Is Cpk Analysis only for Data that came from Normal Distribution?

T

Tavoludo

Hi everybody.
I have been working lately with CPk analysis at my company but somebody told me that I can not use it with data that are not normal distributed, talking specifically to individual data.
My point is that does not matter since we are using n=3 wich base on the LCT makes my averages become to be normal as well the d2 calculation is affected by the individual ranges of the subgroups, So based on that is my understanding that it can be used even if the individual values or data are not normal distributed.
Any further clarification?
Thank You!!
 

bobdoering

Stop X-bar/R Madness!!
Trusted Information Resource
Hi everybody.
I have been working lately with CPk analysis at my company but somebody told me that I can not use it with data that are not normal distributed, talking specifically to individual data.

Are you supplying to automotive?

My point is that does not matter since we are using n=3 which base on the LCT makes my averages become to be normal as well the d2 calculation is affected by the individual ranges of the subgroups, So based on that is my understanding that it can be used even if the individual values or data are not normal distributed.

Yeah, I have heard that urban legend many times. It works great if your data is independent. If it is dependent - such as tool wear - LCT and similar statistical tools are not applicable. They assume independent data. Lots of folks forget that little caveat. So, it really depends on your process. Even AIAG's "Gold Standard" normal data for process capability is really not normal.

Cpks are pretty sloppy to begin with. They really do not recognize that process outputs are virtually always multimodal. But, often people will look at their data and see a normal curve - only because measurement error (which does tend to be normal) masks all of the actual process variation.
 

Bev D

Heretical Statistician
Leader
Super Moderator
...My point is that does not matter since we are using n=3 wich base on the LCT makes my averages become to be normal as well the d2 calculation is affected by the individual ranges of the subgroups, So based on that is my understanding that it can be used even if the individual values or data are not normal distributed.
Any further clarification?

Even if you are starting with subgrouped data the formulas for Cpk translate back to individual data. So if your population is not roughly Normal your Cpk value will be overstated or understated.

Why are you calculating Cpk? What will the data be used for?
 
T

Tavoludo

This is Aero-space Business and the reazon why we are calculatin CPk indices is because we do not know the capabilty of our equipments yet.
 

bobdoering

Stop X-bar/R Madness!!
Trusted Information Resource
This is Aero-space Business and the reazon why we are calculatin CPk indices is because we do not know the capabilty of our equipments yet.

Are you doing precision machining? If so, the technique has been clearly described how to calculate capability with tool wear - and it is not Cpk.
 

Bev D

Heretical Statistician
Leader
Super Moderator
so here's my advice: since you aren't calculating Cpk to 'fill out a form to submit to your customer', the approach I take is to set the indices aside and focus on understanding the capability first. Then if I need to provide an index to somone later I know which index adn formula is the most appropriate one for the need.

I have found that simply running the process and taking time series data (in whatever subgrouping fashion is appropriate for your process) across all of the components of variation and then plottign then in time series using a multi-vari plot is the best starting place. statitistics, math and fancy charting can come later.

The general components of variation to consider are:
measurement error
within piece
piece to piece
time to time (hours, shifts, days, weeks, seasons, years)

lot to lot (set-up to set-up)
vendor lot to lot
operator to operator
equipment to equipment
cavity to cavity
cycle to cycle

the rule of thumb for how many subgroups is that each component of variation is covered 3 times or until most (~80%) of the variation has occured. it rarely takes very long (it's usually the short time period components that contribute the most variation) but if one of the 'longer' components (like vender lot) is the primary contributor you are going to want to know that.

From this data you will be able to perform any number of statistical analyses and determine the best subgrouping scheme shoudl you move on to SPC.
 

bobdoering

Stop X-bar/R Madness!!
Trusted Information Resource
I agree with what Bev D suggested. My approach is to sit down and list the possible variation you think will occur. Put it in a total variance equation , which will help you understand that your data will usually represent multi-modal variation. Then, prepare a CNX chart that identifies the variation you can adjust (X), the variation you can keep constant (C), and the variation you have no control over (N). Set the constants, and look at the opportunities for adjusting the process to keep it capable. If you need to adjust during the process, then it is likely not going to use traditional SPC, because it usually means the variable is dependent. That's where would start to understand the process.
 
Top Bottom