Non-normal Distributions in SPC - How do I Normalize Data?

A

arezoo

Hi everybody
I want to impelement SPC in our company.
there are some process that they don't have Normal distribution.Is there any SPC method for Non-normal Dis.?please help me to solve this problem

M

MSAFAI

Dear Arezoo,

Since I'm not a specialist in this field, I can only give you a hint:

One of the methods to deal with non-normal distributions, is to use data 'transformation'. Meaning, depending on the distribution shape, you use a formula to transform the distribution to a normal one. For example in some cases you can use the square root of the x.

Please have a look at Juran's quality handbook (5th edition) for the outline and some references.

Good Luck
MSAFAI

K

Ken K.

Along those lines, both MINITAB and JMP have Box-Cox transformation tools that allow you to easily identify a transformation that provides normality.

MINITAB also gives a confidence interval for the tranformation constant, which means you have the choice of a range of "best" tranformations. If 0 is inside that CI, then you'd just take the log of the data. If 0.5 is inside, you'd take the square root, etc...

You will also run into a lot of people who say don't sweat the nonnormality, but haven't seen anything to justify that lack of concern.

On the other hand, if you are taking subsamples that are larger, say more than 5-10 or more, then you can take advantage of the fact that the sample mean, regarless of the parent population's distribution, WILL follow a normal distribution (this is what the Central Limit Theorem is about).

D

Don Winton

-------Begin Snip-------

Even though the distribution in the universe is not normal, the distribution of Xbar values tends to be close to normal. The larger the sample size and the more nearly normal the universe, the closer will the frequency distribution of averages approach the normal curve.

However, even if n is as small as 4 and the universe far from normal, the distribution of the averages of the samples will be very close to normal. Shewhart illustrates this by showing the distributions of averages of 1,000 samples of four from each of two bowls of chips, one containing a rectangular and the other a triangular distribution. Neither of these universes even faintly resembles a normal curve. However, the distribution of the samples drawn from these bowls fairly approximates normal.

The main point of Shewhart’s bowl is this. Even with great departures from normality in the universe, the distribution of Xbar values with n=4 are approximately normal. In sampling from most distributions found in nature and industry, the distribution of Xbar values will be even closer to normal.

However, it is of interest to observe that distributions similar to the rectangular and triangular distributions sometimes found in industry. Although they seldom occur as a result of production alone, they may be found as a result of production followed by 100% inspection. For example, if a production operation gives a distribution on a certain dimension, which is roughly normal with a standard deviation of 0.001, and the specified tolerances on the dimension are +/- 0.001 cm, it is obvious that only about 68% of the product will meet the specification. If the production operation accurately centers the dimension at its specified nominal value, the go gage and another 16% will reject about 16% of the product by the no-go gage. The distribution of the accepted product will not be far from rectangular. There will be two distributions something like the triangular, one for the product rejected by the go gage and the other by the no-go gage.

The great practical importance of the normal curve arises even more from its uses in sampling theory than from the fact that some observed distributions are described by it well enough for practical purposes. Of great practical significance is the fact that distributions of averages of samples tend to be approximately normal even though the samples were drawn from non-normal universes.

Grant and Leavenworth, Statistical Quality Control, pp. 60-62.

-------End Snip-------

Regards,
dWizard

------------------
I was better but I got over it.

bobdoering

Trusted Information Resource
-

However, it is of interest to observe that distributions similar to the rectangular and triangular distributions sometimes found in industry. Although they seldom occur as a result of production alone, they may be found as a result of production followed by 100% inspection."

Grant and Leavenworth, Statistical Quality Control, pp. 60-62.

Actually, they are not quite right with that statement - the rectangular distribution is very common in precision machining where all special causes have been removed and the most significant common cause is tool wear. I see it every day...

Steve Prevette

Deming Disciple
Super Moderator
Interesting how many threads are now open on non-normal distributions. But let me post here my standard answer - SPC works because of the Tchebychev Inequality. It was tested and developed by Dr. Shewhart in 1930 to work for any distribution.

bobdoering

Trusted Information Resource
Prior to utilizing the Tchebycheff's Theorem in his book “Economic Control of Quality of Manufactured Product,” Shewhart made a much more significant statement (of which I wholeheartedly agree):

"The total information is given by the observed distribution."

His examples of depth of sapwood and tensile strength easily support his work supporting the normal distribution, because their variation is a natural variation. However, I am not sure of the progress of precision tool industry of the 1930's. They may not have observed the same distributions then that we do now with +/- 10 micron tolerances. What we readily observe is the uniform distribution. That is not a distribution on the list of distributions Shewhart reviewed in figure 47 of his same book. When utilizing the uniform distribution correctly, the mean is a useless factor – nowhere near as important as his conclusion in Chapter 8. The good news is that the math of a rectangle is a far sight easier to deal with than that of the normal curve!

Shewhart’s observations were great for the time, but we now know a bit more. Just like the deal about the flat earth. It is true, however, that SPC itself does work, and can work very well with the uniform distribution without any transformations. One just needs to apply the correct statistics - and not those of the normal curve.

bobdoering

Trusted Information Resource

Chennaiite

Never-say-die
Trusted Information Resource
Before I get into the exact question being raised here...
A normal distribution of the process is the factor co-relating the numerical value of the Cp/Cpk with the PPM of the process. Hence for a Cpk results to be genuine, the process shall pass the normality test. The normality is measured with the help of software such as Minitab, as in practice no process follows a perfect normal distribution. Now, normality is not necessarily the target of all specifications. For eg., a specification Dia. 10 +0.1/-0.0 is not expected to follow normal curve, even theoritically. In this case, '10' is the target dimension and most of the manufactured parts are expected to be in and to the right-hand side of '10' and hence the theoritically targetted curve is not normal but may be considered as half-normal.
Coming back staright forwardly to the question being asked here, if a process is targetted for normal curve and practically it does not follow the normality, it is the indication of one or all of the following:
There is a special cause of variation in the process
There is an interruption to the process being studied and the nature of interruption is significant.
There is a change in setting of the process.
So, one has to work on the above to normalise the process.
Thanks.

bobdoering

Trusted Information Resource
Now, normality is not necessarily the target of all specifications. For eg., a specification Dia. 10 +0.1/-0.0 is not expected to follow normal curve, even theoritically. In this case, '10' is the target dimension and most of the manufactured parts are expected to be in and to the right-hand side of '10' and hence the theoritically targetted curve is not normal but may be considered as half-normal.

A process has its own distribution - it really does not care about the specification. A unilateral specification for precision machining does not change the fact that a correctly controlled precision machining process is a uniform distribution and not normal at all.

Coming back straight forwardly to the question being asked here, if a process is targeted for normal curve and practically it does not follow the normality, it is the indication of one or all of the following:
There is a special cause of variation in the process
There is an interruption to the process being studied and the nature of interruption is significant.
There is a change in setting of the process.
So, one has to work on the above to normalise the process.

Again, you can not target a process for a distribution, a process has a distribution (voice of the process), and your job must be to keep its natural distribution within the specifications (capability). Now, you can maintain or improve the process by controlling the common causes (adjustment) or trying to eliminate the special causes.

We know that if you see a normal distribution in precision machining, it is evidence of being out of control, overadjusted, and the operator has become the process, and not the machine.