First a BIG

for for taking time to go through the data; and diagnosing the situation.

Your process has a lot going on. The basic problem that you are having is caused by a very unstable process, Therefore, you cannot fit a distribution or assess the capability.

Look at the attached graph. The process shown in the upper left is out of control and appears to have experienced a large shift downward at observation 248 (not counting missing observations). After splitting the data after observation 248, the process shown in the upper right still exhibits smaller shifts and occasional extreme values.

The bottom two graphs show that the process tends toward normality, if it were to be stabilized. However, you may have to judge this visually rather than using a normality test. Note the probability plot in the lower right. Even though the data for Stage 1 fall in a straight line and visually would pass the "fat pencil" test for normality, the p-value is still quite small. This is due to the chunkiness of the data, which is due to the resolution of the measurement system.

@ Miner, YES, its counter intutive to start 'assess distribution / capability of unstable process' BUT to explain the situation is that

a) we are starting to use the tools of SPC/SQC and this is the first step.

b) we have these preceding/pre-requisite steps defined, (control charts, and assessment)

c) one of the step was to arrive at a baseline of cpk/ppk steps.

now, given the background, and based on your comment, is it reasonable to assess the (nearest) distribution based on visual/techno_functional assessment and evaluate the cpk/ppk ?

I have attached two approaches - assuming the data is homogeneous. I did not look at the details of the process as Miner did.

As to which is preferred, I recommend reading my blog Unilateral Tolerance Capability Calculation .

I do NOT recommend reporting Ppk or Cpk, as you do not have a process where the target is in the center. The point of Ppk and Cpk is to determining how centered between the specifications the process is. I assume your target is zero.

Cp or Pp would be far more appropriate. They simply illustrate how well your process variation (when represented by the correct model, or distribution) fits within the tolerance.

The expected variation should be non-normal if it is close to the physical limit of zero. If the actual distribution is normal, then you need to evaluate the data's location to the influence of the physical limit.

The further away it is, the more normal it may be. The other point is you are using so little of your tolerance the effort to further analyze the process variation needs to be balanced with its value added (or value maintained, if it can be established).

Measurement error can mask process variation with its normal distribution, also. The non-normal distributions have a very high p-value, especially compared to the normal distribution - so I would be surprised if the true underlying process distribution was normal.

However, it would be good to understand whether shifts that Miner has observed are expected long term for this process (common cause), or if the process can have the shifts eliminated (special cause) through better controls. Again, the controls need to balance with the economic need based on how little of the tolerance us actually being used.

@Bobdoering

wrt centered target==> the specificatinolimtis is having criteria of "not more than"; hence we would prefer to have the limits as near to 'zero' as possible.

wrt target ==> yes the target is zero.

wrt cpk/ppk ==> do you recommend that cp/pp to be adequate.?

@ reality...

above set of data/process is best case scenario, i.e., considerable amount of data;

* on other hand there are other products with 10-20 lots... and above scenario becomes tough to deal with. (wrt having pp/cp)

* what is the best approach for this scenario of 10-20 lots ?; (lets say we are interested/insist for looking to pp/cp values!)

kindly confirm and thanking you in advance for the guidance.