# Probablity of correctly detecting a defect size

#### tahirawan11

##### Involved In Discussions
Hi,

I want to perform a study to calculate the 'probability of correctly detecting a defect size'. The defect is 'wrinkle size' and it varies from 0 to 0.2. The defects are classified into three different categories, Low wrinkle (less than 0.10), Medium wrinkle (between 0.1 - 0.15) and High wrinkle (0.15 - 2.0). The defects are classified visually by operator by looking at a the 'scan result picture'. The object of the study is to estimate 'The probabiliy of detecting a Low wrinkle is xx with a confidence interval of yy and the probability of detecting a medium wrinkle is yy with a confidance interval of zz and so forth. (see attached picture)

For this purpose i plan to take 10 samples of each defect category Low, medium and High. They will be classified by a better measurement system and then a operator will be asked to classify each sample into different defect size. I want to know how to calculate the probability from this data, Is this the right appraoch or should i perform a GR&R instead? #### Statistical Steven

##### Statistician
Super Moderator
Hi,

I want to perform a study to calculate the 'probability of correctly detecting a defect size'. The defect is 'wrinkle size' and it varies from 0 to 0.2. The defects are classified into three different categories, Low wrinkle (less than 0.10), Medium wrinkle (between 0.1 - 0.15) and High wrinkle (0.15 - 2.0). The defects are classified visually by operator by looking at a the 'scan result picture'. The object of the study is to estimate 'The probabiliy of detecting a Low wrinkle is xx with a confidence interval of yy and the probability of detecting a medium wrinkle is yy with a confidance interval of zz and so forth. (see attached picture)

For this purpose i plan to take 10 samples of each defect category Low, medium and High. They will be classified by a better measurement system and then a operator will be asked to classify each sample into different defect size. I want to know how to calculate the probability from this data, Is this the right appraoch or should i perform a GR&R instead? Just a couple of points and a few questions to clarify your question.

1. The graph you show is a cumulative probability plot. That is the probability of detecting a high wrinkle is close to 100% because it includes the probability of low and medium too.

2. Do you have an estimate of the distribution of each wrinkle class? 10% low, 30% medium and 60% high for example.

3. What is the defect rate in the population?

4.Since you have 3 classes of defects, you need to determine if classifying a medium as either a high or a low is "bad". That is, misclassification of a defect plays into the probability calculations.

You can accomplish this via a GR&R but you still need to determine what is "right" and what is "wrong"

#### tahirawan11

##### Involved In Discussions
Just a couple of points and a few questions to clarify your question.

1. The graph you show is a cumulative probability plot. That is the probability of detecting a high wrinkle is close to 100% because it includes the probability of low and medium too.

I dont need a cumulative probability plot, i just need to connect the mean of all three probabilities. but on the other hand it is much easier to detect / measure a medium or high wrinkle compare to smaller ones

2. Do you have an estimate of the distribution of each wrinkle class? 10% low, 30% medium and 60% high for example.
I am developing this new measurement system so i dont have lot of samples but an estimate could be 40% for each medium and high wrinkle and 20% for low wrinkles

3. What is the defect rate in the population?
dont know yet but will try to find out

4.Since you have 3 classes of defects, you need to determine if classifying a medium as either a high or a low is "bad". That is, misclassification of a defect plays into the probability calculations.

yes classifying a medium to high or low is bad, if a medium is identified as low then there is risk of part failure in the field and if a medium is misclassified as high then unjustified cost of repairs

You can accomplish this via a GR&R but you still need to determine what is "right" and what is "wrong"
the best i can do to find what is right/wrong is by a better measurement system and that will be taken as a Gold standard and wil be used to calcualte the probabilites

#### Statistical Steven

##### Statistician
Super Moderator
the best i can do to find what is right/wrong is by a better measurement system and that will be taken as a Gold standard and wil be used to calcualte the probabilites
Here are a few approaches to think about:

1. Use a 3x3 contigency table to get the marginal probabilities for each of the 9 possible outcomes. You will need approximately 50 of each defect type.

2. Just calculate the percent correct for each defect type and use the confidence interval around the binomial to get the proability for each defect class.

3. Do an MSA that where the same sample is categorized multiple times by the same operator. The study might look something like this

10 each low, medium and high defects
3 inspectors
3 repeats of each of the 30 defects

Score a 1 for a correct identification and 0 for wrong one. Each defect class will have 90 values.

Analyze by defect class and across defect class to get estimates of precision (repeatability and reproducibility). 