# How to determine sample size to test when the acceptance criteria is x >= Standard value?

#### Samshah

##### Registered
Hello,

How do you determine statistically significant sample size for a continuous variable where each reading of the data must be equal to or above a standard value? I've always used a 1 sample t-test however then you're only checking if the sample mean is > than the standard value and not checking samples required for individual values being > standard value.

My device is a Class II medical device in the US, thus subject to 21 CFR Part 820. This sample size is required for verification, and not incoming quality inspection or determining process quality.

#### Steve Prevette

##### Deming Disciple
Super Moderator
What is the basis for your "standard value"? If every reading is to be above a certain specification value, you would need to determine your acceptable risk and also know the standard deviation and mean of past data. One method is to use six sigma principles - plot a control chart, and be sure your mean value is six standard deviations above the "standard value". Or some other number of standard deviations (at least 3) depending upon your risk. Addition of other trend rules could help, such as seven in a row below the mean value (even if still above the "standard value".

This article contains information on a variables sampling plan against a lower specification limit Chapter 3 Variables Sampling Plans | An Introduction to Acceptance Sampling and SPC with R (bookdown.org)

#### Bev D

##### Heretical Statistician
Super Moderator
This requires some thought and perhaps even an iterative approach. For us to help you we really will need to understand two things to begin with. As Steve asked, we need to understand this ‘standard value’. We also need to understand if you have any reason to believe that teh standard deviation of the test sample will be different than the historical value.
You will need to base your sample size calculation on the estimate of the standard deviation. There are formulas for this. (The sample size will be larger than one for a mean estimate…)

You will then need to PLOT the individual values against the historical data. The historical data can be plotted in time series if you have it in that form. (You should). If not you can use a sample of data produced under the current conditions as a “control”. The complicating factor here is that you will have variation in the individual values as well as variation between the averages of the process if the process is non-homogenous. Whihc is why we are typically better off with multiple independent small samples than one larger sample from a single run.

Look at the plot of data. Is the mean of the test data far enough away from the standard value that the standard deviation of the test result will guarantee that the lower tail of individual values will be above the standard value? This will help tell you if you need a second larger larger sample or a series of samples.

If you run the test and return here with the data we can help you further. I do understand that in some industries your regulatory statistician may require that you determine the sample size and receive approval of the approach before running any tests. If that is your case you can return here and provide us some of your historical data and we can provide a specific approach.

#### Samshah

##### Registered
This requires some thought and perhaps even an iterative approach. For us to help you we really will need to understand two things to begin with. As Steve asked, we need to understand this ‘standard value’. We also need to understand if you have any reason to believe that teh standard deviation of the test sample will be different than the historical value.
You will need to base your sample size calculation on the estimate of the standard deviation. There are formulas for this. (The sample size will be larger than one for a mean estimate…)

You will then need to PLOT the individual values against the historical data. The historical data can be plotted in time series if you have it in that form. (You should). If not you can use a sample of data produced under the current conditions as a “control”. The complicating factor here is that you will have variation in the individual values as well as variation between the averages of the process if the process is non-homogenous. Whihc is why we are typically better off with multiple independent small samples than one larger sample from a single run.

Look at the plot of data. Is the mean of the test data far enough away from the standard value that the standard deviation of the test result will guarantee that the lower tail of individual values will be above the standard value? This will help tell you if you need a second larger larger sample or a series of samples.

If you run the test and return here with the data we can help you further. I do understand that in some industries your regulatory statistician may require that you determine the sample size and receive approval of the approach before running any tests. If that is your case you can return here and provide us some of your historical data and we can provide a specific approach.

Hi Bev, thanks for your comment!

The continuous variable in question is tensile peak force (in Newtons) for a joint for a medical device. There is no upper specification however the lowest it can be is 5N. This requirement/specification comes from an ISO standard depending upon the outer diameter of the tubing.

There is no "historical data". I don't know what the mean and std dev will be. The first step would be to determine the sample size to collect data for verification.

I need to figure out what kind of hypothesis testing to apply (if any) and determine statistically valid sample size. As I mentioned in my original post, it's for verification of the design. If for N number of samples, the peak tensile force is above 5N, then we would consider the design verified.

Thanks!

#### Samshah

##### Registered
What is the basis for your "standard value"? If every reading is to be above a certain specification value, you would need to determine your acceptable risk and also know the standard deviation and mean of past data. One method is to use six sigma principles - plot a control chart, and be sure your mean value is six standard deviations above the "standard value". Or some other number of standard deviations (at least 3) depending upon your risk. Addition of other trend rules could help, such as seven in a row below the mean value (even if still above the "standard value".

This article contains information on a variables sampling plan against a lower specification limit

Hi Steve, thanks for your comment!

Please read my response to Bev's comment above. Adding to that, I would like to make sure that each reading of the peak tensile force is above 5N. It may not need to be 6 std devs above 5N.

#### Steve Prevette

##### Deming Disciple
Super Moderator
Hi Steve, thanks for your comment!

Please read my response to Bev's comment above. Adding to that, I would like to make sure that each reading of the peak tensile force is above 5N. It may not need to be 6 std devs above 5N.

You do have a chicken and the egg situation here. With a continuous random variable - you have to have some estimate of the standard deviation in order to validate your sample size. I'd recommend taking 15 measurements and see what they come out as.

Now, if not 6 sigma, you do have to do a risk assessment because just having your samples come out better than 5N (unless you are 100% sampling) is not sufficient. There always needs to be some form of statistical "cushion" there to get some certainty that no individual will be less than 5N. For example, if you want to be 95% sure that a result will not be less than 5N, you need a two standard deviation buffer (assuming a lot of things - such as normality). The only way to be 100% sure than nothing is less than 5N, then you need a 100% sample.

As alluded to by Bev and I, a control chart would be much more effective.

#### Bev D

##### Heretical Statistician
Super Moderator
To echo Steve, when you don’t know much about a process, you need to take some data, plot it and think about it. (Actually echoing Ellis Ott)

#### Tidge

Trusted Information Resource
You do have a chicken and the egg situation here. With a continuous random variable - you have to have some estimate of the standard deviation in order to validate your sample size. I'd recommend taking 15 measurements and see what they come out as.
To echo Steve, when you don’t know much about a process, you need to take some data, plot it and think about it.
My echoing voice will also come back from the chasm. Since the OP mentioned this as a verification activity (in the context of medical devices/Part 820) I can imagine a few options to collect the "historical data" that @Steve Prevette alluded to using.
1. There may exist pre-verification build samples that can be leveraged, via some sort of engineering analysis to use as the source of the preliminary estimates.
2. It is possible, but unlikely, that the estimates of interest can be established from engineering analysis of the design outputs as opposed to prototype builds. This requires much experience, a little black magic, and generous (unbiased) prior beliefs.
3. Perhaps there is a planned set of pre-production builds (for process piloting, etc.) that offer the opportunity to collect the historical data. This sort of activity often gets skipped (in the interest of time tables, materials) but such effort can be leveraged in many different ways.