We make primarily Class I, with a few Class II devices. For our stability program, we are required to test 60 samples at each time point for a Pass/Fail test, which gives an upper probability of failure of <5% at the 95% confidence interval. For stability tests which generate numerical results, where we will use regression analysis, we are required to test 17 replicates at each time point. This seems extreme, as from what I've seen in the Pharma industry, typically only 3-5 samples are tested at each time point. Just curious what sample sizes others in the device industry use for their stability testing? Also, if anyone could provide some references for sample size selection for stability testing. I couldn't find anything in the ICH guidelines or ASTM F1980. Thanks in advance.

sample size is always dependent on the standard deviation of the data (among other things). So 17 doesn't actually seem very extreme for stability testing where regression will be applied. you are not merely trying to estimate the MEAN but the results of the individual values. your customer uses individual devices. It does you no good for the average to be above the end of life specification but 49% of your devices are below...:)

I know some companies use smaller sample sizes but this doesn't mean they are correct

60 for attribute and 17 for variable doesn't sound excessive to me.

I once heard that the FDA expects ~30 if you're trying to fit one of the standard distributions to predict extremes. They might take less (depending on context) but are normally comfortable with 30.
