Need help wrapping my head around confidence vs beta error



So I've tried reading into this topic many times in the past and my understanding of these terms is as follows:

Reliability- percent of products we say meets spec
Confidence- percent of time we will be correct on a given reliability
(i.e. 95% confidence/95% reliability means when accepting a lot, we want to be correct 95% of the time that 95+% of parts are acceptable).

So with that in mind, is Beta error simply 1-[confidence]? So if we're right 95% of the time, we will be wrong 5% of the time?

I couldn't really find many resources that flat-out compared beta to confidence with sampling plans, but it seems logical?

I ask because we have a procedure that discusses sample size determination, and it says that "a min 95% confidence with a max 5% beta is recommended".. If my prior statement is correct, then this seems redundant? (i.e. if you have 95+% confidence, you will automatically have no more than 5% beta)

If I am wrong on anything here, please correct me!!!!

John Predmore

Trusted Information Resource
There are 3 separate concepts in your example, but you used 95% for all three concepts. If I explain using different percentages, there may be less confusion about the different percentages.

The terms confidence and beta error come from the universe of hypothesis testing. In a manufacturing quality context, we start with the hypothesis the overall quality of supplier’s entire shipment is acceptable. There is the true (unknown) state of the shipment, and there is a decision made based on an experiment or a sample. There are 4 possible outcomes:
In reality, the product is good, and decision is made to accept <- this is desired outcome
In reality, the product is bad, and decision is made to reject <- this is desired outcome
In reality, the product is good, and decision is made to reject <- this is Type I error
In reality, the product is bad, and decision is made to accept <- this is Type II error

You used the term reliability to indicate that some fraction of the shipment can be non-conforming and the batch would still be acceptable. For my example, out of a batch 1000 widgets I want assurance 90% are within spec, so reliability must be 90%. Based on testing the sample, I accept or reject the entire lot. Because we are counting rejects rather than evaluating the mean of a continuous variable, I use a discrete probability function rather than a Z-test or a t-test. (I used the binomial instead of the hypergeometric since the batch >> sample size.) If there are 10% bad parts in the batch of 1000, there is 35% chance of seeing none in a sample of 10 (which I calculated using the BINOM function in Excel) and therefore 65% cumulative probability of seeing one or more out of 10. To be 95% confident to detect a lot of 1000 which is 10% bad, I need to see zero in a sample of 29 (to be better than 95% confident in this 1-sided test).

In a single shipment, there is 5% risk I reject the entire batch of parts which are actually 90% quality, due to luck of the draw in the sample. In this scenario, the producer’s risk of loss is 5%, which is also called the alpha risk, the risk of wrongly rejecting the hypothesis (that the quality of parts is better than 90%). The significance level of the hypothesis test (as illustrated in this scenario) is 95%.

For all possible quality levels, there is the possibility that a shipment with unacceptable quality might be accepted. Let’s say I want no more than 20% chance of a Type II error where the sample of widgets from an overall defective lot contains zero nonconforming parts (I am willing to accept a little more risk because I have automatic in-process gaging). This consumer’s risk is called Beta, where the hypothesis of 90% quality parts is wrongly accepted. The Beta risk weighs the range of possible scenarios with the defined statistical result. The inverse of the beta risk is called the power of a statistical test, the probability of correctly rejecting a hypothesis (when the alternate is true in reality). The power in this scenario would typically be determined using statistical tables or a computer.

See also Other 1-Sample Binomial | Power and Sample Size Calculators | HyLown

Bev D

Heretical Statistician
Super Moderator
We can try a slightly different approach to understanding.
There are two important parameters of any inspection sampling plan: the defect rate that is acceptable and the defect rate that is NOT acceptable (Rejectable). The first is called the Acceptable Quality Level (AQL) and the second is the Rejectable Quality level (RQL or LTPD - Lot Tolerance Percent Defective)

AQLs have alpha risk - the probability of REJECTING a lot at the ACCEPTABLE defect rate. Here Confidence is 1-alpha (The probability of ACCEPTING the lot at the AQL defect rate)

RQLs have beta risk - the probability of ACCEPTING a lot a the RQL defect rate. Here Confidence is 1-beta (The probability of REJECTING the lot at the RQL defect rate)

The type of plan you reference is an AQL only plan (There is an RQL for it of course but it is not specified or often not calculated for the Confidence/Reliability type of plan)
Top Bottom