The original post asked a question about two different areas:
design verification and
process validation. The N=1 sample size (when allowed) for design verification doesn't care about "extreme conditions" because (as noted) there is no standard deviation that can be calculated to establish anything like a 1-sided limit (which presumably implies "requirement satisfied"). N=1 is accepted because there is some binomial attribute hypothesis test (that is satisfied). The binomial distribution is valid for N=1.
Process validation is about understanding the variation in the process of manufacturing instances of the (single) design (that was verified)... so without knowing what went into Taylor's (simply stated in the OP) N=15, we can only speculate(*). The same math is applicable to design verification of course, but the question was about process validation. There was this question:
Although both are considered challenge tests, why is there a difference in sample size?
With the answer here being (I believe):
Sample sizes are different because there are different hypotheses being challenged in the two different examples (some design verification test v. some process validation test), so we probably could have left this as the answer.
(*) I'm now curious as to what went into Taylor's N=15 number. I'm guessing it is about some variable data, and I'm curious what sort of assumptions was baked into the result... or if it was an empirical observation!
EDIT: After some gurgling, I found this note (that I'll attribute to Taylor):
N=15 samples represent the fewest number of samples which should be used for a variable sampling plan for the following reason. Variable data must be tested for normality before analysis commences. 15 data points are required to perform a normality test. Therefore, at least 15 samples are required for the variable sampling plan.
The normality requirement is important (in the source document's presentation) because some of the examples related to sample sizes OQ and PQ acceptance criteria relate to tests against Pp and Ppk (and it looks like Minitab is recommended to get the actual challenge values). It is certainly convenient if the data is normally distributed (and is often a cause for concern if it is not) but the math for hypothesis testing doesn't strictly rely on a normal distribution... the source text does appear to offer the option to switch from variable to attribute studies.
The specifics hidden elsewhere in the same source are that the target is 90%/85% for an attribute sampling plan, and it is true that N=15 is the smallest N for which 0 failures supports 90%/85%. Below N=15 we can't hit 90%/85%. What I call Power is called Reliability in the source.
Most interesting to me: There are some assumptions made about looking for "two or more defects" used to justify a confidence level of only 90%, a table for 95% confidence is also provided that results in larger sample sizes, the sample sizes in Taylor reference for 95% confidence agree with my calculations (only sample sizes for c=0 failures are shown in the reference).