My thoughts ...
Jerry,
Like you, virtually all of my workload is electronic test & measuring equipment. Here's what I do. It's based on experience in the Government (Department of the Navy), a regulated industry where safety is paramount (aviation) and other experience acquired from a number of sources.
1. I use 95% as the reliability target - the probability of a single item being in-tolerance at the end of the calibration recall period. In another industry I might go to 90% but I'm personally not comfortable with anything lower than that. Anthing higher than 95% is really not cost-effective because it takes way too long to collect "enough" data.
2. I use 95% as the confidence factor - the probability of making a correct in/out tolerance decision. Some would like to see 100% here, but the only way to get that is if it never leaves the cal lab, in which case it is perfectly reliable but totally useless to the end user.
3. I use a maximum interval of 60 months (5 years). The only product I have ever seen a longer inteval on is waveguide directional couplers (72 months) but about the only way they change is if they are physically damaged.
4. The minimum interval used right now is 1 month, but that is a special case. The "normal" minimum is 3 months. Some infrequently used equipment may be labeled "Calibrate before each use."
5. Adjustment (renewal) is performed if the measurement is more than +/- 75% of the calibration limits, or at a different value if specified in the calibration procedure. If the TAR is less than 4:1 I may specify a different adjustment limit in the procedure.
6. The lab supervisor recently purchased a commercial interval analysis program and that is being used now. It does some fancy statistical analysis of the data. Before that, I used a spreadsheet I developed that is based on method A3 but uses the binomial distribution to estimate an interval. (I have attached a copy of the spreadsheet.)
Other - fixed interval: I've never used that ... it seems somewhat foolish to me because you never have the opportunity to improve reliability by reducing an interval, or save money while maintaining the same reliability by increasing the interval.
Other - 1-2-3: I have never used this type of method but I have seen it used in many places. I will observe that RP-1 specifically states that this and other reactive methods (A1, A2 and A3 if used without statistical methods) never settle to a stable value. As you said, the interval is always either too long or too short.
Additional thoughts:
I never try to do interval analysis on a single item or even on a small number of similar items. It tales so long to get a statistically significant sample size that it often exceeds the useful lifetime of the equipment. (And that's almost exactly what RP-1 says as well.)
I try to group things by model where possible and have all of them on the same interval. For example, my client has about 350 Fluke 87 digital multimeters which is a large sample. However, there are also a lot of 3-1/2 digit multimeters from various manufacturers, and while there are over 100 total there are fewer than 10 from any single manufacturer. Since they are similar in type, function and accuracy all of those are treated together as a group for interval analysis.
I am very conservative in extending calibration intervals - I don't take the numbers at face value. I suggest doing no more than doubling the current interval even if the numbers say it could be more. One analysis showed that a particular model of instrument, that had been on a 12 month interval, could be extended to something like 9-1/2 years -- it was actually changed to 24 months, and will never be more than 60 months.
After changing an interval, it is best to wait for at least the time (old interval + new interval) before analyzing that group again. An axiom in RP-1 is that all items in the group must have been on the same interval, so that amount of time allows something that was calibrated the day before the change to complete the old interval and one cycle of the new interval.
For ease of calculation, all time intervals should be in the same units. That is why I alsways use months. It is a convenient size, but also years are not appropriate for a lot of things and weeks or days rapidly ballon into numbers that are hard to grasp. (How long is 1,461 days? How long is 208 weeks?) If calculations result in a fractional part on a result, it is always rounded down to an integer. (Both of those numbers equate to 48 months.)
This lab's definition of "calibration" specifically excludes anything that the equipment manufacturer may call calibration but is really setup, normalization, standardization or error-correction storage that is performed by the operator before use. The definition also recognizes that a large number of "calibration" procedures in equipment manuals are really factory adjustment procedures. According to NCSL RP-3, a fundamental assumption for writing a calibration procedure is that the item to be calibrated is in good working order and ready for use in every respect -- and the purpose of calibration is to verify that status by using traceable standards. On the other hand, many manufacturer procedures are adjustment or alignment procedures intended for use after the item has been manufactured or repaired, and is therefore in an unknown state.
Increasing and decreasing calibration intervals can be justified with different amounts of data. Increasing an interval requires a large sample size in order to get statistically significant results. Decreasing an interval can be justified with much less data. In the case of the product that is on a 1 month interval - it had been at three months, but between the three units there had been five failures in the previous 12 calibrations. Even without doing the math it is easy to see that this is way less than 95% reliability, and they would have to pass all of the next 100 calibrations to have a chance of meeting it. There was not a chance of that happening, so it was an easy decision.