A few thoughts:
No amount of testing (or field experience for that matter) can remove uncertainty completely. With PROPER use of statistics, the uncertainty can be quantified. When available data is ABUNDANT, the statistical confidence may approach certainty, but it will almost never become certainty, and in most cases it's too impractical an approach anyway.
Inherently Safe Design (ISD) vs. Protective Measures (PM) -
Categorisation is meaningless in the absence of specific context. A given means might be either.
A hazard is defined as a potential source of harm. For example, a device that employs electrical potential (for convenience, think about voltage, electric current, capacitors etc.) might have a potential for electric shock. If the device is designed to achieve the same purpose without employing (or potentially harbouring) electrical potential, it won't have the potential to cause an electric shock - no matter what. That's ISD - removal of the hazard altogether.
PM don't remove the hazard, but instead interfere with the formation/unfolding of the hazardous situation and it's further eventuating in harm. The hazard is still there, but the probability of it eventuating in harm is reduced. This can be done through attacking any of the steps or ingredients necessary for the actual harm to come about.
I believe that this view may be used to neatly resolve this confusion. ISD - remove the hazard altogether; PM - interfere with the path from hazard to harm. According to this view most mitigation means will reveal themselves as PM. True ISD is quite rare as a retroactive risk mitigation path; it'd more often be built into the selected design concept from the outset.
As can be seen, the answer to whether a specific means is ISD or PM depends on how the hazards and hazardous situations have been called out.
P=1 for SW PM -
Mainstream risk analysis methodology addresses Single Fault Condition (SFC). Why? Because otherwise it's quite likely that no complex device will ever be deemed safe while still being economically viable (which means it won't exist in reality). Under SFC one should not speculate that both the risky element and the PM intended to mitigate the same risk will fail simultaneously. So where the risky element fails the SW PM should be assumed to work as intended (provided that it passed design verification), i.e. failure P=0. Where the SW PM fails (may well be considered at P=1), the risky element that that SW PM was put in place to cover for should be assumed to be performing to spec. Problem solved.
If you reject the argument above you must be willing to conduct Multiple Fault Condition analysis across the board, at least for a number of simultaneous faults N=2.