Verification of Risk Control Effectiveness - This likely is a dumb question...is it required?

QbDNerd

Registered
Okay, I was under the understanding that ALL risk controls listed required effectiveness checks...however, I have run into a more senior person with an interesting perspective. They interpret the clause in 14971 as effectiveness verification is ONLY required for risk controls where you take credit for the control via reduction in Sharm or Oharm.

They site this from 14971 "The second verification is required to ensure that the risk control measure (including information for safety) as implemented actually reduces the risk. In some instances, a validation study can be used for verifying the effectiveness of the risk control measure."

Thus, the interpretation is, as long as you don't take a reduction, effectiveness is not required. For a little more context this is coming out of using information for safety. There are a lot of opinions on whether taking reduction for IFS is allowed, the stance where I am at now is generally no, unless there is evidence that supports the reduction...

So really the questions is, if you don't take a reduction for information for safety, do you have to prove it is effective? - If you do have to prove effectiveness, do you happen to have a writ and verse from some standard or regulatory body I can use to support this (I don't know if I am overlooking something when I try to substantiate the requirement that effectiveness is always required?

My next question will be, if you do have to prove IFS is effective, is a n=2 or 3 sufficient (I ask because our HFE group wrote into the usability procedures that IFS knowledge only requires at max 5 people, I have been under the impression from reading the regulations it would need to be 15, particularly if it is related to preventing a significant harm.

Super curious and any advice is appreciated :)
 

yodon

Leader
Super Moderator
So really the questions is, if you don't take a reduction for information for safety, do you have to prove it is effective? - If you do have to prove effectiveness, do you happen to have a writ and verse from some standard or regulatory body I can use to support this (I don't know if I am overlooking something when I try to substantiate the requirement that effectiveness is always required?
We approach this generally through a summative study, getting feedback on whether the information is understandable & complete. This is often elicited through interview questions. We don't reduce the risk based on labeling alone. Whether this proves the control (information) is effective is probably debatable, but has been accepted so far.

And we have a minimum of 15 study participants of each user type (per FDA guidance on HFE).
 

Chrisx

Quite Involved in Discussions
If the risk control measure was selected to maintain the level of risk, then I think you could be asked to demonstrate that the risk control did that. In other words, the level of risk did not get worse after implementing the risk control.
 

Tidge

Trusted Information Resource
My answer to the first question is that I agree with "if there is no claim of an improvement, than strictly speaking(*1) there is no need to verify the effectiveness of a risk control." I do believe that there should be some documentation involving why was this even listed as a risk control, and possibly why there was no attempt at VoE(*2).

(*1) My tongue is slightly in my cheek; as the "improvement" is likely a lower ordinal number on some qualitative scale anyway.

(*2) Not many people do things the way I am about to describe, but here goes. IF this "risk line" is in some document completely subordinate to a Hazard Analysis, it is very possible that the risk in Hazard Analysis that the line is traced to is actually made completely acceptable by some other control (which is demonstrated to be effective). In such a case it doesn't make sense to try to implement/verify risk controls all over the place, as it would be a lot of wasted efforts.


As to the second part for HFE, the minimum sample sizes for user validation is 15 (for each user class).

As for N=2 or 3: This is just a guess on my part, but the Faulkner 2003 paper referenced in the FDA HFE guidance mentions an older (Nielsen, 1993) study that "only 3 users would reveal most of the problems". The N=5 was the recommendation of this older study.
 

Tidge

Trusted Information Resource
As an aside: I was curious about the N=15 number. My mind goes immediately to a place occupied by hypothesis testing of attribute data with some confidence and tolerance values... and N=15 is a sort of weird sample size in that space. My quick review of the Faulkner paper is that N=15 is an (essentially) empirically derived sample size that FDA accepted.

Somewhat interesting to me is that my reading of the Faulkner paper is that N=15 is to be interpreted as being the smallest sample size to guarantee that a minimum of 90% of (known!) usability "problems" will be uncovered. This is (in that paper) compared to N=5 which found a minimum of 55% of (again, known) usability "problems". When performing user validations, I've never seen the validation constructed such that users are looking for known problems... rather the users are given a task to perform and their performance are observed/tested/evaluated.

I apologize if what follows is bluntly obvious...

We've always accepted feedback, and occasionally usability defects are exposed... but it doesn't always feel to me like the user validation is constructed to expose usability "problems", rather it's just been an assessment of "correct task completeness" with a side effect of "if you test with N=15 users, you can be relatively certain that within the group they ought to have discovered at least 90% of any usability problems."

I think I can see (from Table 2 in the Faulkner paper) why the original proponents of the N=5 would be tempted to "stick to their guns"... and also why the N=15 would end up as the FDA recommendation. At some level, I have to believe that in an N=5 sample (Faulkner says minimum defects found = 55%, mean defects found = 85.6% ± 9.2%) that as long as a validation team is considering all the defects found by all participants, then there is a high probability that "hella many" (my words, not anyone else's!) defects have been found and can be addressed. With N=15, Faulkner reports minimum found at 90% with a "mean" found of 97.1% ± 2.1%... this is the reported value of N that "includes" 99% of (known) problems found. As a practical matter, it shouldn't matter which of the N participants found a problem that needs to be fixed, it appears the larger sample size is just increasing the chances of more unique problems being found by someone. Another practical consideration is that when it comes to unknown usability issues... my experience has been that it is rare that two users will report the same issue the same way, so that we've always have to triage/pareto the usability issues. This can introduce some bias. into the analysis.

TL;DR: The FDA wants N=15, give them N=15.
 

Bev D

Heretical Statistician
Leader
Super Moderator
When it comes to real defects that can be reliably detected (thru a measurement system that has passed a robust MSA) I am a proponent of small sample sizes built at the extremes of the allowable specification limits and for functional failures testing those few units under the worst case conditions. This makes scientific (and statistical) sense. Empirically it is very powerful.

Usability is similar to a functional failure as it is contingent on varying conditions of the user’s ability & conditions (arthritis, low strength, figure size, eyesight…) and variation of the thing being tested (tightness and access gaps, size…). So this isn’t a matter of statistics, it’s matter of science and understanding the functions to be performed, the variation of the parts themselves and the types of conditions that the users will have…that is what should drive your sample size. Mathematical calculation is no substitute for thinking, no matter how fancy your math is…
 

d_addams

Involved In Discussions
Okay, I was under the understanding that ALL risk controls listed required effectiveness checks...however, I have run into a more senior person with an interesting perspective. They interpret the clause in 14971 as effectiveness verification is ONLY required for risk controls where you take credit for the control via reduction in Sharm or Oharm.

They site this from 14971 "The second verification is required to ensure that the risk control measure (including information for safety) as implemented actually reduces the risk. In some instances, a validation study can be used for verifying the effectiveness of the risk control measure."

Thus, the interpretation is, as long as you don't take a reduction, effectiveness is not required. For a little more context this is coming out of using information for safety. There are a lot of opinions on whether taking reduction for IFS is allowed, the stance where I am at now is generally no, unless there is evidence that supports the reduction...

So really the questions is, if you don't take a reduction for information for safety, do you have to prove it is effective? - If you do have to prove effectiveness, do you happen to have a writ and verse from some standard or regulatory body I can use to support this (I don't know if I am overlooking something when I try to substantiate the requirement that effectiveness is always required?

My next question will be, if you do have to prove IFS is effective, is a n=2 or 3 sufficient (I ask because our HFE group wrote into the usability procedures that IFS knowledge only requires at max 5 people, I have been under the impression from reading the regulations it would need to be 15, particularly if it is related to preventing a significant harm.

Super curious and any advice is appreciated :)
I would say this is an incorrect attempt at a literal interpretation of the standard. How I would interpret 'actually reduces the risk' is that the control actually prevents the hazard from occurring. For instance, a control for the hazard of 'tensile failure when exposed to in-vivo load' is to specify a tensile strength of the component. Your verification of effectivity would be to test to part to the noted load and verify the part does not fail. This is demonstration of effectivity that parts meeting the control conditions actually reduces the risk of the hazard. The standard does not require you come up with an explicit reduction in predicted failure rate. (sure you could do that but you're fooling yourself if you think that is valuable or that you have that precise of measurement of your occurrence of failure mode and harms).

If that control is state of the art or is not unusually burdensome, how does one claim the risk is as low as possible without it? If the control has no value, it does not need to be listed, but since you've identified it, if it is not implemented how can you claim the risk is as low as possible. So if you list a control, you need evidence that it is effective at preventing the failure. You aren't on the hook to demonstrate the quantitative reduction in risk, only that the control prevents the failure mode. If you have a rare case where you do have validated reliability prediction models, go ahead and get specific, but for the 99% of other failure modes you won't have that.
 

EmiliaBedelia

Quite Involved in Discussions
I think your colleague is incorrectly conflating the concept of "risk reduction" with the mathematics of "reducing" the pre-RCM numerical risk calculation after risk controls are applied. Even if you don't quantitatively reduce the post-RCM risk, you still have to show that the residual risk is as low as possible.
I also think maybe there is some confusion about "validation" vs "verification" - it IS acceptable to use "validation" studies as "verification of effectiveness". You have to have evidence for each risk to verify it was reduced, but it doesn't have to be a "Verification Test Report".


If that control is state of the art or is not unusually burdensome, how does one claim the risk is as low as possible without it? If the control has no value, it does not need to be listed, but since you've identified it, if it is not implemented how can you claim the risk is as low as possible. So if you list a control, you need evidence that it is effective at preventing the failure.
This is the best practical answer.

Under MDR, all risks must be reduced as far as possible. If you do not have VOE, you cannot demonstrate that the risk control was applied or effective. And if you have not implemented an effective risk control then you have not reduced the risk. Without having any verification at all, how can you verify that the residual risk is, in fact, acceptable?
 

Tidge

Trusted Information Resource
If the control has no value, it does not need to be listed, but since you've identified it, if it is not implemented how can you claim the risk is as low as possible.

There a few things tied up in this bit that I'd like to comment on.

It is not uncommon that something like a manufacturing process will include steps that don't do anything (that can be measured) to control (or reduce) product risk, but for which experts insist does something(*1) so that they want it documented... often because the experts don't want those steps removed from the process. There is also a possibility that any included risk control could introduce new risks, so listing "risk controls" for which there is no measurable change in a quantitative rating is a way to stimulate discussions and analysis about any potential new risks(*2)

(*1) This could be something like reducing waste of consumables, keeping the shop floor easier to clean, segregating scrap materials. The are other examples I can think of, but I see them all on a spectrum of "how important is it to tie this to the final risk profile of the product?"

(*2) At some point the iterative risk analysis stops. If a line of analysis claims "no risk reduction" and "introduces no new risks", that seems like a convenient way to end that line, at least to me.


On the subject of "not implementing", I was under the impression that the risk controls were implemented, but that there was no evidence that they were effective; with either zero evidence, or low-power evidence. I've work with "quantitative scale" based on "powers of ten"; for low volume production it can be difficult to claim orders of magnitude improvement because of some implemented risk controls.
 

d_addams

Involved In Discussions
There a few things tied up in this bit that I'd like to comment on.

It is not uncommon that something like a manufacturing process will include steps that don't do anything (that can be measured) to control (or reduce) product risk, but for which experts insist does something(*1) so that they want it documented... often because the experts don't want those steps removed from the process. There is also a possibility that any included risk control could introduce new risks, so listing "risk controls" for which there is no measurable change in a quantitative rating is a way to stimulate discussions and analysis about any potential new risks(*2)

(*1) This could be something like reducing waste of consumables, keeping the shop floor easier to clean, segregating scrap materials. The are other examples I can think of, but I see them all on a spectrum of "how important is it to tie this to the final risk profile of the product?"

(*2) At some point the iterative risk analysis stops. If a line of analysis claims "no risk reduction" and "introduces no new risks", that seems like a convenient way to end that line, at least to me.


On the subject of "not implementing", I was under the impression that the risk controls were implemented, but that there was no evidence that they were effective; with either zero evidence, or low-power evidence. I've work with "quantitative scale" based on "powers of ten"; for low volume production it can be difficult to claim orders of magnitude improvement because of some implemented risk controls.
I agree there can be value in collecting those 'no effect' items since they can be used for overall robustness of the operation, but recently the FDA 'invited' us to stop putting 'no effect' or 'no hazard' lines in our FMEAs. So I'm a little sensitized to teams putting 'no effect' as the outcome/hazard.
 
Top Bottom