Risk Analysis Flow - Confusion between ISO 14971 and IEC 62304

Mr Yan

Registered
Hello all,

I'm having a little bit of trouble reconciling the flow between ISO 14971 and IEC 62304. My current approach is as follows:

1) Perform hazard analysis per 14971 - this entails 5 classes of probability (improbable to frequent) and 5 severities (Negligible to Critical).
2) Identify risk mitigations in the hazard analysis. Some of these are software tasks, some are software that is mitigating a hardware issue.
3) Take these upper level mitigations and pass them to the Software Risk Analysis (this is where we jump from 14971 to 62304)
4) In the SRA:
a) accept all of the above hazard analysis generated SW risk mitigations from above into a risk table​
b) identify any other risks (specifications, SOUP, user, etc per 62304) not covered by the hazard analysis and add them to this risk table​
c) create the pre- and post- mitigation severity table. This has a % likelihood and expects 3 severities (A, B, C).​

Now comes the disconnect example. Let's say we have a 14971 based hazard analysis with a hazard having a prob/severity of Occasional / Major. It lists as a mitigation a software activity that reduces that prob/severity to Improbable / Major. Great, we reduced the risk! But....

In the SRA, the software task must be shown and per 62304 you have to assume that its likelihood is 100% to occur. But the 14971 hazard analysis does not have that constraint, or it would have to be listed as Frequent/Major to start with. If I set this software item to 100% likelihood, than it seems there is a break between this flow. You can't have a pre-mitigation of "Occasional" in 14971-speak if the 62304 also requires the software risk to be 100% likely to happen pre-mitigation. That would require you to flow that higher probability to the 14971 hazard analysis.

So, the three questions I have are:
1) Is the flow shown above reasonable? (14971 hazard to 14971 risk mitigation identification to 62304 software risk analysis to 62304 software requirements). If not, can you suggest something different?
2) How do you reconcile the mismatch in risk probability (likelihood) when you make this leap from 14971 to 62304?
3) 14971 uses a range of 5 levels of severity. 62304 is supposed to use 3 (A, B, C) for risk severity. How do you reconcile this if you have evaluated it using 14971 but are now in the 62304 risk analysis?

Thank you for your time!
 

yodon

Leader
Super Moderator
I think you have a couple of things confused here.

First, neither 14971 nor 62304 define levels of severity for your risk scoring. 14971 gives an example using 1..5 but it's not mandated (although I generally agree it's a reasonable approach). 62304 defines software safety classes (A, B, C) but that's just the level of management required, not a risk scoring mechanism.

Second, a software control defined in the system hazard analysis just drives a software requirement and only reduces the particular risk in the hazard analysis. For the hazard analysis, assume the software control has the intended effect.

What we do for software risk is then to take the various functions the software can perform and do an FMEA. Consider all the ways the software can fail to perform that function. Note that 62304 lists some causes for consideration; e.g., incorrect or incomplete specification of functionality, etc. There, you can assume the software WILL fail (pre-controls) so you can rank probability at 100% (or level 5 if you use such a scale). Also note that if your software performs any basic safety or supports essential functions, under 60601-1 there are additional cause considerations. Any new controls to be implemented in software drive new software requirements.

Does that help?
 

Mr Yan

Registered
Thanks Yodon. Some comments/questions:
  1. Levels - agreed on the arbitrary of the levels in 14971. And yes, the A,B,C was grabbed from a different source but 62304 does not call that out, that was my mistake.
  2. Flow - I've seen this walked thru by a few different software consultants and there seem to be two different approaches. The one I was taking was that the Hazard analysis informs the SRA which informs the SRS. You seem to be proposing the other method that has been described to me, which is that the Hazard analysis generates SRS requirements which then are analyzed by the SRA and may feed back to become more SRS requirements. That seems a little more convoluted, but what I'm looking for is the most accepted method.

Here is an example perhaps that might help you point out where I am wrong...

(Making this up)...Let's say a device can administer vitamins. The user enters weight and age via buttons for different groups. i.e. Weight buttons for 0 to 50 lbs, 50 to 250 lbs. Age buttons for 2 to 20, 20 to 50.

If the user accidentally hits the '50 lbs' button and the '20 to 50 yrs' button, it is an error that can be caught by logic later on.

The sequence I would use for this might look like this:

14971 Hazard Analysis row in a table would contain:
Hazard: User enters incorrect data
Sequence: User confused and hits wrong button on GUI
Hazardous Situation: Inaccurate dosage output
Severity: 3
Probability of Harm (unmitigated): 3
RPN (Severity*Probability): 9
Classification: Medium
Risk Reduction Type: Protective
Risk Reduction: "System shall not accept a conbination of user inputs that are in conflict with known combinations."
New Hazards created: None
Probability of Harm (Mitigated): 1
RPN (Severity*Probability) (Mitigated): 3


In the 62304 SRA this would then translate to a row in a table in FMEA format:

Failure Mode: User Data Inaccurate
Failure Cause: Wrong button choice by user on GUI
Effect of Failure: Inaccurate dosage output
Severity: Moderate
Likelihood (unmitigated): 100%
Risk Control Measures: System shall not accept a combination of user inputs that are in conflict with known combinations"
Likelihood (mitigated): 0%

In the traceability matrix of the SRA, this risk would be tied to an SRS requirement that might read:
"The GUI shall reject entries that have weights of 0 to 50 lbs with age groups of 20 to 50 yrs".

And of course that requirement would be tested and verified.

Can you poke holes in that for me please?

Thanks!
 

ECHO

Involved In Discussions
I have only been in the industry for 5 years so I am not an expert in this matter but the example you gave is a bit different than what I have seen at different companies.

First, when calculating RPN, I have alway added "detectability". In other words RPN = Severity x Occurrence x Detectability. If RPN only used severity and occurrence, than a frequent negligible event would be in the sample category as an improbable catastrophic event. When you only have the 2 variable to work with, you can use the table below. I believe a similar table is in ISO 14971 (i don't have a copy with me).

Screen Shot 2020-06-08 at 12.20.58 PM.png

If you use the table above, you could assign a IEC 62304 safety class using the same table. (i.e. high = class C, low = class A). If you decide to break your system down in to modules, you would always use the highest.

Once you have determined the safety classification, the required deliverables would change. In IEC 62304, each bullet has what document is required for which classification. I don't have a copy of IEC 62304 with me but as an example, Unit Testing is required for classification B and C.

So if we go back to your example, we would just keep 1 FMEA. For each failure mode you can list the product level or software level requirement as the risk control. The traceability matrix will then make sure all the related requirements are testing during verification.

Your SDLC SOP would define which documents are required during design review and during the planning phase of our project, you would create a document determining which classification you would fall under. Or if you are like most companies, do the documentation after the coding is done :)

I hope this helps.
 

mihzago

Trusted Information Resource
keep in mind that the probability of a software failure or a bug occurring is not the same as the probability of the hazardous situation causing harm.
So, even if you set the probability of a software defect to 100%, this does not automatically translate to certain harm. Write out your hazard scenarios as sequence of events, where a software defect is an initiating or one of the steps in the sequence. Consider separately the probability of each of these steps occurring to calculate the overall probability.

FMEA is not a best tool for the risk assessment, but it's a good tool to identify the sequence of events or initiating events. So, if you're already done the work building the FMEA analysis, I would recommend extracting from it those situations (e.g. software bugs) that may result in hazardous situations resulting in harm. More than half of failure identified in the FMEA will not result in any hazardous situations.

Oh, and if someone tells you to consider 'detectability', run.
 

ECHO

Involved In Discussions
keep in mind that the probability of a software failure or a bug occurring is not the same as the probability of the hazardous situation causing harm.
So, even if you set the probability of a software defect to 100%, this does not automatically translate to certain harm. Write out your hazard scenarios as sequence of events, where a software defect is an initiating or one of the steps in the sequence. Consider separately the probability of each of these steps occurring to calculate the overall probability.

mihzago is correct. i forgot to mention that. 14971 talks about p1 and p2 values.

Oh, and if someone tells you to consider 'detectability', run.

:) I wasn't suggesting that people use detectability. Most of my work has been taking a system that uses detectability and getting rid of it.
 

mihzago

Trusted Information Resource
@ECHO the comment about detectability wasn't against you. Sorry if it sounded that way. I didn't even see your post after I submitted mine.

Detectability is part of the traditional FMEA, but it only makes sense in some situations where you can actually detect failure, e.g. in-process testing to identify component failures, or in process FMEA when you perform in-process or final visual inspection or testing, etc.
Determining probability is already difficult, adding detectability to the mix makes it even worse and does not add much value.
 

sagai

Quite Involved in Discussions
I experience time to time in the medical domain a kind of fluid way of doing, interpreting, and applying risk management techniques.

Let me put these here, also for argumentation sake.

Hazard Analysis, also known as HARA (Hazard Analysis and Risk Assessment) is not FMEA for me.
It has nothing to do with FMEA, it looks like FMEA, however, it has really nothing to do with it.

Why ... or why do I think so ...

TIMING
HARA/RA occurs in the early phase of the design when we more or less have no or vague information of the proposed architecture of the device, regardless if it is the somehow survived-strange "software-only" concept of the medical device industry, or if its an embedded system.
(I know ... we have an architecture rock solid, it only tells that it is retrospective paperwork that you are about to do ... )
HARA/RA is actually an abstract analysis, kind of independent on the actual implementation, however, yes, it considers preliminary design ideas.
I am volunteering to say, HARA/RA considers that even an intended normal behavior could lead to harm in certain hazardous situations.

CONTEXT
FMEA looks into the effect of the failure in the context of the next architectural component.
HARA looks into the effect of the failure in the context of the PATIENT. We do not have architecture at the time we do HARA/RA.

TIME VIEW
FMEA does not identify the sequence of events.
This is the whole point and limitation of FMEA, it identifies a single error effect on the next architectural element.
There are 'clones' of it moving a bit further like D/PFMEA with the introduction of previous, focus, and next element analysis.

Event Tree Analysis is the technique that looks FROM the point of failure effect to the future to see how the error can propagate.
FMEA looks BACK from the failure effect.

CLASSES for severity/probability
When we are in the early phase of development I am rather skeptical if we can say anything more subtle other than, yeah, it could affect patient safety and yeah, it is probable or actually it can never happen.
Also, consider that the qualitative analysis, that HARA/RA should be, is rather subjective apart from the extremes above mentioned and ... actually culture-dependent.
Team in EU, team in US, Team in Africa or other parts of the world more or less certain that would come up with different class assignments.

All these goes back to yeah it hurts, yeah it is probable or impossible mentioning I had before.

RISK MATRIX as an excuse to accept risk upfront
I would simply forget this concept, someone can be hurt 1 in a million does not make me to accept, or if the severity is anything else other than none.

THAT CLASS C worry of 62304
It does not ask anything else that you would do anyway to have a mature, sustainable source code.
Nothing special, nothing extreme, good software engineering practice, that's all.
Spending a bit of time in other safety-critical industries, I must say, the medical device software development the LEAST regulated one compared to others.


Okay, stop here, already this post way long a bit.

Curious on your thoughts though.

Regards
Saby
 

akp060

Involved In Discussions
Hello all,

I'm having a little bit of trouble reconciling the flow between ISO 14971 and IEC 62304. My current approach is as follows:

1) Perform hazard analysis per 14971 - this entails 5 classes of probability (improbable to frequent) and 5 severities (Negligible to Critical).
2) Identify risk mitigations in the hazard analysis. Some of these are software tasks, some are software that is mitigating a hardware issue.
3) Take these upper level mitigations and pass them to the Software Risk Analysis (this is where we jump from 14971 to 62304)
4) In the SRA:
a) accept all of the above hazard analysis generated SW risk mitigations from above into a risk table​
b) identify any other risks (specifications, SOUP, user, etc per 62304) not covered by the hazard analysis and add them to this risk table​
c) create the pre- and post- mitigation severity table. This has a % likelihood and expects 3 severities (A, B, C).​

Now comes the disconnect example. Let's say we have a 14971 based hazard analysis with a hazard having a prob/severity of Occasional / Major. It lists as a mitigation a software activity that reduces that prob/severity to Improbable / Major. Great, we reduced the risk! But....

In the SRA, the software task must be shown and per 62304 you have to assume that its likelihood is 100% to occur. But the 14971 hazard analysis does not have that constraint, or it would have to be listed as Frequent/Major to start with. If I set this software item to 100% likelihood, than it seems there is a break between this flow. You can't have a pre-mitigation of "Occasional" in 14971-speak if the 62304 also requires the software risk to be 100% likely to happen pre-mitigation. That would require you to flow that higher probability to the 14971 hazard analysis.

So, the three questions I have are:
1) Is the flow shown above reasonable? (14971 hazard to 14971 risk mitigation identification to 62304 software risk analysis to 62304 software requirements). If not, can you suggest something different?
2) How do you reconcile the mismatch in risk probability (likelihood) when you make this leap from 14971 to 62304?
3) 14971 uses a range of 5 levels of severity. 62304 is supposed to use 3 (A, B, C) for risk severity. How do you reconcile this if you have evaluated it using 14971 but are now in the 62304 risk analysis?

Thank you for your time!


CONCEPTS: I don't think there is any jump from 14971 to 62304. Both have been glued together by "Risk Acceptability" and "Risk Management". You are just walking along, no jumps!!. 62304 is the starting line for the software system and 14971 is just where you have to reach

SPECIFICS: (Assuming that your case is a SiMD) Now coming to the example, the "Risk Number" that you assign when you assign a software activity as a risk control measure to a "hardware initiated hazardous situation" will go through a different evaluation process than if the same software activity were a risk control measure for a "software initiated hazardous situation" or when you are doing risk analysis of the hazardous situation initiated by the software activity. (Please note I am referring to the same software activity as in your example, in all the three cases). So for the second and third cases you just take the 100% probability and fill the table, take the risk acceptability from the Risk SOP and come up with a Risk Number. So it is the "Risk acceptability" number that you define in your SOP (usually done taking 14971into account) that matters at the end. Also, your SRA and hazard analysis for the device should be two separate documents. Cannot combine them into one or else this "leap" confusion shall persist. Answers:

1. The flow is reasonable just clear out the perception and implementation because while implementing SRA you might have hazards, not in the 14971 part.
2. The risk probability for SRA is different from the probability for hardware one. Here you follow what 62304 dictates for SRA. And no "leaps", only switching over
3. Again 14971 has provided an example, you are free to define your own severity levels. And that is not the same as A, B and C in 62304. The latter is a risk class just like the one you would assign to your overall device. You do not reconcile, you evaluate separately.
 
Top Bottom