Software Risk Estimation: Probability of Medical Device Software Anomaly Occuring

S

SteveZed

#1
Hi,

Our company makes medical devices following ISO 14971 risk management. We use a qualitative system with tables similar to those found in Annex D (Section D.3.4.1).

Upcoming devices will contain an increased amount of software so we're trying to improve our risk management surrounding software. To that end, I've re-read 14971 and also the IEC TIR80002-1:2009.

In the TIR, the point is made (in Section 4.4.3, among other places) that software anomalies are systematic and thus hard to estimate a probability of occurrence. Section 4.4.3 states:
When software is present in a sequence of events leading to a HAZARDOUS SITUATION, the
probability of the software failure occurring cannot be considered in estimating the RISK for the
HAZARDOUS SITUATION. In such cases, considering a worse case probability is appropriate, and
the probability for the software failure occurring should be set to 1. When it is possible to
estimate the probability for the remaining events in the sequence (as it may be if they are not
software) that probability may be used for the probability of the HAZARDOUS SITUATION occurring
(P1 in Figure 1). If this is not possible, the probability of the HAZARDOUS SITUATION occurring
should be set to 1.
If I understand this correctly, it says that we may continue to use our qualitative risk analysis even for hazardous situations that have software failure as an item in the sequence of events. When there are items in the sequence that are not software bugs, we'll get a useful number and ideas on what mitigations may bring this down. However, if I can't usefully estimate probability of a software bug beforehand, I'm not sure how a software control can be documented to be less hazardous after mitigation.

At the bottom of the same page (21) containing the above quote is the following:
In many cases, estimating the probability of occurrence of HARM may not be possible, and the
RISK should be evaluated on the basis of the SEVERITY of the HARM alone. RISK ESTIMATION in
these cases should be focused on the SEVERITY of the HARM resulting from the HAZARDOUS
SITUATION.
Here I'm not sure how to interpret the guidance to "focus on the SEVERITY of the HARM". Is the idea to create a separate evaluation process for systemic hazards like software anomalies? The commentary in Section 5 suggests this is the intent:
As described in 4.4.3, it is difficult to estimate the probability of software failures. When this results
in the inability to estimate the probability of HARM then RISK should be evaluated on the basis of
the SEVERITY of the HARM alone.
However, the text in 4.4.3 then goes on to say
Although it may not be possible to estimate the probability of the occurrence of a software
failure, it is obvious that many RISK CONTROL measures reduce the probability that such a failure
would lead to a HAZARDOUS SITUATION.
Then an example of detecting memory corruption using a checksum is presented, ending with
Although the probability of a HAZARDOUS SITUATION cannot be estimated
either before or after the checksum is implemented, it can be asserted that the probability of a
HAZARDOUS SITUATION after the checksum is in place is lower than it was before implementing the
checksum.
Taken together, my first impression on reading the TIR is that the idea is, indeed, to have two methods of evaluating hazards: one that takes both SEVERITY and PROBABILITY levels (as we do currently) and a completely separate one that takes into account SEVERITY only for systemic risks. On a certain level, this all makes some sense. What I'm lacking is some idea of the mechanics. Does anyone have experience along these lines to share?

After writing up this question, now I'm questioning my first reading. :) Specifically, the quote
it is obvious that many RISK CONTROL measures reduce the probability that such a failure
would lead to a HAZARDOUS SITUATION
suggests that one should somehow record this reduction in probability. Is it really just saying that we should use qualitative probability measures rather than trying to estimate quantitatively? If so, that suggests we could continue using our current qualitative system.

Any thoughts?

Thanks,
-Steve
 
Last edited by a moderator:
Elsmar Forum Sponsor
P

pldey42

#3
I'm not a medical device expert but I do know software and information security.

Yes, estimating software reliability is hard. This paper from Carnegie Mellon University gives a good idea as to why this is:

http://www.ece.cmu.edu/~koopman/des_s99/sw_reliability/

In information security we have a similar problem. There are events which are either very unlikely, or whose probability is hard to estimate which, if they happen, can have an enormous impact. Some information security risk professionals are using risk assessment methods that weight impact more heavily than probability for this reason. One simple way to do this is, as your report suggests, give the probability of a software error the value of 1.

I see nothing wrong with that. Risk assessment matrices are estimates at the end of the day, probably inaccurate, and their purpose is to help managers prioritize risk reduction efforts. It's the relative values of the risk indicators that matter, not their absolute values.

(That said, if one of the objectives of risk assessment is to estimate field failure rates, the presence of software is going to make that impossible, I fear.)

If the purpose of risk assessment is to compare alternative design solutions which incorporate different software options, one idea might be to adapt an idea from the CMU paper:

Identify the factors that impact software reliability, e.g. size, complexity, number of concurrent activities, programming language, use/non use of defensive programming techniques, development process maturity, etc, and for each software option, assign numbers to these factors and mulch them together, FMEA style, to give some sense of the relative reliability of each solution.

Just an idea.

Substantially, I agree with you. Use qualitative probabilities.
 
S

SteveZed

#4
Patrick,

Thanks for the link to the CMU paper. I'm going to read it more carefully but the introduction basically reaffirms my feelings on the matter.

Risk assessment matrices are estimates at the end of the day, probably inaccurate, and their purpose is to help managers prioritize risk reduction efforts. It's the relative values of the risk indicators that matter, not their absolute values.
I agree with that, but we also need to assess the residual risk (after implementing risk controls) and answer the question "is the risk acceptable?" The 14971 standard mandates this evaluation but leaves the mechanics up to the manufacturer, though some guidance is given in Annex D.

I've edited my initial post to clarify (hopefully!) my questions.

Thanks,
-Steve
 
P

pldey42

#5
In information security the best guide to the mechanics of risk assessment that I've seen is the SEI's Octave Allegro, which you can download free at

http://www.cert.org/octave/allegro.html

It helps to meet the requirements of ISO 27001, which require the details to be explicit.

Another approach that organizations often take is like an engineering FMEA, using a spreadsheet, with columns for likelihood, severity etc. After the first risk assessment they identify mitigations, then assess the residual risks using the same mechanics and yes, effectively they record the lower probability that should have resulted from the mitigation. Using Allegro would achieve something similar.

It makes sense to assess software risks separately because, as you no doubt know and I took a while to realize, when they describe the risk as "systemic" they mean that every user of the device will experience the same impact given the same hazard, e.g. if the pace maker fails due to software that can't handle a heart rate of more than x per minute, everyone will experience the failure if they exceed x per minute.

Hope this helps,
Pat
 
S

SteveZed

#7
Hi,

I'd really like to hear about the practices "out in the field"! Anyone care to share?

Failing that, is there any forum or method to contact the folks who wrote the IEC TIR80002-1? I'd like to solve the apparent contradiction within the document -- where some parts talk about using PROBABILITY and SEVERITY and some parts talk about using SEVERITY alone.

Patrick: I had a quick look at OCTAVE Allegro and came away thinking it is quite a high level strategy document. For medical device software, ISO 14971 already focuses the analysis on the risk of harm to the patient and other personnel. So really, only Steps 6-8 are relevant and these 17 or so pages do not provide a lot of insight above what is in 14971.

-Steve
 
W

WorkInstruction

#8
Here is one method I've seen used. Create a risk analysis that evaluates each individual potentially harmful scenario, assigning each a pre-mitigated likelihood and severity. Evaluation of the risks is then performed both on the combination (likelihood x severity) and also severity. Mitigation and justification must be provided for potentially harmful scenarios that exceed limits. Therefore likelihood is yes ignored during one part of the analysis.

In practice, if a harm is so severe (i.e. death) it will appear in both sets of evaluations. Therefore sometimes this method is redundant however it eliminates the uncertainty in assigning likelihoods.

Also - this method doesn't need to be software specific but can be used for the product / system.
 
Last edited by a moderator:

Marcelo

Inactive Registered Visitor
#9
Quote:
When software is present in a sequence of events leading to a HAZARDOUS
probability of the software failure occurring cannot be considered in estimating
HAZARDOUS SITUATION. In such cases, considering a worse case probability is
the probability for the software failure occurring should be set to 1. When it is
estimate the probability for the remaining events in the sequence (as it may be
software) that probability may be used for the probability of the HAZARDOUS
(P1 in Figure 1). If this is not possible, the probability of the HAZARDOUS
should be set to 1.

If I understand this correctly, it says that we may continue to use our qualitative risk analysis even for hazardous situations that have software failure as an item in the sequence of events. When there are items in the sequence that are not software bugs, we'll get a useful number and ideas on what mitigations may bring this down.
P1 (the probability of the hazardous situation ocurring) is a combination of probabilities from the sequence of events leading to the exposure of the patient/user/etc to the generic hazard.

What the standard means is, if the software failure is part of this sequence of events that lead to a hazardous situation, and parts of the sequence is not software-related (please note that this is generally true because the hazardous situation is related to the patient harm, so something besides a software failure has to occur so that harm comes), then, when you set the probability of the software failure to 1, you still need to verify the other probabilities, and P1 will in fact be a combination of a probability of something always ocurring (the software failure) plus the other probabilities from the other parts of the sequence of events.

In the (rare) case where the software failure probability be the major component of the occurrence of the hazardous situation (for example, when it's also very difficult to estimate the probabilities from the other steps in the sequence of events) then you have to set the probability of the hazardous situation ocurring to 1 (the software failure one).

However, if I can't usefully estimate probability of a software bug beforehand, I'm not sure how a software control can be documented to be less hazardous after mitigation.
Risk control measures are related to the estimated risk (combination of P! and P2). Software failures, as I mentioned before, are always part of the sequence of events which is related to P2. You ALWAYS need to estimate the risk, even in the case when you need to set the probability of ocurrence of the hazardous situation (P1) to 1.


In many cases, estimating the probability of occurrence of HARM may
RISK should be evaluated on the basis of the SEVERITY of the HARM alone.
these cases should be focused on the SEVERITY of the HARM resulting from the
SITUATION.

Here I'm not sure how to interpret the guidance to "focus on the SEVERITY of the HARM". Is the idea to create a separate evaluation process for systemic hazards like software anomalies? The commentary in Section 5 suggests this is the intent:
The harm is what happens to the patient. The probability of ocurrence of harm (P1 x P2) cannot always be estimated, because a lot of those probabilities are related to use error actions of the user/patient which cannot be predicted. In this case, the standard says that the risk should be focused, instead of the usual severity x probability of ocurrence of harm (severity x (P1 x P2), you rely only on the severity of the harm.

This is what is explained in - As described in 4.4.3, it is difficult to estimate the probability of software failures. When this results in the inability to estimate the probability of HARM then RISK should be evaluated on the basis of
the SEVERITY of the HARM alone.


Taken together, my first impression on reading the TIR is that the idea is, indeed, to have two methods of evaluating hazards: one that takes both SEVERITY and PROBABILITY levels (as we do currently) and a completely separate one that takes into account SEVERITY only for systemic risks. On a certain level, this all makes some sense.
Yes, that's exactly that. When probability cannot be relied upon, using the usual severity x probability method cannot give you a reliable result. However, there's no need of 2 diferent systems. You usually have to adapt your system to this situation, please see the comment below.


After writing up this question, now I'm questioning my first reading. Specifically, the quote

Quote:
it is obvious that many RISK CONTROL measures reduce the probability that such a failure would lead to a HAZARDOUS SITUATION

suggests that one should somehow record this reduction in probability. Is it really just saying that we should use qualitative probability measures rather than trying to estimate quantitatively? If so, that suggests we could continue using our current qualitative system.
Yes, that's it.

I would say that the general problem is that people (and in this case, software people) is too focused on using a qualitatively system, when in fact, for software risk management for medical devices (let's make this distinction because I'm not sure how other areas work), a better system is a qualitative probability / quantitative severity one (with qualitative/qualitative in some situations).

This usually come from too a reliance on risk analysis techniques such as FMEA, I think, which are generally too focused on quantivative analysis.
 

Marcelo

Inactive Registered Visitor
#10
Failing that, is there any forum or method to contact the folks who wrote the IEC TIR80002-1?
Usually, you can do this thru your National Committee.

Or you can particupate in an meeting as an observer. There will be a meeting February 25-27 in Berlin.
 
Thread starter Similar threads Forum Replies Date
Sravan Manchikanti Software Risk Management & probability of occurrence as per IEC 62304 IEC 62304 - Medical Device Software Life Cycle Processes 8
silentmonkey Rationalising the level of effort and depth of software validation based on risk ISO 13485:2016 - Medical Device Quality Management Systems 10
A Risk Number for each software requirement IEC 62304 - Medical Device Software Life Cycle Processes 7
B Risk Assessment Checklist for Non product Software IEC 62304 - Medical Device Software Life Cycle Processes 1
MrTetris Should potential bugs be considered in software risk analysis? ISO 14971 - Medical Device Risk Management 5
D Reduction of software class based on multiple external risk controls IEC 62304 - Medical Device Software Life Cycle Processes 5
T Risk analysis of QMS software - Validating software we use for QMS ISO 13485:2016 - Medical Device Quality Management Systems 5
J Software for Techfiles and Risk management ISO 14971 - Medical Device Risk Management 1
B Software Class A - Lengthy further risk analysis IEC 62304 - Medical Device Software Life Cycle Processes 9
D Software as risk control - Confused on one aspect of IEC 62304 IEC 62304 - Medical Device Software Life Cycle Processes 20
I Medical Device Software Risk Analysis ISO 14971 - Medical Device Risk Management 4
Y Risk Control Implemented in Software IEC 60601 - Medical Electrical Equipment Safety Standards Series 6
A Assessing Risk for Medical Device Software ISO 14971 - Medical Device Risk Management 7
A 5.5.3 - Software Unit Acceptance Criteria (Risk Control Measures) IEC 62304 - Medical Device Software Life Cycle Processes 3
M CE Marking and use of IEC 80002-1 for Risk Management of Stand Alone Software EU Medical Device Regulations 13
U Product Level Software Risk Management Plan and Report ISO 14971 - Medical Device Risk Management 2
W Software Tool for Medical Device Risk Analysis - Recommendations please ISO 14971 - Medical Device Risk Management 4
N Minor Concern - Medical Device Software and Risk Management ISO 14971 - Medical Device Risk Management 2
M Risk Management in Software R&D Organization ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 4
N Best Software Program for Risk Management ISO 14971 - Medical Device Risk Management 4
K ISO 62304 Software Risk Management and Medical Device Class IEC 62304 - Medical Device Software Life Cycle Processes 5
Q Books / Literature: Risk Management for Medical Device Software recommendations Other Medical Device and Orthopedic Related Topics 4
A Software Risk Analysis training - Recommendations wanted Training - Internal, External, Online and Distance Learning 1
Q Risk Analysis of Software - ISO 14971:2007 ISO 14971 - Medical Device Risk Management 29
Q Risk Management for Medical Software? IEC 62304 - Medical Device Software Life Cycle Processes 15
2 Risk Assessment according to ISO 14971 - Medical Device Software ISO 14971 - Medical Device Risk Management 7
T Software to Manage Compliance to ISO 14971 (Medical Device Risk Management). ISO 13485:2016 - Medical Device Quality Management Systems 9
J Risk Management Software suggestions? ISO 14971 - Medical Device Risk Management 1
R Medical Device Software Risk Management and ISO 14971:2007 ISO 14971 - Medical Device Risk Management 7
C How is risk management handled in a software-based product ISO 13485:2016 - Medical Device Quality Management Systems 1
T Software Supplier Risk Assessments General Auditing Discussions 0
K Software Updates in the Field and ISO scope ISO 13485:2016 - Medical Device Quality Management Systems 0
M Recurrent event analysis software (python) General Auditing Discussions 2
Y UL 1998 Standard: software classes Software Quality Assurance 0
P Need a programmer for QVI's VMS software for optical inspection machine Inspection, Prints (Drawings), Testing, Sampling and Related Topics 0
S IEC 62304 software costs and time Medical Device and FDA Regulations and Standards News 3
S IEC 62304 - Software verification cost IEC 62304 - Medical Device Software Life Cycle Processes 3
I Form templates for software (iso9001) Document Control Systems, Procedures, Forms and Templates 0
H Software Interface Translation IVD Regulation EU Medical Device Regulations 0
C 8.5.1.1 Control of Equipment, Tools, and Software Programs - Questions about the extent of control of NC programs AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 5
M IEC 62304 Software changes - Minor labeling changes on the GUI IEC 62304 - Medical Device Software Life Cycle Processes 3
T Do I need a qualified compiler for class B software? IEC 62304 - Medical Device Software Life Cycle Processes 3
S Manufacturing Execution Systems Software Costs Manufacturing and Related Processes 0
E 13485:2016, Sections 4.1.6, 7.5.6 and 7.6 - Validation of Software - Need some Advice please ISO 13485:2016 - Medical Device Quality Management Systems 2
R Medical Device Software Certification IEC 62304 - Medical Device Software Life Cycle Processes 1
S HIPAA-compliant monitoring software (advice needed) Hospitals, Clinics & other Health Care Providers 0
A Software bug fixes after shipping a product EU Medical Device Regulations 3
J Medical software Patient outcome Medical Information Technology, Medical Software and Health Informatics 2
Y We are Looking for EASA LOA TYPE 1 experienced software developer Job Openings, Consulting and Employment Opportunities 0
F Grand Avenue Software, Q-Pulse or Qualio - which for a full eQMS? Medical Information Technology, Medical Software and Health Informatics 1

Similar threads

Top Bottom