Search the Elsmar Cove!
**Search ALL of Elsmar.com** with DuckDuckGo including content not in the forum - Search results with No ads.

Examples of inherent safety by design

sagai

Quite Involved in Discussions
#41
One of the subtle differences between software and hardware relates to the probabilities of failure. Hardware elements 'wear', software does not. If a defect is in software, it will be in all instances of the software. Contrast this with physical elements which follow the 'bath tub' curve for reliability. This difference has been one of the main drivers for avoiding the use of "Software FMEA" to support 14971.
We need whiskey here, lots of whiskey. :)
And a bench to sit down together for a chat on this software thing.
:drunk:
 

Ronen E

Problem Solver
Staff member
Super Moderator
#42
To me the main issue with SW (looking in from the outside, with undergrad experience of coding in Pascal, assembler, C & ADA) is not about its inherent characteristics but about our ability to build real confidence in the SW's performance. In HW design testing is never a substitute to quantitative engineering analysis, it's only an additional layer of confidence and we're supposed to acknowledge the level of uncertainty it practically always encompasses.

Can any of you experts comment on SW testing by AI?
 

sagai

Quite Involved in Discussions
#43
For automotive ADAS I am involved on the other way, testing ai with regular kind of software and hardware.
 

Tidge

Involved In Discussions
#44
The length of time/amount of use necessary for a medical device to wear to the point of no longer being itself is addressed in wear and aging studies. Are similar studies been done with software? What about studies to confirm that the software cannot be corrupted via download or copying (i.e., that any defect must be in all instances of the software)?
Often software in medical devices (as opposed to software that is the medical device) is a "closed" system, and as such are inherently secure from the possibility of corruption like an "open" system. It has been demonstrated in several highly publicized examples that medical device software on open/inter-connected systems can be corrupted. Even in such systems there are methods that can be implemented to deter "threat actors" and there are methods that can detect if threat actors have compromised the integrity of a software system.

It is important to note that 14971 is not explicitly for risks related to cybersecurity. Within the context of 14971, the cybersecurity areas of concern are most likely "availability" (Is the ME equipment no longer working?) and "integrity" (Is the performance of the ME equipment compromised?)
 

robert.beck

Involved In Discussions
#45
I have a question on risk assessment of software. This thread seems to be the right place to put it.

This is not a philosophical question about medical device software. It is practical in a real-world setting. I apologize for the length here but it's not simple to find a way to be compliant to complicated and possibly unrealistic regulatory requirements. First, the question followed by the regulatory background, in case you need it:

Question:

If software risk cannot be mitigated by reducing the probability of a software error, how can it be mitigated? What's left, after a strict reading the FDA statements and the older IEC 62034 is: 1) remove the hazard by design, or 2) reduce the severity of the hazard by design, or implement procedures to reduce the hazard level. Severity is the only factor for determining risk level. This is not believable or realistic or practical because even though I'm pretty good at writing software, I'm not so good that I'll always find every mistake, and neither are you.

I do not agree that, as FDA states, 'software failures are systemic in nature' because that is too broad of a statement. It depends more on which component has the software defect; for instance most software failures are bad design, or bugs that were not caught before the software was released. Whether an uncaught bug results in a systemic failure is another question. I've found several software design flaws in my car, yet I can still drive, turn corners, and stop suddenly if needed. Theoretically, the bugs can be fixed but the poor design is not so easily fixed in a business setting that cannot afford to hire the very best designers for every project. Would the software be better designed and flawless in a more expensive car? Probably not.

Conclusion: in a software hazard analysis, in FMEA style, use 100% for probability and accept risks that are not minimal in this is reasonable for your device. Explain this in the Risk Management File, such as a Risk Management report, else if might appear you are out of compliance.

Background:
  • The basis for this problem is Annex ZA of the 2012 version of ISO 14971, which eliminated several useful and overused risk mitigations to prevent manufacturers from abusing them and not truly fixing problems. The EU responded by revising the medical device directive so that ISO 14971 was out of sync with it. ISO 14971:2012 says, "Accordingly, manufacturers and Notified Bodies may not apply the ALARP concept with regard to economic considerations." The actual statement in Annex ZA is, "various particular Essential Requirements require risks to be reduced "as far as possible" without there being room for economic considerations. " Coupled with the inability to use ALARP, in practice, this is often interpreted to mean that all risks must be reduced to minimal.
  • IED 60601-1 3rd edition, clause 14, merely says to use ISO 14971 for determination of unacceptable risk.
  • IEC ISO 62304:2006 states," paragraph 4.3: Software Safety Classification” states “If the HAZARD could arise from a failure of the SOFTWARE SYSTEM to behave as specified, the probability of such failure shall be assumed to be 100 percent.” This is also stated clearly in figure 3 as, "Probability of software failure shall be assumed to be 1." Paragraph B4.3 also supports this with "assuming that the failure will occur. However, no consensus exists for a method of quantitatively estimating the probability of occurrence of a software failure ...
So far, total agreement with FDA (see below #4).
  • Note: Paragraph 4.3 is crossed out in the 2015-06 redline version of this standard, and but paragraph B4.3 and figure 3 statement are not. The final version of the amendment is the same. There are several online references to figure 3 that do not include the statement about probability. For example:
  • TUV: IEC 62304, Amendment 1:2015 | TÜV SÜD shows the same figure without the accompanying comment regarding probability of failure, identified as "safety classification according to IEC 62304 - Amendment I:2015.
  • there is at least one other but this forum allows only one link per post so it is omitted here.
  • Both of these are more recent than revision 1.1 of the 62304 standard. Is TUV actually saying it's ok to use probability, or is this an oversight? What do other NBs say about this?

  • This topic is discussed at some length in the standard. Conclusion: this is still in process and ambiguous as of today. it may be resolved in the 2020 version of the standard which is not yet available. Personal conclusion: those who work hard and diligently on these standards and guidances may not have much experience writing and testing software.

  • The FDA recognized consensus standard of 62304 is version IEC 62304 Edition 1.1 2015-06 CONSOLIDATED VERSION. This is the exact version referenced above in #3.

  • FDA says (in 2005 guidance document entitled Guidance for the Content of Premarket Submissions for Software Contained in Medical Devices):
"In general, FDA considers risk as the product of the severity of injury and the probability of its occurrence. However, software failures are systemic in nature and therefore the probability of occurrence cannot be determined using traditional statistical methods. Therefore, we recommend that you base your estimation of risk for your Software Device on the severity of the hazard resulting from failure, assuming that the failure will occur."

Translation: in an FMEA, likelihood of a software hazard is 100%, no ifs ands or buts. Total agreement with IEC 62304: 2006, but not necessarily with IEC 62304:2015 or later.

Note: FDA's 1999 document on COTS essentially follows the same idea and makes similar or identical statements about probability and software risk.
 

Ronen E

Problem Solver
Staff member
Super Moderator
#46
Caveat: I'm not a SW expert but do have basic background in coding.
I do have extensive background in observing and analysing regulatory creep though :)

I think this whole sad farce stems from poor definitions.

SW failure is "systemic" and "inevitable" not in the sense that where there's a potential for failure it will always occur and will make the entire system run by that SW collapse; but in the sense that SW is deterministic, i.e. if it contains a bug or a deficiency, and a certain combination of circumstances can trip it, that specific combination will always trip it. No more, no less.

This is different from HW, where there's an unknown, random element in the makeup of materials used and some inevitable unit-to-unit manufacturing variability. SW copies are supposed to be 100% identical (ignoring for a moment corruption through writing/copying/storage etc.), so if one copy will fail under a certain set of circumstances, all will fail.

The current situation seems to me the result of a "convenient" generalisation that's actually taken the issue way out of context. Just like the EN ISO 14971:2012 Annex Z AFAP/ALARP and "no economical considerations" farce.

Assuming that SW will "always fail" in the presence of a hazard implies that we shouldn't use SW at all because by it's very nature it's incapable of performing and thus dangerous. Total nonsense. If that offends anyone - I'm really sorry, but you guys didn't do a very good job in explaining your intentions / rationales (just see the amount of confusion around this), so you shouldn't complain.
 

robert.beck

Involved In Discussions
#47
Thanks, and I agree with you completely. I have found that many in the regulatory field, including 'experts,' tend to go with the simplest solution rather than look across multiple regulations and documents, sort of like the stereotypical police detective who closes cases by arresting the most obvious suspect. the question I seek to answer is, "how do I mitigate software risk in an FMEA-style risk assessment when the risk level is too high and I can't do anything to reduce the likelihood of the event happening because it's software and that's not acceptable to an auditor?" Official documents say 'improve the design' but what does this mean in a particular case? who is to assess whether a design change solves a problem without testing the software? and if the risk doesn't show up in the testing, who is to say that was the correct test?

I am at this point:
1) there is no way to use likelihood for mitigation.
2) perform the usual FMEA analysis, documenting potentially unacceptable risks.
3) don't use ALARP as it's prohibited by ISO 14971:2012, but do use ALAP which is not prohibited.
4) after using ALAP, justify the risk level with a cost-benefit argument, per ISO 14971, annex ZA.

any comments on the above appreciated.

On a personal note, I just had an eye surgery that was presented to me as a slam-dunk because the surgeon was a "good" surgeon. It failed, six weeks ago I had 20/20 in this eye; did not need reading glasses. now it's approximately 20/2000 and the only recourse presented to me is to repeat the same operation. very upsetting until I research the procedure and find it has about 90% success. that's ok, I was just unlucky. I don't have to blame the surgeon and can go with a repeat operation. No one can or will tell me what went wrong or why. the cause of this particular eye condition is "idiopathic" or nobody knows. at least it can be fixed sometimes but not always. a lot of the surgical equipment is software-controlled but I doubt if the surgeon is aware of it as having any importance. failure of this software would not have been detected, and it's not life threatening.
 

Ronen E

Problem Solver
Staff member
Super Moderator
#48
Thanks, and I agree with you completely. I have found that many in the regulatory field, including 'experts,' tend to go with the simplest solution rather than look across multiple regulations and documents, sort of like the stereotypical police detective who closes cases by arresting the most obvious suspect. the question I seek to answer is, "how do I mitigate software risk in an FMEA-style risk assessment when the risk level is too high and I can't do anything to reduce the likelihood of the event happening because it's software and that's not acceptable to an auditor?" Official documents say 'improve the design' but what does this mean in a particular case? who is to assess whether a design change solves a problem without testing the software? and if the risk doesn't show up in the testing, who is to say that was the correct test?
Strictly-technically and strictly-FMEA speaking, you can reduce the RPN (or risk level) by addressing elther P, D or S. @Peter Selvey once argued at length, in a discussion we had here some years back, that you can't really reduce S. I agree with that argument. Since P is ruled out, I guess that from that narrow perspective you're left with D, or in other words some sort of alarm. This ties back to the last sentence in your personal story: "failure of this software would not have been detected".

By "improve the design" I think you shouldn't be trying to improve the SW design, but rather look at a more fundamental way to remove the hazard, i.e. the very fundamental characteristic that makes that specific harm even relevant. That's ISD.
On a personal note, I just had an eye surgery that was presented to me as a slam-dunk because the surgeon was a "good" surgeon. It failed, six weeks ago I had 20/20 in this eye; did not need reading glasses. now it's approximately 20/2000 and the only recourse presented to me is to repeat the same operation. very upsetting until I research the procedure and find it has about 90% success. that's ok, I was just unlucky. I don't have to blame the surgeon and can go with a repeat operation. No one can or will tell me what went wrong or why. the cause of this particular eye condition is "idiopathic" or nobody knows. at least it can be fixed sometimes but not always. a lot of the surgical equipment is software-controlled but I doubt if the surgeon is aware of it as having any importance. failure of this software would not have been detected, and it's not life threatening.
Thanks for sharing. It is discouraging and I'm sorry to hear you were among the unlucky 10%. Actually I think that a 90% success rate is considered quite high, so it's not like you took an unreasonable risk. Maybe the most annoying part is the lack of transparency, but if you're in the USA I guess that this kind of behaviour is just legal damage control on their part.

From the risk management perspective, maybe none of the involved devices failed. It could be the surgeon's shortcoming or maybe the procedure is not 100% suitable in all cases and not that easy to predict. Who knows. It's quite hard to analyse when we/you don't know what actually happened.
 

Watchcat

Quite Involved in Discussions
#50
a 90% success rate is considered quite high, so it's not like you took an unreasonable risk. Maybe the most annoying part is the lack of transparency, but if you're in the USA I guess that this kind of behaviour is just legal damage control on their part.
Probably more of an uninformed risk, like most patients, since information can be hard to come by. I don't know who used the term "slam dunk," but I'd steer clear of any medical professional who used this term to refer to the anticipated outcome of a surgical procedure.

I agree, if in the US, it's almost certain to be legal damage control.
 
Top Bottom