In my work in information security and business continuity I've been learning from other fields like H&S and medical devices. Perhaps sharing some lessons from Infosec and BC might be helpful in this difficult area:
Both InfoSec and BC are moving towards emphasising impact and severity more than probability, for these reasons:
- Probability is hard to estimate, especially for software failures, and for events that have never happened for which there are no statistics
- Some managers, when they see that a risk is low probability, think no further than "It'll never happen to us," enjoy a false sense of security, and refuse even the simplest, affordable mitigations
- Low probability, high impact events can be devastating if they occur, and their mitigation is often affordable. BP in the Gulf is an example, where the US regulator's requirement for business continuity planning was waived to enable drilling to start quickly.
Indeed, BS 25999 (Business Continuity Management) is based upon the idea of identifying critical processes and considering the impact should they fail, regardless of the reasons they might fail or the risks inherent in them. You plan to somehow sustain processes that, if they were to fail, would materially threaten the survival of the business. In a sense, you plan for a failure of risk management. The standard does also call for risk management, but not as something to stake the business upon: it's used to attempt to reduce the probability of bad events and/or their impact, but only as an adjunct. The core is to plan for failure.
In information security, the whole thing is based on risk management. To the OP's question, organizations will often have some mitigations (we call them "controls") in place such as locks on doors, secure filing cabinets and networks protected with passwords, firewalls and the like.
Sometimes they assess the risks assuming the mitigations in place work as they should (and sometimes they have data showing effectiveness of controls). This enables them to identify additional mitigations necessary.
Other times they assess the risks assuming a green field, without the existing controls. This helps validate the existing selection and identify controls that are disproportionate or out of date, which can safely be removed; or maybe are no longer strong enough to contain a risk that has become more dangerous.
Either way, risk management is a continuing cycle that's repeated as often as necessary. How often is a matter of judgement, but for some issues of national security it can be daily, hourly even.
Another aspect is that the mitigations, the controls, can themselves introduce risks. Put locks on the doors, someone will lock themselves out. Use encryption software, and there's a risk that it will fail, or the key will get lost and the information is lost; or worse, an attacker cracks the code and quietly steals secrets you thought were secure. Use the cloud for backups, and the internet connection to the cloud gets destroyed.
Infosec and BC professionals, and no doubt medical device designers too, assess the residual risks associated with the controls and may choose to mitigate those too, all the time remembering that there is no such thing as zero risk.
The information security management standard is interesting in that it requires managers to sign off on the residual risks: they are accountable for the risks their business takes (and it's recognized that zero risk does not exist; the risks simply have to be "reasonable", whatever that means). There's a risk in this of building personal blame games into the management culture, so it seems to me important that everyone understands that, should something bad happen, it's not somebody's fault that a risk was not identified or properly controlled; it's a failure of the risk management process which, as in quality, then gets a corrective action.
Risk management is very subjective, because it's not just about the bad things that could happen, but public or customer perception of them. For example, Pan Am was terminally damaged by the Lockerbie disaster: even though the sad loss of that plane made hardly any difference to the safety statistics of air travel (it was as safe as ever) the constant pictures on TV of the crashed plane associated danger and the Pan Am brand in the public mind.
For Infosec and BC professionals that means that mitigation action is not only about containing and managing the incident itself in the scientific and engineering sense, but also PR. For example, when a Virgin train crashed a few years ago, Richard Branson was almost immediately on TV saying all the right things, not only because it was the right thing to do, but also because he wanted passengers to continue to book Virgin trains despite TV images of his wrecked one. He was astonishing: he promised us that the rail network was safe, only hours after what turned out to be a serious points failure, and we believed him. Charisma can be a risk mitigation!
Given that risk management is somewhat subjective, it's vital that auditors assess the organization's risk assessments against its own defined risk assessment process and put their own judgement to one side. Auditors are naturally conservative souls who hate risk, and must keep those anxieties in check. That's not a reason to write down the details of risk assessments though: the reason for writing it all down is so that risk assessments are carried out consistently (and not driven by the most neurotic manager, or the most lurid press stories) and to assure that, when something does go bang, the organization can defend itself in court by showing it had, indeed, performed diligent risk assessments and taken responsible, not reckless, risks.
Finally, I read somewhere (wish I could remember where) that it matters less how you do risk assessment; more critical is how often, and who does it.
Hope this helps,
Pat