Supplier Controls for "Hybrid" AI Regulatory Services (ISO 13485 Clause 7.4)

Huss Mardini

Regulatory AI Validation
Hi everyone,
I’m looking for some interpretation on Clause 7.4 (Purchasing) as we define the quality agreement for my new service model.
We run a regulatory service that uses a "hybrid" approach: we use specialized AI agents to draft the heavy lifting of a 510(k) (mapping evidence to eSTAR sections, generating device descriptions, predicate finding, classification matching etc,), but we then have a human consultant review and finalize the package before it goes to the manufacturer.
My question is: How should a manufacturer audit a service like ours?
Is it "Software Validation"? Since we use AI tools internally to generate the draft, does the manufacturer need to see a CSV package for our algorithms?
Or is it "Purchasing"? Since the final deliverable is reviewed and signed off by a human expert (just like a traditional consultancy), is it sufficient to audit us as a standard service provider based on the competence of the personnel and the verification of the final output?
I want to make sure I’m classifying the risk correctly. We are treating the AI as a "tool used by the consultant" rather than "software as a medical device," but the line is getting blurry.
Has anyone here audited a vendor who uses Generative AI as part of their service delivery yet? What controls did you ask for?
Thanks,
Huss
 
My question is: How should a manufacturer audit a service like ours?
You're just supplying a service and, effectively, the output is 100% verified. There should be no reason for a manufacturer to audit you.

The line isn't blurry at all. You're NOT any part of the medical device. Arguably, you're used in the execution of the QMS so there are some quality considerations, but they are more in terms of qualifying and approving you as a supplier. And the only risk is that the 510(k) is not approved (specifically due to your outputs .. that weren't properly vetted by the manufacturer's regulatory consultant).

For qualifying and approving you as a service provider, they should want to have assurance that the software meets its intended purpose. With an AI system (that's continuously learning), it's a challenge. How *do* you train your system? Why do you believe the results are valid? It's a one-time service (I presume) so the whole lifecycle consideration is probably not a big deal but you'll be in a more defensible position if you do have a well-defined software lifecycle, including change management, and especially how training the AI is done and how updates are better than the previous release. If I was a user, those are the questions I'd ask (but would not conduct an audit).
 
You're just supplying a service and, effectively, the output is 100% verified. There should be no reason for a manufacturer to audit you.

The line isn't blurry at all. You're NOT any part of the medical device. Arguably, you're used in the execution of the QMS so there are some quality considerations, but they are more in terms of qualifying and approving you as a supplier. And the only risk is that the 510(k) is not approved (specifically due to your outputs .. that weren't properly vetted by the manufacturer's regulatory consultant).

For qualifying and approving you as a service provider, they should want to have assurance that the software meets its intended purpose. With an AI system (that's continuously learning), it's a challenge. How *do* you train your system? Why do you believe the results are valid? It's a one-time service (I presume) so the whole lifecycle consideration is probably not a big deal but you'll be in a more defensible position if you do have a well-defined software lifecycle, including change management, and especially how training the AI is done and how updates are better than the previous release. If I was a user, those are the questions I'd ask (but would not conduct an audit).
Seems like the hazardous situation the tool can contribute to is 'submission contains an error'. The consequences of this are 'not getting approval' as well as 'getting approval'. The effect of 'not getting approval' would be low. The effect of 'getting approval based on an error' could be low or it could be high. So it seems the risk classification for this tool is not as straight forward as 'the worst that can happen is business risk of delayed approval'. This is one of the issues with 'automated processes' replacing humans, we know humans are fallible but assume 'automated tools' are less fallible.
 
Back
Top Bottom