Accuracy, Robustness and Cybersecurity (ARC)

This compliance category contains details of the accuracy, robustness and Cybersecurity requirements that FDA AI/ML based SaMD must meet. In the context of this category, accuracy and performance and robustness and safety are used interchangeably.

According to the IMDRF/SaMD, the guideline on the ARC includes:

Accuracy and Robustness:

All appropriate SaMD lifecycle support processes, and SaMD realization and use processes should be considered. Maintenance activities should preserve the integrity of the SaMD without introducing new safety, effectiveness, performance, and security hazards.

Within the context of SaMD it is important to understand how systems, software, context of use, usability, data, and documentation might be affected by changes, particularly with regards to safety, effectiveness, and performance.

The SaMD manufacturer should take into account implications and introduction of patient safety risk as a result of changes to architecture and code.

Cybersecurity:

Building quality into SaMD requires that safety and security should be evaluated within each phase of the product lifecycle and at key milestones. Security threats and their potential effect on patient safety should be considered as possible actors on the system in all SaMD lifecycle activities.

The goal is to engineer a system that: a) maintains patient safety and the confidentiality, availability, and integrity of critical functions and data; b) is resilient against intentional and unintentional threats; and c) is fault-tolerant and recoverable to a safe state in the presence of an attack.

This compliance category covers the following principles from the FDA GMLP:

Principle 2. Good Software Engineering and Security Practices Are Implemented: Model design is implemented with attention to the “fundamentals”: good software engineering practices, data quality assurance, data management, and robust cybersecurity practices. These practices include methodical risk management and design process that can appropriately capture and communicate design, implementation, and risk management decisions and rationale, as well as ensure data authenticity and integrity.

Principle 3. Clinical Study Participants and Data Sets Are Representative of the Intended Patient Population: Data collection protocols should ensure that the relevant characteristics of the intended patient population (for example, in terms of age, gender, sex, race, and ethnicity), use, and measurement inputs are sufficiently represented in a sample of adequate size in the clinical study and training and test datasets, so that results can be reasonably generalized to the population of interest. This is important to manage any bias, promote appropriate and generalizable performance across the intended patient population, assess usability, and identify circumstances where the model may underperform.

Principle 6. Model Design Is Tailored to the Available Data and Reflects the Intended Use of the Device: Model design is suited to the available data and supports the active mitigation of known risks, like overfitting, performance degradation, and security risks. The clinical benefits and risks related to the product are well understood, used to derive clinically meaningful performance goals for testing, and support that the product can safely and effectively achieve its intended use. Considerations include the impact of both global and local performance and uncertainty/variability in the device inputs, outputs, intended patient populations, and clinical use conditions.

Principle 7. Focus Is Placed on the Performance of the Human-AI Team: Where the model has a “human in the loop,” human factors considerations and the human interpretability of the model outputs are addressed with emphasis on the performance of the Human-AI team, rather than just the performance of the model in isolation.

Principle 8. Testing Demonstrates Device Performance During Clinically Relevant Conditions: Statistically sound test plans are developed and executed to generate clinically relevant device performance information independently of the training data set. Considerations include the intended patient population, important subgroups, clinical environment and use by the Human-AI team, measurement inputs, and potential confounding factors.

Principle 9. Users Are Provided Clear, Essential Information: Users are provided ready access to clear, contextually relevant information that is appropriate for the intended audience (such as health care providers or patients) including: the product’s intended use and indications for use, performance of the model for appropriate subgroups, characteristics of the data used to train and test the model, acceptable inputs, known limitations, user interface interpretation, and clinical workflow integration of the model. Users are also made aware of device modifications and updates from real-world performance monitoring, the basis for decision-making when available, and a means to communicate product concerns to the developer.

Principle 10. Deployed Models Are Monitored for Performance and Re-training Risks Are Managed: Deployed models have the capability to be monitored in “real world” use with a focus on maintained or improved safety and performance. Additionally, when models are periodically or continually trained after deployment, there are appropriate controls in place to manage risks of overfitting, unintended bias, or degradation of the model (for example, dataset drift) that may impact the safety and performance of the model as it is used by the Human-AI team.

Below is the list of the controls that are part of this compliance category:

Last updated