FDA - AI based SaMD
HomeDocumentationGet started
  • FDA - AI based SaMD
  • Data Governance (DG)
    • DG01 - Define Sets
    • DG02 - Dataset Governance Policies
    • DG03 - Dataset Design Choices
    • DG04 - Dataset Source Information
    • DG05 - Dataset Annotations Information
    • DG06 - Dataset Labels Information
    • DG07 - Dataset Cleaning
    • DG08 - Dataset Enrichment
    • DG09 - Dataset Aggregation
    • DG10 - Dataset Description, Assumptions and Purpose
    • DG11 - Dataset Transformation Rationale
    • DG12 - Dataset Bias Identification
    • DG13 - Dataset Bias Mitigation
    • DG14 - Dataset Bias Analysis Action and Assessment
    • DG15 - Dataset Gaps and Shortcomings
    • DG16 - Dataset Bias Monitoring - Ongoing
    • DG17 - Dataset Bias Special/Protected Categories
  • Technical Documentation (TD)
    • TD01 - Technical Documentation Generated
    • TD02 - Additional Technical Documentation
    • TD03 - Technical Details
    • TD04 - Development steps and methods
    • TD05 - Pre-trained or Third party tools/systems
    • TD06 - Design specification
    • TD07 - System Architecture
    • TD08 - Computational Resources
    • TD09 - Data Requirements
    • TD10 - Human Oversight Assessment
    • TD11 - Pre Determined Changes
    • TD12 - Continuous Compliance
    • TD13 - Validation and Testing
    • TD14 - Monitoring, Function and Control
    • TD15 - Risk Management System
    • TD16 - Changes
    • TD17 - Other Technical Standards
    • TD18 - Ongoing Monitoring System
    • TD19 - Reports Signed
  • Transparency and Provision of Information to Users (TPI)
    • TPI01 - Transparency of the AI System
    • TPI02 - Instructions for Use
  • Human Oversight (HO)
    • HO01 - Human Oversight Mechanism
    • HO02 - Human Oversight Details
    • HO03 - Human Oversight - Biometric Identification Systems
  • Accuracy, Robustness and Cybersecurity (ARC)
    • ARC01 - Accuracy Levels
    • ARC02 - Robustness Assessment
    • ARC03 - Continuous Learning Feedback Loop Assessment
    • ARC04 - Cyber Security Assessment
  • Managing SaMD Lifecycle Support Process - Record Keeping (RK)
    • RK01 - Logging Capabilities
    • RK02 - Logging Traceability
    • RK03 - Logging - Situations that May Cause AI Risk
    • RK04 - Logging - Biometric systems requirements
    • RK05 - Details of Off-the-Shelf Components
    • RK06 - Evaluation Process of Off-the-Shelf Components
    • RK07 - Quality Control Process of Off-the-Shelf Components
    • RK08 - Internal Audit Reports
  • Risk Management System (RMS)
    • RMS01 - Risk Management System in Place
    • RMS02 - Risk Management System Capabilities and Processes
    • RMS03 - Risk Management Measures
    • RMS04 - Testing
    • RMS05 - Residual Risks
    • RMS06 - Full Track of Mitigation Measures
  • Quality Management Principles (QMP)
    • QMP01 - Quality Management System in Place
    • QMP02 - Compliance Strategy stated
    • QMP03 - Design processes
    • QMP04 - Development and QA (Quality Assurance) processes
    • QMP05 - Test and Validation Procedures
    • QMP06 - Technical Standards
    • QMP07 - Data Management Procedures
    • QMP08 - Risk Management System
    • QMP09 - Ongoing Monitoring System
    • QMP10 - Incident Reporting Procedures
    • QMP11 - Communications with Competent Authorities
    • QMP12 - Record Keeping Procedures
    • QMP13 - Resource Management Procedures
    • QMP14 - Accountability Framework
  • Post Market Monitoring System (PMS)
    • PMS01 - Post Market Monitoring System in Place
    • PMS02 - Data Collection Assessment
    • PMS03 - Post Market Monitoring Plan
Powered by GitBook
On this page

Data Governance (DG)

PreviousFDA - AI based SaMDNextDG01 - Define Sets

Last updated 2 years ago

This compliance category contains requirements concerning the collection, management and use of data in AI based SaMD.

Robust algorithms typically require the availability of large, high-quality, and well-labelled training data sets. According to the , an organisation developing AI based SaMD should:

Data management plan addressing how data will be collected, added to existing data sets, and used: This data management plan may include a quality assurance (QA) plan for determining which new data are appropriate for inclusion as part of an expanded training data set; an approach to the reference standard determination; a data augmentation strategy that allows for additional training and independent test data to be added; and an auditing and sequestration strategy to monitor, document test dataset independence, and control access to both the training and test datasets as additional data are being included and any revised algorithm is being retrained and tested.

Similarly, in the guiding principles:

Principle 3. Clinical Study Participants and Data Sets Are Representative of the Intended Patient Population: Data collection protocols should ensure that the relevant characteristics of the intended patient population (for example, in terms of age, gender, sex, race, and ethnicity), use, and measurement inputs are sufficiently represented in a sample of adequate size in the clinical study and training and test datasets, so that results can be reasonably generalized to the population of interest. This is important to manage any bias, promote appropriate and generalizable performance across the intended patient population, assess usability, and identify circumstances where the model may underperform.

Principle 4. Training Data Sets Are Independent of Test Sets: Training and test datasets are selected and maintained to be appropriately independent of one another. All potential sources of dependence, including patient, data acquisition, and site factors, are considered and addressed to assure independence.

Data Governance (DG) is also in line with the , especially related to the sections:

7.1 -- Product Planning

7.3 -- Document Control and Records

7.4 -- Configuration Management and Control

8.2 -- Design

8.4 -- Verification and Validation

Below is the list of the controls that are part of this compliance category:

US FDA AI and ML Discussion Paper
US FDA Good Machine Learning Practice (GMLP)
IMDRF/SaMD N23
DG01 - Define Sets
DG02 - Dataset Governance Policies
DG03 - Dataset Design Choices
DG04 - Dataset Source Information
DG05 - Dataset Annotations Information
DG06 - Dataset Labels Information
DG07 - Dataset Cleaning
DG08 - Dataset Enrichment
DG09 - Dataset Aggregation
DG10 - Dataset Description, Assumptions and Purpose
DG11 - Dataset Transformation Rationale
DG12 - Dataset Bias Identification
DG13 - Dataset Bias Mitigation
DG14 - Dataset Bias Analysis Action and Assessment
DG15 - Dataset Gaps and Shortcomings
DG16 - Dataset Bias Monitoring - Ongoing
DG17 - Dataset Bias Special/Protected Categories