Data Governance (DG)
This compliance category contains requirements concerning the collection, management and use of data in AI based SaMD.
Robust algorithms typically require the availability of large, high-quality, and well-labelled training data sets. According to the US FDA AI and ML Discussion Paper, an organisation developing AI based SaMD should:
Data management plan addressing how data will be collected, added to existing data sets, and used: This data management plan may include a quality assurance (QA) plan for determining which new data are appropriate for inclusion as part of an expanded training data set; an approach to the reference standard determination; a data augmentation strategy that allows for additional training and independent test data to be added; and an auditing and sequestration strategy to monitor, document test dataset independence, and control access to both the training and test datasets as additional data are being included and any revised algorithm is being retrained and tested.
Similarly, in the US FDA Good Machine Learning Practice (GMLP) guiding principles:
Principle 3. Clinical Study Participants and Data Sets Are Representative of the Intended Patient Population: Data collection protocols should ensure that the relevant characteristics of the intended patient population (for example, in terms of age, gender, sex, race, and ethnicity), use, and measurement inputs are sufficiently represented in a sample of adequate size in the clinical study and training and test datasets, so that results can be reasonably generalized to the population of interest. This is important to manage any bias, promote appropriate and generalizable performance across the intended patient population, assess usability, and identify circumstances where the model may underperform.
Principle 4. Training Data Sets Are Independent of Test Sets: Training and test datasets are selected and maintained to be appropriately independent of one another. All potential sources of dependence, including patient, data acquisition, and site factors, are considered and addressed to assure independence.
Data Governance (DG) is also in line with the IMDRF/SaMD N23, especially related to the sections:
7.1 -- Product Planning
7.3 -- Document Control and Records
7.4 -- Configuration Management and Control
8.2 -- Design
8.4 -- Verification and Validation
Below is the list of the controls that are part of this compliance category:
Last updated