Training and Test Dataset (TTD)
AI systems based on ML need training and test data in order to train and verify the systems for the intended behaviour. The training and test data are fit to the intended behaviour with respect to data type and quality. Training and test data should be validated for their currency and relevance for the intended purpose. The amount of training and test data required will vary based on the intended functionality and complexity of the environment. The training and test data should have sufficiently diverse features in order to provide strong predictive power for the AI system. Training and test data cannot be available in the company and has to be sourced externally. Data quality has to be ensured also in that case.
Controls related to this risk category are listed as below:
TTD 01 - Data Management Procedures
TTD 02 - Data Collection Assessment
TTD 03 - Dataset Governance Policies
TTD 04 - Dataset Annotations and Labels Information
TTD 05 - Dataset Cleaning, Enrichment and Aggregation
TTD 06 - Dataset Description, Assumptions and Purpose
TTD 07 - Dataset Transformation Rationale
TTD 08 - Dataset Bias Identification and Mitigation
TTD 09 - Dataset Bias Analysis Action and Assessment
Last updated