The EU AI Act's high-risk AI systems framework starts applying in August 2026. For regulated sectors, the most operationally demanding requirement is straightforward but hard to meet: being able to reconstruct, precisely and on demand, the exact composition of the dataset that trained each model in production.
Article 10 sets specific documentation obligations for datasets used to train, validate, and test high-risk systems. The gap between what organizations typically document and what the regulation will require demonstrating comes down to architecture, not policy.
This analysis covers three things: what Article 10 requires to be reproducible, what a technically sufficient audit response looks like, and what architectural conditions get you there.
What EU AI Act Article 10 Actually Requires
Regulation (EU) 2024/1689 requires providers of high-risk AI systems to apply appropriate data governance and management practices to training, validation, and test datasets. At minimum, those practices must cover:
- The choice of dataset design
- Data collection processes
- Preparation operations: labelling, cleaning, enrichment, aggregation
- Assumptions about what the dataset represents
- Prior assessment of availability, quantity, and suitability
- Examination for biases that could affect health, safety, or produce discrimination
Datasets must also be relevant, sufficiently representative, free of errors, and complete for the intended purpose — accounting for the specific geographical, contextual, behavioural, or functional environment where the system will be used.
One thing the regulation makes clear: EU AI Act Article 10 data governance requirements will be evaluated with reproducible technical evidence. A well-written policy document, on its own, is not enough.
The Operational Challenge: Reconstructing Dataset Composition
In practice, training datasets are usually built through:
- Extractions from production databases into non-production environments
- Manual or semi-automated transformations (anonymization, masking, aggregation)
- Consolidation into training, validation, and test pipelines
- Successive dataset versions accumulated as the model evolves
The EU AI Act explicitly prioritizes field-level provenance traceability — knowing the origin of each field, what transformations were applied, and when. Most current frameworks produce datasets that work for training but whose exact composition isn't recorded in a reproducible way.
What an Audit Query Actually Looks Like
The clearest way to understand what Article 10 demands is to look at a concrete question a supervisory authority might ask:
"For the fraud detection model deployed in production since March 12th, state how many records containing personal data from non-EU citizens entered the training dataset. Detail what anonymization transformations were applied to each field and under which version of the data governance policy in effect at the time."
Two architectures produce very different responses.
Architecture A — Parallel Documentation
Written policies, spreadsheet records, pipeline descriptions in design documents. Responding means convening the data team, recovering the pipeline version, reconstructing the transformations. Estimated time: 3 to 6 weeks. Result: partial, with uncertainty about accuracy.
Architecture B — Reproducible Evidence
The system records, for each sensitive data point, its origin (tap, table, column), the transformations applied (rule, parameters, policy version), and its destination (sink, model, environment). The response is a direct query to the lineage register. Estimated time: minutes. Result: complete and verifiable.
The difference isn't documentary quality. It's structural: whether field-level traceability is built into how the system works, or requires manual reconstruction every time.
Three Technical Conditions for EU AI Act Training Data Compliance
Three capabilities, together, make it possible to produce the evidence Article 10 requires without adding overhead to the model development cycle. None of them are new — what's new is that auditors will now ask for them.
1. Field-Level Lineage
For each sensitive data point entering a training, validation, or test dataset, the system records:
- Origin: tap, table, column, record identifier
- Transformations applied: anonymization, masking, synthesis, aggregation — with parameters and rule version
- Destination: sink, dataset version, model, environment
This record must outlast the pipeline that generated it. If the pipeline is deleted or rewritten, the lineage data still needs to answer retrospective queries. The standard approach is to separate the lineage layer from the execution layer, with auditable and immutable event storage.
2. Versioned Policy as Code
Governance rules — which fields are sensitive, which transformation applies, which exceptions exist — need to live in versionable artefacts, not documents. When an auditor asks what policy governed a transformation fourteen months ago, you need to be able to retrieve the exact version, show who changed it and when, and reproduce its behavior on a specific data point from that period.
In practice: policy artefacts stored in Git, versioned with semantic tags, integrated into the governance CI/CD pipeline, with a register linking each operation to the policy version active at the time.
3. Origin-Based Execution
Sensitive data transformation happens before the dataset leaves the production perimeter — not as a later step on already-extracted copies. This matters especially where the EU AI Act intersects with NIS2 on the ICT supply chain. Non-production environments with real, uncontrolled data fall within the regulatory perimeter.
Origin-based execution closes that gap and directly addresses EU AI Act data quality requirements.
EU AI Act, NIS2, and DORA: One Architecture, Three Regulations
The three European frameworks with the most operational impact on non-production data — EU AI Act, NIS2, and DORA — share a common technical core: auditable evidence of how sensitive data is handled across the chain.

