EDPB Opinion on AI Models and GDPR Principles

Introduction

The European Data Protection Board (EDPB) has published an opinion addressing data protection in AI models. It covers assessing AI anonymity, the legal basis for processing data, and mitigation measures for impacts on data subjects for tech companies operating in the bloc.

The opinion was published in response to a request from Ireland’s Data Protection Commission, the lead supervisory authority under the GDPR for many multinationals.

What were the key points of the guidance?

The DPC sought more information about:

When and how can an AI model be considered “anonymous” — those that are very unlikely to identify individuals whose data was used in its creation, and therefore is exempt from privacy laws.
When companies can say they have a “legitimate interest” in processing individuals’ data for AI models and, therefore, don’t need to seek their consent.
The consequences of the unlawful processing of personal data in the development phase of an AI model.

When an AI model can be considered ‘anonymous’

An AI model can be considered anonymous if the chance that personal data used for training will be traced back to any individual — either directly or indirectly, as through a prompt — is deemed “insignificant.” Anonymity is assessed by supervisory authorities on a “case-by-case” basis and “a thorough evaluation of the likelihood of identification” is required.

However, the opinion does provide a list of ways that model developers might demonstrate anonymity, including:

Taking steps during source selection to avoid or limit the collection of personal data, such as excluding irrelevant or inappropriate sources.
Implementing strong technical measures to prevent re-identification.
Ensuring data is sufficiently anonymised.
Applying data minimisation techniques to avoid unnecessary personal data.
Regularly assessing the risks of re-identification through testing and audits.

Kathryn Wynn, a data protection lawyer from Pinsent Masons, said that these requirements would make it difficult for AI companies to claim anonymity.

When AI companies can process personal data without the individuals’ consent

The EDPB opinion outlines that AI companies can process personal data without consent under the “legitimate interest” basis if they can demonstrate that their interest, such as improving models or services, outweighs the individual’s rights and freedoms.

This is particularly important to tech firms, as seeking consent for the vast amounts of data used to train models is neither trivial nor economically viable. But to qualify, companies will need to pass these three tests:

Legitimacy test: A lawful, legitimate reason for processing personal data must be identified.
Necessity test: The data processing must be necessary for purpose. There can be no other alternative, less intrusive ways of achieving the company’s goal, and the amount of data processed must be proportionate.
Balancing test: The legitimate interest in the data processing must outweigh the impact on individuals’ rights and freedoms. This takes into account whether individuals would reasonably expect their data to be processed in this way, such as if they made it publicly available or have a relationship with the company.

Even if a company fails the balancing test, it may still not be required to gain the data subjects’ consent if they apply mitigating measures to limit the processing’s impact. Such measures include:

Technical safeguards: Applying safeguards that reduce security risks, such as encryption.
Pseudonymisation: Replacing or removing identifiable information to prevent data from being linked to an individual.
Data masking: Substituting real personal data with fake data when actual content is not essential.
Mechanisms for data subjects to exercise their rights: Making it easy for individuals to exercise their data rights, such as opting out, requesting erasure, or making claims for data correction.
Transparency: Publicly disclosing data processing practices through media campaigns and transparency labels.
Web scraping-specific measures: Implementing restrictions to prevent unauthorised personal data scraping, such as offering an opt-out list to data subjects or excluding sensitive data.

Consequences of unlawfully processing personal data in AI development

If a model is developed by processing data in a way that violates GDPR, this will impact how the model will be allowed to operate. The relevant authority evaluates “the circumstances of each individual case” but provides examples of possible considerations:

If the same company retains and processes personal data, the lawfulness of both the development and deployment phases must be assessed based on case specifics.
If another firm processes personal data during deployment, the EDPB will consider if that firm did an appropriate assessment of the model’s lawfulness beforehand.
If the data is anonymised after unlawful processing, subsequent non-personal data processing is not liable to GDPR. However, any subsequent personal data processing would still be subject to the regulation.

Why AI firms should pay attention to the guidance

The EDPB’s guidance is crucial for tech firms. Although it holds no legal power, it influences how privacy laws are enforced in the EU.

Indeed, companies can be fined up to €20 million or 4% of their annual turnover — whichever is larger — for GDPR infringements. They might even be required to change how their AI models operate or delete them entirely.

Conclusion

The EDPB’s opinion provides a comprehensive framework for AI companies to ensure the lawful processing of personal data. By understanding the key points of the guidance, AI firms can avoid potential legal issues and maintain the trust of their users.

FAQs

Q: What is the main purpose of the EDPB’s opinion on AI models?

A: The main purpose is to provide guidance on how AI companies can ensure the lawful processing of personal data in the development and deployment of AI models.

Q: What are the three tests that AI companies must pass to process personal data without consent?

A: The three tests are the legitimacy test, necessity test, and balancing test.

Q: What are the mitigating measures that AI companies can apply to limit the impact of processing personal data?

A: The mitigating measures include technical safeguards, pseudonymisation, data masking, mechanisms for data subjects to exercise their rights, transparency, and web scraping-specific measures.

Q: What are the consequences of unlawfully processing personal data in AI development?

A: The consequences include the impact on how the model will be allowed to operate, potential fines, and requirements to change or delete the model.