Italian DPA and ChatGPT

by Inge K. Brodersen, Caroline Bonde, Rune Hendriksen, Ebba Borg and Thomas Nygren


Metallic abstract pattern

It seemed a watershed moment when US company OpenAI unleashed ChatGPT on the world in November 2022. Artificial intelligence has been around in various shapes and forms for quite a while, but arguably not on this level.

The chatbot's core function is to imitate a human engaging in a conversation with the user, which many would argue it does exceedingly well. ChatGPT is a spinoff of former increasingly efficient GPT models (GPT is short for a generative pre-trained transformer), representing a sophisticated development on a class of machine learning models called large language models ("LLMs"). LLMs are characterized by their processing of vast amounts of text, and the ability to predict words with increasing accuracy in word sequences. LLMs depend on access to huge data sets, like books and articles. ChatGPT's "secret sauce" is the way it increases the quality of the predictions and word sequences by utilizing a "self-attention" mechanism to weigh the importance of different parts of the input, and a "transformer neural network" to further the understanding of the context.

In a 2014 statement by one of the OpenAI's founders, Elon Musk, said that the best way to ensure a good future is "to empower as many people as possible to have AI. If everyone has AI powers, then there's not any one person or a small set of individuals who can have AI superpower." At the outset ChatGPT seemed to live up to such lofty ideals; it jumped to a record-breaking million users in just five days. From the get go, ChatGPT has impressed users with its fluent responses, uncanny precision, and yes, even haiku poems.

Triple-trouble in March 2023

In March 2023 however, OpenAI/ChatGPT entered troubled waters. On 20 March OpenAI had a data breach which according to the description on OpenAI's website "caused the unintentional visibility of payment-related information of 1.2% of the ChatGPT Plus subscribers who were active during a specific nine-hour window". OpenAI further states that "it was possible for some users to see another active user’s first and last name, email address, payment address, credit card type and the last four digits (only) of a credit card number, and credit card expiration date."

Two days later, on 22 March, the Future of Life Institute submitted an open letter calling on all AI labs to "immediately pause for at least 6 months the training of AI systems more powerful than GPT-4" (GPT-4 being the latest version of ChatGPT). Major AI players, including the aforementioned Elon Musk as well as Apple co-founder Steve Wozniak and MIT professor Max Tegmark, have signed the letter which currently has almost 30 000 digital signatures.

The third blow came on 31 March from the Italian data protection authority (il Garante). The Italian DPA imposed an "immediate temporary limitation on the processing of Italian users' data by OpenAI", requiring compliance with the measures ordered by the DPA within 20 days to avoid the risk of fines up to EUR 20 million or 4% of OpenAI's total worldwide annual turnover.

The Italian DPA and OpenAI

The order by the Italian DPA for the "temporary limitation" stated that processing of personal data in ChatGPT violated the principles in GDPR articles 5-8, the information requirements in GDPR article 13, as well as the "privacy by design" requirement in GDPR article 25. However, other than listing of GDPR articles, the Italian DPA provided little to the public in terms of its legal reasoning. A press release accompanying the order provided some guidance, with the Italian DPA stating that:

  1. "no information is provided to users and data subjects whose data are collected by Open AI"
  2. "more importantly, there appears to be no legal basis underpinning the massive collection and processing of personal data in order to ‘train’ the algorithms on which the platform relies"
  3. "inaccurate personal data is processed" as tests had shown that ChatGPT "does not always match factual circumstances"
  4. "the lack of [any] age verification mechanism exposes children to receiving responses that are absolutely inappropriate to their age and awareness"

Further the DPA added a reference to the aforementioned data breach of 20 March, possibly indicating that it played a role in its decision.

An apparently constructive dialogue was established between OpenAI and the Italian DPA in the following days, resulting in a new decision of 11 April by the DPA under which OpenAI was subjected to nine specific measures for implementation (measures 1 - 7 by 30 April, and 8 - 9 at later dates). Seven of the measures had a clear emphasis on the information obligation and facilitation of the data subjects' rights, including "promoting a non-marketing-oriented information campaign by 15 May 2023, on all the main Italian mass media". One measure was the deployment of an age verification tool, and one measure was "changing the legal basis of the processing of users’ personal data for the purpose of algorithmic training, by removing any reference to contract and relying on consent or legitimate interest as legal bases".

Provided the seven measures were successfully implemented by OpenAI, the Italian DPA stated that it would suspend the enforcement of the temporary limitation decision.

On 28 April, various media reported that ChatGPT was back up and running again in Italy, and OpenAI confirmed it had fulfilled the conditions of the Italian DPA.

Some reflections on the application of data protection principles by the Italian DPA

As a general note, and as mentioned above, the Italian DPA has held its legal analysis close to its chest, making it difficult for external bystanders to analyze the legal implications in detail. With that caveat, below are some of our initial reflections.

From one perspective, one might be tempted to say that the outcome of the process so far has been a touch anti-climactic, starting out as it has with an outright ban on one of the most advanced AI solutions the world has seen, only to be reduced to a discussion on (mostly) fulfilling information requirements – a common theme for any processing tools. While measures related to information requirements and data subjects' rights are by no means an easy feat to implement, they are seldom infeasible and it should at least in theory be more than possible for AI providers to implement within relatively short deadlines.

The emphasis on requirements pertaining to information obligations and facilitation of data subjects' rights is perhaps also a bit surprising given the wording in initial the press release by the DPA which stated that the question of "legal basis" was more important. As stated, the topic of "legal basis" was covered by 1 measure merely to remove any reference to "contract" as legal basis (cf. GDPR article 6 (1) (b)), and instead for OpenAI rely on "consent" or "legitimate interest" (cf. GDPR article 6 (1) (a) and (f)). In our view the statement in the press release appears correct, at least we believe the question of "legal basis" is legally more interesting.

Relying on consent requires an actual option to withdraw the consent, making it hard to see how that could work in practice related to a huge AI model. As such we would expect "legitimate interests" to be the most practical and applicable legal basis. However, it would be interesting to know as to how exactly OpenAI may apply the balancing test underpinning any use of the "legitimate interest" basis, i.e. the weighing of the interests of the business against the protection of the individual's privacy. Adding to the complexity is also the topic of inaccurate personal data, which was mentioned initially by the Italian DPA but is not (directly at least) mentioned in the required measures from April 11. For example: how can you avoid inaccuracy and responses that do not "match factual circumstances" if the user specifically asks for a fictitious response about a person? Follow-up question: would that fictitious response necessarily be to the detriment of an actual person? This opens up for a broader discussion on to what extent the application of data protection principles is possible to conversational AI solutions such as ChatGPT.

As for the age verification tool requirement, one may argue that there is an abundance of solutions on the internet that do not have an age verification mechanism which may expose children "to receiving responses that are absolutely inappropriate to their age and awareness". Several websites come to mind – no need to mention names here – which may lead to the conclusion that the decision by the Italian DPA appears somewhat arbitrary and even overreaching on this point.

The EU Artificial Intelligence Act

In addition to GDPR compliance, OpenAI will also be subject to future regulation, most notably the EU Artificial Intelligence Act (the "EU AI Act"), which was first proposed by the European Commission on 12 April 2021, with the latest draft from the European Council published on 6 December 2022.

The aim of the EU AI Act is to balance the numerous risks and benefits that the use of AI can provide. The EU AI Act thus proposes a harmonized measure “preventing Member States from imposing restrictions on the development, marketing and use of AI systems” while banning the use of certain AI systems with an unacceptable risk (such as harmful manipulation). High risk AI will be subject to strict obligations but not banned while other AI systems with lower risk levels will come with transparency obligations.

The EU AI Act will become law once both the Council (representing the 27 EU Member States) and the European Parliament agree on a common version of the text.

Concluding comments

We note that while supervisory authorities in several countries around the world initiated investigations related to OpenAI's solutions (e.g. France, Germany, Ireland, Canada and South Korea), none of those countries have currently gone as far as the Italian DPA did. From a Scandinavian perspective, the Norwegian DPA and the Danish DPA advised that they would follow the development and discussions around ChatGPT closely, while the Swedish DPA had no specific plans to follow up on the Italian decision. On 13 April, the European Data Protection Board ("EDPB") informed that it "had discussed the recent enforcement action undertaken by the Italian data protection authority against Open AI about the Chat GPT service" and that it had launched a dedicated task force to "foster cooperation and to exchange information on possible enforcement actions conducted by data protection authorities". The Danish DPA is part of the task force established by EDPB.

It is not surprising by any means that huge AI solutions such as ChatGPT are under the scrutiny of regulators. Artificial intelligence on this level is a new phenomenon, and academics and professionals alike have warned against the dangers of AI since, well, Frankenstein. However, the Italian DPA's use of the GDPR to regulate AI may seem like bringing a knife to a gunfight. If anything, the ongoing controversy surrounding ChatGPT illustrates a need for a comprehensive legal framework for artificial intelligence. It remains to be seen if the upcoming EU AI Act, which apparently is still in a quagmire of legal negotiations and political bargaining, will claim that role. In any event, we all need a solid legal framework and, equally important, a coordinated regulatory practice to ensure another watershed moment is avoided, i.e. the point of technological singularity when artificial intelligence becomes uncontrollable.

Do you have any questions?