AI and the GDPR: Providing Meaningful Information about the Logic Involved6 min read

The General Data Protection Regulation (GDPR), applicable in the EU Member States from 25 May 2018, is a comprehensive legislation which aims to adequately protect personal data of data subjects while taking into account the latest technological trends and developments. Still, some may argue that certain novel technologies such as blockchain were not addressed: take, for example, the append-only immutable blockchain’s inherent non-compliance with “the right to be forgotten”, or, in the case of a public blockchain globally distributed across a myriad of nodes, the GDPR requirements for the transfer of personal data to third countries.

Machine learning, a subset of AI which most of AI applications are based on, is addressed by the GDPR in the context of “automated decision-making”. This term is defined by Article 22 GDPR as “[making] a decision based solely on automated processing, including profiling, which produces legal effects concerning [the data subject] or similarly significantly affects [the data subject]”. Automated decision-making is generally prohibited unless one of the exceptions in Article 22 applies, such as when the data subject has given their explicit consent, provided that there are suitable measures in place to safeguard the data subject’s rights and freedoms and legitimate interests, at least “the right to obtain human intervention […], to express [their] point of view and to contest the decision”.

One of the GDPR requirements that sparked a debate in the context of machine learning is a concept that has, by some, been called the “right to explanation“. Recital 71 GDPR states that “[automated decision-making] should be subject to suitable safeguards, which should include […] [the data subject’s right] to obtain an explanation of the decision reached“, while Articles 13 (2) (f), 14 (2) (g) GDPR oblige the data controller to provide, where automated decision-making exists, “meaningful information about the logic involved, as well as the significance and the envisaged consequences of such processing for the data subject” (our emphases). Where the personal data have been collected from the data subject, such information must be provided by the controller “at the time when personal data are obtained [from the data subject]”, i.e. prior to the processing of the personal data. Where the personal data have not been obtained from the data subject, the general requirement is that such information must be provided within a “reasonable period” after obtaining the personal data and no later than one month, “having regard to the specific circumstances in which the personal data are processed”. In addition, the individual has the right to request this information ex-post as part of the data subject’s “right of access” prescribed by Article 15 GDPR.

The debate has been focusing on whether and to what extent the above provisions mandate the data subject’s “right to explanation”. Some legal scholars went so far as to argue that such a right does not exist, thus attracting criticism. Still, semantic intricacies and ambiguities aside, the GDPR is clear on that, where an exception to the general prohibition of automated decision-making applies, (i) the data controller must provide the data subject with a meaningful explanation of an automated decision (where the personal data have been obtained from the data subject, such an explanation must be provided before the personal data are processed for making that decision), and (ii) the data subject has the right to obtain this information from the data controller. This means that the data controller which utilizes machine learning, e.g. a vendor providing an AI-based job matching web-application that automatically shares candidates’ profiles with companies based on certain criteria without human involvement in the process, must ensure that the decisions reached by the app’s machine learning model are, to a reasonable extent, transparent to the user and reviewable. Specifically, the data controller must provide “meaningful information” on how the automated decision regarding the individual is arrived at, thus preventing a “black box” scenario.

What makes information about automated decision-making meaningful to the data subject? Such information must be clear and understandable to the data subject presumably without technical expertise. For example, in the case of the above-mentioned job matching web-app, an automated decision on why the applicant’s profile has been rejected for sharing with a potential employer needs to be explained to the data subject in a plain and intelligible manner, so that the data subject understands why and how this decision was arrived at by the system. Elaborating on this example, a general explanation of this category of automated decisions could be incorporated into the app’s privacy policy and state, at the minimum, that certain criteria, e.g. the candidate’s professional experience and language proficiency, are matched by the app’s algorithm against the requirements specified by the employer to determine whether the applicant qualifies for a job and, if these data subject’s qualities are below the expertise threshold specified by the employer, the candidate’s profile is not shown by the system to this employer. This information gives the data subject understanding that if their profile is not shared by the app with the employer, then the reason lies in the candidate’s lack of expertise. This way, the data subject is explained, in general terms, the algorithmic logic that the decision is based upon. (One could also note that in this example, the decision not to share the candidate’s profile with the employer qualifies as an automated decision producing “similarly significant effects” on the data subject. As Article 29 Data Protection Working Party (WP29) states in its Guidelines on Automated Individual Decision-Making and Profiling (Guidelines), “the threshold for significance must be similar to that of a decision producing a legal effect”, and continues that “decisions that deny someone an employment opportunity or put them at a serious disadvantage” qualify as such automated decisions.)

Some AI vendors could argue that the “right to explanation” puts their trade secrets and intellectual property at risk. Recital 63 GDPR states that “[the data subject’s right of access] should not adversely affect the rights or freedoms of others, including trade secrets or intellectual property […]”. At the same time, WP29 notes in the Guidelines that “controllers cannot rely on the protection of their trade secrets as an excuse to deny access or refuse to provide information to the data subject”. This means that information on automated decision-making provided to the data subject does not necessarily have to contain specific descriptions of proprietary algorithms or the “know-how” that underpins the process. The most important quality of such information is being able to provide a clear understanding to the data subject on how the decision is arrived at. This can be achieved by a non-technical description specifying what personal data are used, how the system processes these data when making an automated decision, in general terms. As WP29 states in the Guidelines, “[t]he controller should find simple ways to tell the data subject about the rationale behind, or the criteria relied on in reaching the decision. The GDPR requires the controller to provide meaningful information about the logic involved, not necessarily a complex explanation of the algorithms used or disclosure of the full algorithm. The information provided should, however, be sufficiently comprehensive for the data subject to understand the reasons for the decision”.

Sergii Shcherbak