The first day of CPDP offered an interesting panel on algorithmic transparency that I will summarise here. (There were many panels on very related topics, and some of the remarks made there I’ve used in this summary here.)

The panel kicked off with Mark Rotenberg, from EPIC who have already done considerable work in the area of algorithmic transparency. He quoted Edison saying

What man creates with his hands, he should control with his head.

(Update: apparently that is a misquote; the actual quote is: ‘Whatever the mind of man creates, should be controlled by man’s character’.) The convergence of state authority, capitalism and IT (aka “surveillance capitalism” as coined by Shoshana Zuboff) is caused by a lack of data protection, transparency and weak democratic institutions. EPIC is concerned we are loosing control over the decision making systems because of this.

Mark believes an increase in transparency is helpful, but not in the way people typically understand this (as opening up the source, or making the decision making algorithms public; the latter is typically impossible with deep learning algorithms). Instead decision making systems should become transparent in the following two ways.

  • Such systems should not only output the decision itself, but also an explanation of the decision (i.e what data it was based on, and which reasoning was used to arrive at the decision).
  • The decision making system should always identify itself to the person about which it is deciding. (This is often not the case.)

These two steps will go a long way making decision systems more auditable. This is important because really:

A system that is not auditable is a system you should not use.

This idea of “explainability” came up several times at CPDP. A recent paper by Sandra Wachter et. al. shows that a right to explanation of automated decision-making does not exist in the general data protection regulation. She explained that the GDPR does not demand a rationale for a decision to be given (at least not in the legally binding texts whereas it is mentioned in one of the recitals), and that the GDPR only gives a weak requirement to reveal how a system works. This right is often restricted to protect trade secrets.

Joris van Hoboken reminded us that the issue of algorithmic transparency is nothing new and has been discussed in different guises over the last decades. Ever since algorithms started ranking and indexing the web, these algorithms were effectively mediating access to information (cf. the filtering debate), and have been legally scrutinized, for example with the famous case against Google in Spain regarding the right to be forgotten. He also stressed that the discussion should be broadened beyond technological issues (they should neither be the focus of regulation nor the sole source of possible solutions or interventions). Joris wondered, for example, whether it would be possible to develop “evidence standards” and noted that open source approaches have not found their way in cloud, (web) services and even mobile app contexts.

Krishna Gummadi talked about “algocracy” and the perceived risk of a black box society when using data driven algorithms, which he defined as self-learning algorithms that adapt when fed with data on previous decisions in the problem space.

(Someone I spoke with wondered what would happen if such self-learning algorithms would be fed data related to decisions they themselves made earlier, thus creating a feedback loop. Both our intuitions flagged this is potentially problematic, but we would both be interested in research addressing this issue more thoroughly.)

Gummadi provided some technical arguments why such data-driven algorithms can actually be more transparent than humans in two ways. (And he stressed that humans often make very biased decisions (and are often not able to reliably ‘explain’ their decisions), and are also hard to de-bias.)

  • Data mining algorithms can be used to detect changes in decision making processes, and be used (for example) to detect discrimination.
  • Well designed algorithms can be fair; however it is important to realise that there are many notions of fairness, and that a thorough mathematical formalisation of these notions showed that some notions of fairness are incompatible: they cannot be achieved both at once.

Gummadi gave the example of COMPASS, a tool to predict the probability of recidivism used in US courts. The tool was deemed fair by its manufacturer (Northpointe) according to one metric, but found unfair in a later study by ProPublica according to another metric.

Especially this latter point I found very interesting, revealing that in the end the decision which notion of fairness to implement is highly political, especially if the decision making system is applied in societally sensitive contexts. Society needs to be made aware of this more.