The challenges of privacy engineering

August 2, 2017
2

The Internet Privacy Engineering Network (IPEN/EDPS), the University of Leuven (KU Leuven), and the Future of Privacy Forum (FPF) will host a transatlantic workshop dedicated to Privacy Engineering Research and the GDPR on Friday, 10 November, 2017 at the University of Leuven in Belgium. In preparation they asked a few people for a shortlist of the most pressing issues to be discussed at the workshop. I started thinking, came up with a short list, which then grew longer as I started explaining what I meant. I'm sharing the result in the hope to receive feedback and to sharpen my thinking.

Please note that this is my personal and hence very subjective perspective on our field.

Privacy engineering research

Let's start with some issues related to research in the area of privacy engineering and the related field of privacy enhancing technologies (PETs).

First and foremost, I think most PETs focus on data minimisation or unlinkability, aiming to unconditionally protect the privacy of the user. The underlying threat model is based on not trusting anyone (government or businesses) with our personal data. This threat model is not always appropriate, and may be too strict to achieve reasonable solutions in practice First of all because this model can only be satisfied with very inefficient protocols that perform poorly in practice. Secondly, this particular threat model disregards the (legitimate) needs of other stakeholders.

Moreover, this focus on data minimisation totally ignores other requirements stipulated by the General Data Protection Regulation (GDPR), like subject access rights, transparency, accountability, etc. There are interesting research questions worth studying in these areas, for example the question of how to solve the apparent paradox between greater transparency and accountability (e.g. logging) while preserving privacy at the same time. The same goes for securely providing data subjects access to review the personal data stored about them, especially when the data subject and the data processor do not have a direct relationship with each other. Many companies that I talk to are really worried that providing such data subject access they are really facilitating their next data breach.

Finally, current research (perhaps with the exception of differential privacy) ignores the explosive growth of big data and smart algorithms in practice. Of course we need to push back on the hosanna stories surrounding big data and warn about the risks of big data and smart algorithms. But beyond that there are also really interesting research question that we need to explore. Like: how to protect and sensibly process these large amounts of data. In particular, how do you make that kind of processing transparent? How do you design deep learning algorithms that are still capable to explain their decisions, in terms understandable to the subjects of their decisions? I'm scratching just the tip of the iceberg here...

Dissemination

We need to do better in disseminating our results, and that holds for both academic research results as well as best practices and experiences from industry. We have said so before (IPEN was supposed to deliver a cookbook, and yes, I was supposed to be the editor... Lack of time and resources are partly to blame.), but a clear, concise, and easily accessible portfolio of the current state of the art in privacy engineering and best (or good-enough) practices is found wanting.

There are loads of books about the privacy problem but very few if any books that discuss the solution. In particular, a good book to use for either a bachelor or master course in privacy engineering within a computer science or information science curriculum is missing. Perhaps some of us could write one. I'd be happy to volunteer/coordinate.…

Privacy engineering in practice

For many people in practice, it is unclear what privacy by design means, and what they need to do in order to meet the requirements in the GDPR. There are a few methodologies or approaches out there that were developed in an a academic context (the PRIPARE methodology, the ULD privacy protection goals and my own privacy design strategies), but it is unclear to me to what extent they are applied in practice. In particular, it would be good to investigate if and how they can be integrated into existing software development approaches (especially agile software development).

In my experience, the best privacy friendly solutions to particular problem are special purpose. They have been specifically designed for that purpose and often involve new research insights or innovative application of existing academic research. Examples are privacy friendly identity management (based on attribute based credentials), privacy friendly roadpricing, etc. In other words, these systems really need to be 'invented'. It is not clear how to create an atmosphere or an collaboration environment among scientists, engineers, business people and government officials to realise such 'inventions' in more or less systematic fashion.

In practice, responsibilities for the different aspects of privacy by design are not clearly assigned. So don't blame the engineers if something goes wrong, or if it unclear what needs to be done. Think about how to engage both the business developers, privacy office and engineers, how to assign responsibilities, and how to let them communicate and cooperate effectively.

The functionality of real privacy friendly products cannot easily be changed. Of course this is on purpose/by design (for example to avoid function creep). But from a business perspective investing in the development in such rigid, inflexible, and less future-proof (from the business perspective) systems does not make sense. For business this is a huge issue. How to resolve this?

Finally I think usability is still something we as engineers struggle with. We tend to design systems as if we ourselves are the users. As a result the systems have many bells and whistles and can be configured and tuned at will. But at the same time the have become completely unusable for the average user. Who wants something that works out of the box, and is privacy friendly out of the box, and does not require any fiddling to make it more secure. We should really avoid the PGP/GPG nightmare and look more to the likes of Apple whose iMessage app is end-to-end encrypted, but nobody really knows or cares... because it 'just works'.

Others issues

Privacy is often perceived as a stumbling block, as a hurdle that needs to be overcome to successfully complete a project. Sometimes it is seen as totally contrary to the projects' main objective. For example, security and privacy are often seen as a zero-sum game: you either have privacy or you have security, but not both. Similar arguments exist about privacy versus accountability, and others. Often these seemingly contradictory requirements can in fact be met all at the same time, but this requires some more critical thinking, sometimes more research, and definitely a broader understanding of the possibilities existing technologies provide.

And let's not forget the elephant in the room: business models. Often, privacy invasive systems are designed on purpose, to increase profits, increase efficiency, or because the whole business model depends on it. In those cases privacy by design is not relevant. Surveillance is still surveillance even if it applies privacy by design, and perhaps we should push back on the use of privacy by design in those contexts.

Our community

Finally, I think that there are also some issues within our community that deserve some attention. And with 'our community' I do not only mean the IPEN mailinglist (on which discussions can in certain cases become overheated), but in general the privacy enhancing technologies research community as well as the 'hacker' community which is typically very committed to privacy.

For some of us, privacy is an absolute right. Some of us do not trust governments or business at all. Some of us are very good at pointing out privacy problems with existing systems, but are less keen on committing to a solution that by necessity strikes a balance between the needs of all stakeholders involved. Sometimes the 'perfect' is the enemy of the 'good (enough)'. This is counterproductive.

To be clear: I do not wish to deny anybody their own point of view. In fact I think the extreme points of view are very valuable, both to have a clear vision of what we want to achieve in the end, but also because it creates a space for the more nuanced approaches to be heard, appreciated and to be taken up in practice. But sometimes the heated debates and animosity within our community drains too much energy, and also makes our community a less welcoming place for new people that may want to contribute to making the world more privacy friendly. Moreover, it may lead to scattered approaches and even competition between projects that share a common goal but fundamentally differ in the way they think that goal should be achieved. Even if, from the perspective of a relative outsider, the differences are not so large and fundamental at all...

In case you spot any errors on this page, please notify me!
Or, leave a comment.
Gilles Ampt
, 2017-08-02 17:00:44
(reply)

I would suggest the topic of identifiers, identifying data, pseudonymisation and anonymisation. It is mentioned in the GDPR. What I see lacking is a common opinion of experts on the strength of pseudonymisation and anonymisation.

Thomas
, 2017-08-17 15:39:09
(reply)

I too am on the mailing list for IPEN. I have posted an announcement on my listing site, but still lack a link to the event website. Please let me know if you find one.