Privacy Is Hard and Seven Other Myths. Achieving Privacy through Careful Design.

October 12, 2021

Nobody at the turn of this century, except perhaps a few die-hard civil rights activists, expected privacy to become such a dominant news item a decade or so later. But after the Snowden revelations, the Cambridge Analytica scandal, and many other incidents and data breaches, tech companies have finally come under growing scrutiny. Hardly a day goes by without yet another news story covering how this or that company tramples our privacy in such and such ways. As a result, legal protection of privacy has started to improve. Unfortunately, this has so far not really led to any significant changes in the way technology is designed and used. Apart from isolated efforts and fringe services offered by enthusiasts, the bulk of the services we use are still privacy invasive at their core. As the COVID-19 pandemic forced us to suddenly do everything online, we were forced to grab the first tools we could find. Alas, the privacy invasive ones were closest at hand. This needs to change.

(This is the main message of my book Privacy Is Hard and Seven Other Myths. Achieving Privacy through Careful Design, that appeared October 5, 2021 at MIT Press. For all other posts related to my book see here.)

We live in an increasingly digital world. We shop online. We share our lives digitally. We are tethered twenty-four hours a day through our smartphones, wearables, and tablets, connected to our family, friends, work, and everything that happens in the world. Governments apply new digital technologies to perform their tasks more efficiently, to increase our safety and security, improve our well-being, and to combat fraud. Businesses similarly embrace these technologies for new systems and services that are more efficient and more personalized, disrupting existing brick-and-mortar businesses in the process.

All these systems collect huge amounts of personal data and use that information to monitor or influence us, without many of us being fully aware of this. Common myths trouble our vision and lull us into indifference. It is time we wake up, challenge these myths and to recognize poorly designed systems and replace them with more privacy friendly ones.

We are not collecting personal data

Consider the myth, upheld by many organisations and businesses, that “We are not collecting personal data”. There is a common misconception that personal data is only data that can directly be linked to a particular, named, person. Such a limited definition of personal data would offer little privacy protection however, because even data that is not immediately linked to a person can be combined with other data to construct a rich and intimate personal profile that at some point can be linked to some particular person. Whether you know their name or not. Therefore, data is also (legally) considered personal if it can be linked to someone indirectly, for example by looking up and linking several pieces of information scattered over different databases. This rather broad definition makes sense because it is often easy to make a connection between such random looking identifiers and the actual person using them. Phone companies keep records of their subscribers. IP addresses are often static and linked to their owner by the Internet Service Provider.

Once you start paying attention, and look under the hood and try to understand how they actually work, you realise that many digital services (electronic payment, travelling by public transport, using a mobile phone, even parking your car, reading an ebook or a newspaper online, to name but a few) collect personal data as well. As Yogi Berra said “You can observe a lot by just watching.” But we do need to train our eyes and learn where to look in order to really see what is going on. This is important, because the collection of personal data creates significant privacy problems: information on parked cars is used to detect tax fraud, your financial transactions are used to determine your credit score and even to feed you targeted advertising, and your reading patterns are used to influence how novelist and journalists write their stories. And these are but a few examples that have an impact on millions of people worldwide.

Unfortunately, many organisations are often unaware of this broad definition of what personal data is, and therefore believe they are not collecting personal data at all: after all they are only collecting some uninteresting, random looking identifiers. For many it’s also a form of denial: many online services necessarily process IP addresses for example, which means they have to adhere to European data protection law (the General Data Protection Regulation GDPR). As this creates obligations they had rather not be bothered with, many stick their heads in the sand hoping the storm will blow over. This is, as a matter of fact, not entirely unreasonable given the fact that the enforcement of data protection law is scarce as data protection authorities are underfunded, understaffed, and investigating individual cases takes a lot of time. But a more mature and sustainable approach of course is to own the fact that you are processing personal data, and learn how to do so responsibly.

You have zero privacy anyway - get over it

Or consider the myth that “you have zero privacy anyway - get over it”, as Scott McNealy, then CEO of SUN Microsystems, famously proclaimed in January 1999. There is a tendency, especially among technology start ups and Silicon Valley veterans, to pretend that technological developments are unavoidable, like acts of god. This is nonsense. Technology does not develop in isolation, does not have an independent, inherent purpose or destiny of its own. Instead, technology is made by people and is shaped according to their agendas and beliefs which, in the case of digital products and services, are the agenda and beliefs of the start up founders and Silicon Valley veterans. These beliefs are embedded in how technology functions and determine what technology affords us to do, what it prevents us from doing, and what it does regardless of our own intents and wishes. Just as systems can be designed to collect personal data to serve an extractive business model, systems can be designed in a privacy-friendly fashion, with respect for our autonomy and human dignity, without a negative impact on their functionality or usability. If we want this to happen, though, we need to educate ourselves, get involved and influence these agendas and beliefs. And dethrone the self-proclaimed gods of Silicon Valley.

The way systems are designed has a tremendous impact on our privacy - and we, as a society, actually have a choice here. Privacy is often ignored when designing systems. Sometimes this is out of ignorance. More often, it is on purpose because of the huge economic value of all that personal data. This approach is no longer sustainable. Stricter regulations and growing privacy concerns among the general public call for a different approach. But purely regulatory approaches to protect our privacy are not enough. Privacy-friendly design is essential.

Privacy isn’t hard if you try.

A common myth is that “privacy is hard”. Indeed, designing totally ‘private’ systems is next to impossible even under ideal circumstances. (The same is true for designing 100% secure systems by the way.) But let perfect not be the enemy of good. A little bit of effort and consideration can actually prevent a lot of privacy harm. In fact, just as technology can be used to invade our privacy, it can also be used to protect our privacy by applying privacy by design. Existing privacy-friendly technologies and privacy by design approaches can be used to create privacy friendly alternatives to the systems we commonly use today.

A very simple example (due to Jason Cronk) helps to illustrate the point. Suppose you are the manager of a popular restaurant. People may want to book a table by phone to guarantee a table. When they call you write down their name, the size of the group, and the date and time they would like to eat. You store the reservation list to check bookings and for later analysis. This will allow you to determine when exactly the peak hours are, what the average waiting time is, and whether this depends on the season, or the day of the week. Also, you might be interested to learn how these figures change over the years and whether there is any correlation with, say, the staff you employed during those periods. If you do keep a waiting list like this, you process personal information: you record the names of people visiting your restaurant, and you keep these records for later analysis.

But there is another way to keep a waiting list, a way that does not process any personal data at all and allows you to perform the exact same analysis as the waiting list that records names. In fact, the idea is trivial: keep a numbered waiting list. Instead of recording a name for someone making a reservation, you give them a reservation number (corresponding to the number on the reservation list you keep). When entering the restaurant, people mention this number instead of their name. You again store the reservation list for later analysis. But in this case, you do not process any personal data at all. All you record is the size of the group, and the date and time the group booked a table at the restaurant. All this data is anonymous. And all this data allows you to answer the same questions that the names-based waiting list allowed you to answer - except, of course, questions pertaining to the visits of particular groups of people, which you really have no business asking to begin with.

See? That wasn’t too hard now, was it?

Admittedly this example is rather trivial, but the reality is that almost all truly privacy friendly designs are based on similar, almost trivial, insights that turn the design of the overall system on its head. The trick is to learn to think out of the box, and to be aware of the possibilities that have been developed over the years. And to recognize and challenge the other myths, like “I’ve got nothing to hide”, “it’s merely metadata”, “we always need to know who you are”, “your data is safe with us”, or “privacy and security are a zero-sum game”.

The way forward

Significantly improving the privacy of the apps and services we use should be our first priority. This may require some effort, and may squash some extractive business models, but it is not at all as hard as many make us believe. We need to be prepared for the next crisis (like COVID-19, the Snowden revelations, or even something trivial like WhatsApp changing its privacy policy) that triggers people to reconsider the digital tools they use, to ensure that we have some good, usable, privacy friendly ones ready!

But this is only the first step, that focuses on the short term. The next step is much more fundamental, but one that is sorely needed to guarantee proper protection of privacy and other human rights and societal values in the long run. This step requires us to dig deeper down into the technology stack and look beyond the products and services we use, and reconsider the designs for the underlying computers and networks, both at the hardware and the operating system levels. These designs are half a century old by now and never fundamentally changed, while the world in which they are used has changed beyond recognition. We are stretching the boundaries of their use beyond the breaking point—not only in terms of privacy, by the way, but also in terms of security and reliability. It’s time to start redoing the plumbing, instead of applying Band-Aids to temporarily stop some leakage while we frantically mop the floor against all odds.

In case you spot any errors on this page, please notify me!
Or, leave a comment.