First I will discuss some existing (mostly abandoned) approaches, and some earlier analysis of these approaches. After that I will synthesise the findings and propose a minimum set of privacy characteristics I think should be represented by an appropriate icon set. I will also discuss some important conditions to make it more likely that such a set of icons becomes succesful in practice.
Overview of existing approaches
- the ‘Iconset for Data-Privacy Declarations’ by Martin Mehldau,
- the ‘Policy Coding Methodology’ of the KnownPrivacy project,
- the Mozilla Privacy Icons,
- the Disconnect.me icons,
- PrimeLife’s icons, and
- a proposal by TNO.
I will briefly summarise and discuss each of these approaches below.
Martin Mehldau’s Iconset for Data-Privacy Declarations
One early approach by Martin Mehldau from 2007, the Iconset for Data-Privacy Declarations, distinguished the following characteristics that each deserve a separate icon, grouped into four different categories:
- what data is processed: real name, username, address, IP address/files/time, email address, comments/conversations, mails/messages, contacts/friends, favourites/interests, edits, and cookies.
- how is it processed (and by whom): deleted, saved, anonymised, encrypted, published, passed on, for friends/contacts, for friends of friends, and whether you have a choice.
- for what purpose: statistics, advertisement, or shopping.
- for how long: for a session (end of usage/logout), until the end of a contract, for hours, for days, for weeks, for months, for years, for the time being (i.e. forever).
It is interesting to note that Mehldau considers IP addresses to be in the same characteristic as files and time information, and that he considers things like conversations and favourites as separate characteristics that deserve a separate icon. Location is missing as a characteristic.
Mehldau makes no distinction between how the data is handled, and with/by whom: both types of icons are in the same category.
I particularly like the distinction Mehldau makes between different kind of data retention periods:
- for the current session,
- for the current contract,
- for some fixed period of time, or
Apart from that, the list of characteristics is quite large and rather unstructured.
The KnowPrivacy project published a set of icons in 2009, as part of their Policy Coding Methodology. They distinguished the following three categories of privacy characteristics.
- types of data collected: contact information (name, address, email address, phone number), computer data (IP address, browser type, OS information), interactive data (browsing behaviour, search history), financial data (account status, credit information, purchase history), or content (personal communications, stored documents or media).
- general data collection practices: ad customisation (user data used to customise advertising), third party tracking (third parties are allowed to track user behaviour), public display (information contributed by the user may be displayed publicly), user control (users can access and correct personal information), data retention (including an indication of the retention period).
- data sharing practices: shared with affiliates (bound by the same privacy practices), contractors (bound by the same privacy practices), or third parties (not bound by the same privacy practices).
I like the consolidation of distinguishing only five different types of data (compared to Mehldau’s more unwieldly and less structured set of data types). Unfortunately again location is missing. Moreover, legally speaking important data types like medical data or special data (information about race, religion or sex) are sadly missing.
I also like the fact that the classification underlines the significance of whether historical data (search queries, financial records) is collected.
The Mozilla Privacy Icons
The Mozilla Privacy Icons propose a much simpler set of icons, distinguishing only the following characteristics:
- retention period,
- third party use,
- ad networks, and
- law enforcement access.
Given the other approaches discussed so far, this set of characteristics is disappointingly minimal. It is however interesting as it recognises the importance of law enforcement access, that does not appear in any of the other icon sets proposed. Mozilla proposes to distinguish two cases: a statutory process (where organisations require the government to comply, at a minimum, with the legal process provided by the law before getting users’ data) and a transparent process (where organisations always follows a publicly-documented and consistent process).
Such an icon could also be accompanied by a link to an (annual) transparency report, or warrant canary.
The Disconnect.me Icons
The Disconnect.me Icons evolved from a Mozilla led working group. It is unclear, unfortunately, whether this is the same group that designed the icons mentioned in the section above. They distinguish the following characteristics:
- expected use (does the service use data in ways other than you would reasonably expect given the site’s service?)
- expected collection (does the service allow other companies like ad providers and analytics firms to track users on the site?)
- precise location (does the service tracks a user’s actual location?)
- data retention (how long does the service retain personal data?)
- do not track (does it honour user do-not-track preferences?)
- children’s privacy (has this website received TRUSTe’s Children’s Privacy Certification?)
They also distinguish, quite oddly and rather specifically, whether the service offers SSL support and whether it is affected by the Heartbleed bug. Again the proposed set of icons is minimalistic.
What I do like in the Disconnect.me approach is the quite innovative idea to use a reasonable expectation of privacy of the average user of the service as a point of departure. Of course the question then becomes what exactly is reasonable and who determines that. This may be very hard to make objective and easy to understand in practice.
The PrimeLife Approach
The EU funded PrimeLife research project has investigated the use of icons to represent privacy policies in some depth around 2010. They took the icon sets designed by Mary Rundle (whose icons seem to have disappeared form the web) and Martin Mehldau as point of departure. PrimeLife distinguishes the following two important categories.
- data types: personal data, sensitive data, medical data, payment data.
- processing steps (including references to common purposes): legal obligations, shipping, user tracking, profiling, storage (including an indication of the length of the data retention period, deletion, pseudonymisation, anonymisation, data disclosure to third parties, data collection from third parties.
PrimeLife aimed to design a limited set of icons1, restricted to data types and purposes users often cope with in the online world. Explicitly included are icons representing positive steps taken by data controllers, like the use of encryption or anonymisation techniques.
According to PrimeLife it makes sense to design a general set of icons applicable in all application domains, as well as designing additional icons for specific application domains, e.g social networks. For this they introduced icons for the following additional category
- groups of recipients: friends, friends of friends, selected individuals, the public.
Interestingly enough, PrimeLife did not consider to make similar distinctions between recipients of data in the general case (although it did suggest to specify this with an optional text string along the ‘data disclosure to third parties’ or ‘data collection from third parties’ icons).
As a last contribution I will briefly discuss a recent but not very well known approach from Johanneke Siljee of TNO. She proposed a totally different approach in 2015, doing away with separate icon categories, and instead proposing to signal the following characteristics using an icon:
- can the service be used anonymously or not; does the site collect anonymous usage statistics?
- does the service implement user choice through opt-in or opt-out, or not at all?
- access rights: can you see which personal data the service collects, and can you have it corrected?
- does the service collect (the legally important) sensitive or special data?
- does the service perform or allow profiling and data mining?
- does the service disclose or sell personal data to third parties?
- does the service disclose personal data to other countries, or countries outside the EU?
- how long does the service retain your data?
What I like here is the recognition of both user choice and access rights. Also, the important question whether data is shared with other countries, especially those outside the EU, is covered in this icon set.
Earlier analysis of icon based approaches
Some of the suggested approaches to summarise privacy policies using icons have been analysed by other scholars before. I will briefly summarise both the analysis of Van den Berg and Van der Hof as well as the analysis of Edwards and Abel here.
The analysis of Van den Berg and Van der Hof
According to the same survey, the kind of information users are mostly interested in are:
- which of their personal data are collected,
- how these data are used,
- whether or not their data are passed on to third parties,
- whether their data is handled securely, and
- whether they can object to the use of their data.
Based on their findings, Van den Berg and Van der Hof proceed to develop a ‘Privacy Wheel’, that looks a bit like a privacy labelling approach instead of an icon based approach and which is loosely based on the OECD Fair Information Processing Practices. For us the above list of five types of information most relevant to users is most significant when considering the relevant privacy characteristics to display using icons.
The analysis of Edwards and Abel
Edwards and Abel have written a very nice report discussing the icon based approaches mentioned above. The following list summarises their main findings.
- Icons allow for quick comprehension regardless of social and cultural backgrounds of users.
- Critical mass is essential, yet hard to achieve; government mandates or co-sponsorship may be beneficial.
- Icons need to be simple (and hence sacrifice legal detail).
- A standardised graphical approach across multiple jurisdictions is best (it creates more trust, less confusion, and creates the best opportunities to create critical mass).
- Layered privacy policies are less prominent now than several years ago2.
My own analysis
Practical applications and real-world experiments using privacy icons are sorely needed to support this claim and to better understand what does and does not work in practice. This research needs to be done by scientists with the appropriate background and skill sets to perform these kind of user studies, and should be based on icon sets designed by graphical design professionals.
All of the icon sets analysed have issues expressing the purpose of the processing. They all focus on expressing only very specific, sometimes only ‘harmful’, purposes (like shopping, advertising or profiling). This is not surprising given the fact that personal data may be processed for a great variety of reasons, ranging from optimising or personalising the service, through big data analytics or collecting information to complete an order and ship the items bought, all the way to profiling and tracking users. Some attempts to capture some of these aspects in icons resulted in very complex designs that were poorly understood (e.g the PrimeLife icons). We conclude that the purpose of the processing cannot be expressed graphically and should instead be explained using a short sentence in everyday speech.
What definitely needs to be expressed is what type of personal data is collected. This is what almost all analysed schemes do, in varying degrees of detail. In the proposal outlined below a common sense balance between detail and simplicity is struck, making sure that legally significant classes of personal data (like health data and so called ‘special’ data) is clearly distinguished.
Intuitively, it makes sense to define a characteristic that specifies who processes the data (where processing includes having access to the data). However, it matters a great deal whether data is processed locally (on a user device but by an externally provided app) or remotely, even if in both cases the same (external) party is responsible for the data processing (i.e. is the data controller). So instead we define a slightly broader characteristic that specifies where the data is processed.
Based on the results of the survey of Van den Berg and Van der Hof, I believe it is also important to express how the data is processed: is this done a secure fashion, done in accordance with certain legal requirements (as suggested by Edwards and Abel). This category also should contain information about the retention period (where I draw on the ideas of Mehldau) . I also think it makes sense to include information about how governmental data access requests are being dealt with (as originally proposed by Mozilla), although this is the least important characteristic in my opinion. If in the end the number of characteristics proves to be unwieldy, this one could be omitted.
A proposal for a set essential privacy characteristics
Given the overview of existing proposals (and their shortcomings), and based on the analysis above I propose the following set of privacy characteristics to be shown using icons as a minimum:
- what personal data is processed,
- where it is processed (including by whom), and
- how (i.e. with which safeguards) it is processed.
As explained above, the purpose of the processing cannot be adequately expressed by an icon. As a result there is no why category for icons. This should be expressed by a short statement in everyday language instead.
Note that the scope of the icons is broadened by referring to data processing (as defined in the Data Protection Directive (DPD) as well as the upcoming General Data Protection Regulation (GDPR)) instead of only referring to data collection (as most of the previous proposals have done, although they probably intended to include all forms of personal data processing).
As said before, this is only a proposal for a set of essential privacy characteristics. It does not include actual icons to graphically represent these. I would love to include those however. So: if you are a professional graphics designer and would like to contribute, please send me your designs!
With that out of the way, let me describe the privacy characteristics in a bit more detail.
What: which type of personal data is processed?
Show an icon for each of the following types of personal data if it is processed.
- Contact data: name, address, email address, phone number.
- Financial data: account status, credit information, purchase history.
- Medical data: DNA, medicine/drug usage, patient dossier, biometrical data.
- Special data: religious beliefs, criminal records, gender, race.
- Behavioural data: browsing and search history, energy consumption patterns, location data. In other words: what you did and where you did it. (This corresponds to metadata, or the observed data class from the WEF classification.)
- Content: personal communications, stored documents or media.
- Tracking data: cookies, IP address, browser type, OS information.
Where: at which location/device is the data processed, and under whose responsibility?
Indicate where the personal data is processed:
- locally, at the user device
- centrally, at the data controller
- shared with third parties
How: what are the safeguards related to the data processing?
Show icons for each of the following safeguards related to the data processing taking place.
- Retention period: Indicate the period for which the personal data is retained:
- the current session,
- the current contract,
- some fixed period of time, or
- Security: Indicate whether personal data is processed securely.
- Consent: Indicate whether personal data is only processed after explicit consent of the data subject.
- Governmental access: Indicate whether access to personal data by law enforcement, tax agencies, intelligence services and the like is restricted using a statutory process and/or a transparent process.
An alternative approach that I haven’t explored yet is related to the idea behind the Disconnect.me approach to express deviations from the expected collection and expected use of personal data. As discussed above, this particular idea is hard to make objective and (thus) understandable for the average user. However, the idea could be transformed into a benchmark approach. Different services could be scored against a benchmark and given icons to represent whether they perform better or worse, compared to the average. In this case, no icons means that the privacy protection is average. Green icons could be used to indicate a service performs better than average on a certain characteristic. Red could be used when a service performs worse than average on a characteristic.
Another approach is to only specify privacy characteristics that pose a risk. Then the best (i.e most privacy friendly) service is one that has no icons (because it induces no risks). In this approach care has to be taken to convert privacy protective measures (e.g. the use of anonymisation) into their opposite, privacy risk inducing, qualities.
A great analysis of current approaches to using icons to represent privacy policies is provided by:
Lilian Edwards, Wiebke Abel: “The Use of Privacy Icons and Standard Contract Terms for Generating Consumer Trust and Confidence in Digital Services”, CREATe Working Paper 2014/15 (October 2014), pdf
For PrimeLife icons, see
- Leif-Erik Holtz, Katharina Nocun, Marit Hansen, “Towards displaying privacy information with icons”, Privacy and Identity Management for Life, 6th IFIP Summer School, Helsingborg, Sweden, August 2–6, Springer, 2010, pp 338-348 pdf
- “Policy Icons and Tests”, Chapter 3, PrimeLife Deliverable D4.3.2, pdf
For the TNO approach, see
- Johanneke Siljee, “Privacy Transparency Patterns”, Chapter 9, The Privacy & Identity Lab, 4 years later, ISBN: 978-90-82483 5-0-5, November 2015. pdf
- Johanneke Siljee, “Privacy Transparency Patterns”, EuroPLoP ’15, July 08 – 12, 2015, Kaufbeuren, Germany.
See also the following references for additional material
- Privacy Nutrition Labels developed by CMU for a slightly different approach.
- Privicons, that offer a simple set of icons to express privacy preferences for emails that you send.
- B. van den Berg and S. van der Hof, “What Happens to my Data? A Novel Approach to Informing Users of Data Processing Practices” (2012) 7:2 First Monday.