The European Commission published a common toolbox for contact tracing and some guidance on such apps in relation to data protection. Here is a critical analysis.

The guidance

It is clear that the Commission aims to take privacy seriously. If only because it published a separate guidance document solely related to data protection issues related to contact tracing apps. But none of the other fundamental rights that are also at stake here (freedom of movement, non-discrimination, freedom to conduct a business, and freedom of assembly and of association) are seriously addressed. They are mentioned in passing only once in the guidance. The guidance does impose the condition that “the installation of the app on [a user] device should be voluntary and without any negative consequences for the individual who decides not to download/use the app”, but nowhere is it made clear what kind of steps should be taken to ensure this. Compared to the many concrete recommendations the commission offers both in the guidance and the toolbox document to protect privacy, it is clear the Commission has not given this any thought whatsoever. This is a serious omission.

Regarding data protection, the guidance is sound. It makes clear the app should be voluntary, not be bundled with other apps, not collect time and data of collection, its use should be temporary, individuals should be able to exercise their rights under the GDPR (although I cannot see how that would work when I want to request that your phone deletes the data it collects about me). Also deactivation of the app should not depend on de-installation by the user (although I wonder how that would be possible given that it broadcasts ephemeral identifiers over Bluetooth autonomously).

There are shortcomings too. The fact that contact tracing relies on fast and extensive testing is only mentioned in the context of necessity and proportionality of processing personal data, even though this is also crucial to protect the other fundamental rights. Accuracy of the information is important, but the discussion in the guidance is limited to talking about epidemiological distance and duration, while experience in Singapore has shown that this is highly contextual: a few minutes close in an elevator may be highly relevant epidemiologically; talking on the street for the same amount of time and at the same distance is not. And this even ignores the fact that especially actual distance between persons is hard to measure reliably (using Bluetooth). These two parameters may therefore be poor proxies for actual infection risk.

The toolbox

I have even more issues with the toolbox, even though it contains many of the recommendations mentioned in the guidance (that actually appeared one day after the document describing the toolbox).

My main objection is that the toolbox favours a centralised approach, where the (health) authorities receive information about who has been in contact with an infected person. Even though it describes both centralised and decentralised architectures for contact tracing, and concedes that a decentralised architecture is favourable in terms of privacy, the following quotes spread throughout the document clearly presuppose a centralised architecture:

The aim of contact tracing and warning is for public health authorities to rapidly identify as many contacts as possible with a confirmed case of COVID-19 (p6)

and

The functionality in such apps, if rolled out on a large scale so that they reach well over 50% of the population , could be useful for Member States to rapidly detect contacts of cases, collect information on these contacts and to inform contacts on the need for follow-up and testing if required. (p7)

and

Public health authorities will continue using currently available software (e.g. national contact tracing systems and Go.Data from WHO/GOARN) to manage the contact tracing and contact management process. (p7)

and

the health authorities may contact directly the user that was in close contact with a COVID-19 patient. (p13)

and

A common approach for the use of anonymised and aggregated mobility data will be developed (p24)

It appears the toolbox is in favour of the Pan-European Privacy-Preserving Proximity Tracing (PEPP-PT) approach, claiming it aims for an open protocol and an architecture that ensures that personal data stays entirely on an individual’s phone. The protocol is still not public, and contacts claim personal data is not limited to the individual’s phone and the architecture is in fact centralised ->.

The executive summary states that the system should be privacy-preserving by ensuring personal data is securely encrypted. This is absolutely insufficient: if all phone upload the data they collect about users in their proximity to the central server securely encrypted against the key of the health authorities (as many of the centralised proposals apparently do), these authorities get to see the full social graph of who has been in contact with whom.

The approach to also rein in the ‘digitally excluded’ is also extremely worrisome:

Complementary, location-based solutions could be used to increase the coverage of digital excluded people (e.g. elderly, children, health and care workers). Standalone devices or wearables which do not need a smartphone to operate could be considered for these groups. […] It is also possible to include in such groups Domotics and home based ICT solutions to broaden the number of people reached by the solutions. (p20)

Interoperability

The toolbox document rightfully stresses that interoperability at the European level is essential. This should ensure that when two citizens from different countries that each have their own national app installed meet and later one of them turns out to be infected, the other should somehow be informed. This is described in detail on page 16 and 17 of the toolbox document.

But if this is an important requirement, then the question is whether it is really possible for one country to deploy a centralised solution while the other country deploys a decentralised solution for their contact tracing app. In other words: is there a risk that either all member states go for a centralised solution, or all go for a decentralised one. As the current rumour is that Germany goes for the (centralised) PEPP-PT approach, this essentially would imply all European countries have to go for this centralised approach as well.

At first I thought such interoperability was impossible, but a little analysis shows that under certain assumption it is possible (although it may in practice be unworkable and require a lot of coordination between system developers).

Regardless of whether the design is centralised or decentralised, the idea of both approaches is that phones send out ephemeral identifiers (i.e. random looking numbers that change regularly) over the Bluetooth network. Other phones within 1 to 2 meters pick up these identifiers and store them locally. These phones of course also broadcast their own ephemeral identifiers in return. Tight standardisation is required to make this work across systems.

Roughly speaking, a decentralised app works as follows. When infected, a phone sends all the ephemeral identifiers it itself broadcast earlier to the central server. (In reality it does so much more efficiently, but that’s not relevant for this discussion.) This server therefore only learns the identity/identities of the infected phones. Other phones regularly query the server for newly uploaded ephemeral identifiers, and the app compares these new values with the ones it recently received over the Broadcast channel. If there is a match, the app knows it was in close contact with an infected person. Notice how the server does not learn this fact!

A centralised app, on the other hand, works as follows: both phones regularly upload the ephemeral identifiers they have seen, as well as the ephemeral identifiers they themselves broadcast earlier, to the central server. This allows the server to immediately determine who has been in close contact with whom (infected or not), which clearly allows the authorities to immediately notify all people that have recently been in close contact with a person that turns out to be infected. (There are centralised versions where only infected phones broadcast the ephemeral identifiers of all other phones the were in close contact with, see e.g. my earlier blog post on this.)

To make this interoperable, ephemeral identifiers could be tagged with a country code. (This raises privacy issues as it signals your country of residence.) Let D be country using a decentralised app, and C be a country using a centralised app.

For a user from D to learn that it was possibly infected by a user from C, it must detect that a phone from C was at some point close. It can do so using the country tag broadcast along with the ephemeral identifier. The phone of a user from D then knows it should also collect information of infected phones from the central server of C. (BTW: this mode of cross border communication is not sketched in the figure in the toolbox document on page 17; but the alternative where all ephemeral identifiers collected by all phones from C must be sent to the central server of country D is very inefficient. Moreover, this is also not envisioned by the figure on page 17, as it only considers cross-border exchange of users that tested positive.)

For a user from C to learn that it was possibly infected by a user from D, all that needs to happen is that the central server of D reports the ephemeral identifiers of infected phones it collects to the central server of C as well.

We see that interoperability is possible in theory, but at a price. Whether it will work in practice remains to be seen.