Privately (and unlinkably) exchanging messages using a public bulletin board

June 22, 2015

In this blog post I present a secure and privacy friendly asynchronous point-to-point message exchange protocol using a public bulletin board that makes individual send or receive events unlinkable to one another. While the clients must securely run in the user's endpoint device, the bulletin board can be hosted on an arbitrary public cloud at no additional risk. In fact the protocol provides the same unlinkability guarantees as the underlying mixing network thus also protecting the social graph. The protocol is efficient, and the central bulletin board can adaptively be scaled and distributed depending on the load.

This blog post describes the details of the design of the Burnrchat app that was the result of a design sprint I previously discussed on this blog.

Motivation

Messaging apps that allow people to asynchronously exchange messages are hugely popular. However, people have become increasingly aware of the extent of the dragnet surveillance performed by intelligence agencies and show a decreasing trust in the providers of such messaging services service themselves. This has created a market for more secure and privacy friendly implementations of such messaging services. Ideally these services should both hide the content of the messages exchanged as well as the sender and recipient (i.e the metadata) associated with the communication. Moreover, both content and metadata should be hidden from both an external (active) adversary as well as the service provider itself. In the extreme case, you want to totally hide the social graph, and in fact be able to anonymously establish a secure contact that allows you to securely and anonymously communicate with that contact at a later stage.

Substantial improvements to protect the content of messages exchanged has been made. And in fact several so called secure messaging apps are readily available. However, none of these apps protect the metadata. This is problematic because this metadata allows adversaries (or the service providers themselves) to reconstruct the so-called 'social graph', which is in itself quite sensitive.

One way to limit metadata collection by the service provider is to avoid any centralised server and instead implement a peer-to-peer protocol. Yet a truly peer-to-peer solution is often impractical, because devices may not always be on line (mobile phones), have non-fixed IP addresses (again mobile phones, especially when roaming or when connected using a cellular network) or may live behind a firewall. Also, many people may be reluctant or incapable to install and run a local server that could serve as a proxy to implement such a P2P protocol.

In short, for large scale adoption of a messaging app a more centralised approach that does not require any user-side setup is preferred. Yet any centralised solution runs, almost by definition, the risk of exposing the social graph. The system outlined below resolves this apparent paradox that.

Goal

We want to implement an asynchronous point to point messaging system, that protects both the content of the messages, as well as the metadata. An external adversary, and even the provider of the messaging service itself, is not allowed to learn who is communicating with whom. The so called social graph (or friend graph) is protected.

An asynchronous point to point messaging system allow users to send messages to other users even if they (i.e. the recipients) are off line. Asynchronous messaging is different from traditional instant messaging or chat systems that are inherently synchronous and assume that both users are on line at the same time.

The point-to-point message exchange protocol should satisfy the following requirements. Suppose Alice successfully sends a message m to Bob. Then

correctness: Bob eventually receives m (unaltered). Moreover, Bob receives its messages in the same order as Alice sends them.
confidentiality: Nobody else (except Alice and Bob) learns m.
integrity: If Bob receives a message m', m' was sent to Bob (by one of his contacts).
authenticity: Alice is guaranteed that only Bob receives m, and Bob is guaranteed that Alice was the sender of m (if he receives m).
unlinkability of events: The adversary cannot link any two send and/or receive events.
unlinkability of relationships: No two users can determine whether they are sharing a contact.
forward security: If a user device is compromised, the adversary is only able to obtain the contents of (or tamper with) future messages to or from this user.
availability: A sender will eventually be able to send a message, and a message sent will eventually be delivered.

Primitives used

The protocol uses the following primitives.

We use an authenticated encryption scheme that given a key K shared between Alice and Bob, allow Alice to encrypt a message m using c := encrypt(K,m). Bob can decrypt this message using m := open(K',c). An authenticated encryption scheme guarantees that if K <> K', or if c was tampered with in any way, then m = nil. In other words, Bob is guaranteed that Alice was the sender, and can detect errors or malicious changes to the message. (Unless Bob uses the same key to send messages to Alice. Note that in the protocol outlined below Bob does not do this.)

We assume that the encryption does not leak information about the message or the key used to encrypt the message. In particular, an adversary should not be able to detect that two messages are encrypted using the same key. That would break the unlinkability properties of our protocol.

We use a key derivation function KDF such that given K' := KDF(K), one cannot go back and recompute K. This is used to make the protocols forward secure.

We also use a preimage resistant hash function B such that given a value b it is prohibitively hard to compute a t such that b = B(t).

System

burnrchat The system consists of users devices (e.g. smart phones, that run the protocols to send or receive messages), a public bulletin board implementing a set of functions to access and modify its contents, and a mixing network that allows users to unlinkably send commands to the bulletin board and obtain their responses. Users establish contact by bumping their smart phones. This is all sketched in the figure on the right.

The bulletin board

The bulletin board is a collection of N cells, that each contain a set of (value, tag) pairs. Initially all sets are empty. The bulletin board implements the following two functions.

write(i,v,t):: Add the value, tag pair (v,t) to the set of pairs stored at cell i on the bulletin board.
get(i,b):: If the set at cell i contains a pair (v,t) for some value v and t with t = B(b), then return v and remove (v,t) from the set. Otherwise return nil and leave the cell unchanged.

In other words, to read (and delete) a value from a cell, you need to know the preimage of the tag used to store that value in the cell.

The mixing network

Users send their commands to the bulletin board using an anonymising mixing network. The connection from a user device to the bulletin board (over the mixing network) is assumed to be reliable and authentic: a command send to the bulletin board will eventually be executed, and will not have been tampered with by the adversary. The mixing network is not tied to this particular application. An arbitrary mixing network can be used, which means the traffic is mixed with traffic from other applications. This increases the level of privacy offered.

Threat model

The bulletin board is public. The server implementing is assumed to be honest but curious: it will try to learn whatever it can from the contents of the bulletin board and the access attempts made. But it will not modify its contents or deviate from its protocol.

We do nor specify the power of the adversary explicitly in terms of what it can and cannot do, as is typically done. Instead we assume that the adversary (whatever its particular capabilities) has a certain maximal probability of linking messages entering or leaving the mixing network (given a particular choice of mixing network). In what follows we show that the protocol does not give the adversary a better chance to link communicating users.

The idea

Suppose Alice and Bob share a symmetric key K_AB, an index idx_AB and a tag tag_AB, used to send messages from Alice to Bob. Different values are used for messages from Bob to Alice, and for any other pairs of contacts. When Alice and Bob establish contact, they generate random initial values for K_AB, idx_AB, tag_AB and exchange these by bumping their devices, for example, or by scanning a QR code displayed on the other person's phone.

Alice uses idx_AB as the index of the cell where the next value for Bob will be stored, with tag tag_AB. Bob uses his copy of idx_AB as the index of the cell to look for any new message from Alice. He needs his copy of tag_AB to be able to access this message.

When Alice wants to send a message m to Bob, she first selects a new random index idx' and tag tag' for the next message she may want to send to Bob. These are piggybacked to the current message for Bob. All three items are encrypted using an authenticated encryption scheme, using the current key K_AB, and assigned to v: v := encrypt ( K_AB, m || idx' || tag' ). This value is written to the bulletin board at index idx_AB using tag B(tag_AB). After that, Alice updates her key using the key derivation function by K_AB := KDF ( K_AB ). This makes the scheme forward secure. If Alice is compromised, all keys she used to encrypt earlier messages have been erased. The plaintext messages can therefore never be recovered by the adversary, even if he would continously make backup copies of the bulletin board.

When Bob wants to receive a message, he uses the get function of the bulletin board to check whether a message is waiting for him at index idx_AB. He needs the tag tag_AB for this (the bulletin checks whether this is a preimage of the tag stored at that location). If there is such a message, Bob uses the piggybacked values to update his own copies of idx_AB and tag_AB. Bob also updates his key using the key derivation function.

The protocols

To send a message to Bob, Alice executes randomly select idx' from {0,...,N-1} randomly select tag' from T v := encrypt ( K_AB, m || idx' || tag' ) write ( idx_AB, v, B(tag_AB) ) idx_AB := idx' tag_AB := tag' K_AB := KDF ( K_AB ) To receive a message from Alice, Bob executes v := get ( idx_AB, tag_AB ) IF v <> nil AND ( m || idx' || tag') := open ( K_AB, v ) is successful THEN idx_AB := idx'; tag_AB := tag' ; K_AB := KDF ( K_AB ) ; RETURN m ELSE RETURN nil

Engineering issues

To deploy this protocol in practice, several engineering issues need to be addressed. The system is highly scalable, but availability and synchronisation between senders and receivers need to be improved.

Scalability

The system is scalable because it can easily adapt to an increase or decrease in server communication and processing load, as well as changes in storage requirements.

If the load on the server gets too high, the bulletin board can be partitioned into chunks that are each served by a different server. The partitioning is communicated to the users, which allows all read and write operations to be directed at the appropriate server depending on the index of the cell to be accessed.

To change the size of the bulletin board a new bulletin board with the appropriate size has to be created, and all send function calls will be instructed to direct their locks and writes to the new bulletin board. At the same point in time, all receives are instructed to look for any remaining (old) messages on the old board once, and to start looking for messages on the new board from that point onward. Once the old bulletin board only contains empty cells, it can be discarded. This can be monitored by letting the bulletin board keep track of the difference between successful write and get calls, as this in fact equals the number of messages the bulletin board currently contains.

This difference can also be used to determine whether the bulletin board size should be increased or decreased. A bulletin board with very few occupied cells wastes storage space and should be decreased. A bulletin board with many occupied cells will make reads become slower (as the expected size of the set of messages stored in a cell increases).

Availability

The protocol as it stands assumes honest users and is highly susceptible to denial-of-service attacks. Although deletion of entries from cells is prevented by requiring the knowledge of the preimage of the corresponding tag value, anybody can write to a cell. This allows the adversary to completely fill the bulletin board essentially blocking new writes.

Unfortunately any fundamental solution to this problem does not appear to exist, in essence because an adversary can always spawn an overwhelming number of users that individually behave within the limits deemed acceptable, while collectively bringing the system down to its knees.

Keeping senders and receivers synchronised

Keys, indices and tags can get corrupted (if a client app crashes, or needs to be reinstalled, for example). A sender will not notice this as the protocol blindly uses whatever values it has to store the next message in some cell with some tag encrypted against some key. The receiver will notice if the tag band index are right but the key is corrupted (because the decryption of the message will unexpectedly fail).

To ensure that a client app always detects such corruptions, one can add redundancy to the local state of each user client. One possibility is to store a hash of the state along with the state itself, and verify this hash at the start of each send or receive. The obvious solution to recover from such data compromise is to allow clients to encrypt their state (keys, index and tag for each of their communicating parties) against some long term static key (with an offline copy) and store this somewhere. However this idea does increase the risk of linkability: every time a user performs an action his state needs to be updated, and these events can be correlated. Using the mixing network to also obfuscate this storage of the state partially resolves this, but still gives the adversary additional information for the potential identity of the sender or the receiver of a particular message. This increases his chances.

Analysis

To see that the scheme is correct we observe the following. When Alice sends messages, she essentially creates a linked list of cells containing future messages for Bob to receive. Bob receives them in the correct order by traversing this list.

Both the send and the receive protocol accesses the bulletin board exactly once. The send protocol clearly takes constant time to complete. The time complexity of the receive protocol (or rather the get operation implemented by the bulletin board) equals the expected number of entries in a cell. This depends on the size of the bulletin board N and the number of messages t it currently contains. As the index of a cell to use for a message is selected at random, the expected size of a cell equals t / N. (We note that in almost all cases, any cell either contains no entries, or just a single one.)

To see the scheme satisfies the security requirements , we observe the following. All messages are encrypted using an authenticated encryption scheme, with different keys for each connected pair of users. This assures confidentiality, authenticity and integrity of the messages exchanged. Unlinkability follows from the random selection of the next cell to be used to exchange a message. Consider any two different events that are not a send and the corresponding receive, or two different tries to receive the same message. These use totally unrelated cells because of the random selection process. The encryption used in the exchange of the next index to use ensures that the adversary cannot predict the next index in advance.

Closing

The presented protocol allows people to exchange message asynchronously, in private, without revealing even who they are communicating with. In essence one could say the protocol transforms a synchronous communication protocol (a mix network) into an asynchronous one with the same privacy guarantees.

Update 18-08-2015: the full scientific paper

J.-H. Hoepman. Privately (and Unlinkably) Exchanging Messages Using a Public Bulletin Board. In ACM Workshop on Privacy in the Electronic Society (WPES 2015), Denver, CO, USA, October 12 2015.

is now available.

In case you spot any errors on this page, please notify me!

Or, leave a comment.