Tracking across multiple websites using only first party cookies

October 4, 2013

Using only first party cookies (and the HTTP redirection mechanism), website visitors can be tracked and profiled across multiple sites. My student, Koen van Ingen, stumbled upon this while studying the cookie wall in the Netherlands. You can read more about this in his bachelor thesis (in Dutch). Here is how it works.

The HTTP redirect mechanism allows websites to redirect visitors that visit a particular web page to another page (possibly hosted by a different server). The contents of that second page are then displayed instead. So when browsing www.firstparty.com/page.html you may be redirected to (and get displayed the contents of) www.thirdparty.com/display.html. If the site you are redirected to sets cookies, your browser treats them as first party cookies (for the domain www.thirdparty.com). Therefore, they are not blocked and/or deleted when you instruct your browser to kill third party cookies! This allows websites to track users across several domains in the following way.

Let tracker.com be a profiling company that wants to allow sites like www.a.com and www.b.com to track their visitors across all sites affiliated with tracker.com. Each of the affiliated sites uses its own (local) first party cookie uid-a(and uid-b etc.) to track visitors on its own website. tracker.com ensures that these cookies in fact contain the same unique identifier for a particular visitor across all affiliated domains. To this end the tracker sets its own 'global' cookie uid-tracker.

A visitor to www.a.com for which cookie uid-a is not set, is redirected to the tracker using the following URL

http://tracker.com/track.html?return-url=www.a.com/source-page

If the tracker cookie uid-tracker is already set (with value user-1234, say), it is sent to the tracker with this redirect. If not, the tracker generates a new value (say user-1234) for uid-tracker.

The page at http://tracker.com/track.html again has no real content. Instead it redirects back to the page that was the source of the first redirect using the following URL.

http://www.a.com/source-page?cookie=user-1234

With this redirect, the tracker also sets the cookie uid-tracker for its own domain to user-1234 (if it was not set yet). When returning to the website of a.com through this URL, the web server reads the proposed value user-1234 for the cookie uid-a from the parameter cookie embedded in the URL, and sets this cookie with this value when returning the contents of the web page visited. After these two redirects, the uid-tracker and uid-a cookies have the same value. By embedding the cookie value in the redirect, the tracker is able to pass this value to other domains, thus violating the constraint that cookie values should only be visible within their own domain. The exchange of messages is also depicted in this figure.

Usage statistics can now be collected by tracker.com across all affiliated domains using the same mechanisms that traditionally used the third party cookie set by tracker.com directly. This time, however, the value of the local cookie has to be embedded explicitly in a URL referencing tracker.com. For example, site a.com could embed a web bug as http://tracker.com/bug.gif&cookie=value-of-cookie-a. The tracker uses the referrer headers within the request for this one-pixel image to determine the exact page on www.a.com that is currently visited by the user.

To ensure consistency of the local cookie with the global cookie, a site should redirect a user to the tracker every once in a while as if no local cookie is set yet.

This trick is used in practice. Google uses a similar method, for example, to ensure that if you are signed in to Gmail, you are automatically signed in to YouTube as well. The details are in Koen's thesis.

This shows that using only first party cookies and the HTTP redirection mechanism, users can still be tracked by a 'third' party across all websites affiliated to this party. Or, as Roesner et al. phrase it:

One tracker with client-side state can enable tracking by partners without client-side state

In other words: there is no point in trying to regulate the use of third party cookies. The same effect, namely persistent tracking of website visitors across multiple domains, is also possible using only first party cookies. No third party cookies, or JavaScript, are required. Instead of trying to outlaw specific methods (means) used to profile citizens, the regulator should focus its activities on restricting the profiling itself (aims). Law should be technology neutral.

P.S.: And yes, I am aware of using IP addresses or browser fingerprinting as a means to recognise a returning visitor; but these methods are less precise than using cookies.

P.P.S.: (Added 2019-11-25: This technique also showed up in an investigation the Dutch DPA performed in 2013)

In case you spot any errors on this page, please notify me!
Or, leave a comment.