Browser fingerprints – the invisible cookies you can’t delete
Dear reader, it seems that you are causing headaches in dark corners of the web.
I pinpoint you specifically, as a reader of Naked Security, because I assume that if you’re a regular to this site then you’re more likely than most to care about who’s watching you online.
For the people trying to track you, profile you and sell to you, you’re a problem.
Historically, techniques for tracking people’s movements around the web have relied on HTTP cookies – small messages that ‘tag’ your browser so it can be uniquely identified.
Unfortunately for snoopers, profilers and marketers, cookie-based tracking leaves the final decision about whether you’re followed or not in your hands because you can delete their cookies and disappear.
It’s no secret that some vendors have moved on from cookies – local storage, Flash cookies and ETags have all been used in-the-wild, either as cookie replacements or as backups from which cookies can be ‘respawned’.
These techniques have been successful because they’re obscure but they all have the same fundamental weakness as cookies – they rely on things that you can delete.
The holy grail for tracking is to find a unique ID that you can’t delete, something that identifies you uniquely based on who or what you are, not what you have.
FINGERPRINTING BROWSERS
In July I wrote about Panopticlick, a fingerprinting tool that does exactly that. It was created by the Electronic Frontier Foundation (EFF) for its research paper How Unique Is Your Web Browser?.
Panopticlick asks your browser a few questions, such what fonts you have installed, what HTTP headers your browser sends, your screen size and your timezone.
That collection of information varies so much from one browser to the next that it’s enough to tell any two browsers apart with startling accuracy.
The EFF used Panopticlick to show that in the population of internet users it tested (a group likely to be more privacy concious than average) users had a 1 in 286,777 chance of sharing their fingerprint with somebody else.
That’s certainly good enough to use as a fall-back ‘respawning’ technique but perhaps not good quite enough to work as a cookie replacement.
Since Panopticlick was only designed to show that fingerprinting was viable it didn’t exhaust all the possible browser features that might be exploited for truly bomb-proof fingerprinting.
That such unexplored features exist was alluded to by the authors in their conclusion (my emphasis.)
We implemented and tested one particular browser fingerprinting method. It appeared, in general, to be very effective, though as noted in Section 3.1 there are many measurements that could be added to strengthen it.
FINGERPRINTING BEYOND THE BROWSER
As chance would have it, at the same time as I was writing about Panopticlick, a well known internet company with a foothold on 13 million websites was caught experimenting with one of those ‘missing’ techniques; canvas fingerprinting.
AddThis is the internet’s premier purveyor of social media sharing widgets.
Its code is embedded in millions of websites, which gives it a huge platform on which to run its anonymous personalization and audience technology.
Between February and July 2014 that technology included a live test for a canvas fingerprinting technique.
To illustrate the point I’ve included two pictures of the letter T below with its SHA1 hashes. One was rendered by Firefox 33 on OS X and the other by Safari 8 on the same machine.
The
<canvas>
element is a feature of HTML5, the language used to build web pages. It’s a ‘drawing surface’ on to which small computer programs, written in JavaScript and embedded in the same page, can paint pictures, animations and other visual elements (our Asteroids game is a fine example – just search our site for Asteroids.)
Often the most sensible and efficient way for web browsers to handle canvas graphics is to hand over font rendering and 2D compositing to the underlying operating system and hardware GPU.
Different graphics cards and operating systems work slightly differently, which means that different browsers given identical instructions on what to draw will draw slightly different pictures.
55b2257ad0f20ecbf927fb66a15c61981f7ed8fc
17bc79f8111e345f572a4f87d6cd780b445625d3
In 2012, researchers Keaton Mowery and Hovav Shacham published a research paper entitled Pixel Perfect: Fingerprinting Canvas in HTML5 which showed that there was enough variation to create a reliable browser fingerprint.
In their own words:
...the behavior of<canvas>
text and WebGL scene rendering on modern browsers forms a new system fingerprint. The new fingerprint is consistent, high-entropy, orthogonal to other fingerprints, transparent to the user, and readily obtainable.
Remarkably, they didn’t have to try very hard to tease out the differences between graphics cards…
Our experiments show that graphics cards leave a detectable fingerprint while rendering even the simplest scenes.
…nor the way that even common fonts are rendered.
Even Arial, a font which is 30 years old, renders in new and interesting ways depending on the underlying operating system and browser. In the 300 samples collected for the text_arial test, there are 50 distinct renderings.
Since the technique relies on rendering pictures you might think that there would be something you could see that gives the game away, right? Not so.
Our tests can be performed, offscreen, in a fraction of a second. There is no indication, visual or otherwise, that the user's system is being fingerprinted.
Finally, the messy business of comparing pictures is neatly accomplished by converting the picture rendered on the canvas into a string of base64 data (using the
toDataURL()
method) and running it through a hashing function to create a short, fixed length ID.
This makes dealing with canvas fingerprints almost as easy as dealing with cookies.
Mowery and Shacham estimated the entropy of their fingerprint to be about 10 bits, which is impressive but fewer than the 18.1 bits found in the Panopticlick fingerprint.
Just as the Panopticlick researchers did, they conclude that there’s more entropy to found:
We were surprised at the amount of variability we observed in even very simple tests ... We conjecture that it is possible to distinguish even systems for which we obtained identical fingerprints, by rendering complicated scenes that come closer to stressing the underlying hardware
FINGERPRINTING IN THE WILD
The potential for canvas fingerprinting was obvious but Mowery and Shacham had only shown that it was possible, not that it was being used in the real world.
In 2014, a group of researchers from Princeton and the University of Leuven set out to see if canvas fingerprinting was being used in the wild.
They crawled the home pages of the 100,000 most popular websites and found 20 distinct implementations of canvas fingerprinting.
Nine of them appeared to be home-brewed implementations unique to a single site while 11 of them were third party scripts shared across a number of sites.
The lion’s share of the sites they found though, some 95% of the 5,542 unique sites that were using canvas fingerprinting, were using code provided by AddThis.
I should be absolutely clear that neither site owners nor users were aware that they were part of an AddThis test bed.
The AddThis code that the researchers found was to provide social media sharing functionality and the fingerprinting code bundled with it unannounced was being used by AddThis for its own ends, and not by its customers.
The results of the research were published in a paper, The Web Never Forgets, in July 2014, and caused a bit of a stir in the computer security press.
By a happy and remarkable coincidence, the six month “preliminary initiative to evaluate alternatives to browser cookies” ended at exactly the same time.
AddThis came clean in a blog post shortly after concluding the test and was at pains to reassure users that their privacy had been protected.
... this data was never used for personalization or targeted advertising.... We don't identify individuals ... and we honor user opt-out preferences any time we act on our data.... We adhere to industry standards, and have an opt-out process that complies with our membership in the NAI and the DAA. We honored our opt-out policy during this test, and the data was only used for internal research.
In the comments, a representative from AddThis revealed that the test wasn’t wrapped-up as a matter of conscience, or even damage limitation, but because it didn’t work very well.
Had the identification actually been good, we would have kicked off a whole new investigation ... But given the results, we're halting the project.
Disappointingly, the post also seeks to justify the company’s actions by invoking an excuse familiar to parents of teenagers the world over – everyone else is doing it, so why can’t we:
Many other companies are working on cookie alternatives, and we wanted to see if this approach worked.
THE BOTTOM LINE
What AddThis didn’t address in its mea culpa is the fundamental thing that makes fingerprinting and other exotic tracking techniques so obnoxious:
They only exist to rob users of the ability to control who tracks them.
Cookies provide a perfectly decent way to identify users – they’re reliable, benign, well understood by users, easy to implement and easy for users to control.
The only ‘problem’ that super cookies, evercookies, fingerprints and other methods ‘solve’ is that of users having opinions about who tracks them.
Users who delete cookies are sending out a clear message that they don’t wish to be tracked. Vendors who use fingerprinting are looking for ways to drown out that message.
HOW TO PROTECT YOURSELF
Fingerprinting is a viable alternative to cookies that’s being used in the wild.
The techniques shown by Mowery, Shacham and the EFF are individually useful but both sets of researchers pointed to ways their techniques might be made better still. The most obvious way to strengthen either technique is to combine it with the other since the two don’t overlap.
That work has already been done and an off-the-peg fingerprinting library that incorporates both techniques is available for free on GitHub.
Existing counter-measures are of limited use; Private Browsing and Incognito mode don’t alter a browser’s fingerprint and, according to the author of the code I mentioned above, they have no effect.
Privacy conscious users who deploy browser plugins to manage cookies and other tracking mechanisms are also likely to make their fingerprints more distinct, not less.
There is no single, good way to protect yourself but there are things that you can do to make your fingerprint less distinct.
Turning off Flash, Java, WebGL and Javascript will reduce your fingerprint massively but you may find the web unusable if you do. A reasonable compromise would be to disable Flash and Java and use a plugin like noscript.
Privacy plugins like Ghostery should protect you from fingerprinting code served from known, third party domains used for advertising or tracking.
According to the EFF the browser most resistant to fingerprinting is the Tor browser because of its bland User-Agent string and aggressive approach to blocking JavaScript.
Tor also asks for a user’s permission before giving websites access to data on canvas elements, which completely disrupts canvas fingerprinting. The same functionality is available in plugins for Chrome and Firefox.
The EFF is also promising that future versions of its PrivacyBadger plugin will include countermeasures against fingerprinting.