You Can be Anonymised But You Can’t Hide

Advertisement

If you think there is safety in numbers when it comes to the privacy of your personal information, think again. A recent study in Nature Communications found that, given a large enough dataset, anonymised personal information is only an algorithm away from being re-identified.

Anonymised data refers to data that has been stripped of any identifiable information, such as a name or email address. Under many privacy laws, anonymising data allows organisations and public bodies to use and share information without infringing an individual’s privacy, or having to obtain necessary authorisations or consents to do so.

Advertisement

But what happens when that anonymised data is combined with other data sets?

Researchers behind the Nature Communications study found that using only 15 demographic attributes can re-identify 99.98% of Americans in any incomplete dataset. While fascinating for data analysts, individuals may be alarmed to hear that their anonymised data can be re-identified so easily and potentially then accessed or disclosed by others in a way they have not envisaged.

Advertisement

Re-identification techniques were recently used by the New York Times. In March this year, they pulled together various public data sources, including an anonymised dataset from the Internal Revenue Service, in order to reveal a decade’s worth of Donald Trump’s negatively adjusted income tax returns. His tax returns had been the subject of great public speculation.

Advertisement

What does this mean for business? Depending on the circumstances, it could mean that simply removing personal information such as names and email addresses is not enough to anonymise data and may be in breach of many privacy laws.

To address these risks, companies like Google, Uber and Apple use “differential privacy” techniques, which adds “noise” to datasets so that individuals cannot be re-identified, while still allowing access to the information outcomes they need.

It is a surprise for many businesses using data anonymisation as a quick and cost effective way to de-personalise data that more may be needed to protect individuals’ personal information.

Advertisement

If you would like to know more about other similar studies, check out our previous blog post ‘The Co-Existence of Open Data and Privacy in a Digital World’.

Copyright 2019 K & L Gates
This article is by Cameron Abbott of  K&L Gates.
For more on internet privacy, see the National Law Review Communications, Media & Internet law page.

Published by

National Law Forum

A group of in-house attorneys developed the National Law Review on-line edition to create an easy to use resource to capture legal trends and news as they first start to emerge. We were looking for a better way to organize, vet and easily retrieve all the updates that were being sent to us on a daily basis.In the process, we’ve become one of the highest volume business law websites in the U.S. Today, the National Law Review’s seasoned editors screen and classify breaking news and analysis authored by recognized legal professionals and our own journalists. There is no log in to access the database and new articles are added hourly. The National Law Review revolutionized legal publication in 1888 and this cutting-edge tradition continues today.