Participants in Personal Genome Project Identified by Privacy Experts

Posted: May 2, 2013 at 7:46 am

Privacy experts have identified participants in the Personal Genome Project using de-identified data.

One of the biggest questions in biology is the nature versus nurture debate, the relative roles that genetic and environmental factors play in determining human traits.

In 2006, George Church at Harvard University and a few others started the Personal Genome Project (PGP) to help answer this question. The goal is to collect genomic information from 100,000 informed members of the public along with their health records and other relevant phenotypic data. The idea is to use this information to help tease apart the relative contributions of genetic and environmental factors.

The project does not guarantee privacy for those who sign up. Indeed, the participants can reveal as much information as they like, including their ZIP code, birth date and sex.

However, the data is de-identified in the sense that the owners names and addresses are not included in their profiles on the PGP website and this generates a veneer of privacy.

Today, Latanya Sweeney and colleagues at Harvard show that even this is practically useless in keeping owners identities private. They say a relatively simple comparison of the list of PGP participants with other databases such as voter lists reveals the identity of a significant number of them with remarkable accuracy.

Thede-anonymisation procedure is simple.Voter lists contain information including name, address, but also zip code, birth date and sex. So it is straightforward to compare this list with PGP participants who have also included their zip code, birth date and sex.

When there is a match, the question is whether the zip, birth date and sex uniquely identify an individual. Sweeney has argued in the past that it does with an accuracy of up to 87 per cent, depending on factors such as the density of people living in the zip code in question.

These results seem to prove her right. Sweeney and co-submitted the results to the PGP organisation and asked them to check how accurate the de-anonymisation process had been. It turns out they accurately identified people with a success rate of up to 97 per cent.

More:
Participants in Personal Genome Project Identified by Privacy Experts

Related Posts