Supplementary Materials for "How to Avoid Reidentification with Proper Anonymization"- Comment on "Unique in the shopping mall: on the reidentifiability of credit card metadata"

18 Nov 2015  ·  David Sánchez, Sergio Martínez, Josep Domingo-Ferrer ·

The study by De Montjoye et al. ("Science", 30 January 2015, p. 536) claimed that most individuals can be reidentified from a deidentified credit card transaction database and that anonymization mechanisms are not effective against reidentification. Such claims deserve detailed quantitative scrutiny, as they might seriously undermine the willingness of data owners and subjects to share data for research. In a recent Technical Comment published in "Science" (18 March 2016, p. 1274), we demonstrate that the reidentification risk reported by De Montjoye et al. was significantly overestimated (due to a misunderstanding of the reidentification attack) and that the alleged ineffectiveness of anonymization is due to the choice of poor and undocumented methods and to a general disregard of 40 years of anonymization literature. The technical comment also shows how to properly anonymize data, in order to reduce unequivocal reidentifications to zero while retaining even more analytical utility than with the poor anonymization mechanisms employed by De Montjoye et al. In conclusion, data owners, subjects and users can be reassured that sound privacy models and anonymization methods exist to produce safe and useful anonymized data. Supplementary materials detailing the data sets, algorithms and extended results of our study are available here. Moreover, unlike the De Montjoye et al.'s data set, which was never made available, our data, anonymized results, and anonymization algorithms can be freely downloaded from http://crises-deim.urv.cat/opendata/SPD_Science.zip

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper