Researchers at Massachusetts Institute of Technology found that just four fairly vague pieces of information - the dates and locations of four purchases - are enough to identify 90 per cent of the people in a data set recording three months of credit-card transactions by 1.1 million users.
When the researchers also considered coarse-grained information about the prices of purchases, just three data points were enough to identify an even larger percentage of people in the data set.
This is true, the researchers said, even in cases where no one in the data set is identified by name, address, credit card number, or anything else that we typically think of as personal information.
"If we show it with a couple of data sets, then it's more likely to be true in general," said Yves-Alexandre de Montjoye, an MIT graduate student in media arts and sciences, and first author of the study.
The data set the researchers analysed included the names and locations of the shops at which purchases took place, the days on which they took place, and the purchase amounts.
Purchases made with the same credit card were all tagged with the same random identification number.
For each identification number - each customer in the data set - the researchers selected purchases at random, then determined how many other customers' purchase histories contained the same data points.
At the other extreme, five points with price information was enough to identify almost everyone, researchers found.
