The Easy Aggregation of Personal Data

18 May ’05

The NYT reports today on a fun school project at Johns Hopkins – for $50, see how much personal data on others you can collect. Answer: lots. And lots. And the truly fun part is the picture many different pieces of information create when they are linked together. Quote:

Working with a strict requirement to use only legal, public sources of information, groups of three to four students set out to vacuum up not just tidbits on citizens of Baltimore, but whole databases: death records, property tax information, campaign donations, occupational license registries. They then cleaned and linked the databases they had collected, making it possible to enter a single name and generate multiple layers of information on individuals. Each group could spend no more than $50.

Several groups managed to gather well over a million records, with hundreds of thousands of individuals represented in each database.

“Imagine what they could do if they had money and unlimited time,” Dr. Rubin said.

An object lesson in the risk created by combining the persistence of data with the power of current and future data storage and processing capacity with the access provided by the Internet ….

Interestingly, the article explores the opposing views on whether privacy is necessarily the better default position, given the benefits to society of openness and transparency (for example, in government). It’s always bugged me, actually, that the debate is being framed in that way; apparently, we are all freer now, and live now in a more efficiently governed and managed society, because all of this data can now be collected and harvested. Well, no actually, we don’t. And it seems to me that the importance of the need to distinguish between information that on balance ought to be available for publication, aggregation and analysis, and information that shouldn’t be, is being submerged (because the task is perceived as being too difficult??) under the rising tide of the debate between openness and privacy.

Previous post:

Next post: