Three-Hopping the Corporate Store, in Theory

Stanford University has been running a project to better understand what phone metadata can show about users, MetaPhone, in which Android users can make their metadata available for analysis.

They just published a piece that suggests we could be underestimating the intrusiveness of the government’s phone dragnet program. That’s because most assumptions about degrees of separation consider only human contacts, and not certain hub phone numbers that quickly unite us.

A common approach for calculating these figures has been to simply assume an average number of call relationships per phone line (“degree”), then multiply out the number of hops. If a single phone number has average degree d, and the NSA can make h hops, then a single query gives expected access to about dh complete sets of phone records.34

We turned to our crowdsourced MetaPhone dataset for an empirical measurement. Given our small, scattershot, and time-limited sample of phone activity, we expected our graph to be largely disconnected. After all, just one pair from our hundreds of participants had held a call.

Surprisingly, our call graph was connected. Over 90% of participants were related in a single graph component. And within that component, participants were closely linked: on average, over 10% of participants were just 2 hops away, and over 65% of participants were 4 or fewer hops away!

In spite of the fact that just 2 of its participants had called each other, the fact that so many people had called TMobile’s voicemail number connected 17% of participants at two hops.

Already 17.5% of participants are linked. That makes intuitive sense—many Americans use T-Mobile for mobile phone service, and many call into voicemail. Now think through the magnitude of the privacy impact: T-Mobile has over 45 million subscribers in the United States. That’s potentially tens of millions of Americans connected by just two phone hops, solely because of how their carrier happens to configure voicemail.

And from this, the piece concludes that NSA could get access to a huge number of numbers with just one seed.

But our measurements are highly suggestive that many previous estimates of the NSA’s three-hop authority were conservative. Under current FISA Court orders, the NSA may be able to analyze the phone records of a sizable proportion of the United States population with just one seed number.

This analysis doesn’t account for one thing: NSA uses Data Integrity Analysts who take out high volume numbers — numbers like the TMobile voice mail number.

Here’s how the 2009 End-to-End review of the phone dragnet described their role.

As part of their Court-authorized function of ensuring BR FISA metadata is properly formatted for analysis, Data Integrity Analysts seek to identify numbers in the BR FISA metadata that are not associated with specific users, e.g., “high volume identifiers.” Read more