PCLOB Estimates 120 Million Phone Numbers in Corporate Store

PCLOB’s report confirms something ACLU’s Patrick Toomey and I have been harping on. One of the biggest risks of the phone dragnet stems not from the initial queries themselves, but from NSA’s storage of query results in the “corporate store,” permanently, where they can be accessed without the restrictions required for access to the full database, and exposed to all the rest of NSA’s neat toys.

According to the FISA court’s orders, records that have been moved into the corporate store may be searched by authorized personnel “for valid foreign intelligence purposes, without the requirement that those searches use only RAS-approved selection terms.”71 Analysts therefore can query the records in the corporate store with terms that are not reasonably suspected of association with terrorism. They also are permitted to analyze records in the corporate store through means other than individual contact-chaining queries that begin with a single selection term: because the records in the corporate store all stem from RAS-approved queries, the agency is allowed to apply other analytic methods and techniques to the query results.72 For instance, such calling records may be integrated with data acquired under other authorities for further analysis. The FISA court’s orders expressly state that the NSA may apply “the full range” of signals intelligence analytic tradecraft to the calling records that are responsive to a query, which includes every record in the corporate store.73

PCLOB doesn’t say it, but NSA’s SID Director Theresa Shea has: those other authorities include content collection, which means coming up in a query can lead directly to someone reading your content.

Section 215 bulk telephony metadata complements other counterterrorist-related collection sources by serving as a significant enabler for NSA intelligence analysis. It assists the NSA in applying limited linguistic resources available to the counterterrorism mission against links that have the highest probability of connection to terrorist targets. Put another way, while Section 215 does not contain content, analysis of the Section 215 metadata can help the NSA prioritize for content analysis communications of non-U.S. persons which it acquires under other authorities. Such persons are of heightened interest if they are in a communication network with persons located in the U.S. Thus, Section 215 metadata can provide the means for steering and applying content analysis so that the U.S. Government gains the best possible understanding of terrorist target actions and intentions. [my emphasis]

Plus, those authorities will include datamining, including with other data collected by NSA, like a user’s Internet habits and financial records.

Then, PCLOB does some math to estimate how many numbers might be in the corporate store.

If a seed number has seventy-five direct contacts, for instance, and each of these first-hop contact has seventy-five new contacts of its own, then each query would provide the government with the complete calling records of 5,625 telephone numbers. And if each of those second-hop numbers has seventy-five new contacts of its own, a single query would result in a batch of calling records involving over 420,000 telephone numbers.


If the NSA queries around 300 seed numbers a year, as it did in 2012, then based on the estimates provided earlier about the number of records produced in response to a single query, the corporate store would contain records involving over 120 million telephone numbers.74

74 While fewer than 300 identifiers were used to query the call detail records in 2012, that number “has varied over the years.” Shea Decl. ¶ 24.

Some might quibble with these numbers: other estimates use 40 contacts per person (though remember, there’s 5 years of data), and the estimate doesn’t seem to account for mutual contacts. Plus, remember this is unique phone numbers: we should expect it to include fewer people, because people — especially people trying to hide — change phones regularly. Further, remember a whole lot of foreign numbers will be in there.

But other things suggest it might be conservative. As a recent Stanford study showed, if the NSA isn’t really diligent about removing high volume numbers, then queries could quickly include everyone; certainly, NSA could have deliberately populated the corporate store by leaving such identifiers in. We know there were 27,000 people cleared for RAS in 2008 and 17,000 on an alert list in 2009, meaning the query numbers for earlier years are effectively much much higher (which seems to be the point of footnote 74).

Plus, remember that PCLOB gave their descriptive sections to the NSA to review for accuracy. So I assume NSA did not object to the estimate.

So 120 million phone numbers might be a reasonable estimate.

That’s a lot of Americans exposed to the level of data analysis permissible in the corporate store.

3 replies
  1. TheMagicIsOver says:

    How, at its base level, is this guilt by association even considered an acceptable, useful or moral way to actually identify and investigate terrorists?

  2. thatvisionthing says:

    ew, can you please clarify this? How long is stuff retained?

    Question comes from Corrente re Edward Snowden’s Live Q&A Thursday: http://www.correntewire.com/comment/227406#comment-227406
    Snowden says 5 years, Appelbaum said Risen and Poitras said 10+5 for metadata and content, and I find this at NYT to support that:


    If the N.S.A. does not immediately use the phone and e-mail logging data of an American, it can be stored for later use, at least under certain circumstances, according to several documents.

    One 2011 memo, for example, said that after a court ruling narrowed the scope of the agency’s collection, the data in question was “being buffered for possible ingest” later. A year earlier, an internal briefing paper from the N.S.A. Office of Legal Counsel showed that the agency was allowed to collect and retain raw traffic, which includes both metadata and content, about “U.S. persons” for up to five years online and for an additional 10 years offline for “historical searches.”

    I do not get the online-offline historical searches thing. I don’t know what briefing paper the NYTimes was talking about. The only OLC ref (DOJ) I see in PCLOB is p42 – 2003.

    I tried searching “retention” in the PCLOB report — I see 18 months for phone companies though one retained up to 26 years? (p145) Also per FISA court, 5 years for “collection store data”, but no limits at all for “corporate store”? (p169-170) Wondering too about diff between encrypted and unencrypted, does encrypted ever get tossed, or has it never been “collected” until it can be cracked? One of the board’s suggestions is to reduce retention of bulk phone records from 5 years to 3 years… but what/who/where does that leave out? (If there’s no limit on corporate store, then is there effectively no limit at all?)


Comments are closed.