To Justify Dragnet, FBI Implies It Can’t File 300 More NSLs in a Year

June 15, 2013/11 Comments/in FISA, PATRIOT /by emptywheel

So Mark Hosenball just reported this, uncritically.

The U.S. government only searched for detailed information on calls involving fewer than 300 specific phone numbers among the millions of raw phone records collected by the National Security Agency in 2012, according to a government paper obtained by Reuters on Saturday.

As Jim Sensenbrenner noted the other day, if the government is doing only what it says it is with the database — finding US persons who are in contact with suspected terrorists — the FBI could use a grand jury subpoena or a National Security Letter to do so. Collecting all the phone records of Americans would only be required if the FBI were doing so many checks such a process became onerous.

Except that the FBI routinely gets upwards of 10,000 NSLs a year. Adding these 300 would be a drop in the bucket.

So the difficulty of getting NSLs can’t be the problem.

Which suggests the 300 claim is implicit acknowledgment they’re doing something more with this data than they’re letting on.

Share this entry

11 replies

ess emm says:
June 15, 2013 at 11:15 pm

You’re indefatigable, Marcy.
And right on the money.
orionATL says:
June 15, 2013 at 11:28 pm

For me at least, this is difficult to understand, uncharacteristically unclear.
ess emm says:
June 15, 2013 at 11:30 pm

Here’s Marcy on twitter:

Now, if Reuters were interested in journalism, they’d ask, “well, if you’re only using this DB 300 times, what justifies it v. NSLs?”
Garrett says:
June 15, 2013 at 11:35 pm

The Mark Hosenball article has got a pretty studious avoidance of the words “talking points.”

(Editing by Bill Trott)

I guess coming up with the word “paper” was the writing part.
Orestes Ippeau says:
June 16, 2013 at 12:26 am

“Millions of phone records were collected in 2012, but the paper says U.S. authorities only looked in detail at the records linked to fewer than 300 phone numbers.”

That’s from Hosenball’s report.

Anyone involved in the last 20 years in a court case involving something like what David Simon followed as a reporter and later portrayed in The Wire knows that it’s actually mundane for even a limited-number order in a single city to gather up hundreds of thousands, even millions, of calls covered by a SERIES of court orders related back to the originating search warrant of a particular investigation — depending on how long the orders carry on.

The FAA of 2008 works somewhat differently. Rather than the investigators having to return to the court periodically to get judicial authorizations extended, or extended and widened, as can (and does) occur in typical domestic crime investigation, in FAA applications, the government lays out a proposed plan showing what sorts of calls the analysts will be told to follow; and doing that means the government doesn’t have to return to the court to get the order widened at the front end. Depending on the definition of the term ‘linked to’, the terms of the originating FAA grant can allow investigators and analysts to perform their own widening, such as in series (e.g. Number “A” is ‘named’ in the order, then is found to be used to connect to Number “B”, and “B” to “C”, and so on and so on.), without having to return to the court to get further authority to carry on that sort of chase.

The EFFECT of that sort of single originating FAA order would end up being like a series of a dozen or even several dozens of domestic court orders, all tracing back to a single order into say a drug-trafficking-related homicide that identifies even just a handful of names and suspected numbers or email addresses.

The same problem pertains to Rep. Nadler’s recent exchange with FBI Director Mueller: we don’t have enough information, and maybe not even an appropriate public forum to get it, from the sort of cryptic briefing session Nadler was discussing, to determine what’s actually happening. Hosenball’s report, to me at least, is too bare bones to make sense from; it COULD BE that the government has only ever gone to the FISA court for the okay for a ‘plan’ to chase 300 or so numbers, but the magic of scale from how the term ‘linked to’ is interpreted plus the passage of time has turned The 300 into “millions”.
ess emm says:
June 16, 2013 at 12:44 am

This comment is sort of for PJ Evans.
From Gelman’s article in WaPo

The other two types of collection, which operate on a much smaller scale, are aimed at content. One of them intercepts telephone calls and routes the spoken words to a system called NUCLEON.

Smaller scale means what, not all Americans but all Muslim-Americans?
Adam Colligan says:
June 16, 2013 at 12:46 am

I think in addition to the drive to gather and covertly mine the whole table, this is also about the forward- versus backward-looking aspect of the database.

If the database users are, as I suspect and as I think you noted at some point, running a multiple-hop standard of articulable suspicion, then chasing down the exponentially-expanding number of “leads” from each target number becomes daunting and would explode the number of NSLs/etc issued. You go to the telecom and pull all the metadata for the numbers with direct connections, then you go have to square that number, and if you want three hops, you’ll have to have issued an astronomical number of “directives”, or whatever we’re calling them these days. Most of them you don’t care about.

But if you already have the entire historical database in-house, you gain two things.

Firstly, you get to dramatically expand the amount of phnoe numbers that would fall within any given degree of distance from an official “target” or “suspect” number, as I mentioned yesterday, because over time the minimum distance between any two numbers in the phone web will approach 1.

But there’s another advantage as well. When you apply your fancypants semantic and relational indexer algorithms to the data as it comes in, you can add dozens or hundreds of extra columns of data for each entry (row) that you got from the telecom. These columns would automatically describe all kinds of properties and relationships that that row has in relation to all the others, and you can periodically update the entire database based on new entries that are coming in.

What this means is that when you get a target number, you can query the database for, at a minimum, all of the numbers within a certain calculated “distance” of your target number, either in terms of contact hops or geolocation convergence or association with PRISM data or whatever you want. You don’t have to do a cascading search, with each level requiring a new round of orders to a telco, because all the relationships are in a sense readily pre-calculated or seen in-house. But you can go beyond even that. You can use the additional columns and relational data spit out by your indexing system to just highlight numbers that your algorithms find *interesting* in a particular way in relation to a target number, by some combination of factors.

This means that even in cases where your other tricks and obfuscations don’t hold water and you actually have to add to the reported database usage metrics, you can do so in a way that looks very conservative. You’ve only picked out a small slice of numbers for human review, even if the pool they were drawn from is very large. (I’m leaving aside for the moment the immense problem with an agency claiming it hasn’t done something simply because it’s automated that thing. This seems to be their position from what I can tell: this is yet another reason to focus on Clapper’s use of “voyeuristically” in his attempted walk-back. He is highlighting not a level of intrusiveness in the management of data but rather a state of mind, one that machines are not capable of).

A further legal or PR consequence is that this tactic could allow the database users, even when human, to claim that they have not queried “the data” or “the metadata” from the telecoms when in fact they have been. This is because they are reading the columns that have been added in by the indexing and targeting system rather than the columns that were originally fed in under the orders. Ironically, this would probably be even more invasive, because the additional auto-added columns probably contain much more socially relevant information about the names, occupations, emails, etc of the persons who control the phone numbers. Maybe you heard something different at the hearings this week, but I noted that they said they needed a more robust judicial order to get subscriber info from the telcos; I didn’t hear that they needed any such process in order to allow their computers to automatically infer identities.
Valley Girl says:
June 16, 2013 at 8:32 am

The U.S. government or U.S. authorities… are they perhaps leaving out the number of searches done by contractors, like at Booz A?
emptywheel says:
June 16, 2013 at 9:45 am

@Valley Girl: See this post.

http://www.emptywheel.net/2013/06/13/what-does-nctc-do-with-nsa-and-fbis-newly-disclosed-databases/
P J Evans says:
June 16, 2013 at 11:26 am

@ess emm:
All the people they’re actively watching?
klynn says:
June 17, 2013 at 10:51 am

@ess emm:

EW you have a way with asking the perfect questions.

And, the no answers from those the questions are asked…ARE the answers.

Non-denial denial.

To Justify Dragnet, FBI Implies It Can’t File 300 More NSLs in a Year

Comments are closed.

Interesting links

Pages