The Triage Document

Accompanying a new story on GCHQ/NSA cooperation yesterday, the Intercept released one of the most revealing documents about NSA spying yet. It describes efforts to use Identifier Scoreboard to triage leads such that analysts spend manual time only with the most promising leads. Basically, the NSA aims to use this process to differentiate the 75% of metadata they collect that is interesting but not of high interest into different categories for further analysis.

It does so by checking the leads — which are identifiers like email addresses and phone numbers — against collected data (and this extends beyond just stuff collected on the wires; it includes captured media) to see what kind of contacts with existing targets there have been. Not only does the system pull up what prior contacts of interest exist, but also what time frame those occurred and in what number. From there, the analyst can link directly to either the collected knowledge about a target or the content.

Before I get into the significance, a few details.

First, the system works with both phone and Internet metadata. That’s not surprising, and it does not yet prove they’re chaining across platforms. But it is another piece of evidence supporting that conclusion.

More importantly, look at the authorities in question:

Screen shot 2014-05-01 at 10.46.51 AM

First, FAA. The CP and CT are almost certainly certificates, the authority to collect on counterproliferation and counterterrorism targets. But note what’s not there? Cybersecurity, the third known certificate (there was a third certificate reapproved in 2011, so it was active at this time). Which says they may be using that certificate differently (which might make sense, given that you’d be more interested in forensic flows, but this triage system is used with things like TAO which presumably include cyber targets).

There is, however, a second kind of FAA, “FG.” That may be upstream or it may be something else (FG could certainly stand for “Foreign Government, which would be consistent with a great deal of other data). If it’s something else, it supports the notion that there’s some quirk to how the government is using FAA that differs from what they’ve told PCLOB and the Presidential Review Group, which have both said there are just those 3 certificates.

Then there’s FAA 704/705B. This is collection on US person overseas. Note that FAA 703 (collection on US person who is located overseas but the collection on whom is in the US) is not included. Again, this shows something about how they use these authorities.

Finally, there are two EO12333s. In other slides, we’ve seen an EO12333 and an EO123333 SPCMA (which means you can collect and chain through Americans), and that may be what this is. Update: One other possibility is that this distinguishes between EO12333 data collected by the US and by second parties (the Five Eyes).

Now go to what happens when an identifier has had contact with a target — and remember, these identifiers are just random IDs at this point.

Screen shot 2014-05-01 at 10.49.50 AM

The triage program automatically pulls up prior contacts with targets. Realize what this is? It’s a backdoor search, conducted off an identifier about which the NSA has little knowledge.

And the triage provides a link directly from that the metadata describing when the contact occurred and who initiated it to the content.

When James Clapper and Theresa Shea describe the metadata serving as a kind of index that helps prioritize what content they read, this is part of what they’re referring to. That — for communications involving people who have already been targeted under whatever legal regime — the metadata leads directly to the content. (Note, this triage does not apparently include BR FISA or PRTT data — that is, metadata collected in the US — which says there are interim steps before such data will lead directly to content, though if that data can be replicated under EO 12333, as analysts are trained to do, it could more directly lead to this content.)

So they find the identifiers, search on prior contact with targets, then pull up that data, at least in the case of EO12333 data. (Another caution, these screens date from a period when NSA was just rolling out its back door search authorities for US persons, and there’s nothing here that indicates these were US persons, though it does make clear why — as last year’s audit shows — NSA has had numerous instances where they’ve done back door searches on US person identifiers they didn’t know were US person identifiers.)

Finally, look at the sources. The communications identified here all came off EO12333 communications (interestingly, this screen doesn’t ID whether we’re looking at EO12333_X or _S data). As was noted to me this morning, the SIGADS that are known here are offshore. But significantly, they include MUSCULAR, where NSA steals from Google overseas.

That is, this screen shows NSA matching metadata with metadata and content that they otherwise might get under FAA, legally, within the US. They’re identifying that as EO12333 data. EO12333 data, of course, gets little of the oversight that FAA does.

At the very least, this shows the NSA engaging in such tracking, including back door searches, off a bunch of US providers, yet identifying it as EO12333 collection.

Update: Two more things on this. Remember NSA has been trying, unsuccessfully, to replace its phone dragnet “alert” function since 2009 when the function was a big part of its violations (a process got approved in 2012, but the NSA has not been able to meet the terms of it technically, as of the last 215 order). This triage process is similar — a process to use with fairly nondescript identifiers to determine whether they’re worth more analysis. So we should assume that, while BR FISA (US collected phone dragnet) information is not yet involved in this, the NSA aspires to do so. There are a number of reasons to believe that moving to having the providers do the initial sort (as both the RuppRoge plan offered by the House Intelligence Committee and Obama’s plan do) would bring us closer to that point.

Finally, consider what this says about probable cause (especially if I’m correct that EO12333_S is the SPMCA that includes US persons). Underlying all this triage is a theory of what constitutes risk. It measures risk in terms of conversations –how often, how long, how many times — with “dangerous” people. While that may well be a fair measure in some cases, it may not be (I’ve suggested, for example, that people who don’t know they may be at risk are more likely to speak openly and at length, and those conversations then serve as a kind of camouflage for the truly interesting, rare by operational security conversations). But this theory (though not this particular tool) likely lies behind a lot of the young men who’ve been targeted by FBI.

Marcy has been blogging full time since 2007. She’s known for her live-blogging of the Scooter Libby trial, her discovery of the number of times Khalid Sheikh Mohammed was waterboarded, and generally for her weedy analysis of document dumps.

Marcy Wheeler is an independent journalist writing about national security and civil liberties. She writes as emptywheel at her eponymous blog, publishes at outlets including the Guardian, Salon, and the Progressive, and appears frequently on television and radio. She is the author of Anatomy of Deceit, a primer on the CIA leak investigation, and liveblogged the Scooter Libby trial.

Marcy has a PhD from the University of Michigan, where she researched the “feuilleton,” a short conversational newspaper form that has proven important in times of heightened censorship. Before and after her time in academics, Marcy provided documentation consulting for corporations in the auto, tech, and energy industries. She lives with her spouse and dog in Grand Rapids, MI.

1 reply
  1. bloopie2 says:

    Well, nobody has commented yet on this post (perhaps because of its nature/content – important, I assume as one cog in a vast machine for which no one can point me to an overall explanation of its operation), so I will take the occasion to go Off Topic and give you at least One Comment, specifically, some thoughts that came to me in the last day or so and for which I can’t find a better forum.
    In the SCOTUS oral arguments on the cell phone cases, much ado was made about fitting cell phones into the “incident to arrest” exception to the warrant requirement. Yet no one asked, “How important is ‘incident to arrest’, per se?” No one asked, “Mr. Government, Sir, do you have any statistics as to how many officers’ lives are saved by allowing warrantless searches for weapons within the reach of the arrestee? Or any statistics as to how much evidence is saved from destruction by allowing warrantless searches for evident within the reach of the arrestee?”
    My guess is, very little in the overall scheme of things. I certainly would not want to get rid of the exception altogether, but For God’s Sake, don’t, ye Justices, consider it to be supremely significant, something that must be upheld and enhanced at all costs.
    And Mr. Government also said that encryption of information on cell phones posed a “grave danger”. Yet no one asked, “A danger to whom?” Of course, the only correct answer to that question is, “To the ability of law enforcement to track people etc.” (You can see why he didn’t say that.) And that answer, of course, is bullshit. You want us to believe that cops can’t catch and put away crooks (whom they’ve already arrested based on other evidence) if they can’t access the crooks’ cell phones? Right. Pray tell, how did they ever catch and put away crooks, before cell phones existed? This just can’t be a significant tool in their arsenal. The badge and the gun and the patrol car are the essential tools; access to cell phones is not.
    Really, though, the Supremes do tend see through a lot of this garbage that the lawyers put up, so I remain hopeful. Worst case scenario, I think, is very narrow rulings limited to the facts of these cases. They are not going to proclaim a blanket rule covering digital information as a class; they don’t operate that way.
    So there. Good night now.

Comments are closed.