Posts

How Hal Martin Stole 75% of NSA’s Hacking Tools: NSA Failed to Implement Required Security Fixes for Three Years after Snowden

The other day, Ellen Nakashima reported that Hal Martin, the Booz Allen contractor who has been in custody for months based on allegations he stole terabytes of NSA’s hacking tools, may be indicted this week. The story raises some interesting questions — such as how, absent some proof that Martin leaked this information to a third party, prosecutors intend to distinguish Martin’s hoarding from David Petraeus’ sharing of code word information with his girlfriend Paula Broadwell. One detail Nakashima included — that Martin had stolen “operational plans against ‘a known enemy’ of the United States” — may suggest prosecutors plan to insinuate Martin stole the information to alert that known enemy (especially if the known enemy is Russia).

All that said, the detail in Nakashima’s story that has attracted the most notice is the claim that Martin stole 75% of NSA’s hacking tools.

Some U.S. officials said that Martin allegedly made off with more than 75 percent of TAO’s library of hacking tools — an allegation which, if true, would be a stunning breach of security.

Frankly, this factoid feels a lot like the claim that Edward Snowden stole 1.5 million documents from NSA, a claim invented at least in part because Congress wanted an inflammatory detail they could leak and expand budgets with. That’s especially true given that the 75% number comes from “US officials,” which sometimes include members of Congress or their staffers.

Still, the stat is pretty impressive: even in the wake of the Snowden leak, a contractor was able to walk out the door, over time, with most of NSA’s most dangerous hacking tools.

Except it should in no way be a surprise. Consider what the House Intelligence Report on Snowden revealed, which I mentioned here. Buried way back at the end of the report, it describes how in the wake of Snowden’s leaks, NSA compiled a list of security improvements that would have stopped Snowden, which it dubbed, “Secure the Net.” This initiative included the following, among other things:

  • Imposing two person control for transferring data by removable media (making it harder for one individual to put terabytes of data on a thumb drive and walk out the door with it)
  • Reducing the number of privileged and authorized data transfer agents (making it easier to track those who could move terabytes of data around)
  • Moving towards continuous evaluation model for background investigations (which might reveal that someone had debt problems, as Martin did)

By July 2014, the report reveals, even some of the most simple changes included in the initiative had not been implemented. On August 22, 2016 — nine days after an entity calling itself Shadow Brokers first offered to auction off what have since been verified as NSA tools — NSA reported that four of the initiatives associated with the Secure the Net remained unfulfilled.

All the while, according to the prosecutors’ allegations, Martin continued to walk out of NSA with TAO’s hacking tools.

Parallel to NSA’s own Secure the Net initiative, in the intelligence authorization for 2016 the House directed the DOD Inspector General to assess NSA’s information security. I find it interesting that HPSCI had to order this review and that they asked DOD’s IG, not NSA’s IG, to do it.

DOD IG issued its report on August 29, 2016, two days after a search of Martin’s home had revealed he had taken terabytes of data and the very day he was arrested. The report revealed that NSA needed to do more than its proposed fixes under the Secure the Net initiative. Among the things it discovered, for example, is that NSA did not consistently secure server racks and other sensitive equipment in data centers, and did not extend two-stage authentication controls to all high risk users.

So more than three years after Snowden walked out of the NSA with thousands of documents on a thumb drive, DOD Inspector General discovered that NSA wasn’t even securing all its server racks.

“Recent security breaches at NSA underscore the necessity for the agency to improve its security posture,” The HPSCI report stated dryly, referring obliquely to Martin and (presumably) another case Nakashima has reported on.

Then the report went on to reveal that CIA didn’t even require a physical token for general or privileged users of its enterprise or mission systems.

So yes, it is shocking that a contractor managed to walk out the door with 75% of NSA’s hacking tools, whatever that means. But it is also shocking that even the Edward Snowden breach didn’t lead NSA to implement some really basic security procedures.

Marcy has been blogging full time since 2007. She’s known for her live-blogging of the Scooter Libby trial, her discovery of the number of times Khalid Sheikh Mohammed was waterboarded, and generally for her weedy analysis of document dumps.

Marcy Wheeler is an independent journalist writing about national security and civil liberties. She writes as emptywheel at her eponymous blog, publishes at outlets including the Guardian, Salon, and the Progressive, and appears frequently on television and radio. She is the author of Anatomy of Deceit, a primer on the CIA leak investigation, and liveblogged the Scooter Libby trial.

Marcy has a PhD from the University of Michigan, where she researched the “feuilleton,” a short conversational newspaper form that has proven important in times of heightened censorship. Before and after her time in academics, Marcy provided documentation consulting for corporations in the auto, tech, and energy industries. She lives with her spouse and dog in Grand Rapids, MI.

One-Fifth of Documents Edward Snowden Stole Were Blank

Charlie Savage has a great review in the New Yorker, pitting Oliver Stone’s Snowden movie against Edward Jay Epstein’s book (and astutely noting that these two have battled before over JFK history, which presumably explains the use of “Soviet” in the title).

In it, he addresses something fact-based commentators have had to deal with over and over: the claim Snowden stole 1.5 million documents.

Another complication for judging Snowden’s actions is that we do not know how many and which documents he took. Investigators determined only that he “touched” about 1.5 million files—essentially those that were indexed by a search program he used to trawl NSA servers. Many of those files are said to pertain to military and intelligence tools and activities that did not bear on the protection of individual privacy. Snowden’s skeptics assume that he stole every such file. His supporters assume that he did not. In any case they believe his statements that after giving certain NSA archives to the journalists in Hong Kong, he destroyed his hard drives and brought no files to Russia.

But it’s time, once and for all, to reject this frame entirely.

That’s true for several reasons. First, as the House Intelligence Report on Snowden discloses, the Intelligence Community actually has two different counts of what documents Snowden “took.” The 1.5 million number comes from Defense Intelligence Agency.

The IC more generally, though, has a different (undisclosed) number, based off three tiers of damage assessment: those documents that had been released to the public by August 31, 2015, those documents that, “based on forensic analysis, Snowden would have collected in the course of collecting [the documents already released], but have not yet been disclosed to the public.” (PDF 29) The IC believes these documents are in the hands of Glenn Greenwald and Laura Poitras and Bart Gellman. The last tier consists of documents that Snowden accessed in some way. The rest of the description of this category is redacted, but the logic involved in the section suggests the IC has good reason to question whether the third tier ever got delivered to journalists.

By May 2016 (much to HPSCI’s apparent chagrin), the IC had stopped doing damage assessment on documents not released the public, which strongly suggests they believed Russia and other adversaries hadn’t and probably wouldn’t obtain them, which in turn suggests the IC either believes the journalists’ operational security is adequate against Russia and China and/or the documents have already been destroyed and certainly didn’t go with Snowden to Russia and get delivered to Vladimir Putin.

Particularly given the later date for the IC assessment, I’d suggest the IC likely has listened for years for signs the wider universe of documents has been released, and have found no sign the documents have. Otherwise they’d be doing a damage assessment on them.

But the 1.5 million number is problematic for two more reasons. First, as Jason Leopold reported in 2015, the 1.5 million number comes from a period when HPSCI was actively soliciting dirt on Snowden that they could (and did) leak to the press. It was designed to be as damning as possible And, as I added at the time, the number also came at a time when Congress was scrambling to give DOD more money to deal with mitigation of Snowden’s leak. In other words, for several reasons Congress was asking the IC to give it the biggest possible number.

But there’s another problem with the 1.5 million number, revealed in the HPSCI report released last month. The 1.5 million isn’t actually all the documents Snowden is known to have touched, or even downloaded. Rather, it is all the documents he touched and downloaded, less some 374,000 “blank documents Snowden downloaded from the Department of the Army Intelligence Information Service (DAIIS) Message Processing System.”

So the real number of documents that Snowden “touched” is almost 1.9 million. But in coming up with its most inflammatory number, DIA eliminated the almost 20% of the documents that it had determined were blank.

But consider what that tacitly admits. It admits that one-fifth of the documents that Snowden not just touched, but actually downloaded, were absolutely useless for the purposes of leaking, because they were blank. But if Snowden downloaded 374,000 blank documents, it is proof he downloaded a bunch things he didn’t intend to leak.

Of course, fear-mongering about Snowden wandering the world with 374,000 blank documents risks making someone look crazy. So maybe that’s the reason the Snowden skeptics have chosen to edit their number down, even while doing so is tacit admission they know he “touched” a lot of things he had no intention of leaking.

If Edward Jay Epstein wants to write the definitive screed against Snowden, he should adopt, instead, that 1.9 million number. But in so doing, he should also admit he’s raising concerns about Snowden leaking blank documents.

Marcy has been blogging full time since 2007. She’s known for her live-blogging of the Scooter Libby trial, her discovery of the number of times Khalid Sheikh Mohammed was waterboarded, and generally for her weedy analysis of document dumps.

Marcy Wheeler is an independent journalist writing about national security and civil liberties. She writes as emptywheel at her eponymous blog, publishes at outlets including the Guardian, Salon, and the Progressive, and appears frequently on television and radio. She is the author of Anatomy of Deceit, a primer on the CIA leak investigation, and liveblogged the Scooter Libby trial.

Marcy has a PhD from the University of Michigan, where she researched the “feuilleton,” a short conversational newspaper form that has proven important in times of heightened censorship. Before and after her time in academics, Marcy provided documentation consulting for corporations in the auto, tech, and energy industries. She lives with her spouse and dog in Grand Rapids, MI.

The Dragnet Donald Trump Will Wield Is Not Just the Section 215 One

I’ve been eagerly anticipating the moment Rick Perlstein uses his historical work on Nixon to analyze Trump. Today, he doesn’t disappoint, calling Trump more paranoid than Nixon, warning of what Trump will do with the powerful surveillance machine laying ready for his use.

Revenge is a narcotic, and Trump of all people will be in need of a regular, ongoing fix. Ordering his people to abuse the surveillance state to harass and destroy his enemies will offer the quickest and most satisfying kick he can get. The tragedy, as James Madison could have told us, is that the good stuff is now lying around everywhere, just waiting for the next aspiring dictator to cop.

But along the way, Perlstein presents a bizarre picture of what happened to the Section 215 phone dragnet under Barack Obama.

That’s not to say that Obama hasn’t abused his powers: Just ask the journalists at the Associated Press whose phone records were subpoenaed by the Justice Department. But had he wanted to go further in spying on his enemies, there are few checks in place to stop him. In the very first ruling on the National Security Administration’s sweeping collection of “bulk metadata,” federal judge Richard Leon blasted the surveillance as downright Orwellian. “I cannot imagine a more ‘indiscriminate’ and ‘arbitrary’ invasion than this collection and retention of personal data,” he ruled. “Surely, such a program infringes on ‘that degree of privacy’ that the founders enshrined in the Fourth Amendment.”

But the judge’s outrage did nothing to stop the surveillance: In 2015, an appeals court remanded the case back to district court, and the NSA’s massive surveillance apparatus—soon to be under the command of President Trump—remains fully operational. The potential of the system, as former NSA official William Binney has described it, is nothing short of “turnkey totalitarianism.”

There are several things wrong with this.

First, neither Richard Leon nor any other judge has reviewed the NSA’s “sweeping collection of ‘bulk metadata.'” What Leon reviewed — in Larry Klayman’s lawsuit challenging the collection of phone metadata authorized by Section 215 revealed by Edward Snowden — was just a small fraction of NSA’s dragnet. In 2013, the collection of phone metadata authorized by Section 215 collected domestic and international phone records from domestic producers, but even there, Verizon had found a way to exclude collection of its cell records.

But NSA collected phone records — indeed, many of the very same phone records, as they collected a great deal of international records — overseas as well. In addition, NSA collected a great deal of Internet metadata records, as well as financial and anything else records. Basically, anything the NSA can collect “overseas” (which is interpreted liberally) it does, and because of the way modern communications works, those records include a significant portion of the metadata of Americans’ everyday communications.

It is important for people to understand that the focus on Section 215 was an artificial creation, a limited hangout, an absolutely brilliant strategy (well done, Bob Litt, who has now moved off to retirement) to get activists to focus on one small part of the dragnet that had limitations anyway and NSA had already considered amending. It succeeded in pre-empting a discussion of just what the full dragnet entailed.

Assessments of whether Edward Snowden is a traitor or a saint always miss this, when they say they’d be happy if Snowden had just exposed the Section 215 program. Snowden didn’t want the focus to be on just that little corner of the dragnet. He wanted to expose the full dragnet, but Litt and others succeeded in pretending the Section 215 dragnet was the dragnet, and also pretending that Snowden’s other disclosures weren’t just as intrusive on Americans.

Anyway, another place where Perlstein is wrong is in suggesting there was just one Appeals Court decision. The far more important one is the authorized by Gerard Lynch in the Second Circuit, which ruled that Section 215 was not lawfully authorized. It was a far more modest decision, as it did not reach constitutional questions. But Lynch better understood that the principle involved more than phone records; what really scared him was the mixing of financial records with phone records, which is actually what the dragnet really is.

That ruling, on top of better understanding the import of dragnets, is important because it is one of the things that led to the passage of USA Freedom Act, a law that, contrary to Perlstein’s claim, did change the phone dragnet, both for good and ill.

The USA Freedom Act, by imposing limitations on how broadly dragnet orders (for communications but not for financial and other dragnets) can be targeted, adds a check at the beginning of the process. It means only people 2 degrees away from a terrorism suspect will be collected under this program (even while the NSA continues to collect in bulk under EO 12333). So the government will have in its possession far fewer phone records collected under Section 215 (but it will still suck in massive amounts of phone records via EO 12333, including massive amounts of Americans’ records).

All that said, Section 215 now draws from a larger collection of records. It now includes the Verizon cell records not included under the old Section 215 dragnet, as well as some universe of metadata records deemed to be fair game under a loose definition of “phone company.” At a minimum, it probably includes iMessage, WhatsApp, and Skype metadata, but I would bet the government is trying to get Signal and other messaging metadata (note, Signal metadata cannot be collected retroactively; it’s unclear whether it can be collected with standing daily prospective orders). This means the Section 215 collection will be more effective in finding all the people who are 2 degrees from a target (because it will include any communications that exist solely in Verizon cell or iMessage networks, as well as whatever other metadata they’re collecting). But it also means far more innocent people will be impacted.

To understand why that’s important, it’s important to understand what purpose all this metadata collection serves.

It was never the case that the collection of metadata, however intrusive, was the end goal of the process. Sure, identifying someone’s communications shows when you’ve been to an abortion clinic or when you’re conducting an affair.

But the dragnet (the one that includes limited Section 215 collection and EO 12333 collection limited only by technology, not law) actually serves two other primary purposes.

The first is to enable the creation of dossiers with the click of a few keys. Because the NSA is sitting on so much metadata — not just phone records, but Internet, financial, travel, location, and other data — it can put together a snapshot of your life as soon as they begin to correlate all the identifiers that make up your identity. One advantage of the new kind of collection under USAF, I suspect, is it will draw from the more certain correlations you give to your communications providers, rather than relying more heavily on algorithmic analysis of bulk data. Facebook knows with certainty what email address and phone number tie to your Facebook account, whereas the NSA’s algorithms only guess that with (this is an educated guess) ~95+% accuracy.

This creation of dossiers is the same kind of analysis Facebook does, but instead of selling you plane tickets the goal is government scrutiny of your life.

The Section 215 orders long included explicit permission to subject identifiers found via 2-degree collection to all the analytical tools of the NSA. That means, for any person — complicit or innocent — identified via Section 215, the NSA can start to glue together the pieces of dossier it already has in its possession. While not an exact analogue, you might think of collection under Section 215 as a nomination to be on the equivalent of J Edgar Hoover’s old subversives list. Only, poor J Edgar mostly kept his list on index cards. Now, the list of those the government wants to have a network analysis and dossier on is kept in massive server farms and compiled using supercomputers.

Note, the Section 215 collection is still limited to terrorism suspects — that was an important win in the USA Freedom fight — but the EO 12333 collection, with whatever limits on nominating US persons, is not. Plus, it will be trivial for Trump to expand the definition of terrorist; the groundwork is already being laid to do so with Black Lives Matter.

The other purpose of the dragnet is to identify which content the NSA will invest the time and energy into reading. Most content collected is not read in real time. But Americans’ communications with a terrorism suspect will probably be, because of the concern that those Americans might be plotting a domestic plot. The same is almost certainly true of, say, Chinese-Americans conversing with scientists in China, because of a concern they might be trading US secrets. Likewise it is almost certainly true of Iranian-Americans talking with government officials, because of a concern they might be dealing in nuclear dual use items. The choice to prioritize Americans makes sense from a national security perspective, but it also means certain kinds of people — Muslim immigrants, Chinese-Americans, Iranian-Americans — will be far more likely to have their communications read without a warrant than whitebread America, even if those whitebread Americans have ties to (say) NeoNazi groups.

Of course, none of this undermines Perlstein’s ultimate categorization, as voiced by Bill Binney, who created this system only to see the privacy protections he believed necessary get wiped away: the dragnet — both that authorized by USAF and that governed by EO 12333 — creates the structure for turnkey totalitarianism, especially as more and more data becomes available to NSA under EO 12333 collection rules.

But it is important to understand Obama’s history with this dragnet. Because while Obama did tweak the dragnet, two facts about it remain. First, while there are more protections built in on the domestic collection authorized by Section 215, that came with an expansion of the universe of people that will be affected by it, which must have the effect of “nominating” more people to be on this late day “Subversives” list.

Obama also, in PPD-28, “limited” bulk collection to a series of purposes. That sounds nice, but the purposes are so broad, they would permit bulk collection in any area of the world, and once you’ve collected in bulk, it is trivial to then call up that data under a more broad foreign intelligence purpose. In any case, Trump will almost certainly disavow PPD-28.

Which makes Perlstein’s larger point all the more sobering. J Edgar and Richard Nixon were out of control. But the dragnet Trump will inherit is far more powerful.

Marcy has been blogging full time since 2007. She’s known for her live-blogging of the Scooter Libby trial, her discovery of the number of times Khalid Sheikh Mohammed was waterboarded, and generally for her weedy analysis of document dumps.

Marcy Wheeler is an independent journalist writing about national security and civil liberties. She writes as emptywheel at her eponymous blog, publishes at outlets including the Guardian, Salon, and the Progressive, and appears frequently on television and radio. She is the author of Anatomy of Deceit, a primer on the CIA leak investigation, and liveblogged the Scooter Libby trial.

Marcy has a PhD from the University of Michigan, where she researched the “feuilleton,” a short conversational newspaper form that has proven important in times of heightened censorship. Before and after her time in academics, Marcy provided documentation consulting for corporations in the auto, tech, and energy industries. She lives with her spouse and dog in Grand Rapids, MI.

Working Thread: HPSCI’s Full Unbelievably Shitty Snowden Report

In September, I did a post asking why the House Intelligence Committee report on Edward Snowden was so unbelievably shitty. My post was just based off a summary released by the Committee. HPSCI has now released the full report.

This will be a working thread.

Summary: The summary, with all its obvious errors, remains unchanged. So see my earlier post for the problems with that.

PDF 6: The report starts with a claim that Snowden’s leaks were the “most massive and damaging in history.” But the claim was made in 2014. Since then we’ve had two more damaging leaks, the OPM leak and the Shadow Brokers leak.

PDF 6: In my earlier post, I wrote about how the deference given to the ongoing criminal investigation into Snowden seemed very similar to — but was far less defensible than — the approach Stephen Preston used when he was General Counsel at CIA. He was General Counsel at DOD when this report started, suggesting he adopted the same approach. Worse, we now know from emails released this year that the exec had actually moved on by May 2014, meaning the claim was not sustainable when made in August 2014.

PDF 7: On the education paragraph, see this post.

PDF 7: Rather than asking the military why Snowden was discharged, the committee asked NSA’s security official. As Bart Gellman notes, his official Army record backs Snowden, not the security official.  Then they say (in the footnote) that they “found node evidence that Snowden was involved in a training accident.”

PDF 9: This page cites from a CIA IG report on Snowden’s complaints about the treatment of TISOs overseas. It actually shows him trying to complain through channels.

PDF 10: Note that HPSCI claimed a paragraph based on information classified confidential was classified secret.

PDF 11: I’m curious why they redacted footnote 43.

PDF 11: Report notes a new derogatory report was submitted after Snowden left Geneva but also after his next employer hired him. It doesn’t seem too serious. Report notes that the alert function for Scattered Castles got updated after that.

PDF 12: The reports that he went to Thailand and China are second-hand, based off what an NSA lawyer said his former co-workers said. Both support an awareness that Snowden was making his privacy concerns known, including this quote (which is likely out of context and may refer to an individual program):

… Snowden expressing his view that the U.S. government had overreached on surveillance and that it was illegitimate for the government to obtain data on individuals’ personal computers.

PDF 13: Why would HPSCI (or NSA, for that matter) depend on the comments of co-workers to learn what Snowden did during a leave of absence? Also note, this is classified Secret, which means it must have some security function.

PDF 13: Note they had an interview with a lawyer and a security official on the same day.

PDF 13: His co-workers claimed Snowden frequently showed up late. That would mean he’d be home for the entirely of the East Coast day.

PDF 13: Snowden expressed concern that SOPA/PIPA would lead to online censorship, but his co-worker was dismissive bc he hadn’t read the bill.

PDF 14: The claim that Snowden went to a hackers conference in China is sourced to a co-worker who didn’t like Snowden much.

PDF 14: Note in the patch discussion, they hide the kind of person that the interviewee for this information is.

PDF 14: Snowden did something after being called out for bringing in a manager.

PDF 15: The report claims that Snowden started downloading docs in July 2012. Snowden has said that was part of transferring docs. But it also coincides with the period when he was trouble shooting a 702 template, so they may think this is how he got the FISA data.

PDF 15: Snowden had access to wget on NSA’s networks for the same reason Chelsea Manning did, IIRC: because the networks were unreliable. Snowden said he did this to move files from MD to HI. There’s a redacted paragraph that it sourced to a “HPSCI recollection summary paper,” which seems odd and unreliable.

PDF 15: The methods Snowden used paper is classified REL to USA, FVEY, presumably because Snowden was grabbing GCHQ documents.

PDF 16: Here’s the funny quote about Snowden violating privacy. Note the first redacted sentence here is not sourced to an NSA document, but instead to a NSA Legislative Affairs document.

PDF 18: The end of this betrays NSA’s efforts to make light of glaring security holes: the CD-ROM/USB port on Snowden’s computer, and the ability for him to download data w/o a buddy (they currently require a buddy).

PDF 19: THe complaints about Snowden’s “resumé inflation” are a valid point. But what does it say that no one at NSA checks these things.

PDF 20: After Snowden moved to Booz, he went back to his old computer to be able to download the files he had new access to. I had been wondering about that.

PDF 20: All the details about Snowden’s flight are taken from public reports, not FBI or CIA reports or even NSA’s timeline, which must cover it. Did NSA’s timeilne, which is dated . That is bizarre.

PDF 21: Note the classification mark for 132, which seems to conclude that Snowden’s motivation was to inform the public.

PDF 21: The report says Snowden left some encrypted hard drives behind, sourced to a 2/4/14 briefing not cited elsewhere. Working from memory I think this is the Flynn one.

PDF 21: The description of what others had said about Snowden’s interest in privacy conflicts with what NSA said internally. 

PDF 22: I will return to the description of the 702 training.

PDF 22: Note they source the training issue to someone unnamed. This appears to be the same person who described the patch issue (PDF 14), with an interview on October 28. That means it couldn’t have been the training person, and surely didn’t have first-hand knowledge.

PDF 23: The report cites the emails (without describing who they were addressed to) and the I Con the Record report on the email. Which means I’ve reviewed this issue more closely than HPSCI.

PDF 23: The section on whether Snowden was a whistleblower doesn’t cite his CIA IG contact.

PDF 25: Some of the foreign influence section obviously says there was none (see the Keith Alexander comment). Plus, this doesn’t cite other public comments saying there is no evidence of any foreign tie.

PDF 26: FN 166 is the bad briefing. Note that 1/5 of the documents Snowden took were blank.

PDF 29: This section describes the damage assessment. I find it very significant the NCSC has stopped reviewing T3 and T2 documents, which must suggest, in part, that they trust the security of the documents and/or have confirmed via some means that there aren’t more out there.

PDF 34: Yet another complaint about not fixing the removable media problem.

PDF 34: A description of the Secure the Net initiative, with four measures outstanding, and taking over a year to get to buddy system with SysAdmins.

PDF 35-36: There’s a list of things HPSCI ordered the IC to do after Snowden.

Marcy has been blogging full time since 2007. She’s known for her live-blogging of the Scooter Libby trial, her discovery of the number of times Khalid Sheikh Mohammed was waterboarded, and generally for her weedy analysis of document dumps.

Marcy Wheeler is an independent journalist writing about national security and civil liberties. She writes as emptywheel at her eponymous blog, publishes at outlets including the Guardian, Salon, and the Progressive, and appears frequently on television and radio. She is the author of Anatomy of Deceit, a primer on the CIA leak investigation, and liveblogged the Scooter Libby trial.

Marcy has a PhD from the University of Michigan, where she researched the “feuilleton,” a short conversational newspaper form that has proven important in times of heightened censorship. Before and after her time in academics, Marcy provided documentation consulting for corporations in the auto, tech, and energy industries. She lives with her spouse and dog in Grand Rapids, MI.

How HPSCI’s Staffers Used Miscitations to Turn Edward Snowden into a Lying Flunkie

I want to take a close look at this paragraph (from PDF 7) of the House Intelligence report on Snowden, to show how they’re (mis)using information.

In its first claim, HPSCI says Snowden was “by his own account,” a “poor student.” It cites this Greenwald and Poitras intro to Snowden, which says something different: “By his own admission, he was not a stellar student.”

The next claim says he dropped out of high school in his sophmore year and then took community college classes, which relies on this report, which in turn cites the public schools as well as the Guardian story.

1991-1998: Snowden attends schools in the Anne Arundel County Public School System in Maryland from the elementary level to high school, where he dropped out his sophomore year. He’ll later say he earned his GED. (Source: Anne Arundel County Public Schools, The Guardian)

1999-2005: Snowden takes a variety of classes from Anne Arundel Community College in Arnold, Maryland. He does not take any cyber security or computer science classes, however, and he never earns a certificate or degree. (Source: Anne Arundel Community College)

Note, the committee has said it didn’t do an investigation because of the ongoing criminal investigation into Snowden. But there is no reason they couldn’t have called Anne Arundel County Public Schools rather than relying on an ABC piece; it wouldn’t have required a long distance call!

The third claim is that Snowden hoped the (community college) classes would permit him to earn a GED, “but nothing the Committee found indicates he did so.” That’s not sourced. Again, it doesn’t say whether or not they called Maryland.

This is what Bart Gellman said in September about Snowden’s claim to have gotten a GED.

I do not know how the committee could get this one wrong in good faith. According to the official Maryland State Department of Education test report, which I have reviewed, Snowden sat for the high school equivalency test on May 4, 2004. He needed a score of 2250 to pass. He scored 3550. His Diploma No. 269403 was dated June 2, 2004, the same month he would have graduated had he returned to Arundel High School after losing his sophomore year to mononucleosis. In the interim, he took courses at Anne Arundel Community College.

The fourth claim is that Snowden told TAO he did have a GED, claiming to have received it on 6/21/2001 from “Maryland High School.”

Finally, the report says that Snowden stated that he did not have a degree of any type, citing this NYT profile rather than citing the forum itself or even the Ars Technica article that first reported it. It is absolutely true that Snowden said he didn’t have a high school diploma, but in context, Snowden was responding to someone focused primarily on a college degree.

Visigothan: No college degree.

Over 10 years work experience in my field

No communicable or other diseases

Not a religious wackjob

I think I’m good on everything except the college degree.

TheTrueHOOHA: First off, the degree thing is crap, at least domestically. If you really have ten years of solid, provable IT experience (and given that you say you’re 25, I think it’d probably be best to underestimate), you CAN get a very well paying IT job. You just need to be either actively looking now or get the fuck out of California. I have no degree, nor even a high school diploma, but I’m making much more than what they’re paying you even though I’m only claiming six years of experience. It’s tough to “break in,” but once you land a “real” position, you’re made.

Now, unless the forum has changed over the years (in which case the date could be wrong), the NYT miscited Snowden, claiming he said “I don’t have a degree of ANY type. I don’t even have a high school diploma,” when in fact the forum itself says he said, “I have no degree, nor even a high school diploma.” Moreover, in context, Snowden is distinguishing between a “degree” and a “diploma,” which may suggest he’s thinking of the actual class work versus the (GED) degree.

That claim is modified by this footnote, citing an unnamed “associate” — is this Pulitzer Prize winning Bart Gellman they’re talking about? — describing that Snowden did get a GED in 2004. [Update: Indeed it is! HPSCI hid how credible the source for this was and what he based if off of!!]

But having acknowledged that there are official records they could consult but have not, they instead just present the admittedly conflicting claims made in secondary sources (assuming they got the dates correct, but there are dates that are absolutely incorrect elsewhere in this report). There’s no actual attempt to contact local schools to get to the bottom of it all.

And yet, they then use these conflicting claims (based on inaccurate citations) to claim, in the summary, that Snowden is a “serial exaggerator.”

To make that claim with respect to his high school education, you would actually have had to do the work to ascertain the truth. The report made no effort to do so.

Marcy has been blogging full time since 2007. She’s known for her live-blogging of the Scooter Libby trial, her discovery of the number of times Khalid Sheikh Mohammed was waterboarded, and generally for her weedy analysis of document dumps.

Marcy Wheeler is an independent journalist writing about national security and civil liberties. She writes as emptywheel at her eponymous blog, publishes at outlets including the Guardian, Salon, and the Progressive, and appears frequently on television and radio. She is the author of Anatomy of Deceit, a primer on the CIA leak investigation, and liveblogged the Scooter Libby trial.

Marcy has a PhD from the University of Michigan, where she researched the “feuilleton,” a short conversational newspaper form that has proven important in times of heightened censorship. Before and after her time in academics, Marcy provided documentation consulting for corporations in the auto, tech, and energy industries. She lives with her spouse and dog in Grand Rapids, MI.

DISTANTFISH and Correlations

For some time, I’ve been trying to track how the NSA does correlations, as a 2008 FISA Court opinion that almost certainly approves correlation has been withheld from release. By “correlation,” NSA means that matching of known strong identifiers of a particular traffic. All such identifiers need to be tracked to track a target (indeed, France was not able to prevent the Bataclan attack because they had lost track of one of the key attackers).

One of the SIDToday newsletters the Intercept released today describes how a key tool to correlate identities, DISTANTFISH, works.

Here’s how it describes DISTANTFISH’s two functions:

(S//SI) PSC works by processing application layer protocols to extract certain metadata fields that work as strong selectors for the client of the current application. These selectors are usually login names, client e-mail addresses, user numbers, and other unique metadata. If a selector is found to be that of a known terrorist, that session, as well as all others generated by the terrorist, is forwarded to NSA for analysis. The DISTANTFISH association algorithms are the primary way of determining which sessions the terrorist generated when the access is traditional passive collection. The collection of all user sessions is called the Aggregate Session and can be achieved by other methods, especially active efforts.

(S//SI) However, PSC assumes that the strong selectors for a terrorist are known. The second objective for DISTANTFISH is to associate all strong selectors for SIGINT targets and store them in a database. Intelligence analysts use the database to discover new identities to add to the selectors for that terrorist. Work on this database has begun, but much work remains.

And here’s how it worked to collect all the web activity of a particular target in Iraq in 2004.

(S//SI) Project DISTANTFISH was created to target terrorist traffic on the Internet by providing two important services. First, it provides a database for discovering account identities for known terrorists to use as strong selectors (i.e. login names, e-mail addresses, or other elements that can be associated with a particular individual). Second, it provides information on which the same user generated computer sessions. Thus, if one session contains a strong selector for a terrorist, then all sessions can be collected. At the heart of this capability is an association service that can track an individual computer by the way it generates packets.

(S//SI) From this association service, the DISTANTFISH team members were able to determine that the terrorist generated 107 computer sessions over eleven minutes, thus separating this traffic from that of the other 16 people in the web café. As most of the supporting software is still under development, the data was manually examined resulting in the discovery of two additional MSN Messenger accounts and two Yahoo web mail accounts that the terrorist used, but that NSA had been unaware of. Since terrorists often abandon accounts for new ones, having a complete picture of the accounts used is critical for targeting the terrorists’ traffic.

Remember, the USA Freedom Act requires “phone” companies, broadly defined, to turn over “session identifiers” under the guise of call records. Any such session identifier can be used to correlate identities in this fashion. I have long argued that is the point of USAF: to get tech companies to do correlations with a near perfect degree of accuracy rather than (in fact, in addition to) having the NSA correlate the IDs.

Marcy has been blogging full time since 2007. She’s known for her live-blogging of the Scooter Libby trial, her discovery of the number of times Khalid Sheikh Mohammed was waterboarded, and generally for her weedy analysis of document dumps.

Marcy Wheeler is an independent journalist writing about national security and civil liberties. She writes as emptywheel at her eponymous blog, publishes at outlets including the Guardian, Salon, and the Progressive, and appears frequently on television and radio. She is the author of Anatomy of Deceit, a primer on the CIA leak investigation, and liveblogged the Scooter Libby trial.

Marcy has a PhD from the University of Michigan, where she researched the “feuilleton,” a short conversational newspaper form that has proven important in times of heightened censorship. Before and after her time in academics, Marcy provided documentation consulting for corporations in the auto, tech, and energy industries. She lives with her spouse and dog in Grand Rapids, MI.

One Thing Edward Snowden Is Not a Fucking Idiot About

Gizmodo’s Matt Novak is outraged that fucking idiot Edward Snowden told a conference some stupid things. I agree that this was a pretty stupid comment.

Snowden also addressed his tweet from October 21st in which he said that, “There may never be a safer election in which to vote for a third option.” Snowden told us that he more or less stands by his tweet and that anything else “freezes us into a dynamic of ‘you must always choose between two bad options’” which is a “fundamentally un-American idea.”

The thing that really outraged Novak, however, is that Snowden said technical means are more important than policy as a way to protect liberty.

What got me so riled up about Snowden’s talk? He firmly believes that technology is more important than policy as a way to protect our liberties. Snowden contends that he held this belief when Obama was in office and he still believes this today, as Donald Trump is just two months away from entering the White House. But it doesn’t make him right, no matter who’s in office.

“If you want to build a better future, you’re going to have to do it yourself. Politics will take us only so far. And if history is any guide, they are the least effective means of seeing change we want to see,” Snowden said on stage in Oakland from Russia, completely oblivious to how history might actually be used as a guide.

Snowden spoke about how important it is for individuals to act in the name of liberty. He continually downplayed the role of policy in enacting change and trotted out some libertarian garbage about laws being far less important than the encryption of electronic devices for the protection of freedoms around the world.

“Law is simply letters on a page,” Snowden said. It’s a phrase that’s still ringing in my ears, as a shockingly obtuse rejection of civilized society and how real change happens in the world.

How do we advance the cause of liberty around the world? Encrypt your devices, according to Snowden. Okay, now what? Well, Snowden’s tapped out of ideas if you get beyond “use Signal.”

Novak went on to recite big legislation — notably, the Civil Rights and Voting Rights Acts — that has been critical to advancing the cause of liberty with the boundaries of the US. I agree that they have.

That said, I’m all but certain I spend more time working on surveillance policy than Novak. I’m no shrug in the work to improve surveillance policy.

But there are several things about surveillance that are different. First (as Snowden pointed out), “Technology knows no jurisdiction.” One aspect of the government’s dragnet is that it spies on Americans with data collected overseas under EO 12333. And Congress has been very reluctant to — and frankly pretty ineffective at — legislating surveillance that takes place outside the relatively narrow (geographic and legal) boundaries of FISA. Without at least reinterpretation of Supreme Court precedent, it’s not clear how much Congress can legislate the spying currently conducted under EO 12333.

Either we need to come up with a way to leverage other jurisdictions so as to limit surveillance overseas (which will require technology in any case, because the NSA is better at spying than any other jurisdiction out there), or we need to find some way to make it harder for the government to spy on us by doing it overseas. The latter approach involves leveraging technology.

And all that assumes the Trump Administration won’t use the very same approach the Bush Administration did: to simply blow off the clear letter of the law and conduct the spying domestically anyway. At least now, it would be somewhat harder to do because Google has adopted end-to-end encryption and Signal exists (we’re still fighting policy battles over terms under which Google can be coerced into turning over our data, but Signal has limited the amount to which it can be coerced in the same way because of its technological choices).

The other important point is, especially going forward, it will be difficult to work on policy without using those technological tools. “Use Signal” may not be sufficient to protecting liberties. But it is increasingly necessary to it.

It may be that Novak is aware of all that. Nothing in his article, however, reflects any such awareness.

Edward Snowden may be a fucking idiot about some things. But anyone who imagines we can protect liberties by focusing exclusively on policy is definitely a fucking idiot.

Marcy has been blogging full time since 2007. She’s known for her live-blogging of the Scooter Libby trial, her discovery of the number of times Khalid Sheikh Mohammed was waterboarded, and generally for her weedy analysis of document dumps.

Marcy Wheeler is an independent journalist writing about national security and civil liberties. She writes as emptywheel at her eponymous blog, publishes at outlets including the Guardian, Salon, and the Progressive, and appears frequently on television and radio. She is the author of Anatomy of Deceit, a primer on the CIA leak investigation, and liveblogged the Scooter Libby trial.

Marcy has a PhD from the University of Michigan, where she researched the “feuilleton,” a short conversational newspaper form that has proven important in times of heightened censorship. Before and after her time in academics, Marcy provided documentation consulting for corporations in the auto, tech, and energy industries. She lives with her spouse and dog in Grand Rapids, MI.

NSA Conducts FISA Section 704 Collection Using Transit Collection

Please consider donating to support this work. It’s going to be a long four years. 

The Intercept has a fascinating new story confirming what many people already intuited: AT&T’s spooky building at 33 Thomas Street is a key NSA collection point, and the NSA has equipment inside the building (it’s almost certainly not just NSA; this is probably also where AT&T collects much of their Hemisphere database and it likely includes AT&T’s special service center for FBI NSLs).

The Intercept released a bunch of documents with the story, including this one on FAIRVIEW.

It shows that FISA Section 704/705a are among the authorities used with FAIRVIEW, ostensibly collected under “Transit” authority, but with the collection done at TITANPOINT (which is the code name for 33 Thomas Street).

screen-shot-2016-11-16-at-3-05-47-pm

As I explain in this post, there are three authorities in the FISA Amendments Act that are supposed to cover US persons: 703 (spying with the help of domestic partners on Americans who are overseas), 704 (spying on Americans who are overseas, using methods for which they would have an expectation of privacy), and 705, which is a hybrid.

But Snowden documents — and this IG Report — make it clear only 704 and 705b are used.

Screen Shot 2016-05-13 at 3.38.08 AM

Unsurprisingly, the disclosure standards are higher for 703 — the authority they don’t use — than they are for 704. In other words, they’re using the authority to spy on Americans overseas that is weaker. Go figure.

But here’s the other problem. 704/705b are two different authorities and — as reflected in Intelligence Oversight Board reports — they are treated as such. Which means they are using 704 to spy on targets that are overseas, not just defaulting to 705b hybrid orders (which would require the person to be in the US some of the time).

But they are doing it within the US, using the fiction that the collection is only “transiting” the US (that is, transiting from one foreign country to another). This seems to indicate the NSA is conducting electronic surveillance on US persons located overseas — which seems clearly to fall under 703 — but doing it under 704 by claiming traffic transiting the US isn’t really collection in the US. Correction: Because the person is located overseas, it doesn’t count as electronic surveillance. In any case, this seems to be effectively a way around the intent of 703.

Marcy has been blogging full time since 2007. She’s known for her live-blogging of the Scooter Libby trial, her discovery of the number of times Khalid Sheikh Mohammed was waterboarded, and generally for her weedy analysis of document dumps.

Marcy Wheeler is an independent journalist writing about national security and civil liberties. She writes as emptywheel at her eponymous blog, publishes at outlets including the Guardian, Salon, and the Progressive, and appears frequently on television and radio. She is the author of Anatomy of Deceit, a primer on the CIA leak investigation, and liveblogged the Scooter Libby trial.

Marcy has a PhD from the University of Michigan, where she researched the “feuilleton,” a short conversational newspaper form that has proven important in times of heightened censorship. Before and after her time in academics, Marcy provided documentation consulting for corporations in the auto, tech, and energy industries. She lives with her spouse and dog in Grand Rapids, MI.

In Latest Russian Plot, WikiLeaks Reveals Hillary Opposes ISDS

Among the emails released as part of the Podesta leaks yesterday, WikiLeaks released this one showing that, almost a year before she was making the same argument in debates with Bernie Sanders, Hillary was opposed to Investor State Dispute Settlement that is part of the Trans Pacific Partnership. (h/t Matt Stoller) ISDS is the means by which corporations have used trade agreements to operate above the domestic laws of party countries (if you haven’t read this three part series from BuzzFeed to learn about the more exotic ways business are profiting off of ISDS).

The email also appears to echo her later public concern that she had changed her mind on TPP because of KORUS.

After our last talk with HRC, we revised our letter to oppose ISDS and include her caution about South Korea.

Sure, other Podesta emails show Hillary supporting a broad region of free trade (and labor) in the Americas. But this more recent email confirms that the views she expressed in debate were more than just an attempt to counter Bernie’s anti-trade platform.

Whether or not this is newsworthy enough to justify the WL dump, it is noteworthy in light of NYT’s rather bizarre article from some weeks back suggesting that WL always sides with Putin’s goals. As I noted, the article made a really strained effort to claim that WL exposed TPP materials because it served Putin’s interests. Now, here, WL is is releasing information that makes Hillary look better on precisely that issue.

That doesn’t advance the presumed narrative of helping Trump defeat Hillary!

Then, as I noted yesterday, in spite of all the huff and puff from Kurt Eichenwald, the release of a Sid Blumenthal email used by Trump is another case where the WL release, as released, doesn’t feed the presumed goals of Putin.

Which brings me to this Shane Harris piece, which describes four different NatSec sources revealing there’s still a good deal of debate about WL’s ties to Russia.

Military and intelligence officials are convinced that WikiLeaks is an ongoing threat to U.S. national security and privacy owing to its leaks of classified documents and emails. But its precise relationship with Russia has been a subject of internal debate. Some do see the group as being in cahoots with the Kremlin. But others find that WikiLeaks is acting mainly as the beneficiary of stolen documents, not unlike a journalistic organization.

There are some funny aspects to this story. Nothing in it considers the significant evidence that WL is (and has reason to be) affirmatively anti-Hillary, which means its interests may align with Russia, even if it doesn’t take orders from Russia.

It also suggests that if the spooks can prove some tie between WL and Russia, they can spy on it as an agent of foreign power.

But those facts don’t mean WikiLeaks isn’t acting at Russia’s behest. And that’s not a trivial matter. If the United States were to determine that WikiLeaks is an agent of a foreign power, as defined in U.S. law, it could allow intelligence and law enforcement agencies to spy on the group—as they do on the Russian government. The U.S. can also bring criminal charges against foreign agents.

WL has been intimately involved in two separate charges cases of leaking-as-espionage in the US, Chelsea Manning and Edward Snowden. The government has repeatedly told courts that it has National Security/Criminal investigations, plural, into WikiLeaks, and when pressed for details about how and whether the government is collecting on supporters and readers of WikiLeaks, the government has in part hidden those details under a b3 FOIA exemption, meaning a statute prevents disclosing it, while extraordinarily refusing to reveal what statute that is. We certainly know that FBI has used multiple informants to spy on WL and used a variety of collection methods against Jacob Appelbaum, including (according to Appelbaum) physical tails.

So there’s not only no doubt that the US government believes it can spy on WikiLeaks (which is, after all, headed by a foreigner and not a US organization), but that it already does, and has been doing for at least six years.

Perhaps Harris’ sources really mean they’ve never found a way to indict Julian Assange before, but if they can claim he’s working for Putin, then maybe they’ll overcome past problems of indicting him because it would criminalize journalism. If that’s the case, it may be shading analysis of WL, because the government would badly like a reason to shut down WL (as the comments about the direct threat to the US in the story back up).

As I’ve said before, the role of WL in this and prior leak events is a pretty complex one, one that if approached too rashly (or too sloppily) could have ramifications for other publishers. While a lot of people are rushing to collapse this (in spite of what sounds like a continuing absence of directly incriminating evidence) into a nation-state conflict, things like this TPP email suggest it’s not that simple.

Marcy has been blogging full time since 2007. She’s known for her live-blogging of the Scooter Libby trial, her discovery of the number of times Khalid Sheikh Mohammed was waterboarded, and generally for her weedy analysis of document dumps.

Marcy Wheeler is an independent journalist writing about national security and civil liberties. She writes as emptywheel at her eponymous blog, publishes at outlets including the Guardian, Salon, and the Progressive, and appears frequently on television and radio. She is the author of Anatomy of Deceit, a primer on the CIA leak investigation, and liveblogged the Scooter Libby trial.

Marcy has a PhD from the University of Michigan, where she researched the “feuilleton,” a short conversational newspaper form that has proven important in times of heightened censorship. Before and after her time in academics, Marcy provided documentation consulting for corporations in the auto, tech, and energy industries. She lives with her spouse and dog in Grand Rapids, MI.

The Yahoo Scan: On Facilities and FISA

There are now two competing explanations for what Yahoo was asked by the government to do last year.

Individual FISA order or 702 directive?

NYT (including Charlie Savage, who FOIAed all the FISC opinions and then wrote a book about them) explains Yahoo got an individual FISA order to search for a “signature” that the FBI had convinced the FISA Court was associated with a state-sponsored terrorist group.

A system intended to scan emails for child pornography and spam helped Yahoo satisfy a secret court order requiring it to search for messages containing a computer “signature” tied to the communications of a state-sponsored terrorist organization, several people familiar with the matter said on Wednesday.

Two government officials who spoke on the condition of anonymity said the Justice Department obtained an individualized order from a judge of the Foreign Intelligence Surveillance Court last year. Yahoo was barred from disclosing the matter.

To comply, Yahoo customized an existing scanning system for all incoming email traffic, which also looks for malware, according to one of the officials and to a third person familiar with Yahoo’s response, who also spoke on the condition of anonymity.

With some modifications, the system stored and made available to the Federal Bureau of Investigation a copy of any messages it found that contained the digital signature.

Reuters — in a story emphasizing the upcoming debate about reauthorization — says that the order was a Section 702 order.

The collection in question was specifically authorized by a warrant issued by the secret Foreign Intelligence Surveillance Court, said the two government sources, who requested anonymity to speak freely.

Yahoo’s request came under the Foreign Intelligence Surveillance Act, the sources said. The two sources said the request was issued under a provision of the law known as Section 702, which will expire on Dec. 31, 2017, unless lawmakers act to renew it.

The FISA Court warrant related specifically to Yahoo, but it is possible similar such orders have been issued to other telecom and internet companies, the sources said.

Yet it also reports that both Intelligence Committees are investigating more about this request (which tells you something about Reuters’ potential sources and how much the spooks’ overseers actually know about this).

The intelligence committees of both houses of Congress, which are given oversight of U.S. spy agencies, are now investigating the exact nature of the Yahoo order, sources said.

For what it’s worth, at least until 2012, I think NSA and FBI might have been able to request this scan under 702; there are a bunch of court decisions, including one associated with what got reported as an upstream violation in 2012, that we haven’t seen on this point though. But particularly given Reuters’ discussion of a “warrant” — which is more often used with traditional FISA — I suspect NYT is correct on this.

“Hard” and “soft,” and “upstream,” “about,” and “PRISM” are confusing the debate

The source of the confusion seems to stem from two separate sets of vocabulary that are unhelpful in understanding how FISA works.

The first set has to do with “hard” and “soft” selectors, language used in XKeyscore, which basically conducts boolean searches of buffered Internet traffic. Hard selectors are name, email, or phone identifiers associated with a specific person. Soft selectors are characteristics that can range from geographic location to specific code — so a search might ask for users of the encryption tool Mujahadeen Secrets in Syria, for example, which will return a bunch of people whose identities may not be known but whose activities warrant interest. Soft selectors can include searches on what counts as “content,” but they also search on what counts as metadata.

I think the hard/soft distinction is misleading because — as far as I know — FISA has always operated on single selectors, not boolean searches. NSA isn’t asking providers — whether they’re phone companies or Internet providers — to go find people who are in interesting places and use interesting crypto (though AT&T may be an exception to this rule). Rather, they’re asking for communications obtained by searching on specific selectors.

To be sure, for each target, there will be a range of selectors, often a huge number of them. Even for one person, as I have noted, NSA and FBI probably know of at least a hundred selectors. One Google subpoena response I examined, for examined, included 15 “hard” identifiers for just one person (and multiply that by any major Internet service a person used). For a targeted organization like “Russian GRU hackers,” the NSA will probably have still more. But — again, as far as we know — FISA providers are asked to return data based off known selectors. But as I’ll show below, they’ve been asked to return data off selectors that would count as both hard and soft under XKeyscore.

The other set of confusing vocabulary comes from public debates about FISA (including PCLOB’s report on Section 702). Some debates have made a distinction between “upstream” and “PRISM.” Upstream is when NSA gives the telecoms a selector to collect information from scans conducted at switches, but it fundamentally refers to how something is collected, not who does it (and it’s possible there are backbone providers we haven’t thought of who also participate). PRISM is when NSA/FBI give Internet providers selectors to return activity on; it’s a description of from whom the information is collected. But even there, a PRISM provider will provide far more than just the email associated with a given selector.

Sometimes “upstream” collection is referred to as “about” collection. That’s misleading. “About” collection — that is, communications that contain a selector in what counts as content areas of the communication — is a subset of upstream collection. But what is really happening is that when the telecoms sniff packets to find a given selector, they need to sniff both the header and content to get all the communications they’re after, which is what PCLOB is saying here.

With regard to the NSA’s acquisition of “about” communications, the Board concludes that the practice is largely an inevitable byproduct of the government’s efforts to comprehensively acquire communications that are sent to or from its targets. Because of the manner in which the NSA conducts upstream collection, and the limits of its current technology, the NSA cannot completely eliminate “about” communications from its collection without also eliminating a significant portion of the “to/from” communications that it seeks. The Board includes a recommendation to better assess “about” collection and a recommendation to ensure that upstream collection as a whole does not unnecessarily collect domestic communications.

One hazard of using “about” to refer to “upstream” collection is it leads people to forget that the NSA needs to use upstream collection to comprehensively collect non-PRISM Internet traffic, even when working just from “hard” selectors like email addresses. Some of this collection (as the PCLOB passage above makes clear) is just looking for any emails involving a target, not emails talking “about” that target. But at least according to PCLOB, because of the way this collection is done, even if NSA is only searching for a hard selector email, it will get “about” traffic.

As you can see, however, this language is already going to be insufficient to discuss the Yahoo request, which is effectively an “upstream” search on a PRISM providers’ content (though I’m not clear whether it happens at the packet level or not). We also don’t yet know whether the signature involved counts as content, but the filters Yahoo adapted for the process clearly scan the content.

Public discussions have hidden how 702 includes non-email selectors

But the bigger problem with this discussion is that people are confused about what FISA permits the government to search on.

One huge shortcoming of the PCLOB report — one I pointed out at the time — is that it pretended that Section 702 was not used for cybersecurity. That’s unfortunate because cybersecurity is the area where Section 702 most obviously includes non-email selectors, what would be called “soft” selectors in XKeyscore. When I first confirmed that NSA was using 702 for cybersecurity back when I briefly worked at the Intercept, it was based off the search on a cyber “signature,” not an email. The target was a (state-sanctioned) hacker, but the search was not for the hacker’s email, but for his tools.

Here’s how PCLOB briefly alluded to this activity.

Although we cannot discuss the details in an unclassified public report, the moniker “about” collection describes a number of distinct scenarios, which the government has in the past characterized as different “categories” of “about” collection. These categories are not predetermined limits that confine what the government acquires; rather, they are merely ways of describing the different forms of communications that are neither to nor from a tasked selector but nevertheless are collected because they contain the selector somewhere within them.

The Semiannual reports are one place where the government has officially admitted that it searches on more than just email addresses.

Section 702 authorizes the targeting of non-United States persons reasonably believed to be located outside the United States. This targeting is effectuated by tasking communication facilities (also referred to as “selectors”), including but not limited to telephone numbers and electronic communications accounts, to Section 702 electronic communication service providers. [my emphasis]

As I said, the Snowden documents confirm that NSA has searched on malware signatures. Given the obvious application and the non-denials I have gotten from various quarters, I would bet a great deal of money that NSA has also searched on some signature associated with AQAP’s Inspire magazine, effectively allowing it to track anyone who downloads (or decrypts) the magazine.

In a series of tweets yesterday, Snowden confirmed that the scope is even more broad.

In practical terms, this means anything you can convince FISC to stamp. At NSA, I saw live examples of the following:

The usual suspects (emails, IPs, usernames, etc), but also cryptographic hashes that identify known files (MD5/SHA1), sub-strings from base-64 encoded email attachments (derived from things like embedded corporate logos), and any uncommon artifacts arising from a target’s tooling, for example if their app transmits a UUID (like a registration code or serial).

The possibilities here are basically limitless, and we can’t infer the specific nature of the string without more info.

The point is, “upstream” collection — whether done at a telecom switch or a tech server — can (and will, so long as FISC will authorize it) search on any string that will return the communications of interest, with “communications” extending to include “cyberattacks conducted by disembodied code.”

To understand FISA collection, then, it is best to think in terms of selectors or facilities that will return a desired target. Here’s some language from an Semiannual report that explains the distinction between target and facility (and why the classified numbers in the report are undoubtedly much larger than the unclassified 92,000 “target” number we’re given to explain the scope of FISA collection).

The provided number of facilities on average subject to acquisition during the reporting period remains classified and is different from the unclassified estimated number of targets affected by Section 702 released on June 26, 2014, by ODNI in its 2013 Transparency Report: Statistical Transparency Report Regarding Use of National Security Authorities (hereafter the 2013 Transparency Report). The classified number provided in the table above estimates the number of facilities subject to Section 702 acquisition, whereas the unclassified number provided in the 2013 Transparency Report estimates the number of targets affected by Section 702 (89,138). As noted in the 2013 Transparency Report, the “number of 702 ‘targets’ reflects an estimate of the number of known users of particular facilities (sometimes referred to as selectors) subject to intelligence collection under those Certifications.” Furthermore, the classified number of facilities in the table above accounts for the number of facilities subject to Section 702 acquisition during the current six month reporting period (e.g., June 1, 2013 – November 30, 2013), whereas the 2013 Transparency Report estimates the number of targets affected by Section 702 during the calendar year 2013.

As explained above, for any given target, there may be a slew of selectors or facilities that NSA can collect on (though they probably only collect on a limited selection of all the selectors they know; they use the other selectors to make sure they can find all the online activity of someone). The government tracks this internally by counting how many average selectors or facilities are targeted in a given day. These numbers will get more interesting, by the way, once the numbers incorporate USA Freedom Act compliance, which (in my opinion) significantly serves to require providers to provide all known selectors, that is, to even further expand the universe of known selectors.

A history of the word “facility”

But to understand the background to the Yahoo thing, it is absolutely necessary to understand how the word “facility” has evolved within FISC (and we only have access to some of this). As far as we know, the meaning of the word started to change in 2004 when Coleen Kollar-Kotelly approved the installation of “Pen Registers” (really, packet sniffers) at switches to accomplish with the Internet dragnet what Stellar Wind had been doing (that is, the collection of Internet metadata in bulk), based on the logic that al Qaeda was using those facilities to communicate. Her ruling changed the definition of facility from meaning an individual user (a phone number or email address) to many users including the target. When Kollar-Kotelly first approved it, she required the government to tell her which specific switches they were going to target — that is, which switches were likely to carry traffic from target countries like Yemen and Afghanistan. But when John Bates reauthorized the Internet dragnet in 2010, he let the government decide on a rolling basis which facilities it would collect metadata from.

Thus, starting in 2004 and expanded in 2010, “facility” — the things targeted under FISA — no longer were required to tie to an individual user or even a location exclusively used by targeted users.

When Kollar-Kotelly authorized the Internet dragnet, she distinguished what she was approving, which did not require probable cause, from content surveillance, where probable cause was required. That is, she tried to imagine that the differing standards of surveillance would prevent her order from being expanded to the collection of content. But in 2007, when FISC was looking for a way to authorize Stellar Wind collection — which was the collection on accounts identified through metadata analysis — Roger Vinson, piggybacking Kollar-Kotelly’s decision on top of the Roving Wiretap provision, did just that. That’s where “upstream” content collection got approved. From this point forward, the probable cause tied to a wiretap target was freed from a known identity, and instead could be tied to probable cause that the facility itself was used by a target.

There are several steps between how we got from there to the Yahoo order that we don’t have full visibility on (which is why PCLOB should have insisted on having that discussion publicly). There’s nothing in the public record that shows John Bates knew NSA was searching on non-email or Internet messaging strings by the time he wrote his 2011 opinion deeming any collection of a communication with a given selector in it to be intentional collection. But he — or FISC institutionally — would have learned that fact within the next year, when NSA and FBI tried to obtain a cyber certificate. (That may be what the 2012 upstream violation pertained to; see this post and this post for some of what Congress may have learned in 2012.) Nor is there anything in the 2012 Congressional debate that shows Congress was told about that fact.

One thing is clear from NSA’s internal cyber certificate discussions: by 2011, NSA was already relying on this broader sense of “facility” to refer to a signature of any kind that could be associated with a targeted user.

The point, however, is that sometime in the wake of the 2011 John Bates opinion on upstream, FISC must have learned more about how NSA was really using the term. It’s not clear how much of Congress has been told.

The leap from that — scanning on telephone switches for a given target’s known “facility” — to the Yahoo scan is not that far. In his 2010 opinion reauthorizing the Internet dragnet, Bates watered down the distinction between content and metadata by stripping protection for content-as-metadata that is also used for routing purposes. There may be some legal language authorizing the progression from packets to actual emails (though there’s nothing that is unredacted in any Bates opinion that leads me to believe he fully understood the distinction). In any case, FISCR has already been blowing up the distinction between content and metadata, so it’s not clear that the Yahoo request was that far out of the norm for what FISC has approved.

Which is not to say that the Yahoo scan would withstand scrutiny in a real court unaware of the FISC precedents (including the ones we haven’t yet seen). It’s just to say we started down this path 12 years ago, and the concept of “facilities” has evolved such that a search for a non-email signature counts as acceptable to the FISC.

If a facility is not a user, then how do you determine foreignness?

[Update: I realize this discussion is, given the increasing certainty that the Yahoo scan was done under an individual FISA order, irrelevant for the Yahoo case, because FBI has been cleared to collect on signatures in the US. But the issue is still an important one when discussing “facilities” that have been divorced from a geographically located user.]

There’s one final thing we don’t have visibility on.

When Kollar-Kotelly started down this path, she focused on facilities that were foreign-facing. That is, there was a high likelihood messages transiting those switches were one-side foreign, and therefore targetable, certainly for a PRTT. But as I noted, that foreign-facing distinction got badly watered down in 2010. And Yahoo’s entire universe of emails would not be particularly foreign focused (though a lot of foreigners use Yahoo).

The question is, if NSA or FBI is targeting a facility that is not tied to a given user, but is instead tied to an organization that is located overseas, how does the government determine foreignness on a signature? NSA’s General Counsel would permit analysts to collect on but not target metadata of, say, bots in the US based on the assumption that the ultimate source of the bot was overseas. If the signature that FBI searches on derives from overseas — as in the case where Inspire magazine is produced overseas — does that by itself deem a communication involving that signature to be “located” overseas, and therefore targetable.

I suspect that may be why NYT’s sources emphasized that the target of the Yahoo search was a state-sponsored terrorist organization, rather than just a terrorist organization, because by definition that state would be overseas. But I also suspect that a lot of the recent troubles at NSA pertaining to “roving” selectors stems from the ambiguity that arises when you start targeting selectors that are not by definition geographically bounded.

The way the government targets facilities is constitutionally problematic in any case. But this question of foreignness seems to present both statutory and constitutional problems.

Marcy has been blogging full time since 2007. She’s known for her live-blogging of the Scooter Libby trial, her discovery of the number of times Khalid Sheikh Mohammed was waterboarded, and generally for her weedy analysis of document dumps.

Marcy Wheeler is an independent journalist writing about national security and civil liberties. She writes as emptywheel at her eponymous blog, publishes at outlets including the Guardian, Salon, and the Progressive, and appears frequently on television and radio. She is the author of Anatomy of Deceit, a primer on the CIA leak investigation, and liveblogged the Scooter Libby trial.

Marcy has a PhD from the University of Michigan, where she researched the “feuilleton,” a short conversational newspaper form that has proven important in times of heightened censorship. Before and after her time in academics, Marcy provided documentation consulting for corporations in the auto, tech, and energy industries. She lives with her spouse and dog in Grand Rapids, MI.