Marco Rubio Leaks that the Phone Dragnet Has Expanded to “A Large Number of Companies”

Last night, Marco Rubio went on Fox News to try to fear-monger over the phone dragnet again.

He repeated the claim that the AP also idiotically parroted uncritically — that the government can only get three years of records for the culprits in the San Bernardino attack.

In the case of these individuals that conducted this attack, we cannot see any phone records for the first three years in which — you can only see them up to three years. You’ll not be able to see the full five-year picture.

Again, he’s ignoring the AT&T backbone records that cover virtually all of Syed Rizwan Farook’s 28-year life that are available, that 215 phone dragnet could never have covered Tashfeen Malik’s time in Pakistan and Saudi Arabia, and that EO 12333 collection not only would cover Malik’s time before she came to the US, but would also include Farook’s international calls going back well over 5 years.

So he’s either an idiot or he’s lying on that point.

I’m more interested in what he said before that, because he appears to have leaked a classified detail about the ongoing USA Freedom dragnet: that they’ve been issuing orders to a “large and significant number of companies” under the new dragnet.

There are large and significant number of companies that either said, we are not going to collect records at all, we’re not going to have any records if you come asking for them, or we’re only going to keep them on average of 18 months. When the intelligence community or law enforcement comes knocking and subpoenas those records, in many cases there won’t be any records because some of these companies already said they’re not going to hold these records. And the result is that we will not be able in many cases to put together the full puzzle, the full picture of some of these individuals.

Let me clear: I’m certain this fact, that the IC has been asking for records from “a large number of companies,” is classified. For a guy trying to run for President as an uber-hawk, leaking such details (especially in appearance where he calls cleared people who leak like Edward Snowden “traitors”) ought to be entirely disqualifying.

But that detail is not news to emptywheel readers. As I noted in my analysis of the Intelligence Authorization the House just passed, James Clapper would be required to do a report 30 days after the authorization passes telling Congress which “telecoms” aren’t holding your call records for 18 months.

Section 307: Requires DNI to report if telecoms aren’t hoarding your call records

This adds language doing what some versions of USA Freedom tried to requiring DNI to report on which “electronic communications service providers” aren’t hoarding your call records for at least 18 months. He will have to do a report after 30 days listing all that don’t (bizarrely, the bill doesn’t specify what size company this covers, which given the extent of ECSPs in this country could be daunting), and also report to Congress within 15 days if any of them stop hoarding your records.

That there would be so many companies included Clapper would need a list surprised me, a bit. When I analyzed the House Report on the bill, I predicted USAF would pull in anything that might be described as a “call.”

We have every reason to believe the CDR function covers all “calls,” whether telephony or Internet, unlike the existing dragnet. Thus, for better and worse, far more people will be exposed to chaining than under the existing dragnet. It will catch more potential terrorists, but also more innocent people. As a result, far more people will be sucked into the NSA’s maw, indefinitely, for exploitation under all its analytical functions. This raises the chances that an innocent person will get targeted as a false positive.

At the same time, I thought that the report’s usage of “phone company” might limit collection to the providers that had been included — AT&T, Verizon, and Sprint — plus whatever providers cell companies aren’t already using their backbone, as well as the big tech companies that by dint of being handset manufacturers, that is, “phone” companies, could be obligated to turn over messaging records — things like iMessage and Skype metadata.

Nope. According to uber-hawk who believes leakers are traitors Marco Rubio, a “large number” of companies are getting requests.

From that I assume that the IC is sending requests to the entire universe of providers laid out by Verizon Associate General Counsel Michael Woods in his testimony to SSCI in 2014:

Screen Shot 2015-12-08 at 1.17.27 AM

Woods describes Skype (as the application that carried 34% of international minutes in 2012), as well as applications like iMessage and smaller outlets of particular interest like Signal as well as conferencing apps.

So it appears the intelligence committees, because they’re morons who don’t understand technology (and ignored Woods) got themselves in a pickle, because they didn’t realize that if you want full coverage from all “phone” communication, you’re going to have to go well beyond even AT&T, Verizon, Sprint, Apple, Microsoft, and Google (all of which have compliance departments and the infrastructure to keep such records). They are going to try to obtain all the call records, from every little provider, whether or not they actually have the means with which to keep and comply with such requests. Some — Signal might be among them — simply aren’t going to keep records, which is what Rubio is complaining about.

That’s a daunting task — and I can see why Rubio, if he believes that’s what needs to happen, is flustered by it. But, of course, it has nothing to do with the end of the old gap-filled dragnet. Indeed, that daunting problem arises because the new program aspires to be more comprehensive.

In any case, I’m grateful Rubio has done us the favor of laying out precisely what gaps the IC is currently trying to fill, but hawks like Rubio will likely call him a traitor for doing so.

Marcy has been blogging full time since 2007. She’s known for her live-blogging of the Scooter Libby trial, her discovery of the number of times Khalid Sheikh Mohammed was waterboarded, and generally for her weedy analysis of document dumps.

Marcy Wheeler is an independent journalist writing about national security and civil liberties. She writes as emptywheel at her eponymous blog, publishes at outlets including the Guardian, Salon, and the Progressive, and appears frequently on television and radio. She is the author of Anatomy of Deceit, a primer on the CIA leak investigation, and liveblogged the Scooter Libby trial.

Marcy has a PhD from the University of Michigan, where she researched the “feuilleton,” a short conversational newspaper form that has proven important in times of heightened censorship. Before and after her time in academics, Marcy provided documentation consulting for corporations in the auto, tech, and energy industries. She lives with her spouse and dog in Grand Rapids, MI.

31 replies
  1. Trevanion says:

    This latest bit of stupidity also underscores how the way is paved for IC malevolence in a society fixated on an extreme business model of The Big and winner-take-all. Another outcome of thirty years of flapdoodle passing as anti-trust policy.

  2. Denis says:

    MW: “Let me clear [sic]: I’m certain this fact, that the IC has been asking for records from ‘a large number of companies,’ is classified.”
    .
    This seems to be your punch-line — as in a punch in Rubio’s face, but surely it can’t be accurate.
    .
    Could you source your allegation that the fact that the IC has been seeking records is a classified fact? It’s a very important allegation to make and then just let hang in the air.
    .
    And the reason I say it is that it seems that you, yourself, have expended untold thousands of brilliant, entertaining keystrokes on this blog informing your readers — and complaining about — IC seeking records from corporations.
    .
    Besides, if it really is a “classified” fact, then your mere repetition of what Rubio said would likely be illegally problematic, too.
    .
    So I’m like sitting here all WTF? Please clarify and provide citation of law/reg that renders the fact that IC is seeking records from a large number of companies a classified fact. Durn, now I’ve gone and disclosed the same “classified fact.”
    .
    Knock, knock, knock . . . hold on, someone’s at the door.

    • emptywheel says:

      It couldn’t be illegal for me, because I don’t have clearance. Rubio does. Rubio has some idea what providers are getting data.

      USAF was passed — and some of its boosters believed — as one accessing phone records. That’s the public record. Rubio just made it clear it goes well beyond that.

      • Denis says:

        I believe you’ll find that the rule is that if you receive classified information and pass it on knowing it’s classified, doesn’t matter what your security clearance is, you’re cooked.

        But my point is not that I think you’re cooked; it’s that the general fact that the IC has been asking for records from a large number of companies cannot possibly be classified information, as you allege. Asking for records from any specific company — that, I can see, might be classified info, particularly if specific records and/or the nature of specific records are disclosed. But that’s not what Rubio spilled.

        My guess is that both you and he are safe from prosecution, at least over this particular matter.

        • emptywheel says:

          Sorry. You’re just wrong. It has been highly classified for 2 years that they’re getting records from 3 companies. It was highly classified during the summer that they added a 4th. It is surely classified they’ve added far more.

  3. orionATL says:

    denis writes “So I’m like sitting here all WTF?”

    aren’t you our self-styled lawyer-neuroscientist?

    a

      • orionATL says:

        you get a license to practice, if you can, then we might do business.

        in the meantime see if you can bring the quality of your writing up to the level of your alleged professional accomplishments.

  4. haarmeyer says:

    …because they’re morons who don’t understand technology (and ignored Woods) got themselves in a pickle…

    Could you please explain to me how someone who has written several articles now without apparent understanding of the term “federated query” but with lots of views about it, gets to say others “don’t understand technology”. I thought about all sorts of other ways of saying this, but for my part, I don’t understand “getting through to you”.

    • emptywheel says:

      Haarmeyer,

      It took until this morning for beat journalists covering this stuff to understand that a query from the same interface would pull of various repositories to include 215 obtained data. That’s after 2 years of me trying to get them to pay attention to it. That that’s true even though I’ve dumbed down the language for them is rather telling about what vocabulary can be used.

      We’ve actually had discussions about federated queries here before, though–before you time.

      In any case, there is good reason to believe that in 2007, 2009, and 2011, data was intermingled in ways that were not permitted (I will one day argue that remained true after 2011, but not yet). In other words, while there is a great deal of language about the structured repositories at NSA, there’s also reason to believe the reality was different than what the concept was.

      • haarmeyer says:

        Good. Then you’re aware that the reason for federating is to cut down on, not to expand, the overall database size, and to distribute it. I sometimes get the feeling you think the opposite in what you write. I have also done my share of work with such things, and with some of the other technologies you mention here (metadata, face recognition, etc.). When some of these are used, there is one more reason for federating queries, because they aren’t queried in the same way and can’t be stored efficiently for query in the same database.

        If you want my advice, and that isn’t a given and I understand that, you should pay far more attention to the advanced querying capabilities (e.g. Intro to Context Sensitive Scanning with X-Keyscore Fingerprints) of the federated database, and waste less time trying to figure out precisely how many people’s records are in it. Hoovering data is a very scary meme, but it masks the far scarier meme of what happens to someone who becomes a selector.

        • orionATL says:

          the word “selector” at the end of your comment doesn’t feel right.

          a “selector” would be one who controls selections. what is scary for one, if he controls, about doing so?

          did you mean one who is selected? “selectee”? as the needle in the haystack? that certainly could become scary.

          • haarmeyer says:

            A selector in their terms (NSA), is one or more algorithmic criteria for a search. Because a person is a set of defining characteristics in the database, by extension a selector is also a name for someone they’re surveilling.

        • jerryy says:

          .
          Both options are equally bad, but having the larger database to troll and chum through is probably the worse.
          .
          The difference is the ease to which someone is being or about to be roasted vs and along with how many others are going to be dragged down with them. How to query related data sets is a skill that comes through usage, but one needs to be able to tie things together to others to get patterns for groups. Having the larger numbers of people in the data sets allows for fishing trips that WILL yield results regardless of validity (p < .05 is a marvelous device for doing just that.).
          .

          • orionATL says:

            “…  allows for fishing trips that WILL yield results regardless of validity…”

            somewhere i’ve heard that the nsa doesn’t much care for the term “fishing trips”.

            they prefer the term “pitchforking” :))

            (“o. k., folks. get with it. start turning that hay”)

          • haarmeyer says:

            Assuming validity is set statistically for all queries, and assuming you know how the query engines are run, given that they accept queries all the way from key words and boolean connectors to snippets of arbitrary length of C++ code, is a big mistake on your part.

            Return may or may not be set statistically, and even so, priors assigned to widely varying kinds of information and in what capacity or by what search change a lot of things.

            Put simply, most of the blather out here about false positives is victim to not knowing what the query methods and means are.

            But also victim to any sort of query method we don’t know about are the interpretations of, and implementation of, minimization.

            Put simply, how many people have data in any of the databases doesn’t make a productive line of inquiry compared to knowing what queries look like and how they are interpreted, and what happens to their selectors once any kind of link has been found. All of that information will tell you what sort of risk someone is exposed to if they are a person even briefly focussed on in these databases, so you know what risk it is to be in these databases. You will also know when and if any metric, like the statistical significance you mention, is being used and for what.

            If at that point you decide you know what’s going on in the queries, and want to figure out how many people are exposed to it, then it makes sense. But since those code queries are allowed, and other documents suggest that the queries are vetted before being exposed to the larger database, and vetted again for efficacy, you have no idea whether someone pinged in the course of a query is being used as a possible person of interest and future selector, or is in the data set to train the code, as a known non-interesting person, or neither.

            So you have no idea how to interpret the figures you get on how many people and who those people are who are in these databases. None at all.

            • emptywheel says:

              Adding, it’s not clear they were supposed to be able to train on the full set under 215, which is one of the reasons it was really limited, as compared to EO 12333 data.

              I fully expect that will change with what they get in USAF.

            • jerryy says:

              Your first paragraph illustrates the point.
              .
              You took different phrases (items from different data sets) strung them together and came up with a paragraph loaded with negative connotations yet is absolutely meaningless.
              .
              That the data is available to be chained is the problem, vetting is a side issue as is the programming language used by the developer to create the tools used by the inquisitor.
              .

              • haarmeyer says:

                Disagree entirely. You seem to believe you know how the government does its queries and therefore what purpose the information in the database serves. You don’t. Nothing was meaningless about what I said. If you think it was, then show it line by line and counter by counter. Otherwise, you haven’t really supplied a reason I should believe you understood anything I said.

                Shorter version: Saying something is meaningless but failing to show it is, itself, meaningless.

            • orionATL says:

              re #21
              .

              “Put simply, most of the blather out here about false positives is victim to not knowing what the query methods and means are.”

              i hadn’t heard the “blather” about false positives when it comes to the nsa/fbi.

              .
              it does seem, when it comes to the nsa/fbi, identifying false positives does not seem to be their big electronic spying problem. their problem seems to be they can’t identify a would-be shooter/bomber before that individual(s) gives the media good reason to label him/her a terrorist.

              this failure isn’t even one of false negative (as far as we can know); it is a not failure to note and then wrongly label “not a threat”. it is a failure to even notice that individual among those in the data set.

            • orionATL says:

              at #21

              “Put simply, most of the blather out here about false positives is victim to not knowing what the query methods and means are.”

              i just can’t stop wondering what you understand by the phrase “false positives”. is it, perhaps, a part of the jargon of a workplace where you’ve been?

              i ask because “false positive” has a very particular meaning in experimental science and medicine (at least).

              what do you mean to refer to when you use the phrase “false positives” in the context of this weblog?

              this weblog often enveighs against “hoovering up” citizens for whom there is no obvious (to experts) reasonable justification for inclusion in nsa/fbi searchable data bases.

              would “false positives” as you understand the term be a reference to such citizens?

              • haarmeyer says:

                No there’s no discrepancy. People make general statements of what the number of false positives are, people make general statements of what criteria are being used to statistically check meaningful correlations, people make all sorts of statements. Here’s a newsflash. Unless you know what they are doing and what their standards are, and where they are starting, and a whole lot about parameters assigned in problems and how the problems are characterized, you aren’t capable of making statements about fishing trips. They agree with you that they are looking for a needle in a haystack. You don’t know what their tools for shaking haystacks are.

                • Dean says:

                  What does that have to do with false positives? False positives occur because of “methods and means.” Those latter don’t provide a context for interpreting false positives.

                • orionATL says:

                  my question about “false positives” was directed at you with the hope i’d learn more about how individuals were brought into, inducted into, swept into, commandeered into a nat sec nsa/fbi data sets.

                  specifically, if the inclusion of a person in a spying data set was without any criterion initially and without any warrant for that person or his class, would that person, if later found to be of no value to an investigation, be an example of what you refer to as a “false positive”, jargon for “an unhelful or useless inclusion” ?

                • orionATL says:

                  re #28

                  “No there’s no discrepancy. People make general statements of what the number of false positives are, ”

                  you are going to have to restate this sentence with a ordinary english substitution for” false positives” if i am to understand it.

                  at this blog i rarely if ever see the phrase “false positives”written, nor do i read references to some number of false positives. i do see references to how many people may be swept into spying data sets (in efforts) to find various classes of malfactors.

                  how might/does the term “false positives” relate to an individualplaced into, swept into, commandeered into, a spying (or other) data set?

        • emptywheel says:

          It’s a fair point and probably one of particular interest as they transition from 215 to USAF.

          Under the latter, they were only supposed to do straight queries on contact chains. They did more–at least 2 more kinds of queries, one of which was just a check to see if someone’s ID counted as RAS approved (but which would be a test of correlation, I think).

          That said, as I mentioned, they weren’t abiding by the rules until 2009 at least, and maybe not after.

          I think that only the results of such queries could be subjected to advanced querying–the initial query basically made an identifier eligible for such querying, but once it was eligible, then it was no holds barred.

          Under USAF, they will be doing different kinds of queries (connection chaining rather than contact chaining). Even really smart people who are read in in Congress don’t know what this means but my bet is that, to the extent it is possible, they’ll move closer to an XKS model (in part bc they’ll also no longer have to deal with high volume identifiers). They will strive to do those in real time with their existing querying.

  5. Anon says:

    “So he’s either an idiot or he’s lying on that point.”
    .
    In actuality there is a third option, he just does not care.
    .
    Since no major politician pays a price for making false statements, particularly politicians campaigning for the Republican Party, why should he care? In the end it won’t cost him anything, but it will make his subsequent actions seem “serious”.

Comments are closed.