Can Congress — or Robert Mueller — Order Facebook to Direct Its Machine Learning?

The other day I pointed out that two articles (WSJ, CNN) — both of which infer that Robert Mueller obtained a probable cause search warrant on Facebook based off an interpretation that under Facebook’s privacy policy a warrant would be required — actually ignored two other possibilities. Without something stronger than inference, then, these articles do not prove Mueller got a search warrant (particularly given that both miss the logical step of proving that the things Facebook shared with Mueller count as content and not business records).

In response to that and to this column arguing that Facebook should provide more information, some of the smartest surveillance lawyers in the country discussed what kind of legal process would be required, but were unable to come to any conclusions.

Last night, WaPo published a story that made it clear Congress wanted far more than WSJ and CNN had suggested (which largely fell under the category of business records and the ads posted to targets, the latter of which Congress had been able to see but not keep). What Congress is really after is details about the machine learning Facebook used to identify the malicious activity identified in April and the ads described in its most recent report, to test whether Facebook’s study was thorough enough.

A 13-page “white paper” that Facebook published in April drew from this fuller internal report but left out critical details about how the Russian operation worked and how Facebook discovered it, according to people briefed on its contents.

Investigators believe the company has not fully examined all potential ways that Russians could have manipulated Facebook’s sprawling social media platform.

[snip]

Congressional investigators are questioning whether the Facebook review that yielded those findings was sufficiently thorough.

They said some of the ad purchases that Facebook has unearthed so far had obvious Russian fingerprints, including Russian addresses and payments made in rubles, the Russian currency.

Investigators are pushing Facebook to use its powerful data-crunching ability to track relationships among accounts and ad purchases that may not be as obvious, with the goal of potentially detecting subtle patterns of behavior and content shared by several Facebook users or advertisers.

Such connections — if they exist and can be discovered — might make clear the nature and reach of the Russian propaganda campaign and whether there was collusion between foreign and domestic political actors. Investigators also are pushing for fuller answers from Google and Twitter, both of which may have been targets of Russian propaganda efforts during the 2016 campaign, according to several independent researchers and Hill investigators.

“The internal analysis Facebook has done [on Russian ads] has been very helpful, but we need to know if it’s complete,” Schiff said. “I don’t think Facebook fully knows the answer yet.”

[snip]

In the white paper, Facebook noted new techniques the company had adopted to trace propaganda and disinformation.

Facebook said it was using a data-mining technique known as machine learning to detect patterns of suspicious behavior. The company said its systems could detect “repeated posting of the same content” or huge spikes in the volume of content created as signals of attempts to manipulate the platform.

The push to do more — led largely by Adam Schiff and Mark Warner (both of whom have gotten ahead of the evidence at times in their respective studies) — is totally understandable. We need to know how malicious foreign actors manipulate the social media headquartered in Schiff’s home state to sway elections. That’s presumably why Facebook voluntarily conducted the study of ads in response to cajoling from Warner.

But the demands they’re making are also fairly breathtaking. They’re demanding that Facebook use its own intelligence resources to respond to the questions posed by Congress. They’re also demanding that Facebook reveal those resources to the public.

Now, I’d be surprised (pleasantly) if either Schiff or Warner made such detailed demands of the NSA. Hell, Congress can’t even get NSA to count how many Americans are swept up under Section 702, and that takes far less bulk analysis than Facebook appears to have conducted. And Schiff and Warner surely would never demand that NSA reveal the extent of machine learning techniques that it uses on bulk data, even though that, too, has implications for privacy and democracy (America’s and other countries’). And yet they’re asking Facebook to do just that.

And consider how two laws might offer guidelines, but (in my opinion) fall far short of authorizing such a request.

There’s Section 702, which permits the government to oblige providers to provide certain data on foreign intelligence targets. Section 702’s minimization procedures even permit Congress to obtain data collected by the NSA for their oversight purposes.

Certainly, the Russian (and now Macedonian and Belarus) troll farms Congress wants investigated fall squarely under the definition of permissible targets under the Foreign Government certificate. But there’s no public record of NSA making a request as breathtaking as this one, that Facebook (or any other provider) use its own intelligence resources to answer questions the government wants answered. While the NSA does draw from far more data than most people understand (including, probably, providers’ own algorithms about individually targeted accounts), the most sweeping request we know of involves Yahoo scanning all its email servers for a signature.

Then there’s CISA, which permits providers to voluntarily share cyber threat indicators with the federal government, using these definitions:

(A) IN GENERAL.—Except as provided in subparagraph (B), the term “cybersecurity threat” means an action, not protected by the First Amendment to the Constitution of the United States, on or through an information system that may result in an unauthorized effort to adversely impact the security, availability, confidentiality, or integrity of an information system or information that is stored on, processed by, or transiting an information system.

(B) EXCLUSION.—The term “cybersecurity threat” does not include any action that solely involves a violation of a consumer term of service or a consumer licensing agreement.

(6) CYBER THREAT INDICATOR.—The term “cyber threat indicator” means information that is necessary to describe or identify—

(A) malicious reconnaissance, including anomalous patterns of communications that appear to be transmitted for the purpose of gathering technical information related to a cybersecurity threat or security vulnerability;

(B) a method of defeating a security control or exploitation of a security vulnerability;

(C) a security vulnerability, including anomalous activity that appears to indicate the existence of a security vulnerability;

(D) a method of causing a user with legitimate access to an information system or information that is stored on, processed by, or transiting an information system to unwittingly enable the defeat of a security control or exploitation of a security vulnerability;

(E) malicious cyber command and control;

(F) the actual or potential harm caused by an incident, including a description of the information exfiltrated as a result of a particular cybersecurity threat;

(G) any other attribute of a cybersecurity threat, if disclosure of such attribute is not otherwise prohibited by law; or

(H) any combination thereof.

Since January, discussions of Russian tampering have certainly collapsed Russia’s efforts on social media with their various hacks. Certainly, Russian abuse of social media has been treated as exploiting a vulnerability. But none of this language defining a cyber threat indicator envisions the malicious use of legitimate ad systems.

Plus, CISA is entirely voluntary. While Facebook thus far has seemed willing to be cajoled into doing these studies, that willingness might change quickly if they had to expose their sources and methods, just as NSA clams up every time you ask about their sources and methods.

Moreover, unlike the sharing provisions in 702 minimization procedures, I’m aware of no language in CISA that permits sharing of this information with Congress.

Mind you, part of the problem may be that we’ve got global companies that have sources and methods that are as sophisticated as those of most nation-states. And, inadequate as they are, Facebook is hypothetically subject to more controls than nation-state intelligence agencies because of Europe’s data privacy laws.

All that said, let’s be aware of what Schiff and Warner are asking for, however justified it may be from a investigative standpoint. They’re asking for things from Facebook that they, NSA’s overseers, have been unable to ask from NSA.

If we’re going to demand transparency on sources and methods, perhaps we should demand it all around?

Marcy Wheeler is an independent journalist writing about national security and civil liberties. She writes as emptywheel at her eponymous blog, publishes at outlets including Vice, Motherboard, the Nation, the Atlantic, Al Jazeera, and appears frequently on television and radio. She is the author of Anatomy of Deceit, a primer on the CIA leak investigation, and liveblogged the Scooter Libby trial.

Marcy has a PhD from the University of Michigan, where she researched the “feuilleton,” a short conversational newspaper form that has proven important in times of heightened censorship. Before and after her time in academics, Marcy provided documentation consulting for corporations in the auto, tech, and energy industries. She lives with her spouse in Grand Rapids, MI.

7 replies
  1. Rugger9 says:

    It seems to me that the targeting described here would be very much like a warrant if Mueller does it, which would presumably require judicial approval (but, IANAL).  If the indictments to come can be painted as a witch hunt because of this targeting, why would Mueller do it by Facebook?  He’s careful enough to get a warrant.  Congresscritters aren’t that careful, and may want to muddy the waters (Hi, Devin!) for other reasons.  Cohen’s testimony today was derailed by an unauthorized public statement (apparently he promised not to do one before the hearing in the Senate today) which to me points out the plan to publicly throw sand in the gears of justice.  For the short term, Cohen will get subpoenas from the Senate in open session and possibly Mueller.

    The only reason Facebook would agree to this idea IMHO is to dodge later charges of being uncooperative.

    Lawfare has an interesting rundown on the news from the NYT and CNN:

    https://www.lawfareblog.com/latest-scoops-cnn-and-new-york-times-quick-and-dirty-analysis

  2. Bay State Librul says:

    In the hearts of many Americans, we want Mueller to bring charges, before Don the Con blows up the world. I say this with all sincerity, fucking Trump is nutso.
    With all deliberate speed, pleeze.

  3. pseudonymous in nc says:

    It’s long been obvious that Facebook’s algorithmic operations are not transparent even to itself. By transparent I mean “logged at the time” or “easily replicable from contemporaneous datasets.” So it’s not a surprise that they’d need to come up with custom forensics tools that will inevitably operate in the fuzzy Bayesian space where altering certain inputs and thresholds will produce very different outputs.

    The things that should be presumed loggable: the datasets uploaded for Custom Audiences; promoted posts and their microtargeting criteria; clickthrough metrics and other analytics associated with the marketing/ad side. Basically, anything on the platform with a dollar value and generated by explicit user choice. That ought to be the foundation of Schiff and Warner’s requests.

    part of the problem may be that we’ve got global companies that have sources and methods that are as sophisticated as those of most nation-states.

    And equally opaque, even internally. Just with slightly different privileges.

  4. SpaceLifeForm says:

    Two sides of coin.

    “They’re asking for things from Facebook that they, NSA’s overseers, have been unable to ask from NSA.”

    Maybe because they know that NSA is stonewalling.

    “I’m aware of no language in CISA that permits sharing of this information with Congress.”

    I’m aware of no language in CISA that prohibits sharing of this information with Congress.

    I believe 1st Ammendment is applicable here.

    If you see something, say something.

    Facebook should be fully within their rights to report not just to Congress but to the public.

    Remember, corporations are people too.

  5. greengiant says:

    The 2015 activities of the IRA seem to be a distraction, and Congress in December allocated 70 million or so to do the same, manipulate social media outside the US. Above the surface are the same anonymous/Assange, Trump, Brexit, Putin and La Pen supporters aka Trump operatives. Look for Russian activity in areas that cannot be sourced in the US, because only the young and naive in the US want to risk a prison term. The URL shorteners used in the phishes, any help/spin given CWA and actors such as @LauriLove, and hacking passwords for the election.
    The question is how did Trump.org turn Wi, Mi, Pa and what about NH. So far we have a handful of testimonials of how people were turned away from Clinton, disconnects between election results and exit polls and ballots that can not be recounted. Consider the Theresa Wong interview http://www.bbc.com/news/av/magazine-40852227/the-digital-guru-who-helped-donald-trump-to-the-presidency and other media coverage of the San Antonio Alamo project. Where did Parscale spend that 95 or whatever million? He says twitter gave little edge. Who were the facebook, google and twitter people in San Antonio and what did they do? So far this sounds like Bernie Madoff secret sauce, and if it is too good to be true then well. Note the persistent misdirection on Mi and Pa polls and trends. The noise level on New Hampshire sounds as if the campaign effort did not anticipate same day registration and was actually counting for whatever reason a win. Lastly pay attention to search engine manipulation. You know how hard it is to get a link to an image of a Nazi flag in Charlottesville?
    You have not experienced media warfare until you see an anti Monsanto link followed by 4 pro Monsanto advert links.

Comments are closed.