Posts

Tuesday Morning: Garbage in, Garbage out [UPDATE]

Why’d I pick this music video, besides the fact I like the tune? Oh, no reason at all other than it’s trash day again.

Speaking of trash…

Facebook furor just frothy foam?
I didn’t add yesterday’s Gizmodo piece on Facebook’s news curation yesterday or the earlier May 3 piece because I thought the work was sketchy. Why?

  • The entire curation system appears to be contractors — Where is a Facebook employee in this process?

    “…News curators aren’t Facebook employees—they’re contractors. One former team member said they received benefits including limited medical insurance, paid time off after 6 months and transit reimbursement, but were otherwise excluded from the culture and perks of working at Facebook. […] When the curators, hired by companies like BCForward and Pro Unlimited (which are then subcontracted through Accenture to provide workers for Facebook), arrive at work each day, they read through a list of trending topics ranked by Facebook’s algorithm from most popular (or most engaged) to least. The curators then determine the news story the terms are related to.

    The news curation team writes headlines for each of the topics, along with a three-sentence summary of the news story it’s pegged to, and choose an image or Facebook video to attach to the topic. The news curator also chooses the “most substantive post” to summarize the topic, usually from a news website. […] News curators also have the power to “deactivate” (or blacklist) a trending topic—a power that those we spoke to exercised on a daily basis. …” (emphasis mine)

    I see a Facebook-generated algorithm, but no direct employees in the process — only curator-contractors.

  • Sources may have a beef with Facebook — This doesn’t sound like a happy work environment, does it?

    “…Over time, the work became increasingly demanding, and Facebook’s trending news team started to look more and more like the worst stereotypes of a digital media content farm.

    […]

    Burnout was rampant. ‘Most of the original team isn’t there anymore,’ said another former news curator. ‘It was a stop-gap for them. Most of the people were straight out of [journalism school]. At least one of them was fired. Most of them quit or were hired by other news outlets.’ …” (emphasis mine)

    It’s not as if unhappy contractors won’t have newsworthy tips, but what about unhappy Facebook employees? Where are they in either of Gizmodo’s pieces?

  • Details in the reporting reveal bias in the complainant(s) — So far I see one reference to a conservative curator, not multiple conservative curators.

    “Facebook workers routinely suppressed news stories of interest to conservative readers from the social network’s influential “trending” news section, according to a former journalist who worked on the project.

    […]

    Other former curators interviewed by Gizmodo denied consciously suppressing conservative news, and we were unable to determine if left-wing news topics or sources were similarly suppressed. The conservative curator described the omissions as a function of his colleagues’ judgements; there is no evidence that Facebook management mandated or was even aware of any political bias at work. …”

    Note the use of “a” in front of “former journalist” and “the” in front of “conservative curator.” (Note also Gizmodo apparently needs a spell check app.)

  • No named sources confirming the validity of the complaints or other facts in Gizmodo’s reporting — Again, where are Facebook employees? What about feedback from any of the companies supplying contractors; did they not hear complaints from contractors they placed? There aren’t any apparent attempts to contact them to find out, let alone anonymous confirmation from these contract companies. There are updates to the piece yesterday afternoon and this morning, including feedback from Vice President of Search at Facebook, Tom Stocky, which had been posted at Facebook. Something about the lack of direct or detailed feedback to Gizmodo seems off.
  • Though named in the first of two articles, Facebook’s managing editor Benjamin Wagner does not appear to have been asked for comment. The May 3 piece quotes an unnamed Facebook spokesperson:

    When asked about the trending news team and its future, a Facebook spokesperson said, “We don’t comment on rumor or speculation. As with all contractors, the trending review team contractors are fairly compensated and receive appropriate benefits.”

I’m disappointed that other news outlets picked up Gizmodo’s work without doing much analysis or followup. Reuters, for example, even parrots the same phrasing Gizmodo used, referring to the news curators as “Facebook workers” and not contract employees or contractors. Because of this ridiculous unquestioning regurgitation by outlets generally better than this, I felt compelled to write about my concerns.

And then there’s Gizmodo itself, which made a point of tweeting its report was trending on Facebook. Does Gizmodo have a beef with Facebook, too? Has it been curated out of Facebook’s news feed? Are these two pieces really about Facebook’s laundering of Gizmodo?

I don’t know; I can’t tell you because I don’t use Facebook. Not going to start now because of Gizmodo’s sketchy reporting on Facebook, of all things.

Miscellany
Just some odd bits read because today is as themeless as yesterday — lots of garbage out there.

Skepticism: I haz it
As I read coverage about news reporting and social media leading up to the general election, I also keep in the back of my mind this Bloomberg report, How to Hack an Election:

As for Sepúlveda, his insight was to understand that voters trusted what they thought were spontaneous expressions of real people on social media more than they did experts on television and in newspapers. […] On the question of whether the U.S. presidential campaign is being tampered with, he is unequivocal. “I’m 100 percent sure it is,” he says.

Be more skeptical. See you tomorrow morning!

UPDATE — 1:30 P.M. EDT —

@CNBCnow
JUST IN: Senate Commerce Commtitte chair sends letter to Facebook’s Mark Zuckerberg seeking answers on alleged manipulation of trending news

ARE YOU FUCKING KIDDING ME WITH THIS? THE SENATE GOING TO WASTE TAX DOLLARS ON THIS WHEN EVERY. SINGLE. NEWS. OUTLET. USES EDITORIAL JUDGMENT TO DECIDE WHAT TO COVER AS NEWS?

Cripes, Gizmodo’s poorly sourced hit piece says,

“…In other words, Facebook’s news section operates like a traditional newsroom, reflecting the biases of its workers and the institutional imperatives of the corporation. …”

Yet the Senate is going to pursue this bullshit story after Gizmodo relied on ONE conservative curator-contractor — and their story actually says an algorithm is used?

Jeebus. Yet the Senate will ignore Sheldon Adelson’s acquisition of the biggest newspaper in Las Vegas in a possible attempt to denigrate local judges?

I can’t with this.

UPDATE — 3:35 P.M. EDT —
The Guardian reports the senator wasting our tax dollars questioning a First Amendment exercise by Facebook is John Thune. Hey! Guess who’s running for re-election as South Dakota’s senior senator? Why it’s John Thune! Nothing like using your political office as a free press-generating tool to augment your campaign. I hope Facebook’s algorithm suppresses this manufactured non-news crap.

Would NSA’s New Big Social Media Data Approach Have Noticed the Arab Spring?

Screen Shot 2014-01-27 at 10.02.29 PMSometime in 2011, I was on a panel with the Democracy Now’s Sharif Kouddous — whose tweeting from Tahrir Square played an important role in keeping the world informed after Hosni Mubarak shut down the Internet. I mentioned that DiFi had been bitching for months because the CIA and other intelligence agencies had missed the Arab Spring.

Who had followed Sharif on Twitter, I asked? (Probably half the rather large room raised their hands.) Because if you had, you knew more about the Arab Spring than the CIA did.

Which is the underlying context to the NBC/Greenwald report that GCHQ collects data from Facebook and YouTube to try to monitor the mood of the world.

The demonstration showed that by using tools including a version of commercially available analytic software called Splunk, GCHQ could extract information from the torrent of electronic data that moves across fiber optic cable and display it graphically on a computer dashboard. The presentation showed that analysts could determine which videos were popular among residents of specific cities, but did not provide information on individual social media users.

The presenters gave an example of their real-time monitoring capability, showing the Americans how they pulled trend information from YouTube, Facebook and blog posts on Feb. 13, 2012, in advance of an anti-government protest in Bahrain the following day.

More than a year prior to the demonstration, in a 2012 annual report, members of Parliament had complained that the U.K.’s intelligence agencies had missed the warning signs of the uprisings that became the Arab Spring of 2011, and had expressed the wish to improve “global” intelligence collection.

During the presentation, according to a note on the documents, the presenters noted for their audience that “Squeaky Dolphin” was not intended for spying on specific people and their internet behavior. The note reads, “Not interested in individuals just broad trends!”

What we’re seeing is how NSA would go about amassing public data to try to learn what the rest of us can read by following Twitter attentively. [see update]

I won’t comment much on the technical ability here (which involve contractors to collect the data), and I’ll only applaud that Facebook has finally been exposed as the perfect surveillance app it is.

But there seem to be several problems with the analysis they’re doing (though MSNBC did not include the script for its PowerPoint). Aside from what seems to be an Orientalism built into the analysis…

Screen Shot 2014-01-27 at 10.29.41 PM

And some half-assed PsychoLOLogy…

Screen Shot 2014-01-27 at 10.32.51 PM

Nowhere does this presentation distinguish between the propaganda social media accounts and the legitimate ones — a known problem of social media analysis going back years (which has, because of the all the competing parties involved, been particularly acute in Syria). Perhaps they deal with this, but this analysis seems ripe for spamming by propaganda, particularly if it came from frenemies who know GCHQ and NSA use such analysis.

Now, presumably someone somewhere else in the combined Intelligence Communities of the US and UK would actually sit down and read the social media of a potential hotspot, which is the way a bunch of Tweeps in their pajamas can get a sense of what’s going on without collecting all the social media data for an entire country first. Such an approach uses the hive mind you acquire on social media, with the built in assurances from trusted interlocutors.

After the Arab Spring, the Intelligence Communities of a number of nations got their asses kicked because none of them are well suited to figure out what non-elites are doing. But from the looks of things, they just hired some contractors with bad attitudes to have something to offer up, no matter how dubiously effective.

Update: My statement was inaccurate. They got this data by tapping the cables.

Important: Changes to Section 215 Dragnet Will Not Change Treatment of EO 12333 Metadata

In their Angry Birds stories, both the Guardian and NYT make what I believe is a significant error. They suggest changes in the handling of the Section 215-collected phone metadata will change the way NSA handles EO 12333-collected phone metadata.

Guardian:

Data collected from smartphone apps is subject to the same laws and minimisation procedures as all other NSA activity – procedures which US president Barack Obama suggested may be subject to reform in a speech 10 days ago. But the president focused largely on the NSA’s collection of the metadata from US phone calls and made no mention in his address of the large amounts of data the agency collects from smartphone apps.

NYT:

President Obama announced new restrictions this month to better protect the privacy of ordinary Americans and foreigners from government surveillance, including limits on how the N.S.A. can view “metadata” of Americans’ phone calls — the routing information, time stamps and other data associated with calls. But he did not address the avalanche of information that the intelligence agencies get from leaky apps and other smartphone functions.

Here’s what the President actually said, in part, about phone metadata:

I am therefore ordering a transition that will end the Section 215 bulk metadata program as it currently exists, and establish a mechanism that preserves the capabilities we need without the government holding this bulk meta-data.

That is, Obama was speaking only about NSA’s treatment of Section 215 metadata, not the data — which includes a great amount of US person data — collected under Executive Order 12333.

To be clear, both Guardian and NYT were distinguishing Obama’s promises from the treatment extended to the leaky mobile data app. But they incorrectly suggested that all phone metadata, regardless of how it was collected, receives the same protections.

Section 215 metadata has different and significantly higher protections than EO 12333 phone metadata because of specific minimization procedures imposed by the FISC (arguably, the program doesn’t even meet the minimization procedure requirements mandated by the law). We’ve seen the implications of that, for example, when the NSA responded to being caught watch-listing 3,000 US persons without extending First Amendment protection not by stopping that tracking, but simply cutting off the watch-list’s ability to draw on Section 215 data.

Basically, the way NSA treats data collected under FISC-overseen programs (including both Section 215 and FISA Amendments Act) is to throw the data in with data collected under EO 12333, but add query screens tied to the more strict FISC-regulations governing production under it. This post on federated queries explains how it works in practice. As recently as 2012 at least one analyst improperly searched on US person FAA-collected content because she didn’t hit the right filter on her query screen.

[T]he NSA analyst conducted a federated query using a known United States person identifier, but forgot to filter out Section 702-acquired data while conducting the federated query.

That’s it. If the data is accessed via one of the FISC-overseen programs, US persons benefit from the additional subject matter, dissemination, and First Amendment protections of those laws or FISC’s implementation of them (and would benefit from the minor changes Obama has promised to both Section 215 and FAA).

But if NSA collected the data via one of its EO 12333 programs, it does not get get those protections. To be clear, it does get some dissemination protection and can only be accessed with a foreign intelligence purpose, but that is much less than what the FISC programs get. Which leaves the NSA a fair amount of leeway to spy on US persons, so long as it hasn’t collected the data to do so under the programs overseen by FISC. And when it collects data under EO 12333, it is a lot easier for the NSA to spy on Americans.

The metadata from leaky mobile apps almost certainly comes from EO 12333 collection, not least given the role of GCHQ and CSEC (Canada’s Five Eyes’ partner) to the collection. The Facebook and YouTube data GCHQ collects (just reported by Glenn Greenwald working with NBC) surely counts as EO 12333 collection.

NSA’s spokeswoman will say over and over that “everyday” or “ordinary” Americans don’t have to worry about their favorite software being sucked up by NSA. But to the extent that collection happens under EO 12333, they have relatively little protection.

Side by Side: Timeline of NSA’s Communications Collection and Cyber Attacks

In all the reporting and subsequent hubbub about the National Security Administration’s ongoing collection of communications, two things stood out as worthy of additional attention:

— Collection may have been focused on corporate metadata;

— Timing of NSA’s access to communications/software/social media firms occurred alongside major cyber assault events, particularly the release of Stuxnet, Flame, and Duqu.

Let’s compare timelines; keep in mind these are not complete.

Date

NSA/Business

Cyber Attacks

11-SEP-2007

Access to MSFT servers acquired

15-NOV-2007

Stuxnet 0.5 discovered in wild

XX-DEC-2007

File name of Flame’s main component observed

12-MAR-2008

Access to Yahoo servers acquired

All 2008 (into 2009)

Adobe applications suffer from 6+ challenges throughout the year, including attacks on Tibetan Government in Exile via Adobe products.

11-JAN-2009

Stuxnet 0.5 “ends” calls home

14-JAN-2009

Access to Google servers acquired

Mid-2009

Operation Aurora attacks begin; dozens of large corporations confirming they were targets.

03-JUN-2009

Access to Facebook servers acquired

22-JUN-2009

Date Stuxnet version 1.001 compiled

04-JUL-2009

Stuxnet 0.5 terminates infection process

07-DEC-2009

Access to PalTalk servers acquired

XX-DEC-2009

Operation Aurora attacks continue through Dec 2009

12-JAN-2010

Google discloses existence of Operation Aurora, said attacks began in mid-December 2009

13-JAN-2010

Iranian physicist killed by motorcycle bomb

XX-FEB-2010

Flame operating in wild

10-MAR-2010

Date Stuxnet version 1.100 compiled

14-APR-2010

Date Stuxnet version 1.101 compiled

15-JUL-2010

Langner first heard about Stuxnet

19-SEP-2010

DHS, INL, US congressperson informed about threat posed by “Stuxnet-inspired malware”

24-SEP-2010

Access to YouTube servers acquired

29-NOV-2010

Iranian scientist killed by car bomb

06-FEB-2011

Access to Skype servers acquired

07-FEB-2011

AOL announces agreement to buy HuffingtonPost

31-MAR-2011

Access to AOL servers acquired

01-SEP-2011

Duqu worm discovered

XX-MAY-2012

Flame identified

08-JUN-2012

Date on/about “suicide” command issued to Flame-infected machines

24-JUN-2012

Stuxnet versions 1.X terminate infection processes

XX-OCT-2012

Access to Apple servers acquired (date NA)

Again, this is not everything that could be added about Stuxnet, Flame, and Duqu, nor is it everything related to the NSA’s communications collection processes. Feel free to share in comments any observations or additional data points that might be of interest.

Please also note the two deaths in 2010; Stuxnet and its sibling applications were not the only efforts made to halt nuclear proliferation in Iran. These two events cast a different light on the surrounding cyber attacks.

Lastly, file this under “dog not barking”:

Why aren’t any large corporations making a substantive case to their customers that they are offended by the NSA’s breach of their private communications through their communications providers?