Air Travel, Disrupted: Welcome to the New Normal

[graphic: Live radar from 15-AUG-2015, via @FlightRadar24]

[graphic: Live radar from 15-AUG-2015, via @FlightRadar24]

Air travelers along the U.S. east coast experienced flight cancellations and delays this past Saturday, due to initially unspecified “technical issues” attributed to the air traffic control system.

Beginning some time late morning, hundreds of flights were affected by the problem. The FAA’s service was restored around 4:00 p.m. EDT, though it would take hours longer for the airlines to reschedule flights and flyers.

Although 492 flights were delayed and 476 flights were canceled, the FAA’s Twitter account did not mention the outage or mass flight disruptions until 4:06 p.m., when it said service had been restored.

In a tweet issued long after the outage began, the Federal Aviation Administration said, “The FAA is continuing its root cause analysis to determine what caused the problem and is working closely with the airlines to minimize impacts to travelers.”

The FAA’s Safety Briefing Twitter account made no mention at all of the outage, though it has advised of GPS system testing at various locations across the country.

Various news outlets were conflicted: airports were blamed, then the FAA blamed, and the public knew nothing at all except they were stuck for an indeterminate period.

Get used to this. There’s no sign FAA will change its communications methodology after several air travel disruptions this year alone “due to technical issues” or whatever catchy nondescript phrase airlines/airports/government chooses to use.

Is this acceptable? Hell no. Just read the last version of WaPo’s article about the outage; the lack of communication causes as much difficulty as the loss of service. How can travelers make alternative plans when they hear nothing at all about the underlying problem? They’re stuck wherever they are, held hostage by crappy practices if not policies.

It doesn’t help that the media is challenged covering what appears to be a technology problem. The Washington Post went back and forth as to the underlying cause. The final version of an article about this disruption is clean of any mentions of the FAA’s En Route Automation Modernization (ERAM) system, though earlier versions mention an upgrade to or component of that system as suspect.

The most recent statement issued by the FAA on Monday evening blames an upgrade to ERAM for the outage. It’s troubling that an upgrade applied to a key regional air traffic control facility wasn’t identified more promptly as the problem. Software testing and change management processes are also in question; why was this problem identified in a production environment and not a test environment? Was this the slowest traffic period for implementation of an upgrade? And why so long to cut over to local, lower level air traffic control since they knew there had been a recent upgrade? Didn’t they anticipate a possible fail-over?

Equally troubling is government communication long after air travel disruption, as well as the media’s follow-up coverage, in spite of what appears to be an uptick in air travel disruption.

What happened after United Airline’s July 8th outage? Was there an investigation? If so, what were the findings?

Was the outage, characterized in some reports as automation-related, like this past weekend’s outage? It, too, was characterized by at least one news outlet as an automation issue.

Whatever the cause, how will outages like that on July 8th be prevented in the future?

And what happened after United Airline’s June 2nd outage? After the hour-long grounding of unspecified origin, was there an investigation and corrective action?

Were these two outages manifestations of the same problem — upgrades to existing software — or were they hardware problems?

Without follow-through by the FAA and follow-up by the media, the public can’t be certain that network equipment failures caused by hackers aren’t a wider problem. Or that airlines, airports, and the government have or haven’t done an adequate job of shielding air travel systems from solar storm radiation.

[graphic: NOAA Solar Weather Prediction Center]

[graphic: NOAA Solar Weather Prediction Center]

Oh, yeah, that…there was a G2-3 level solar storm, combined with the effects of an earlier coronal mass ejection on Saturday, too, which impacted electronics including ham radio.  This is at least the second time in a year when an air travel-related outage occurred about the same time as a solar event. The most obvious was New Zealand’s June 23 loss of aviation radar during one of the largest solar storms this solar cycle. Was it a coincidence that the radar went down during the storm? We don’t know because we haven’t heard anything specific about the outage’s cause.

It’s little comfort to know that New Zealand’s aviation system isn’t any better at communicating than the FAA.

You’d think the American public would learn more and faster about the loss of air traffic control of flights in and out of Washington, D.C.

Or didn’t any of the piles of tax dollars we spent post-9/11 actually do anything about communicating U.S. air traffic conditions?

image_print
13 replies
  1. orionATL says:

    “dear american public,

    we believe in transparency. really, we do.

    look up in the sky. transparency will come. really it will.

    happy travels,

    your faa”

  2. Denis says:

    Solar storm. Thank you, Rayne. Someone had to say it. The MSM took no notice of what was happening. Not only did NZ air services go down on that day, they went down near the middle of the day NZ time when solar impacts are max.

    As for the delay in FAA getting out the word re: Saturday’s f’up. My sources inside the FAA tell me the delay was caused by having to negotiate with Twitter for the 157-character tweet. No . . . really, that’s the real story here. Has anyone figured out how they busted the 140 cap?

    Count ’em:

    The FAA is continuing its root cause analysis to determine what caused the problem and is working closely with the airlines to minimize impacts to travelers.

  3. bloopie2 says:

    We’re bitching about air travel only because it is the largest system of moving physical things that is totally computer reliant. Just wait until the highways fill up with computer-regulated cars, and some jerk plops down some jammers by the side of the road, or DHS decides to temporarily shut off the roads (as it can do with cell phone service) for “security” reasons. Or your house (heat, lights, etc.) goes haywire because your remote control of its electronic systems is disrupted under any one of a half dozen very plausible scenarios. Or drones start falling out of the sky because of some software bug. It’s tough enough to keep one home or office local network running smoothly all the time — what on earth makes people think that you can connect a whole lot of really complicated moving things, all over the world, with infallible electronic signals? You think it’s not going to get worse as more and more Things move into the cloud? Time to start organizing your life around the fact that you can’t count on stuff like that working.

  4. lefty665 says:

    “the FAA on Monday evening blames an upgrade to ERAM for the outage. ”
    .
    Hey Rayne does that mean they’re actually more or less off vacuum tube gear? Leesburg’s been the site of SNAFUs off and on over the years.
    .
    When you look at all the screw ups, OPM, IRS, Obamacare rollout, ERAM, etc it may be that 7 years in this administration has demonstrated that it is not competent to be trusted with software, or hardware.
    .
    But I’m sure the solution is to elect a replacement who when asked if her server had been “wiped” answered, “You mean like with a cloth?” snark

    • orionATL says:

      a new americanism is born:

      – “her overseas bank records were missing, wiped like with cloth”

      – “his company only made 78k last year” [ :)) wiped like with a cloth]

      – “she had five traffic tickets, that’s all” ……..

    • Rayne says:

      Lefty, don’t pin it on this administration alone, nor on this White House when Congress is up to their ears in this. I worked for one of the biggest IT integrators when it had massive contracts with the US military and with air carriers. EVERYTHING was late, over-budget, and bolloxed to hell and back. This is the way it has been for most of my adult life when IT meets government. The Obama admin is simply the latest to the game. Like this fucked-all-to-hell contract — it didn’t start with Obama.

      What really bothers me is that the administration should know this — and I mean people who work in the trenches who are employees, too, not just political appointees or electeds. They know better than to try and hide the problem by now by ignoring the public, now that the public has become far more technologically savvy.

      As for HRC’s server comment: do make sure you refresh your memory about the tens of thousands of WHITE HOUSE emails that either went missing, or were run through RNC servers (and possibly others) during Bush’s admin. Let’s also not forget the hack of Congressional Dems’ emails and work server during the Bush years. Let’s not forget the problem with the 2004 vote in Ohio. HRC’S private server is a puny piker compared to the scale of bullshit the Bush administration got away with.

      • Joanne says:

        Thank you. I’m no fan of HRC but why no one has brought up the email/computer shenanigans of the Bushies is incomprehensible.

  5. arbusto says:

    Lockheed-Martin, the prime contractor on the software said it was tested, though not well enough it seems. The new/upgraded software is part of FAA’s move to NextGen air traffic control to include mandatory Automatic Dependent Surveillance-Broadcast Out( ADS-B Out) which broadcasts an aircraft’s ID, current position, altitude, and velocity from on-board equipment. This will augment/replace/expand ATC coverage, though at an added expense to aircraft owners.

    This outage is auspicious for industry and some politico’s in Congress that want to privatize Air Traffic Control. Lockheed-Martin anyone?

    • Rayne says:

      That map, that hole, the one with Washington DC right smack in center? I don’t buy it was an upgrade problem. Whether it is or not, the integrator can’t possibly believe they’d be trusted with a contract for the whole works.

  6. bloopie2 says:

    And if we’re talking about “Is the Internet a Reliable Thing?”, there’s the stunning Ashley Madison data dump to consider as well. As Chicago once sang, it’s “Only The Beginning”.

    • Rayne says:

      What I want to know: Did an intelligence agency hack and leak Ashley Madison content because some of the government employees (or persons using government-issued email accounts) posed a security risk to the US?

      That underserved market would find itself a free market response once DC Madam was outed, right? She had clients who were potential security risks back when, even if the media elected not to report on them. In steps Ashley Madison, the Uber of sex, serving the DC area handily, creating a new generation of security risks.

Comments are closed.