Sunday, March 28, 2004

Another system failure

Apparently, New York City's 911 system was down for a few hours [NY Times] on Friday night. The cause? Apparently, an engineer changing phone numbers for a bank accidentally changed the phone number for the dispatch center instead (the numbers were similar). Oops.

A similar thing happened to Pittsburgh a few years ago. The cause there? They added an overlay area code (which, ironically, has never been used) to the 412/724 area codes. FCC rules state that, when this happens, the affected area has to switch to ten digit dialing (where you always have to dial the area code, even if you are calling to the same area code). Nobody reprogrammed the 911 system to use the ten digit system; when they shut off the seven digit system, it could no longer relay the calls.

Both failures were caused by changed to systems outside of the scope of the basic 911 system. This illustrates why it's hard to make good systems (especially software systems): the interdependencies cause failures to propagate rapidly.

No comments: