city: tech glitch was frustrating, but not costly [Virginian - Pilot]
(Virginian - Pilot Via Acquire Media NewsEdge) If you were you among those turned away while trying to pay a bill at City Hall early last month because the computers were down, you probably would have been interested to hear the city's explanation of what went wrong.
The head of the Information Technology Department gave a detailed report on the outage, which also crippled the city's phone and email systems, to the City Council at its Aug. 13 televised work session.
Only problem was, the broadcast was disrupted by a second malfunction, this one involving the city's television equipment. If you tuned in, you saw talking heads and PowerPoint images, but heard no sound.
So, for those left wondering, here is the lowdown on the communication meltdown that beset City Hall in the wee hours of Aug. 7.
It began with the best of intentions.
As part of an ongoing effort to "improve stability and reliability" of the city's computer system, IT staff members set out Aug. 6 to update routers that shuttle data around the network, according to a memo to city leaders.
The update began around 10?p.m. Bad stuff started happening about a half hour later, and computer services began shutting down.
By 1:30 a.m., the staff decided to halt the update and roll back to the previous configuration until the trouble could be sorted out.
Two core switches failed during the rollback, shutting down the city's Internet connection and access to its primary data center.
Normally, the memo explained, one switch acts as a backup to the other. "Having them both fail at once like this is very rare."
As a result, several key operations were knocked offline: payment processing by the treasurer and commissioner of the revenue; city email and some phone service; a system used to handle requests for city services; and online access to real estate assessments data and permit applications.
The emergency 911 call system and the public safety communications network, which are on separate switches, were not affected, although police access to some online services, such as background checks, was temporarily disrupted.
City technicians and outside contractors worked feverishly through the day and night to get things up and running. The network was restored at 4 a.m. Aug. 8.
So what's to prevent this from happening in the future?
"The key," according to the memo, "is to eliminate as many 'single points of failure' as feasible." That means creating redundant servers for all critical systems and "nearly full replication" of the entire data network in a second location. That will take time, and money.
Other than frustration, the outage didn't cost the city anything, spokesman Mark Cox said. Hardware and software replacement was covered under maintenance agreements.
As for the audio malfunction, Cox said, the culprit was a 1980s- era transformer "that finally bit the dust." It was replaced for $30.
(c) 2013 ProQuest Information and Learning Company; All Rights Reserved.
[ Back To Technology News's Homepage ]