On one sunny summer day, this IT manager's network turned slow and lazy -- to his consternation.
I still vividly remember the day. I came into work in a good mood on a sunny summer morning in Vancouver, and was getting ready to do a regular check of the firewall log. As an IT Manager at the University of British Columbia, I managed a network of 400 nodes and supported applications on a variety of platforms. These ranged from large HP/Compaq and IBM enterprise servers at the top end, to Windows Server 2003, Novell Netware 6.X, Unix and Citrix servers in the mid-range, all the way down to Windows and Macintosh desktops at the client.
That sunny summer day started going badly when my assistant reported that she had received more than 20 calls from users at different departments (including Payroll). Users complained that they either could not log in to the Novell server or their Microsoft Outlook e-mails were extremely slow. My assistant mentioned that she had tried to reset (delete and recreate) the Outlook profiles of a few users, but this restored normal operation for just a few minutes.
I checked both the Novell server and Exchange 2003 server -- everything was functioning properly. A review of the server log and multiple virus scans produced nothing. I turned my attention to the network traffic monitoring software, which showed that the network was unexpectedly busy.
"What could be causing it?" I wondered desperately, as I stared at the switches in the machine room. The phone calls were piling up and the situation was getting worse with pay day the next day and the payroll systems still down. I tried to hide my growing frustration as I patiently explained to managers that we were working hard on the problem.
I was approaching the point of outright panic when suddenly I remembered there
had been a power outage the day before.
Our network employed a gigabit backbone and high-speed switched Ethernet connections
at both the core and the edge. Ethernet switches located in floor distribution
wiring closets divide the network into 10 geographic sections. The result is
a tree structure starting from the switch and expanding to every wiring floor
closet and eventually workstation.
The network, however, had been implemented with virtual local area network (VLAN) technology, to
provide flexibility. By layering a
logical network structure atop the physical network, client computers could participate in a departmental subnet regardless of physical location. Just as important, the virtualized structure compartmentalized traffic, preventing congestion.
Remembering the power outage, I quickly went through the settings for each routing switch. I soon discovered that a Cisco 2900 routing switch had ceased retaining its VLAN settings since the power failed. As a result, three VLANs had collapsed into a single default VLAN, and the unmanaged traffic was choking the network.
Once I discovered the problem, it took me 20 minutes to reconfigure the switch and restore the network to normal operation.
It was a difficult yet challenging day, no doubt. From this "Never Again" experience, I learned that problems can often arise from forgotten events, and that the solutions we employ to boost productivity can fail in ways that destroy productivity.
Hong-Lok Li, MCSE 2003, MCSA, MCDBA, MCSD, is an information technology manager at the University of British Columbia, in Vancouver, Canada.