Making Sure the Backup Backs Up
Testing and backup problems result in an all-nighter for a health care IT staff.
It was two days after what appeared to be a run-of-the-mill migration from a Windows 2000 File Server and PDC to a shiny new Windows 2003 Virtualized File Server and domain controller (DC), when our Citrix farm decided to crash around 3 p.m. Because our entire medical practice relied on Citrix to deliver the electronic medical record system to our providers and staff, we gang-tackled the problem immediately.
With the help of my supervisor we spent the next 16 hours trying to diagnose the problem. We had five Citrix boxes that were responsive, but Citrix was throwing very cryptic errors when trying to connect. As the night wore on we reinstalled Citrix on the licensed server and ran updates, even reinstalling it on the servers, thinking if we could get just one of them up we could worry about the others later.
We found some information in a few forums that pointed us to the problem being in the Web interface module of Citrix. So we created a new Web interface server around 4 a.m., with the clinic scheduled to open up again at 8 a.m.
After configuring the Web interface to point to an emergency server, we were able to succeed with our first connection around 7 a.m. The next most pressing problem was that all of the clients pointed to the old Web server. So for the next two hours we were answering calls and trying to update everyone's URL at once, because the version of Citrix we had didn't support automatic updating.
Consequently, we had to go to each computer and enter the URL for 75 increasingly impatient users. All of this was accomplished by 10 a.m. and we finally went home after 26 hours of long, continuous work.
Well, it turns out our original file server contained some special pointer for the Web interface, which had been installed by me and my supervisor's predecessors respectively, who no longer worked for the company. Once we switched the server the pointer was erased along with all the settings needed to connect to our Citrix app farm.
The moral of the story: Allow ample time to test, and make sure that the backup works.
Chris Gahlsdorf is a systems administrator with Northwest Human Services.