SQL Slip Up
What happens when a query goes wrong -- very, very wrong.
One of our largest customers was using our software to support its training department. The software let thousands of employees register for classes, checkout resources and see their progress toward certification. They needed some customizations done to support a government client, so as the only person with experience in Microsoft SQL Server, I traveled to the customer site. Our company president, who also doubled as our head programmer, came to support the database side.
The customizations had been completed, tested and saved to floppy disk before we left our office in Chicago. Before we could install the code changes, however, we had to massage the live production data. Simple enough. The first step was to run a SELECT query that looked for records in the STUDENT table that would not fit into the new customizations. We needed to make sure our final query would successfully update the table with the correct information.
My boss worked up a query that returned all students who did not have an employee ID registered. A second UPDATE query then altered the data in the LAST_NAME field for all the matched records. This way, the records would remain in the database, but would not display in the application.
My boss was flying through the process, which, to be honest, he was very good at. However, he accidentally placed a carriage return in the SQL query text, causing the UPDATE query look to overwrite the last name for every record in the customer's database and made it TERMINATED.
I watched him prep the query and remember wanting to scream "Nooooo!" as his hand hovered over the Enter button. I said to my boss, "David, do you realize what you just did?" His eyes flicked back and forth over the screen and he whispered, "Oh my god."
It was bad. We had been working on the production database. That's never a good idea, especially when the database belongs to your largest client.
My boss was frantic, trying to keep the customer from finding out what happened while we tried to recover the lost data. I searched for recent database dumps. There were none. After a couple of hours, we knew we had to tell the client.
They were ... unhappy. They had to go to their tape storage facility and perform a full restore of their database. And back then, restoring from tape was really hard.
Needless to say, all of our professional services during that trip were gratis, including our expenses. I think the only reason they didn't kick us to the curb right then was that our software was so heavily embedded into their business. To appease them, we offered deep discounts and free professional services for a year. We lost a lot of money on that trip, but we did keep the customer.
Like most disasters, a lot of little mistakes were made on the way to a big calamity. But one error stands out: The decision to run an untested query on a production database. We also failed to confirm the presence of a recent data backup. Had we taken that step, we might have been able to quietly recover from our error without involving the client. Instead, we learned a hard lesson in customer retention.
As for the president of the company, he stopped working on live data, providing only pre-tested scripts to customers that need data customizations.
Jim Desmond, CISSP, works in information security and contingency planning in the San Francisco Bay area.