Posey's Tips & Tricks

Where Air Gapped Backups Actually Fail, Part 2

Air gapped backups can still fail when configuration drift, lost encryption keys and routine human mistakes go unnoticed until recovery is needed.


In my previous post in this series, I explained that as important as air gapped backups might be, they can break down as a result of human error stemming from things like untested restorations or even rotation drift. However, these are not the only things that can keep an air gapped backup from working when it is needed. In this follow-up, I want to talk about a few more things that can derail your air gapped backups.

Silent Configuration Drift
One of the big problems that I talked about in the first part of this series was the various problems stemming from not testing the restoration process. However, there is an additional problem associated with the lack of testing that I had forgotten to mention in that article. This problem stems from silent configuration drift.

Here is the way that this one plays out. An organization creates an air gapped backup. They test it a few times and everything works exactly as intended. Once the backup has been proven to work properly, the organization eventually stops testing it.

The problem here is that things inevitably change over time. New workloads are added, while older workloads are retired. Infrastructure changes and the air gapped backup that once worked perfectly is now inadequate. At best, the backup is still delivering some degree of protection but contains protection gaps. At worst, the backup process simply spews a series of errors that nobody is paying attention to.

Mismanaged Encryption Keys
Many organizations encrypt their backups as a safeguard against data theft. However, the encryption and decryption processes are based around the use of keys. An encrypted backup is useless without the necessary keys.

Quite a few years back, someone asked me to try to help them to restore a backup from tape. The problem was that the organization's tape drive had recently failed and was replaced. The old tape drive had been paired with poor or misunderstood key management. When the original drive replaced the only usable copy of the encryption keys effectively disappeared with it. The end result was that backups created on the old drive could not be restored.

The Forgotten Human Layer
Most failures tied to air gapped backups are probably linked to simple human error. Air gapped backups are, by design, kept offline except for when they are being updated. This means that the tasks associated with creating or updating a backup are going to require at least some degree of manual interaction. This might include things like transporting the storage, swapping media, verifying that a backup job has completed, and making sure that completed backups are stored securely.

There are any number of problems that can occur as a result of retrieving, transporting, and storing media. At one of the places where I used to work, one of the employees would take backup tapes to an offsite location each day. That method worked well enough, until on one particularly hot day the employee left the tape in their car and it melted.

That same organization then contracted with a courier service to transport backup tapes to and from a secure facility. That experience actually caused a couple of unforeseen problems such as at least one lost tape and a situation in which a backup needed to be restored, but could not be because the tape was with the courier service.

The lesson here is that air gapped backups can easily be misplaced, damaged, dropped, or might be inaccessible for any number of reasons. As such, it's important to adopt a tiered approach to backups in which your air gapped backup is treated as supplementary (in case your primary backup fails), rather than being the only backup.

Human error can also come into play with regard to employees forgetting to create the backup. I have seen backups be skipped because someone was out of the office or because everyone was distracted by a crisis (or by the company picnic).

Unfortunately, it's tough to completely prevent these types of situations, but there are various ways to minimize the risks. I recently heard of an organization that created a nag screen that was designed to prevent a particular user from doing anything else until they acknowledged that they had created a backup. I'm not saying that this was the best idea, but I have always admired those who like to find creative solutions to problems.

Ultimately, the real danger associated with making air gapped backups is that they tend to occur infrequently enough to be forgettable, but frequently enough to fall into the trap of assuming that they are working properly.

About the Author

Brien Posey is a 22-time Microsoft MVP with decades of IT experience. As a freelance writer, Posey has written thousands of articles and contributed to several dozen books on a wide variety of IT topics. Prior to going freelance, Posey was a CIO for a national chain of hospitals and health care facilities. He has also served as a network administrator for some of the country's largest insurance companies and for the Department of Defense at Fort Knox. In addition to his continued work in IT, Posey has spent the last several years actively training as a commercial scientist-astronaut candidate in preparation to fly on a mission to study polar mesospheric clouds from space. You can follow his spaceflight training on his Web site.

Featured

comments powered by Disqus

Subscribe on YouTube