Posey's Tips & Tricks

Hyper-V Snapshots vs. Backups: Addressing the Confusion

Microsoft's Hyper-V snapshots feature is a convenience, not a backup replacement. But with different cloud vendors using the term "snapshots" in different ways, that's easy to forget.

The public cloud is easily one of the most transformative IT technologies of all time. Among many other things, the cloud has had a huge impact on the way that we perform backups and disaster recovery operations. Not only have backup methods and technologies changed, but so has some of the terminology.

That being the case, I wanted to take a moment to talk about one particular term that can be especially problematic: snapshots.

From the time that Microsoft introduced Hyper-V way back in 2008, it has been well-established that Hyper-V snapshots are not a viable substitute for traditional backups.

For those who might not be familiar with Hyper-V snapshots, they are a tool that can be used to revert a virtual machine (VM) to a previous state, but without the hassles of restoring a backup. Whereas it might take hours to restore a VM from backup (depending on the VM size and the backup infrastructure being used), a snapshot can be restored in a matter of seconds.

Even so, Hyper-V snapshots exist primarily as a convenience, not as a backup replacement.

The reason I say this is because Hyper-V snapshots work in a manner that is completely different from backups. When you create a backup, you are creating a copy of whatever it is that you're backing up. Conversely, when you create a Hyper-V snapshot, Hyper-V is not making a copy of the VM or its data. When you create a Hyper-V snapshot, what you are really creating is a differencing disk.

As I'm sure you probably know, Hyper-V VMs generally have one or more virtual hard disks associated with them. When you create a snapshot, Hyper-V creates a special-purpose virtual hard disk called a differencing disk. At that point, all of the VM's write operations are directed to the newly created differencing disk. This means that the VM's primary virtual hard disk becomes read-only.

This is the reason why it is possible to quickly restore a Hyper-V snapshot. When you restore a snapshot, you are not copying data from a backup. Instead, Hyper-V simply deletes the differencing disk and makes the VM's original virtual disk read/write. Since that virtual hard disk was previously read-only, it is entirely unchanged from the time that the snapshot was created.

Hence, when you restore a snapshot, you are not actually restoring data. You are simply throwing away all of the write operations that have occurred since the time that the snapshot was created.

Obviously, I'm oversimplifying things a bit since Hyper-V supports the creation of large snapshot chains that behave a little differently from a single snapshot, but the point remains that Hyper-V snapshots are not backups.

One of the reasons that snapshots can be a bit confusing is that different cloud providers use the term "snapshots" in different ways.

(In all fairness, Microsoft did rename the Hyper-V snapshot feature to "checkpoints." Even so, the checkpoint feature is still commonly referred to by its old name, snapshots. In fact, some of the Microsoft documentation still refers to checkpoints as snapshots. The page on "using checkpoints to revert virtual machines to a previous state," for example, states, "A snapshot is not a full backup and can cause data consistency issues with systems that replicate data between different nodes such as Active Directory.")

Amazon Web Services (AWS) treats Elastic Block Store (EBS) snapshots made within the AWS cloud as an actual point-in-time backup. In fact, AWS documentation states, "You can back up the data on your Amazon EBS volumes to Amazon S3 by taking point-in-time snapshots. Snapshots are incremental backups, which means that only the blocks on the device that have changed after your most recent snapshot are saved."

When you first begin reading AWS' snapshot-related documentation, it initially seems that AWS snapshots work similarly to Hyper-V snapshots, and that AWS is just calling the snapshots backups. However, AWS Elastic Compute Cloud (EC2) snapshots are actually incremental backups. Yesterday, for instance, I made a snapshot of a 30GB VM instance, and the snapshot itself was 30GB in size.

Somewhat surprisingly, Microsoft Azure works similarly to AWS. The reason I say this is somewhat surprising is because Azure VMs run on Hyper-V. Even so, the Azure documentation states, "A snapshot is a full, read-only copy of a virtual hard drive (VHD). You can take a snapshot of an OS or data disk VHD to use as a backup, or to troubleshoot virtual machine (VM) issues."

The point is that snapshots can mean different things based on the virtualization platform that is being used. Hyper-V snapshots should not be treated as a backup substitute. On the other hand, Azure and AWS EC2 VM snapshots actually are full-blown backup copies. Even so, many organizations opt to use third-party backup applications rather than depending solely on snapshots.

About the Author

Brien Posey is a 22-time Microsoft MVP with decades of IT experience. As a freelance writer, Posey has written thousands of articles and contributed to several dozen books on a wide variety of IT topics. Prior to going freelance, Posey was a CIO for a national chain of hospitals and health care facilities. He has also served as a network administrator for some of the country's largest insurance companies and for the Department of Defense at Fort Knox. In addition to his continued work in IT, Posey has spent the last several years actively training as a commercial scientist-astronaut candidate in preparation to fly on a mission to study polar mesospheric clouds from space. You can follow his spaceflight training on his Web site.


comments powered by Disqus

Subscribe on YouTube