Posey's Tips & Tricks
Strategies for Expanding Storage on Hyper-V Replica Servers
Here's how to get back up and running in minimal time when increasing storage space.
A few years ago, I made the decision to simplify my network infrastructure as much as I possibly could in an effort to reduce costs and improve reliability. In doing so, I moved everything to the cloud except for my file server, which contains a massive amount of data. I didn't want to cluster the file server. Instead, I set up two Hyper-V servers (each with direct attached storage), virtualized my file server, and then used Hyper-V replication to ensure that I would have a spare copy of the server if I ever needed it.
The two Hyper-V servers that I mentioned are the only two servers on my production network (although I have a bunch of lab servers). At the time that I set them up, I installed more storage than I thought that I would ever need. Remember, my goal was to simplify things and I didn't want to have to upgrade storage any time soon.
A couple of weeks ago, one of the drives in one of my Hyper-V servers failed and needed to be replaced. As I assessed the situation, I realized that the idea that my servers contained more storage than I would ever need was a fantasy. The volume containing my virtual machine was a parity volume with 9TB of usable space. I had already consumed about 70 percent of that space. With a couple of big projects coming up that were sure to consume a significant amount of space, I made the decision to expand the server's storage rather than simply replacing the failed drive.
The servers were previously equipped with a 250GB system drive and four 3TB drives configured as a parity array. I ordered 5TB drives to replace the 3TB drives. I also ordered replacements for the system drives since those drives were a few years old and were likely to fail eventually. The real question then became how to replace the drives in a minimally disruptive way.
My first instinct was to take a round robin approach to replacing the 3TB disks. Since those disks were configured as a parity array, I could remove a disk and replace it with one of the new disks. The array would then populate the new disk with data. Once the array had been rebuilt, I could remove and replace the next disk and then repeat the procedure until all of the disks had eventually been replaced. At that point, I could expand my virtual hard disks and the volumes that they contain, back up the system drive, and then wrap everything up by replacing the system drive and restoring my backup.
It was a good plan, and I am confident that it would have worked. Ultimately, however, I decided not to go with this plan. My reason for abandoning it had to do with the involved time that it would take to complete. When you replace a drive in a parity array, it takes some time for the array to be rebuilt and brought back to a healthy status. Based on my hardware's performance benchmark, I estimated that it would probably take a little over two days to populate each drive. That meant that it would have taken at least eight or nine days to replace all eight drives. When you factor in everything else that needed to be done, my storage upgrade could realistically have turned into a two-week project.
Taking two weeks to perform a storage upgrade wouldn't necessarily have been a problem, but there were other issues to consider. For starters, an array's performance is diminished during the rebuild process, so I would have had a couple of weeks of subpar performance.
The bigger issue was that of reliability. Most of the drives in my two servers had never been replaced. Since these drives were the same age and endured similar levels of activity, the odds of another drive failing in the near future were pretty good. This is especially true when you consider that the drives would be under increased stress due to the continuous array rebuilding process. I didn't want to run the risk of a data loss.
I decided to increase my storage by taking advantage of the fact that my virtual machine was replicated to a secondary Hyper-V server. I performed a planned failover so that the VM was running on the server that had not experienced a drive failure and then performed a full backup of both servers. Once the backup was complete, I removed replication. Next, I replaced all of the drives in the idle host server (including the system drive) and installed a clean copy of the host operating system.
Within a couple of hours, the host was back online. I configured the host as a replica server and allowed my virtual machine to be replicated to the recently upgraded host. Once the replication process was completed, I performed a planned failover, disabled replication, and replaced all of the drives in the other server. After doing so, I installed Windows, brought the server back online, and re-enabled replication.
In retrospect, I was really happy with the way that the process went. The entire process took about four and a half days to complete and the performance loss during that time was barely noticeable.
About the Author
Brien Posey is a 22-time Microsoft MVP with decades of IT experience. As a freelance writer, Posey has written thousands of articles and contributed to several dozen books on a wide variety of IT topics. Prior to going freelance, Posey was a CIO for a national chain of hospitals and health care facilities. He has also served as a network administrator for some of the country's largest insurance companies and for the Department of Defense at Fort Knox. In addition to his continued work in IT, Posey has spent the last several years actively training as a commercial scientist-astronaut candidate in preparation to fly on a mission to study polar mesospheric clouds from space. You can follow his spaceflight training on his Web site.