Taking Hyper-V Health Monitoring to the Next Level, Part 1 -- Redmondmag.com

Taking Hyper-V Health Monitoring to the Next Level, Part 1

A new PowerShell-based Hyper-V health tool goes beyond replication status to estimate whether a VM is actually ready for a successful failover.

By Brien Posey
06/30/2026

A couple of months ago, I ran into some problems with Hyper-V replication and consequently decided to build a PowerShell script that was designed to detect replication failures and other common problems. Now that I have been using the script for a while, I can confirm that it works beautifully and does exactly what it is supposed to.

Even so, I came to the realization that when it comes to Hyper-V replication, there are two questions that really matter. My previous script did a great job of answering one of those questions. That question was, "is the replication process working the way that it should?" However, the script does little to answer the second question. That question is, "If I had to fail over a VM right now, what are the odds that the failover would actually work?"

I decided to build a PowerShell script to help me to answer that second question. The script proved to be far lengthier and more complex than I expected, but I am happy with the results. More importantly, I am eventually going to be using the functions contained within my script in an entirely new tool that I am developing. That tool is going to assist with server monitoring and capacity planning. I mention this because although the tool that I have created works, the output isn't nearly as fancy as some of the other tools that I created.

So what does my new tool do? I made a list of all of the things that I could think of that might potentially prevent a failover from being successful. I then wrote a script that checks for all of those conditions and that creates a score estimating the odds that a failover will succeed. The script also details any conditions that might need to be addressed in order to improve the odds of a successful failover.

If you look at Figure 1, you can see a summary displaying the health information for a couple of my VMs. Notice that each VM has received a score ranging from 0 to 100, indicating how likely the VM would be to failover successfully. Just beneath the score, you will see the VM's current status (is it ready for a failover or not) and any issues that have been detected.

**[Click on image for larger view.]** *Figure 1.* These virtual machines can likely be failed over successfully.

So now that I have spent some time talking about what my script does, I want to turn my attention to the eight health checks that the script performs in an effort to determine whether or not the VM can be failed over.

Replication Health
The first thing that the script checks is replication health, as reported by Hyper-V. If Hyper-V reports that the VM's replication health is normal, then the script awards 30 points. If the VM is in a warning state, then only 10 points are awarded. If the VM is in a critical state, then no points are awarded. Additionally, there is a $CriticalFailure variable that I set to True. I use this variable to indicate that a condition exists that renders a failover impossible. I actually had a tough time deciding on whether to designate a critical state as a critical failure because technically, a failover may still be possible, albeit with at least some data loss. Ultimately however, I decided to treat a critical state as a critical failure.

The second condition that my script checks for is replication freshness. In other words, how long has it been since the last successful replication cycle. If a replication cycle has successfully completed within the last five minutes, then 10 points are awarded. If it has been between five and 10 minutes, then five points are awarded, but a warning is issued. If the last successful replication was more than 10 minutes ago, then no points are awarded.

The third test checks for network connectivity. This test is a little bit on the complicated side, but effectively, I am checking to see whether or not each NIC is connected to an external virtual switch. The score is calculated by dividing the number of valid NICs (NICs that are connected to a switch) by the total number of NICs that have been detected and then multiplying the answer by 10.

In part 2 of this series, I will tell you about the remaining health checks that my script performs. I will also share the source code with you at the end of the post.

About the Author

Brien Posey is a 22-time Microsoft MVP with decades of IT experience. As a freelance writer, Posey has written thousands of articles and contributed to several dozen books on a wide variety of IT topics. Prior to going freelance, Posey was a CIO for a national chain of hospitals and health care facilities. He has also served as a network administrator for some of the country's largest insurance companies and for the Department of Defense at Fort Knox. In addition to his continued work in IT, Posey has spent the last several years actively training as a commercial scientist-astronaut candidate in preparation to fly on a mission to study polar mesospheric clouds from space. You can follow his spaceflight training on his Web site.

Featured

Microsoft Expands Defender Experts With New Threat Intelligence and Multicloud Coverage

Microsoft on Wednesday introduced a threat intelligence service and expanded its managed detection and response offering as the company looks to help security teams face growing volume of threat data into specific defensive actions.
What Happens When Malware Outlives its Intended Lifespan, Part 1?

Aging malware can remain dangerous long after its creators move on, leaving victims with fewer protections and no reliable recovery path.
Microsoft, 3M Partnership Targets AI Infrastructure and Enterprise Transformation

Microsoft and 3M on Wednesday announced a wide-ranging partnership that links two major areas of enterprise AI investment: the infrastructure needed to support AI data centers and the use of AI to modernize large organizations.
Microsoft's Record July Patch Tuesday Fixes 570 Flaws, Including Two Exploited Zero-Days

Microsoft's July Patch Tuesday release broke the record for a second straight month, delivering fixes for roughly 570 holes across Windows, SharePoint, Microsoft 365, Azure and others.
Why Most Backup Success Metrics Are Meaningless

Traditional backup metrics can show perfect health while failing to reveal whether critical workloads can actually be restored.

comments powered by Disqus

Subscribe on YouTube

Office 365 Watch

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

TechMentor & Cybersecurity Live! @ Microsoft HQ
August 3-7, 2026

Virtual Hands-on Training Seminar: PowerShell Mastery Workshop: From Fundamentals to Advanced Automation
September 9-10, 2026

The AI Pivot
September 25, 2026

Live! 360 6-Week Training & Certification Course: Mastering the Microsoft AI Framework: Building Enterprise-Ready AI Agents with Microsoft Foundry
October 6–November 10, 2026

Live! 360 Orlando
November 15-20, 2026

Artificial Intelligence Live! Orlando
November 15-20, 2026

AI Enterprise Architecture Live! Orlando
November 15-20, 2026

Cybersecurity & Ransomware Live! Orlando
November 15-20, 2026

Data Platform Live! Orlando
November 15-20, 2026

TechMentor Orlando
November 15-20, 2026

Live! 360 2-Day Hands-On Seminar: AI-Powered .NET Development with Claude & Claude Code
December 8-9, 2026

Virtual Hands-on Training Seminar: AI-Powered PowerShell and Infrastructure Automation with Claude Code
December 10-11, 2026

TechMentor & Cybersecurity Live! @ Microsoft HQ
August 9-13, 2027

Webcasts

More Webcasts

Whitepapers

More Tech Library