Posey's Tips & Tricks

Troubleshooting Windows 10's Blue Screen of Death, Part 1

There are three initial paths you can take to diagnose a BSOD problem.

When a Windows 10 machine experiences the dreaded blue screen of death (BSOD), the best course of action is to reimage the machine. However, while reimaging often solves the problem, it is rarely convenient for anything other than domain-joined corporate desktops.

Last week, for example, my normally reliable Surface Book began producing BSOD errors while I was 2,000 miles from home. Had I reimaged the hard disk, I would have lost my applications and the data stored on the machine's hard disk.

Since reimaging a machine isn't always a good option, I wanted to show you a few techniques I like to use when troubleshooting blue screen errors.

Understanding Where Blue Screen Errors Come From
When a blue screen error occurs, it means that something has happened at the kernel level that has compromised the operating system's ability to function. Normally, a buggy application won't cause a BSOD. Blue screen errors are normally related to hardware problems, corrupt operating system files or other low-level problems.

Remember Occam's Razor
Occam's razor is a scientific philosophy that is often paraphrased as, "The simplest explanation is usually correct." While the accuracy of this is debatable, the concept applies really well to blue screen errors.

For example, if you recently upgraded a machine's hardware and suddenly started getting blue screen errors, then the simplest explanation -- and most likely the correct one -- would be that the errors are being caused by a hardware problem.

Of course, in the real world, things aren't always that cut and dried. In my case, I had not changed anything on my Surface Book. I didn't install any new applications or updates, and there were no hardware upgrades. As such, it was necessary to look deeper.

The Windows Reliability Monitor
When it comes to tracking down the root cause of a BSOD, it may be helpful to take a look at the Windows Reliability Monitor -- which is just what I did for my situation. For those who aren't familiar with the Reliability Monitor, it is a native Windows tool that tracks a PC's reliability over time. When problems occur, the Reliability Monitor is sometime able to correlate those problems with recent events.

To launch the Reliability Monitor, enter the Control command at the Windows Run prompt. This will cause Windows to open the Control Panel. Now, click on System and Security, followed by Security and Maintenance. Now, expand the Maintenance section and click on the View Reliability History link.

You might be wondering why I started with the Reliability Monitor as opposed to starting with another technique, such as examining the event logs. There are actually two reasons why I chose to begin the troubleshooting efforts with the Reliability Monitor. The first is that using the Reliability Monitor helps to simplify the troubleshooting process. That's because the Reliability Monitor is designed to surface the information that is most relevant to the problems that are occurring. While this information does indeed exist within the event logs, it can sometimes be tricky to sift through all of the various events to find what you really need.

The second reason I chose to start my troubleshooting efforts with the Reliability Monitor is because I suspected a hardware problem was the most likely cause of the BSODs.

The reason I suspected a hardware problem -- aside from past experience with other hardware problems on other machines -- was because my Surface Book began acting strangely just prior to the blue screen errors. Windows was having difficulty turning on the camera used for Windows Hello authentication. I also noticed that the machine was slowing to a crawl, especially when I would try to copy files to an external hard disk. I had been using the computer almost exclusively offline, so I was relatively confident I could rule out any sort of malware attack and my problems were most likely hardware-related.

In case you're wondering, the Windows Reliability Monitor confirmed my suspicions. I will show you what the Windows Reliability Monitor found, and what I did to correct the problem in Part 2 of this series.

About the Author

Brien Posey is a 22-time Microsoft MVP with decades of IT experience. As a freelance writer, Posey has written thousands of articles and contributed to several dozen books on a wide variety of IT topics. Prior to going freelance, Posey was a CIO for a national chain of hospitals and health care facilities. He has also served as a network administrator for some of the country's largest insurance companies and for the Department of Defense at Fort Knox. In addition to his continued work in IT, Posey has spent the last several years actively training as a commercial scientist-astronaut candidate in preparation to fly on a mission to study polar mesospheric clouds from space. You can follow his spaceflight training on his Web site.


comments powered by Disqus

Subscribe on YouTube