Joey on SQL Server
Spectre and Meltdown: How Do They Impact SQL Server?
IT needs to patch everything, but the Spectre/Meltdown patches come with their own problems, too. Here's how to limit the fallout.
- By Joey D'Antoni
- 01/30/2018
During my career in enterprise IT, the first couple of weeks of the year were frequently some of the most productive. Nearly everyone would be in the office after holiday vacation, there were no conferences, and planning for new objectives would be in full swing. This all combined to make January one of the more productive months in an IT organization.
This year, that was all thrown into flux by the announcement of the Spectre and Meltdown security bugs across all major CPU platforms. One colleague compared it to the Y2K crisis. If we'd actually had problems, the only patching effort I can compare it to is the U.S. daylight saving time change in 2007, when every piece of infrastructure required patching to have the correct time.
What Are These Bugs?
Modern CPUs use what's called speculative execution; this means that if the CPU is processing instructions A, B and C, it will execute instruction B before it has the results of instruction A. Through a process called side-loading, an attacker can mine kernel memory information out of the cache. Kernel memory contains protected information from the operating system like your passwords (in plain text), cached data (think the SQL Server buffer pool) and various other things that could be used to attack your systems.
All that is bad enough, but the fact that Spectre/Meltdown is a CPU exploit means two things: You have to patch everything in your infrastructure from the ground up, and it becomes really difficult to identify exploits of this vulnerability, which means if someone is attacking you, you might not know until it is too late.
The final fun fact about these vulnerabilities is that speculative execution boosts CPU performance, so when you apply these patches, your systems will likely take a performance hit. Depending on your patterns and infrastructure, anecdotal evidence points towards around a 5 to 30 percent impact. That's a big range, so it's important you have a good understanding of your baseline performance.
What Do We Need To Patch?
The scope of these patches goes beyond just about anything I have ever seen. From the ground up, you need to patch the following:
- Physical server bios
- Hypervisors
- Operating systems
- Anti-virus
- Database software
- Browsers
As if that weren't bad enough, not all versions of software are getting fixed. Two good examples of this come from VMware. If you are running version 5.5 of VMware, the patches supplied do not fully patch for all of the vulnerabilities. This layer of the infrastructure is critical, because it is possible to escape virtual machine (VM) boundaries with this exploit, meaning a single unpatched guest VM could potentially access the memory of the hypervisor host.
Microsoft has patched all versions of SQL Server dating back to 2008 (if you are on Windows Server 2003 and SQL Server 2005, it is really time to upgrade). The SQL Server product team has also released some hardening guidance for their extensibility mechanisms such as SQL Common Language Runtime (which allows for execution of .NET code from within SQL Server), xp_cmdshell (which allows for command line access), and R and Python execution (which are part of the SQL Server Machine Learning Services). This is all comes down to limiting the possible scope of where exploitable code can be run. All of the aforementioned services should already be controlled, but locking them down further is always a best practice. You can see the current Microsoft guidance for SQL Server in this KB article.
The answer to the earlier question of "What do I need to patch?" is that you need to patch everything. That includes your phone and possibly your mouse, if it has a CPU.
Performance Impact
Reading data from colleagues and various users on the Internet has shown me a wide variety of nearly unnoticeable impact, which has been my experience with a customer running on a physical server without the microcode changes in place. There have also been anecdotal bits of data from other users (and other users of non-SQL Server databases) that imply a much larger performance impact.
The most important thing you can do from a performance perspective is have a baseline of your server performance. If you don't know exactly what your workloads look like currently, you will have no idea how much impact these (or frankly any other changes) that you make to your system will have. Baselines can mean different things to many people, but I would usually recommend the following Windows and SQL Server Performance counters to someone getting started:
- Processor(_Total)\% Processor Time
- System\Processor Queue Length
- PhysicalDisk(_Total)\Avg. Disk sec/Read
- PhysicalDisk(_Total)\Avg. Disk sec/Write
- SQLServer:Buffer Manager\Page life expectancy
- SQLServer:SQL Statistics\Batch Requests/sec
- SQLServer:SQL Statistics\SQL Compilations/sec
- SQLServer:SQL Statistics\SQL Re-Compilations/sec
Those are a handful of counters that I really like to use to understand what's going on with a server. The specific counters I would focus on for these patches would be SQL Server batch requests per second and overall CPU utilization. If your CPU use and throughput go up (batch requests/sec is a pretty good indicator of throughput, at least over time), then you are definitely being impacted by this patch.
In order to have successful baselines, you need to keep those running, and ideally return the data from all of your servers into a centralized repository. You can use the relog utility from Microsoft to output this data to your chosen location for analysis. Alternatively, you can use a tool from a third-party vendor to aggregate that data for you.
Wrapping Up
This was not the news any IT professional wanted to wake up to in January. Patching all of your systems is bad enough, but when you add in the potential for performance impact, it makes all of us want to go on vacation to someplace warm and sunny.
There is no avoiding these patches -- you are putting your entire infrastructure and all of your data at risk if you don't apply them. So ensure you have baselines, test out the patches in your lower environments like QA and Test, and have a good expectation of what is going to happen in your production environment. You might need more CPUs or tighter code, or maybe there will be minimal impact to your environment. Good luck and start patching.
About the Author
Joseph D'Antoni is an Architect and SQL Server MVP with over two decades of experience working in both Fortune 500 and smaller firms. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. He is a Microsoft Data Platform MVP and VMware vExpert. He is a frequent speaker at PASS Summit, Ignite, Code Camps, and SQL Saturday events around the world.