Joey on SQL Server
Top Tips To Make SQL Server Run Faster on Azure VMs
Storage and performance can be quite different in the cloud. It's important to choose the right VM size and configure your storage properly to keep SQL Server workloads running smoothly.
- By Joey D'Antoni
- 04/14/2021
Microsoft recently published updated guidance for running SQL Server on Azure virtual machines (VMs). I wanted to share some of the highlights of the documentation, combined with my experiences with several clients running Azure VMs.
While there are a lot of benefits to running SQL Server in a platform-as-a-service (PaaS) model with Azure SQL Database or Azure SQL Managed Instance, a sizable percentage of my customers still have workloads that require VMs. Their reasons for sticking with VMs include transaction log throughput (which can be limited in a PaaS), the amount of memory available or, in some cases, other SQL Server services (like Analysis Services) that run on the same VM as the database server to reduce overall costs.
With all of that in mind, let's talk about the best practices you should follow when running SQL Server in Azure VMs. (You should note most of these practices apply to Amazon Web Services, as well, under slightly different names.)
Choosing a VM Size
Azure VM sizes range from a single virtual CPU with 768MB of RAM all the way up to 416 CPUs with nearly 12TB of RAM, so the computing power to meet your needs is available if you have the budget. This can be a lot to look at; the numbers and types of VMs are constantly changing, and it can be intimidating to make the right choice.
To help you narrow this list down, for the purposes of running production database servers, you can focus on three VM series: D, E and M. Specifically, check out the Dasv4 and Easv4 VM series, which have a lot of memory and good I/O throughput. Don't worry -- there's still a lot of choice within those series.
When choosing your VM, there are a couple of important dimensions beyond the number of CPUs and the amount of RAM that are critical to your performance. Each VM has a local SSD drive that provides fast ephemeral (i.e., disappears at reboot) storage, which, with enough size, can be used for your TempDB. Beyond the local SSD, there is an IOPS (i.e., how many reads and writes the VM can perform per second) limit for each VM. No matter how much storage you allocate or how fast that storage is, your VM will never exceed its IOPS limit. Finally, you should make note of the amount of network bandwidth your VM has, particularly if you are using features that may consume significant bandwidth, like Availability Groups or replication.
One other persistent concern is licensing. When you run SQL Server on Azure, you have a couple of options. You can bring your own license or you can use a pay-as-you-go license, where you effectively rent your SQL Server license as part of your Azure bill. Either way, SQL Server is generally licensed by the CPU core, which means that in Azure, it may seem like you need to buy a lot more cores to get the amount of RAM your workload requires.
Enter constrained cores, which give you the same number of VMs and amount of RAM, with a reduced number of CPU cores. Your compute costs for Azure will remain the same for these VMs. However, you can dramatically reduce your SQL Server licensing costs by using this option.
It's All About the Storage
The biggest difference between running your workloads on-premises and in the cloud is storage. The difference with cloud storage is that virtually all cloud storage is networked, which means you will have slightly higher latency than in a perfect, on-premises world.
There are trade-offs to this. You will never need to wait two months for your storage vendor to deliver the few extra terabytes you need to complete your backups. On the other hand, you will probably not have the performance that your fibre channel attached all-flash SAN offers with it sub-millisecond latencies.
However, there are a lot of things you can do to deliver very good storage performance for your Azure VMs. My first recommendation is to only use premium managed disks or ultra disks for your data and transaction log file storage. You can use Azure Blob Storage for backups, which are sequential in nature, but standard storage (and even standard SSD) will simply not deliver adequate performance for even the smallest database workloads.
In terms of premium storage, typically the best cost-to-IOPS ratio comes from using P30 disks, which are 1TB and are rated at 5,000 IOPS. As the size of the disks goes up, the monthly cost is mostly linear. However, the IOPS only go up by 50 percent. Barring extremely high storage volumes (greater than 50TB), you should allocate enough P30 disks to meet your IOPS and capacity needs. You should create separate volumes for data files, transaction log files and TempDB (more on TempDB below).
To create larger volumes, you can use Storage Spaces in Windows to pool disks, which gives you the sum total of the IOPS and capacity. If you pool six P30 disks, you would have 6TB and 30,000 IOPS. You can -- and should -- increase the number of IOPS to your storage pool by enabling read caching on your data file volumes and no caching on your transaction log volumes. Ultra disk simplifies this calculus by letting you choose IOPS, bandwidth and capacity independently. Ultra disk also does not support caching.
Remember that your VM has a local SSD drive that gets wiped clean every time the VM is de-allocated. While this is not a good fit for your database or log files, it is a great use case for TempDB, which is recreated every time SQL Server restarts. However, you should note that the local SSD also hosts the read cache for your data disks. There's no simple way to monitor the use of the disk by the cache, but I typically recommend monitoring the latency of your TempDB files. If you see it exceeding 20 to 30 milliseconds, consider either moving to a larger VM or relocating your TempDB files to a dedicated storage volume.
Anything else you can do to reduce your I/O workload is helpful. As a DBA, I am more inclined to use features like data compression and instant file initialization, which can reduce your required IOPS. I also lean toward being more aggressive with indexing data than I might be if I had faster storage.
Moving to the cloud has many challenges. However, many administrators think moving from on-premises VMs to cloud-based VMs is trivial. While a lot of things will stay the same in a lift-and-shift migration, storage and performance can be quite different in the cloud. It's important to understand how to choose the right VM size and how to configure your storage properly to ensure your SQL Server workloads run smoothly.
About the Author
Joseph D'Antoni is an Architect and SQL Server MVP with over two decades of experience working in both Fortune 500 and smaller firms. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. He is a Microsoft Data Platform MVP and VMware vExpert. He is a frequent speaker at PASS Summit, Ignite, Code Camps, and SQL Saturday events around the world.