Joey on SQL Server
The State of Microsoft Data, Part 1: Transactional Database Solutions
Let's kick off 2024 by checking in with the current status of some of Microsoft's more popular database options, like Azure SQL Database, Azure Managed Instance and more.
- By Joey D'Antoni
- 01/04/2024
Happy New Year's to my readers. To start out 2024, I wanted to review the current state of Microsoft Data Platform, including Azure services, and to discuss options on Amazon Web Services (AWS). One of the challenges of cloud services is the rapid pace of change, which means that the best solution for your organization right now might not be the best solution if you are building in six months. The topic was big enough to merit two columns split between analytics and transactional system, since they have significantly different requirements and platform options.
With that, let's take a look at how Microsoft's transactional database solutions are faring.
Cosmos DB
Cosmos DB is a non-relational database that runs as a platform as a service offering on Azure -- there is no on-premises solution (there is a Docker container to allow you to develop locally), so choosing the service means you are going to be on it for the long haul. One of the cooler features of Cosmos is that it supports multiple data models -- graph, key/value, relational and document databases -- all running under the same service and deployment model.
Cosmos most recently added support for a PostgreSQL API. The traditional challenge with Cosmos DB was a pay-as-you-go pricing model. However, recently Cosmos began offering vCore pricing on selected APIs, making the cost of the service more predictable. Cosmos is best used for applications that can benefit from low latency and different consistency models like IoT services, Web, mobile application and gaming.
The closest AWS service to Cosmos DB is Amazon's Dynamo DB service, which offers similar NoSQL options and a similar pricing model.
Azure Database for PostgreSQL and MySQL
While I frequently caveat my recommendations for platform-as-a-service databases with a lot of questions, in the case of PostgreSQL and MySQL, I have one basic question: do you have a dedicated engineering team to managing these databases? If the answer is no, I nearly always recommend using a cloud service, no matter what cloud you are in (Amazon's equivalent services here are RDS and Aurora). The biggest reasons for my recommendation of a managed service here is that backups and high availability are complex and require a lot of hands and experience, at least in the open source versions of those databases.
Azure SQL Database
Azure SQL Database is one of the longest standing Azure services and has matured into one of the best platform as a service databases, in my opinion. Azure SQL DB is a fully managed service, where backups, patching and high availability are all managed by the service. Azure SQL DB is like traditional SQL Server, with a few major caveats: since cross-database queries aren't directly supported, there is no SQL Agent (there are several options for scheduling jobs), CLR is not supported and you can't restore a backup from SQL Server. This means the best use case for Azure SQL DB is for new database development, where you can manage the code within those limitations.
Azure SQL Database and Managed Instance both have similar pricing models. The main service offerings are General Purpose and Business Critical. There is an important difference in storage between the two tiers -- General Purpose uses network-attached SSD storage, while Business Critical uses local SSD storage Hyperscale, which allows databases to grow beyond the 16 TB that General Purpose and Business Critical allow.
Microsoft recently lowered prices on the Hyperscale tier, meaning it should have roughly the equivalent pricing or lower as General Purpose. There is also a serverless option which allows for auto-pause, and auto-scaling of the service. Serverless is excellent for infrequently used development databases, but you should note that the cost is higher per second than provisioned tiers, so if your database is running all the time, you would likely be better served than a provisioned tier. One final note on Azure SQL DB, is once your data is in the service, there is no equivalent on-premises or VM option to easily move back to.
There is no real equivalent of Azure SQL Database on AWS.
Azure Managed Instance
I wrote
an article a few months ago on both Managed Instance and Amazon RDS -- both of these are platform as a service offerings which aim to be a managed version of SQL Server. I feel like one of the challenges of the design is, unlike Azure SQL DB, which is mostly designed around a single database model, Managed Instance is designed for migrating existing SQL Servers into a cloud service. This works well if your SQL Server fits nicely into the memory, IOPs, tempdb and CPU allocations for your service tier.
As I mentioned in my column, the biggest challenge of both Managed Instance and RDS are the pricing, as the service scales. To scale any one of those dimensions, you must scale the entire service offering. For example, if you need to add storage capacity or performance, you need to add additional cores (which come with the effective cost of SQL Server licensing).
The practical application of this is that the service is good for median workloads. But for edge-case workloads that are pushing the envelope in terms of performance may be better suited to VMs, where you have more granular sizing control. Managed Instance does allow you to move easily back to SQL Server as long as you are using SQL Server 2022.
SQL Server on VMs
I want to highlight this service on Azure—this team has been doing a great job of both publishing good performance information and building new features to make bring this offering closer to the platform as a service offerings. You can easily deploy Availability Groups from a template, have managed SQL Server backups and some level of automated patching (this work is ongoing). Additionally, as mentioned, there is more granular control around storage performance, as well as memory and cores -- this means using VMs can be more cost-effective than a managed service.
AWS EC2 VMs offer similar performance and cost options to Azure VMs, however, they do lack some of the automation features around patching and backups.
Cloud database services are constantly evolving, and it's important to understand what the current best practices are. Choosing the right solution requires a mix of experience, understanding, and prognostication, as well as risk tolerance.
Stay tuned for my next column as we delve into the current state of analytical systems in the Microsoft data platform.
About the Author
Joseph D'Antoni is an Architect and SQL Server MVP with over two decades of experience working in both Fortune 500 and smaller firms. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. He is a Microsoft Data Platform MVP and VMware vExpert. He is a frequent speaker at PASS Summit, Ignite, Code Camps, and SQL Saturday events around the world.