Joey on SQL Server
Q&A: How Microsoft Is Raising Azure Arc's Data Services Game
Ignite 2020 saw the public preview of Azure Arc enabled data services, the latest step in Microsoft's bid to demystify multicloud. Principal program manager Travis Wright explains how it works.
- By Joey D'Antoni
- 09/29/2020
Microsoft Ignite 2020 was last week, and while it wasn't a release year for SQL Server, there were still several interesting announcements at the virtual conference.
The general availability of Azure SQL Edge was announced, along with some performance improvements to the Azure SQL Managed Instance Platform as a Service (PaaS) offering. One other announcement -- and the focus of this article -- was the public preview of Azure Arc enabled data services, which allows you to run either Azure SQL Managed Instance or Azure PostgreSQL Hyperscale across on-premises datacenters, multicloud scenarios and edge computing scenarios.
Azure Arc services can be a bit confusing to newcomers, but the general premise across all of the services is that Azure Arc provides a single control pane to leverage Azure services to manage all of your Azure Arc enabled resources, no matter where those resources live. Currently, Azure Arc supports management of virtual machines, Kubernetes clusters and the aforementioned Azure data services. This allows you to take advantage of features in Azure Resource Manager like role-based access control, resource tagging and automation, and to have PaaS resources that can run anywhere.
I recently had a chance to talk with Travis Wright, principal program manager of SQL Server at Microsoft, about Azure Arc enabled data services.
D'Antoni: What are the scenarios where you have been seeing customers implement Azure Arc enabled data services? Is it mostly on-premises or are you seeing multicloud deployments?
Wright: At this point, I'd say that the most common use case we see is for on-premises Database as a Service [DBaaS], but one of the first customer implementations that will go into production will be on AWS EKS [Amazon Elastic Kubernetes Service].
While customers may start in one place like on-premises, part of the appeal of Arc enabled data services is that they can deploy and manage in multiple clouds as they proceed in their hybrid cloud journey. Arc enabled data services also future-proofs things a bit because, for example, it ensures that even if a company is acquired in the future that runs on another cloud, it can still be managed in the same way.
One of the touted benefits of Azure Arc data services is evergreen SQL. Can you explain what that means and how it's implemented in Azure Arc and the Kubernetes framework?
If you think about SQL Server, it is a versioned product. The features in a given release of SQL Server are what they are. We release updates to that version over time, but it is really only bug fixes, not features. After five years, a major SQL Server version goes into extended support, meaning there are only security fixes for another five years and then it goes out of support.
Ten years is a long time these days, but we still have a lot of customers that, for various reasons, are "stuck" on an older version of SQL Server and can't get off of it. The idea with "evergreen SQL" is twofold. First, provide customers continuous updates, both bug fixes and new features like we do in the cloud, and secondly, to make the process of upgrading as painless as possible because it is a very small, incremental update each month as opposed to a big upgrade process every two to three years. The process of updating is fully automated with near-zero downtime.
Can you explain the data controller and how that works to help provision other resources?
The data controller is really just a set of Kubernetes pods that provide the orchestration services for things like provisioning, deprovisioning, scaling, backup/restore, monitoring, HA [high availability], et cetera. One of those pods called the "bootstrapper" is responsible for monitoring for requests to create custom resources like SQL managed instances or PostgreSQL Hyperscale server groups. When those requests to deploy those custom resources are submitted to the Kubernetes API server, Kubernetes hands those requests off to the bootstrapper to process.
The bootstrapper validates the request and applies some logic to it to determine the right thing to do, and then tells Kubernetes what do -- so, using Kubernetes primitives like statefulsets, services and persistent volumes. This simplifies the user experience because people making these requests don't have to understand the primitives. They just say, "I want to create a SQL managed instance with 16 cores and 256GB of RAM." The data controller takes care of translating that request into something Kubernetes can understand.
In my opinion, Azure Arc is a natural evolution from Azure Stack, which required customers to purchase specific hardware through partners, is challenging to implement outside of large enterprises, and slow to maintain pace with Azure feature enhancements. Since Azure Arc is a way to run Azure services anywhere and uses container-based deployment, it can stay up-to-date as features get added to the service. Even if you need your deployment to be disconnected from the Internet, Azure Arc supports a private container repository that can sync with Azure.
The ability to host your own PaaS services with services like system-managed backups and built-in high availability provided by Kubernetes -- and not having to ever worry about SQL Server upgrades and patches -- will be very attractive to a lot of customers. The other benefit of using Azure Arc enabled services (whether they be data, virtual machines or Kubernetes) is that the Azure Portal and Azure tools like Monitor and Security can be used to manage resources wherever they are.
I'm generally skeptical of organizations implementing multicloud solutions as they are really complex -- even just the networking alone. However, one of the promises of using Kubernetes as a deployment platform is that you can deploy your containers and pods on any Kubernetes, whether it be on a Raspberry Pi on your desktop or onto multiple public cloud providers. Azure Arc extends this by providing a single control pane and managed services options, greatly reducing the complexity of a multicloud or hybrid deployment. Microsoft sees a lot of growth here and I would expect to see continued investment from Microsoft in this space.
About the Author
Joseph D'Antoni is an Architect and SQL Server MVP with over two decades of experience working in both Fortune 500 and smaller firms. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. He is a Microsoft Data Platform MVP and VMware vExpert. He is a frequent speaker at PASS Summit, Ignite, Code Camps, and SQL Saturday events around the world.