Joey on SQL Server

How Telemetry Is Making SQL Server and Azure SQL Better: Part 1

In the first of this two-part series, Microsoft's lead SQL architect explains how the company collects -- and secures -- telemetry data from Azure SQL Database customers to improve its engineering process.

In recent releases of SQL Server, Microsoft has delivered a great deal of feature enhancements and performance improvements. This rapid delivery has been facilitated by some development changes at Microsoft, like the use of software telemetry in the on-premises SQL Server product, enhancements in the automation of the testing process, and the use of Azure SQL Database as a testing ground, both for new features and overall performance analysis.

Azure SQL Database typically receives new features ahead of SQL Server, which allows Microsoft to see how these features operate in the wild and to understand how widely they are being used. While some customers may be concerned about this level of examination from Microsoft, Microsoft has been completely transparent about its telemetry services and the data it collects. This telemetry collection is audited, just like all of the other security aspects of the Azure platform.

I sat down for a two-part interview with Conor Cunningham, partner architect for SQL Server/Azure SQL at Microsoft, to talk about telemetry and how it is used to improve the product. In Part 1, we focus on the aspects of Azure SQL Database (also known as SQL Azure) that help improve the engineering process for the entire product.

D'Antoni: Azure SQL Database offers the ability to have some deeper security controls over SQL Server (e.g., threat detection, universal authentication). Can you talk about those features and how they make the service more secure?
Cunningham: Universal Authentication is an evolution of Integrated Authentication in SQL since there are now cloud-hosted Active Directories and ADFS-enabled mechanisms to share log-in information in hybrid topologies. Customers like this because it gives them control over the log-ins to SQL from a central administrative endpoint when there are multiple databases in a solution (or multiple tiers in a solution). We have been building a lot of integration with Azure mechanisms here to give customers options when they work in Microsoft's cloud.

Threat Detection is a feature that watches the patterns of queries being executed and looks for anomalies like people trying to do SQL injection attacks against dynamic SQL in your application. It then notifies you if there were attempts like this against your database. Early customer feedback has been great, and we will continue to invest to give customers a premium security experience in our PaaS SQL offerings. When you buy SQL Azure, you're not just buying a VM-hosting SQL -- you are buying a service with smarts that are there to back you up. 

In terms of telemetry, can you talk about the collection process within Azure?
We have a similar classification matrix in SQL Azure to SQL Server, though there are some differences. If we "collect" data that is sensitive, we have that stored in a highly protected system for restricted purposes (troubleshooting, generally). For example, if a customer is getting an error in SQL Azure, we record those into a system that keeps the data behind a two-factor restricted firewall so that our engineers can respond to customer requests when customers open service requests. Error messages can, in a few cases, contain values that customers were trying to insert, so we treat them all as if they were customer content. 

We separately also have a set of telemetry around error messages where we emit the values as you see them in sys.messages (in other words, without any customer values at all), and we use this more broadly -- engineering teams can use this in their daily activities to see if the feature they are building is working. So, the Query Store team looks at the most common errors that are being emitted and then fund working on bugs based on how commonly an issue is hit by customers.

We do not have a way for customers to directly see the telemetry being collected in SQL Azure today, but we do have a large set of certifications that govern how we collect and treat customer data, including "telemetry" data. You can see this at the Azure Trust Center. Certifications are really about the whole solution, from datacenter design and security, all the way up to the processes and controls we use to manage the service. There is a TechNet document that lists each Azure service and which certifications they have.

Can you give you some concrete examples of features that have been introduced in Azure SQL Database/SQL Server because of telemetry?
All features we build today use telemetry to validate proper function in Azure or in SQL Server. Some features specifically use telemetry as part of the feature. 

One cool example in SQL Azure is that we can recommend, create and validate indexes for your application automatically. This works by watching the telemetry from the query patterns for each database to infer what kinds of indexes would help improve performance. While we've had that "recommendation" capability for many releases in SQL Server as Dynamic Management Views, the feature in SQL Azure builds a model of performance before and after and "proves" that the performance of the application improves when the index is added. If it doesn't improve, it reverts the index for you, as well. 


Check out Part 2 of this interview, where we focus more on on-premises SQL Server. We discuss how these processes have improved the engineering process, and how Microsoft is focusing on the customer value being delivered, rather than just building new features. 

About the Author

Joseph D'Antoni is an Architect and SQL Server MVP with over two decades of experience working in both Fortune 500 and smaller firms. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. He is a Microsoft Data Platform MVP and VMware vExpert. He is a frequent speaker at PASS Summit, Ignite, Code Camps, and SQL Saturday events around the world.

Featured

comments powered by Disqus

Subscribe on YouTube