Joey on SQL Server

How Telemetry Is Making SQL Server and Azure SQL Better: Part 2

More from Microsoft's lead SQL architect about how telemetry data is helping the company deliver regular, highly requested and targeted feature improvements at a massive scale.

In the second part of this interview with Microsoft partner architect Conor Cunningham, we talk about Microsoft's use of telemetry to improve SQL Server and Azure SQL. We also discuss the changes the company has made to its development processes that have allowed more automated testing and faster development.

SQL Server 2016 and 2017 were both built with this process, and both delivered in a major way in terms of new features. SQL Server 2016 introduced a myriad of new features, including Query Store, which provides a data recorder for your query performance; PolyBase, which allows for interaction with Big Data systems like Hadoop; and support for JavaScript Object Notation (JSON) data. In SQL Server 2017, Microsoft built SQL Server on a whole new operating system, Linux, and introduced support for machine learning using Python.

In addition to all of those features, Microsoft delivered an automated tuning feature, similar to what is in Azure SQL Database.

D'Antoni: What data are you collecting in both on-premises SQL Server and Azure [SQL Database], and is it possible for the customer to see their telemetry data?
Cunningham: For SQL Server, we generically collect anonymized usage data to help us improve the product. As an example, for a new feature such as the Clustered Columnstore index, we track some of the internal metrics of the feature such as segment size and compression ratio. We found that there were some insert patterns that caused poor compression ratios which resulted in scan performance being about half of target. We were able to roll in some fixes that improved the compression in this case (thus improving performance). For this kind of information, we don't know who has the Columnstore, but we have a view of how many customers use them and some metrics about the aggregate experience.

SQL Server 2016 and above exposes an audit feature where you can see everything being collected.  We also updated the privacy statement for SQL Server 2016+ to have an explicit statement of the kinds of information we do and do not collect. We specifically don't collect data from inside user tables, users/log-ins/passwords, or other sensitive information from customers running SQL Server. 

Effectively, this is a mechanism that we can use to get better engagement for our engineering team with our customers and make sure that we're delivering features that work to our customers since we can integrate this into our engineering experience. This is a much faster loop for us than waiting for customers to give feedback after we release the product.

Can you talk about the process of how features make it from development into both products?
SQL Server, SQL Azure and SQL DW [Data Warehouse] all share the same development branch. We have feature switches (like trace flags) that control what capabilities are enabled in a given product. Engineers make changes in the development branch and keep the functionality "off" during a deployment of new code to clusters, and then it is later enabled in a subsequent step (and then often just for a few customers at first to make sure that things are working properly). 

The capabilities we build are tested against all flavors of SQL by a large automated test infrastructure that runs all the time over thousands of machines with different gates and levels of testing. So, a daily test run has many hundreds of thousands of tests and we constantly monitor the output on each team. Regressions are usually found and fixed in the development branch. We then prepare a branch of the code specifically for deployment to a service (or product, in the case of SQL Server) and then start doing validation runs on another set of thousands of machines for that code branch. We port fixes into that branch until we are ready and then, eventually, start rolling out the code into production (for services) or to release a completed SQL Server installer.

This mechanism lets us create most features and test them in production first in SQL Azure and have them roll into the next major SQL Server release. We also get feedback from the monthly CTPs [community technology previews] for SQL Server. Sometimes, we will make a business choice to ship something only in one release vehicle, but we try to ship things everywhere whenever we can.

How has Azure SQL Database changed the way development occurs?
We have made major changes in our engineering team over the past five years. We went to a unified engineering model at Microsoft (instead of separate developer and test roles). This took some adjustment, but we were able to make this work through a combination of better/faster automated testing, as well as using telemetry to get faster feedback from customers about whether they liked the features we were building. Now we don't stop building the feature until we believe it has delivered actual value. That's very different than 10 years ago where we would build features and largely hope that we just were smart enough to figure out what customers would need once things got into the market.


As you can see from this series, Microsoft has made a great effort to improve and modernize its development processes, and to deliver more new features more quickly. SQL Server and Azure SQL Database have introduced a lot of new features, not to mention checked off a lot of long-requested minor enhancements (e.g., partition-level truncation, the string_split function, and many others).

Delivering a release cadence for enterprise software is impressive, and this discussion with Cunningham gives some deeper insight into what it takes, from both a process and data perspective, to deliver high-value, quality software at scale.

About the Author

Joseph D'Antoni is an Architect and SQL Server MVP with over two decades of experience working in both Fortune 500 and smaller firms. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. He is a Microsoft Data Platform MVP and VMware vExpert. He is a frequent speaker at PASS Summit, Ignite, Code Camps, and SQL Saturday events around the world.

Featured

comments powered by Disqus

Subscribe on YouTube