In-Depth

Tour de SQL Part III: Performance Tuning

SQL Server 2005 gives you new tools to help your database performance stay ahead of the pack.

Tour de SQL Series

> Tour de SQL Part I: Versions
> Tour de SQL Part II: Administrative Features
> Tour de SQL Part III: Performance Tuning (currently viewing)
> Tour de SQL Part IV: Replication

The riders in the Tour de France are probably the most monitored athletes in the world. In preparation for the race, every ounce of food a rider consumes is specially formulated to give him maximum performance. They spend hours in wind tunnels, constantly tweaking bikes, helmets and other gear to make sure they're as aerodynamic as possible. Heart rate, oxygen intake and dozens of other bodily functions are closely watched to ensure top performance during the most grueling race on the planet.

Performance tuning like that wasn't available on past versions of Microsoft's SQL Server, when tuning was regarded as a black art. An index here, a few design changes there, toss in some query hints, do a quick SQL dance and voilà! You had a tuned database. However, like so many other aspects of SQL Server 2005, the ability to adjust performance has been mightily enhanced (but not with steroids).

On the third leg of our Tour de SQL, we'll show you how to pump up the performance of Microsoft's best-ever database, giving you the edge you need to stand on the podium and kiss supermodels, like the winners of each leg of the Tour.

A Sharper Profile
An old SQL tool you probably won't recognize in its SQL 2005 incarnation is the SQL Server Profiler, shown in Figure 1 below. Profiler still has the same goal as before -- tracing your SQL Server so that you can tune or troubleshoot -- but the look, feel, and capabilities have come a long way.

You can now trace Analysis Services and Integration Services packages, and almost everything is XML based. Trace results can be saved as .XML files, and trace definitions are in XML, expanding Profiler's capabilities to other programming interfaces. While tracing to files, Profiler will roll over to additional files when the specified maximum file size is reached. While this isn't new, the ability to replay all rolled over files in order is new to Profiler.

SQL Profiler
[Click on image for larger view.]
Figure 1. The SQL Profiler now integrates with Windows Performance Monitor to provide a more complete picture of server activity.

But that's not even the best thing Profiler can do: It can now build correlations to performance monitor counters taken at the same time as the trace. Once you have a trace and a performance monitor log saved to a file, you simply open the trace and import the performance data. The result is a split screen with your trace results on top and performance counter on the bottom.

Now clicking on the trace takes you to the corresponding section of your performance graph, and vice versa. This allows DBAs to get a quick picture of what was running when the CPU spiked to 100 percent.

A More Capable Advisor
Another performance tool that has long needed a makeover is the Index Tuning Wizard. SQL Server 2005 introduces a new tool, the Database Engine Tuning Advisor, which replaces the functionality of the Index Tuning Wizard and provides even more information than its predecessor.

In a nutshell, the Advisor (Figure 2) examines workloads (T-SQL scripts or SQL Profiler results) and provides recommendations to add, remove or modify physical structures. The recommendations are made regarding clustered indexes, non-clustered indexes, indexed views and partitioning schemes. In addition, the Advisor examines more events and physical structures than its predecessor, allowing it to make better, more complete recommendations.

Database Engine Tuning Advisor
[Click on image for larger view.]
Figure 2. The all-new Database Engine Tuning Advisor boasts many more options for performance enhancement than its predecessor.

It goes even further, with the ability to analyze workloads run against multiple databases. An adage of the tuning trade is that too many indexes can be just as bad as too few. The old Index Tuning Wizard didn't apply this principle very well, and would often recommend adding index after index until the performance started to suffer. The Advisor, however, is designed to take into account multiple structures and can analyze the interaction of these structures. After the Advisor is run, the user is given T-SQL scripts that can be used to implement all, or a subset of, the recommendations.

Indexing: Bigger and Better
Assuming you have a decent database design and the queries running against your database weren't written by monkeys, most of your tuning will be done with indexing. Many improvements to the indexing process in SQL 2005 not only allow for better performance, but also greater availability.

First off, the syntax for working with indexes has changed a little. There's now an ALTER INDEX command that can rebuild, reorganize, disable or change indexes. This has effectively replaced the old DBCC index commands. Now, instead of using DBCC DBREINDEX, you use ALTER INDEX REBUILD. Likewise, DBCC INDEXDEFRAG has been replaced by ALTER INDEX REORGANIZE.

The most widely anticipated change to indexing is the addition of online indexing (available only in Enterprise Edition). Using the REBUILD WITH ONLINE option of the index command, users aren't blocked by most indexing operations and the object of the index can stay online and accessible. This can greatly reduce the amount of downtime for your SQL environment.

But be aware that the performance of online indexes involves tradeoffs. During an online indexing operation performance is degraded -- how much depends on the user load and the object and index size. Online indexing is recommended by Microsoft in most cases, but every environment is different; I urge you to carefully test this feature before implementation.

Index operations on SQL 2005 Enterprise Edition can also take advantage of multiprocessor servers. Utilizing this parallelism can improve the overall performance of your indexing operations and minimize the aforementioned period of degraded performance.

Indexed views are still around in SQL 2005, and they, too, have been improved:

  • The query optimizer can make better decisions, providing better performance than before.
  • Two new index options allow for more granular locking control. By using ALLOW_ROW_LOCKS or ALLOW_PAGE_LOCKS you can control at what level locking will occur for your index.
  • The XML datatype has been introduced, allowing XML documents to be stored in columns. XML documents can be very large, since they themselves usually represent entire data entities. Leaving no stone unturned, SQL Server 2005 provides the ability to create indexes on your XML columns.

While indexing has changed, it shouldn't seem completely foreign. In the short term, all the old commands are still in place for index maintenance and operation, allowing you to ease into the new syntax.

Table and Index Partitioning
Previously, data stored in SQL Server could only be split into different physical files at the table level. In other words, each table could be placed on a different filegroup, but the table in its entirety had to be in that one filegroup. The only exception was that you could put non-clustered indexes on a different file from the data, but the same held true for an index -- it could exist on only one filegroup. SQL 2005 introduces table and index partitioning, allowing you to horizontally split rows in tables or indexes across multiple filegroups.

Performance Tip

It's a well-known mantra of disk management: "The more spindles, the better." Because of this, it might be tempting to simply partition large tables so they end up on multiple hard drives; here, however, that could degrade performance. The problem is that during sorting operations, SQL sorts partitions one at a time, leaving all the other partitions, as in this case, idle. A better solution is to stripe partitions onto multiple disks, allowing SQL to take advantage of the extra spindles, even though it's still working one partition at a time. -- E.J.

This provides several benefits, the first of which is manageability. When your data is divided into logical partitions, data loading can be faster and maintenance tasks can be performed in a more timely fashion.

In order to realize this gain, you need to know how a table is being used and, most importantly, the size of that table. Partitioning will bear the most fruit on very large tables in which subsets of data in that table are used differently.

Take, for example, a table that stores order information for a large e-commerce firm at a rate of 100,000 records per day. That works out to approximately 3 million records per month. Once an order is taken, it's inserted once and updated several times over the first few weeks, as it goes through warehouse fulfillment and ships to the customer. After that, the record will typically only be read for reporting or historical purposes, and only a few records will be updated in the event of an exception, such as an order return.

If you partition this table by month, the indexes built can be geared toward the record operations used for that data. The current month can be optimized for insert and update operation, while previous months are optimized for select. So you can rebuild indexes on the current month's data and not the entire table, saving valuable time and resources.

Using table partitioning can also result in increased performance, but it's a tricky matter that requires an in-depth knowledge of how your objects are being queried.

Partitions should be configured in such a way that data joined from two tables be on similar partitions, in most cases using the same partition criteria. If two tables are dissimilar in their partition scheme, query performance can degrade.

Snapshot Isolation Level
SQL 2005 introduces a new transaction isolation level called Snapshot Isolation (not to be confused with a database snapshot). Whenever a transaction updates a row, SQL 2005 copies an image of that row into a "version store" in TempDB before the update happens. If any other process tries to read the row being updated, it pulls the latest copy of the row from the image store. This prevents users attempting to read data from being blocked by the update. In addition, Read transactions using Snapshot Isolation will not take shared locks on the data, again reducing locking contention. If two transactions attempt to update the same row at the same time, the old standard applies: The first transaction gets to make the update and the second one is blocked until the first update completes.

Keep a few things in the back of your mind if you plan to enable Snapshot isolation. First, TempDB has to have enough space to maintain the version store; if it fills up, updates can continue but version images will stop being updated. The image is only kept in the store long enough for the transaction to complete. While this is not a dirty Read, you are reading data about to be changed. The select statement will return the data as it existed after the last committed transaction; but assuming the update in progress commits, the data will be changed.

Closing in on the Finish Line
We're nearing the end of our Tour de SQL, and like the cyclists heading into Paris for the final lap, we'll be tackling replication for our closing chapter. We'll be discussing the new Replication Monitor, tracer tokens, replication agents and more. Our final installment also arrives close to the start of the real Tour de France, which I hope you'll be watching on the Outdoor Life Network!

Featured

comments powered by Disqus

Subscribe on YouTube