Tour de SQL Part III: Performance Tuning
SQL Server 2005 gives you new tools to help your database performance stay ahead of the pack.
The riders in the Tour de France are probably the most monitored athletes in
the world. In preparation for the race, every ounce of food a rider consumes
is specially formulated to give him maximum performance. They spend hours in
wind tunnels, constantly tweaking bikes, helmets and other gear to make sure
they're as aerodynamic as possible. Heart rate, oxygen intake and dozens of
other bodily functions are closely watched to ensure top performance during
the most grueling race on the planet.
Performance tuning like that wasn't available on past versions of Microsoft's
SQL Server, when tuning was regarded as a black art. An index here, a few design
changes there, toss in some query hints, do a quick SQL dance and voilà! You
had a tuned database. However, like so many other aspects of SQL Server 2005,
the ability to adjust performance has been mightily enhanced (but not with steroids).
On the third leg of our Tour de SQL, we'll show you how to pump up the performance of Microsoft's best-ever database, giving you the edge you need to stand on the podium and kiss supermodels, like the winners of each leg of the Tour.
A Sharper Profile
An old SQL tool you probably won't recognize in its SQL 2005 incarnation is
the SQL Server Profiler, shown in Figure 1 below. Profiler still has the same
goal as before -- tracing your SQL Server so that you can tune or troubleshoot
-- but the look, feel, and capabilities have come a long way.
You can now trace Analysis Services and Integration Services packages, and
almost everything is XML based. Trace results can be saved as .XML files, and
trace definitions are in XML, expanding Profiler's capabilities to other programming
interfaces. While tracing to files, Profiler will roll over to additional files
when the specified maximum file size is reached. While this isn't new, the ability
to replay all rolled over files in order is new to Profiler.
[Click on image for larger view.]
|Figure 1. The SQL Profiler
now integrates with Windows Performance Monitor to provide a more complete
picture of server activity.
But that's not even the best thing Profiler can do: It can now build correlations
to performance monitor counters taken at the same time as the trace. Once you
have a trace and a performance monitor log saved to a file, you simply open
the trace and import the performance data. The result is a split screen with
your trace results on top and performance counter on the bottom.
Now clicking on the trace takes you to the corresponding section of your performance graph, and vice versa. This allows DBAs to get a quick picture of what was running when the CPU spiked to 100 percent.
A More Capable Advisor
Another performance tool that has long needed a makeover is the Index Tuning Wizard.
SQL Server 2005 introduces a new tool, the Database Engine Tuning Advisor, which
replaces the functionality of the Index Tuning Wizard and provides even more information
than its predecessor.
In a nutshell, the Advisor (Figure 2) examines workloads (T-SQL scripts or
SQL Profiler results) and provides recommendations to add, remove or modify
physical structures. The recommendations are made regarding clustered indexes,
non-clustered indexes, indexed views and partitioning schemes. In addition,
the Advisor examines more events and physical structures than its predecessor,
allowing it to make better, more complete recommendations.
[Click on image for larger view.]
|Figure 2. The all-new Database
Engine Tuning Advisor boasts many more options for performance enhancement
than its predecessor.
It goes even further, with the ability to analyze workloads run against multiple
databases. An adage of the tuning trade is that too many indexes can be just
as bad as too few. The old Index Tuning Wizard didn't apply this principle very
well, and would often recommend adding index after index until the performance
started to suffer. The Advisor, however, is designed to take into account multiple
structures and can analyze the interaction of these structures. After the Advisor
is run, the user is given T-SQL scripts that can be used to implement all, or
a subset of, the recommendations.
Indexing: Bigger and Better
Assuming you have a decent database design and the queries running against your
database weren't written by monkeys, most of your tuning will be done with indexing.
Many improvements to the indexing process in SQL 2005 not only allow for better
performance, but also greater availability.
First off, the syntax for working with indexes has changed a little. There's now an ALTER INDEX command that can rebuild, reorganize, disable or change indexes. This has effectively replaced the old DBCC index commands. Now, instead of using DBCC DBREINDEX, you use ALTER INDEX REBUILD. Likewise, DBCC INDEXDEFRAG has been replaced by ALTER INDEX REORGANIZE.
The most widely anticipated change to indexing is the addition of online indexing (available only in Enterprise Edition). Using the REBUILD WITH ONLINE option of the index command, users aren't blocked by most indexing operations and the object of the index can stay online and accessible. This can greatly reduce the amount of downtime for your SQL environment.
But be aware that the performance of online indexes involves tradeoffs. During an online indexing operation performance is degraded -- how much depends on the
user load and the object and index size. Online indexing
is recommended by Microsoft in most cases, but every environment is different; I urge you to carefully test this feature before implementation.
Index operations on SQL 2005 Enterprise Edition can also take advantage of multiprocessor servers. Utilizing this parallelism can improve the overall performance of your indexing operations and minimize the aforementioned period of degraded performance.
Indexed views are still around in SQL 2005, and they, too, have been improved:
- The query optimizer can make better decisions, providing better performance
- Two new index options allow for more granular locking control. By using
ALLOW_ROW_LOCKS or ALLOW_PAGE_LOCKS you can control at what level locking
will occur for your index.
- The XML datatype has been introduced, allowing XML documents to be stored
in columns. XML documents can be very large, since they themselves usually
represent entire data entities. Leaving no stone unturned, SQL Server 2005
provides the ability to create indexes on your XML columns.
While indexing has changed, it shouldn't seem completely foreign. In the short term, all the old commands are still in place for index maintenance and operation, allowing you to ease into the new syntax.
Table and Index Partitioning
Previously, data stored in SQL Server could only be split into different physical
files at the table level. In other words, each table could be placed on a different
filegroup, but the table in its entirety had to be in that one filegroup. The
only exception was that you could put non-clustered indexes on a different file
from the data, but the same held true for an index -- it could exist on only one
filegroup. SQL 2005 introduces table and index partitioning, allowing you to horizontally
split rows in tables or indexes across multiple filegroups.
It's a well-known mantra
of disk management: "The more spindles, the better."
Because of this, it might be tempting to simply partition
large tables so they end up on multiple hard drives; here,
however, that could degrade performance. The problem is that
during sorting operations, SQL sorts partitions one at a time,
leaving all the other partitions, as in this case, idle. A
better solution is to stripe partitions onto multiple disks,
allowing SQL to take advantage of the extra spindles, even
though it's still working one partition at a time. -- E.J.
This provides several benefits, the first of which is manageability. When your
data is divided into logical partitions, data loading can be faster and maintenance
tasks can be performed in a more timely fashion.
In order to realize this gain, you need to know how a table is being used and, most importantly, the size of that table. Partitioning will bear the most fruit on very large tables in which subsets of data in that table are used differently.
Take, for example, a table that stores order information for a large e-commerce firm at a rate of 100,000 records per day. That works out to approximately 3 million records per month. Once an order is taken, it's inserted once and updated several times over the first few weeks, as it goes through warehouse fulfillment and ships to the customer. After that, the record will typically only be read for reporting or historical purposes, and only a few records will be updated in the event of an exception, such as an order return.
If you partition this table by month, the indexes built can be geared toward the record operations used for that data. The current month can be optimized for insert and update operation, while previous months are optimized for select. So you can rebuild indexes on the current month's data and not the entire table, saving valuable time and resources.
Using table partitioning can also result in increased performance, but it's a tricky matter that requires an in-depth knowledge of how your objects are being queried.
Partitions should be configured in such a way that data joined from two tables be on similar partitions, in most cases using the same partition criteria. If two tables are dissimilar in their partition scheme, query performance can degrade.
Snapshot Isolation Level
SQL 2005 introduces a new transaction isolation level called Snapshot
Isolation (not to be confused with a database snapshot). Whenever a transaction
updates a row, SQL 2005 copies an image of that row into a "version store" in
TempDB before the update happens. If any other process tries to read the row being
updated, it pulls the latest copy of the row from the image store. This prevents
users attempting to read data from being blocked by the update. In addition, Read
transactions using Snapshot Isolation will not take shared locks on the data,
again reducing locking contention. If two transactions attempt to update the same
row at the same time, the old standard applies: The first transaction gets to
make the update and the second one is blocked until the first update completes.
Keep a few things in the back of your mind if you plan to enable Snapshot isolation. First, TempDB has to have enough space to maintain the version store; if it fills up, updates can continue but version images will stop being updated. The image is only kept in the store long enough for the transaction to complete. While this is not a dirty Read, you are reading data about to be changed. The select statement will return the data as it existed after the last committed transaction; but assuming the update in progress commits, the data will be changed.
Closing in on the Finish Line
We're nearing the end of our Tour de SQL, and like the cyclists heading into
Paris for the final lap, we'll be tackling replication for our closing chapter.
We'll be discussing the new Replication Monitor, tracer tokens, replication
agents and more. Our final installment also arrives close to the start of the
real Tour de France, which I hope you'll be watching on the Outdoor Life Network!