Speed Up SQL Server Apps

Learn 10 useful tips and tricks that help boost SQL Server application performance.

Technology Toolbox: SQL Server 2000, ADO, ADO.NET

Developers love database programming tips, tricks, and workarounds—especially those that slash development time or boost application performance. Here's a collection of such nuggets for developing SQL Server applications.

1) Replace COUNT(*) With EXISTS When Checking for Existence
Developers often use the value from the COUNT(*) function when enforcing business rules in Transact-SQL code. However, try the EXIST clause instead if you're using the COUNT(*) value only to evaluate whether you have at least one row that meets certain conditions. For example, consider this code from the Northwind database:

IF (SELECT COUNT(*) FROM Orders 
   WHERE ShipVia = 3) > 0
PRINT 'You cannot delete this shipper'

The execution plan shows that SQL Server has to read all 255 rows in the Orders table before evaluating the IF expression. You can achieve the same result more efficiently with EXISTS because the IF condition evaluates to true as soon as SQL Server finds the first occurrence of 3 in the ShipVia column:

IF EXISTS (SELECT * FROM Orders 
   WHERE ShipVia = 3)
PRINT 'You cannot delete this shipper'

The difference in total execution time isn't much in a sample database such as Northwind, but use this efficient query against an Orders table with millions of rows and you'll see a major speed improvement.

2) Be Careful When Using WHERE IN and WHERE NOT IN
SQL Server doesn't always choose an optimal execution plan when you have a substantial list of values in the WHERE IN clause. Using WHERE IN and WHERE NOT IN clauses in T-SQL code can produce an execution plan involving one or more nested loops. This increases the number of comparisons SQL Server must perform exponentially. Use the WHERE IN clause only if you have a short list of values you need to evaluate:

USE Northwind
   --This query takes 9 ms to execute
SELECT * 
   FROM Customers 
   WHERE CustomerID NOT IN 
   (SELECT CustomerID FROM Orders)

Replace the WHERE IN clause with OUTER JOIN if you're using a subquery to generate a potentially large list. Doing so can improve performance significantly:

USE Northwind
    --This query takes 3 ms to execute
SELECT c.*
   FROM Customers c
   LEFT OUTER JOIN Orders o
   ON o.CustomerID = c.CustomerID
   WHERE o.CustomerID IS NULL

In this case, the second query uses LEFT OUTER JOIN, producing an execution plan that lets it run about three times faster than the first query.

The LEFT OUTER JOIN selects all rows from the Customer table—whether or not a customer placed any orders—and joins them with the Orders table. Then the WHERE clause filters out the rows where the columns from the Orders table have NULL values. Either way, you get a list of customers who placed no orders, but the second way gives SQL Server a lot less work to do. I rewrote a query recently using this technique, and the execution time went from 50 seconds to about 500 ms.

3) Randomize Resultset Orders With NewID()
You occasionally might need to randomize the order of the resultset retrieved from SQL Server. This is often the case in database searches where certain products or services would gain unfair advantage against others based simply on their name. I've seen a few clever (and not so clever) solutions for randomizing resultsets, but the solution is actually simple. You can use the NewID() function in Transact-SQL to generate a GUID for each row, then order the results by the generated GUID:

SELECT * FROM Products
   ORDER BY NEWID()

SQL Server returns products in a different order every time you run the query. You also can use this technique to return a random row from a table:

SELECT TOP 1 * FROM Products
   ORDER BY NEWID()

However, be careful when using this technique with large tables. You're only asking for one random row, but the execution plan shows that SQL Server gives you that random row only after reading each row in the table, generating a GUID for each row, then sorting all the rows. Consequently, SQL Server needs several seconds to give you a random row from a table with a few million rows. So don't use the "SELECT TOP 1?" technique on huge tables. Instead, restrict the random selection to a subset of the large table. Select a random range, then use "SELECT TOP 1?" within that range.

4) Increase Default Packet Size for Large Data Fields
SQL Server client applications communicate with instances of SQL Server through Net-Libraries. Client Net-Libraries and Server Net-Libraries communicate over the network by exchanging network packets. The size of network packets depends on the data access API you're using. The default packet size is 4,096 bytes in ADO and 8,192 bytes in ADO.NET.

These sizes work well in most scenarios, but sometimes you can improve data exchange velocity greatly by increasing packet size—especially if you're sending or receiving large amounts of XML data, or if you're storing images in SQL Server. The client and server libraries must exchange and process fewer packets when you increase packet size. Maximum packet size is 32,767 bytes. You can increase packet size in ADO or ADO.NET simply by including the Packet Size property in the connection string:

"?;Packet Size=32767;?"

The speed gained in data upload and download depends on the size of your data fields and on your network topology. One of my applications stores XML data in SQL Server, with each XML field about 500K in size. My benchmarks show that the application exchanged XML data with SQL Server about twice as fast after increasing the packet size to 32,767 bytes.

5) Name Your SQL Server Applications
You can often easily identify what process belongs to which application when you examine SQL Server activity in Enterprise Manager or with SQL Server Profiler, because the Application column contains a value instead of being blank. SQL Server receives the application name from the client data access API. You've probably noticed that .NET applications often show ".Net SqlClient Data Provider" as their name. This happens when you don't specify the application name when you create an ADO.NET connection. ".Net SqlClient Data Provider" becomes the useless value ADO.NET winds up sending to SQL Server.

You can easily populate the Application Name property with a unique name for your application. Set the application name in ADO or ADO.NET by including Application Name in the connection string:

"?;Application Name=MyApplication;?"

Populating the Application Name property simplifies identifying your application activity on the server; you can use this name as the value of the Application Name filter in SQL Server Profiler traces. This can help you with distributed applications where your app might be running on multiple servers and possibly using multiple databases. Use Application Name as the filter, then run a single trace and capture the workload specific to your app coming from multiple client computers.

6) Utilize SQL Server Aliases
SQL Server Client Network Utility allows you to create alias names for SQL Server instances. The utility also lets you associate each alias with a specific network protocol. Once you create an alias, any connections that use the aliased name will be directed to the corresponding server name or IP address.

Aliases are beneficial in many scenarios. For example, you can alias an IP address of a remote SQL Server to a friendly name, or you can create an alias if your application requires the Named Pipes protocol instead of TCP/IP. The biggest benefit of using SQL Server aliases comes from being able to redirect all connections from a client computer to another SQL Server quickly and easily. One common scenario is using redirection to a standby SQL Server for high availability. Consider what happens if your primary server goes down. You need to point all applications to the secondary server in a flash. This might require modifying many connection strings if you have a Web server with many Web sites—or you can simply alias the name of the primary server to the name of your secondary server (see Figure 1).

SQL Server Client Network Utility is a part of SQL Server client tools, but you don't have to install the client tools on every computer you want to use. The MDAC pack install also includes this utility, which you can run by executing "cliconfg" from the command prompt if SQL Server client tools aren't installed.

7) Disable Triggers Instead of Dropping Them
Business rules in table triggers often expect your application to update the table one row at a time. Also, some triggers generate an error when the code in the trigger assigns to a local variable the value returned by selecting a column from the inserted virtual table. The assignment fails if you're updating multiple rows because the inserted table contains more than one row, so the subquery returns more than a single value.

Multirow updates need special handling in such a scenario. Developers often wind up dropping a trigger before multirow updates, then creating them later to resume single-row updates. However, I recommend disabling such a trigger instead:

ALTER TABLE MyTable
   DISABLE TRIGGER MyTrigger

You can re-enable the trigger once you finish your multirow data updates:

ALTER TABLE MyTable
   ENABLE TRIGGER MyTrigger

8) Don't Rename Script-Based Objects
You'll often create a script-based object, such as a view, a stored procedure, or a user-defined function. SQL Server then stores the script that created the object in the text column of the syscomments system table. SQL Server also creates a row in the sysobjects table, as it does for most database objects.

The problem arises when you rename the object in Enterprise Manager or with the sp_rename stored procedure. Unfortunately, SQL Server then only modifies the object name in the sysobjects table. It doesn't modify the script in the syscomments table, so the script remains stored with the old name.

The new object will work, but you'll get an error the next time you try to modify the object in Enterprise Manager, because the object name in the script doesn't match the new name. You might also end up deploying the object to another server with the old name inadvertently, because Enterprise Manager generates scripts by reading the syscomments table. The "CREATE ?" part of the script winds up using the old name. Save the original script, delete the object, then re-create it with the new name.

9) Don't Trust Generated Table Modification Scripts
Enterprise Manager, Visual Studio, and third-party tools let you add, drop, and modify table columns in a visual, easy-to-use environment. You click on Save; then they generate and execute Transact-SQL scripts under the hood. It's tempting to modify tables this way—after all, they shield you from the complexities of data definition language.

However, the scripts these visual tools produce often generate more SQL than required, or they generate scripts that might not be suitable in a production environment. These scripts often drop and re-create the table you're modifying instead of using the ALTER TABLE command. Consequently, large tables (with millions of rows) can chew up cycles before the data is copied back to the table.

You don't need to stop using these tools, though. Fortunately, they let you save modification scripts before they execute. Use this option, and inspect the generated script before executing it. I suspect you'll find that a few simple ALTER TABLE commands will speed up the same modifications and obviate having to drop and re-create the table.

10) Examine Your Apps With SQL Server Profiler
Allocate a few hours at the end of the development cycle to run a trace in SQL Server Profiler and capture your application's workload. This lets you identify any potential long-running queries and create additional indexes for optimizing data retrieval.

As a rule, I create a primary key and a clustered index on each table. Next, I index each foreign key, then I create additional indexes to speed up certain queries. Doing a SQL Server trace allows me to quickly identify any opportunities for useful indexes I might have missed.

My trace includes the Stored Procedures-RPC:Completed event and the TSQL-SQL:BatchCompleted event. Both of these events populate the Duration column. I usually investigate trace events that take longer than 100 ms. I paste the code into Query Analyzer and examine the execution plan. You'll need some practice with reading and interpreting execution plans, but after a while you should be able to spot any potential issues in the execution plan quickly. You should always run a trace on your production database to make sure it's performing well with tables containing production data, rather than the sample amounts in your development database.

Featured

comments powered by Disqus

Subscribe on YouTube