What To Watch For When Troubleshooting

Performance lagging on the SQL Server? Back to troubleshooting basics.

I'd like to talk about the first few things to look at when experiencing performance problems on your SQL Server. Inevitably, every DBA will be faced with a phone call or email stating that "the database server is slow". For most of us, this conversation usually entails trying to extract the meaning of "slow" from the caller/emailer, hoping to glean some sort of clue as to where to start. However, anytime there are performance issues reported, there are a few key places to look to start narrowing down the problem.

Hardware. Most DBAs start here, because there can be very obvious indicators in this area that help us narrow down the focus of the issue. Usually, it's a good idea to figure out what the state of the major resource consumption is. Particularly, you need to figure out what the usage looks like for these
components; CPU, RAM, Disk. Fortunately, that's only everything in the system, right?

CPU. Obviously, CPU is easy to figure out. If logging into the server (via console OR remote desktop) is slow, you've probably already guessed that the CPU is pegged. Once you're in, you can launch the Task Manager, or preferably PerfMon (Windows Performance Monitor), and you can quickly tell if the CPU (or CPUs) is maxed out. With Task Manager, you can flip to the Processes tab and click on the CPU column to sort the processes by CPU consumption. If you're a DBA, you're looking to see where sqlservr.exe lands. If it shows nearly 100% use, then you can be sure SQL Server needs to be investigated further.

Using PerfMon, add the "% Processor Time object and the _Total instance to see what the overall usage looks like. You can also look under the Process object for the sqlservr.exe process, and add it to review a ton of information around SQL Server's CPU usage.

RAM. This resource can be a bit trickier; though you're initial review should determine whether or not the physical memory in the server is being 100% allocated.  First, you can again look at Task Manager, under the Performance tab, and see what the Physical Memory(MB) box says. If the the Free column is extremely low, AND the Available row is extremely low, you've probably got an issue. You can also use PerfMon (using the Memory object) to look at % Committed Bytes In Use, Available MBytes, and Page Faults/sec to get an idea of how quickly your memory is being used, as well as how much of it is being used. A high number of Page Faults combined with a low number of Available MBytes could be a sign that you don't have enough memory.

Many larger systems have the bulk of memory alocated from startup; usually all of it will be assigned to SQL Server, with a minimum reserve for the OS. If this is the case in your environment, you can skip to the next step.

Disk. This area can get very complicated. Many systems still use locally attached storage, but more and more often high powered SQL Servers are using Storage Area Networks (SAN). In this case, you'll need to get your SAN administrator on the phone immediately, and have them start looking for bottlenecks in the SAN fabric (network), particularly on the controllers the SQL Server you are working on uses. Otherwise, you'll need to spend some time in PerfMon looking at the Logical Disk object, and in particular the following counters:

% Disk Read Time
% Disk Write Time
% Idle Time
Disk Reads/sec
Disk Writes/sec
Avg. Disk sec/Read
Avg. Disk sec/Write

The first three counters tell you how much of the total I/O time is being spent reading, writing, or idle. Any disk that is spending more time idle is probably not a concern. But, pay attention to the disks that host your data files and log files; if the % counters are high, look at the other 4 counters for each of the corresponding disks. Look for particularly high sec/Read and sec/Write; these could mean your disks aren't working as fast as SQL Server needs them to.

SQL Server Settings. Once you've had a look at the hardware, you need to start looking inside of SQL Server. The first thing you want to look at is the current "health" of the system. You'll want to review the SQL Server Error Log (as well as the Windows Application and System Event Logs) to see if anything out of the ordinary shows up. Along with tons of "ignorable" messages, you may see anything from crashing SPIDs to stack dumps (memory dumps) that can tell you if the SQL Server is having problems.

Simultaneously, you can open a query window to the server and run a few key statements:

SELECT * FROM sys.dm_exec_requests
SELECT * FROM sys.dm_tran_locks
sp_configure

The first statement allows you to see all of the current requests being executed on the server. Here, you're going to look for blocking processes (look at the blocking_session_id column), as well as any processes that are high in the wait_time, cpu_time, reads and/or writes columns. These processes will need further investigation. You can obtain the T-SQL being executed by those processes by simply issuing BDCC INPUTBUFFER(<spid>), replacing <spid> with the actual SPID of the process in question.

The second statement gives you an idea of how many locks, and what types of locks, are being held on objects inside the databases. This can be helpful if you are trying to identify a specific process that is blocking other processes, and why it's blocking (blocking is usually because of table locks). This can help you decide whether or not you want to kill a process, or let it finish it's workload.

Finally, sp_configure let's you examine your server configuration to make sure nothing has changed from your baseline. You do have a baseline, right?

Tables. Finally, you'll want to look at the most important part of your database; the part that holds the data. Quite often, performance issues are reported only AFTER they've reached an unacceptable level. Usually these issues have been slowly developing, it just takes awhile for someone to report it to you. So, one other area you want to address is the table structure. More often than not performance issues stem from either poorly written T-SQL or poorly maintained tables. And while looking at the T-SQL is outside the bounds of this article, we can talk about table structure. In particular, indexing.

First, you'll want to make sure that there ARE indexes on the data. If the system has grown significantly, and no indexes are in place, SQL Server will seem to go from performing well (when there was a tiny amount of data in the tables) to very poorly. Double check that there are clustered indexes where there should be, and non-clustered indexes to cover the majority of queries being run on the system. If there aren't, you've got your work cut out for you.

However, for many, indexes will already be in place. If they are, look and see when they were last rebuilt. If it hasn't been recently (or, gasp, at all), that'll be the very next thing to address.

Here's a script that can help you determine the fragmentation of indexes in a given database:

DECLARE @db_id SMALLINT;
SET @db_id = DB_ID(N'DatabaseName'); --<<<CHANGE TO YOUR DB NAME
SELECT SCHEMA_NAME(so.[schema_id]) + '.' + object_name(so.[object_id]) as [Object_Name],
index_id,
index_type_desc,
index_level,
avg_fragmentation_in_percent,
fragment_count,
page_count
FROM sys.dm_db_index_physical_stats (@db_id, NULL, NULL, NULL, 'DETAILED') ips
inner join sys.objects so
    on ips.object_id = so.object_id
where avg_fragmentation_in_percent > 50 --Filter by level of fragmentation
and fragment_count > 100 --filter by number of fragments; more fragments = more pages = larger tables
order by avg_fragmentation_in_percent desc;
GO

Once you've identified highly fragmented indexes, go about issuing ALTER INDEX...REBUILD statements as needed to fix the fragmentation. You may also need to do some ongoing monitoring to make sure that the indexes don't become too fragmented between maintenance windows. If they do, you'll have to adjust how frequently the rebuilds are taking place, or consider running ALTER INDEX...REORGANIZE between maintenance windows to try and keep the indexes in the best shape you can.

Summary
So there you have it. Once the above information has been gathered, you can quite often find the necessary steps to improve your server's performance. Of course, this isn't everything, so if these steps don't help, don't give up, there are tons of resources online to help you track down your performance issues. Hope this helps!