Denny's SQL Secrets
Finding Storage Performance Problems in SQL Server
The first step is to understand the nature of performance monitoring software.
- By Denny Cherry
- 09/08/2014
Database admins are able to find storage performance problems easily using Performance Monitor. But why is it so hard to get storage administrators to find and fix storage performance problems?
To find the answer to this we need to understand how the reporting engine for the storage platforms work. Finding these performance problems typically comes down to a timing issue. Performance Monitor is typically the tool used by database administrators to track storage performance in 15-second intervals by default. Storage monitoring tools typically monitor storage performance at much longer intervals, usually in the 5 minute interval range. This leads to large gaps in the data. When the storage monitoring software tracks the storage performance, the data (which is reported) is a point in time much like the data being displayed by performance monitor. This means that in order for the problem to be seen in the storage administrators monitoring tools, the problem must be happening at the exact time that the monitoring tool takes its snapshot of the performance.
In order to find the performance problems, it is much easier when the performance problem is one that happens over a longer period of time such as several minutes. However by the nature of most database workloads, performance problems come and go, usually over very short periods of time, which is why performance monitor graphs of disk performance look very spikey in performance monitor.
Making things worse within the storage administrators monitoring software is that as the data gets older, more and more detail becomes lost as time goes on. A system which may monitor the data every five minutes after 24 hours may average that data out to an hourly average instead of keeping the detail data (which was taken every five minutes). This makes tracking problems down ever harder because even if a spike occurs or a response time was logged, that spike will be flattened out very quickly as soon as the data begins getting averaged out.
To compact this data averaging problem when a performance problem comes up, the data from the storage platform should be exported so that the detail data which the array has can be kept. Sometimes this may be an export to Excel, or even just a screenshot of the graph from the storage monitoring system.
When I'm working with clients who are trying to identify performance problems I typically start with looking for when long-running commands will be running and comparing the performance monitor data to the storage performance data at that time. Good commands to look at are commands like index rebuild commands, or DBCC CHECKDB commands as those are very IO-intensive commands which will usually push the storage platform as hard as it can be pushed. These commands will also usually run long enough that the storage platform will be able to see the performance problems. Another option is to look at the time windows of exports or imports to or from the data warehouse as these will also be very IO-intensive workloads.
When looking at storage performance problems, looking at the performance of the disks for a single SQL Server instance (physical or virtual) may not tell the entire story. Often times when reviewing storage performance issues the specific LUNs are not a problem, but once I begin aggregating LUNs together, suddenly the amount of IO which is being pushed to the array is more than the array can handle.
Storage performance problems can be very tricky to identify, but with perseverance and a good understanding of how often the storage platform is logging its data, the problems can be found. And if they can be found, they can be fixed.
About the Author
Denny Cherry is the owner and principal consultant for Denny Cherry & Associates Consulting and has over a decade of experience working with platforms such as Microsoft SQL Server, Hyper-V, vSphere and Enterprise Storage solutions. Denny's areas of technical expertise include system architecture, performance tuning, security, replication and troubleshooting. Denny currently holds several of the Microsoft Certifications related to SQL Server for versions 2000 through 2008 including the Microsoft Certified Master as well as being a Microsoft MVP for several years. Denny has written several books and dozens of technical articles on SQL Server management and how SQL Server integrates with various other technologies.