Joey on SQL Server

What Is TempDB and Why Does It Matter?

Time to hit the basics and dive into how theSQL Server database engine works.

One reoccurring SQL Server question that I see: Why is my TempDB database so large and what can I go to reduce the size of TempDB?

The answer requires some basic understanding of the internals of how SQL Server's database engine works. It's also the answer to one of the questions I always ask when interviewing prospective DBA candidates, which is "what is TempDB and what does it do?"

What Is TempDB?
On any SQL Server (or Azure SQL/Amazon RDS SQL Server) you work on, you will have a TempDB, and it will always be have a Database_ID of 2. TempDB is a workhorse of SQL Server performing a number of functions to support both system and internal operations. With this workload TempDB will process a large number of database writes, which requires low-latency, high throughput underlying storage. Beyond that, there can be contention for busy system pages in TempDB, so creating multiple TempDB database files is a best practice (which is enabled by default starting with SQL Server 2016) to improve throughput. Since SQL Server uses a proportional fill mechanism (larger files get proportionally more writes based on file size) all of your TempDB data files should be sized equally. However, if you are 2016 or higher, that's also taken care of you. But really, what does TempDB do for SQL Server?

The Basics
If you have ever written any T-SQL code, you have probably encountered the need to store a bit of data for a later operation. By using the nomenclature #localtable or ##globaltable SQL Server knows to create a table in TempDB. If you use a local temp table, it is only available for your current session. If you create a global table it is accessible by any user session.

In either case, as soon as the session the created the temporary table ends, the table goes away and is deallocated. While user-defined temporary tables are the most obvious use case of TempDB, SQ: Server uses it for way more than just that.

The Page File for the Database
When you execute a query that joins data to another table, or sorts a result set with an ORDER BY command, the database engine allocates a certain amount of memory for that joining operation. The amount memory granted is based on the column and index statistics, which means if there are a lot more records than the engine expects. There will not be enough memory to complete the operation.

Since SQL Server is in the business of returning your query results and not throwing errors, it uses TempDB to complete the operation. You can see this in Figure 1. A spill is represented by the warning yellow triangle with the exclamation point, and the detail of the spill is shown in the warning. In this instance this query is writing 9.6 megabytes to TempDB.

 

[Click on image for larger view.] Figure 1. Example of a memory spill.

While writing to disk is orders of magnitude slower than an in-memory operation, it's still better than failing the query. This is the most common cause of cause of sudden growth in TempDB size, large spills happen from time to time causing those files to grow. While the above example shows a 10 MB spill, spills of multiple gigabytes can happen with poor statistics or bad query constructs.

Beyond just spills, SQL Server uses TempDB for a number of internal temporary objects, sorting for online re-indexing operations, multiple active result sets (MARS), along with version stores for data. What is a version store? If you are used read-committed snapshot isolation (RCSI) a common tactic to prevent read operations from blocking write operations, each reading session gets a unique snapshot of the data, which is stored in, you guessed it, a version store table in TempDB. All this usage means TempDB can be the busiest or the second busiest database on your SQL Server.

TempDB is Taking a Lot of Space. Should I Shrink It?
One of the interesting things about TempDB is that the files are recreated every time your SQL Server restarts. Since all of the data is temporary, there is no need to persist the files like a user database. Also, TempDB is very active, which means if you try to shrink the database (which you shouldn't do), it will likely fail anyway since most of the pages are in use. If you have had unexpected and unusual TempDB growth, the proper way to resize is to use alter databasecommands as shown here:

  ALTER DATABASE tempdb MODIFY FILE ( NAME =  N'tempdev', SIZE = 8192MB , FILEGROWTH = 64MB );

Those commands will be issues and the file sizes will be modified in the catalog in the master database. The next time the service restarts, TempDB will be the size specified in the resize statements. However, you should note that typically, if TempDB grows to specific size, it will grow back to that size under normal usage. So any space savings you gain will be short lived. I typically recommend provisioning TempDB on a dedicated volume and allocating the file size to be 80 percent of that volumes. You can see that in this PowerShell script I built with the help of DBATools to automatically configure TempDB on Azure VMs based on the size of the D: drive. The reason why I chose 80 percent is that a lot of infrastructure based alerts fire at 90 percent, and I hadn't yet written this article to point sysadmins back to, when they asked "Why is TempDB taking up so much space?”

TempDB plays a significant role in any number of SQL Server operations. However, to the untrained eye, having hundreds of gigabytes of storage assigned to a temporary resource, can seem like a massive waste of space. SQL Server cannot start without TempDB (or master and model) and is massively dependent on it, for a large number of both system and user operations. In newer versions of SQL Server, the TempDB configuration is dynamic based on the number of CPUs.  If you are on an older version, you want to ensure that you have the same number of data files as you do CPU cores, up to 8.

About the Author

Joseph D'Antoni is an Architect and SQL Server MVP with over two decades of experience working in both Fortune 500 and smaller firms. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. He is a Microsoft Data Platform MVP and VMware vExpert. He is a frequent speaker at PASS Summit, Ignite, Code Camps, and SQL Saturday events around the world.

Featured

comments powered by Disqus

Subscribe on YouTube