Joey on SQL Server
Understanding Dirty Reads: What NOLOCK Does to Your Queries
Let's break down when it's ideal to use the NOLOCK hint and when you should never use it.
- By Joey D'Antoni
- 01/21/2022
As a consultant, I work on wide swaths of database systems, which means I get to see many, many SQL statements -- from the good, the bad and the ugly. One common pattern I see in a lot of queries of all sort, is the use of the NOLOCK hint.
While to the uninitiated, the NOLOCK phrasing may imply that a query using this hint will take no locks against any tables referred to in the query. That is not the case. What the NOLOCK hint does is enforce SQL Server's Read Uncommitted isolation level, which allows for what is known as dirty reads. What does this mean for your application? Let's take a deeper dive into database transaction handling to understand when and where you should use NOLOCK (and when you should never use it).
One of the core principles of the relational database model is ACID, which stands for Atomicity, Consistency, Isolation and Durability. There are several database engine components, like latches, locks, and transaction logging, that allow a database to fulfill these principles, but changes to certain settings can take databases out of ACID compliance. Specifically, in this post we are talking about the consistency and isolation properties of ACID and how they affect the throughput of your application and the accuracy of your data. SQL Server allows isolation level to be configured at the connection or transaction level and it can be set at the database level. However, any settings at the connection or transaction level will override those set at the database level.
So What Does Nolock Actually Do?
Let's look the most basic example. I have a table called Sales, with two columns, TransactionID and SaleAmount.
In the left most query, you can see a transaction starting that inserts a record with a transaction ID of 3 and a SalesAmount of 43.44. However, that transaction is never committed -- in this case it is an example of poor transaction handling in code, in a real-world scenario this could happen for a number of reasons, including system errors, rejected credit card, validation failing, etc. The queries on the right are both selecting that record. One is using the NOLOCK query hint. The other is using the READ UNCOMMITTED isolation level, which both allow the query to read uncommitted data. While the query completes, since this data is uncommitted it is invalid, and should not be accounted for, or used to drive any other business decisions, because it could go away or change.
If I execute the same query without the NOLOCK hint, or the isolation level set, the query will not complete.
To better illustrate this, I removed the WHERE clause from my query. It still does not complete. This behavior is known as "blocking" and has a negative reputation, because as you can see the query will wait until that open transaction is committed to complete. However, we have a solution that allows you to both read accurate data and have higher concurrency in your application.
By default, SQL Server uses an isolation level of Read Committed, which means a query cannot read data modified by another transaction that has not yet committed, thus preventing dirty reads. However, by changing that default isolation level to Read Committed Snapshot Isolation (better known as RCSI) each query gets the last committed version of the data. As shown in this example:
We have changed the statement in the left to an update just to better illustrate the point. An update is run without a COMMIT or ROLLBACK. When we run the queries on the right with RCSI enabled for the database, you can see that the first query, starting on line 7, returns the correct amount, while the query starting on line 13, with the NOLOCK hint returns the incorrect uncommitted value.
Does NOLOCK Prevent Blocking?
Use of the NOLOCK hint or READ UNCOMMITTED isolation level tells the database engine that your query or session is ready and willing to read uncommitted data. What it does not do is prevent the database engine from taking a lock on any underlying tables. A very basic example is this is shown in the below query:
In this session I've enabled trace flags 1200 and 3604, which will show all locks taken by a given query (you should never use these trace flags on a production server). Since the query I issued does not have a cached execution plan, SQL Server needs to generate one. In order to generate an execution plan, SQL Server needs to read statistics and other metadata objects. In doing so, it takes a schema stability lock (SCH-S) while it is generating the plan. While this take milliseconds on a simple query, if you have a complex query, plan generation may take more than a minute, blocking other query activities.
The inverse is also possible. Another transaction with a schema lock can block a query with a NOLOCK hint.
In this case, an alter table in an open transaction is blocking the query we ran earlier with the NOLOCK hint.
Should You Use With NOLOCK Hints?
If you are software developer and reading this article, no you should never use a NOLOCK hint. You can stop reading this article now. For the DBAs who are still reading, there are some administrative scenarios where you can and should use NOLOCK. For example, if you have used
Glenn Berry's diagnostic queries, you may notice that most of them use the NOLOCK hint. This is because as a DBA, you do not want to block user activity as you are administering your databases.
Beyond just the "do no harm" mantra there are some scenarios where you may want to read system data that is uncommitted. A couple of examples of this include rebuilding an index and querying the sys.partitions view to monitor the progress of that operation. Another example might be to get row count on a table that is currently being bulk loaded by multiple processes.
What I'd like you to take away from this column is to think judiciously about when and where you use the NOLOCK hint. Far too often, I see it bluntly applied across a code base without any thought of the impact of reading dirty data. Secondly, consider using read committed snapshot isolation on all of your user databases. The trade-off is that there is a small chance of repeated read and higher TempDB utilization on your instance. If you are using Accelerated Database Recovery on SQL Server 2019, the version is stored in your database, reducing TempDB load. However, it saves a lot of headaches, and Microsoft uses it as a default on Azure SQL Database, which means you should probably think about it in your environment.
About the Author
Joseph D'Antoni is an Architect and SQL Server MVP with over two decades of experience working in both Fortune 500 and smaller firms. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. He is a Microsoft Data Platform MVP and VMware vExpert. He is a frequent speaker at PASS Summit, Ignite, Code Camps, and SQL Saturday events around the world.