Lifting the Lid on Database Snapshots (Guest Post by Mark Broadbent)
I have, for a very long time, been a big fan of database snapshots as a mechanism to achieve higher levels of concurrency in SQL Server when using pessimistic isolation–which is, of course, the default concurrency behaviour. As many of you most probably know, when running under pessimistic isolation, readers block writers and writers block readers; this is no great secret to almost anyone who has ever written a T-SQL statement. However, what is not necessarily as well known is that there is a very good way to get around this issue: by using rolling database snapshots, ensuring that your readers are no longer blocked by your writers and guaranteeing that your queries are still returning committed data only. Granted, the use of snapshots does raise one very interesting question, but before I address it, I first want to look at a couple of other techniques that are commonly used under SQL Server’s pessimistic isolation levels when using
The Truth About READCOMMITTED
The first and most obvious strategy–and the modus operandi of many a development team–is to use the
READUNCOMMITTED table hint. This will (to the more astute of you) be recognised as equivalent to using the infamous
NOLOCK table hint, for both are identical in behaviour; the only difference between them is simply that the latter uses a legacy hint name. Therefore, should you wish to use this form of isolation,
READUNCOMMITTED keyword should be preferred over
The reason that using
READUNCOMMITTED isolation is a very bad idea (and this also will not be a surprise to most people) is that, by using this behaviour, we are able to query dirty, uncommitted data. This means that our result set is going to be inaccurate (or at best uncertain) and surely goes against the whole point of using an ACID-based database engine.
The second and perhaps less obvious strategy is to use the
READPAST table hint (one of many subjects I discuss during my “READPAST & Furious: Locking, Blocking, and Isolation” presentation). One of the biggest problems of using this table hint to achieve high levels of concurrency is the fact that the result set returned from a query could (depending upon the current locks taken on your data) be incomplete. This is not really the kind of behaviour we should be aiming for, and although there are a few good possibilities that the
READPAST hint can provide–especially when used in conjunction with other hints–it is probably best that for now we leave it alone.
At the beginning of this post, I mentioned that there is one very important question that needs to be raised by our use of database snapshots. Before I say what this is, it will probably be useful for me to make a few statements about the behaviours a database snapshot allows and exhibits, since these things are sometimes misunderstood and not explicitly documented.
- A database snapshot can be created irrespectively of whether open transactions are in flight or not.
- The snapshot will provide the ability to query all data that has already been committed in a database from the very point the snapshot was created.
- A database snapshot is treated (to all intents and purposes) as a separate database. This means that querying the snapshot data will not cause lock incompatibility with the source database since the lock manager is only interested in comparing lock compatibility of exactly the same resource.
- A database snapshot will implement a copy on write mechanism so that when changes are made to the source data, the original data page is copied to the database snapshot’s sparse file.
- The database snapshot will create sparse files to house the original data pages of its source database, which will obviusly fill over time as more and more source data is changed from the point at which the snapshot was created.
Okay, now that I have now told you what a database snapshot will provide, I hope the benefits of creating them on the fly will be obvious, especially when you have a long-running reporting query and require it to run with the minimal amount of effort and overhead. I am sure you will be able to recognise the fact that the query will not be blocked by any activity or by any locks taken or already present in the source database (for the reasons already mentioned). Secondly, our long-running query will not cause blocking itself, which is obviously a big win for all concerned… but what about the overhead?
Well, do not forget we are still running under pessimistic concurrency. Therefore, we are avoiding using the version store, which in turn means avoiding the use of TempDB. Trying to eliminate the use of shared resources is obviously going to be a very good thing. However, there’s going to be overhead for any data pages that are changed during the lifetime of the database snapshot(s) since, as we have said earlier, the source pages will be copied to the sparse files on write. It is really because of this overhead for maintaining a snapshot on data change that I believe that we need to limit its lifetime for only as long as it is required–hence the use of rolling snapshots.
We also have another overhead to be concerned about. It is something that I have never heard mentioned before (anywhere) and brings me neatly to the question that needs to be asked: “If we have many transactions in flight and a checkpoint occurs on the database prior to creating a database snapshot, how will the transaction log be affected in terms of read activity? More specifically, whilst all transactions are still in flight, will a query of the snapshot cause read overhead on the transaction log? If not, why?”
Let me first examine part of this question. It is important for me to clarify, since a common misconception of many DBAs is that only data pages that have been committed can be checkpointed to the database data file(s). This is a very common trap, one in which I have fallen into myself in the past. In actual fact, all dirty pages in the buffer cache are written to the database data file(s) on
CHECKPOINT, which means that in the scenario posed by my question, our snapshot will only have one place to obtain those committed pages that were overwritten by the checkpointed uncommitted pages. Yes, you’ve guessed it: the transaction log.
Snapshots in Action
So now we can move forward and investigate exactly what happens in this scenario. We can do this by performing a series of operations, ensuring that before and after each step we capture the log file IO usage statistics.
- Update all rows in a table in an open-ended transaction, therefore taking out Exclusive locks on all the data.
- Create database snapshot.
- Query the table in question.
In order to keep track of our database’s log file usage statistics, I can use a very handy dynamic management view (DMV) called sys.dm_io_virtual_stats and store the results to a temporary table so we can compare and contrast log file activity before and after database snapshot creation.
SELECT file_id, num_of_reads, num_of_bytes_read, num_of_writes,num_of_bytes_written, size_on_disk_bytes FROM sys.dm_io_virtual_file_stats(DB_ID('READPAST & Furious'),2)
To make the capturing of statistics easier to perform and read, I have created a simple stored procedure that saves the results to a temporary table and includes an optional description for the step which is currently being performed.
I shall now start running through each step:
EXEC usp_capturelogstats 'pre-update' --before proceeding, ensure that the following open-ended transaction --is executed within a new connection to update table BEGIN TRAN UPDATE orders SET status = 3; --leave this transaction open and change back to the previous connection EXEC usp_capturelogstats 'post-update' --checkpoint database to write all dirty pages to disk CHECKPOINT EXEC usp_capturelogstats 'post-checkpoint' --create snapshot CREATE DATABASE [READPAST & Furious Snapshot] ON (NAME = 'READPAST_and_Furious', FILENAME = 'D:\READPAST_and_Furious_Snapshot.mdf') AS SNAPSHOT OF [READPAST & Furious] EXEC usp_capturelogstats 'post-snapshot' --query snapshot SELECT * FROM [READPAST & Furious Snapshot].dbo.orders EXEC usp_capturelogstats 'post-query'
So we have flushed our uncommitted changes to disk, created a snapshot, and queried it whilst still holding open all the exclusive locks across the pages of the entire table and collected all the log file usage statistics at each stage. So it is now time to take a look at the results.
It is plain to see from the results that there is significant IO read overhead that occurs on creation of the database snapshot (iteration 4). These log file usage statistics demonstrate that at this stage, SQL Server performs recovery upon the log file in order to populate the snapshot’s sparse files with the previous image of the data pages that had been overwritten in the data file(s) by the checkpoint operation. This is quite a clever implementation by the SQL Server Database Engine team because it means that log file recovery would only have to occur once per snapshot (rather than once per query). Another very important point to highlight is that contrary to popular belief, the database snapshot files are not completely sparse; their size is wholly dependent upon how much uncommitted dirty data for the database exists at that point in time. Snapshot creation would obviously also take at least the time for recovery of those data pages in the log file to complete.
You will see from the following picture that our database snapshot’s sparse file was created with a size of 83.2 MB, but actually occupies an incredible 21.2 MB on disk already!
For me, database snapshots are an incredibly underused technology in SQL Server. They can provide significant concurrency benefits when used in the right way under pessimistic isolation. However, the way in which SQL Server creates a snapshot is currently very poorly documented within MSDN and Books Online, so it is very important you understand the impact you could cause to your SQL Server on a very busy server with high volume transaction throughput. In that scenario, it is even more important that your transaction’s duration is short-lived if you intend upon using database snapshots for concurrency benefits. Otherwise? Well, your log file may feel the pain.