Nucleus Datakonsult Why you want to be restrictive with shrink of database files

Home

About me and Nucleus
Partners

Articles, tips etc.
My blog
Links
Microsoft articles
Forums

Courses and training




Microsoft Most Valuable Professional


Overview
If you want the really short story, then check out this analogy - hopefully you come back here and read the full story. Introduced in SQL Server 7.0 was the ability automatically grow and to shrink the physical size of database data and transaction log files. Auto grow of files doesn't happen before the file is full, it happens when new space is needed (like when an insert is performed), so the user will wait the time it takes to grow until the modification is completed.

Auto grow and shrink can be very useful under special circumstances, for example after archiving data to some other location. However, we often see DBA's doing shrink on a regular basis and the purpose of this article is to explain some of the downsides of shrink and what actually happens when you shrink a database file. Also, it is worth noticing that the auto grow functionality was mainly added so the dba wouldn't be paged in the middle of the night because the database is full. It was never intended to be a high performance feature or to replace the need to manage space usage, especially at the high end

Acknowledgements
I like to thank Mark Allison and Kalen Delaney who provided valuable suggestions and input for this article.

Versions of SQL Server
This article applies to SQL Server 2000, 2005 and 2008, where not noted otherwise.

More information
You can shrink a database file using either DBCC SHRINKDATABASE (which targets all files for the database) or DBCC SHRINKFILE (which targets a particular database file). I prefer SHRINKFILE. I will not go through the details of the commands here; they are documented in SQL Server Books Online. Let us first determine what actually happens when you shrink a database file:

Shrinking of data file
When you shrink a data file, SQL Server will first move pages towards the beginning of the file. This frees up space at the end of the file and the file can then be shrunk (or as I like to view it: "cut off at the end").

Management of transaction log file
This section is only to serve as a brief introduction to the topic of transaction log file management. See below for reference.
Each modification performed in the database is reflected in the transaction log file. Needless to say, these log records need to be removed (or actually: overwritten) sooner or later - or else we would end up with a huge log file, or a full log file.
One way is to set the database to simple recovery model. This mean that SQL Server will automatically mark inactive log records as "OK to overwrite" (a.k.a. "truncate the log" or as I like to call it: "empty the log").
Another option is to have the database in full or bulk_logged recovery model and perform regular transaction log backups (BACKUP LOG). The transaction log is emptied when you backup the log. Note that the log is *not* emptied for other backup types (like BACKUP DATABASE).
See the SQL Server Books Online reference http://msdn.microsoft.com/en-us/library/ms345583.aspx for details. Make sure you read all the subsections. It is worth some 30 minutes of your time to understand how to manage the transaction log. Trust me on this.

Shrinking of transaction log file
SQL Server cannot move log records from the end of the log file toward the beginning of the log file. This means that SQL Server can only cut down the file size if the file is empty at the end of the file. The end-most log record sets the limit of how much the transaction log can be shrunk. A transaction log file is shrunk in units of Virtual Log Files (VLF). You can see the VLF layout using the undocumented DBCC LOGINFO command, which returns one row per virtual log file for the database:

DBCC LOGINFO('myDatabase')
FileId FileSize StartOffset FSeqNo Status Parity CreateLSN
2 253952 8192 11 0 128 0
2 253952 262144 13 0 128 0
2 270336 516096 12 0 128 7000000025000288
2 262144 786432 14 2 128 9000000008400246

The interesting column is "Status". 0 means that the VLF is not in use and 2 means that it is in use. In my example, I have 2 at the end of the file (read result from top to bottom) and this means that the file cannot currently be shrunk. How to handle this depends on your recovery model. You should adjust and replace options and database appropriately for below code. If you are uncertain, then check the command in the product documentation:
  • Simple recovery model
    USE dbname
    CHECKPOINT
    --First param below is fileno for log file, often 2. Check with sys.database_files
    --Second is desired size, in MB.
    DBCC SHRINKFILE(2, 500)
    DBCC SQLPERF(LOGSPACE) --Optional
    DBCC LOGINFO --Optional

    Now repeate above commands as many times as needed!

  • Full or bulk_logged recovery model
    USE dbname
    BACKUP LOG dbname TO DISK = 'C:\x\dbname.trn'
    --First param below is fileno for log file, often 2. Check with sys.database_files
    --Second is desired size, in MB.
    DBCC SHRINKFILE(2, 500)
    DBCC SQLPERF(LOGSPACE) --Optional
    DBCC LOGINFO --Optional

    Now repeate above commands as many times as needed!
What you end up doing is empty the log (CHECKPOINT or BACKUP LOG) and DBCC SHRINKFILE several times, so that SQL Server can move the head of the log to beginning of the file and also so the end of the file becomes unused. Investigate the layout of the log file using DBCC LOGINFO in between.

If you have loads of VLF (many rows returned from DBCC LOGINFO), you probably had a small file size for the log initially and then had lots of small autogrow. Having lots of VLF is a a performance hit, especially for database recovery and batch modifications. See this and this great blog posts by Linchi Shea. If this is your case, consider shrinking the log file to a very small size and then expand the file size to some comfortable larger size. Here are some articles specifically about management of log file size:
Shrinking the Transaction Log in SQL Server 2000 with DBCC SHRINKFILE
Log File Grows too big
Considerations for Autogrow and AutoShrink

So what is the problem? Why shouldn't I shrink database files on a regular basis?
Have a look at below list and then you can determine for yourself whether or not you want to shrink database files regularly:
  • Each page moved will be logged to the transaction log.
    Say you have a database with 50GB used space (data and indexes pages), and the shrink will shuffle 40 GB towards the beginning of the file. The log file will for this shrink operation need 40GB, likely auto-grow to that size (if you don't have 40GB free space in the log file already). The following log backup wll be 40GB in size, plus the "regular" log records. This doesn't seem to happen if the database is in simple recovery mode, possibly beacuse CHECKPOINT will prune the log regurarly during the shrink.
    (Applies to shrinking of data files.)

  • After the shrink, as users add rows etc in the database, the file has to grow again.
    Growing a database file is an expensive operation, it takes time and it also hurts performance (lots of resource usage). Also, some modifications will be blocked until the grow operation has finished.
    (Applies to shrinking of both data and log files.)

    SQL Server 2005 and later:
    As of SQL Server 2005, we have "instant file initialization" which means that database files can be created and also grow very fast since Windows doesn't "zero out" the data in the database file. Instant file initialization is only available for data files, not log files. Also, instant file initialization requires that the service account for the SQL Server service has the SE_MANAGE_VOLUME_NAME windows privilige, which can be assigned using the Perform Volume Maintenance Tasks security policy. This is by default only granted to Administrators.

  • There are situations where autogrow doesn't "catch up" with the space usage requirements.
    This will result in an error message from SQL Server when the modification is performed, returned to the client application: error 1105 if data is full and 9002 if log is full.
    (Applies to shrinking of both data and log files.)

  • Moving datapages around will fragment your database.
    Say you rebuild your indexes (which will requires free space in the database), and then shrink the database. The shrink will essentially undo the index rebuild, leaving you with fragmented indexes. Don't believe me? This is easy to test for yourself.
    What if you do it the other way around, shrink first, then rebuild? Well, the rebuld need free space in the database for the largest index that you rebuild, and it is likely you have a large table with a clustered index. A friend of mine had a 4GB used space db, where almost all space was one 4GB table. He did a shrink and then rebuild, where the re-build immediately "bumped up" the db size to 8GB.
    (Applies to shrinking of data files.)

  • Heavy shrinking and growing of database files will fragment your file system, which will further hurt performance.
    (Applies to shrinking of both data and log files.)

  • Repeatedly srinking and subsequent growing of transaction log files will typically result in many virtual log files which can cause long startup time for the database. This can manifest itself as long startup time for a database, long restore time, delays for transaction replication etc. Check out this blog post for more information.
Conclusion
In the end, only you can determine whether you want to shrink or not. With above information, you hopefully have enough to go on when making that decision. You probably know by now what I think of regular shrinking. Not to talk about autoshrink. :-)


Transact-SQL code
Below you find the T-SQL code I ran to prove that shrinking of a database will generate the same amount of transaction log records as the amount of data which was moved. (The script uses the 'NUL' filename as backup destination. This will "write" the backup data to the "bit bucket". I.e., the data isn't saved anywhere. This is very practical for testing but risking to state te obvious, do not use this backup destination for your production work!)

--Script to show that shrink produces a lot of log record. 
SET NOCOUNT ON
USE 
master
IF DB_ID('shrink_test'IS NOT NULL DROP DATABASE shrink_test
GO

CREATE DATABASE shrink_test
ON PRIMARY 
(NAME shrink_test_dataFILENAME N'c:\shrink_test_data.mdf'
,SIZE 5MBMAXSIZE 200MBFILEGROWTH 10%) 
LOG ON  
(NAME shrink_test_logFILENAME N'c:\shrink_test_log.ldf'
,SIZE 3MBMAXSIZE 200MBFILEGROWTH 10%) 
GO 

ALTER DATABASE shrink_test SET RECOVERY FULL

--Make sure the database isn't in auto-truncate mode for the log
BACKUP DATABASE shrink_test TO DISK = 'NUL' 

USE shrink_test 
CREATE TABLE 
(
 
c1 INT IDENTITY CONSTRAINT PK_shrink_test PRIMARY KEY NONCLUSTERED
,c2 CHAR(3000DEFAULT 'hello'


DECLARE @i INT 
SET 
@i 
WHILE @i <= 40000  
BEGIN 
  INSERT INTO 
DEFAULT VALUES 
  IF 
@i%500 --Truncate log on every 500'th row 
    
BACKUP LOG shrink_test TO DISK = 'NUL' 
  
SET @i @i 
END 
SELECT 
COUNT(c1AS "Number of rows before delete, should be 40000"
FROM t  
GO 

--Delete some rows, in a loop so log doesn't grow! 
DECLARE @i INT 
SET 
@i 
WHILE @i <= 20000  
BEGIN 
  DELETE FROM 
WHERE c1 @i 
  
IF @i%500 --Truncate log on every 500'th row 
    
BACKUP LOG shrink_test TO DISK = 'NUL' 
  
SET @i @i 
END 
SELECT 
COUNT(c1AS "Number of rows after delete, shuld be 20000"
FROM t  

--Empty the log
BACKUP LOG shrink_test TO DISK = 'NUL' 

--Database file should be a bit larger (160MB on my machine), logfile smaller (3MB)
SELECT namesize*8192/(1024*1024AS sizeInMB FROM sysfiles 
GO 

--This shrink might now produce a lot of log record as 20000 rows will be moved! 
CHECKPOINT
DBCC 
SHRINKFILE (shrink_test_data40)

--Database file should now be small, but logfile large
SELECT namesize*8192/(1024*1024AS sizeInMB FROM shrink_test..sysfiles

--My result, the data file shrunk to 80MB and the log file grew from 3MB to 146MB!!!

USE master
GO
DROP DATABASE shrink_test