Database Corruption and DBCC commands
Data corruption could be a nightmare for every DBA. Database security and availability is the topmost priority job of every DBA.
In this blog, we will see what is data corruption, which commands we can use to identify corruption and how to get the database in a usable state.
What is Data Corruption?
Database corruption typically occurs as a result of hardware issues, but can also occur due to virtualization technologies.
It is important to implement database consistency checks as part of your database management strategy.
The following DBCC commands perform consistency checks on a database:
DBCC CHECKALLOC Checks the consistency of disk space allocation structures with a database.
The DBCC CHECKDB operation performs the following operations:
1) DBCC CHECKALLOC
[ ( database_name | database_id | 0
[ , NOINDEX
| , { REPAIR_ALLOW_DATA_LOSS | REPAIR_FAST | REPAIR_REBUILD } ]
) ]
[ WITH
{
[ ALL_ERRORMSGS ]
[ , EXTENDED_LOGICAL_CHECKS ]
[ , NO_INFOMSGS ]
[ , TABLOCK ]
[ , ESTIMATEONLY ]
[ , { PHYSICAL_ONLY | DATA_PURITY } ]
[ , MAXDOP = number_of_processors ]
}
]
]
* Some of the underlying structures to be checked are more complex.
DBCC CHECKDB could be time consuming operation. In order to reduce time that DBCC takes, follow below techniques.
1) Use the NOINDEX
2) Use the PHYSICAL_ONLY
3) Take advantage of DBCC CHECKFILEGROUP to check a subset of the database’s filegroups.
4) Take advantage of DBCC CHECKTABLE and rotate the consistency checks across all the tables in the database across successive nights, similarly to DBCC CHECKFILEGROUP.
Now let’s see how we
can Identify database corruption.
1) Scheduling regular database consistency checks through SQL Server Agent jobs or maintenance plans helps identify database corruption.
The database engine automatically logs database consistency errors in certain places those are ERRORLOG and Windows Application Log.
2) Schedule SQL Server Agent alerts on the following error messages:
* 823 This error message is generated whenever the Windows environment issues a cyclic redundancy check. It represents a disk error.
* 824 This error message is generated when a logical consistency check fails after reading or writing a database page. It also typically indicates that there is a problem with the storage subsystem, hardware, or driver that is in the I/O path.
* 825 This error message is called the Read Retry error
and is generated whenever the database engine has had to retry a read or write
operation. The database engine retries the I/O operation up to four times
before failing it.
* 832 This error message is generated when a page in memory has been corrupted. In other words, the checksum has changed since it was readfrom disk into memory. It is a rare error. It possibly indicates bad memory.
Now let’s see how we can recover database from corruption.
When a DBCC operation runs and detects any corruption it describes the corruptions encountered and recommends a course of action and the minimum level of repair that is required.
Please find below screenshot where the minimum repair
command that are required to fix the errors encountered.
1) The primary purpose of DBCC’s repair operations is to get a corrupted database into a non-corrupted state, not to protect your data.
2) That is why Microsoft recommends that in the case of a database corruption your primary corruption recovery path should be to restore the database from the last good backup.
3) Set the corrupted database to single user mode before performing any repair operations. You don’t want users accessing a database that has corrupt data. Nor do you want them interfering with your repair attempts.
Below are few examples of repair database using DBCC commands
Repairing a database using the REPAIR_REBUILD operation
use [master]
--Get Database in SIngle User Mode
ALTER DATABASE [trans_DB] SET SINGLE_USER
-- Perform Safe repair i.e. REPAIR_REBUILD
DBCC CHECKDB (trans_DB) WITH REPAIR_REBUILD
Repairing a database using the REPAIR_ALLOW_DATA_LOSS
operation and perform an emergency repair
Perform an emergency repair using the REPAIR_ALLOW_DATA_LOSS option as an absolute last resort when you need to recover from a database corruption incident.
IMP:
1) The emergency repair process cannot be rolled back. It is a one-way operation.
2) Before performing a repair with the REPAIR_ALLOW_DATA_LOSS, it is highly recommended that you perform a file level backup of a database. Or even at the virtual machine level if you using a virtualized environment. Do not forget to include all files associated with the database, including any containers that are being used by FILESTREAM, full text indexes, and memory-optimized tables
Ex: Repairing a database using the REPAIR_ALLOW_DATA_LOSS operation
USE [master];
ALTER DATABASE [trans_DB] SET EMERGENCY;
ALTER DATABASE [trans_DB] SET SINGLE_USER;
DBCC CHECKDB ('trans_DB', REPAIR_ALLOW_DATA_LOSS) WITH NO_INFOMSGS;
Ref:
Exam Ref 70-764 Book
This is very useful. Thanks
ReplyDelete