One day I and my colleague was working on a data issue. There was some expected data missing in the table hence, we were analyzing the SQL job which puts the data in the table. However, while discussing this issue we got comment from someone like is it a kind of a data corruption issue? I and my friend looked at each other with shocked eyes. we told that person to keep calm and don’t be panic as this word could create a panic situation unnecessarily. We could able to solve the data issues that day but that person’s comment inspired me to write a blog on data corruption and its resolution.

Data corruption could be a nightmare for every DBA. Database security and availability is the topmost priority job of every DBA.

In this blog, we will see what is data corruption, which commands we can use to identify corruption and how to get the database in a usable state.

What is Data Corruption?

Data corruption is when there is an error while reading or updating data in the database.

Database corruption typically occurs as a result of hardware issues, but can also occur due to virtualization technologies.

It is important to implement database consistency checks as part of your database management strategy.

The following DBCC commands perform consistency checks on a database:

DBCC CHECKALLOC Checks the consistency of disk space allocation structures with a database.

DBCC CHECKCATALOG Checks the consistency of the system tables with a database.

DBCC CHECKDB Checks the logical and physical integrity of all the objects of a database.

DBCC CHECKFILEGROUP Checks the allocation and structural integrity of all tables and indexed views in a filegroup.

DBCC CHECKTABLE Checks the allocation and structural integrity a table.

Note:

The DBCC CHECKDB operation performs the following operations:

1) DBCC CHECKALLOC

2) DBCC CHECKTABLE

3) DBCC CHECKCATALOG

4) Performs a consistency check on all indexed views within a database

5) Performs a consistency check on FILESTREAM data within a database

6) Performs a consistency check on the Service Broker data within a database

DBCC CHECKDB
    [ ( database_name | database_id | 0
        [ , NOINDEX
        | , { REPAIR_ALLOW_DATA_LOSS | REPAIR_FAST | REPAIR_REBUILD } ]
    ) ]
    [ WITH
        {
            [ ALL_ERRORMSGS ]
            [ , EXTENDED_LOGICAL_CHECKS ]
            [ , NO_INFOMSGS ]
            [ , TABLOCK ]
            [ , ESTIMATEONLY ]
            [ , { PHYSICAL_ONLY | DATA_PURITY } ]
            [ , MAXDOP = number_of_processors ]
        }
    ]
]

1) NOINDEX: No check on nonclustered indexes.

2) REPAIR_ALLOW_DATA_LOSS: Tries to repair all reported errors. These repairs can cause some data loss.

3) REPAIR_FAST: Maintains syntax for backward compatibility only. No repair actions are performed.

4) REPAIR_REBUILD: Performs repairs that have no possibility of data loss. This can include quick repairs, such as repairing missing rows in nonclustered indexes, and more time-consuming repairs, such as rebuilding an index.This argument does not repair errors involving FILESTREAM data.

5) ALL_ERRORMSGS: All error messages are displayed by default

6) EXTENDED_LOGICAL_CHECKS: Performs logical consistency checks on an indexed view, XML indexes, and spatial indexes, were present.

7) NO_INFOMSGS: Suppresses all informational messages.

8) TABLOCK: Causes DBCC CHECKDB to obtain locks instead of using an internal database snapshot.

9) ESTIMATEONLY: Displays the estimated amount of tempdb space that is required to run DBCC CHECKDB with all the other specified options. The actual database check is not performed.

10) PHYSICAL_ONLY: Limits the checking to the integrity of the physical structure of the page and record headers and the allocation consistency of the database.

11) DATA_PURITY: This option specifies that column values should be checked to ensure that they are valid for the domain and not out of range.

12) MAXDOP: Controls CPU uses.

A full run of DBCC CHECKDB may take considerably longer to complete than earlier versions. This behaviour occurs because:

* The logical checks are more comprehensive.

* Some of the underlying structures to be checked are more complex.

* Many new checks have been introduced to include the new features.

Therefore, using the PHYSICAL_ONLY option may cause a much shorter run-time for DBCC CHECKDB on large databases and is recommended for frequent use on production systems.

DBCC CHECKDB could be time consuming operation. In order to reduce time that DBCC takes, follow below techniques.

1) Use the NOINDEX

2) Use the PHYSICAL_ONLY

3) Take advantage of DBCC CHECKFILEGROUP to check a subset of the database’s filegroups.

4) Take advantage of DBCC CHECKTABLE and rotate the consistency checks across all the tables in the database across successive nights, similarly to DBCC CHECKFILEGROUP.

Now let’s see how we can Identify database corruption.

1) Scheduling regular database consistency checks through SQL Server Agent jobs or maintenance plans helps identify database corruption.

The database engine automatically logs database consistency errors in certain places those are ERRORLOG and Windows Application Log.

2) Schedule SQL Server Agent alerts on the following error messages:

* 823 This error message is generated whenever the Windows environment issues a cyclic redundancy check. It represents a disk error.

* 824 This error message is generated when a logical consistency check fails after reading or writing a database page. It also typically indicates that there is a problem with the storage subsystem, hardware, or driver that is in the I/O path.

* 825 This error message is called the Read Retry error and is generated whenever the database engine has had to retry a read or write operation. The database engine retries the I/O operation up to four times before failing it.

* 832 This error message is generated when a page in memory has been corrupted. In other words, the checksum has changed since it was readfrom disk into memory. It is a rare error. It possibly indicates bad memory.

Now let’s see how we can recover database from corruption.

When a DBCC operation runs and detects any corruption it describes the corruptions encountered and recommends a course of action and the minimum level of repair that is required.

Please find below screenshot where the minimum repair command that are required to fix the errors encountered.

Note:

1) The primary purpose of DBCC’s repair operations is to get a corrupted database into a non-corrupted state, not to protect your data.

2) That is why Microsoft recommends that in the case of a database corruption your primary corruption recovery path should be to restore the database from the last good backup.

3) Set the corrupted database to single user mode before performing any repair operations. You don’t want users accessing a database that has corrupt data. Nor do you want them interfering with your repair attempts.

Below are few examples of repair database using DBCC commands

Repairing a database using the REPAIR_REBUILD operation

use [master]

--Get Database in SIngle User Mode
ALTER DATABASE [trans_DB] SET SINGLE_USER

-- Perform Safe repair i.e. REPAIR_REBUILD
DBCC CHECKDB (trans_DB) WITH REPAIR_REBUILD

Repairing a database using the REPAIR_ALLOW_DATA_LOSS operation and perform an emergency repair

Perform an emergency repair using the REPAIR_ALLOW_DATA_LOSS option as an absolute last resort when you need to recover from a database corruption incident.

IMP:

1) The emergency repair process cannot be rolled back. It is a one-way operation.

2) Before performing a repair with the REPAIR_ALLOW_DATA_LOSS, it is highly recommended that you perform a file level backup of a database. Or even at the virtual machine level if you using a virtualized environment. Do not forget to include all files associated with the database, including any containers that are being used by FILESTREAM, full text indexes, and memory-optimized tables

.
Ex: Repairing a database using the REPAIR_ALLOW_DATA_LOSS operation

USE [master];

ALTER DATABASE [trans_DB] SET EMERGENCY;

ALTER DATABASE [trans_DB] SET SINGLE_USER;

DBCC CHECKDB ('trans_DB', REPAIR_ALLOW_DATA_LOSS) WITH NO_INFOMSGS;

Ref:
Exam Ref 70-764 Book

Search This Blog

Devendra-Techs

Database Corruption and DBCC commands

What is Data Corruption?

Comments

Post a Comment

Popular posts from this blog

What is DACPAC and how to use it?

Always on setup end to end - Part 1 || Pre-requisites

Insufficient disk space in filegroup