When a database is corrupted, we will alwayssuggest restoring last known good database backup instead of running DBCCCHECKDB command to fix it. Can you explain why? Try to use a demo to show whyDBCC CHECKDB command is not always
a first choice for a corrupted database.
Answer:
REPAIR_ALLOW_DATA_LOSS
· The purpose of repair is not to save user data.The purpose of repair is to make the database structurally consistent as fast as possible(to limit downtime) and correctly (to avoid making things worse).
· This means that repairs have to be engineered tobe fast and reliable operations that will work in every circumstance. Thesimple way to do this is to delete what's broken and fix up everything thatlinked to (or was linked from) the thing being deleted
- whether a record orpage.
· This means that repairs have to be engineered tobe fast and reliable operations that will work in every circumstance. Thesimple way to do this is to delete what's broken and fix up everything thatlinked to (or was linked from) the thing being deleted
- whether a record orpage.
· Things to look for that mean repair won’t be ableto fix everything are:
- CHECKDB stops early and complains about system table pre-checks failing (errors 7984 – 7988 inclusive)
- CHECKDB reports any metadata corruption (8992, 8995 errors)
- CHECKDB reports any errors on PFS page headers (8939, 8946 errors with a possible 8998 error as well)
Test1: Damage the data page
--create a test database
usemaster
createdatabase testcheckdb
usetestcheckdb
--create table t3
droptable t3
createtable t3
(
c1 int primarykey,
c2 nvarchar(50)
)
--insert a row (0, 'ABCDEFG')
select*
from t3
insertinto t3
values(0,'ABCDEFG')
go
--insert 1000 rows
declare@count int
set@count = 1
WHILE@count <= 1000
BEGIN
insert into t3values(@count,'abc')
set@count = @count+1
END;
GO
--check data
select*
from t3
--check page
dbcctraceon(3604)
DBCCind('testcheckdb', t3,-1)
--//154-IAMpage;155-index root page;153,156,157-data page
DBCCpage('testcheckdb', 1,154,3)--//
DBCCpage('testcheckdb', 1,155,3)--//
DBCCpage('testcheckdb', 1,153,3)--//
m_slotCnt = 351

DBCCpage('testcheckdb', 1,156,3)--//
m_slotCnt = 352
DBCCpage('testcheckdb', 1,157,3)--//
m_slotCnt = 298
--set db offline
usemaster
alterdatabase testcheckdbsetoffline

--modify the data page
alterdatabase testcheckdbsetonline
--check data again
usetestcheckdb
select*
from t3
• SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x54781bb2;actual: 0x54781c32). Itoccurredduring a read of page (1:153)
in database ID 12 at offset0x00000000132000 in file 'C:\Program Files\Microsoft SQLServer\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\testcheckdb.mdf'. Additional messages in the SQL Server errorlog or system event log may provide more detail. This is a severe errorcondition
that threatens database integrity and must be corrected immediately.Complete a full database consistencycheck (DBCC CHECKDB). This error can be caused by many factors; formore information, see SQL Server Books Online.
USEtestcheckdb
DBCCCHECKDB(12)
GO

usemaster
ALTERDATABASE testcheckdbSETSINGLE_USER;
GO
DBCCCHECKDB(12, REPAIR_ALLOW_DATA_LOSS)
WITHALL_ERRORMSGS,NO_INFOMSGS
GO
Repair: TheClustered index successfully rebuilt for the object "dbo.t3" indatabase "testcheckdb".
Repair: Thepage (1:153) has been deallocated from object ID 2105058535, index ID 1,partition ID 72057594038779904, alloc unit ID 72057594039697408 (type In-rowdata).
Msg 8945,Level 16, State 1, Line 1
Table error:Object ID 2105058535, index ID 1 will be rebuilt.
The error has been repaired.
Msg 8928,Level 16, State 1, Line 1
Object ID2105058535, index ID 1, partition ID 72057594038779904, alloc unit ID72057594039697408 (type In-row data): Page (1:153) could not be processed. See other errors for details.
The error has been repaired.
Msg 8939,Level 16, State 98, Line 1
Table error:Object ID 2105058535, index ID 1, partition ID 72057594038779904, alloc unit ID72057594039697408 (type In-row data), page (1:153). Test (IS_OFF (BUF_IOERR,pBUF->bstat)) failed. Values are 12584969 and -4.
The error has been repaired.
Msg 8980,Level 16, State 1, Line 1
Table error:Object ID 2105058535, index ID 1, partition ID 72057594038779904, alloc unit ID72057594039697408 (type In-row data). Index node page (1:155), slot 0 refers tochild page (1:153) and previous child (0:0), but they were not encountered.
The error has been repaired.
Msg 8978,Level 16, State 1, Line 1
Table error:Object ID 2105058535, index ID 1, partition ID 72057594038779904, alloc unit ID72057594039697408 (type In-row data). Page (1:156) is missing a reference fromprevious page (1:153). Possible chain linkage problem.
The error has been repaired.
CHECKDBfound 0 allocation errors and 4 consistency errors in table 't3' (object ID2105058535).
CHECKDBfixed 0 allocation errors and 4 consistency errors in table 't3' (object ID2105058535).
CHECKDBfound 0 allocation errors and 4 consistency errors in database 'testcheckdb'.
CHECKDBfixed 0 allocation errors and 4 consistency errors in database 'testcheckdb'.
usetestcheckdb
--check data again
select*
from t3

--check page again
dbcctraceon(3604)
DBCCind('testcheckdb', t3,-1)
--//former: 154-IAMpage;155-index root page;153,156,157-data page
--//now: 159-IAMpage;169-index rootpage;158,168-data page
DBCCpage('testcheckdb', 1,159,3)--//
DBCCpage('testcheckdb', 1,169,3)--//
DBCCpage('testcheckdb', 1,158,3)--//
m_slotCnt = 352
DBCC page('testcheckdb', 1,168,3)--//
m_slotCnt = 298
Test2: Damage the PFS page
--//check the PFS page
DBCCpage('testcheckdb', 1,1,2)-
--thenI modify it physically.

--then online the db , selectand insert.
--check the data, valid.
usetestcheckdb
select*
from t3
--insert a row, valid
insertinto t3
values(1001,'ABCD')
--insert another 1000 rows, just 52 rows inserted ok. The following failed
declare@count int
set@count = 1002
WHILE@count <= 2000
BEGIN
insert into t3values(@count,'abc')
set@count = @count+1
END;
GO


Thestatement has been terminated.
Msg824, Level 24, State 2, Line 5
SQLServer detected a logical consistency-based I/O error: incorrect checksum(expected: 0x20c60630; actual: 0x80660ace). It occurred during a read of page(1:1) in database ID 12 at offset 0x00000000002000 in file 'C:\ProgramFiles\Microsoft
SQLServer\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\testcheckdb.mdf'. Additional messages in the SQL Server errorlog or system event log may provide more detail. This is a severe errorcondition that threatens database integrity and must be corrected immediately.Complete
a full database consistency check (DBCC CHECKDB). This error can becaused by many factors; for more information, see SQL Server Books Online.
--then try to check page
dbcctraceon(3604)
DBCCind('testcheckdb', t3,-1)
--//former: 154-IAMpage;155-index root page;153,156,157-data page
--//now: 159-IAMpage;169-index rootpage;158,168-data page
dbcctraceon(3604)
DBCCpage('testcheckdb', 1,159,3)--//
DBCCpage('testcheckdb', 1,169,3)--//
DBCCpage('testcheckdb', 1,158,3)--//
former m_slotCnt = 352, now m_slotCnt = 352
DBCC page('testcheckdb', 1,168,3)--//
former m_slotCnt =298, now m_slotCnt = 351
--checkdb
DBCCCHECKDB(12)
WITHALL_ERRORMSGS,NO_INFOMSGS
GO
Msg 8921, Level 16, State 1, Line 1
Check terminated. A failure was detected while collecting facts. Possibly tempdb out of space or a system table is inconsistent. Check previous errors.
Msg 8998, Level 16, State 2, Line 1
Page errors on the GAM, SGAM, or PFS pages prevent allocation integrity checks in database ID 12 pages from (1:0) to (1:8087). See other errors for cause.
CHECKDB found 1 allocation errors and 0 consistency errors not associated with any single object.
Msg 2575, Level 16, State 1, Line 1
The Index Allocation Map (IAM) page (1:86) is pointed to by the next pointer of IAM page (0:0) in object ID 3, index ID 1, partition ID 196608, alloc unit ID 196608 (type In-row data), but it was not detected in the scan.
CHECKDB found 1 allocation errors and 0 consistency errors in table 'sys.sysrscols' (object ID 3).
Msg 2575, Level 16, State 1, Line 1
The Index Allocation Map (IAM) page (1:131) is pointed to by the next pointer of IAM page (0:0) in object ID 5, index ID 1, partition ID 327680, alloc unit ID 327680 (type In-row data), but it was not detected in the scan.
CHECKDB found 1 allocation errors and 0 consistency errors in table 'sys.sysrowsets' (object ID 5).
Msg 2575, Level 16, State 1, Line 1
The Index Allocation Map (IAM) page (1:17) is pointed to by the next pointer of IAM page (0:0) in object ID 7, index ID 1, partition ID 458752, alloc unit ID 458752 (type In-row data), but it was not detected in the scan.
Msg 2575, Level 16, State 1, Line 1
The Index Allocation Map (IAM) page (1:54) is pointed to by the next pointer of IAM page (0:0) in object ID 7, index ID 2, partition ID 562949953880064, alloc unit ID 562949953880064 (type In-row data), but it was not detected in the scan.
CHECKDB found 2 allocation errors and 0 consistency errors in table 'sys.sysallocunits' (object ID 7).
Msg 2575, Level 16, State 1, Line 1
The Index Allocation Map (IAM) page (1:12) is pointed to by the next pointer of IAM page (0:0) in object ID 8, index ID 0, partition ID 524288, alloc unit ID 524288 (type In-row data), but it was not detected in the scan.
CHECKDB found 1 allocation errors and 0 consistency errors in table 'sys.sysfiles1' (object ID 8).
Msg 2575, Level 16, State 1, Line 1
The Index Allocation Map (IAM) page (1:26) is pointed to by the next pointer of IAM page (0:0) in object ID 19, index ID 1, partition ID 281474977955840, alloc unit ID 281474977955840 (type In-row data), but it was not detected in the scan.
CHECKDB found 1 allocation errors and 0 consistency errors in table 'sys.sysfgfrag' (object ID 19).
Msg 2575, Level 16, State 1, Line 1
The Index Allocation Map (IAM) page (1:28) is pointed to by the next pointer of IAM page (0:0) in object ID 23, index ID 1, partition ID 281474978217984, alloc unit ID 281474978217984 (type In-row data), but it was not detected in the scan.
CHECKDB found 1 allocation errors and 0 consistency errors in table 'sys.sysphfg' (object ID 23).
Msg 2575, Level 16, State 1, Line 1
The Index Allocation Map (IAM) page (1:30) is pointed to by the next pointer of IAM page (0:0) in object ID 24, index ID 1, partition ID 281474978283520, alloc unit ID 281474978283520 (type In-row data), but it was not detected in the scan.
CHECKDB found 1 allocation errors and 0 consistency errors in table 'sys.sysprufiles' (object ID 24).
Msg 2575, Level 16, State 1, Line 1
The Index Allocation Map (IAM) page (1:92) is pointed to by the next pointer of IAM page (0:0) in object ID 27, index ID 1, partition ID 281474978480128, alloc unit ID 281474978480128 (type In-row data), but it was not detected in the scan.
…………
Msg 2575, Level 16, State 1, Line 1
The Index Allocation Map (IAM) page (1:74) is pointed to by the next pointer of IAM page (0:0) in object ID 93, index ID 2, partition ID 562949959516160, alloc unit ID 562949959516160 (type In-row data), but it was not detected in the scan.
CHECKDB found 2 allocation errors and 0 consistency errors in table 'sys.sysxmlplacement' (object ID 93).
Msg 2575, Level 16, State 1, Line 1
The Index Allocation Map (IAM) page (1:137) is pointed to by the next pointer of IAM page (0:0) in object ID 97, index ID 1, partition ID 281474983067648, alloc unit ID 281474983067648 (type In-row data), but it was not detected in the scan.
Msg 2575, Level 16, State 1, Line 1
The Index Allocation Map (IAM) page (1:139) is pointed to by the next pointer of IAM page (0:0) in object ID 97, index ID 2, partition ID 562949959778304, alloc unit ID 562949959778304 (type In-row data), but it was not detected in the scan.
CHECKDB found 2 allocation errors and 0 consistency errors in table 'sys.sysbinsubobjs' (object ID 97).
Msg 8939, Level 16, State 98, Line 1
Table error: Object ID 99, index ID 0, partition ID 0, alloc unit ID 6488064 (type System allocation data), page (1:1). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 12584969 and -4.
Msg 8939, Level 16, State 98, Line 1
Table error: Object ID 99, index ID 0, partition ID 0, alloc unit ID 6488064 (type System allocation data), page (1:1). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 12584969 and -4.
Msg 8939, Level 16, State 98, Line 1
Table error: Object ID 99, index ID 0, partition ID 0, alloc unit ID 6488064 (type System allocation data), page (1:1). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 12584969 and -4.
Msg 8939, Level 16, State 98, Line 1
Table error: Object ID 99, index ID 0, partition ID 0, alloc unit ID 6488064 (type System allocation data), page (1:1). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 12584969 and -4.
Msg 8939, Level 16, State 98, Line 1
Table error: Object ID 99, index ID 0, partition ID 0, alloc unit ID 6488064 (type System allocation data), page (1:1). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 12584969 and -4.
Msg 8939, Level 16, State 98, Line 1
Table error: Object ID 99, index ID 0, partition ID 0, alloc unit ID 6488064 (type System allocation data), page (1:1). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 12584969 and -4.
CHECKDB found 5 allocation errors and 1 consistency errors in table '(Object ID 99)' (object ID 99).
Msg 2575, Level 16, State 1, Line 1
The Index Allocation Map (IAM) page (1:159) is pointed to by the next pointer of IAM page (0:0) in object ID 2105058535, index ID 1, partition ID 72057594038845440, alloc unit ID 72057594039762944 (type In-row data), but it was not detected in the scan.
CHECKDB found 1 allocation errors and 0 consistency errors in table 't3' (object ID 2105058535).
CHECKDB found 56 allocation errors and 1 consistency errors in database 'testcheckdb'.
|
|
--try to repair, fail
DBCCCHECKDB(12, REPAIR_ALLOW_DATA_LOSS)
WITHALL_ERRORMSGS,NO_INFOMSGS
GO

|
"CHECKDB uses the PFS pages to determine which pages are allocated - and so which pages to read to drive the various consistency checks.The only repair for
a PFS page is to reconstruct it - they can't simply be deleted as they're a fixed part of the fabric of the database.PFS pages cannot be rebuilt because there is no infallible way to determine which pages
are allocated or not.There are various algorithms I've experimented with to rebuild them, with optimistic or pessimistic setting of page allocation statuses and then re-running the various consistency checks to try to sort out the incorrect choices,
but they allrequire very long run-times. Given the frequency with which these corruptions are seen, and the engineering effort required to come up with an (imperfect) solution,
I made the choice to leave this as unrepairable, and I don’t think that will change in future." ——Paul.S. Randal
|
|