SQLMCM Training – Day 11

Today Jonathan presented on Replication and Paul about Database Snapshots, and his baby called “CHECKDB” – he has been married with CHECKDB for almost 5 years at Microsoft.

Module 8: Replication

Distributes data to different locations
Offloading reporting to another database
Transactional Replication
- Used in server-to-server scenarios that require high throughput
- Different Agents
  - Snapshot Agent
  - Log Reader Agent
    - Is also used by CDC & Change Data Capture
    - See log_reuse_wait_desc column in sys.database
    - Only committed transactions are copied to the Distribution Database
  - Distribution Agent
    - Runs on the Distributor for Push Subscriptions
    - Runs on the Subscriber for Pull Subscriptions
    - Copies transactions from the Distribution Databases to the Subscriber Databases
- Peer-to-Peer Replication maintans copies of data across multiple server
  - If you enable it, you can’t go back!
  - P2P Replication must be set per publication
Merge Replication
- Primarily for mobile and distributed server that have possible data conflicts
- Starts with a snapshot
- Incremential changes for data and schema are tracked with triggers
- Conflict Detection
- POS (Point Of Sale) appliations
- Different Agents
  - Snapshot Agent
  - Merge Agent
    - Applies initial snapshots to subscribers
    - Merges incremential changes
    - Detects and resolve conflicts
      - Priority based – highest priority wins
      - SQLCLR/COM- based on complex business rules
- Only the final changes to the data are applied
  - E.g. multiple Updates
  - Net changes are tracked at publisher and subscriber are merged periodically with conflict detection
- Needs planning for TempDb & VersionStore
  - Depends on the number of articles
Snapshot Replication
- You can a snapshot of the data delivered to the subscription
- Can be used as a preparation for Transactional- and Merge Replication
  - For large databases a backup would be a better option
  - You can also use Bulk Copy the initial data
- Snapshot takes locks when the snapshot is generated
  - Shared locks on the table are acquired
  - RCSI is not supported
Distribution database is needed, because there can be several subscribers
- When you make changes in the Distribution Database, you are not supported by Microsoft
  - Like Indexing
- Large Distribution Databases should be moved to its own server
SQL Server uses Table Scans inside the Distribution database, which can perform badly when the distribution database gets larger and large
- You can set the retention period
Updatable Subscriptions
- Allows subscribers to replicate changes back to the publisher
Database Mirroring & Replication
- Publisher: Full Support
- Subscriber. Limited Support
- Distributor: No Support
  - Failover Clustering is the only High Availability option
Failover Clustering & Replication
- Supported on each Replication role (Publisher, Subscriber, Distributor)
Replication Monitor
- Provides health about the Replication topology
Tracer Token
- Allows you to measure latency

Module 9: Database Snapshots

Consistent point of view of database
Can be created on a mirror to access database
Sparse file as a mapping- and data area
SQL Server is pushing 8kb once a time into the snapshot, not Extents
Once a page is pushed in the snapshot, it is never pushed again into the snapshot
Buffer Pool has an in-memory map which pages are in the snapshot and which are not
- The first time when the snapshot is accessed the in-memory map is generated
When the database is grown, the new pages are never in the snapshot, because they never existed in the snapshot when the database was created
When the original page is pushed into the snapshot, the page is latched (the BUF_LATCH structure)
3 different Snapshots of a database means 3 synchronous writes when a page gets updated
- The original page must be pushed synchronously in each of the 3 snapshots
Snapshot on a Snapshot will not work
The snapshot stores its own copy of a page in the Buffer Pool, when the page is read
- This can lead to memory pressure in the Buffer Pool
When there are open transactions during the creation of the snapshot
- Crash Recovery is of the source database is run into the snaphot
- E.g. When a transaction is active, a rollback of this transaction is occuring in the context of the snapshot
When you create a snapshot on a mirror, the consistent point of time of the snapshot is the time, when the last CHECKPOINT operation occured on the mirror
First step is a CHECKPOINT operation, you will not get in the snapshot the following:
- All transactions that have comitted AFTER the CHECKPOINT operation, they are rollbacked
- All transactions that were running at the CHECKPOINT operation, and were not committed at the time the CHECKPOINT operation occured
FILESTREAM file group can’t be included in a snapshot, but it does not prevent creating the snapshot
- But you can’t REVERT the snapshot, if there is a FILESTREAM file group
Snapshot goes suspect, when there is no more space for the snapshot
- Source database is un-effected, and snapshot can’t be used anymore
You can’t detach or restore the source code, when there is a snapshot defined for that database
http://blogs.msdn.com/b/psssql/archive/2009/03/04/sparse-file-errors-1450-or-665-due-to-file-fragmentation-fixes-and-workarounds.aspx
sys.databases/source_database_id IS NOT NULL return all Snapshots
When you begin a transaction and finally rollback the transaction, the changed pages from that transaction stays in the snapshot
REVERTing a snapshot sets the log file to 0,5MB and 2 VLFs
Database Snapshot Performance Considerations under I/O Intensive Workloads
- http://tinyurl.com/44te4ca

Module 10: Corruption Detection and Recovery

I/O Errors
- 823: Hard I/O error
  - OS can’t read data from disk
- 824 :Soft I/O error
  - Page Checksum failure
  - Torn Page Detection
    - Bits are corrected as soon as the pages are read into the Buffer Pool
  - Wrong PageID
- 825: Read-Retry Error
  - Retried 4 more times
  - Buffer Pool does the retry
  - Severity 10 informational message
  - When one of the Read-Retries fails, then you get the 825
  - http://www.sqlskills.com/BLOGS/PAUL/post/A-little-known-sign-of-impending-doom-error-825.aspx
- Logged in msdb.suspect_pages
Page Protection options doesn’t protect you
- They just detect corruption
Page Checksum is stored in the Page Header
When Page Checksum is enabled, nothing is done
- A page is only checksumed, when the page gets dirty
- Not available for TempDb since SQL Server 2008
  - You have to enable Checksum explictely on TempDb when upgrading from SQL Server 2005 to SQL Server 2008
Automatic Page Repair is available in both Standard and Enterprise Editions
CHECKDB doesn’t take any locks
Last Known Good (last time when CHECKDB run without any problems) is reported in the SQL Server Error Log, when you start SQL Server
EXTENDEND_LOGICAL_ACCESS
- Indexed Views
- XML Indexes
- Spatial Indexes
Statistics Blobs are not checked
Repair can break foreign-key constraints
VLDBs
- DBCC CHECKFILEGROUP
  - E.g. One partition per each night
You can run CHECKTABLE on the system tables
REPAIR_REBUILD option need single user mode
TABLERESULTS
- Returns the output as a table
- Undocumented, because the output can change from release to release
Online Index Rebuild reads the old index during the rebuild
- Offline Index Rebuild also reads almost every time the old index during the rebuild
- You have to drop and recreate the NCI
- It could be problematic if the NCI enforces a constraint, in the mean time when the NCI is dropped, users can insert data that is not enforced by the NCI, and afterwards you can’t recreate the NCI
DBCC CHECKDB can be run on a suspect database
- Suspect means that recovery started, but couldn’t be completed, so it is not transactional consistent
- But you must put it in EMERGENCY mode
Not all pages can be single page restored (because not all operations on them are fully logged)
- Boot page
- Fileheader page
- Allocation bitmap (not including IAM pages)
- Certain pages in hidden, critical system catalogs
- Logged in msdb.suspect_pages
- Up to 100 pages can be restored during a Single Page Restore
- Online Single Page Restore is an Enterprise feature
- If you have subsequent log backups, they must be also restored
  - You also need to do a tail-log backup and restore it also
- And the log system knows that only log records for the specific page must be replayed/rollbacked
Allocation System belongs to Page-ID 99
Repair is going to delete data!
Repair is always offline
When you repair a Replication, you must re-initialize the Subscriptions
- Repair is physically, Replication is logically
Msg 8921 when TempDb run out of space
Do not try to repair system tables!!!
Emergency Mode for rebuilding the transaction log
XVI32 Hex Editor 😉
- http://www.chmaas.handshake.de/delphi/freeware/xvi32/xvi32.htm
Damaged PFS page
- Allocation Order Scan uses the PFS page
  - E.g. SELECT * FROM TableA
- You can retrieve the data, even when the PFS page is damaged, by using an Clustered Index Scan by ordering by the Clustered Key, or using an index hint that forces an Clustered Index Scan
A suspect or recovery pending database can’t be detached on SQL Server 2008, because it can’t be re-attached any more…
- When you enforce it, it can be detached
- To attach it again
  - Create a new database with the SAME log
  - Set the database offline
  - Delete the MDF/LDF file
  - Copy the MDF/LDF file from the detached database
  - Set the database online
  - Database is again in the SUSPECT state
ATTACH_REBUILD_LOG

Thanks for reading and stay tuned for the last week J

-Klaus

SQLMCM Training – Day 11

SQLpassion Performance Tuning Training Plan – Signup

Do you want to master PostgreSQL like an expert?

PostgreSQL for the SQL Server Professional

More information about my Live Training