SQL MCM Training – Day 1

Today was my first day of SQL MCM training here in Bellevue. We started at 08:30am and now (05:30pm) Kimberly is currently talking about how to use the provided VPC J. The main theme for today’s day was about learning the fundamental about performance tuning. We have covered the following 3 modules:

SQL Server IO
Waits & Queues
IO – Core Concepts

In the morning Paul and Kimberly provided a foundation about how SQL Server is doing IO and the inner architecture of SQL Server. There was a big overlap with the first day of the first week (which I have attended in London), but this was “by design” J. Afterwards Paul jump into Waits & Queues and he has shown a lot of different demos about the various “well-known” wait types like CXPACKET, ASYNC_NETWORK_IO, WRITELOG, etc. Paul also walked down to Latches and Spinlocks, which was really great. Afterwards Jonathan kicked in and told us a lot of IO Core Concepts like the difference between traditional hard disks and SSDs.

I’ve written down a lot of personal notes for the first day, and the great thing is that I want to share all those notes (and all notes that are coming within the next 3 weeks) with you, so that you can get a picture about the content that we covered. Also a big thanks to Paul, that he allows that! J

Module 1: SQL Server IO

File Header Page in the LDF is a mirror of the Boot Page of the MDF
- SQL Server can write the File Header Page in parallel but NOT the VLFs
- SQL Server has to read all File Header Pages during startup, therefore it is not recommended to have too much databases on a SQL Server instance
DEFAULT file group should be changed so that new objects go automatically to another file group instead of PRIMARY file group
When IFI (Instant File Initialization) is enabled you can have a larger Auto Growth setting instead of when IFI is not enabled
When Auto Growth for the Log File kicks in, NO additional log records can be flushed to disk and therefore have to wait
- But only for the database where Auto Growth kicked in, and NOT for the WHOLE instance
FGCB_ADD_REMOVE Latch (File Group Control Block)
- Needed for add/remove files, and Auto Growth
- Can be checked if Auto Growth is a performance issue
The IO subsystem can intersect the partition alignment of Windows Server 2008
- The partition can be therefore ALSO misaligned!!!
File System Level Defragmentation
- Shutdown SQL Server first – corruption can occur!
- Gains maybe 1 – 2% performance (when doing large range scans)
Reads from the TxLog can be sequential or random
- Sequential
  - Log Backup
  - Log Reader Agent for Replication
- Random
  - Crash Recovery
  - Rollback
- Log Records are chained together backwards through their LSN
- Rollback always goes backward, therefore Random IO
- Redo goes forward
- Rollback goes backward
Minimally Logged Operation
- Changed Data Pages are flushed to disk during the COMMIT
- The flush is Extent based
Allocation Weightings
- Weightings are per filegroup
- One file always has a weighting of 1
- Weightings are recalculated
  - Grown, Shrunk, Added, Removed or 8192 allocations took place
- During an allocation the weighting is reduced by 1, e.g. 8192 to 8191
- FGCB_ADD_REMOVE Latch when weightings are recalculated
- Proportional Fill doesn’t work on TempDb
- Can be tracked through Extended Events – see Jonathans Blog
TF 1117
- All files within a file group can growth the same size
- See: http://www.sqlskills.com/BLOGS/PAUL/post/Tempdb-configuration-survey-results.aspx
Backups/Restores
- Backups are doing sequential IO
- Backups are not going through the Buffer Pool
  - They are going directly to the data files
- Fast Recovery
  - Only available on Enterprise Edition
  - Database is accessible after the REDO phase, and before the UNDO phase begins
- Separate threads when you do backup striping
  - Backup and Restore
  - It’s only drive letter aware and mount point aware
  - See whitepaper “A Case Study: Fast and Reliable Backup and Restore of a VLDB over the Network” (Thomas Grohser)
  - See: http://www.sqlskills.com/BLOGS/KIMBERLY/post/Backup-Resources-Where-oh-where-can-they-be.aspx
TempDb
- Reset to the previous size during restart
- The log file gets a different amount of VLFs, when the log was grown previously
  - Less VLFs than earlier, because of the formula used
  - VLF count is reset
- See http://sqlskills.com/blogs/jonathan/post/TSQL-Tuesday-11-e28093-Misconceptions-The-Tempdb-Log-File-and-VLF-Counts.aspx
- CHECKPOINT in TempDb happens when the log is 70% full – not influenced by the Recovery Interval setting
  - See http://www.sqlskills.com/BLOGS/PAUL/post/What-does-checkpoint-do-for-tempdb.aspx
- Only the LazyWriter writes dirty pages out to the storage system in TempDb, but not the CHECKPOINT process
  - Therefore SSDs doesn’t make always sense for TempDb
Perfmon counter “Disk Queue Length” is not relevant
- Outstanding IOs are ok, just look at the latency times from DMV sys.dm_io_virtual_file_stats

Module 2: Waits & Queues

Waits are just symthoms
LCK_M_S: Lock Mode Shared
APPEND_ONLY_STORAGE_FIRST_ALLOC
- Latch on the Version Store
FGCB_PRP_FILL: proportional fill latch
Dead Latches are possible
- Bugs inside SQL Server
Dead Spins are not possible
- You can’t acquire a Spin, when it is already acquired
When a Spinlock is acquired, nothing else can’t be latched or locked
SUSPENDED threads are not a ordered list, anyone can be signaled to go into the RUNNABLE state
RUNNABLE is a FIFO queue, expect when Resoure Governor is used
- Workload starvation could be theoretical possible for low priority workload
Propagated Page Split
- Scans from the Virtual Root down to ensure that the Shared Latches are present
CXPACKET
- Parallel Index Rebuild uses the statistics to determine how to divide the work
- Resource Governor can be used to set MAXDOP for resource pools
  - DBCC CHECKDB can’t use MAXDOP query hint, so Resource Governor can be used with a dedicated resource pool
OLEDB
- DBCC CHECKDB uses OLEDB internally
- Microsoft Office (Access, Excel)
WRITELOG
- SSD: 1 – 2ms
- Traditional Hard Disk: < 10ms
Reasons for CXPACKET
- Physical Cores are a little bit faster than logical cores, therefore CXPACKET waits also can occur
  - Therefore Hyper Threading is not recommend for SQL Server
- Scan of partitioned tables, when one partition is on a faster storage system than another partition
- Out-Of-State Statistics
Service Broker Wait Types
- http://blogs.msdn.com/b/sql_service_broker/archive/2008/12/01/service-broker-wait-types.aspx
Preemptive Wait Types of Win32 IO are indicating that FILESTREAM is used
System table contentation can be removed if you use multiple insertion points (like in TempDb)
- Create a dedicated table for each SPID in TempDb (not a temp table)

Module 3: IO Core Concepts

“General” SQL Server Latency recommendation
- < 8ms: Excellent
- < 12ms: Good
- < 20ms: Fair
- > 20ms: Poor
Parallel disk head positions are not possible
- All disk heads are positioned on the same position
Short Stroking
- Storing data in the outer tracks because of the higher data density and lower seek times
- Disks are storing data from the outer side to the inner side
SSD Seek Times
- From 0.0ms to 0.16ms
RAID1 can improve read performance by reading from each disk
RAID10 and RAID01 provides the same performance, but a different redundancy
RAID01
- 1 drive failure fails the whole RAID 0 array

Thanks for reading

-Klaus

1 thought on “SQL MCM Training – Day 1”

Mahesh
08/09/2011 at 4:30 AM

Thanks Klaus,

This is really amazing start I humbly request you t keep posting the similar articles thruout the training as these helps a lot os other DBA’s to understand, what all they don;t know.

Really appreciated your time and efforts to helps the cmmunity.

Comments are closed.

SQL MCM Training – Day 1

1 thought on “SQL MCM Training – Day 1”

SQLpassion Performance Tuning Training Plan – Signup

Your SQL Server is slow and nobody knows why?

Check out my brand new SQL Rescue Sprint!

No risk. No endless projects. No surprises.