Today was my first day of SQL MCM training here in Bellevue. We started at 08:30am and now (05:30pm) Kimberly is currently talking about how to use the provided VPC J. The main theme for today’s day was about learning the fundamental about performance tuning. We have covered the following 3 modules:
- SQL Server IO
- Waits & Queues
- IO – Core Concepts
In the morning Paul and Kimberly provided a foundation about how SQL Server is doing IO and the inner architecture of SQL Server. There was a big overlap with the first day of the first week (which I have attended in London), but this was “by design” J. Afterwards Paul jump into Waits & Queues and he has shown a lot of different demos about the various “well-known” wait types like CXPACKET, ASYNC_NETWORK_IO, WRITELOG, etc. Paul also walked down to Latches and Spinlocks, which was really great. Afterwards Jonathan kicked in and told us a lot of IO Core Concepts like the difference between traditional hard disks and SSDs.
I’ve written down a lot of personal notes for the first day, and the great thing is that I want to share all those notes (and all notes that are coming within the next 3 weeks) with you, so that you can get a picture about the content that we covered. Also a big thanks to Paul, that he allows that! J
Module 1: SQL Server IO
-
File Header Page in the LDF is a mirror of the Boot Page of the MDF
- SQL Server can write the File Header Page in parallel but NOT the VLFs
- SQL Server has to read all File Header Pages during startup, therefore it is not recommended to have too much databases on a SQL Server instance
- SQL Server can write the File Header Page in parallel but NOT the VLFs
- DEFAULT file group should be changed so that new objects go automatically to another file group instead of PRIMARY file group
- When IFI (Instant File Initialization) is enabled you can have a larger Auto Growth setting instead of when IFI is not enabled
-
When Auto Growth for the Log File kicks in, NO additional log records can be flushed to disk and therefore have to wait
- But only for the database where Auto Growth kicked in, and NOT for the WHOLE instance
- But only for the database where Auto Growth kicked in, and NOT for the WHOLE instance
-
FGCB_ADD_REMOVE Latch (File Group Control Block)
- Needed for add/remove files, and Auto Growth
- Can be checked if Auto Growth is a performance issue
- Needed for add/remove files, and Auto Growth
-
The IO subsystem can intersect the partition alignment of Windows Server 2008
- The partition can be therefore ALSO misaligned!!!
- The partition can be therefore ALSO misaligned!!!
-
File System Level Defragmentation
- Shutdown SQL Server first – corruption can occur!
- Gains maybe 1 – 2% performance (when doing large range scans)
- Shutdown SQL Server first – corruption can occur!
-
Reads from the TxLog can be sequential or random
-
Sequential
- Log Backup
- Log Reader Agent for Replication
- Log Backup
-
Random
- Crash Recovery
- Rollback
- Crash Recovery
- Log Records are chained together backwards through their LSN
- Rollback always goes backward, therefore Random IO
- Redo goes forward
- Rollback goes backward
-
-
Minimally Logged Operation
- Changed Data Pages are flushed to disk during the COMMIT
- The flush is Extent based
- Changed Data Pages are flushed to disk during the COMMIT
-
Allocation Weightings
- Weightings are per filegroup
- One file always has a weighting of 1
-
Weightings are recalculated
- Grown, Shrunk, Added, Removed or 8192 allocations took place
- Grown, Shrunk, Added, Removed or 8192 allocations took place
- During an allocation the weighting is reduced by 1, e.g. 8192 to 8191
- FGCB_ADD_REMOVE Latch when weightings are recalculated
- Proportional Fill doesn’t work on TempDb
- Can be tracked through Extended Events – see Jonathans Blog
- Weightings are per filegroup
-
TF 1117
- All files within a file group can growth the same size
- See: http://www.sqlskills.com/BLOGS/PAUL/post/Tempdb-configuration-survey-results.aspx
- All files within a file group can growth the same size
-
Backups/Restores
- Backups are doing sequential IO
-
Backups are not going through the Buffer Pool
- They are going directly to the data files
- They are going directly to the data files
-
Fast Recovery
- Only available on Enterprise Edition
- Database is accessible after the REDO phase, and before the UNDO phase begins
- Only available on Enterprise Edition
-
Separate threads when you do backup striping
- Backup and Restore
- It’s only drive letter aware and mount point aware
- See whitepaper “A Case Study: Fast and Reliable Backup and Restore of a VLDB over the Network” (Thomas Grohser)
- See: http://www.sqlskills.com/BLOGS/KIMBERLY/post/Backup-Resources-Where-oh-where-can-they-be.aspx
- Backup and Restore
- Backups are doing sequential IO
-
TempDb
- Reset to the previous size during restart
-
The log file gets a different amount of VLFs, when the log was grown previously
- Less VLFs than earlier, because of the formula used
- VLF count is reset
- Less VLFs than earlier, because of the formula used
- See http://sqlskills.com/blogs/jonathan/post/TSQL-Tuesday-11-e28093-Misconceptions-The-Tempdb-Log-File-and-VLF-Counts.aspx
-
CHECKPOINT in TempDb happens when the log is 70% full – not influenced by the Recovery Interval setting
-
Only the LazyWriter writes dirty pages out to the storage system in TempDb, but not the CHECKPOINT process
- Therefore SSDs doesn’t make always sense for TempDb
- Therefore SSDs doesn’t make always sense for TempDb
- Reset to the previous size during restart
-
Perfmon counter “Disk Queue Length” is not relevant
- Outstanding IOs are ok, just look at the latency times from DMV sys.dm_io_virtual_file_stats
- Outstanding IOs are ok, just look at the latency times from DMV sys.dm_io_virtual_file_stats
Module 2: Waits & Queues
- Waits are just symthoms
- LCK_M_S: Lock Mode Shared
-
APPEND_ONLY_STORAGE_FIRST_ALLOC
- Latch on the Version Store
- Latch on the Version Store
- FGCB_PRP_FILL: proportional fill latch
-
Dead Latches are possible
- Bugs inside SQL Server
- Bugs inside SQL Server
-
Dead Spins are not possible
- You can’t acquire a Spin, when it is already acquired
- You can’t acquire a Spin, when it is already acquired
- When a Spinlock is acquired, nothing else can’t be latched or locked
- SUSPENDED threads are not a ordered list, anyone can be signaled to go into the RUNNABLE state
-
RUNNABLE is a FIFO queue, expect when Resoure Governor is used
- Workload starvation could be theoretical possible for low priority workload
- Workload starvation could be theoretical possible for low priority workload
-
Propagated Page Split
- Scans from the Virtual Root down to ensure that the Shared Latches are present
- Scans from the Virtual Root down to ensure that the Shared Latches are present
-
CXPACKET
- Parallel Index Rebuild uses the statistics to determine how to divide the work
-
Resource Governor can be used to set MAXDOP for resource pools
- DBCC CHECKDB can’t use MAXDOP query hint, so Resource Governor can be used with a dedicated resource pool
- DBCC CHECKDB can’t use MAXDOP query hint, so Resource Governor can be used with a dedicated resource pool
- Parallel Index Rebuild uses the statistics to determine how to divide the work
-
OLEDB
- DBCC CHECKDB uses OLEDB internally
- Microsoft Office (Access, Excel)
- DBCC CHECKDB uses OLEDB internally
-
WRITELOG
- SSD: 1 – 2ms
- Traditional Hard Disk: < 10ms
- SSD: 1 – 2ms
-
Reasons for CXPACKET
-
Physical Cores are a little bit faster than logical cores, therefore CXPACKET waits also can occur
- Therefore Hyper Threading is not recommend for SQL Server
- Therefore Hyper Threading is not recommend for SQL Server
- Scan of partitioned tables, when one partition is on a faster storage system than another partition
- Out-Of-State Statistics
-
-
Service Broker Wait Types
- Preemptive Wait Types of Win32 IO are indicating that FILESTREAM is used
-
System table contentation can be removed if you use multiple insertion points (like in TempDb)
- Create a dedicated table for each SPID in TempDb (not a temp table)
- Create a dedicated table for each SPID in TempDb (not a temp table)
Module 3: IO Core Concepts
-
“General” SQL Server Latency recommendation
- < 8ms: Excellent
- < 12ms: Good
- < 20ms: Fair
- > 20ms: Poor
- < 8ms: Excellent
-
Parallel disk head positions are not possible
- All disk heads are positioned on the same position
- All disk heads are positioned on the same position
-
Short Stroking
- Storing data in the outer tracks because of the higher data density and lower seek times
- Disks are storing data from the outer side to the inner side
- Storing data in the outer tracks because of the higher data density and lower seek times
-
SSD Seek Times
- From 0.0ms to 0.16ms
- From 0.0ms to 0.16ms
- RAID1 can improve read performance by reading from each disk
- RAID10 and RAID01 provides the same performance, but a different redundancy
-
RAID01
- 1 drive failure fails the whole RAID 0 array
- 1 drive failure fails the whole RAID 0 array
Thanks for reading
-Klaus
1 thought on “SQL MCM Training – Day 1”
Thanks Klaus,
This is really amazing start I humbly request you t keep posting the similar articles thruout the training as these helps a lot os other DBA’s to understand, what all they don;t know.
Really appreciated your time and efforts to helps the cmmunity.
Comments are closed.