Mike Ruthruff SQLServer on SAN SQLCAT

MICROSOFT SQL SERVER®

ON SANBEST

PRACTICESMike Ruthruff

Program Manager Contributor: Prem Mehra

SQL Server Customer Advisory Team

Microsoft® Corporation

SQL Server Customer Advisory Team (SQLCAT)

Works on the largest, most complex SQL Server projects worldwide– US: NASDAQ, USDA, Verizon, Raymond James…– Europe: London Stock Exchange, Barclay’s Capital– Asia/Pacific: Korea Telecom, Western Digital, Japan Railways

East– ISVs: SAP, Siebel, Sharepoint, GE Healthcare

Drives product requirements back into SQL Server from our customers and ISVs

Shares deep technical content with SQL Server community– SQLCAT.com– http://blogs.msdn.com/sqlcat – http://blogs.msdn.com/mssqlisv– http://technet.microsoft.com/en-us/sqlserver/bb331794.aspx

2

http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/default.mspx

http://blogs.msdn.com/sqlcat

http://blogs.msdn.com/mssqlisv

http://technet.microsoft.com/en-us/sqlserver/bb331794.aspx

SQL Server Design Win Program

Target the Most Challenging and Innovative Applications on SQL Server

Investing in Large Scale, Referenceable SQL Server Projects Across the World– Provide SQLCAT technical & project experience– Conduct architecture and design reviews

covering performance, operation, scalability and availability aspects

– Offer use of HW lab in Redmond with direct access to SQL Server development team

Work with Marketing Team Developing Case Studies

3

Agenda Characteristics of SQL Server I/O operations Best practices

– SQL Server Design Practices

– Storage Configuration

– Common Pitfalls

Monitoring performance of SQL Server on SAN Emerging Storage Technologies

Lots of Additional Material In Appendix Section (not covered during session) How to validate a configuration using I/O load generation tools General SQL Server I/O characteristics How to diagnose I/O bottlenecks Sample Configurations

4

SQL Server’s I/O CharacteristicsTypical I/O Patterns

Generalizing SQL Server I/O patterns is difficult making sizing storage for a SQL Server deployment non-trivial in some cases

OLTP (Online Transaction Processing) Typical heavy on random read / writes (8K most common) Some amount of read-ahead typical

RDW (Relational Data Warehousing) Typical 64KB+ sequential reads (table and range scan) 128-256KB sequential writes (bulk load)

Operational Activities Backup/Restore , Index Rebuild, etc… (see appendix)

Many “mixed” workloads observed in customer deployments Analysis Services I/O patterns

Up to 64KB random reads

5

See appendix for more details on I/O characteristics of certain SQL Server operations

SQL Server’s I/O CharacteristicsLog manager

6

• User threads fill log buffers & requests log manager to flush all

records up to certain LSN - log manager thread writes the buffers

• Sequential in nature

• Individual write size varies – Dependent on nature of transaction

– Transaction “Commit” forces log buffer to be flushed to disk

– Up to 60KB in size

• Log manager throughput considerations Version Limits (per database)SQL Server 2005 SP1 or later

• Outstanding log writes: 32-bit=Limit of 8, 64-bit=Limit of 32

• No more than 480K “in-flight”

SQL Server 2008 • Outstanding log writes: 32-bit=Limit of 8, 64-bit=Limit of 32

• No more than 3840K “in-flight”

SQL Server 2000 SP4 & SQL Server 2005 RTM

• Limit log writes to 8 outstanding

Monitoring SQL Server I/OLog Manager

7

• How do I determine if I am hitting log bottlenecks?• First look for associated wait types (dm_os_wait_stats): WRITELOG

& LOGBUFFER

• Suboptimal Disk Response Times (most common issue)• Logical Disk Counters: Avg. Disk/sec Write

• SQL Server:Databases: (Log Flush Wait Time)/(Log Flushes/sec)

• Log manager limits (SQL Server:Databases Log Counters)• Amount of “in-flight” I/O Limit = Avg. Bytes per Flush * Current

Queue Length

• Avg. Bytes per Flush = (Log Bytes Flushed/sec) / (Log Flushes/sec)

• Amount of Outstanding I/O limit = Current Disk Queue Length or sys.dm_io_pending_io_requests

Monitoring SQL Server I/OLog Manager, example of 32 outstanding I/O limit

8

High rate of inserts ~14,000 inserts/sec

Observed high waits on WRITELOG

During checkpoint activities log manager encounters periods of 32 outstanding I/O’s

SQL Server’s I/O CharacteristicsCheckpoint

9

• Periodically sweeps buffer pool and flushes dirty buffers to disk – Up to 32 pages in a single I/O request (WriteFileGather)

– More random in nature, although attempts to find adjacent pages

Types of Checkpoints – Background/automatic checkpoints:

– Triggered by log volume or recovery interval and performed by the checkpoint thread

– User-initiated checkpoints: – Initiated by the T-SQL checkpoint command

– Reflexive checkpoints: – Automatic as part of some operations, such as recovery, restore, snapshot

creation, etc.

• Concurrency– Background/automatic checkpoints take place one at a time, however

– Any number of user-initiated or reflexive checkpoints may occur simultaneously as long they are for different databases

– NUMA systems checkpoints spread work to lazy writer per node

SQL Server’s I/O CharacteristicsCheckpoint (2) / Lazy Writer

10

• Checkpoint Throttling – Checkpoint measures I/O latency impact and automatically

adjusts checkpoint I/O to keep the overall latency from being unduly affected

– CHECKPOINT [checkpoint_duration]

• CHECKPOINT now allows an optional numeric argument, which specifies the number of seconds the checkpoint should take

• Checkpoint makes a “best effort” to perform in the time specified

• If specified time is too low it runs at full speed

• Lazy Writer– Background process which attempts to locate buffer pages which

can be returned to the free list

• LRU-2 algorithm in SQL 2005 / 2008

• Time of next-to-last reference

• Determined by the reference count of the page in SQL 2000

SQL Server’s I/O CharacteristicsRead-ahead

11

• Attempts to retrieve data pages that will be used in immediate future

• Single read-ahead request– I/O Size determined by logical vs. physical ordering, target size of 64

pages (any multiple of 8K up to 512K)

– Standard: Limited to 128 pages, Enterprise: up to 512 pages

– Cumulative outstanding limit of 5000 (pages)

• Occurs independent of parallel plan selection, however: – Parallel plan may drive I/O harder due multiple workers (scanner

threads)

– Parallel page supplier segments the data requests in the case of parallel plans

• Until the buffer pool is (nearly) full, all single-page requests bring in the

entire 8-page extent (Enterprise only)– Helps the server come up to speed quicker

Agenda Characteristics of SQL Server I/O

operations Best practices



– Common Pitfalls

Monitoring performance of SQL Server on SAN

Emerging Storage Technologies12

SQL Server Design Practices How many data files should I have (per FILEGROUP)? More data files does not necessarily equal better

performance – Determined mainly by hardware capacity & characteristics of access patterns

– Data files can be used to maximize # of spindles – stripping

– Number of data files per FILEGROUP – In the range of .25 to 1 per CPU cores depending on nature of the workload (also consider growth

– will number of CPU cores grow over time?)

– Scalability / performance consideration for allocation intensive workloads – see slide on “Diagnosing Allocation Contention”

Consider disaster recovery requirements– Will the target environment for a disaster recovery restore accommodate the file sizes?

Best practices: Align data files with CPU cores (considering access patterns) Pre-size data/log files Use equal size for files within a single FILEGROUP Grow all files in a single FILEGROUP together when possible Rely on AUTOGROW as little as possible

13

SQL Server Design Practices

How Many FILEGROUPS? Performance – Filegroups can be used to separate tables/indexes - allowing selective

placement of these at the disk level (use with caution)

– Separate objects requiring more data files due to high page allocation rate

Administration considerations (primarily) – Backup can be performed at the filegroup or file level

– Partial availability

Database is available if primary filegroup is available; other filegroups can be offline

A filegroup is available if all its files are available– Tables and Indexes

Can specify separate filegroups for in-row data and large-object data

– Partitioned Tables

Each partition can be in its own filegroup

May provide better archiving strategy as partitions can be SWITCHED in/out of the table 14

SQL Design Practices Tempdb Considerations

Tempdb placement (dedicated vs. shared spindles)– In many modern storage scenarios it may better to place tempdb on common

spindles with data files utilizing more cumulative disks – Depends on how well you know your workload use of tempdb (i.e. RDW workloads

may differ than OLTP)

Understand your own Tempdb usage – Many underlying technologies within SQL Server utilize tempdb (index rebuild

with sort in tempdb, RCSI, etc..)

– More details (Working with tempdb in SQL Server 2005):– http://www.microsoft.com/technet/prodtechnol/sql/2005/workingwithtempdb.mspx

Best Practice: 1 data file per CPU core on host

server – Applies most to allocation intensive workloads with heavy tempdb utilization

– Same practices as data files with respect to sizing and growth15

http://www.microsoft.com/technet/prodtechnol/sql/2005/workingwithtempdb.mspx

SQL Design Practices Diagnosing allocation contention

High rate of allocations to data files can result in contention on allocation structures PFS/GAM/SGAM are structures within data file which manage free

space Especially a consideration on servers with many CPU cores More data files scales-out these structures and reduces the

contention potential

Allocation contention is diagnosed by looking for waits on PAGELATCH_UP Either real time on sys.dm_exec_requests or tracked in sys.dm_os_wait_stats

Resource description in form of DBID:FILEID:PAGEID Can be cross referenced with sys.dm_os_buffer_descriptors to

determine type of page

More details here: http://sqlcat.com/technicalnotes/archive/2008/03/07/How-many-files-should-a-database-ha

ve-part-1-olap-workloads.aspx

16

http://sqlcat.com/technicalnotes/archive/2008/03/07/How-many-files-should-a-database-have-part-1-olap-workloads.aspx

http://sqlcat.com/technicalnotes/archive/2008/03/07/How-many-files-should-a-database-have-part-1-olap-workloads.aspx

SQL Server on SANGeneral Considerations

Storage technologies are evolving rapidly and traditional best practices may not apply to all configurations

– However, many still do apply (specifically physical isolation practices), especially at the high end (high volume OLTP, large scale DW)

There is no one single “right” way to configure storage for SQL Server on SAN

SAN deployment can be complex Generally involves multiple organizations to put a configuration in place

Dependent on things such as: Workload characteristics (OLTP, RDW, Mixed) Deployment scenario (i.e. SQL Server consolidation) Scale of deployment - database size Performance requirements Array architecture Backup/restore & disaster recovery requirements, etc..

17

Shared Storage EnvironmentsConsiderations for SQL Server

Shared storage environments are becoming more common– Already many shared components (ports / switches, array cache, controllers, etc…)

– Spindle sharing is becoming more common of a practice

Different considerations on different classes of arrays Physical design matters

Cache does not solve all performance problems Think about splitting workloads with very different I/O characteristics at

the physical levels – yes, there is a benefit Isolation at physical level can provide 1) predictability and 2) better

performance (in some cases)

Array Cache– Best to tune for writes (when possible) ; low log latency and absorbing checkpoint

operations

– In shared storage environments - can be overused across hosts impacting all users

18

Storage Design Best Practices Optimal LUN Design for SQL Server

LUN design should be driven by: – Optimal configuration for particular storage array

– Array architecture varies greatly and may impact LUN design / growth strategy

– Management/Growth strategy – Windows/SQL Server considerations – Array feature utilization (snapshots, replication, etc..)

LUNs should be dedicated to SQL Server data files Separating data/log/tempdb at logical level even if

shared at physical level facilitates easier monitoring Root of any mount point volumes should be dedicated

for that purpose Use single partition per LUN so you can

extend/grow19

SQL Server on SANChallenges

There are barriers between DBA’s and storage administrators – DBA’s need to have knowledge of physical storage configuration

– Storage Administrators need some understanding of SQL Server I/O patterns

Sizing only on “capacity” is a common problem When performance matters - size based spindle count not capacity,

consider physical isolation at spindle level

Shared storage environments– Shared components can impact everyone

– Heterogeneous I/O workloads sharing physical spindles can be problematic

– Workloads with overlapping periods of heavy I/O – unpredictable performance

– Performance degradation over time as capacity utilization increases (increased seek times)

Poorly tuned queries issuing more I/O than necessary20

SQL Server on SAN Challenges (2)

SQL Server predeployment best practices not followed

– Validation of configuration prior to deployment– SQLIO/IOMeter (see appendix)

Proactive monitoring strategy in place and trending of response times

Proper host storage configuration Queue depth set too low Multipathing improperly configured Not using vendor recommended drivers

Volume alignment performed at partition creation time– Disk Partition Alignment: Increase I/O Throughput by up to 10%, 20%, 30% or more – Jimmy May

– Disk Partition Alignment Essentials (Cheat Sheet) – Jimmy May

21

SQL Server Predeployment Best Practices Whitepaper:http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/pdpliobp.mspx

http://blogs.msdn.com/jimmymay/archive/2008/10/14/disk-partition-alignment-for-sql-server-slide-deck.aspx

http://blogs.msdn.com/jimmymay/archive/2008/10/14/disk-partition-alignment-for-sql-server-slide-deck.aspx

http://blogs.msdn.com/jimmymay/archive/2008/12/04/disk-partition-alignment-sector-alignment-for-sql-server-part-4-essentials-cheat-sheet.aspx

http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/pdpliobp.mspx





– Common Pitfalls



Monitoring I/O Performance

23

Understand potential throughput of the hardware Each component in the path has associated

speed/bandwidth Know where the potential bottlenecks exist

Cach

e

Fro

nt E

nd

Po

rts

Co

ntro

llers

/Pro

ce

ss

ors

Sw

itch

Host Sw

itch

PCI Bus HBA Fiber Channel Ports Array Processors Disks

Monitoring I/O PerformanceSQL Server Tools

Tool Monitors Granularity

Performance MonitorDisk counters Logical or Physical (when necessary)

Volume or LUN

sys.dm_os_wait_stats PAGEIOLATCH waits SQL Server Instance level

sys.dm_io_virtual_file_stats Latency, Number of I/O’s Database files

sys.dm_exec_query_statsNumber of …Reads (logical or physical) Number of writes (logical)

Query or Batch

sys.dm_db_index_usage_statsNumber of I/O’s and type of access (seek, scan, lookup, write)

Index or Table

sys.dm_db_index_operational_stats

PAGEIOLATCH waits Index or Table

sys.dm_os_io_pending_ios Pending I/O requests at any given point in time File (Per I/O request)

Tools available to monitor SQL Server I/O behavior…

Performance MonitorWindows (Performance Monitor)

Counter Description

Average Disk/sec Read & Write

Measures disk latency. Numbers will vary, optimal values for averages over time:

1 - 5 ms for Log (Ideally 1ms or better) 5 - 20 ms for Data (OLTP) (Ideally 10ms or better)<=25-30 ms for Data (DSS or RDW)

(consider aggregate throughput)

Current Disk Queue LengthHard to interpret due to virtualization of storage. Consider in combination with response times.

Disk Reads/Writes per Second

Measures the Number of I/O’s per second Discuss with vendor sizing of spindles of different type and rotational speedsImpacted by disk head movement (i.e. short stroking the disk will provide more I/O per second capacity)

Disk Read & Write Bytes/sec

Measure of total disk throughput. Ideally larger block scans should be able to heavily utilize connection bandwidth.

Average Disk Bytes/Read & Write

Measures the size of I/O’s being issued. Larger I/O tend to have higher latency (example: BACKUP/RESTORE)

Windows view of the I/O world…

Performance MonitoringArray Side Monitoring – End to End Picture

Backend monitoring of the array (array specific tools) Only way to get the complete story

Trending over time – generally less granularity (1 min)

Typical components monitored Front end port usage

Bandwidth utilization, # of concurrent requests on port

Throughput at the LUN level / physical disk level Physical Disk I/O Rates

Exposes spindle sharing/undersized spindle count/RAID choice issues

Cache utilization % of Cache Utilized

Write pending %

Impacts how aggressive array is in flushing writes to physical media

Storage controller(s) utilization Similar to monitoring CPU utilization on any server

*Terminology may vary by vendor





– Common Pitfalls



Advanced and Emerging Technologies

Storage level replication is more common place (SRDF, TrueCopy, Continuous Access, SnapMirror, etc…)

Synchronous & Asynchronous

Storage based replication vs. database mirroring

Many of the same considerations apply

Sometimes data outside the database needs to be in a consistent state with database

Snapshot based Backup/Restore Technologies Fully materialized (sometimes referred to as a clone) & those that maintain

only deltas (sometimes referred to as snapshots - space efficient)

Requires vendor provided tools & integration with SQL/Windows (VDI/VSS)

Thin Provisioning Capacity on demand / supports “Green Computing”

Requires NTFS quick format & SQL Server instant file initialization

Advanced and Emerging Technologies

Virtualization of heterogeneous storage environments

Ability to manage all storage resources through a single platform

Migration of data transparent to the application

Solid State Disks Flash memory based storage (no moving parts)

Potentially much higher performance – especially for random I/O patterns

Geographically dispersed clusters Enabling technologies provide by storage for enabling clusters over a

geographical distance

Solid State Devices (SSD)

Storage device based on NAND flash Fits into regular HDD slot Utilizes the same command set and interface Advantages

– Performance, weight, power & cooling consumption, more durable Disadvantages

– Cost per GB– Shifting bottleneck– Limited experience for enterprise use – Seeks are “free”, writes are expensive relative to reads

Most appealing for – IOPs intensive “Tier 0” storage (particularly random I/O)– Mobile devices

30

Solid State Performance OLTP Workload on SSD

EMC DMX4 Array RAID5 - 3+1

– 4 physical devices

– Log/Data on same physical devices

Database size (~300GB) Random read and write

for checkpoints / sequential log writes

16 core server completely CPU bound– Sustained 12K IOPs < 4ms

latencyCounter Average

Avg Disk/sec Read (total) 0.004

Disk Reads/sec (total) 10100

Avg Disk/sec Write (total) 0.001

Writes/sec (total) 1900

Processor Time 98

Batches/sec 520031

31

Solid State PerformanceOLTP Workload on Spinning Media

EMC DMX4 Array RAID 1+0

– 34 Physical Devices Data

– 4 Physical Devices Log

Same workload/database as SSD configuration (OLTP)

Nearly same sustained IO’s with ~10x number of spindles – Higher latency

– Slightly lower throughput

– “Short stroking” the spinning media

32

Counters Average Avg Disk/sec Read (total) 0.017

Disk Reads/sec (total) 10259Avg Disk/sec Write (total) 0.002

Writes/sec (total) 2103Processor Time 90

Batches/sec 4613

PASS Community Summit 2008 DBA-323 SQL Server 2008 on SAN - Best Practices and Lessons Learned

References

SQL CAT Blogwww.sqlcat.com

SQL Server 2000/2005 I/O Basics on TechNet http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqlIObasics.mspx http://www.microsoft.com/technet/prodtechnol/sql/2005/iobasics.mspx

SQL Server PreDeployment Best Practices http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/pdpliobp.mspx

Storage Top 10 Best Practices http://sqlcat.com/top10lists/archive/2007/11/21/storage-top-10-best-practices.aspx

SQL Server AlwaysOn Partner program http://www.microsoft.com/sql/alwayson/default.mspx

SQL Server Best Practices Sitehttp://www.microsoft.com/technet/prodtechnol/sql/bestpractice/default.mspx

SQL Server 2008 websitehttp://www.microsoft.com/sqlserver/2008/en/us/default.aspx

Windows Server System Storage Homehttp://www.microsoft.com/windowsserversystem/storage/default.mspx

ICE http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032341825&Culture=en-US

SQL Server Case Studieshttp://www.microsoft.com/sql/casestudies/default.mspx

Microsoft SQL Server I/O subsystem requirements for the tempdb databasehttp://support.microsoft.com/kb/917047

FIX: Concurrency enhancements for the tempdb databasehttp://support.microsoft.com/kb/328551

33

http://www.sqlcat.com/

http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqlIObasics.mspx

http://www.microsoft.com/technet/prodtechnol/sql/2005/iobasics.mspx

http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/pdpliobp.mspx

http://sqlcat.com/top10lists/archive/2007/11/21/storage-top-10-best-practices.aspx

http://www.microsoft.com/sql/alwayson/default.mspx

http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/default.mspx

http://www.microsoft.com/sqlserver/2008/en/us/default.aspx

http://www.microsoft.com/windowsserversystem/storage/default.mspx

http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032341825&Culture=en-US

http://www.microsoft.com/sqlserver/2008/en/us/default.aspx

http://support.microsoft.com/kb/917047

http://support.microsoft.com/kb/328551

Questions?

34

Visit the Microsoft Technical Learning

Center located in the Expo Hall

Microsoft Data Platform ISV Village Microsoft Ask the Experts Lounge Microsoft Chalk Talk Theatre

Presentations

35

THANK YOUFOR ATTENDING THIS SESSION

AND THE PASS COMMUNITY SUMMIT 2008

PASS Community Summit 2008November 18 – 21, 2008 Seattle WA36

Sponsored by

37

Appendix

38

SQL Server’s I/O Characteristics

39

Note these values may change as optimizations are made to take advantage of modern storage enhancements

• Difficult to generalize I/O patterns of SQL Server

• SQL is a platform on which applications are built hence I/O patterns may differ significantly from one application to another

• Monitoring of I/O is necessary to determine specifics of each scenario

• Understanding the I/O characteristics of common SQL Server operations/scenarios can help determine how to configure storage

• General I/O characteristics of common scenarios:

Operation Random / Sequential

Read / Write

Size Range

OLTP – Log Sequential Write Sector Aligned Up to 60K

OLTP – Log Sequential Read Sector Aligned Up to 120K

OLTP – Data (Index Seeks) Random Read 8K

OLTP - Lazy Writer Random Write Any multiple of 8K up to 256K

OLTP - Checkpoint Random Write Any multiple of 8K up to 256K

Read Ahead (DSS, Index/Table Scans)

Sequential Read Any multiple of 8KB up to 256K (512K for ENT Edition)

Bulk Insert Sequential Write Any multiple of 8K up to 128K

SQL Server’s I/O Characteristics

40

Operation Random / Sequential

Read / Write

Size Range

CREATE DATABASE Sequential Write 512KB (SQL 2000) , Up to 4MB (SQL2005)

(Only log file is initialized in SQL Server 2005)

BACKUP Sequential Read/Write Multiple of 64K (up to 4MB)

RESTORE Sequential Read/Write Multiple of 64K (up to 4MB)

DBCC – CHECKDB Sequential Read 8K – 64K

ALTER INDEX REBUILD - replaces DBREINDEX (Read Phase)

Sequential Read Any multiple of 8KB up to 256K

ALTER INDEX REBUILD - replaces DBREINDEX (Write Phase)

Sequential Write Any multiple of 8K up to 128K

DBCC – SHOWCONTIG (deprecated, use sys.dm_db_index_physical_stats)

Sequential Read 8K – 64K

Note these values may change as optimizations are made to take advantage of modern storage enhancements

SAN vs. DASStorage Area Network vs. Direct Attach Storage

41

SAN DAS

Better flexibility provided by virtualization of storage Speed of deployment (once initial configuration is in

place)

Online configuration changes

Better overall utilization of storage resources

More featuresStorage Replication/Disaster Recovery, Snapshots/Clones

via VDI/VSS Integration, Thin Provisioning, etc..

May have increased redundancy / reliability

Likely higher cost May be lower depending on individual components (i.e.

SATA vs. SCSI)

May be offset due to better overall utilization of storage resources

Contrary to some common perceptions SAN does might not equal better performance

Simple and well understood

Likely cheaper for the same performance

Less flexible – better get it right the first time

Monitoring SQL Server I/ODiagnosing Bottlenecks

42

• Questions to ask when determining if a I/O is a

performance problem?

• Are my top SQL Server wait types related to I/O?

• Are my disk response times healthy? Are they reasonable for

my physical configuration? Do I need to investigate the physical

level?

• What type of I/O operations is SQL Server performing?

• Random in nature: focus on I/O’s per second and response time.

• Sequential in nature: focus on aggregate throughput.

• How large are the I/O’s (size will impact latency)?

• Which queries are issuing the most I/O and are they properly tuned?

• Which data files are incurring the most I/O and highest

response times?

Monitoring SQL Server I/ODiagnosing Bottlenecks

43

• What is the process of diagnosing I/O performance

issues?

• Logical Disk Counters:

• First line of defense - see previous slide

• Wait types (sys.dm_os_wat_stats): PAGEIOLATCH_SH/EX,

WRITELOG

• Consider averages for wait statistics

• Accumulated from last server start or flush of stats – consider deltas

• Virtual File Stats: sys.dm_io_virtual_file_stats

• File level statistics allowing for average size, latency, number of

I/O’s and total amount of I/O

• Identify I/O intensive queries

• sys.dm_exec_query_stats order by total_physical_reads

• Investigate query plans and index tuning

Validate Storage ConfigurationsTools

SQLIOSim.exe Use: Ensure correct functionality of underlying I/O subsystem. Simulates various patterns of SQL Server I/O. Replacement for SQLIOStress.exe.

http://blogs.msdn.com/sqlserverstorageengine/archive/2006/10/06/SQLIOSim-available-for-download.aspx

SQLIOStress.exe (deprecated – use SQLIOSim) Use: Ensure correct functionality of underlying I/O subsystem. Simulates various patterns of SQL Server I/O.

SQLIO.exe Use: Test throughput of I/O subsystem or establish benchmark of I/O subsystem

performance http://www.microsoft.com/downloads/details.aspx?familyid=9a8b005b-84e4-4f24-8d65-cb5344

2d9e19&displaylang=en

IOMeter Use: Test throughput of I/O subsystem or establish benchmark of I/O subsystem

performance Open source tool, Allows combinations of I/O types to run concurrently against test file No support for mount point volumes

http://www.iometer.org/

44



http://www.microsoft.com/downloads/details.aspx?familyid=9a8b005b-84e4-4f24-8d65-cb53442d9e19&displaylang=en

http://www.microsoft.com/downloads/details.aspx?familyid=9a8b005b-84e4-4f24-8d65-cb53442d9e19&displaylang=en

http://www.iometer.org/

SQLIO is an unsupported tool provided by Microsoft that can be used for this

IOMeter is an external tool providing ability to stress storage subsystem with a variety of I/O patterns concurrently

Test and validate the performance of each storage configuration before deploying SQL Server application (common pitfall)

Benchmark performance and “shake out” hardware/driver/multi-pathing problems early in the configuration

Share the results with your vendor – good method for comparing different configurations

Validate Storage ConfigurationsSQLIO / IOMeter

45

Validate Storage ConfigurationsConsiderations

Things to consider when running tests Test a variety of I/O types and sizes Make sure your test files are significantly larger than the amount of cache

on the array Exception: if you are testing channel throughput in which case use files that will

fit in array cache To get a true representation of disk performance use test files of approximate

size to planned data files – small test files (even if they are larger than cache) may result in smaller seek times and skew results

Test each I/O path individually and then combinations of the I/O paths Relatively short tests are okay, however, longer runs may give a more

complete understanding of how the storage will perform Allow time in between tests to allow the hardware to reset (cache flush) Keep all of the benchmark data to refer to after the SQL implementation

has taken place Maximum throughput (IOPs or MB/s) has been obtained when latency

continues to increase while throughput is near constant

46

Validate Storage ConfigurationsConsiderations (2)

Things to consider when looking at results Consult your storage admin or vendor. They should know

if this results are reasonable for the particular storage configuration

Once you reach saturation (i.e. latency increases but throughput does not)

1. Ensure any multipathing is functional and you are not bound by the capacity of a single HBA, switch port, etc…

2. Ensure “queue depth” setting on the HBA is set high enough. If too low it will seem as though the disk is saturated before it actually is

(common pitfall) Default values for queue depth generally not ideal for SQL Server –

consider increasing

If test results vary wildly check to determine if you are sharing spindles with others on the array or shared components are an issue

Monitoring at host and on the array during tests is ideal

47

Storage Design Best Practices RAID Levels

RAID Levels – RAID 1+0 preferred for log, tempdb

Data files - when HA & performance really matter

– RAID 5 observed frequently used in deployments There is a write-performance penalty but may be

acceptable for the scenario Cache may help but not for sustained write workloads

– Other RAID levels are becoming more popular – Example: RAID-DP (NetApp proprietary)

48

Sample Configuration High Availability / Disaster Recovery

17TB single OLTP database (mixed workload)– Second copy for reporting

using transactional replication

Complex storage configuration – deployed in production

Achieves disaster recovery through storage level replication across – Distance ~30Km

– Average 3-5ms latency

Backup/Restore – Through snapshot/VDI

technologies

– Used to quick reestablish replication

49

Storage Implementation @ ScaleMSN ® Online Services Large Scale OLTP / BI / DW

Large scale online services – 8 petabytes total across entire organization (not all SQL Server )

– Generally simple LUN design, larger LUNs reduce number of LUNs per host Deep queues per LUN, proactive monitoring of response time

Sized for IOPs: sustain 8K Random / 18K sequential per LUN

– Mixed environment OLTP / DW

Mix of EMC Clariion and DMX Storage – CX700, CX3-80, CX-400i, DMX 3 - 4500, DMX 4

– Using SRDF for DR / Using clone technologies to enable scale out of reporting as well as backup/restore

One example: Business Intelligence Data Warehouse– Storage based snapshots per day for scale out reporting

– 16 servers in an active/active cluster

– Each server allocated with ~9TB = 144TB in total50

Storage Implementation @ ScaleMSN Online Services Large Scale OLTP / BI / DW

Monitoring Strategy Focus on response time , considered most important metric

Writes less than 6ms , Reads < 25ms

Customers start to notice issues above 50ms disk response time

Utilize vendor specific tools for monitoring trending on backend

SQL Server configuration Deploy in both shared and dedicated models based on the requirements of

the application Utilize simple LUN design – fewer, larger LUN’s to simplify management

(clones)

How do they succeed…. Work closely with SQL developers to optimize storage for I/O “Professional respect. I will really say that it is a team effort” “A big part of it is learning to speak a common language. Hardware

guys speak in I/O’s – SQL folks talk in spids and queries. We learned over the years to translate.”

51

Example ConfigurationData Warehouse - ICE/CX700

Information Security Consolidated Event DW Internal tool used by Microsoft Information Security

team Collects inbound and outbound e-mail traffic, login

events, and Web browsing, into a single database which is then used to provide forensic evidence

Provides analysis and query capabilities Gathers data from 85+ sources around the world Up to 10 concurrent users running ad-hoc queries and

fixed reports SSIS, SSRS and the DW on the same box Use Table Partition to load new data into new partition

quickly Achieve with minimal HW and operation cost

52

Example ConfigurationData Warehouse - ICE/CX700 (2)

Dedicated storage environment Single database

• 4 –way single core 2.2 GHz HP Proliant DL 585 G1, X-64 with 8 GB RAM

• 40 TB across two CX700 arrays

• Currently @ 30TB Loading/Deleting 500GB-1TB /

day (60 day retention period)

• SQL Data, Log, TempDB volumes on RAID1+0, backup volumes on RAID5

• 200GB LUN’s backed by 12 Spindles

53

EMC CX700-1240x300GB disks30 disks @ RAID5

210 disks @ RAID1+0

EMC CX700-2142x146GB disks

142 disks @ RAID1+0

ICE Application ServerWindows Server 2003 x644x2.2GHz CPU 8GB RAM

2x2Gb FC HBA5x200GB LUNs3x1000GB LUNs

ICE SQL ServerWindows Server 2003 x644x2.2GHz CPU 32GB RAM

2x2GB FC HBA20x200GB fileshares

160x200GB mountpoints

Example ConfigurationAir Products SAP ERP

ERP database migrated from IBM Mainframe to HP Integrity rx8640 this year

12 CPUs Itanium Dual-Core with 192GB RAM as Database Server

Database Volume around 5TB 6 teamed 4Gb HBAs Application Server layer:

– 10 x DL380 2 x Intel 5160 3.0GHz

Workload during day created by over 1500 users Workload during night created by heavy batch activities Up to 30,000 random IOPS monitored during high load

phase or parallel index create

54

Documents

Mike Ruthruff SQLServer on SAN SQLCAT