100
DAT 402 Designing High Performance I/O System with SQL Server Dubi Lebel [email protected]

IO Dubi Lebel

Embed Size (px)

Citation preview

Page 1: IO Dubi Lebel

DAT 402Designing High Performance I/O System with SQL Server

Dubi [email protected]

Page 2: IO Dubi Lebel

Query A runs two thousand time every day, takes up to 1 second to return results set

Query B runs two time in a week , takes about 10 minutes to return results set

Which one of those query affecting more the DISK I/O?

Page 3: IO Dubi Lebel

Insert 100 single row (100 transactions)

insert 100 rows wrapped in 1 transaction

Which one of those inserts is faster? which one affecting more the DISK I/O?

Page 4: IO Dubi Lebel

Amdahl's Law: system speed-up limited by the slowest part!

CPU Performance: 60% per year

Disk Performance < 10% per year (IO per sec)

I/O system performance limited by mechanical delays (disk I/O)

Page 5: IO Dubi Lebel

Who am I?D.B.A. (Don’t Bother Asking) Architect at Logic Ind.

Worked as a database Dev & admin since 1990.

Works with SQL Server from first (SYBASE) version 3.6

Been at S.R.L. (R.I.P.) 8½ years

Been at Microsoft 7 ½ as Technical manager of SQL pre-sales (managed 3 times the DB track at Tech-ED)

Co-manage with Ami Leven the SQL Server Israeli User Group

Page 6: IO Dubi Lebel

Thanks

Thomas Kejser - SQL CAT member

http://sqlcat.com/members/tkejser.aspx

Thomas with Henk Van der Valk (BI405 29/11/10, 11:30 - 12:45 )

world record SSIS -ETL performance - 1.18 TB in under 10 minutes.

Page 7: IO Dubi Lebel

Key Takeaway

This is NOT going to be easy…182 slides

You can either dive here or in the sea.

But here you will see what you can’t see in the sea.

The lessons in this session wrote in sweat and blood

Page 8: IO Dubi Lebel

What is IO for me?Terminology.Tools.The path from client application to the storage and back.What affects the disk performance?Benchmark and Sizing Methodology.Workload - Design for Performance.

Page 9: IO Dubi Lebel

What is IO for me?

Page 10: IO Dubi Lebel

What is IO for me?

Page 11: IO Dubi Lebel

1956: IBM 305 RAMAC Computer with Disk Drive

Page 12: IO Dubi Lebel

Seagate ST4053 40 MByte

This was my disk on my first desktop

Page 13: IO Dubi Lebel

What is IO for me?Terminology.Tools.DiThe path from client application to the storage and back.What affects the disk performance?Benchmark and Sizing Methodology.Workload - Design for Performance.

Page 14: IO Dubi Lebel

What is IO for me?

Throughput

Latency

Capacity

How do you Measure it?

Page 15: IO Dubi Lebel

What is IO for me - Throughput

The amount of successful data passing between storage and computer in a specified amount of time

Measured in MB/sec or IOPs

Performance Monitor: Logical Disk– Disk Read Bytes / Sec– Disk Write Bytes / Sec– Disk Read / Sec– Disk Writes / Sec

Page 16: IO Dubi Lebel

What is IO for me - Latency

A synonym for delay. how much time it takes for a packet of data to get from one designated to another

Measured in milliseconds (ms)

Performance Monitor: Logical Disk– Avg. Disk Sec / read – Avg. Disk Sec / write

More on healthy latency values later

Page 17: IO Dubi Lebel

What is IO for me - Capacity

Capacity is just Capacity

Measured in GB/TB– The easy one!– does it important?

Key Takeaway:

Don’t think about disk i/o as disk capacity

Page 18: IO Dubi Lebel

What is IO for me?Terminology.Tools.The path from client application to the storage and back.What affects the disk performance?Benchmark and Sizing Methodology.Workload - Design for Performance.

Page 19: IO Dubi Lebel

Terminology - Basic

Disk -

Spindles – Physical disks in the Storage Array

Array - Box with the Spindles in it

Page 20: IO Dubi Lebel

Terminology – ACR”N

JBOD - Just a Bunch of Disks

SAME – Stripe and Mirror Everything

RAID - Redundant Array of Inexpensive Disks

DAS - Direct Attached Storage

NAS - Network Attached Storage

SAN - Storage Area Network

CAS - Content Addressable Storage

Page 21: IO Dubi Lebel

Terminology – Adv.

LUN - Logical Unit Number

Host - The Server or Servers a LUN is presented to.

Disk - How the OS sees a LUN when presented

IOps - Physical Operation To Disk

Sequential IO - Reads or writes which are sequential on the spindle

Random IO - Reads or writes which are located at random positions on the spindle

Page 22: IO Dubi Lebel

Cardiothoracic Surgery

Page 23: IO Dubi Lebel

The Full Stack

CPU PCI Bus

I/O Controller / HBA Cabling Array Cache SpindleWindows

SQL Serv.

Page 24: IO Dubi Lebel

The Traditional Hard Disk Drive

Base casting

Spindle

Slider (and head)

Actuator arm

Actuator axis

Actuator

SATA interfaceconnector Power connector

Flex Circuit(attaches headsto logic board)

Source: Panther Products

Platters

Case mounting holes

Cover mounting holes(cover not shown)

Page 25: IO Dubi Lebel

Disk Arm and Head

Disk arm A disk arm carries disk heads

Disk head Read and write on disk surface

Read/write operation Disk controller receives a

command with <track#, sector#>

Seek the right cylinder (tracks) Wait until the right sector

comes Perform read/write

Page 26: IO Dubi Lebel

Mechanical Component of A Disk Drive

Tracks– Concentric rings around disk surface, bits laid out serially along each track

Cylinder– A track of the platter, 1000-5000 cylinders per zone, 1 spare per zone

Sectors– Each track is split into arc of track (min unit of transfer)

Page 27: IO Dubi Lebel
Page 28: IO Dubi Lebel

Numbers to Remember - Spindles

Traditional Spindle throughput in random 8/16K I/O– 10K RPM – 100 -130 IOPs at ‘full stroke’– 15K RPM – 150-180 IOPs at ‘full stroke’– Can achieve 2x or more when ‘short stroking’ the disks

(using less than 20% capacity of the physical spindle)

Aggregate throughput when sequential access:– Between 90MB/sec and 125MB/sec for a single drive– If true sequential, any block size over 8K will give you

these numbers– Depends on drive form factor, 3.5” drives slightly faster

than 2.5”

Approximate latency: 3-5ms

Page 29: IO Dubi Lebel

Scaling of Spindle Count - Short vs. Full Stroke

Each 8 disks exposes a single 900GB LUN

– RAID Group capacity ~1.1TB

Test data set is fixed at 800GB

– Single 800GB for single LUN (8 disks), two 400GB test files across two LUNs, etc…

Lower IOPs per physical disk when more capacity of the physical disks are used

(longer seek times)

8 Disks 16 Disks 32 Disks 48 Disks 64 Disks0

50

100

150

200

250

300

350

400

0

10

20

30

40

50

60

70

80

90

100

233270

311327 336

Short vs. Full Stroke Impact Random 8K Reads

Reads Per Disk (Random 8K)Total Capac-ity (%) Used Across All DisksI/

O's

Per S

econ

d

Page 30: IO Dubi Lebel

The “New” Hard Disk Drive (SSD)Solid State Drive

No moving parts!

Page 32: IO Dubi Lebel

SSD - Game Changer!No moving parts– Power consumption (20%)– Read operations Random = Sequential – low latency on access

Page 33: IO Dubi Lebel

SSD - NAND FlashThroughput, especially random, much higher than traditional drive

– Typically 10**4 IOPS for a single drive

– example: Intel X25 and FusionIO

Storage organized into cells of 512KB

– Each cell consists of 64 pages, each page 8KB

When a cell need to rewritten, the 512KB Block must first be erased

– This is an expensive operation, can take very long

– Disk controller will attempt to locate free cells before trying to delete

existing ones

– Writes can be slow

• DDR ”write cache” often used to ”overcome” this limitation

When blocks fill up, NAND becomes slower with use

– But only up to a certain level – eventually peaks out

– Still MUCH faster than typical drives

Page 34: IO Dubi Lebel

SSD - Battery Backed DRAM Throughput close to speed of RAM

– Typically 10**5 IOPS for a single drive

Drive is essentially DRAM RAM

– on a PCI card (example: FusionIO)

– ...or with a fiber interface (example: DSI3400)

Battery backed up to persist storage

– Be careful about downtime, how long can drive survive with

no power?

As RAM prices drop, these drives are becoming larger

Extremely high throughput, watch the path to the drives

Page 35: IO Dubi Lebel

SSD Directly on PCI-X Slot

> 10,000 IOPs

Mixed read/write

Latency < 1ms

PCI bus bottleneck

Page 36: IO Dubi Lebel

Overview of Drive Characteristics

Characteristic 7500 rpm SATA 15.000rpm SAS SSD

NAND Flash DDR

Seek Latency 8-10ms 3-4.5ms 70-90µs 15µs

Seq. Read Speeds 64KB

? 100-120MB/sec 800MB/sec 3GB/sec

Ran. Read Speed 8KB

? 1-3 MB/sec 800MB/sec 3GB/sec

Seq. Write Speeds 64KB

? 25 MB/sec >150MB/sec 3GB/sec

Ran. Write at 8KB ? 1-3 MB/sec 100MB/sec 3GB/sec

Peak Transfer Speed

? 130MB/sec 800MB/sec 3GB/sec

Max Size / Drive 5TB 600GB 512GB N/A

Cost pr GB Low Medium Medium-High High / Very High

MTTF 1.4M hours 1M hours 2M hours ?

Page 37: IO Dubi Lebel

Question

SSD is evolution of DiskOnKey,

What is the most dangerous event that can lead you to loss all your Disk On Key data?

Don’t put all the eggs on one basket!

Page 38: IO Dubi Lebel

What is IO for me?Terminology.Tools.The path from client application to the storage and back.What affects the disk performance?Benchmark and Sizing Methodology.Workload - Design for Performance.

Page 39: IO Dubi Lebel

Tools

SQLIO

IOMETER

SQLIOStress - SQLIOSim

Page 40: IO Dubi Lebel

SQLIO

What is it

Domo

How to read results

Page 41: IO Dubi Lebel

How to Run SQLIOBrent Ozar – MVP, Quest Softwarehttp://www.brentozar.com/

http://sqlserverpedia.com/wiki/SAN_Performance_Tuning_with_SQLIO

Page 42: IO Dubi Lebel

Write This Down. It’s Important.

sqlio -kW -t2 -s120 -dM -o1 -frandom -b64 -BH -LS Testfile.datsqlio -kR -t2 -s120 -dM -o2 -frandom -b64 -BH -LS Testfile.datsqlio -kW -t2 -s120 -dM -o8 -frandom -b64 -BH -LS Testfile.datsqlio -kW -t2 -s120 -dM -o16 -frandom -b64 -BH -LS Testfile.datsqlio -kR -t2 -s120 -dM -o64 -frandom -b64 -BH -LS Testfile.datsqlio -kR -t2 -s120 -dM -o128 -frandom -b64 -BH -LS Testfile.dat

sqlio -kW -t4 -s120 -dM -o1 -fsequential -b64 -BH -LS Testfile.datsqlio -kR -t4 -s120 -dM -o2 -fsequential -b64 -BH -LS Testfile.datsqlio -kR -t4 -s120 -dM -o4 - fsequential -b64 -BH -LS Testfile.datsqlio -kW -t4 -s120 -dM -o8 - fsequential -b64 -BH -LS Testfile.dat

Page 43: IO Dubi Lebel

The most important parameters are:

-kW means writes (as opposed to reads)

-t2 means two threads

-s120 means test for 120 seconds

-dM means drive letter M

-o1 means one outstanding request (not piling up requests)

-frandom means random access (as opposed to -fsequential)

-b64 means 64kb IOs

Page 44: IO Dubi Lebel

The OutputE:\Program Files (x86)\SQLIO>sqlio -kW -t2 -s120 -dM -o1 -frandom -b64 -BH -LS Testfile.dat sqlio v1.5.SGusing system counter for latency timings, -1361967296 counts per second2 threads writing for 120 secs to file M:Testfile.dat

using 64KB random IOsenabling multiple I/Os per thread with 1 outstandingbuffering set to use hardware disk cache (but not file cache)

using current size: 24576 MB for file: M:Testfile.datinitialization doneCUMULATIVE DATA:throughput metrics:IOs/sec: 1539.50MBs/sec: 96.21latency metrics:Min_Latency(ms): 0Avg_Latency(ms): 0Max_Latency(ms): 572histogram:ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+%: 66 32 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Page 46: IO Dubi Lebel

IOMETER

What is it

Domo

How to read results

Page 47: IO Dubi Lebel

http://www.iometer.org/

developed by the Intel Corp. 1998

an I/O subsystem measurement and characterization tool for single and clustered systems.

given to the Open Source Development Lab (OSDL). In November 2001

Last update 2008-06-22-rc2

Page 48: IO Dubi Lebel
Page 49: IO Dubi Lebel

Heuristics:•One manager per server.•One worker per processor.

Note:•If you leave this field at “0”, IOMeter will use all available disk space.

Can play a significant role in observed performance.

Page 50: IO Dubi Lebel
Page 51: IO Dubi Lebel
Page 52: IO Dubi Lebel
Page 53: IO Dubi Lebel
Page 54: IO Dubi Lebel

QLIOStress - SQLIOSim

What is it

Dome

How to read results

Page 56: IO Dubi Lebel

HP StorageWoker

Page 57: IO Dubi Lebel
Page 58: IO Dubi Lebel

What is IO for me?Terminology.Tools.The path from client application to the storage and back.What affects the disk performance?Benchmark and Sizing Methodology.Workload - Design for Performance.

Page 59: IO Dubi Lebel

CPU PCI Bus

I/O Controller / HBA Cabling Array Cache SpindleWindows

SQL Serv.

Page 60: IO Dubi Lebel

hardware between the CPU and the physical drive

Different topologies

depending on vendor and technology

Best Practices:

– Understand topology, potential bottlenecks

and theorectical throughput of components in

the path

– Engage storage engineers early in the

process– The deeper the topology, the more latency

Page 61: IO Dubi Lebel

Controller

Network components between disks and server

Multiple disks connected to a computer system through a controller

Failure detection and recovery (checksum, bad sector remapping)

Page 62: IO Dubi Lebel

Disk interface standard

Fiber Channel (FC)– Fastest Bus Speeds between 2-4 Gigs

SCSI - Small Computer System Interconnect, – Older Technology, slower bus speeds

S(ATA) - AT Adaptor– Newer Technology, even slower bus speeds

Enterprise Flash Disks (EFDs)– Newest Technology, same bus speeds as FC

Page 63: IO Dubi Lebel

Cache

System cache - Buffer data between disk and interface

Disk cache Use DRAM to cache recently accessed blocks

Blocks are replaced usually in an LRU order

Minimum 8-16MB

Needs battery and operational for reliable writes

Page 64: IO Dubi Lebel

Cache size, checkpoint 2GB vs. 8GB

Key Takeaway: Write cache helps, up to a certain point

Page 65: IO Dubi Lebel

I/O SystemsProcessor

Cache

Memory - I/O Bus

MainMemory

I/OController

Disk Disk

I/OController

I/OController

Graphics Network

interrupts

Page 66: IO Dubi Lebel

DAS vs. SAN

DAS – Direct Attached Storage– Standards: (SCSI), SAS, SATA– RAID controller in the machine– PCI-X or PCI-E direct access

SAN – Storage Area Networks – Standards: iSCSI or Fibre Channel (FC)– Host Bus Adapters or Network Cards in the

machine– Switches / Fabric access to the disk

Page 67: IO Dubi Lebel

Path to the drives - DAS

PCI Bus

PCI Bus Cach

eController

Sh

elf In

terfaceS

helf

Interface

Cach

e

Controller

Sh

elf In

terfaceS

helf

Interface

Page 68: IO Dubi Lebel

Path to the Drives – DAS ”on chip”

PCI Bus Controller

PCI Bus

Controller

Page 69: IO Dubi Lebel

Path to the drives - SAN

Cach

e

Fib

er C

ha

nn

el P

orts

Co

ntro

llers

/Pro

ce

ss

ors

Sw

itch

HBAS

witch

PCI BusPCI BusPCI Bus

PCI Bus

Best Practice: Make sure you have the tools to monitor the entire path to the drives. Understand utilization of individual componets

Page 70: IO Dubi Lebel

Numbers to Remember - DASSAS Cable speed– Theoretical: 1.5GB/sec– Typical: 1.2GB/sec

PCI-X v1 bus– X4 slot: 750M/sec– X8 slot: 1.5GB/sec– X16 – fast enough, around the 3GB/sec

PCI-X v2 Bus– X4 slot: 1.5 – 1.8GB/sec– X8 slot: 3GB/sec

Be aware that a PCI-X bus may be “v2 compliant” but still run at v1 speeds.

Page 71: IO Dubi Lebel

Numbers to Remember - SAN

HBA speed– 4Gbit – Theoretical around 500MB/sec– Realistically: between 350 and 400MB/sec

8Gbit will do twice that– But remember limits of PCI-X bus– An 8Gbit card will require a PCI-X4 v2 slot or faster

Max throughput per storage controller– Varies by SAN vendor, check specifications

Drives are still drives – there is no magic

Page 72: IO Dubi Lebel

DAS vs. SANFeature SAN DASCost High

- but may be offset by better utilization

Low- But may waste space

Flexibility Virtualization allows online configuration changes

Better get it right the first time!

Skills required Steep learning curve, can be complex

Simple and well understood

Additional Features SnapshotsStorage Replication

None

Performance Contrary to popular belief, SAN is not a performance technology

High performance for small investment

Reliability Very high reliability Typically less reliable.- May be offset by higher redundancy on RAID levels

Clustering Support Yes No

Page 73: IO Dubi Lebel

What is IO for me?Terminology.Tools.The path from client application to the storage and back.What affects the disk performance?Benchmark and Sizing Methodology.Workload - Design for Performance.

Page 74: IO Dubi Lebel

Monitoring - Windows View of I/OMake sure to capture all of these for the complete picture…

Logical Disk Counter Storage Guy’s term Description

Disk Reads / SecondDisk Writes / Second

IOPS Measures the Number of I/O’s per second Discuss with vendor sizing of spindles of different type and rotational speedsImpacted by disk head movement (i.e. short stroking the disk will provide more I/O per second capacity)

Average Disk sec / readAverage Disk sec / write

Latency Measures disk latency. Numbers will vary, optimal values for averages over time:

1 - 5 ms for Log (Ideally 1ms or better) 5 - 20 ms for Data (OLTP) (Ideally 10ms or better)<=25-30 ms for Data (DSS)

Average Disk Bytes / Read Average Disk Bytes / Write

Block Size Measures the size of I/O’s being issued. Larger I/O tend to have higher latency (example: BACKUP/RESTORE)

Avg. / Current Disk Queue Length

Outstanding or waiting IOPS

Should not be used to diagnose good/bad performance. Provides insight into the applications I/O pattern.

Disk Read Bytes/secDisk Write Bytes/sec

Throughput or Aggregate Throughput

Measure of total disk throughput. Ideally larger block scans should be able to heavily utilize connection bandwidth.

Page 75: IO Dubi Lebel

Validating a System for High Throughput OLTPCached files vs. Disk Access

From Disk

By the way – Notice the queue depth

From Cache

Page 76: IO Dubi Lebel

Random or Sequential?Knowing if your workload is random or sequential in nature can be a hard question to answer– Depends a lot on application design

SQL Server Access Methods can give some insights– High values of Readahead pages/sec indicates a lot

of sequential activity– High values of index seeks / sec indicates a lot of

random activity– Look at the ratio between the two

Transaction log is always sequential– Best Practice: Isolate transaction log on dedicated

drives

Page 77: IO Dubi Lebel

Configuring Disks in WindowsThe one slide best practice

Use Disk Alignment at 1024KB

Use GPT if MBR not large enough

Format partitions at 64KB allocation unit size

One partition per LUN

Only use Dynamic Disks when there is a need to stripe

LUNs using Windows striping (i.e. Analysis Services

workload)

Tools:

– Diskpar.exe, DiskPart.exe and DmDiag.exe

– Format.exe

– Disk Manager

Page 78: IO Dubi Lebel

Ensure Disks are Formatted Correctly

– The worst scenario? Random operations using 64K IO and 64K

chunk size. One sector off and you are hitting two disks for every

IO thus halving the random performance potential.

Note: On a RAID array this means accessing two

different stripe units on two separate disks.Graphics Source: Jimmy May

ד

Page 79: IO Dubi Lebel

11 / 20; Using Unaligned Partitions

ד

Page 80: IO Dubi Lebel

Do multiple data files make a difference?

Paul S. Randalhttp://www.sqlskills.com/BLOGS/PAUL/post/Benchmarking-do-multiple-data-files-make-a-difference.aspx

– More drives typically yield better speed

– True for both SAN and DAS

– ... Less so for SSD, but still relevant (especially for NAND)

Page 81: IO Dubi Lebel

How Many Data Files Do I Need?

More data files does not necessarily equal better performance

– Determined mainly by 1) hardware capacity & 2) access patterns

Number of data files may impact scalability of heavy write

workloads

– Potential for contention on allocation structures (PFS/GAM/SGAM)

– Mainly a concern for applications with high rate of page allocations on

servers with >= 8 CPU cores

– More of a consideration for Tempdb (most cases)

Can be used to maximize # of spindles – Data files can be used

to “stripe” database across more physical spindles

Best practice: Pre-size data/log files, use equal size for files

within a single file group and manually grow all files within

filegroup at same time (vs. AUTOGROW)

Page 82: IO Dubi Lebel

How Many Filegroups?Performance

– Filegroups can be used to separate tables/indexes - allowing selective placement of these

at the disk level

– Separate objects requiring more data files due to high page allocation rate

– Can be used to separate I/O patterns

Administration consideration (primarily)

– Backup can be performed at the filegroup or file level

– Partial availability• Database is available if primary filegroup is available; other filegroups can be offline

• A filegroup is available if all its files are available

– Tables and Indexes• Can specify separate filegroups for in-row data and large-object data

• Best Practice: Place LOB data in a dedicated filegroup

– Partitioned Tables • Each partition can be in its own filegroup

• Partition per filegroup may provide better archiving strategy

• Partitions can be moved in and out of the table

– Best Practice: Do not place data in PRIMARY filegroup, allocate a new filegroup and set

this as default

Page 83: IO Dubi Lebel

Share storage environments

At the disk level and other shared components (i.e. service processors, cache, etc…)

Page 84: IO Dubi Lebel

What is IO for me?Terminology.Tools.The path from client application to the storage and back.What affects the disk performance?Benchmark and Sizing Methodology.Workload - Design for Performance.

Page 85: IO Dubi Lebel

Monitoring - SQL Server View of I/OTool Monitors Granularity

sys.dm_io_virtual_file_stats Latency, Number of IO’s Database files

sys.dm_os_wait_stats PAGEIOLATCH, WRITELOG

SQL Server Instance level (cumulative since last start – most useful to analyze over time periods).

sys.dm_exec_query_statsNumber of …Reads (Logical Physical) Number of writes

Query or Batch

sys.dm_db_index_usage_statsNumber of IO’s and type of access (seek, scan, lookup, write)

Index or Table

sys.dm_db_index_operational_stats

IO latch wait time, Page splits Index or Table

Xevents PAGEIOLATCH Query and Database file

Page 86: IO Dubi Lebel

What is IO for me?Terminology.Tools.The path from client application to the storage and back.What affects the disk performance?Benchmark and Sizing Methodology.Workload - Design for Performance.

Page 87: IO Dubi Lebel

Storage Selection General Pitfalls

There are organizational barriers between DBA’s and

storage administrators – Each needs to understand the others “world”

Share storage environments – At the disk level and other shared components (i.e. service processors,

cache, etc…)

Sizing only on “capacity” is a common problem – Key Takeaway: Take latency and throughput requirements (MB/s, IOPs

and max latency) into consideration when sizing storage

One size fits all type configurations– Storage vendor should have knowledge of SQL Server and Windows best

practices when array is configured

– Especially when advanced features are used (snapshots, replication, etc…)

Page 88: IO Dubi Lebel

Disk Subsystem - SQL Server I/O Pattern Understanding I/O characteristics of

common SQL Server operations/scenarios can help you determine how to configure storageOperation Random /

SequentialRead / Write Size Range

OLTP – Log Sequential Write Up to 64K

OLTP – Data (Index Seeks) Random Read 8K

OLTP - Lazy Writer Random Write Any multiple of 8K up to 128K

OLTP - Checkpoint Random Write Any multiple of 8K up to 128K

Read Ahead (DSS, Index/Table Scans)

Sequential Read Any multiple of 8KB up to 256K

Bulk Insert Sequential Write Any multiple of 8K up to 128K

BACKUP / Restore Sequential Read/Write Multiple of 64K (up to 4MB)

DBCC – CHECKDB Sequential Read 8K – 64K

ALTER INDEX REBUILD - (Read Phase)

Sequential Read (see Read Ahead)

ALTER INDEX REBUILD - (Write Phase)

Sequential Write Any multiple of 8K up to 128K

Page 89: IO Dubi Lebel

OLTP Workloads

High number of small Tlog writes (often single digit KB)

T-Log buffer is written because Commit is issued by the application

Concurrency around writing into T-Log Buffer

Majority ‘random’ single page reads

Page 90: IO Dubi Lebel

OLAP - DWH Workloads

Smaller number of Tlog Writes with longer writes (often 64K)

T-Log buffers get written because buffer is full

Often ‘sequential’ Read-Ahead with 64K or more from data files

Page 91: IO Dubi Lebel

Backup / Restore

Backup and restore operations utilize internal buffers for the data

being read/written

Number of buffers is determined by:

– The number of data file volumes

– The number of backup devices

– Or by explicitly setting BUFFERCOUNT

If database files are spread across are a few (or a single) logical

volume(s), and there are a few (or a single) output device(s)

optimal performance may not be achievable by default

Tuning can be achieved by using the BUFFERCOUNT parameter

for BACKUP / RESTORE

More Information: – http://sqlcat.com/technicalnotes/archive/2008/04/21/tuning-the-performance-of-backup-compres

sion-in-sql-server-2008.aspx

Page 92: IO Dubi Lebel

FILESTREAM

Writes to varbinary(max) will go through the buffer pool and are flushed

during checkpoint

Reads & Writes to FILESTEAM data does not go through the buffer pool

(either T-SQL or Win32)

T-SQL uses buffered access to read & write data

Win32 can use either buffered or non-buffered– Depends on application use of APIs

FileStream I/O is not tracked via sys.dm_io_virtual_file_stats – Best practice to separate on to separate logical volume for monitoring purposes

Writes/Generates to FILESTREAM generates less transaction log volume

than varbinary(max)– Actual FILESTREAM data is not logged

– FILESTREAM data is captured as part of database backup and transaction log backup

– May increase throughput capacity of the transaction log • http://sqlcat.com/technicalnotes/archive/2008/12/09/diagnosing-transaction-log-performance-issues-and-limits-of-the-log

-manager.aspx

Page 93: IO Dubi Lebel

TEMPDB

User group 92 November 2009:– Nothing is not more permanent than the

temporary

http://www.slideshare.net/sqlserver.co.il/nothing-is-not-more-permanent-than-the-temporary

Page 94: IO Dubi Lebel

Tools

SQLIO– Used to stress an I/O subsystem – Test a configuration’s performance– http://www.microsoft.com/downloads/details.aspx?FamilyId=9A8B005B-8

4E4-4F24-8D65-CB53442D9E19&displaylang=en

SQLIOSim– Simulates SQL Server I/O – Used to isolate hardware issues – 231619 HOW TO: Use the SQLIOStress Utility to Stress a Disk

Subsystem http://support.microsoft.com/?id=231619

Fiber Channel Information Tool – Command line tool which provides configuration information (Host/HBA)– http://www.microsoft.com/downloads/details.aspx?FamilyID=73d7b879-5

5b2-4629-8734-b0698096d3b1&displaylang=en

Page 95: IO Dubi Lebel

KB Articles

KB 824190 Troubleshooting Storage Area Network (SAN) Issues– http://support.microsoft.com/?id=824190

KB 304415: Support for Multiple Clusters Attached to the Same SAN Device– http://support.microsoft.com/?id=304415

KB 280297: How to Configure Volume Mount Points on a Clustered Server– http://support.microsoft.com/?id=280297

KB 819546: SQL Server support for mounted volumes– http://support.microsoft.com/?id=819546

KB 304736: How to Extend the Partition of a Cluster Shared Disk– http://support.microsoft.com/?id=304736

KB 325590: How to Use Diskpart.exe to Extend a Data Volume– http://support.microsoft.com/?id=325590

KB 328551: Concurrency enhancements for the tempdb database– http://support.microsoft.com/?id=328551

KB 304261: Support for Network Database Files– http://support.microsoft.com/?id=304261

Page 96: IO Dubi Lebel

General Storage References

Microsoft Windows Clustering: Storage Area Networks– http://www.microsoft.com/windowsserver2003/techinfo/overview/san.msp

x

StorPort in Windows Server 2003: Improving Manageability and Performance in Hardware RAID and Storage Area Networks– http://www.microsoft.com/windowsserversystem/wss2003/techinfo/plande

ploy/storportwp.mspx

Virtual Device Interface Specification– http://www.microsoft.com/downloads/details.aspx?FamilyID=416f8a51-65

a3-4e8e-a4c8-adfe15e850fc&DisplayLang=en

Windows Server System Storage Home– http://www.microsoft.com/windowsserversystem/storage/default.mspx

Microsoft Storage Technologies – Multipath I/O– http://www.microsoft.com/windowsserversystem/storage/technologies/mpi

o/default.mspx

Storage Top 10 Best Practices – http://sqlcat.com/top10lists/archive/2007/11/21/storage-top-10-best-practi

ces.aspx

Page 97: IO Dubi Lebel

Recommended Courses: SQL 2008 • Maintaining a Microsoft SQL Server 2008

Database

• 2008 Implementing a Microsoft SQL Server Database

! מיוחד מבצע ועכשיומקורסי לאחד כאן SQLהרשם המוצעים

: משלימה הסמכה בחינת קנה•70-432

: 2008, TS Microsoft SQL Server Implementation and Maintenance

•70-433 : 2008, TS Microsoft SQL Server Database Development

- ל שנתי מנוי ללא TechNetוקבל חוזרת לבחינה ואפשרותעלות

, פנה והרשמה נוספים המוסמכות לפרטים למכללות

Page 99: IO Dubi Lebel

ופייסבוק משובים

השלמה- מירב

Page 100: IO Dubi Lebel