IO PERFORMANCE.ppt

Embed Size (px)

Citation preview

  • 8/10/2019 IO PERFORMANCE.ppt

    1/35

    I/O Performance Measures:

    Austin Orgah

    Chapter 8.6,7,8,9

  • 8/10/2019 IO PERFORMANCE.ppt

    2/35

    Examples from Disk and File

    Systems

    How should we compare I/O systems?

    - This is complex because I/O

    performance depends on many aspects ofthe I/O system.

    - Design can also make complex trade-offs

    between response time and throughput,

    making it impossible to measure just one

    aspect in isolation.

  • 8/10/2019 IO PERFORMANCE.ppt

    3/35

    Examples from Disk and File

    Systems contd

    For Example:

    Handling a request as early as possible

    generally minimizes response time, although

    greater throughput can be achieved handlingrelated requests together.

    Throughput may be increased on a disk by

    grouping requests that access locations thatare close together.

    This will increase response time for some

    requests, probably leading to a larger variation in

    response time.

  • 8/10/2019 IO PERFORMANCE.ppt

    4/35

    Examples from Disk and File

    Systems contd

    Though throughput will be increased,

    some benchmarks constrain the maximum

    response time to any request, making any

    of the optimizations(disk and file)

    potentially problematic.

  • 8/10/2019 IO PERFORMANCE.ppt

    5/35

    Some benchmarks are proposed for

    determining the performance of disk

    systems. These benchmarks are affected by a

    variety of system features such as:

    Disk technology How the disks are connected

    The memory system

    The processor The file system provided by the operating

    system

  • 8/10/2019 IO PERFORMANCE.ppt

    6/35

  • 8/10/2019 IO PERFORMANCE.ppt

    7/35

    Important Note: Contd

    In base 10: 1K = 1000 In base 2: 1K = 1024

    For calculation, instead of converting

    between the two, treating the two as if theyare equal will introduce little error.

  • 8/10/2019 IO PERFORMANCE.ppt

    8/35

    Benchmarks

    Transaction Processing I/O

    File System and Web I/O

  • 8/10/2019 IO PERFORMANCE.ppt

    9/35

    Transaction Processing I/O

    Benchmarks

    Transaction Processing(TP)A type ofapplication that involves handling small short

    operations(transactions) that require both I/O

    and computation. Its applications typically have

    both response time requirements and a

    performance measurement based on the

    throughput of transactions.

    TP are mainly concerned with I/O ratemeasuredas the number of disk accesses/sec instead of

    data ratemeasured in bytes of data per/sec.

  • 8/10/2019 IO PERFORMANCE.ppt

    10/35

    Transaction Processing I/O

    Benchmarks

    I/O ratePerformance measure of I/Os

    per unit time, such as reads per/sec.

    Data rateperformance measure of bytes

    per unit time, such as GB/sec.

    TP involve changes to a large database, with the

    system meeting some response time

    requirements as well as gracefully handlingcertain types of failures.For example banks useTP systems.

  • 8/10/2019 IO PERFORMANCE.ppt

    11/35

    Transaction Processing I/O

    Benchmarks

    The best-known set of benchmarks isdeveloped by the Transaction ProcessingCouncil (TPC).

    TPC-Ccreated in 1992, simulates acomplex query environment.

    TPC-Hmodels ad hoc decision support-

    the queries are unrelated and knowledgeof past queries cannot be used to optimizefuture queries.

  • 8/10/2019 IO PERFORMANCE.ppt

    12/35

    Transaction Processing I/O

    Benchmarks

    TPC-Rsimulates a business decisionsupport system where users run astandard set of queries.

    TPC-Wweb based transactionbenchmark that simulates the activities ofa business-oriented transactional webserver.

    Pour plus information visiter sur le internetwww.tpc.org.

  • 8/10/2019 IO PERFORMANCE.ppt

    13/35

    File System and Web I/O Benchmarks

    File systems stored on disks have a different

    access pattern. Measurement of UNIX file systems (engineering

    environment) show that: 80% of accesses are to files < 10KB.

    90% of all file accesses are to data with sequential. addresses

    on the disk.

    67% of the accesses are reads.

    27% were writes.

    6% were read-modify accesses which read, modified andrewrote data to the same location.

    These measurements have led to the creation of

    synthetic file system benchmarks.

  • 8/10/2019 IO PERFORMANCE.ppt

    14/35

    File System and Web I/O Benchmarks

    A popular synthetic file system benchmark

    with its 5 phases using 70 files: MakeDir: Constructs a directory subtree that is

    identical in structure to the given directorysubtree.

    Copy: Copies every file from the source subtreeto the target subtree.

    ScanDir: Recursively traverses a directorysubtree and examines the status of every file in it.

    ReadAll: Scans every byte of every file in asubtree once.

    Make: Compiles and links all the files in asubtree.

  • 8/10/2019 IO PERFORMANCE.ppt

    15/35

    File System and Web I/O Benchmarks

    In addition to processor benchmarks,

    SPEC offers a file server and a web serverbenchmarks. (SPECSFS) and(SPECWeb).

    SPECSFS is a benchmark for measuring NFS(Network

    File System) performance using a script of file serverrequests. It tests performance of the I/O system, disk,and network I/O and the processor. It is a throughput-oriented benchmark with important response timerequirements.

    SPECWeb is a web server benchmark that simulatesmultiple clients requesting both static and dynamicpages from a server. Also clients posting data to theserver.

  • 8/10/2019 IO PERFORMANCE.ppt

    16/35

    I/O Performance Versus Processor

    Performance

    Impact of I/O on System Performance: Suppose we have a benchmark that executes in 100s of

    elapsed time, where 90s is CPU time & the rest is I/O

    time. If CPU time improves by 50% per year for the nextfive years but I/O time doesnt , how much faster will our

    program run at the end of five years?

    Elapsed time = CPU time + I/O time

    100 = 90 + I/O timeTherefore: I/O time = 10s.

  • 8/10/2019 IO PERFORMANCE.ppt

    17/35

    After n years CPU time I/O time Elapsed time % I/O time

    0 90 secs 10 secs 100 secs 10%

    1 90/1.5 = 60secs

    10 secs 70 secs 14%

    2 60/1.5 = 40secs

    10 secs 50 secs 20%

    3 40/1.5 = 27secs

    10 secs 37 secs 27%

    4 27/1.5 = 18secs

    10 secs 28 secs 36%

    5 18/1.5 = 12secs

    10 secs 22 secs 45%

  • 8/10/2019 IO PERFORMANCE.ppt

    18/35

    CPU improvement over 5 years is:

    90/12 = 7.5

    The improvement in elapsed time is:100/22 = 4.5

    So the I/O time increased from 10% to 45%

    of the elapsed time.

  • 8/10/2019 IO PERFORMANCE.ppt

    19/35

    Designing an I/O System

    Two primary specifications that designersencounter in I/O systems

    Latency Constraints

    Bandwidth Constraints

    Knowledge of the traffic pattern affects the

    design and analysis.

  • 8/10/2019 IO PERFORMANCE.ppt

    20/35

    Latency Constraintsinvolve ensuring

    that the latency to complete an I/Ooperation is bounded by a certain amount.

    Designing an I/O system to meet a set of

    bandwidth constraints given a workload. Find the weakest link in the I/O system which is the component

    in the I/O path that will constrain the design. Depending on the

    workload, this component can be anywhere, including the CPU,

    the memory system, the back plane bus, the I/O bus, the I/O

    controllers or the devices. The workload and configuration limitsmay dictate where the weakest link is located.

    Configure this component to sustain the required bandwidth.

    Determine the requirements for the rest of the system and

    configure them to support this bandwidth.

  • 8/10/2019 IO PERFORMANCE.ppt

    21/35

    I/O System Design Example A CPU that sustains 3 billion instructions/sec and averages 100,000

    instructions in the operation system per I/O operation.

    A memory backplane bus capable of sustaining a transfer rate of

    1000 MB/sec.

    SCSI Ultra320 controllers with a transfer rate of 320 MB/sec and

    accommodating up to 7 disks.

    Disk drives with read/write bandwidths of 75 MB/sec and an average

    seek plus rotational latency of 6 ms.

    If the workload consists of 64 KB reads(where the block is

    sequential in a track) and the user program needs 200,000

    instructions per I/O operation, find the max sustainable I/O rate andthe number of disks and SCSI controllers required. Assume that the

    reads can always be done on an idle disk if one exists(i.e, ignore

    disk conflicts).

  • 8/10/2019 IO PERFORMANCE.ppt

    22/35

    Real Stuff:A Digital Camera

    Digital cameras are embedded computerswith removable, writable, nonvolatile,

    storage, and interesting I/O devices. See

    Sanyo VPC-SX500

  • 8/10/2019 IO PERFORMANCE.ppt

    23/35

  • 8/10/2019 IO PERFORMANCE.ppt

    24/35

    Digital Camera Contd

    When powered on, the microprocessorfirst runs diagnostics on all componentsand writes any errors messages to theliquid crystal display(LCD). When a picture

    is about to be taken, the photographerholds the shutter halfway so that themicroprocessor can take a light reading.The microprocessor then keeps the

    shutter open to get the necessary lightwhich is captured by a charged coupledevice(CCD) as red, green, and bluepixels.

  • 8/10/2019 IO PERFORMANCE.ppt

    25/35

    Digital Camera Contd

    The pixels are then scanned out row and

    then passed through routines for whitebalance, color and aliasing correction andthen stored in a 4MB frame buffer. The

    next step is to compress the image into astandard format such as JPEG and store itin the removable flash memory. Themicroprocessor updates the LCD display

    to show that there is room for one lesspicture. The camera has other featuressuch as video recording, sleep mode, LCDdisplay amongst many.

  • 8/10/2019 IO PERFORMANCE.ppt

    26/35

    Digital Camera Contd

    The camera allows the use of a Microdrivedisk instead of CompactFlash memory. Fig

    8.15 shows the comparison of both.

  • 8/10/2019 IO PERFORMANCE.ppt

    27/35

    Digital Camera Contd The electronic brain of the Sanyo camera is an

    embedded computer with several special

    functions embedded on the chip. These kind of

    chips are called systems on a chip(SOC). The

    SOC integrate into a single chip all the parts thatwere found on a small printed circuit board of

    the past. They reduce size and lowers the power

    compared to less integrated solutions. The SOC

    enables the camera to operate on half thenumber of batteries and to offer a smaller form

    factor than competitors cameras.

    Fig 8.16

  • 8/10/2019 IO PERFORMANCE.ppt

    28/35

  • 8/10/2019 IO PERFORMANCE.ppt

    29/35

    The SOC has two buses, the 16-bit bus is for

    the many slower I/O devices like the Smart

    Media interface, program and data memory,and DMA. The 32-bit bus is for the SDRAM,

    the signal processor(which is connected to

    the CCD), the Motion JPEG encoder, and theNTSC/PAL encoder(which is connected to

    the LCD). The SOC has a large variety of I/O

    buses it must integrate unlike desktop

    microprocessors. This 700 mW chip contains1.8M transistors in a 10.5 x 10.5 mm die

    implemented using a 0.35-micron process

  • 8/10/2019 IO PERFORMANCE.ppt

    30/35

    Fallacies and Pitfalls

    Fallacy: the rated mean time to failure ofdisks is 1,200,000 hours or almost 140

    years so disks practically never fail.

    This number exceeds the lifetime of a disk.

    For this large MTTF to make some sense, themanufacturer's argue that this calculation will

    correspond to a user who buys a disk, and

    keeps replacing it every 5 years. (lifespan of

    the disk).

  • 8/10/2019 IO PERFORMANCE.ppt

    31/35

    Fallacy: Magnetic disk storage is on its last

    legs and will be replaced shortly.

    This is a fallacy and a pitfall. Magneticbubbles memories, optical storage, and

    holographic storage are unsuccessful

    contenders. None have matched the

    combination of the characteristics that favormagnetic disks: high reliability, nonvolatility,

    low cost, reasonable access time etc.

    magnetic storage rather improves at the same

    or faster pace that is sustained over the past

    25 years.

  • 8/10/2019 IO PERFORMANCE.ppt

    32/35

    Fallacy: A 100 MB/sec bus can transfer

    100 MB of data in 1 sec.

    First you cannot use 100% of any computerresource. For a bus you would be fortunate to

    get 70% to 80% of the peak bandwidth. Time

    to send the address, time to acknowledge the

    signals and stalls while waiting to use a busybus are deterrents to 100% utilization of a

    bus. Also the MB of storage and the MB/sec

    of bandwidth do not agree.

  • 8/10/2019 IO PERFORMANCE.ppt

    33/35

    Pitfalls: Using the peak transfer rate of aportion of the I/O system to makeperformance projections or performancecomparisons.

    The components of an I/O system, from thedevices to the controllers to the buses are

    specified using their peak bandwidth. Thesepeak bandwidths measurements are oftenbased on unrealistic assumptions about thesystem or are unattainable because of othersystem limitations. Amdahls law tells us thatthe throughput of an I/O system will be limitedby the lowest-performance component in theI/O path.

  • 8/10/2019 IO PERFORMANCE.ppt

    34/35

    Pitfall: Using magnetic tapes to back up

    disks.

    This is a fallacy and a pitfall. Tapes usesimilar technology to disks. The cost

    difference between disks and tapes is based

    on the fact that the rotating disk have lower

    access times than sequential tape access.Though tapes could hold the contents of

    many disks and since it was 10 to 100 times

    cheaper per gigabyte than disks it was a

    useful backup. Today, disks have improvedmuch rapidly than tapes that tapes have

    compatibility problems that are not imposed

    on disks.

  • 8/10/2019 IO PERFORMANCE.ppt

    35/35

    Pitfall: Trying to provide features onlywithin the network versus end to end.

    The concern is providing at a lower levelfeatures that can only be accomplished at thehighest level, thus only partially satisfying thecommunication demand.

    Pitfall: Moving functions from the CPU tothe I/O processor, expecting to improveperformance without a careful analysis.

    A frequent instance of this fallacy is the use of

    intelligent I/O interfaces, which, because ofthe higher overhead to set up an I/O request,can turn out to have worse latency than aprocessor directed activity.