23
Black-Box Storage Device Modeling Mengzhi Wang May 8, 2004 Thesis Proposal Thesis Statement Entropy plots and machine learning methods help create inexpensive “black- box” models that describe workload behavior on modern storage devices. Black-box models are appropriate in use for application planning tasks, such as automatic data layout on a set of disks. Abstract Autonomous systems aim to reduce the ever-increasing system management costs. Storage device models are one of the key elements in autonomous storage systems in making decisions on resource re- allocation. This thesis proposes a black-box devicemodeling approach, in which storage devices are treated as black boxes and modeled as mappings of I/O workloads to their performance. The model construction algorithm observes the modeled storage device on a set of training workloads and builds the model based on the observations. The main advantage of this approach is easy model construction and high efficiency in both computation and storage requirement. No specific knowledge of the internals of storage devices is needed in constructing such device models. This thesis tackles the black-box device modeling problem in the following four steps. 1) The entropy plot metric is developed to capture the spatio-temporal behavior of I/O workloads. The metric is able to map workloads as vectors of scalars, which serve as the input to the storage device models. 2) This step introduces two types of storage device models that work on the request and workload level respectively and a hybrid approach to combine the strengths of both. 3) This step provides a framework to generate device-specific training workloads using a feedback mechanism. 4) A sample application, automatic data layout, is used to demonstrate the usability of such black-box device models. 1 Introduction Moore’s law dictates the hardware advance. The exponential growth in processing power predicted by Moore’s law leads to larger scale systems in practice. Growing exponentially at the same time are the complexity of software running on these systems and the amount of the data stored on the systems. All these exponential growth leads to dominant management costs in overall system costs. Autonomous systems, such as self-tuning database systems [38, 53, 18] and self-managed storage systems [6, 28, 67, 77], aim at reducing the management cost by automating administration tasks. Besides the ability to identify the performance bottlenecks, these systems can also eliminate bottlenecks by re-allocating resources without human intervention. Performance prediction in self-adaptive systems is extremely important 1

Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

Black-Box Storage Device Modeling

Mengzhi Wang

May 8, 2004

Thesis Proposal

Thesis Statement

Entropy plots and machine learning methods help create inexpensive “black-box” models that describe workload behavior on modern storage devices.Black-box models are appropriate in use for application planning tasks, suchas automatic data layout on a set of disks.

Abstract

Autonomous systems aim to reduce the ever-increasing system management costs. Storage devicemodels are one of the key elements in autonomous storage systems in making decisions on resource re-allocation. This thesis proposes a black-box device modeling approach, in which storage devices are treatedas black boxes and modeled as mappings of I/O workloads to their performance. The model constructionalgorithm observes the modeled storage device on a set of training workloads and builds the model basedon the observations. The main advantage of this approach is easy model construction and high efficiencyin both computation and storage requirement. No specific knowledge of the internals of storage devices isneeded in constructing such device models.

This thesis tackles the black-box device modeling problem in the following four steps. 1) The entropyplot metric is developed to capture the spatio-temporal behavior of I/O workloads. The metric is ableto map workloads as vectors of scalars, which serve as the input to the storage device models. 2) Thisstep introduces two types of storage device models that work on the request and workload level respectivelyand a hybrid approach to combine the strengths of both. 3) This step provides a framework to generatedevice-specific training workloads using a feedback mechanism. 4) A sample application, automatic datalayout, is used to demonstrate the usability of such black-box device models.

1 Introduction

Moore’s law dictates the hardware advance. The exponential growth in processing power predictedby Moore’s law leads to larger scale systems in practice. Growing exponentially at the same time arethe complexity of software running on these systems and the amount of the data stored on the systems.All these exponential growth leads to dominant management costs in overall system costs. Autonomoussystems, such as self-tuning database systems [38, 53, 18] and self-managed storage systems [6, 28, 67, 77],aim at reducing the management cost by automating administration tasks. Besides the ability to identifythe performance bottlenecks, these systems can also eliminate bottlenecks by re-allocating resourceswithout human intervention. Performance prediction in self-adaptive systems is extremely important

1

Page 2: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

because any decision on resource re-allocation needs to be evaluated. Such evaluation is, however,infeasible by running workloads on actual devices for two reasons. First, determining the optimal or anear-optimal solution requires evaluating a significant amount of possible configurations, and the numberof possible configurations grows exponentially to the system size. It is prohibitively slow to evaluateconfigurations by actually running them. Second, the devices are in use in online planning applications.Therefore, statistical performance models are essential because of their computational efficiency. Thesedevice models can be pre-built by the device manufacturers and shipped with the devices themselves.

The difficulty of storage device modeling lies in the complexity of both storage devices and I/Oworkloads. On one hand, storage devices show strong non-linear behavior because of the disks’ physicalmachinery. Performance optimizations, such as prefetching and caching, further complicate the problem.On the other hand, I/O workloads usually exhibit bursty arrival patterns and strong spatial locality, inaddition to complex correlations between requests. A lack of burstiness and locality metrics makes iteven more difficult to build accurate device models.

Previous work has not yet given satisfying answers to how to model devices and how to characterizeI/O workloads to suit storage device modeling. Existing device models, such as disk simulators andanalytical models, usually rely on detailed information of the storage device implementation and a deepunderstanding of the interaction between storage devices and workloads. New storage device techniquessometimes invalidate existing models. The accuracy of the models is further limited by the lack of accurateworkload characteristic metrics. On the other hand, most recent work in workload characterizationtargets the arrival processes of network traffic after the groundbreaking discovery of self-similarity inEthernet traffic [44]. It is, however, inadequate to adapt the network traffic work to I/O traffic becauseof the difference in locality models of the two types of traffic. Target addresses of network traffic arecategorical, so the LRU stack distance is a good locality metric for such traffic. For I/O workloads,however, the huge performance difference in sequential and random accesses calls for a more sophisticatedlocality metric than the LRU stack distance. Furthermore, strong correlations involving request typeand size are also relevant to the workload performance. Previous workload characterization work usuallydescribes workloads using a large number of parameters, and some parameters are distributions ratherthan scalars. The black-box storage device modeling approach, however, uses machine learning tools todescribe storage device behavior, requiring a more compact representation of I/O workloads because allthe tools we have examined take only vectors of scalars as input.

The target of this thesis is to develop storage device models for self-managed storage systems bytreating the devices as “black boxes”. That is, neither the model itself nor the model constructionalgorithm is aware of the internals of the modeled storage device. The model learns the device behaviorby observing it for a certain period of time, also known as “training”. The behavior is captured bya mapping of workloads to their performance on the device. The advantage of this approach is thatconstructing a device model involves only little human intervention, that is, a training on the device.Furthermore, strong non-linear device behavior is no longer a worry because there are quite a few machinelearning tools for non-linear mapping approximation.

Black-box device modeling is difficult for two reasons. First, machine learning tools usually requirethe input data to be data points in a multi-dimensional Cartesian space. Therefore, efficient workloadcharacteristic metrics are needed in order to represent workloads as multi-dimensional data points. Pre-vious work hasn’t provided metrics suitable for this purpose. Second, a suite of high-quality trainingworkloads is essential in order for the models to capture the device behavior under all possible workload.Moreover, the “best” training workloads may be device-specific because different devices may performdifferently on a certain type of workloads. This thesis addresses the above two issues.

The contributions of this thesis are: (i) We propose the entropy plot metric to quantify the spatio-temporal behavior of I/O workloads. This metric captures both the temporal burstiness and spatial

2

Page 3: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

locality in three distinct values, making it possible to apply black-box device modeling. In addition,the entropy plot metric allows easy replication of the workload characteristics in synthetic workloads tofacilitate other applications, such as system design evaluation. (ii) We develop black-box device modelsbased on the entropy plot metric. The target device models, the workload-level models, take the entropyplot measures of workloads as input and predict their performance. To shorten the long training inconstructing the workload-level models, a hybrid approach is introduced. The hybrid approach combinesthe two types of device models to share the best of the two worlds, and is able to deliver accurate andefficient device models with a reasonable amount of training workloads. (iii) We provide a frameworkfor generating device-specific training workloads for constructing black-box storage device models. Theframework uses a feedback mechanism to augment the training set with workload scenarios that themodels haven’t seen enough samples. As a result, the training set provides a good coverage over theinput space, and the constructed models are able to predict any type of workloads. (iv) We demonstratethe power of the storage device models using a sample application, automatic data layout across devices.

The proposal is organized as follows. Section 2 reviews previous work in related areas. Section 3presents the detailed description of this thesis work and the steps towards completing this thesis. Section 4and Section 5 presents my previous research and planned work. Section 6 gives a tentative schedule.

2 Related Work

This section lists some representative work in the area of storage device modeling, workload charac-terization, and storage system planning.

Storage device modeling. Storage device modeling models either specific device components or theentire devices. For the former, there are publications about the seek time latency [61, 63], schedulingalgorithms [35, 64], and disk head position [79].

At the device level, there are three categories of models: simulators, analytic models, and black-boxmodels. Disk simulators, such as DiskSim [16] and Pantheon [76], simulate the behavior of a storagedevice by software, and can deliver accurate response times for individual disk requests. Replaying atrace on a disk simulator, however, is expensive in terms of time and resources, making it inappropriatefor online planning applications. Analytical device models [20, 43, 48, 49, 65, 70], on the other hand,describe the behavior of storage devices by a set of formulas and are computationally efficient. Buildinga disk simulator or an analytical device model is challenging because the model developers should havea deep understanding of the storage device implementations and their interaction with I/O workloads.Furthermore, a new disk model or optimization technique may require a new cycle of performance analysisand model development.

In contrast, the black-box device modeling approach models storage devices as mappings from work-loads to their performance on the devices. The model construction does not require any prior knowledgeof the storage device implementation. The model is built by observing the modeled device under a setof workloads. To our knowledge, the only such model is the table-based prediction used in Ergastu-lum [7, 8]. In this approach, each device has a table to remember its maximum throughput under a largeset of random synthetic workloads. Although that is a good step toward the goal, the workload charac-teristics in the table-based approach is largely simplified. The models themselves are tables that couldpotentially take a large amount of storage space when more advanced characteristics are introduced toimprove the prediction accuracy. The device behavior is approximated with spline functions that requiresa table search and approximation for each prediction.

This work proposes black-box device modeling based on more accurate workload characteristics. Inaddition, we employ a machine learning tool to deliver models that are efficient both computationally

3

Page 4: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

and space-wise. This work further develops a framework to generate training workloads that are tailoredfor the modeled device, leading to fully automated device construction.

Workload characterization. Workload characterization has been an active area ever since the birthof computer systems and involved workloads of various applications on different components of thesystems, such as data references at the processor level [12, 21, 40], file systems [54, 50, 25, 9, 31], I/Oworkloads for various operating systems and applications [13, 37, 29, 60], I/O access patterns of parallelapplications [33, 66, 52], networking tracing [17, 23, 44, 55], and trace visualization [33, 46, 34, 10, 3, 45].

Offline trace gathering, analysis and visualization. [32].[14] attempts to solve the performance guarantee problem assuming intermixing phased workloads.

Chen [19] suggested that a self-scaling I/O benchmarks. Unfortunately, the workload characteristicsare overly simplified in that work and the unrealistic assumption of no or little correlation between thesecharacteristics compromises the validity of the work.

Early models, such as the traditional queuing theories [22] and Markov models [1, 26], usually assumePoisson arrivals.

Recent discovery of self-similarity in network and other types of traffic [23, 31, 29, 44] has invali-dated the popular Poisson arrival assumption [57], leading to work that involves self-similar processes.Self-similar stochastic processes, such fractal Brownian motion [56, 44], fractional ARIMA [68], and Mul-tifractal Wavelet model [58], have been introduced to capture the self-similarity in computer-generatedtraffic. One conjecture is that the cause of the self-similarity is the heavy-tailed distributions of the useractive and idle time, leading to structural ON/OFF models [11, 23, 78]. Most efforts, however, focused onreproducing the self-similarity in arrival time, while ignoring the spatial locality, or using simple localitymodels such as Independent Reference Model [15] and LRU stack distances [5].

Although these locality models might be adequate for network traffic, they are not powerful enough forI/O workloads. In applications such as web cache evaluation, the addresses of requests can be deemed ascategorical and the LRU stack distances are appropriate. In contrast, the service time for a disk requestdepends heavily the correlations between the current request to previous ones. The performance differencecan be several orders of magnitude between sequential and random accesses for disks. Therefore, a LRUstack distance is inadequate to describe the spatial locality of I/O workloads.

I/O workload characterization requires models that capture the complex correlations of I/O workloadsbesides the self-similarity in the arrival patterns. The spatial locality has been verified by multiplestudies [27, 41] as being one of the most important characteristics in determining the performance of I/Oworkloads. While self-similar processes are introduced to model the arrival patterns of I/O workloads,for example, Gomez et. al. [29, 30] mixed ON/OFF models with traditional queuing models, and Tranet. al. [68] employed ARIMA models for predicting the optimal prefetching size. the locality model is,on the other hand, usually simulated by the empirical distribution of the access frequencies of the diskblocks mixed with sequential scans or Markov models [68, 41], which ignores the correlation betweensubsequent requests. Hong et. al. [36], have argued that self-similarity does not significantly affect diskbehavior with respect to response times beyond a certain time scale, for example, 5 seconds for HP97560and HP2204A disks under two different workloads. In addition, the request type and size also play animportant role in devices of large caches as shown by Kurmas et. al. [42]. None of these studies, however,provides a good metric for quantifying the correlation structures of I/O workloads. Even sequentialaccess detection is a hard problem when sequential accesses have jumps and are intermixed [39].

It is nontrivial for black-box device models to take advantage of the work in synthetic workload gener-ation. Existing workload generators usually involves empirical distributions derived from real workloads.Our black-box device modeling approach uses machine learning tools as the building blocks, thus, isnot able to take distributions as input. This work develops a metric that allows I/O workloads to be

4

Page 5: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

represented as vectors. The metric captures the burstiness in the arrival patterns and the correlationsbetween disk request attributes. The parameter extraction requires no tuning or human intervention.The additional value of the metric is that it leads naturally to a synthetic workload generator.

Storage planning. Data layout happens in two levels: inside a storage device and across storagedevices. Within a single disk, the target of reorganizing data is usually to reduce the seek time for futureaccesses. The reorganization happens at the cylinder or block level. Examples include placing relateddata contiguously for sequential disk access [47, 75], placing hot data near the center of the disk [4, 71, 59],and replicating data on disk to provide quicker-to-access options for subsequent reads [79, 51]. Disk arraysincurs more options for layout optimizations. For example, [69] examines the problem of inter-disk layoutfor a variety of disks. [62] considers the problem of deciding the best stripe size assuming files are alwaysstriped across all the disks.

Data layout across storage devices usually happens in logical units, such as logical volumes for filesystems and relations for database systems. Recent effort by Anderson et. al. [6, 7] formulates theproblem as an optimization problem and apply a randomized greedy search algorithm to find a near-optimal solution. In the database community, Agrawal et. al. [2] examines the problem in the contextof the AutoAdmin project at Microsoft [38] and uses undirected graphs to model the interaction amongrelations. The data layout problem is then reduced to graph partitioning.

In both approaches, simple device models are used for performance prediction. The thesis work iscomplementary to their work by providing device models that are more accurate, efficient in terms ofcomputation and storage requirement.

3 Problem Definition

This section defines the problem of black-box storage device modeling and outlines solutions.

3.1 Black-Box Storage Device Modeling Overview

The goal of this thesis is to develop storage device models to predict the performance of an arbitraryworkload on devices. The devices could be a single disk or a disk array. A workload is a sequence of diskrequests, each request, ri, specified by four attributes: arrival time, logical block number (LBN), requestsize, and operation type.1 The workload performance is an aggregate performance measure, such as theaverage or 90th percentile response time.

We build device models by treating the devices as “black-boxes”, meaning that neither the modelnor the construction algorithm requires prior knowledge of the device internals. Such device models areconstructed through “training”, in which the construction algorithm observes the device behavior undertraining workloads and builds the models based upon these observations as shown in Figure 1 (a). Thedevice behavior is, then, modeled as a mapping from workloads to their performance on the device. Theresulting model takes a workload as input and outputs its performance on the device.

The black-box modeling approach has three advantages over other models. First, by treating themodeled device as a black box, the construction algorithm is isolated from the implementation detailsof the device. Modeling a new device involves only a new training on the new device and requireslittle human intervention. Second, we can take advantage of existing efficient machine learning tools toapproximate the device models. These tools usually have strong mathematical theories as basis and haveexperienced huge successes in practice. Third, black-box models are desired when propriety is a concern.

1We assume a workload consists of all the activities on a single logical volume. It is straightforward to extend the model

to handle more than one logical volumes. For example, we can append the LBN of several logical volumes together to form

a large LBN space.

5

Page 6: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

. . .

Storage device

TrainingWorkloads

. . .

. . .

. . .

TrainingWorkloads

. . .

. . .. . .

Devicemodel

r3

r 2

r 1

2

3

1

3r

r 2

r 1

2

3

1

r3

r 2

r 1

r3

r 2

r 1 Trace replaying

ModelConstruction

RT

RT

RT

RT

RT

RT

(a) Black-box model construction. The algorithm builds the model based onthe training workload performance on the actual device. RTi is the responsetime of disk request ri.

Device model

. . .

Performance Pr 3

r 2

r 1

WorkloadCharacteristicExtractor

CARTW

(b) Performance prediction using black-box models. A workload is com-pressed into a vector, W , and fed into the CART model. The CART modelpredicts the performance P from the vector W .

Figure 1. Black-box storage device modeling overview.

Manufacturers can ship models with their storage devices without revealing their device technologies.All these reasons make black-box models attractive in practice.

We employ a machine learning tool, Classification And Regress Trees (CART) [24], to approximate themapping from workloads to performance. First, machine learning tools have solid mathematical theories,and they have successfully solved a wide range of problems. Second, the CART models are efficient inboth computation and storage requirement, making them a good candidate for modeling storage devices.In addition, the CART models require little parameter tuning and offer automatic model construction.A CART model approximates a non-linear function over vectors using a piece-wise constant function.Please note that there are other tools that can provide the same or even better performance. This workdoes not compare these models; our focus here is how to apply such tools in storage device modeling.

Performance prediction has two steps with black-box models as shown in Figure 1 (b). The modelreceives a workload and represents the workload into a vector, W . The CART model, then, predictsthe performance P of the workload from the vector. The transformation from workloads to vectorsis necessary because all the machine learning tools we have examined require the input data to beembedded in a multi-dimensional Cartesian space. The accuracy of black-box models are subject to twoissues. First, in order for the CART models to distinguish workloads of different behavior, the vector,W , should contain all the important workload characteristics. We also need to keep the dimensionalityof W small enough to achieve a manageable training time because the training time grows exponentiallywith the input dimensionality. We denote W as workload description for the rest of this document.Second, the model performs well for workloads similar to the training workloads, but bad for others.Therefore, the training should explore the vector space as much as possible for the constructed model topredict arbitrary workloads with high confidence. Solving the above two problems is the key in buildingsuccessful black-box models for storage devices.

Unfortunately, previous work has not provided satisfying answers to the two problems. Compressing aworkload into a workload description is not straightforward because of the complexity of I/O workloads.I/O workloads exhibit long range dependence along time and strong correlation between time, LBN,

6

Page 7: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

�����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

�����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

�����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

�����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

�������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

���������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

characterizationStep 1: Workload

Step 3: Generaing device−specifictraining workloads

Step 2: Black−box storagedevice modeling

. . .

Storage device

TrainingWorkloads

. . .

. . .

. . .

TrainingWorkloads

. . .

. . .. . .

the training setAdding trace to

device modelBlack−box

. . .

TrainingWorkloads

. . .

r3

r 2

r 1

2

3

1

3r

r 2

r 1

2

3

1

r3

r 2

r 1

r3

r 2

r 1

r 3

r 2

r 1

Trace replaying

ModelBuilding

ModelEvaluation

Workload

GeneratorScenario

WorkloadCharacteristicExtractor

RT

RT

RT

RT

RT

RT

PW

PW

Figure 2. Detailed procedure of model construction in black-box device modeling. The shaded boxescorrespond to the steps described in Section 3.1. ri are disk requests and RTi are the correspondingresponse times. W and P denote the vector of workload characteristics and the performance of a workloadrespectively.

size, and type. The compression should bring as little information loss as possible, yet the workloaddescription should also be compact. Existing models, however, usually involve a large set of parameters,and some parameters are empirical distributions rather than scalars. We need new metrics that canquantify workload characteristics using a few scalars.

Second, training workload generation bears a subtle distinction from traditional synthetic workloadgeneration. Synthetic workload generation focuses on generating synthetic workloads that are similarto “real” workloads, with “real” meaning a sample workload collected from a running system. Modelparameter values are estimated from the sample. Training workload generation, in contrast, stresseson generating workloads of all possible characteristics to exercise the modeled device. In most cases,we do not have a sample workload to estimate parameters. Efficient sampling the parameter space isimportant because enumerating all the possible workload characteristics makes the training prohibitivelylong. Moreover, the difference in storage devices calls for customized training traces, so a “universal”training set may be impractical. We need a systematic way to generate training workloads efficiently.

This thesis addresses both the workload characterization and training workload generation problems.In particular, we propose the entropy plot metric to quantify I/O workloads and develop storage devicesiteratively using a feedback mechanism. Figure 2 illustrates in details the model construction algorithm.The four steps towards finishing the thesis are outlines as follows.

Step 1 Workload characterization and synthetic workload generation. This step proposes an“entropy plot” metric that describes both the characteristics along one attribute and the corre-lations between attributes. In particular, entropy plot quantifies the spatio-temporal behavior ofI/O workloads using three scalars, making it a perfect metric for generating workload descrip-tions. To demonstrate the accuracy of the entropy plot metric, we develop synthetic workloadgenerators to generate workloads similar to a given one. The similar response time distributionindicates that the entropy plot captures the characteristics accurately.

Step 2 Storage device modeling. We develop black-box models based on the workload characteristicsquantified by the entropy plot metric. This step develops a construction algorithm that deliversan accurate and efficient device model with a short training time.

7

Page 8: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

Step 3 Training workload generation. This step builds a framework to automate training workloadgeneration. The framework uses device models as an evaluation tool for a training set. A feedbackmechanism keeps on augmenting the training set by generating workloads whose behavior are notcaptured by the existing training set until the quality of the constructed model meets the accuracythreshold.

Step 4 Case study: automated data layout. This step evaluates the storage device models in thecontext of a sample application, namely, automated data layout. We plan to compare the solutionof using our device models to the hand-optimized solution and the solution found by using theactual devices. The goal is to show that such black-box device models are efficient and accuratein finding the near-optimal solution.

3.2 Step 1: Workload Characterization and Synthetic Workload Generation

The target of this step is to develop a metric allowing efficient workload representation. Previouswork usually involves models using empirical distributions derived from real workloads. Analytical orblack-box models, however, need a more compact representation of workloads. The difficulty lies inthe complex correlation structures of I/O workloads. On one hand, along one attribute workloads havelong range dependence on arrival time, skew in access frequencies of disk blocks, which we call “vertical”characteristics. On the other hand, between attributes are strong correlations, which we call “horizontal”characteristics. For example, a sequential scan usually consists of requests of the same operation typeand size, suggesting strong correlation between the request arrival time, LBN, type, and size. Both typesof characteristics are important in determining workload performance.

Figure 3 further illustrates the importance of these characteristics. We observe bursty arrival andskewed access frequencies of disk blocks as well as the strong correlation between arrival time and logicalblock numbers for a sample disk trace. The importance of spatio-temporal correlation is clear from theperformance difference between real traces and trace with no spatio-temporal correlation. The desiredmetric should capture both types of characteristics in order to develop accurate device models.

This step develops the entropy plot metric to quantify both the vertical and horizontal characteristics.We start to model the temporal burstiness and extend the metric to handle other characteristics. Thevalue of the entropy plot metric is further confirmed by synthetic workload generators. The simplicityof the entropy plot enables us to generate synthetic workloads of the same entropy plot measures as realworkloads, and the synthetic and real workloads share similar performance. The work in this step isdivided into three parts.

Step 1-a Model temporal burstiness. This part uses the entropy plot to quantify the arrival process ofI/O workloads. More specifically, the arrival process is bursty and self-similar. The entropy plottakes advantage of the self-similarity and uses a single scalar to describe the temporal burstinessof a workload. The corresponding workload generator, “B-model”, is able to generate workloadarrival process for a given temporal burstiness.

Step 1-b Model spatio-temporal correlation. This part extends the work in the previous step tohandle the vertical characteristics on LBN and the correlation between arrival time and LBN.As a result, three scalars capture the entire spatio-temporal behavior of workloads. The cor-responding workload generator, “PQRS model”, can generate not only the arrival process, butalso the location of the requests.

Step 1-c Model other characteristics. We use the entropy plot to capture other vertical and hori-zontal correlations.

8

Page 9: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

0

20000

40000

60000

Number of Requests

Dis

k B

lock

s (a

ggre

gate

d in

1000 b

lock

s)

0

3000

6000

9000

12000

Num

ber o

f Req

uest

s

Time (aggregated in 10 seconds)

Space

TimeArrival time

Dis

k bl

ock

num

ber

Arrival time

Dis

k bl

ock

num

ber

0

3000

6000

9000

12000

Num

ber o

f Req

uest

s

Time (aggregated in 10 seconds)

0.0001

0.001

0.01

0.1

1

10 100 1000 10000

Pr(

Que

ue le

ngth

>=

x)

Queue length

RealI-model

(a) Sample disk trace (b) Independent trace (c) Queuing behavior

Figure 3. Comparison of a sample disk trace with an independent trace. The real trace shows strong tem-poral burstiness, skew along disk block numbers, and strong spatio-temporal correlation. The correlationis even clearer when we compare the real trace with the independent trace. The independent trace hasthe same marginal on the arrival time and location, but assumes no correlation between them. The twotraces are different both visually and performance-wise as shown in (c).

3.3 Step 2: Storage Device Modeling

This step studies the device model construction algorithm and evaluates the effectiveness of the work-load description in characterizing workloads. The construction algorithm should deliver efficient andaccurate device models with a short training time.

Black-box device modeling views storage device behavior as a mapping from workloads to their per-formance on the modeled device. Ideally, the model should operate at the workload level as shown inFigure 1 (b). The model first transforms a workload into the workload description and predicts theworkload performance from the description. The workload description contains both the vertical andhorizontal workload characteristics quantified by the entropy plot metric.

A straightforward construction algorithm is the “naive” approach as shown in Figure 4. We call thisapproach CART-workload because the construction algorithm also works at the workload-level. Thealgorithm represents training workloads and their performance as (workload description, performance)pairs and builds a device model from the pairs. The resulting model is fast in predicting. The trainingtime, however, could be extremely long because a workload yields only one sample.

An alternative device model operates at the device level. The request-level model characterizes indi-vidual requests as request descriptions, R, and predicts their response times. We refer to such devicemodels as CART-request. Building the CART-request models requires a much shorter training timebecause one request yields a sample. We can use CART-request to generate the response time distribu-tion of an arbitrary workload and derive the aggregate performance measures. The prediction process,however, is slow since one request needs a run against the model.

To combine the best of CART-workload and CART-request, we design a hybrid model constructionalgorithm as shown in Figure 4. The hybrid approach first builds a CART-request model and then derivesa CART-workload from the request-level models. As a result, both the training and predicting is fast.We refer this approach as CART-hybrid. Table 1 summarizes the three types of device models.

The work in this step is divided into the following three parts.

Step 2-a Build CART-workload device models. This part develops CART-workload models. This

9

Page 10: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

���������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

�������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

Naive model construction

Hybrid model construction

TrainingWorkloads

. . .

. . .

. . .

TrainingWorkloads

. . .

. . .. . . device model

Request−level

device modelWorkload−level

TrainingWorkloads

. . .

r3

r 2

r 1

2

3

1

3r

r 2

r 1

2

3

1

2

3

11

2

3

2

3

11

2

3

WorkloadCharacteristicExtractor

ModelBuilding

ModelBuilding

RT

RT

RT

RT

RT

RT

RT

RT

RTR

RR

RT

RT

RTR

RR

PW

PW

Figure 4. Two ways of constructing workload-level device models.

Feature CART-workload CART-request CART-hybrid

Storage requirement low low low

Training long short short

Predicting fast slow fast

Table 1. Comparison of three device model construction algorithms.

allows us to evaluate the effectiveness of the workload description.

Step 2-b Build CART-request device models. This part identifies the request description and buildsCART-request models.

Step 2-c Build CART-hybrid device models This part studies the relationship between the requestand workload descriptions and builds CART-hybrid models.

3.4 Step 3: Training Workload Generation

The target of this step is to provide a systematic approach to generate workloads for device modelconstruction. We choose to develop a framework rather than a suite of “universal” training workloadsbecause there are not such workloads due to the difference in storage devices. The framework, on theother hand, is able to generate workloads iteratively, leading to device-specific training workloads.

The framework uses a feedback mechanism as illustrated in Figure 2 to achieve the goal. First, a devicemodel is built based on a small set of training workloads. By evaluating the model, more workloads willbe generated to cover the input space that has not been adequately represented by the existing trainingset. The next phase uses both the existing training set and new workloads to construct a better devicemodel. The iteration continues until the device model meets the desired accuracy. Theoretically, theinitial training set could be any workloads because the feedback loop will pick up scenarios that are notcovered by the existing training set. A high-quality initial training set, on the other hand, reduces thenumber of iterations needed. The key issues here are how to identify such regions and how to generateworkloads for these regions.

This work in this step is further divided into three parts.

Step 3-a Evaluate device models. The evaluation examines the device model and determines theregions in the input vector space that need more training workloads.

10

Page 11: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

Step 3-b Generate workloads. This part generates workloads within a given region. The given regioncan be specified as a request description for CART-request and CART-hybrid or a workloaddescription for CART-workload.

Step 3-c Evaluate framework. This parts evaluates the effectiveness of the feedback mechanism inproducing accurate device models.

3.5 Step 4: Case Study (Automated Data Layout)

This step evaluates the usability of the device models built in this work. A sample application,automated data layout, is used to illustrate the power of such device models.

The automated data layout algorithm takes as input a workload on a set of data objects and a set ofdevices and outputs the optimal or a near-optimal data layout in terms of some target function. Morespecifically, a workload on a set of data objects, {oi}, is a sequence of disk requests, and all the requestsare tagged with the object identifications. A data layout is an assignment of data objects on the storagedevices such that

1. a data object can be assigned to several devices, but every bit of the data object resides on oneand only one device.

2. a storage device holds no more bits of data than its capacity allows.

The optimal solution is measured in terms of a target function, for example, to minimize the averageresponse time or to reduce the overall storage cost. In either case, the problem is essentially an opti-mization problem. Storage device models provide an efficient way to estimate the performance of eachdata layout so that the search algorithm can locate the desired solution.

Step 4-a Build infrastructure. This step designs and realizes the experimental setup for the problem,such as the search algorithm and the workloads.

Step 4-b Evaluate device models. The storage device models will be plugged into the infrastructurefor evaluation. The purpose is to show the effectiveness of the models in guiding the searchalgorithm and identifying the desired solution.

3.6 Current Status

Table 2 lists my progress in the thesis work. The next two sections briefly describe my previous workand the ongoing work on this thesis.

4 Completed Work

I have completed Step 1 and part of Step 2. This section presents the completed work.

4.1 Workload Characterization and Synthetic Workload Generation

Step 1 in Section 3.1 is completed. The target is to develop a compact and accurate metric to captureboth the vertical and horizontal workload characteristics. I have designed the entropy plot metric to meetthis goal. In particular, the entropy plot can capture the spatio-temporal behavior of I/O workloads usingthree scalars. Detailed information about this work is available in [72, 73].

11

Page 12: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

Step Status

Step 1: Workload characterization and synthetic workload generation

1-a) Model temporal burstiness Done

1-b) Model spatio-temporal correlation Done

1-c) Model other characteristics Done

Step 2: Storage device modeling

2-a) Build CART-workload device models Done

2-b) Build CART-request device models Done

2-c) Build CART-hybrid device models In progress

Step 3: Training workload generation In progress

Step 4: Case study (automated data layout) In progress

Table 2. Current status in thesis work.

5

10

15

20

0 5 10 15 20 25

Ent

ropy

val

ue

Scale

Real0.73 * X - 0.24

Poisson0.95 * X + 0.30

1

b

b^2

1 1 11/2 1/2

1e-06 %

0.0001 %

0.01 %

1 %

1 10 100 1000 10000

Pr(

Que

ue le

ngth

>=

x)

Queueing time (ms)

RealBmodelPoisson

(a) Entropy plot on sample trace (b) Trace generation in B-model (c) Queue length distribution

Figure 5. Entropy plot on time and B-model. (a) shows the entropy plot on time for the sample trace inFigure 3. The entropy plot has a slope of 0.73. (b) illustrates the B-model construction algorithm. Param-eter b controls the temporal burstiness of the synthetic workloads. (c) demonstrates the effectiveness ofthe entropy plot by comparing a B-model trace with the sample trace. They both have similar long tails inqueuing time distribution.

Step 1-a: Model temporal burstiness. I/O workloads show strong burstiness in arrival time asillustrated in Figure 3. The entropy plot quantifies the temporal burstiness using only one scalar. It plotsthe entropy value on arrival time against the granularity of the entropy calculation. More specifically,to calculate the entropy value at scale n, the whole range of the workload in time is divided into 2n

intervals of equal length, and the probability of a request arrives in each of these intervals is estimatedfrom a given workload. The entropy value can then be calculated from the probabilities. Plotting theentropy value against the scale gives us the entropy plot. Because of the self-similarity in I/O workloads,the entropy plots are linear, with the slopes indicating the temporal burstiness of the workloads. Theslope ranges from 0 to 1 for highly bursty traffic to smooth traffic. Figure 5 (a) shows the entropy ploton arrival time for the sample trace in Figure 3. The entropy plot shows strong linearity and has a slopeof 0.73, indicating the traffic is quite bursty.

To demonstrate the accuracy of the entropy plot metric, I have developed B-model to generate workloadarrival time with a given temporal burstiness. In B-model, parameter b, calculated from the entropy plotslope from a given trace, controls the burstiness of the synthetic traces. The generation algorithm is arecursive process as illustrated in Figure 5 (a). The recursive construction guarantees that the synthetictraces have linear entropy plots and the slopes of the entropy plots are controlled by the value of b.Figure 5 (c) compares the queuing time distribution of the sample trace to a B-model trace and aPoisson arrival trace of the same arrival rate. The B-model trace approximates the real trace much

12

Page 13: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

0

5

10

15

20

0 5 10 15 20

Ent

ropy

val

ue

Scale

Entropy on timeEntropy on lbn

Joint entropyCorrelation

p*s

p*rp*p

p*q

qp

sr

1

0.0001

0.001

0.01

0.1

1

10 100 1000 10000

Pr(

Que

ue le

ngth

>=

x)

Queue length

RealPQRS

I-model

(a) Entropy plot on sample trace (b) PQRS trace generation (c) Queuing time comparison

Figure 6. PQRS model for generating both the arrival time and access location of disk traces.

better because it captures the temporal burstiness of the real traffic.

Step 1-b: Model spatio-temporal correlation. This step extends the entropy plot and the B-model to handle the entire spatio-temporal behavior of I/O workloads. The entropy plot on arrival timedescribes the temporal burstiness; applying the tool to disk block number gives us the skew in accessfrequencies on disk blocks. Extending the tool to two-dimensional space produces the joint entropybetween arrival time and disk block number as shown in Figure 6. The difference between the sum oftwo one-dimensional entropy plots and the joint entropy is the correlation between arrival time and diskblock numbers. We observe that the four lines are reasonably linear and there is a strong spatio-temporalcorrelation. The slopes of the four lines characterize the spatio-temporal behavior of the sample disktrace. These are really three values because the fourth is a linear combination of the other three.

The PQRS model is the two-dimensional extension to the B-model. The four parameters, (p, q, r, s),characterize the spatio-temporal behavior of a given workload and can be computed from the entropy plotslopes. Constructing a PQRS trace is a recursive process as illustrated in Figure 6 (b). The experimentsin [73] have demonstrated that the PQRS traces have comparable performance as real traces as shownin Figure 6 (b).

Step 1-c: Model other characteristics. The entropy plot is able to model other workload charac-teristics. Modeling the vertical characteristics on request type and request size is straightforward. Thereare only two request types, and entropy plot has only one point. This value is essentially equivalentto the percentage of reads in the workloads. Figure 7 (a) shows the entropy plot on size. Most of therequests access less than 32 disk blocks even though the largest one can go beyond 600 disk blocks. Sowe see near zero values at the first two scales and a sudden surge after that.

Modeling the horizontal correlations involving operation type and request size requires a minor mod-ification of the entropy plot. The limited value range for these two attributes confines the number ofscales of the joint entropy. Instead of scaling both dimensions, we can fix the scale on request size andoperation type to the finest granularity and change the scale on the other dimension. Figure 7 showsthe correlations between arrival time or LBN and size or type. As we have expected, strong correlationexists.

Not all these entropy plots are linear. In the workload description, we use the average entropy differencebetween subsequent scales to characterize these characteristics.

4.2 Storage Device Modeling

This part of the work is toward finishing Step 2 in Section 3.3. This section briefly describes theCART-workload and CART-request device models that I have developed. Detailed information of this

13

Page 14: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5 6 7

Ent

ropy

val

ue

Scale

Entropy on size

0.2

0.4

0.6

0.8

0 5 10 15 20 25

Ent

ropy

val

ue

Scale

Time and RWLocation and RW

0

0.2

0.4

0.6

0.8

0 5 10 15 20 25

Ent

ropy

val

ue

Scale

Time and SizeLocation and Size

(a) Entropy plot on time (b) Entropy plot with operation type (c) Entropy plot with size

Figure 7. Entropy plots to quantify other workload characteristics.

work is available in [74].

Step 2-a: Build CART-workload device models. This steps builds workload-level device models anddemonstrates the effectiveness of the workload description in capturing workload characteristics. Theworkload description contains the following information.

• General information includes the average arrival rate, percentage of reads, percentage of sequen-tial accesses, and average request size.

• Vertical characteristics includes the entropy plot slopes on arrival time, logical block number,and request size.

• Horizontal characteristics includes the correlations between attributes quantified by the entropyplot.

Step 2-b: Build CART-request device models. This steps designs the request description and buildsrequest-level device models. The request description models the three components of request responsetime.

• Queuing time: time that the disk request spends in the queue waiting to be served. The queuingtime is captured by the difference in arrival time between the current request and previous requests.

• Locating time: time that the storage device spends on locating the disk blocks that the requestis going to access. This characteristic include both the logical block number of the current requestand the difference in logical block number from the current request to previous ones.

• Data operation time: time spent on reading or writing the disk blocks. The disk request size isused for this part of the response time.

Two parameters control the length of the history for queuing time and locating time characterization.We use 8 and 3 for these two respectively in our experiments.

Initial experimental results. Figure 8 shows the prediction accuracy of our predictors on four traces.Table ?? gives a summary of the four traces. The device being modeled is a 9GB Atlas 10K disk and themodels predict the average time over one-minute intervals. All the models are trained on the first twoweeks of cello99a. The prediction accuracy of cello99a is measured on the last two weeks. We comparethe two CART-based device models against three predictors.

14

Page 15: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

Trace name Length # of requests Average Size % reads Average RT

cello92 4 weeks 7.8 Million 12.9 KB 35.4% 83.78 ms

cello99a 4 weeks 43.7 Million 7.1 KB 20.9% 115.71 ms

cello99b 4 weeks 13.9 Million 118.0 KB 41.6% 259.68 ms

cello99c 4 weeks 24.0 Million 8.5 KB 26.4% 4.77 ms

Table 3. Trace summary. There are originally 8 disks in cello92. We pick the three with the most activitiesto fill a 9GB Atlas 10K disk that we are modeling. The response time information is generated by replayingthe traces on DiskSim 3.0 with the 9GB Atlas 10K disk model.

cello99a cello99b cello99c cello920 %

100 %

200 %

Med

ian

rela

tive

erro

r

constantperiodiclinearCART requestCART workload

2597

1258

3728

653

628

1955

240

1275

823

cello99a cello99b cello99c cello920

50

100

150

200

Med

ian

abso

lute

err

or (

ms)

constantperiodiclinearCART requestCART workload

(a) Median relative error (b) Median absolute error

Figure 8. Average response time prediction accuracy. The device being modeled is a 9GB Atlas 10K disk.All predictors are trained on two weeks of cello99a and tested on the indicated workloads.

• constant makes predictions using the average or quantile response time of the training trace.

• periodic divides a week into 24× 7× 60 intervals and remembers the average response time of thetraining workload for each interval. Prediction uses the corresponding value for the same interval.

• linear does linear regression on the workload description.

Overall, the two CART-based predictors are robust across all the workloads. Not surprisingly, constantand periodic incur sky-high prediction errors when the workload is not the same as the one that theyare trained on because they model workloads rather than the storage device. In contrast, consistent pre-diction accuracy proves that the CART-based predictors are not bound to workloads. The CART-requestpredictor does extremely well, achieving a mean absolute error of about 10 milliseconds for all the work-loads. The CART-workload predictor, on the other hand, experiences a larger relative error but a smallabsolute error.

5 Current and Future Work

This section outlines the ongoing and future work in order to complete the thesis.

5.1 Build CART-hybrid Device Models

This part of the work will fulfill Step 2-c) as described in Section 3.3. The target is to derive aworkload-level device model from a request-level device model. This approach is made possible by theclose relationship between the request and workload descriptions.

15

Page 16: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

The ideal case is to derive the distribution on request description for a given workload descriptionanalytically, for example, from a given temporal burstiness and arrival rate to the distribution on thequeuing time characteristics in the request description. Another option is to use PQRS models to generateworkloads for a given workload description and run them against the request-level device model. Thelatter takes more time than the former, but is still significantly better than running the workloads onthe real device.

5.2 Training Workload Generation

This part of the work will fulfill Step 3 as described Section 3.4. The target is to provide a systematicway of generating training workloads for device model construction. The framework, as shown in Figure 2,uses the feedback mechanism to improve the quality of the training set. It evaluates the constructedmodel and identifies workload scenarios that are not well captured by the model. New traces are, then,generated and augmented to the set of training workloads. The iteration keeps on until a satisfactorydevice model is achieved.

Step 3-a: Evaluate device models. Two types of regions need more samples. A region may containno samples or it may have complex device behavior. In both cases, more samples are needed in order forthe device models to approximate the device behavior.

Given a region, the ideal number of samples depends on three factors: the region size, the variance ofthe device behavior. In addition, these samples should spread out in the region rather than clustered.We will explore and leverage these measures.

Step 3-b: Generate workloads. Generating workload scenarios is rather straightforward. The re-quest description is simple enough to produce a disk request with a given request description. Theworkload description requires more efforts. The PQRS model described in Step 1-c) is a good candidate.In addition, after CART-hybrid is developed, there is no need to generate workloads based on workloaddescriptions.

Step 3-c: Evaluating framework. A variety of workloads, both synthetic and real ones, will beused as the testing workloads to evaluate the device models trained using the feedback mechanism. Theexpected observation is that all the workloads experience smaller and smaller prediction errors when thenumber of iterations increases.

5.3 Case Study: Automatic Data Layout

This part of the work will fulfill Step 4 as described in Section 3.5. The goal is to evaluate the devicemodels in the context of a real application. We use automated data layout as the target application.

Step 4-a: Build infrastructure. This step decides the problem setup and builds the infrastructure.The problem setup includes the workloads, the search algorithm, and the devices.

The workload in use is the TPC-C workloads and possibly other complex database workloads. Thedata objects are the relations and indexes. The target is to find an assignment of the objects to thedevices that minimizes the average response time. To simplify the problem, the target layout is limitedto the ones that meet the following constraints.

1. All the devices are disks of the same type.

2. The devices are divided into groups of either a single disk or four disks.

16

Page 17: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

�����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

�����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

�����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

�����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

r 1

r 2

r 3

. . .

r 1

r 2

r 3

. . .

r 1

r 2

r 3

. . .

r 1

r 2

r 3

. . .

. . .

W

r 1

r 2

r 3

. . .

r 1

r 2

r 3

. . .

r 1

r 2

r 3

. . .

r 1

r 2

r 3

. . .

. . .

WW W W

A device group as one device A device group as a set of devices

CART CART CARTCART CART

Performance Performance

Figure 9. Two alternatives to predict the performance of a device group.

3. Each object is assigned to only one device group and its data will be striped across all the devices.

This set of data layouts is similar to the ones supported by Automate Storage Manager of Oracle. ASMallows DBAs to divide devices into groups, and all the relations within a group are striped and mirrored.We introduce the above constraints only to limit the search space. The approach itself is independentfrom the constraints.

This work will use a randomized greedy search algorithm similar to Ergastulum [7]. The randomizedgreedy algorithm greedily assigns data objects to devices. It occasionally scrambles the current solutionby moving devices or data objects from one group to another to avoid local optimal.

The search algorithm consults the cost model to evaluate a data layout. The evaluation takes thedata layout and the workload as input and estimates the performance of the workload on such layoutusing device models. Performance of individual device groups will be estimated to produce the overallperformance of the layout. There are two options in estimating the performance of device groups asillustrated in Figure 9. The first one is to model the devices separately using the single device model.For device groups larger than one device, the accesses on these devices are split according to the stripingmechanism. The performance of individual device is then estimated by running the split workloadsagainst the device model. Aggregating them produces the average of the device group. The advantageof this approach is its simplicity. Only one device model is needed even if a disk group can have anarbitrary number of devices. The disadvantage is that splitting the workloads may require a significantamount of computation. The other alternative is to model the group of devices as a new type of device.This eliminates the splitting process, but may provide an inferior prediction accuracy as device groupsare more complex to model. This work will investigate the two alternatives and choose one to use.

The current infrastructure doesn’t model the effect of mixing several data objects on the workloadcharacteristics. When a device hold several data objects, the model extracts the accesses on these dataobjects, mixes them according to the new layout, and computes the characteristics from the mixture.Doing this allows the evaluation to focus on the quality of the device models by isolating errors introducedin the mixing process. Investigating the effect of mixing accesses, however, is an interesting topic forfuture research. The ideal case is to derive the characteristics of the mixture from the characteristics ofindividual data objects and the correlation between these objects.

Step 4-b: Evaluate device models. The storage device models will be plugged into the infrastructurefor evaluation. The purpose is to show the effectiveness of these models in guiding the search algorithm

17

Page 18: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

May 2004 Nov. 2004 Feb. 2005 May 2005Aug. 2004

Nov. 2004 Feb. 2005 May 2005Aug. 2004May 2004

Step 3−b) Generating workload scenarios

Step 4) Case study: automated data layoutStep 3−c) Evaluating the framework

Step 2−c) Building the workload−level device models using the hybrid approachStep 3−a) Evaluating the device models

Step 2−c)

Step 3−a)

Step 3−b)

Step 4)

Step 3−c)

Dissertation writing

Figure 10. Timeline toward finishing the thesis.

and identifying the desired solution. The expected result is that the search algorithm ends in a reasonableamount of time and the located solution is the same or close to the optimal solution.

We will compare the solution found by using the device models to two other solutions: a) the solutionfound by using the actual devices. This serves as the best solution that we can ever achieve under thechosen search algorithm, b) the solution found by other predictors such as the linear predictor, and c)the solution achieved by hand optimization. This shows the benefit of automating the data layout. Boththe computation time and quality of the solutions will be used to demonstrate that the black-box modelsare accurate and efficient for practical use.

6 Timeline

This section lists the tentative schedule in completing the thesis work. The expected graduation timeis May 2005. Figure 10 shows the schedule.

I plan to start with Step 3. The quality of the training workloads is important in device modelconstruction. Finishing Step 3 allows me to make it possible to evaluate the device models in an objectiveway. I allocate two months on average to each sub-step in Step 3.

Step 2-c and Step 4 will happen in parallel to Step 3. More specifically, Step 2-c will be parallelto Step 3-b because the result of Step 2-c determines if there is any need to generate workloads withworkload-level characteristics. Step 4 will be parallel to Step 3-c. Ideally, the device models built on thefeedback mechanism will be ready for evaluation when the sample application is built. The evaluationof the device models is expected to be completed before April.

The dissertation writing will happen mostly at the last four months of the my Ph.D. program.

References

[1] A. K. Agrawal, J. M. Mohr, and R. M. Bryant. An approach to the workload characterizationproblem. Computer, 9(6):18–32, 1976.

[2] Sanjay Agrawal, Suraji Chaudhuri, Abhnandan Das, and Vivek Narasayya. Automating layout ofrelational databases. In ICDE, pages 607–618, 2003.

18

Page 19: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

[3] Alexander Aiken, Jolly Chen, Michael Stonebraker, and Allison Woodruff. Tioga-2: A direct ma-nipulation database visualization environment. In ICDE, pages 208–217, 1996.

[4] Sedat Akyurek and Kenneth Salem. Adaptive block rearrangement. ACM Transactions on computersystems, 13(2):89–121, 1995.

[5] Virgılio Almeida, Azer Bestavros, Mark Crovella, and Adriana de Oliveira. Characterizing refer-ence locality in the WWW. In Proceedings of the IEEE Conference on Parallel and DistributedInformation Systems (PDIS), pages 92–103, Miami Beach, FL, 1996.

[6] E. Anderson, M. Hobbs, K. Keeton, S. Spence, and M. Uysal an d A. Veitch. Hippodrome: Runningcircles around storage administration. In FAST, pages 175–188, 2002.

[7] E. Anderson, M. Kallahalla, S. Spence, R. Swaminathan, and Q. Wang. Ergastulum: Quickly findingnear-optimal storage system designs. Technical Report HPL-SSP-2001-05, HP Labs, 2001.

[8] Eric Anderson. Simple table-based modeling of storage devices. Technical Report HPL-SSP-2001-4,HP Labs, 2001.

[9] Mary Baker, John H. Hartman, Michael D. Kupfer, Ken Shirriff, and John K. Ousterhout. Measure-ment of a distributed file system. In Proceedings of 13th ACM Symposium on Operating SystemsPrinciples (SOSP), pages 198–212, 1991.

[10] Thomas Ball and Stephen G. Eick. Software visualization in the large. IEEE Computer, 29(4):33–43,1996.

[11] Paul Bardord and Mark Crovella. Generating representative web workloads for network and serverperformance evaluation. In SIGMETRICS’98, pages 151–160, 1998.

[12] Luiz Andre Barroso, Kourosh Gharachorloo, and Edouard Bugnion. Memory system characteriza-tion of commercial workloads. In Proceedings of the 25th International Symposium on ComputerArchitecture, 1998.

[13] Ken Bates. VAX I/O subsystems: optimizing performance. Cardinal business media, 1991.

[14] Elizabeth Borowsky, Richard A. Golding, P. Jacobson, Arif Merchant, L. Schreier, Mirjana Spa-sojevic, and John Wilkes. Capacity planning with phased workloads. In WOSP, pages 199–207,1998.

[15] Lee Breslau, Pei Cao, Li Fan, Graham Phillips, and Scott Shenker. Web caching and zipf-likedistributions: Evidence and implications. In INFOCOM (1), pages 126–134, 1999.

[16] John Bucy, Greg Ganger, and contributors. The DiskSim simulation environment version 3.0 refer-ence manual. Technical Report CMU-CS-03-102, Carnegie Mellon University, 2003.

[17] Ramn Caceres, Peter B. Danzig, Sugih Jamin, and Danny Mitzel. Characteristics of wide-areaTCP/IP conversations. In Proceedings of ACM SIGCOMM, pages 101–112, 1991.

[18] S. Chaudhuri and G. Weikum. Rethinking database system architecture: towrads a self-tuningRISC-style database system. In VLDB, pages 1–10, 2000.

[19] Peter M. Chen and David A. Patterson. A new approach to I/O performance evaluation – self-scalingI/O benchmarks, predicted I/O performance. In SIGMETRICS, pages 1–12, 1993.

19

Page 20: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

[20] Shenze Chen and Don Towsley. A performance evaluation of RAID architectures. IEEE Transactionson Computers, pages 1116–1130, 1996.

[21] Trishul M. Chilimbi. Efficient representations and abstractions for quantifying and exploiting datareference locality. In PLDI, pages 191–202, 2001.

[22] Robert B. Cooper. Introduction to queueing theory (2nd edition). North-Holland (Elservier), 1981.

[23] Mark E. Crovella and Azer Bestavros. Self-similarity in world wide web traffic evidence and possiblecauses. In Proc. of the 1996 ACM SIGMETRICS Intl. Conf. on Measurement and Modeling ofComputer Systems, May 1996.

[24] Breiman L. et. al. Classification and Regression Trees. Wadsworth, 1984.

[25] K. K. Ramakrishnan et. al. Analysis of file I/O traces in commercial computing environments. InSIGMETRICS, pages 78–90, 1992.

[26] D. Ferrari. On the foundation of artificial workload design. In SIGMETRICS, pages 8–14, 1984.

[27] Gregory R. Ganger. Generating representative synthetic workloads: an unsolved problem. In Pro-ceedings of the Computer Measurement Group (CMG) Conference, pages 1263–1269, 1995.

[28] Gregory R. Ganger, John D. Strunk, and Andrew J. Klosterman. Self-* storage: brick-based storagewith automated administration. Technical Report CMU-CS-03-178, Carnegie Mellon University,2003.

[29] Maria E. Gomez and Vicente Santonja. Analysis of self-similarity in I/O workload using structurealmodeling. In 7th International Symposium on Modeling, Analysis and Simulation of Computer andTelecommunication Systems, pages 234–243, 1999.

[30] Maria E. Gomez and Vicente Santonja. A new approach in the modeling and generation of syntheticdisk workload. In Proceedings of 9th International Symposium on Modeling, Analysis and Simulationof Computer and Telecommunication Systems, pages 199–206, 2000.

[31] Steven D. Gribble, Gurmeet Singh Manku, Drew Roselli, Eric A. Brewer, Timothy J. Gibson, andEthan L. Miller. Self-similarity in file systems. In SIGMETRICS’98, pages 141–150, 1998.

[32] Knuth Stener Grimsrud, James K. Archibald, Richard L. Frost, and Brent E. Nelson. Locality as avisualization tool. IEEE Transactions on Computers, 45(11):1319–1326, 1996.

[33] M.T Heath and J.A. Etheridge. Visualizating the performance of parallel programs. IEEE Software,8(5):29–39, 1991.

[34] William L. Hibbard, Brian E. Paul, David A. Santek, Charles R. Dyer, Andre L. Battaiola, andMarie-Francoise Voidrot-Martinez. Interactive visualization of earth and space science computations.IEEE Computer, 27(7):65–72, 1994.

[35] M. Hofri. FCFS vs. SSTF revisited. Communications of the ACM, 23(11):645–653, 1980.

[36] Bo Hong and Tara Madhyastha. The relevance of long-range dependence in disk traffic and impli-cations for trace synthesis. Technical Report UCSC-CRL-02-13, University of California at SantaCruz, 2002.

20

Page 21: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

[37] Windsor W. Hsu, Alan Jay Smith, and Honesty C. Young. I/O reference behavior of productiondatabase workloads and the TPC benchmarks - an analysis at the logical level. Technical ReportCSD-99-1071, University of California, Berkeley, 1999.

[38] Microsoft Inc. AutoAdmin project at Microsoft Research,http://research.microsoft.com/dmx/autoadmin.

[39] K. Keeton, G. Alvarez, E. Riedel, and M. Uysal. Characterizing I/O-intensive workload sequentialityon modern disk arrays. In CAECW-01, 2001.

[40] K. Keeton, D. A. Patternson, Y. Q. He, R. C. Raphael, and W. E. Baker. Performance characteri-zation of a quad pentium pro SMP using OLTP workloads. In ISCA’98, pages 15–26, 1998.

[41] Zachary Kurmas, Kimberly Keeton, and Ralph Becker-Szendy. I/O workload characterization. In4th workshop on computer architecture evaluation using commercial workloads, 2001.

[42] Zachary Kurmas, Kimberly Keeton, and Kenneth Mackenzie. Synthesizing representative I/O work-loads using iterative distillation. In Proceedings of the 11th International Symposium on Modeling,Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), pages 6–15,2003.

[43] Edward K. Lee and Randy H. Katz. An analytic performance model of disk arrays. In Proceedingsof the 1993 ACM SIGMETRICS, pages 98–109, 1993.

[44] Will E. Leland, Murad S. Taqq, Walter Willinger, and Daniel V. Wilson. On the self-similar natureof Ethernet traffic. In Deepinder P. Sidhu, editor, ACM SIGCOMM, pages 183–193, San Francisco,California, 1993.

[45] Miron Livny, R. Ramakrishnan, K. Beyer, G. Chen, D. Donjerkovic, S. Lawande, J. Myllymaki, andKent Wenger. DEVise: integrated querying and visual exploration of large datasets. In SIGMOD,pages 301–312, 1997.

[46] A.D. Malony, D.H. Hammerslag, and D.J. Jablonowski. Traceview: a trace visualization tool. IEEESoftware, 8(5):19–28, 1991.

[47] Jeanna Neefe Matthews, Drew Roselli, Adam M. Costello, Randolf Y. Wang, and Thomas E. An-derson. Improving the performance of log-structured file systems with adaptive methods. In ACMSymposium on Operating System Principles, pages 238–251, 1997.

[48] Arif Merchant and Guillermo A. Alvarez. Disk array models in Minerva. Technical Report HPL-2001-118, HP Laboratories, 2001.

[49] Arif Merchant and P. S. Yu. Analytical modeling of clustered RAID with mapping based on nearlyrandom permutation. IEEE Transactions on Computers, pages 367–373, 1996.

[50] Ethan L. Miller and Randy H. Katz. Analyzing the I/O behavior of supercomputer applications.digest of papers. In IEEE Symposium on Mass Storage Systems, pages 51–55, 1991.

[51] Spencer W. Ng. Improving disk peformance via latency reduction. IEEE Transactions on Computers,40(1):22–30, 1991.

[52] J.P. Oly and D.A. Reed. Markov model predictions of I/O requests for scientific applications. InProc. 16th ACM Conf. Supercomputing, pages 147–155, 2002.

21

Page 22: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

[53] Oracle Corporation. Oracle Database 10g.

[54] John K. Ousterhout, Herve Da Costa, David Harrison, John A. Kunze, Michael D. Kupfer, andJames G. Thompson. A trace-driven analysis of the UNIX 4.2 BSD file system. In SOSP, pages15–24, 1985.

[55] V. Paxson. Empirically-derived analytic models of wide-area TCP connections. IEEE/ACM Trans-actions on Networking, 2(4):316–336, 1994.

[56] V. Paxson. Fast, approximate synthesis of fractional Gaussian noise for generating self-similarnetwork traffic. Computer communications review, 27(5):5–18, 1997.

[57] V. Paxson and S. Floyd. Wide-area traffic: the failure of Poisson modeling. IEEE/ACM Transactionson Networking, 3(3):226–244, 1995.

[58] Ruldolf H. Riedi, Matthew S. Crouse, Vinay J. Ribeiro, and Richard G. Baraniuk. A multifractalwavelet model with application to network traffic. IEEE Transactions on Information Theory,45(3):992–1018, April 1999.

[59] C. Ruemmler and J. Wilkes. Disk shuffling. Technical Report HPL-91-156, HP Laboratories, 1991.

[60] Chris Ruemmler and John Wilkes. Unix disk access patterns. In Winter USENIX Conference, pages405–420, 1993.

[61] Chris Ruemmler and John Wilkes. An introduction to disk drive modeling. IEEE Computer,27(3):17–28, 1994.

[62] P. Scheuermann, G. Weikum, and P. Zabback. Data partitioning and load balancing in parallel disksystems. VLDB Journal, 7(1):48–66, 1998.

[63] Jiri Schindler and Gregory R. Ganger. Automated disk drive characterization. Technical ReportCMU-CS-99-176, Carnegie Mellon University, 1999.

[64] M. Seltzer, P. Chen, and J. Ousterhout. Disk scheduling revisited. In Proc. of the 1990 WinterUSENIX, pages 313–323, 1990.

[65] Elizabeth Shriver, Arif Merchant, and John Wilkes. An analytical behavior model for disk driveswith readahead caches and request reordering. In Proceedings of International Conference on Mea-surement and Modeling of Computer Systems, pages 182–191, 1998.

[66] E. Smimi and D.A. Reed. Workload characterization of input/output intensive parallel applications.In Proc. Conf. Modeling Techniques and Tools for Computer Performance Evaluation, volume 1245,pages 169–180. Springer-Verlag, 1997.

[67] John D. Strunk and Gregory R. Ganger. A human organization analogy for self-* systems. InFirst Workshop on Algorithms and Architectures for Self-Managing Systems, in conjunction withFederated Computing Research Conference, pages 1–6, 2003.

[68] Nancy Tran and Daniel A. Reed. Automatic ARIMA time series modeling for adaptive I/O prefetch-ing. IEEE Transactions on Parallel and Distributed Systems, 15(4):362–377, 2004.

[69] Peter Triantafillou, Stavros Christodoulakis, and Costas Georgiadis. Optimal data placement ondisks: a comprehensive solution for different technologies. IEEE Transactions on Knowledge andData Engineering, 12(2):324–330, 2000.

22

Page 23: Black-Box Storage Device Modelingmzwang/research/thesis_proposal.pdfPerformance optimizations, such as prefetching and caching, further complicate the problem. On the other hand, I/O

[70] Mustafa Uysal, Guillermo A. Alvarez, and Arif Merchant. A modular, analytical throughput modelfor modern disk arrays. In Proceedings of 9th International Symposium on Modeling, Analysis andSimulation of Computer and Telecommunication Systems, pages 183–192, 2001.

[71] Paul Vongsathom and Scott D. Carson. A system for adaptive disk rearrangement. Software –practice and experience, 20(3):225–242, 1990.

[72] M. Wang, T. Madhyastha, N.H. Chan, S. Papadimitriou, and C. Faloutsos. Data mining meetsperformance evaluation: fast algorithms for modeling bursty traffic. In ICDE, pages 507–516, 2002.

[73] Mengzhi Wang, Anastassia Ailamaki, and Christos Faloutsos. Capturing the spatio-temporal be-havior of real traffic data. Performance Evaluation, 49(1/4):147–163, 2002.

[74] Mengzhi Wang, Kinman Au, Anastassia Ailamaki, Anthony Brockwell, Christos Faloutsos, andGregory R. Ganger. Storage device performance prediction with CART models. Technical ReportCMU-PDL-04-103, Carnegie Mellon University, 2004.

[75] Randolf Y. Wang, David A. Patterson, and Thomas E. Anderson. Virtual log based file systemsfor a programmable disk. In Symposium on Operating Systems Design and Implementation, pages29–43, 1999.

[76] J. Wilkes. The Pantheon storage-system simulator. Technical Report HPL–SSP–95–14, Hewlett-Packard Laboratories, 1995.

[77] J. Wilkes, R. Golding, C. Staelin, and T. Sullivan. The HP AutoRAID heirarchical storage system.IEEE Transaction on computer systems, 14(1):108–136, 1996.

[78] W. Willinger, V. Paxson, and M. S. Taqqu. A practical guide to heavy tails: statistical techniquesand applications, chapter Self-similarity and heavy tails: structural modeling of network traffic.Birkhauser, 1998.

[79] Xiang Yu, Benjamin Gum, Yuqun Chen, Randolph Y. Wang, Kai Li, Arvind Krishnamurthy, andThomas E. Anderson. Trading capacity for performance in a disk array. In OSDI, pages 243–258,2000.

23