38
On Fairness, Optimizing On Fairness, Optimizing Replica Selection in Replica Selection in Data Grids Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 20, NO. 8, AUGUST 2009 Present by Chen, Ting-Wei

On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

  • View
    218

  • Download
    3

Embed Size (px)

Citation preview

Page 1: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

On Fairness, Optimizing On Fairness, Optimizing Replica Selection in Data Replica Selection in Data

GridsGridsHusni Hamad E. AL-Mistarihi and C

han Huah YongIEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED

SYSTEMS, VOL. 20, NO. 8, AUGUST 2009

Present by Chen, Ting-Wei

Page 2: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

22

Table of ContentTable of Content

IntroductionIntroduction System Requirements and DesignSystem Requirements and Design Performance Metrics and EvaluationPerformance Metrics and Evaluation Results and DiscussionResults and Discussion Conclusions and Future WorksConclusions and Future Works

Page 3: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

33

IntroductionIntroduction

ProblemProblem– How to select the best replica location from How to select the best replica location from

among many replica locations in minimum among many replica locations in minimum response time and high level of QoS?response time and high level of QoS?

– How to establish fairness among the users How to establish fairness among the users in selecting the replica location, such that in selecting the replica location, such that user gains an equity portion of QoS and user gains an equity portion of QoS and response time in relation to other users?response time in relation to other users?

Page 4: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

44

Introduction Introduction (cont.)(cont.)

Replica selectionReplica selection– One of major functions of data One of major functions of data

replication that decides which replica replication that decides which replica location is the best for the users based location is the best for the users based on some criteriaon some criteria

Replicas for grid usersReplicas for grid users– Minimum response timeMinimum response time– High level of Quality of Service (QoS)High level of Quality of Service (QoS)– Be allocated among the users fairlyBe allocated among the users fairly

Page 5: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

55

Introduction Introduction (cont.)(cont.)

Criteria in the selection decisionCriteria in the selection decision– Response timeResponse time– SecuritySecurity– ReliabilityReliability

ConflictConflict HeterogeneousHeterogeneous

Page 6: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

66

Introduction Introduction (cont.)(cont.)

Achieves the following objectivesAchieves the following objectives– Provides the Grid users with the Provides the Grid users with the

required replica in minimum response required replica in minimum response time and maximum QoStime and maximum QoS

– Establishes fairness among the users by Establishes fairness among the users by providing a new method for resources providing a new method for resources allocationallocation

Page 7: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

77

Introduction Introduction (cont.)(cont.)

– Provides an elaborated method that Provides an elaborated method that generates the decision-maker generates the decision-maker preferences (weights) automatically, preferences (weights) automatically, and is termed as the “fairness method”and is termed as the “fairness method”

– Deploys the AHP model in replica Deploys the AHP model in replica selection engineselection engine

Page 8: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

88

Introduction Introduction (cont.)(cont.)

EvaluatedEvaluated– Own simulator which is an extension of the Own simulator which is an extension of the

simulation OptorSimsimulation OptorSim– Compare with the random algorithmCompare with the random algorithm

Because there is no previous work similar to theBecause there is no previous work similar to themm

– Measure the fairness among usersMeasure the fairness among users Calculating the Standard Deviation (SD) FOR GriCalculating the Standard Deviation (SD) FOR Gri

d users for each criterion valued users for each criterion value

Page 9: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

99

System Requirements and System Requirements and Design Design

Focus onFocus on– Replica selection decisionReplica selection decision– Establishing fairness among usersEstablishing fairness among users– The most important resourceThe most important resource

Page 10: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

1010

System Requirements and System Requirements and Design Design (cont.)(cont.)

Data file

Copy 1

Grid site 1

Copy 5

Grid site 5

Copy 4

Grid site 4

Copy 3

Grid site 3

Copy 2

Grid site 2

Reliability

Security

Response Time

Reliability

Security

Response Time

Reliability

Security

Response Time

Reliability

Security

Response Time

Reliability

Security

Response Time

Page 11: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

1111

System Requirements and System Requirements and Design Design (cont.)(cont.)

Selection engine decides which is the Selection engine decides which is the best sitebest site– The highest secure siteThe highest secure site– The highest reliable siteThe highest reliable site– The lower response time between the The lower response time between the

local site and the remote sitelocal site and the remote site

Best replicaBest replica– The highest level of QoSThe highest level of QoS

Page 12: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

1212

System Requirements and System Requirements and Design Design (cont.)(cont.)

Analytical Hierarchy ProcessAnalytical Hierarchy Process– The weighted sum approachThe weighted sum approach

Step 1: Underlying criteria. Thus, pair-wise Step 1: Underlying criteria. Thus, pair-wise comparisons are made and comparisons are made and converted into converted into quantity numbers.quantity numbers.

CriterioCriterionn

Scale of measurementScale of measurement

ResponseTime

From 30 minutes to 2000 minutes.Excellent=(30~100); Very Good=(101~500); Good=(501~1000); Indifferent=(1001~1500); Bad=(1501~2000)

Reliability

From 30 to 100.Excellent=(90~100); Very Good=(80~89); Good=(65~79); Indifferent=(50~64); Bad=(30~49)

Security From 1 to 5.Excellent=5; Very Good=4; Good=3; Indifferent=2; Bad=1

Page 13: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

1313

System Requirements and System Requirements and Design Design (cont.)(cont.)

Step 2Step 2– The pair-wise comparisons are organized into a symmThe pair-wise comparisons are organized into a symm

etric matrixetric matrix– Multiply by itself becomes a judgment matrixMultiply by itself becomes a judgment matrix– The total sum of each row in the judgment matrix is caThe total sum of each row in the judgment matrix is ca

lculated to produce the AHP_Eigenvector valuelculated to produce the AHP_Eigenvector value WeightWeight

Step 3Step 3– For each criterion, the relative importance among the For each criterion, the relative importance among the

alternatives will be organized into a symmetric matrix, alternatives will be organized into a symmetric matrix, and steps 1, 2 are repeatedand steps 1, 2 are repeated

Page 14: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

1414

System Requirements and System Requirements and Design Design (cont.)(cont.)

Step 4Step 4– Local ratings Local ratings – Multiply by the weights of the criteria of the judgment Multiply by the weights of the criteria of the judgment

matrix (the first matrix)matrix (the first matrix)– Aggregate to get global ratingsAggregate to get global ratings– Decision will be taken about the highest ranked Decision will be taken about the highest ranked

alternative sitealternative site

Disadvantage to AHPDisadvantage to AHP– Error proneError prone– Hinder the dynamic nature of the autonomous Grid Hinder the dynamic nature of the autonomous Grid

systemssystems– Fairness MethodFairness Method is proposed to overcome this is proposed to overcome this

disadvantagedisadvantage

Page 15: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

1515

System Requirements and System Requirements and Design Design (cont.)(cont.)

FairnessFairness– Contribute toward the replication Contribute toward the replication

management system in Gridmanagement system in Grid– Contribute to other domains which have Contribute to other domains which have

a similar optimization problem in a similar optimization problem in selecting one possible solutionselecting one possible solution

Page 16: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

1616

System Requirements and System Requirements and Design Design (cont.)(cont.)

System Detailed DesignSystem Detailed Design– Data Grid architectureData Grid architecture

Replica SelectionServiceService

Replica Management

Storage System Metadata Repository Resource Management Security Instrumentation

High Level Components

Core Services

DPSS …… HPSS LDAP …… MCAT LSF …… DIFFSERV Kerberos NWS …… NetLogger

Data Grid Specific Services Generic Grid Services

Page 17: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

1717

System Requirements and System Requirements and Design Design (cont.)(cont.)

System consists of two main componentsSystem consists of two main components– Replica Manager (RM)Replica Manager (RM)

Manages the historical data fileManages the historical data file Enquires the Replica Location Service for the relaEnquires the Replica Location Service for the rela

ted physical file name, and their sites locationsted physical file name, and their sites locations Enquires the NWS and GridFTP for site-related infEnquires the NWS and GridFTP for site-related inf

ormation and network statusormation and network status– Replica Selector (RS)Replica Selector (RS)

Located at each grid site (node) receives the requLocated at each grid site (node) receives the requests from the user’s jobsests from the user’s jobs

Page 18: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

1818

System Requirements and System Requirements and Design Design (cont.)(cont.)

– RS gets the related information from the RS gets the related information from the RM in order to take the appropriate RM in order to take the appropriate decisiondecision

– RS computes the fairness values and RS computes the fairness values and comes up with the best replica location comes up with the best replica location decisiondecision

Page 19: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

1919

System Requirements and System Requirements and Design Design (cont.)(cont.)

Implementation steps (Fairness Implementation steps (Fairness method)method)– Step 1:Calculate the User Criteria Step 1:Calculate the User Criteria

Average (UCA) from the historical data Average (UCA) from the historical data file for each criterionfile for each criterion

1

n

iCriterion

UCAn

1Re

n

iUser

liabilityRCA

n

1

n

iUser

SecuritySCA

n

1Re _

n

iUser

sponse TimeRTCA

n

Page 20: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

2020

System Requirements and System Requirements and Design Design (cont.)(cont.)

– Step 2: Calculate the System Criteria Step 2: Calculate the System Criteria Average (SCA) for all users in the Grid Average (SCA) for all users in the Grid systemsystem

1

m

iCriterion

SCAm

1Re

m

iSystem

liabilityRCA

m

1

m

iSystem

SecuritySCA

m

1Re _

m

iSystem

sponse TimeRTCA

m

Page 21: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

2121

System Requirements and System Requirements and Design Design (cont.)(cont.)

– Step 3: User Fairness (UF) is calculated for eStep 3: User Fairness (UF) is calculated for each criterionach criterion

System

User

CriterionAvgUF

CriterionAvg

Re ( ) System

User

RCAliabilityFairness RF

RCA

( ) System

User

SCASecurityFairness SF

SCA

Re ( ) System

User

RTCAsponseTimeFairness RTF

RTCA

Page 22: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

2222

System Requirements and System Requirements and Design Design (cont.)(cont.)

– Step 4: Calculate the correlated criteria Step 4: Calculate the correlated criteria weightsweights

The equation is computed nine times by The equation is computed nine times by varying both i and j to fill the weights varying both i and j to fill the weights

,: ii j

j

CriterionFairnessWeights W

CriterionFairness

Page 23: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

2323

System Requirements and System Requirements and Design Design (cont.)(cont.)

Implementation steps (AHP)Implementation steps (AHP)– Step 5: Produce the matrixStep 5: Produce the matrix

Fairness MatrixFairness Matrix Security MatrixSecurity Matrix Reliability MatrixReliability Matrix Response Time MatrixResponse Time Matrix

Page 24: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

2424

System Requirements and System Requirements and Design Design (cont.)(cont.)

– Step 6Step 6 Calculate the AHP_EigenvectorCalculate the AHP_Eigenvector

Security Security MatrixMatrix

Row sumRow sum EigenvectEigenvectoror

1 0.4 0.67

2.07 0.2

2.5 1 1.67

5.17 0.5

1.5 0.6 1 3.1 0.3

Total=10.34

Page 25: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

2525

System Requirements and System Requirements and Design Design (cont.)(cont.)

– Step 7Step 7 Aggregate the AHP_Eigenvector for reliability, seAggregate the AHP_Eigenvector for reliability, se

curity, and response time in one matrix curity, and response time in one matrix Multiply this matrix by the AHP_Eigenvector of tMultiply this matrix by the AHP_Eigenvector of t

he fairness matrixhe fairness matrix One-dimensional array, rank arrayOne-dimensional array, rank array

The maximum value of the resulting ranked matThe maximum value of the resulting ranked matrix is the best siterix is the best site

Page 26: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

2626

Performance Metrics and Performance Metrics and EvaluationEvaluation

Evaluate the system performanceEvaluate the system performance– MeasureMeasure– AnalysisAnalysis– Compare with other modelsCompare with other models

Quality of Service (QoS) and Response Quality of Service (QoS) and Response TimeTime– High level of securityHigh level of security– The security value specified on each site in The security value specified on each site in

replica selection decisionreplica selection decision

Page 27: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

2727

Performance Metrics and Performance Metrics and Evaluation Evaluation (cont.)(cont.)

Fairness MetricFairness Metric– Measure the resources portion gained by a sMeasure the resources portion gained by a s

pecified userpecified user– SD metrics are the appropriate metrics to mSD metrics are the appropriate metrics to m

easure the fairness leveleasure the fairness level EvaluationEvaluation

– OptorSimOptorSim– Made some changes to suit the caseMade some changes to suit the case– Compare with random algorithmCompare with random algorithm

Page 28: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

2828

Results and DiscussionResults and Discussion

Test case (1): FairnessTest case (1): Fairness– Before - AfterBefore - After

– Before -AfterBefore -After

Page 29: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

2929

Results and Discussion Results and Discussion (cont.)(cont.)

– Before -AfterBefore -After

Page 30: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

3030

Results and Discussion Results and Discussion (cont.)(cont.)

Test case (2): Best Replica Selectin and STest case (2): Best Replica Selectin and Scalabilitycalability

Page 31: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

3131

Results and Discussion Results and Discussion (cont.)(cont.)

Page 32: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

3232

Results and Discussion Results and Discussion (cont.)(cont.)

Page 33: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

3333

Results and Discussion Results and Discussion (cont.)(cont.)

Page 34: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

3434

Results and Discussion Results and Discussion (cont.)(cont.)

– Overall of the fairness algorithm compared Overall of the fairness algorithm compared with the random algorithmwith the random algorithm

Page 35: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

3535

Conclusions and Future Conclusions and Future WorksWorks

ConclusionsConclusions– Best replica selectionBest replica selection– Establishing fairness among users Establishing fairness among users – AdvantageAdvantage

The system allows Grid users to participate The system allows Grid users to participate and share Grid resources fairlyand share Grid resources fairly

The system achieves better satisfaction for The system achieves better satisfaction for Grid usersGrid users

– Reliability and security are maximizedReliability and security are maximized– Response time is minimizedResponse time is minimized

Page 36: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

3636

Conclusions and Future Works Conclusions and Future Works (cont.)(cont.)

Future WorksFuture Works– Improve the replica selection process by Improve the replica selection process by

involving the users in determining their involving the users in determining their preferencespreferences

– Create another component to the Create another component to the system that provides searching and system that provides searching and matching services for the usersmatching services for the users

– The stock market shares will be adapted The stock market shares will be adapted such that each user can sell or buy such that each user can sell or buy fairness values with other usersfairness values with other users

Page 37: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

3737

Conclusions and Future Works Conclusions and Future Works (cont.)(cont.)

– Expand the systemExpand the system– Propose a new replication strategy Propose a new replication strategy

Support replica managementSupport replica management Replica deletionReplica deletion Replica placementReplica placement Reduce both job execution time and network traReduce both job execution time and network tra

fficffic– The future replication strategy will be compThe future replication strategy will be comp

ared with the OptorSim ared with the OptorSim

Page 38: On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

Thank You Thank You for for

Your AttentionYour Attention