18
DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation Sources Aparna Varde and Elke Rundensteiner Database Systems Research Group Department of Computer Science Worcester Polytechnic Institute Worcester, Massachusetts, USA. (aparna | rundenst) @ cs.wpi.edu

MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 1

DEXA-2002

MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation Sources

Aparna Varde and Elke RundensteinerDatabase Systems Research GroupDepartment of Computer ScienceWorcester Polytechnic Institute Worcester, Massachusetts, USA.

(aparna | rundenst) @ cs.wpi.edu

Page 2: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 2

Concurrency Conflict in View Maintenance:e.g. Travel Information Management

DEXA-2002

R1: Flights

8 a.m.6 a.m.D.C.BostonA3

3 p.m.9 a.m.LAMiamiA2

5 p.m.4 p.m.NYCD.C.A1

Arr.Dep.ToFromFlt

View: Discounted Flight

100NYCD.C.

150LAAtlanta

PriceToFromR2: Special Offers

V = II Flt, From, To, DisPrice (R1 join R2)Maintenance Query: MQ = R1joinR2

Correct Result: MQR = nullIncorrect MQR = R1 join (R2 + R2)Concurrency Conflict: R1 join R2

100NYCD.C.A1

PriceToFromFlt9 p.m.7 p.m.D.C.BostonA4

MQ

125D.C.BostonA4MQR

125D.C.Boston

Page 3: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 3

State of the Art VM AlgorithmsDEXA-2002

DW DW

Single-Source VM: One MQ per update goes to all RsMulti-Source VM: Each IS needs separate MQ

MediatorMediator

R1 R2 R3 R4 R2R1 R3

Single-Source VM Algorithms(e.g. ECA, CCA)

Multi-Source VM Algorithms(e.g. Strobe, SWEEP)

V V

Page 4: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 4

DEXA-2002Research Issue

DW V

R11 R12 R21 R22 R23

Mediator

IS3 R31IS1 IS2

Modern Data Warehouses generally have multiple sourcesEach source typically has multiple relationsGoal: To solve concurrency conflicts in multi-IS, multi-R, DW

Page 5: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 5

Where State of the Art fails

R2 R3

IS2

R1

IS1

R1 join R2 join R3Can’t Calculate

MQ1= R1 join (R2 join R3)MQR1= R1 join (R2+ R2) join R3

R1

IS1

R2

IS2

DEXA-2002

MQ1= R1 join R2MQR1= R1 join (R2+ R2)

from mediator ( problem!!! )

R1 join R2Can Calculate

from mediator

R1 R1R2

R2

Conflict = R1 join R2 joinR3 Conflict = R1join R2

One Relation per SourceMultiple Relations per Source

Page 6: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 6

DEXA-2002

Potential Solution???

RelationalConcurrencyConflict R3

R2

R2 R3

IS2

R1

IS1

R1 join R2 join R3Multi-source VM Algorithm Can’t Calculate

At MediatorR1 R2

Wrapper calculates ( R2 join R3)

( R2 join R3)

Mediator cannot correct this

Page 7: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 7

The MEDWRAP Approach: MEDiator and WRAPper Compensation

DEXA-2002

Wrapper (Single-Source VM Algorithm)

Wrapper (Single-Source VM Algorithm)

Wrapper(Single-SourceVM Algorithm)

V

IS2IS1 IS3

Mediator (Multi-Source VM Algorithm)

R11 R21 R22 R23 R31 R32

IS3IS2

Data Warehouse

R21 R22 R31R11

IS1

Page 8: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 8

Conditions for VM algorithmsDEXA-2002

Both Mediator and Wrapper VM algorithms have to be compensation-basedIf both algorithms maintain complete consistency, MEDWRAP maintains complete consistencyIf either algorithm maintains lower than complete consistency, MEDWRAP maintains the consistency level of the weaker algorithm

Page 9: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 9

Global Query ProcessingDEXA-2002

R1 R2 R3

Mediator V

IS

VM algorithm at wrapperDoes update-processing onlyFor this it sends queries to IS

GMQ ?

Wrapper (Single-Source VM)

LMQ1What if it receives a query from the mediator? LMQR1R1

Global Query Processing needed at wrapper

Page 10: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 10

Common Concurrency ConflictsDEXA-2002

Timing Diagram for order of messages

GMQLMQ1 LMQR1 GMQR

R1 R2 R3

R2 and R3 are conflicts for LMQ1 and LMQR1

R2 and R3 are also conflicts for GMQ1 and GMQR1

Hence, the same updates can cause conflicts in both Local and Global Query Processing

Hence we need

A common queue for all messages at each wrapper

A common processor for queries and updates

Page 11: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 11

DEXA-2002MEDWRAP: At Wrapper

Wrapper

LQueue………………

GMQ GMQRIS

R GMQR

Source

LProcessor

LMQ

CQ

LMQR

R GMQR LMQR

QueryUpdateResult

GMQLegend

CQR

Page 12: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 12

MEDWRAP ExampleMediator :

+[5,3]R23

+[2,5]R22

+[4,2]R21

[1,2]Ini

Y,ZX,YW,X

-[5,4]R11

[5,4]Ini

A,B

IS1 = - [4]

Wrapper2

GMQ =IIw ( IS1 join IS2)

GMQ +[2,5] +[5,3] GMQR

IS2 = IIw (R21 join R22 join R23)

IS1= IIB(R11)

V = IIw ( IS1 join IS2 ) Initial V = null

+ [4,2]

CQ22 = IIw (R21 join [2,5] join R23)

CQR22 =([1],[4])

Correct GMQR = (Incorrect GMQR -All CQRs) = null

[1],[4]

DEXA-2002

Wrapper1

R11 = - [5,4]

IS1 IS2

Page 13: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 13

MEDWRAP Example (Contd.)DEXA-2002

R11

Wrapper2

R21 R22 R23

Wrapper1

Mediator V= IIw (IS1 join IS2 join IS3)( V = final correct GMQR)

IS1 GMQ IS2 GMQR

IS1 IS2

IS1 = - [4] GMQ = IIw

( IS1 join IS2)GMQR = nullIS2 =

[1],[4]

Page 14: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 14

Advantages of MEDWRAPDEXA-2002

Allows sources to be semi-autonomousThey do not participate in DW maintenance beyond reporting updates and processing queries

Benefit of software reuseTechniques from existing VM algorithms used in MEDWRAP design

Generic for any two compensation-based VM algorithms

Page 15: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 15

Evaluation of MEDWRAPDEXA-2002

Study alternative approachesAlternative 1: “Materialized MRE” Approach

• Stores an IS View at each wrapper, built by encapsulating all relations in the source

Alternative 2: “Rel-as-Source” Approach (Simplistic)• If multiple relations per source, treats each relation

as separate source, processing separate queries/updates from each source

Two critical resources for comparisonSpace: Storage at wrappersTime: Processing of queries and updates

Page 16: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 16

Space and Time Cost

0

50

100

150

200

250

300

0 20 40 60 80 100 120

Space (Storage of Views)

Tim

e (D

elay

s in

Pr

oces

sing

)

MEDWRAP

MRE

Rel-as-Source

Comparative Cost AnalysisDEXA-2002

Rel-as-Source: Needs too much processing time

MRE: Needs too much processing time

MEDWRAP: Good utilization of space/time on the whole

Page 17: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 17

DEXA-2002 Related Work

RV: Re-computation of View (Traditional)Rewrite all tuples, not only affected onesHighly inefficient if done for every update

SM: Self Maintenance (Quass et. al.,96; Gupta et. al., 96)DW stores copies of source relations for maintenanceHuge storage at warehouse

Version Control: (Kulkarni et. al.,99; Chen et. al.,00)Versions of transactions / tuples stored at wrappersLatest version used to answer queriesHuge storage at source wrappers.

Page 18: MEDWRAP: Consistent View Maintenance over Distributed ...web.cs.wpi.edu/~aparna/DEXA2002.pdf · DSRG, WPI 1 DEXA-2002 MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation

DSRG, WPI 18

DEXA-2002Conclusions

MEDWRAP solves Concurrency Conflicts in Incremental VM in multi-source, multi-relation DW

Allows sources to be semi-autonomousBenefit of software re-useGeneric for any 2 compensation-based VM algorithms

Provides Cost-efficient solution Materialized MRE Approach: Too space-consumingRel-as-Source Approach: Too time-consumingMEDWRAP Approach: Good use of space and time overall

MEDWRAP being implemented and integrated into DyDa (Dynamic Data Warehousing system) at WPI.