Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
DSRG, WPI 1
DEXA-2002
MEDWRAP: Consistent View Maintenance over Distributed Multi-Relation Sources
Aparna Varde and Elke RundensteinerDatabase Systems Research GroupDepartment of Computer ScienceWorcester Polytechnic Institute Worcester, Massachusetts, USA.
(aparna | rundenst) @ cs.wpi.edu
DSRG, WPI 2
Concurrency Conflict in View Maintenance:e.g. Travel Information Management
DEXA-2002
R1: Flights
8 a.m.6 a.m.D.C.BostonA3
3 p.m.9 a.m.LAMiamiA2
5 p.m.4 p.m.NYCD.C.A1
Arr.Dep.ToFromFlt
View: Discounted Flight
100NYCD.C.
150LAAtlanta
PriceToFromR2: Special Offers
V = II Flt, From, To, DisPrice (R1 join R2)Maintenance Query: MQ = R1joinR2
Correct Result: MQR = nullIncorrect MQR = R1 join (R2 + R2)Concurrency Conflict: R1 join R2
100NYCD.C.A1
PriceToFromFlt9 p.m.7 p.m.D.C.BostonA4
MQ
125D.C.BostonA4MQR
125D.C.Boston
DSRG, WPI 3
State of the Art VM AlgorithmsDEXA-2002
DW DW
Single-Source VM: One MQ per update goes to all RsMulti-Source VM: Each IS needs separate MQ
MediatorMediator
R1 R2 R3 R4 R2R1 R3
Single-Source VM Algorithms(e.g. ECA, CCA)
Multi-Source VM Algorithms(e.g. Strobe, SWEEP)
V V
DSRG, WPI 4
DEXA-2002Research Issue
DW V
R11 R12 R21 R22 R23
Mediator
IS3 R31IS1 IS2
Modern Data Warehouses generally have multiple sourcesEach source typically has multiple relationsGoal: To solve concurrency conflicts in multi-IS, multi-R, DW
DSRG, WPI 5
Where State of the Art fails
R2 R3
IS2
R1
IS1
R1 join R2 join R3Can’t Calculate
MQ1= R1 join (R2 join R3)MQR1= R1 join (R2+ R2) join R3
R1
IS1
R2
IS2
DEXA-2002
MQ1= R1 join R2MQR1= R1 join (R2+ R2)
from mediator ( problem!!! )
R1 join R2Can Calculate
from mediator
R1 R1R2
R2
Conflict = R1 join R2 joinR3 Conflict = R1join R2
One Relation per SourceMultiple Relations per Source
DSRG, WPI 6
DEXA-2002
Potential Solution???
RelationalConcurrencyConflict R3
R2
R2 R3
IS2
R1
IS1
R1 join R2 join R3Multi-source VM Algorithm Can’t Calculate
At MediatorR1 R2
Wrapper calculates ( R2 join R3)
( R2 join R3)
Mediator cannot correct this
DSRG, WPI 7
The MEDWRAP Approach: MEDiator and WRAPper Compensation
DEXA-2002
Wrapper (Single-Source VM Algorithm)
Wrapper (Single-Source VM Algorithm)
Wrapper(Single-SourceVM Algorithm)
V
IS2IS1 IS3
Mediator (Multi-Source VM Algorithm)
R11 R21 R22 R23 R31 R32
IS3IS2
Data Warehouse
R21 R22 R31R11
IS1
DSRG, WPI 8
Conditions for VM algorithmsDEXA-2002
Both Mediator and Wrapper VM algorithms have to be compensation-basedIf both algorithms maintain complete consistency, MEDWRAP maintains complete consistencyIf either algorithm maintains lower than complete consistency, MEDWRAP maintains the consistency level of the weaker algorithm
DSRG, WPI 9
Global Query ProcessingDEXA-2002
R1 R2 R3
Mediator V
IS
VM algorithm at wrapperDoes update-processing onlyFor this it sends queries to IS
GMQ ?
Wrapper (Single-Source VM)
LMQ1What if it receives a query from the mediator? LMQR1R1
Global Query Processing needed at wrapper
DSRG, WPI 10
Common Concurrency ConflictsDEXA-2002
Timing Diagram for order of messages
GMQLMQ1 LMQR1 GMQR
R1 R2 R3
R2 and R3 are conflicts for LMQ1 and LMQR1
R2 and R3 are also conflicts for GMQ1 and GMQR1
Hence, the same updates can cause conflicts in both Local and Global Query Processing
Hence we need
A common queue for all messages at each wrapper
A common processor for queries and updates
DSRG, WPI 11
DEXA-2002MEDWRAP: At Wrapper
Wrapper
LQueue………………
GMQ GMQRIS
R GMQR
Source
LProcessor
LMQ
CQ
LMQR
R GMQR LMQR
QueryUpdateResult
GMQLegend
CQR
DSRG, WPI 12
MEDWRAP ExampleMediator :
+[5,3]R23
+[2,5]R22
+[4,2]R21
[1,2]Ini
Y,ZX,YW,X
-[5,4]R11
[5,4]Ini
A,B
IS1 = - [4]
Wrapper2
GMQ =IIw ( IS1 join IS2)
GMQ +[2,5] +[5,3] GMQR
IS2 = IIw (R21 join R22 join R23)
IS1= IIB(R11)
V = IIw ( IS1 join IS2 ) Initial V = null
+ [4,2]
CQ22 = IIw (R21 join [2,5] join R23)
CQR22 =([1],[4])
Correct GMQR = (Incorrect GMQR -All CQRs) = null
[1],[4]
DEXA-2002
Wrapper1
R11 = - [5,4]
IS1 IS2
DSRG, WPI 13
MEDWRAP Example (Contd.)DEXA-2002
R11
Wrapper2
R21 R22 R23
Wrapper1
Mediator V= IIw (IS1 join IS2 join IS3)( V = final correct GMQR)
IS1 GMQ IS2 GMQR
IS1 IS2
IS1 = - [4] GMQ = IIw
( IS1 join IS2)GMQR = nullIS2 =
[1],[4]
DSRG, WPI 14
Advantages of MEDWRAPDEXA-2002
Allows sources to be semi-autonomousThey do not participate in DW maintenance beyond reporting updates and processing queries
Benefit of software reuseTechniques from existing VM algorithms used in MEDWRAP design
Generic for any two compensation-based VM algorithms
DSRG, WPI 15
Evaluation of MEDWRAPDEXA-2002
Study alternative approachesAlternative 1: “Materialized MRE” Approach
• Stores an IS View at each wrapper, built by encapsulating all relations in the source
Alternative 2: “Rel-as-Source” Approach (Simplistic)• If multiple relations per source, treats each relation
as separate source, processing separate queries/updates from each source
Two critical resources for comparisonSpace: Storage at wrappersTime: Processing of queries and updates
DSRG, WPI 16
Space and Time Cost
0
50
100
150
200
250
300
0 20 40 60 80 100 120
Space (Storage of Views)
Tim
e (D
elay
s in
Pr
oces
sing
)
MEDWRAP
MRE
Rel-as-Source
Comparative Cost AnalysisDEXA-2002
Rel-as-Source: Needs too much processing time
MRE: Needs too much processing time
MEDWRAP: Good utilization of space/time on the whole
DSRG, WPI 17
DEXA-2002 Related Work
RV: Re-computation of View (Traditional)Rewrite all tuples, not only affected onesHighly inefficient if done for every update
SM: Self Maintenance (Quass et. al.,96; Gupta et. al., 96)DW stores copies of source relations for maintenanceHuge storage at warehouse
Version Control: (Kulkarni et. al.,99; Chen et. al.,00)Versions of transactions / tuples stored at wrappersLatest version used to answer queriesHuge storage at source wrappers.
DSRG, WPI 18
DEXA-2002Conclusions
MEDWRAP solves Concurrency Conflicts in Incremental VM in multi-source, multi-relation DW
Allows sources to be semi-autonomousBenefit of software re-useGeneric for any 2 compensation-based VM algorithms
Provides Cost-efficient solution Materialized MRE Approach: Too space-consumingRel-as-Source Approach: Too time-consumingMEDWRAP Approach: Good use of space and time overall
MEDWRAP being implemented and integrated into DyDa (Dynamic Data Warehousing system) at WPI.