Upload
luthando-morin
View
33
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Self Maintenance of materialized XML views with non-cooperative data sources. DBDBD – 2006 Virginie Sans –ETIS/CNRS Laboratory– MIDI Team. Issue and context Pre-requisite The issue Context State of the art Contributions View computation with the XAlgebra - PowerPoint PPT Presentation
Citation preview
Self Maintenance of materialized XML views with non-cooperative data
sources
DBDBD – 2006
Virginie Sans –ETIS/CNRS Laboratory– MIDI Team
2
SummarySummary
1) Issue and context
1) Pre-requisite2) The issue3) Context4) State of the art
2) Contributions
1) View computation with the XAlgebra2) Detection and Identification of source updates3) View maintenance4) Applications and performances
Conclusion
3
Mediation architectureMediation architecture
Introduced by WiederHold
The architecture mediator wrappers sources Query langague
1.1 Pre-requisite
4
Mediation architectureMediation architecture
Mediator Handle the user request: canonization, atomization Send atomic request to a source via its wrapper
wrappers Translate query coming from the mediator into a
query in the native langague of the web source Give the mediator an answer in XML
Data sources heterogeneous distributed In a web context : Partially unavailable
Source SQL
WrapperWrapper
Meditor
XMLAtomic request
SQL Tuples
1.1 Pre-requisite
5
ViewsViews
What about views ? Data integration Access control, security Data-warehouses
Why ? Interoperability Heterogeneous data
Materializing views Fast access to complex query Better Availability Request optimization
RDB SQL HTML
Materializedviews
WrapperWrapper
Mediator
WrapperWrapper WrapperWrapper
1.1 Pre-requisite
6
Issue : View maintenance Issue : View maintenance
Maintenance process
Recomputation Recompute the whole view from scratch
When data sources are updated, the view consistency should be kept
Incremental maintenance compute changes to view in response
to changes to base sourcesSource t
Viewt
View computation
Source t+1
Viewt+1
Recomputation
Update
incr
emen
tal
Mai
nten
ance
Maintenance
1.2 Issue
7
Context : semi-structured XML dataContext : semi-structured XML data
XML views are materialized at the mediator level
Hierarchical data
No scheme, except the query scheme
<bib><book>
<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>
</book> <book>
<title> TCP/IP Illustrated </title></book><book>
<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>
</book> <book> <price>39.95</price>
<title> Data on the Web </title><title> Données sur le Web </title>
</book></bib>
<bib><book>
<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>
</book> <book>
<title> TCP/IP Illustrated </title></book><book>
<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>
</book> <book> <price>39.95</price>
<title> Data on the Web </title><title> Données sur le Web </title>
</book></bib>
1.3 Context
8
Context : XQUERY Context : XQUERY
XQuery
Dedicated to XML data
Relational operator (projection, select, join, union, …)
XML operator (tagging, unnesting, aggregation, ..)
FLWOR syntax
…………(pronounced Flower !)
<result> for $b in document("bib.xml")/bib/book let $a=$b/author where $b/price/text() < 60 Order by $b/year return <cheap_book>
$b/title </cheap_book>
</result>
<result> for $b in document("bib.xml")/bib/book let $a=$b/author where $b/price/text() < 60 Order by $b/year return <cheap_book>
$b/title </cheap_book>
</result>
Syntaxe FLWOR
for $var in foret [$var in foret]*let $var:= sous-arbreWhere conditionReturn result
Syntaxe FLWOR
for $var in foret [$var in foret]*let $var:= sous-arbreWhere conditionReturn result
1.3 Context
9
Context : Other specificities Context : Other specificities
Views are computed using XAlgebra Cf.View computation
Wrappers have limited resources Few computation possibilities A component named logger stores the last modification date and a checksum of sources
Non cooperative web sources No information about their updates Not always available Not enough granularity
1.3 Context
10
State of the art (1/2)State of the art (1/2)
Relational views Not fit for semi-structured data
Abiteboul and Al. OEM (Object Embedded Model) LOREL language Some Operators are missing
VOX – Rainbow Team Need to know the exact position in the XML Tree where the update has been done
1.4 State of the art
11
State of the art (2/2) State of the art (2/2)
Cobena and Al. XDiff – an algorithm for XML files comparison Need a copy of the source at the wrapper level
Bonnet and Al. /Papadimos and Al. Parachute queries A mutant query plan
What about when sources are really unavailable ?
Our goal :
Reduce to the minimum sources accessUse information that are stored in the view
1.4 State of the art
12
View maintenance : The process View maintenance : The process
View computation An algebraic approach using XAlgebra – Extension of the XAlgebra (identifiers)
Update detection Comparison of the information of the source and those stored in the logger
Update identification Recovering process Diff Algorithm
View maintenance Propagation rules for each operator
2.1 View computation
13
View computationView computation
Steps :
2.1 View computation
14
The XAlgebra data modelThe XAlgebra data model
Data structures : XRelation, XTuple, XAttributes
Operators : XSource, XConstruct, XUnion, ….
2.1 View computation
15
XSource Operator– Step 1XSource Operator– Step 1
XQuery analysis
We obtain : A contextA set of patterns
For $f in doc("informations.xml")/personnes/personneLet $a:=$f/nomWhere $f/age<27 and $a="Durand"Return<nom>{$a}</nom><prenom>{$f/prenom}</prenom>
Path extraction :OptionalMandatoryHidden
2.1 View computation
16
XSource Operator– Step 2 and 3XSource Operator– Step 2 and 3
From XML Sub-Trees to the tabular structure
1 Sub Tree => 1 Xtuple XRelation = set of XTuples
2.1 View computation
17
XSource Operator– Extending the Algebra XSource Operator– Extending the Algebra
adding identifiers : XTids
An XTID is a set of pair :
{(idsource, idfragment), …..}
2.1 View computation
18
View computation - XOperatorView computation - XOperator
XProject
2.1 View computation
19
View computation - XOperatorView computation - XOperator
XJoin
XTids propagation : card (XTID)1for some nodes
2.1 View computation
20
Update detection and IdentificationUpdate detection and Identification
Detection
Comparison of the information of the source and those stored in the logger• The last modification date• The checksum of the source
Identification
Partial recovery of the source information based on Xtids Comparison of the recovered XRelation with the updated source Δ computation
2.2 Update detection and identification
21
XRecoverXRecover
Step 1 : Project XRv on XR1 patterns
2.2 Update detection and identification
22
XRecoverXRecover
Step 2 : filtering XTuples values
2.2 Update detection and identification
23
XRecoverXRecover
Step 3 : re-ordering XTuples
XTidUnnest
2.2 Update detection and identification
Xtuples are unnested depending on their XTids
24
XRecoverXRecover
Step 3 : re-ordering Xtuples
XTidnest
2.2 Update detection and identification
Xtuples are nested by their Xtids
Xtuples are re-ordered
25
Update Identification – Comparison AlgorithmUpdate Identification – Comparison Algorithm
Comparison of XR1t+1 avec XRt’
XR1t+1 is the XRelation obtained by applying Xsource to source 1 at t+1
XRt’ is the partial recovery of Xrelation of source 1 at t
Remark : XR1t+1 can also be filtered using predicates before comparison
The Diff algorithm is based on Unix Diff (Hunt & McIllroy).The symbol is the Xtuple instead of being the line
2.2 Update detection and identification
26
Update identification – Diff algorithmUpdate identification – Diff algorithm
Delta with hunks : Insert(pos; Xtuple) delete(pos;Xtuple) Replace(pos; Xtupleold, Xtuplenew)
2.2 Update detection and identification
Insert(2,{Leclerc,Avide,{(1,3)}} {John,Avide,{(1,3)}} }
Delete(4,{Durand,Avide,{(1,11)}}, {Marcel,Avide,{(1,11)}} {Eric,Avide,{(1,11)}}}
Etc…
27
Maintenance RulesMaintenance RulesFrom Delta to view maintenanceFrom Delta to view maintenance
Case of a deletion - delete(pos, xtuple)
An Xtuple is associated to an Xtid {(x)} such that card=1, Each Xvalue of the view have xtids noted XTID
1) We delete from Xvalues each pair of the Xtid such that x XTID
Example : The XTuple where xtid is x=1,3 has been deletedThe Xvalue {Alain}1,3;1,4 becomes XValeur {Alain}1,4
2) We delete each Xvalues such that card(XTID)=0
If XValue {Alain}1,3 become XValeur {Alain} We delete entirely the XValue
3) If the Xvalue was concenned by the predicate, we delete the XTuple
Join and restriction case
2.3 View maintenance
28
Maintenance RulesMaintenance RulesFrom Delta to view maintenanceFrom Delta to view maintenance
Case of an insertion - insert(pos; xtuple)
1) A new Xtid is created Goal : preserved Xtuples order for a later recovery
2) Depending on the operator; we obtain various maintenance instructions
Projection: insert of the projection of the xtupleSelect : xtuple satisfies the predicat insertion
Join XR1 * XR2, computation of XT= xtuple * XR2. If XT insertion of XT
Union and Intersect: we keep the conservation des doublons Union Select where the predicate is always true Intersect join
Depending on the predicate, we can request either XR2 or its recovery
2.3 View maintenance
29
Maintenance RulesMaintenance RulesFrom Delta to view maintenanceFrom Delta to view maintenance
Case of a modification- Replace(pos; Xtupleold, Xtuplenew)
Xtuple modification=
Xvalue modification OR
Xvalues deletion followed by insertion
Project and Union: modification of the concerned XValuesSelect and Intersect: If modification is applied an Xvalue that must verify the condition,
deletion of the Xtuple Else modification of the XValuesIntersect select.Join deletion followed by insertion.
2.3 View maintenance
30
Maintenance RulesMaintenance RulesFrom Delta to view maintenanceFrom Delta to view maintenance
2.3 View maintenance
31
Maintenance rulesMaintenance rulesMissing InformationMissing Information
Missing Information (join ?)
Source Recovery Multi-view strategy Source request
Goal : limited acces to the sources !!!!
Example :View= S1*S2
SQLHTML
Materialized viewsMediator
WrapperWrapperWrapperWrapper
xtuple x is inserted in S1
Computation of S2’
Insertio : x * S2’
2.3 View maintenance
32
ApplicationsApplications
•On the web
• With sensors (ANR Project )
When necessary sources are unavailable
Goal : Limited access to them
With sensors that have no wire
Goal: Preserve power ressources
2.4 Applications and performances
33
PerformancesPerformances
• Comparison between XRecover and Recomputation
2.4 Applications and performances
34
PerformancesPerformances
• Comparison between XRecover and Recomputation
2.4 Applications and performances
35
ContributionsContributions
Maintenance process in the context of non-cooperative web sources
Contribution to the XAlgebra New operators : XRecover, XTidUnnest, XTidNest
New data structure : XTids
Futur work Order sensitive view maintenance
A better Diff algorithm
Conclusion
36
Thanks for you Thanks for you attention !attention !
Any questions ?Any questions ?