Upload
kailee-faley
View
226
Download
1
Embed Size (px)
Citation preview
Ling Wang, Mukesh Mulchandani
Advisor: Elke A. Rundensteiner Co-Advisor: Kathi Fisler
Updating XML Views over Relational Data
Outline Motivation (Why?)
• Background:
XML View, Update Extension for XQuery
• Problem Definition: - Correct Translatability
- General Classification of XML View Update(XVUP)
- Typical Case Study: RUP, PUP
• Update Strategy for Round-Trip update problem ( RUP)
• Update Strategy for Publish-based update problem ( PUP)
• XVUP System Architecture
• Contribution
• Related Work
• XML is a standard for information exchange over internet But RDBMS is mature
- Mature query optimization techniques- High query performance
• Research Topic on dealing with XML with relational technology: - Publishing XML over relational database:
SilkRoute (AT&T), XPERANTO (IBM), RAINBOW - Storing XML into RDBs
LegoDB (BellLab), RAINBOW
• Support Update features • Our Work will focus on Content updates using XQuery language
Motivation
• Step1: Expressing updates in XQuery
- Extension to XQuery
- Extension to XML query Parser to support Update features
• Step2: Update RD through XML View
- Keep Consistency
- Translate XML View Updates (XQuery) into Relation Table Updates (SQL)
What should we do?
RDBMS
View Query
XML View
XML Update Query
SQL Update
RDBMS
Where we are Motivation (Why?) Background:
XML View, Update Extension for XQuery
• Problem Definition: - Correct Translatability
- General Classification of XML View Update(XVUP)
- Typical Case Study: RUP, PUP
• Update Strategy for Round-Trip update problem ( RUP)
• Update Strategy for Publish-based update problem ( PUP)
• XVUP System Architecture
• Contribution
• Related Work
XML Schema
<?xml version="1.0"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="bib"><xs:complexType>
<xs:sequence><xs:element name="book" maxOccurs="unbounded">
<xs:complexType><xs:sequence>
<xs:element name="bookid" type="xs:string" nillable="false"/><xs:element name="title" type="xs:string" nillable="false"/><xs:element name="author">
<xs:complexType><xs:sequence>
<xs:element name="aname" type="xs:string" maxOccurs="unbounded"/></xs:sequence>
</xs:complexType></xs:element><xs:element name="prices" maxOccurs="unbounded">
<xs:complexType><xs:sequence>
<xs:element name="source" type="xs:string"/><xs:element name="currency" type="xs:string"/><xs:element name="value" type="xs:double"/>
</xs:sequence></xs:complexType>
</xs:element><xs:element name="publisher">
<xs:complexType><xs:sequence>
<xs:element name="pname" type="xs:string"/><xs:element name="location" type="xs:string"/>
</xs:sequence></xs:complexType>
</xs:element><xs:element name="review" type="xs:string" nillable="true"/>
</xs:sequence><xs:attribute name="year" type="xs:string" use="required"/>
</xs:complexType></xs:element>
</xs:sequence></xs:complexType>
</xs:element></xs:schema>
Bib.xsd
XML document<prices>
<source>www.amazon.com</source><currency>USD</currency><value>65.95</value>
</prices><prices>
<source>www.bn.com</source><currency>USD</currency><value>64.75</value>
</prices><publisher>
<pname>Addison-Wesley</pname><location>Boston</location>
</publisher><review>
A clear and detailed discussion of UNIX programming. </review>
</book><book year="2000">
<bookid>98003</bookid><title>Data on the Web</title><author>
<aname>Serge Abiteboul</aname><aname>Peter Buneman</aname><aname>Dan Suciu</aname>
</author><prices>
<source>www.amazon.com</source><currency>DEM</currency><value>34.95</value>
</prices><publisher>
<pname>Morgan Kaufmann Publishers</pname><location>New York</location>
</publisher><review>
A very good discussion of semi-structured database systems and XML. </review>
</book></bib>
<bib><book year="1994">
<bookid>98001</bookid><title>TCP/IP Illustrated</title><author>
<aname>W. Stevens</aname></author><prices>
<source>www.amazon.com</source><currency>USD</currency><value>65.95</value>
</prices><publisher>
<pname>Addison-Wesley</pname><location>San Francisco</location>
</publisher><review>
One of the best books on TCP/IP. </review>
</book><book year="1992">
<bookid>98002</bookid><title>Advanced Programming in the Unix environment</title><author>
<aname>Bram Stoker</aname></author>
Bib.xml
<books>FOR $book IN document(“bib.xml”)/bookLET $titles = $book/titleWHERE $book/@year <= 2000RETURN
$book/title,<total>count($titles)</total>
</books>
<books><title>TCP/IP Illustrated</title><title> Advanced Programming in the Unix environment </title><total>2</total>
</books>
Xquery:
Query Result:
XQuery Example
XQuery Update Grammar
FOR $binding1 IN Xpath-expr,…..LET $binding := Xpath-expr,…WHERE predicate1,…..updateOp,……
Where updateOp is defined as : UPDATE $binding {subOp {, subOp}* } and subOp is :
DELETE $child |RENAME $child To new_name |INSERT ( $bind [BEFORE | AFTER $child]
| new_attribute(name, value) | new_ref(name, value) | content [BEFORE | AFTER $child] ) |
REPLACE $child WITH ( new_attribute(name, value)| new_ref(name, value)| content ) |
FOR $sub_binding IN Xpath-subexpr,…..WHERE predicate1,……….updateOp.
FOR $book IN document(“bib.xml")/bookLET $author:=$book/authorWHERE $book/title = “TCP/IP Illustrated”UPDATE $author{ INSERT
<aname>"Peter Naughton "</aname>}
Insert Update
Update query example
<bib><book year="1994">
<bookid>98001</bookid><title>TCP/IP Illustrated</title><author>
<aname>W. Stevens</aname><aname>"Peter Naughton "</aname>
</author><prices>
<source>www.amazon.com</source><currency>USD</currency><value>65.95</value>
</prices><publisher>
<pname>Addison-Wesley</pname><location>San Francisco</location>
</publisher><review>
One of the best books on TCP/IP. </review>
</book>……
<bib>
Where we are Motivation (Why?) Background:
XML View, Update Extension for XQuery Problem Definition:
- Correct Update Translatability
- General Classification of XML View Update(XVUP)
- Typical Case Study: RUP, SUP
• Update Strategy for Round-Trip update problem ( RUP)
• Update Strategy for Publish-based update problem ( SUP)
• XVUP System Architecture
• Contribution
• Related Work
Correct Update Translatability
• No side effect• One Step changes
- Each database tuple is affected by at most one step of update operation- Implications: No order between update operation
Could affect same table several times• Minimal changes
- No valid translation is subset of current translation
- No extraneous updates• Replacement can not be simplified
- Two replace could get same result, pick simple one- Replace the minimum attribute set
• No insert-delete pairs- Replace is cheaper than Insert/Delete pair
General Classification of XML View Update(XVUP)
Four dimensions for XVUP be studied:
• Information Dimension: - The amount of information available for XVUPeg: Constraints, Keys, Virtual View Definition, Underlying Relation
• Modification Dimension:- What modifications the XVUP can handle?eg: Content ( Insert/Delete/Replace/Move/Rename…), Schema
• Language Dimension:- View formeg: Algebra, XML Query language, SPJ, duplicate, Recursion, Aggregation
• Instance Dimension:- Requirement for DBeg: BCNF, 3NF, others?
Virtual View Definition
Underlying Relation
RDB Schema
Integrity Constraints(Key, FK)
DeletionInsertion
ReplacementRename MoveSet of each
Group UpdateSchema Change
Duplicates
Aggregation
Recursion
Hierarchy Consistency
Key Exposition
Non Correlation predicates attribute exposition
Correlation predicates attributes exposition
BCNF
3NF
Information
Modification
Instance Language
Local Constraints(Not Null, domain)
2NF
1NF
General Classification of XVUP
• Information dimension- Why Local Constraints ? ---- Valid Update
An Update to an XML view is Valid Update iff the update never inviolate any XML semantic constraints.
- Why Integrity Constraints ? ----- Update PropagationKnowledge of dependency of RDB , keep global integrity
• Instance dimension- BCNF ---- preserving of data dependency
• Modification dimension- Content update ( insertion/deletion/replacement )
• Only think about language dimension!
Why?
bib
book
bookid title author prices
aname sourcecurrenc
y
year
(0,n)
(1,n)
(1,1) (1,1) (0,1) (0,n) (1:1)
value
publisher
pname location
(0,1)
review
(0,1)
XML Schema Graph (XSG)
-Remember hierarchical information of XML view or XML document
1
(1,1)(1,1)
(1,1) (1,1) (1,1)
1
1 1 1
Duplicate
Duplicate:- two vertex in XSG are exposed from same relational attribute.- Partial updates touching duplicate elements are not translatable.
why? Cause ambiguous/inconsistent for underlying relation.
book
publisher author
title anamepname title
Book/title Book/title
update
Exposition Features• Key Exposition
- Primary key of underlying relation has to be exposed- except automatic generated key(implication: user has right to update it)
• Non Correlation predicates attribute exposition (select condition)- variable involved in predicates has to be exposedeg: $book/bookid = “98004”
bookid has to be exposed in view result • Correlation predicates attributes exposition (join condition)
- variable involved in predicates has to be exposedeg: $book/authorid = $author/id
then authorid in book table, id in author table have to be exposed
• Complete Exposition
• Why? Flexibility in constructing view, could against RDB.• Hierarchy in Relational Semantic:
- Table vs. Attribute - Key vs. Foreign-Key
bookid title year pname location review
BOOK
bookid authorid name
AUTHOR
Then, book is parent of all its attributes, book is ancester of author
- ID pairs. ( Recursive table like edge) ???
source position name target
1.0 1.0 book 6.0
6.0 1.0 bookid 98003
Hierarchy Consistency
Note: Same implication with default XML view generation
Hierarchy Consistency-Transitivity holds:
Aiancestor
Ajancestor
Ak
ancestor
Ai
PK UK NKR
• Consistent edge in XSG:edge has same ancestor-descendant with underlying relation
Hierarchy Consistency
BIB
AUTHOR
ANAME BOOK
BOOKID TITLE
<BIB> <author> <name>David Sklansky</name> <book>
<bookid>98001</bookid><title>TCP/IP Illustrated</title>
</book> </author>
……</BIB>
Inconsistent edge
Author/aname
Book/bookid Book/title
<BIB>FOR $book IN document("default.xml")/books/Row, $author IN document("default.xml")/author/RowWHERE $book/Author_IID = $author/PID RETURN
<author> $author/aname, <book> $book/bookid, $book/title </book> </author>
</BIB>
Un-updatable
• Consistent edge construction
View is consistent construction iff all edges in XSG are consistent edge.
• Inconsistent ConstructionView is inconsistent construction if exist an inconsistent edge.
• TranslatabilityAll update worked on sub-tree rooted in inconsistent edge are not translatable.
Hierarchy Consistency
Assumption & Case Study
• General Assumption- RDB has no cyclic dependencies- No order issue
• Typical View update problem- Round-Trip Update problem (RUP)- Semi-structured Update problem (SUP)
RDBMS
View Query
XML View
RDBMS
XML Doc+Schema
1
RUP: View = Schema
SUP:View Schema2
Where we are Motivation (Why?) Background:
XML View, Update Extension for XQuery Problem Definition:
- Correct Update Translatability
- General Classification of XML View Update(XVUP)
- Typical Case Study: RUP, PUP
Update Strategy for Publish-based update problem ( SUP)
• Update Strategy for Round-Trip update problem ( RUP)
• XVUP System Architecture
• Contribution
• Related Work
Semi-structured Update Problem (SUP)
• Update Translatability
Exposition complete
Consistency Duplication Complete Update
Partial Update
Y Y Y Y Case 1
Y Y N Y Y
Y N Y Y Case 2
Y N N Y Case 3
N Y Y N N
N Y N N N
N N Y N N
N N N N N
Semi-structured Update Problem (SUP)
• Case 1: Complete Exposition + Consistent + DuplicationPartial update + touch duplication is not translatable
• Case 2: Complete Exposition + In-Consistent + No-Duplicationsub-tree rooted at inconsistent edge is not updatable
• Case 3: Complete Exposition + In-Consistent + Duplication- case2 case 3, same as case 2 for inconsistent part- partial update touch duplication is not translatable
Where we are Motivation (Why?) Background:
XML View, Update Extension for XQuery Problem Definition:
- Correct Update Translatability
- General Classification of XML View Update(XVUP)
- Typical Case Study: RUP, PUP
Update Strategy for Publish-based update problem ( SUP) Update Strategy for Round-Trip update problem ( RUP)
• XVUP System Architecture
• Contribution
• Related Work
Round-Trip Update Problem
Loading Features
•Structure Preserving ( hierarchy information of XML)
Complete Structure Loading ---- each edge e(v1,v2) in XSG is mapped to a hierachical relationship defined in relational semantic.
Lossless Structure Loading ---- could re-construct XML view with same structure information as original XML document.
Complete Structure Loading Lossless Structure Loading
IID PID BOOK
IID PID BOOKID TITLE AUTHOR_IID PUBLISHER_IID YEAR
IID PID ANAME
IID PID SOURCE CURRENCY VALUE
BIB
BOOK
AUTHOR
PRICE
IID PID PNAME LOCATION
Complete Structure Loading for example XML schema (Basic Inline)
IID PID BOOK
IID PID BOOKID TITLE AUTHOR_IID PNAME LOCATION YEAR
IID PID ANAME
IID PID SOURCE CURRENCY VALUE
BIB
BOOK
AUTHOR
PRICE
Lossless Structure Loading for example XML schema ( Shared Inline)
Round-Trip Update Problem
• Semantic Preserving ( Constraints information of XML)
- Five kinds of constraints: Domain Constraints, Not null constraints, Key Constraints, Cardinality Constraints
(0,1) at most ---- NULL + UNIQUE(0,n) any ---- eg: Separate Table/ overflow table(1,1) only ---- NOT NULL + UNIQUE
(1,n) at least ---- Not NullInclusion Dependency ( IDREF)
Keep as duplicateSeparate table with K-FK connection
Complete Semantic Loading --- keep all semantic constraints in RDB schema
Round-Trip Update Problem
• loading strategy feature for RUP Lossless Structure loading + Complete Semantic Loading
• Update Translatability
any valid update are translatable in RUP
Where we are Motivation (Why?) Background:
XML View, Update Extension for XQuery Problem Definition:
- Correct Update Translatability
- General Classification of XML View Update(XVUP)
- Typical Case Study: RUP, PUP
Update Strategy for Publish-based update problem ( SUP) Update Strategy for Round-Trip update problem ( RUP) XVUP System Architecture
• Update Strategy
• Contribution
• Related Work
Parser
View Analyser
Valid Update Checker
Translatability Checker
Update Decomposer
Translator
Update Propagation
Execution Engine
View
DB Trigger
SQL Update
XQuery
System Architecture
Where we are Motivation (Why?) Background:
XML View, Update Extension for XQuery Problem Definition:
- Correct Update Translatability
- General Classification of XML View Update(XVUP)
- Typical Case Study: RUP, PUP
Update Strategy for Publish-based update problem ( SUP) Update Strategy for Round-Trip update problem ( RUP) XVUP System Architecture Update Strategy
• Contribution
• Related Work
Fundmental ---- connection
Ownership
connection
Subset connection Referencing connection
X1 & X2X1=NK/PK(R1),
X2PK(R2)X1=NK/PK(R1),
X2=PK(R2)X1=NK/PK(R1),
X2 NK/UK(R2),
Cardinality1:n 1:[0,1] 1:n
Representation R1 R2 R1 R2 R1 R2
R
We divide Foreign key as three types:
Inner-going Outer-going
• R1 is the owner of R2 if:
(a) every tuple in R2 must be connected to an owning tuple in R1
(b) Deletion of an owning tuple in R1 requires deletion of all tuples connected to that tuple in R2
(c) Modification of X1 in an owning tuple of R1
- propagation of the modification to the matching attributes X2 of all owned tuples in R2 or
- deletion of those tuples.
Fundmental ---- Ownership Connection
Fundmental ---- Reference Connection
• R1 is referencing to R2 if:
(a) Every tuple in R2 must either be connected to a referenced tuple in R1 or have null value for X1 ( the latter is allowed only when X1 NK(R1).
(b) Deletion of a tuple in R1 requires
- deletion of its referencing tuples in R2
- assignment of null values to attributes X2 of all the referencing tuples in R2.
(c) Modification of X1 in a referenced tuple of R1
- propagation of the modification to attributes X2 of all referencing tuples in R2 - assignment of null values to attributes X2 of all referencing tuples in R2
(unique + NULL)
- deletion of those tuples. (unique + Not NULL)
Fundmental ---- Subset Connection
• R1 and R2 is subset connection if:
(a) Every tuple in R2 must be connected to one tuple in R1.
(b) Deletion of a tuple in R1 requires deletion of the connected tuple in R2 ( if the latter exists)
(c) Modification of X1 in a tuple of R1 requires
- propagation of the modification to attributes X2 of its connected tuple in R2
- deletion of the R1 tuple. ( reject update)
XML View Mapping Graph (VMG)
Graph G(V,E) is represented as follows:
Nodes:- Core Relation : Relations underlying View- Extended Relation: Relations connected with Core Relation by FK.
- Involved Relation: Relations connected with Extended Relation or other involved
relation by FK
Edges: connections between two relation node.
From DAG to Set-Tree
Observation: - DAG: No recursion in View- Set of trees: replicating subtrees rooted at vertices having multiple
incoming edges.
XML View Mapping Graph (VMG)
IID PID BOOK
IID PID BOOKID TITLE AUTHOR_IID PUBLISHER_IID YEAR
IID PID ANAME
IID PID SOURCE CURRENCY VALUE
BIB
BOOK
AUTHOR
PRICE
IID PID PNAME LOCATION
VMG for Basic Inline Loading Strategy
PUBLISHER
• Pivot Relations (PR)
- Core Relation
- key is exposed in the view
- not included in other tree rooted in pivot relation
- out-going ownership/subset connections to other Core Relations = 0
• Implication:
- Start point of sub-tree of VMG
Fundmental ---- Pivot Relation
Book
Price PublisherAuthor
Bib
Book
Price PublisherAuthor
Bib
Book
Price PublisherAuthor
BibPivot Relation
Pivot Relation
Example for PR
• Dependency Island (DI) of root relation R
- Rooted at R.
- Maximal sub-tree.
- All inner-going ownership and subset connections of R.
• Referencing Peninsula (RP) of root relation R
- A relation Rj
- Directly connected to any relation of dependency island Rk via Reference connection Rk Rj
Dependency Island /Reference Peninsula
• Referenced Continent( RC) of root relation R
- Rooted at R.
- Maximal sub-tree.
- All outer-going ownership / subset / reference connections of R.
Referenced Continent
RDI
RPRC
analysisVMG(){ new VMG(V,E) for each underlying relation Ri
put Ri into V as a node of VMGfor each relation RjDB if ( Foreign key from Rj->Ri) then {
identifyConnectionType(Ri,Rj)put edge e(Rj->Ri) with connection type(o/s/r) into E
} else if ( Foreign key from Ri->Rj) then {
identifyConnectionType(Ri,Rj)put edge e(Ri->Rj) with connection type(o/s/r) into E
} else{}
return VMG}
Step1: View Analyser
updateTranslatabilityChecking( XAT_tree, update, VMG){ if ( expositionCompletenessChecking(VMG))
if ( completeUpdate(update) )return true;
else{ if ( duplicationChecking(VMG, update) )
if ( expositionConsistencyChecking(VMG))return true;
return false;}
}
Step2: Update Translatability Checking
Step3: Update Decomposition
updateDecomposition(XAT_tree)
{ XATleave = get all leave node of XAT_tree resultUpdate = array of RelationalUpdate for all XAT_leave do{ node = ith XAT_leave update = ith resultUpdate set updateType by looking at the root of XAT_tree while node != null {
update = opUpdateDecomp( node, update)node = parent node
} }
distinctResultUpdate();}
Tagger
Source
Join
NavNav
Source
opUpdateDecomp( XAT_node, update){ if XAT_node is a Navigate node
update.tableName = get table name from node if it has one elseif XAT_node is a Select node or a Join node
add the condition into whereClause of updatebreak any complex binary conditions into simple binary conditions and store the conditions
to be referred while extending tuples in case of insert and replace updates elseif XAT_node is a Tagger node
if type of update is deleteextract names of attributes from tagger patternfill in the updateColumn vector of update
if type of update is Insert if the DOM pattern of element represented by the tagger matches DOM pattern of the element to be inserted
extract names of attributes from tagger patternextract values of attributes from pattern of element to be insertedfill in the updateColumn vector of update
if type of update is Replace if the DOM pattern of element represented by the tagger matches DOM pattern of the replacing element
extract names of attributes from tagger patternextract old values by querying the relational databaseextract new values from pattern of the replacing elementfill in the updateColumn vector of update
else do nothing return update}
Step4: Delete PropagationDelete tuple t from relation R
• Algorithm (Step1-3):- Isolate the Dependency Island (DI) of R- Delete Matching t from R- Identify Referencing Peninsulas(RP)- Replacement on foreign key of matching tuple in each Peninsula else delete corresponding tuple
• Global Integrity Maintenance (Step4):- Relation involved in deletions
Delete propagation in its DI, Repeatedly, if necessaryForeign Key Replacement in its RP
IID PID BOOK
IID PID BOOKID TITLE AUTHOR_IID PUBLISHER_IID YEAR
IID PID ANAME
IID PID SOURCE CURRENCY VALUE
BIB
BOOK
AUTHOR
PRICE
IID PID PNAME LOCATION
VMG for Basic Inline Loading Strategy
PUBLISHER
delete
Delete/ Replace
Step4: Insertion Propagation
Insert tuple t into the relation R
• Algorithm- Extend the view tuple with values for the attributes that have been exposed out in the view definition- If new tuple is already present in the instance, reject the update- Otherwise, perform an insertion in the underlying database relation.
• Global Integrity Maintenance:- Insertion-check for RC,
if there, do nothingelse rollback, reject insertion
updateTranslation(resultUpdate ){ update = first of resultUpdate while update != null{
updateType (update) formatWhereConditions
formatOtherConditions }}
updateType (update){ if update-type is replace
do nothing elseif delete or insertion
if update-columns include all attributes of table do nothingelse update-type = replace
}
Step5: Update Translation
Virtual View Definition
Underlying Relation
RDB Schema
Integrity Constraints(Key, FK)
DeletionInsertion
Replacement Rename MoveSet of each
Group UpdateSchema Change
Duplicates
Aggregation
Recursion
Hierarchy Consistency
Key Exposition
Non Correlation predicates attribute exposition
Correlation predicates attributes exposition
BCNF
3NF
Information
Modification
Instance Language
Local Constraints(Not Null, domain)
2NF
3NF
General Classification of XVUP