55
XML To Relational Model

XML To Relational Model. Key Index – Forward Traversal Backward Traversal

  • View
    226

  • Download
    0

Embed Size (px)

Citation preview

Page 1: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

XML To Relational Model

Page 2: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 3: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 4: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 5: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 6: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Key Index – Forward Traversal

Backward Traversal

Page 7: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 8: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 9: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 10: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 11: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 12: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Binary Approach

Bname(source, ordinal, flag, target) Create many tables as different

subelement and attribute names occur in XML document

Partition Edge Table by name

Universal table – Take outer join of all binary tables

Page 13: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 14: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 15: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 16: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 17: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 18: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 19: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Universal Table with Overflow

Page 20: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 21: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 22: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 23: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Converting Ordered XML to Relations

Page 24: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Skynet Hitech. Company

<Company><Name>

Skynet Hitech</Name><Department>

<Name>Research

</Name><Manager>

John Smith</Manager><Employee>

Tom Jackson</Employee>

</Department>

<Department><Name>

Sales</Name><Manager>

Linda White</Manager><Employee>

Kevin Lee </Employee></Department>

</Company>

Page 25: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Ordered XML model for Skynet Hitech. Company

Company

Name Department

Skynet Hitech Name Manager Employee

Research John Smith Tom Jackson

Department

Name Manager Employee

Sales Linda White Kevin Lee

1

1 2 3

1 2 3 1 2 3

Page 26: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Schema of the storing table

Attributes IDID: the unique index for each tuple DID: the document ID Path: the path from the root to the leaf node,

this is to find a particular node Surrogate Pattern: number representation of

nodes Value: Text value associated with each node

Page 27: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Numbering nodes

Company

Name Department

Skynet Hitech Name Manager Employee

Research John Smith Tom Jackson

Department

Name Manager Employee

Sales Linda White Kevin Lee

1[1]

2[2]

2[1]

Page 28: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Tuple that stores “Linda White”

ID: 00334 DID: 501 Path: Company/Department/Manager Surrogate Pattern: 1[1]2[2]2[1] Value: Linda White

Page 29: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Old Skynet file stored in the RDBMS

OLD  

Path Surrogate Patten Value

Company/Name 1[1]1[1] Skynet Hitech

Company/Department/Name 1[1]2[1]1[1] Research

Company/Department/Manager 1[1]2[1]2[1] John Smith

Company/Department/Employee 1[1]2[1]3[1] Tom Jackson

Company/Department/Name 1[1]2[2]1[1] Sales

Company/Department/Manager 1[1]2[2]2[1] Linda White

Company/Department/Employee 1[1]2[2]3[1] Kevin Lee

Page 30: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 31: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 32: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 33: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 34: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 35: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 36: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 37: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 38: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

book

booktitle

author

monograph

title

contactauthor

authorID

editor

*

nameaddress

?

firstname lastname

?

authorid

article

*

name

Page 39: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 40: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

<!ELEMENT book (booktitle, author)

<!ELEMENT booktitle (#PCDATA)>

<!ELEMENT author (name, address)><!ATTLIST author id ID #REQUIRED>

<!ELEMENT name (firstname?, lastname)>

<!ELEMENT firstname (#PCDATA)>

<!ELEMENT lastname (#PCDATA)>

<!ELEMENT address ANY>

<!ELEMENT article (title, author*, contactauthor)>

<!ELEMENT title (#PCDATA)>

<!ELEMENT contactauthor EMPTY><!ATTLIST contactauthor authorID IDREF IMPLIED>

<!ELEMENT monograph (title, author, editor)>

<!ELEMENT editor (monograph*)><!ATTLIST editor name CDATA #REQUIRED>

Page 41: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 42: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Basic Inline Algorithm

A relation is created for root of element of graph

All element’s descendents are inlined into that relation except Children below a “*” node are made into

separate relations – this corresponds to creating a new relation for a set-valued child

Each node having a backpointer edge pointing to it is made into a separate relation

Page 43: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Drawbacks

Grossly inefficient for many queries “List all authors having first name Jack” will have to

be executed as the union of 5 separate queries Large number of relations it creates

Page 44: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

To determine the set of relations to be created for an element, we construct an element graph by… Do a DFS traversal of DTD graph, starting at element

node for which we are constructing relations Each node is marked as “visited” the first time it is

reached and is unmarked once all its children have been traversed

If an unmarked node in DTD graph is reach during DFS, a new node bearing the same name is created in the element graph

A regular edge is created from the most recently created node in the element graph with the same names as the DFS parent of the current DTD node to newly created node

If an attempt is made to traverse an already marked DTD, then a backpointer edge is added from the most recently created node in the element graph to the most recently created node in the element graph of the same name as the marked DTD node

Page 45: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 46: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Fragmentation: Example

Results in 5 relations Just retrieving first and last names of an

author requires three joins!

<!ELEMENT author (name, address)><!ATTLIST author id ID #REQUIRED>

<!ELEMENT name (firstname?, lastname)>

<!ELEMENT firstname (#PCDATA)>

<!ELEMENT lastname (#PCDATA)>

<!ELEMENT address ANY>

author (authorID: integer, id: string)

name (nameID: integer, authorID: integer)

firstname (firstnameID: integer, nameID: integer, value: string)

lastname (lastnameID: integer, nameID: integer, value: string)

address (addressID: integer, authorID: integer, value: string)

Page 47: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 48: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Shared Inlining Method

Relations are created for… All elements in the DTD graph whose nodes have an

in-degree greater than one. Nodes with in-degree of one are inlined

Elements have an in-degree of zero Elements below a “*” node Of mutually recursive elements all having in-degree

one, one of them is made a separate relation Each element node X that is a separate relation inlines

all nodes Y that are reachable from it such that the path from X to Y does not contain a node that is to be made a separate relation

Page 49: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Issues with Sharing Elements

Parent of elements not fixed at schema level

Need to store type and ids of parents parentCODE field (type of parent) parentID field (id of parent) No foreign key relationship

Page 50: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Hybrid

Same as Shared except that it inlines some elements not inlined in Shared Inlines elements with in-degreee greater than

one that are not recursive or reached through a “*” node.

Set sub-elements and recursive elements are treated as in Shared

Page 51: XML To Relational Model. Key Index – Forward Traversal Backward Traversal
Page 52: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

book (bookID: integer, book.booktitle.isroot: boolean, book.booktitle : string)

article (articleID: integer, article.contactauthor.isroot: boolean, article.contactauthor.authorid: string)

monograph (monographID: integer, monograph.parentID: integer, monograph.parentCODE: integer, monograph.editor.isroot: boolean, monograph.editor.name: string)

title (titleID: integer, title.parentID: integer, title.parentCODE: integer, title: string)

author (authorID: integer, author.parentID: integer, author.parentCODE: integer, author.name.isroot: boolean, author.name.firstname.isroot: :boolean, author.name.firstname: string, author.name.lastname.isroot: boolean, author.name.lastname: string, author.address.isroot: boolean, author.address: string, author.authorid: string)

Page 53: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Shared Inline

Page 54: XML To Relational Model. Key Index – Forward Traversal Backward Traversal

Hybrid

Page 55: XML To Relational Model. Key Index – Forward Traversal Backward Traversal