View
221
Download
1
Embed Size (px)
Citation preview
Lectures
1. Introduction to data models2. Query languages for relational
databases3. Models and query languages for object
databases4. Models and query languages for
semistructured data, XML5. Embedded query languages 6. Guest lecture on Object Role Modelling
Why do we like types?
Types facilitate understanding
Types enable compact representations
Types enable query optimisation
Types facilitate consistency enforcement
Background assumptions fortyped data
Data stable over timeOrganisational body to control data
Exercise: Give an example of a context where these assumptions do not hold
Semistructured data
Semistructured data is schemaless and self describing
The data and the description of the data are integrated
An example
{name: {first: “John”, last: “Smith”}, tel: 112233, email: “[email protected]”}
“John” “Smith”
112233 “[email protected]”
name tel email
first last
Another example
person person
name age name age
child
&o1 &o2
“Eva” 40 “Abel” 20
{person:&o1{name: “Eva”, age: 40, child: &o2},person:&o2{name: “Abel”, age: 20}}
An object identifier, such as &o1, before a structure, binds the object identifier to the identity of that structure. The object identifier can then be used to refer to the structure.
Terminology
The following is an ssd-expression:
&o1{name: “Eva”, age: 40, child: &o2}
Label ValueObjectidentifier
A database
biblio
paper
book
author
author
title
date
Crick
Wallace
DNAspiral
1956
author
titledate
n1
n2
Darwin Origin 1848
db
author
titledate
n3
Marx Kapital 1860book
…….
Path expressions
A path expression is a sequence of labels:l1.l2…ln
A path expression results in a set of nodes
Path properties are specified by regular expressions on two levels: on the alphabet of labels and on the alphabet of characters that comprise labels
A path expression
biblio
paper
book
author
author
title
date
Crick
Wallace
DNAspiral
1956
author
titledate
n1
n2
Darwin Origin 1848
db
author
titledate
n3
Marx Kapital 1860book
…….
biblio.book.author
A path expression
biblio
paper
book
author
author
title
date
Crick
Wallace
DNAspiral
1956
author
titledate
n1
n2
Darwin Origin 1848
db
author
titledate
n3
Marx Kapital 1860book
…….
biblio.(book l paper).author
Examples of path expressions
biblio.book.author - authors of booksbiblio.paper.author - authors of papersbiblio.(book l paper).author - authors of
books or papersbiblio._.author - authors of anythingbiblio._*.author - nodes at the ends of
paths starting with biblio, ending with author, and having an arbitrary sequence of labels between
Example of a label pattern
((b l B)ook l (a l A)uthor) (s)? - book, Book, author, Author, books, Books, authors, Authors
An exercise
biblio._*.author.(“[s l S]ection”)
Which ones of the following paths match the path expression above?
1. Biblio.author.Section2. Biblio.cat.rat.hat.author.section3. Biblio.author4. Biblio.cat.author.section.Section
A query with a condition
select row: Xfrom biblio._ Xwhere “Crick” in X.author
Result:{row: {author: “Crick”,
author: “Wallace”,date: 1956,title: “The spiral DNA”}, …}
Two exercises
select row: {title: Y, date: Z}from biblio.paper X, X.title Y, X.date Z
select row: {author: Y, date: Z}from biblio.book X, X.author Y, X.date
Z
A database
biblio
paper
book
author
author
title
date
Crick
Wallace
DNAspiral
1956
author
titledate
n1
n2
Darwin Origin 1848
db
author
titledate
n3
Marx Kapital 1860book
…….
select row: {title: Y, date: Z}from biblio.paper X, X.title Y, X.date Z
A database
biblio
paper
book
author
author
title
date
Crick
Wallace
DNAspiral
1956
author
titledate
n1
n2
Darwin Origin 1848
db
author
titledate
n3
Marx Kapital 1860book
…….
Three exercises
Which authors have written a book or a paper in 1992?
Which authors have written a book together with Jones?
Which authors have written both a book and a paper?
Expressing relations
a b c
1 2 33 2 24 3 1
b d e
1 1 33 4 22 3 1
r1 r2
{ r1: { row: {a: 1, b:2, c:2}, row: {a: 1, b:2, c:2}, row: {a: 1, b:2, c:2} }, r2: { row: {b: 1, d:2, e:2}, row: {b: 1, d:2, e:2}, row: {b: 1, d:2, e:2} } }
Expressing relational joins
select a: A, d: Dfrom r1.row X
r2.row YX.a A, X.b B, Y.b B’, Y.d D
where B = B’
Label variables
select L: Xfrom biblio._*.L Xwhere matches(“.*Shakespeare.*”, X)
Label variable
biblio book
author
titledate
n2
Shakespeare Macbeth 1622
db
author
titledate
n3
Smith Best of Shakespeare 1992book
…….
Label variables
select L: Xfrom biblio._*.L Xwhere matches(“.*Shakespeare.*”, X)
{author: “Shakespeare”, title: “Best of Shakespeare”}
Turning labels into data
select publ: {type: L, author: A}
from biblio.L X, X.author A
biblio
paper
book
author
author
title
date
Crick
Wallace
DNAspiral
1956
author
titledate
n1
n2
Darwin Origin 1848
db
{publ: {type: “paper”, author: “Crick”},publ: {type: “paper”, author: “Wallace”},publ: {type: “book”, author: “Darwin”}
Basic XML syntax
XML is a textual representation of dataAn element is a text bounded by tags
<name> John </name>
start-tagend-tagcontent
element
<name> </name> can be abbreviated as <name/>
Basic XML syntax
Elements may contain subelements
<person><name> John </name><tel> 112233 </tel><email> [email protected] </email>
</person>
XML attributes
An attribute is defined by a name-value pair within a tag
<price currency = “dollar”> 500 </price>
<length unit = “cm”> 25 </length>
XML attributes and elements
<product><name> widget </name><price> 10 </price>
</product>
<product price = “10”><name> widget </name>
</product>
<product name = “widget” price = “10”/>
XML and ssd-expressions
<person><name> John </name><tel> 112233 </tel><email> [email protected] </email>
</person>
{person: {name: “John”, tel: 112233, email: “[email protected]”}}
XML references
<person id = “p1”><name> John </name><tel> 112233 </tel>
</person>
<person id = “p2”><name> Peter </name><tel> 998877 </tel><boss idref = “p1”/>
</person>
element identifier
reference attribute
Document Type Definitions
<!DOCTYPE db [<!ELEMENT db (person*)><!ELEMENT person (name, age, email)><!ELEMENT name (#PCDATA)><!ELEMENT age (#PCDATA)><!ELEMENT email (#PCDATA)>
]>
An exercise on DTDs as schemas
<db> <r1> <a> a1 </a> <b> b1 </b> </r1><r1> <a> a2 </a> <b> b2 </b> </r1> <r2> <c> a1 </c> <d> b1 </d> </r1> <r2> <c> c2 </c> <d> d2 </d> </r1> <r3> <a> a1 </a> <c> b1 </c> </r1>
</db>
Write down a DTD for the data above!
Attributes in DTDs
<product>
<name language = “Swedish” department = “music”>
trumpet </name>
<price currency = “dollar”> 500 </price>
<length unit = “cm”> 25 </length>
</product>
<!ATTLIST name language CDATA #REQUIRED department CDATA #IMPLIED>
<!ATTLIST price currency CDATA #REQUIRED><!ATTLIST length unit CDATA #REQUIRED>
Reference attributes in DTDs
<!DOCTYPE people [
<!ELEMENT people (person*)>
<!ELEMENT person (name)>
<!ELEMENT name (PCDATA)>
<!ATTLIST person id ID #REQUIRED
boss IDREF #REQUIRED
friends IDREFS#IMPLIED>
]>
An exercise
<people><person> id = “sven” boss = “olle”>
<name> Sven Svensson </name></person> <person> id = “olle” friends = “nils eva”>
<name> Olle Olsson </name></person> <person> id = “pelle” boss = “nils eva”>
<name> Per Persson </name></person>
<people>
Does this XML element conform to the previous DTD?
Limitations of DTDs as schemas
DTDs impose order
No base types
The types of IDREFs cannot be
constrained
XSL - extensible stylesheet language<bib> <book> <title> t1 </title>
<author> a1 </author> <author> a2 </author>
</book><paper>
<title> t2 </title> <author> a3 </author> <author> a4 </author>
</paper> <book> <title> t3 </title>
<author> a5 </author> <author> a6 </author>
</book></bib>
Template rules and XSL patterns
<xsl: template><xsl: apply-templates/>
</xsl: template>
<xsl: template match = “bib/*/title”><result>
<xsl: value-of/></result>
</xsl: template>
}Template rule
XSL pattern
<result> t1 </result><result> t2 </result><result> t3 </result>
Two exercises
select row: {title: Y, date: Z}from biblio.paper X, X.title Y, X.date Z{row: {title: “The spiral DNA”,
date: 1956}, {title: “Origin”,date: 1848}, {title: “Kapital”,date: 1860}}
select row: {author: Y, date: Z}from biblio.book X, X.author Y, X.date Z
Which authors have written a book or a paper in 1992?
select author: Xfrom biblio.(book | paper) Y, Y.author Xwhere Y.date = 1992
Which authors have written a book together with Jones?
select author: Xfrom biblio.book Y, Y.author Xwhere “Jones” in Y.author
Which authors have written both a book and a paper?
select author: Afrom biblio.book B, biblio.paper P, B.author Awhere B.author = P.author
select author: A1from biblio.book B, biblio.paper P, B.author A1, P.author A2where A1 = A2
List all publications in 1992, their types, and titles.
select publ: {type: L, title: T}from biblio.L X, X.title Twhere X.date = 1992
<!DOCTYPE db [<!ELEMENT db (r1*, r2*, r3*)><!ELEMENT r1 (a, b)><!ELEMENT r2 (c, d)><!ELEMENT r3 (a, c)><!ELEMENT a (#PCDATA)><!ELEMENT b (#PCDATA)><!ELEMENT c (#PCDATA)><!ELEMENT d (#PCDATA)>
]>
<db> <r1> <a> a1 </a> <b> b1 </b> </r1><r1> <a> a2 </a> <b> b2 </b> </r1> <r2> <c> a1 </c> <d> b1 </d> </r1> <r2> <c> c2 </c> <d> d2 </d> </r1> <r3> <a> a1 </a> <c> b1 </c> </r1>
</db>