45
Dependency-Preserving Normalization of Relational and XML Data Solmaz Kolahi [email protected] Department of Computer Science University of Toronto DBPL 2005 – p.1/28

Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

  • Upload
    dangdat

  • View
    236

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Dependency-Preserving Normalization ofRelational and XML Data

Solmaz Kolahi

[email protected]

Department of Computer Science

University of Toronto

DBPL 2005 – p.1/28

Page 2: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Motivation

Schema design: coming up with a “good” way of grouping theattributes of interest to avoid insertion, update, and deletionanomalies.

Goals:Eliminating redundancies.Preserving data.Preserving data dependencies and constraints.

Well-known approaches for relational database design: BCNF, 4NF,and 3NF.

BCNF: eliminates all redundancies, may lose dependencies.

3NF: does not eliminate all redundancies, preserves alldependencies.

DBPL 2005 – p.2/28

Page 3: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Motivation (cont’d)

How much redundancy 3NF tolerates to preserve dependencies?Applying an information-theoretic measure to 3NF.

Is it possible to achieve redundancy elimination and dependencypreservation by representing relational data in XML documents?

Characterizing cases when an XML normal form, called XNF,guarantees both.Providing a PTIME algorithm.

How do we achieve dependency preservation in XML normalizationtechniques?

Defining equivalent of 3NF for XML.

DBPL 2005 – p.3/28

Page 4: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Outline

Characterizing 3NF using an information-theoretic measure.

Converting relational data into redundancy-free XML documents.

XML dependency preservation and XML Third Normal Form.

Final Remarks.

DBPL 2005 – p.4/28

Page 5: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Outline

Characterizing 3NF using an information-theoretic measure.

Converting relational data into redundancy-free XML documents.

XML dependency preservation and XML Third Normal Form.

Final Remarks.

DBPL 2005 – p.4/28

Page 6: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Measure of Information Content

Proposed by Arenas & Libkin in PODS’03.

Used to measure the redundancy of a data value in a databaseinstance with respect to a set of constraints.

Defined using information theory.

Intuitively,

� �� � ��� ��

measures the information content of position �

in instance

with respect to constraints�

.

1 2 31 2 4

1 2 31 2 41 2 5

DBPL 2005 – p.5/28

Page 7: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Measure of Information Content

Proposed by Arenas & Libkin in PODS’03.

Used to measure the redundancy of a data value in a databaseinstance with respect to a set of constraints.

Defined using information theory.

Intuitively,

� �� � ��� ��

measures the information content of position �

in instance

with respect to constraints�

.

� � ���� �� � � � � � � � � �

� � �

1 2 31 2 4� �� � ��� �� � ��� � ��

� � �

1 2 31 2 41 2 5� �� � ��� �� � ��� � ��

DBPL 2005 – p.5/28

Page 8: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Measure of Information Content (cont’d)

A database specification

� �� �

is defined as well-designed if forevery instance

of

� �� �

and every position � in

,� �� � ��� �� � �

.That is, every position in every instance carries the maximumamount of information.

It is known that:If

contains only FDs,

� �� �

is well-designed iff it is in BCNF.If

contains FDs and MVDs,

� �� � is well-designed iff it is in

4NF.

We would like to apply this measure to 3NF.

DBPL 2005 – p.6/28

Page 9: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Characterizing 3NF

Theorem The specification

� �� �

is in 3NF iff if for every instance

of� �� �

and every position � � ��� �

in

,

� �� � ��� �� �� �

implies�

is aprime attribute.

Question Can this number be arbitrarily small when we have 3NF?In other words, how much redundancy is allowed by 3NF?

DBPL 2005 – p.7/28

Page 10: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Characterizing 3NF (cont’d)

Theorem For every ! � �� � "

, there exists a relation schema�

, a set ofFDs

over

, an instance

of

� �� �

, and position � in

such that

� �� �

is in 3NF, and

� �� � ��� �� � .

�� ���� �� �$#� � � �� �&% � � � � � � � �$# � � � �&%� �$# � ��� � � �� �&% � � �

' ( (*) + + + (-,

1 1 1 + + + 11 2 1 + + + 11 3 1 + + + 1...

......

...

1

.

1 + + + 1

/ 021 354 687 � � 0 354 689 :; < � 1 � � �� � ��� �� �� for any ! � �� � "

.

DBPL 2005 – p.8/28

Page 11: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Outline

Characterizing 3NF using an information-theoretic measure.

Converting relational data into redundancy-free XML documents.

XML dependency preservation and XML Third Normal Form.

Final Remarks.

DBPL 2005 – p.9/28

Page 12: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML: Example

� � ���� �� � � = � �� � � �� � � � Hierarchical translation into XML

(C)

(A)

(B) 1 2

1

1 2

3

2

12

1

A B C

1 1 11 2 22 1 13 2 2 A A @b A A @b

B B

r

1 2

@a @a @a @aC C C C1 2 1 3

@c @c @c @c1 1 2 2

DBPL 2005 – p.10/28

Page 13: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML: Example

� � ���� �� � � = � �� � � �� � � � Hierarchical translation into XML

(C)

(A)

(B) 1 2

1

1 2

3

2

12

1

A B C

1 1 11 2 22 1 13 2 2 A A @b A A @b

B B

r

1 2

@a @a @a @aC C C C1 2 1 3

@c @c @c @c1 1 2 2

DTD:

> � �? > � @

� � � ? � � AB

� � �? � � ADC

� � � � ADEDBPL 2005 – p.10/28

Page 14: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML: Example

� � ���� �� � � = � �� � � �� � � � Hierarchical translation into XML

(C)

(A)

(B) 1 2

1

1 2

3

2

12

1

A B C

1 1 11 2 22 1 13 2 2 A A @b A A @b

B B

r

1 2

@a @a @a @aC C C C1 2 1 3

@c @c @c @c1 1 2 2

DTD: Functional Dependencies:

> � �? > � @ > � � � AB � > � �

� � � ? � � AB � > � �� > � � � � � ADC � � > � � � �� � �? � � ADC � > � � � ��� > � � � � � �� ADE � � > � � � � � �

� � � � ADE � > � � � � � ADC� > � � � AB � � > � � � � � �� ADE

> � � � � � �� ADE � > � � � AB

DBPL 2005 – p.10/28

Page 15: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML: Problem Statement

Given: relation specification

� �� F

, where:� � �� #� � � �� � %

is a relation.F

is a set of FDs over

.

Question: is there an XML representation

�G � � , where:G

is a DTD.�

is a set of XML FDs over

G

.such that:�G � �

is a dependency-preserving hierarchical translation of� �� F

.�G � �

does not allow redundancy.

By dependency preservation we mean:

A relational instance is valid w.r.t.

� �� F

iff its hierarchical XMLrepresentation is valid w.r.t.

�G � �

.

DBPL 2005 – p.11/28

Page 16: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

XML Functional Dependencies

Proposed by Arenas & Libkin in PODS’02.

Based on a relational representation of XML trees: tree tuples.

r

1 2A A@b @bAA

BB

C

@c

@a1

1

C

@c1

@a2

C

@c2

@a1

C

@c2

@a3

DBPL 2005 – p.12/28

Page 17: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

XML Functional Dependencies

Proposed by Arenas & Libkin in PODS’02.

Based on a relational representation of XML trees: tree tuples.

r

1 2A A@b @bAA

BB

C

@c

@a1

1

C

@c1

@a2

C

@c2

@a1

C

@c2

@a3

DBPL 2005 – p.12/28

Page 18: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

XML Functional Dependencies

Proposed by Arenas & Libkin in PODS’02.

Based on a relational representation of XML trees: tree tuples.

r

1 2A A@b @bAA

BB

C

@c

@a1

1

C

@c1

@a2

C

@c2

@a1

C

@c2

@a3

DBPL 2005 – p.12/28

Page 19: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

XML Functional Dependencies

Proposed by Arenas & Libkin in PODS’02.

Based on a relational representation of XML trees: tree tuples.

r

1 2A A@b @bAA

BB

C

@c

@a1

1

C

@c1

@a2

C

@c2

@a1

C

@c2

@a3

DBPL 2005 – p.12/28

Page 20: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

XML Functional Dependencies

Proposed by Arenas & Libkin in PODS’02.

Based on a relational representation of XML trees: tree tuples.

r

1 2A A@b @bAA

BB

C

@c

@a1

1

C

@c1

@a2

C

@c2

@a1

C

@c2

@a3

A tree tuple in a DTD

G

is a mapping:

�IH � C � JLK �G � nodes M Strings M �N �

in a consistent way.

DBPL 2005 – p.12/28

Page 21: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

XML Functional Dependencies (cont’d)

An XML FD is an expression of the form:

��O #� � � �� OQP � � O O #� � � �� OQP � O ! � C � JLK �G

XML tree

R

satisfies

��O #� � � �� OQP � � O iff for every two tree tuples

� #� � 7

in

R

: �ST ! U�� V " � # �OW � � 7 �OW X � N 1 � � # �O � � 7 �O

r

1 2A A@b @bAA

BB

C

@c

@a1

1

C

@c1

@a2

C

@c2

@a1

C

@c2

@a3

> � � � �� ADE � > � � � AB

DBPL 2005 – p.13/28

Page 22: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

XNF: a Normal Form for XML

�G � �

is in XNF if for each non-trivial FD

Y � O � AZ ! �G � � [,Y � O is also in

�G � � [

.

XNF generalizes BCNF for the case of XML.

XNF guarantees zero redundancy.

Question Given

� � �� #� � � �� � %

and a set of FDs

F

, can we translate� �� F

into

�G � �

in XNF and preserve FDs?

Answer Yes, in some cases, for which we have a precise characterization.

DBPL 2005 – p.14/28

Page 23: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML: Conditions

Given

� � �� #� � � �� � %

and FDs

F

over it:

Condition We can put the attributes of

in order

� \]� � � �� � \_^ s.t. forevery non-trivial FD

Y � � \` ! F [

and every

a� T

, the FD

Y � � \b isalso in

F [

.

Theorem

� �� F

has an FD-preserving XNF representation iff the abovecondition holds.

We provide a PTIME algorithm that checks the condition and producesthe XML representation.

DBPL 2005 – p.15/28

Page 24: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Outline

Characterizing 3NF using an information-theoretic measure.

Converting relational data into redundancy-free XML documents.

XML dependency preservation and XML Third Normal Form.

Final Remarks.

DBPL 2005 – p.16/28

Page 25: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

XNF Normalization

" mr201" @type @bid " marketing"

branch

client

@name @postal_code @city @name @postal_code @city @name @postal_code @city @name @postal_code @city"cl1" "cl2" "Toronto" "M4Y 2R5" "M4Y 2R5" "Toronto" "K1A 0H9" "K2B 1S5" "Ottawa" "Ottawa"

client client client

clients clients

company

@type @bid

branch

"ad005" "admin"

"cl3" "cl4"

Ec /� C Vd � B > C VE J� E ZT e V� K � E ZT e V� � A � c K � C Z

_Ec f e �

Ec /� C Vd � B > C VE J� E ZT e V� K � E ZT e V� � ADET � d

DBPL 2005 – p.17/28

Page 26: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

XNF Normalization

company

city_info branch branch city_info

"Ottawa" @city code code @city

"Toronto" code clients

" mr201" @type @bid " marketing"

clients @type @bid " ad005" "admin"

"M4Y 2R5" @name @postal_code @name @postal_code

"cl1" "M4Y 2R5" @name @postal_code

"cl2" "K1A 0H9" @name @postal_code

"K2B 1S5" "cl3" "cl4"

"M4Y 2R5" @val

"K1A 0H9" "K2B 1S5" @val @val client client client client

DBPL 2005 – p.18/28

Page 27: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

XNF Normalization

client

@name @postal_code @city @name @postal_code @city @name @postal_code @city @name @postal_code @city"cl1" "cl2" "Toronto" "M4Y 2R5" "M4Y 2R5" "Toronto" "K1A 0H9" "K2B 1S5" "Ottawa" "Ottawa"

client client client

clients clients

company

@type @bid

branch

"ad005" "admin"

"cl3" "cl4"

" mr201" @type @bid " marketing"

branch

Ec /� C Vd � B > C VE J� E ZT e V� K � E ZT e V� � A � c K � C Z

_Ec f e �

Ec /� C Vd � B > C VE J� E ZT e V� K � E ZT e V� � ADET � d

� Ec /� C Vd � B > C VE J� E ZT e V� K � E ZT e V� � ADET � d� Ec /� C Vd � B > C VE J� A � d � e � �

Ec /� C Vd � B > C VE J

DBPL 2005 – p.19/28

Page 28: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

XNF Normalization

company

city_info branch branch city_info

"Ottawa" @city code code @city

"Toronto" code clients

" mr201" @type @bid " marketing"

clients @type @bid " ad005" "admin"

"M4Y 2R5" @name @postal_code @name @postal_code

"cl1" "M4Y 2R5" @name @postal_code

"cl2" "K1A 0H9" @name @postal_code

"K2B 1S5" "cl3" "cl4"

"M4Y 2R5" @val

"K1A 0H9" "K2B 1S5" @val @val client client client client

The FD: � Ec /� C Vd � ET � d _

T V gc � ADET � d� Ec /� C Vd � B > C VE J� A � d � e � �

Ec /� C Vd � B > C VE J

does not hold for the new document.

DBPL 2005 – p.20/28

Page 29: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

XML Dependency Preservation

The concept of dependency preservation is more involved for XML.Implication of FDs in presence of DTD.

There are XML specifications

�G � �

, for which there is nodependency-preserving XML representation in XNF.

Complete redundancy elimination cannot be achieved for some XMLdocuments without losing constraints.

We need an equivalent of 3NF for XML.

DBPL 2005 – p.21/28

Page 30: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Prime Attribute Path

Given a DTD

G

, and set of XML FDs

over

G

:

Definition Attribute path � � AZ ! � C � JLK �G

is called prime if there is anontrivial FD

h i � � i ! �G � � [

such that:

� � AZ ! h i

;

� i

is an element path;

h i

is minimal

� h i1 � � � AZ � � � iX ! �G � � [ .

Ec /� C Vd � B > C VE J� E ZT e V� K � E ZT e V� � A � c K � C Z

_Ec f e �

Ec /� C Vd � B > C VE J� E ZT e V� K � E ZT e V� � ADET � d

� Ec /� C Vd � B > C VE J� E ZT e V� K � E ZT e V� � ADET � d� Ec /� C Vd � B > C VE J� A � d � e � �

Ec /� C Vd � B > C VE J

DBPL 2005 – p.22/28

Page 31: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Third Normal Form for XML

Given a DTD

G

, and set of XML FDs

over

G

:

Definition

�G � �

is in XML third normal form (X3NF) iff for every nontrivialFD

h � � � AZ ! �G � � [

:

1. the FD

h � � is also in

�G � � [

; or

2. attribute path � � AZ

is prime.

Ec /� C Vd is in X3NF.

X3NF generalizes 3NF for XML.

DBPL 2005 – p.23/28

Page 32: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Outline

Characterizing 3NF using an information-theoretic measure.

Converting relational data into redundancy-free XML documents.

XML dependency preservation and XML Third Normal Form.

Final Remarks.

DBPL 2005 – p.24/28

Page 33: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Final Remarks

3NF is good, because it preserves dependencies and eliminatesredundancies for non-prime attributes.

But it admits arbitrary redundancy on prime attributes.

We can sometimes achieve redundancy elimination and dependencypreservation by converting into XML.

Normalizing XML to achieve redundancy elimination can result inlosing FDs.

Future WorkFormally defining FD-preservation for XML.Verifying that decomposition based on X3NF definition willpreserve FDs.

DBPL 2005 – p.25/28

Page 34: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Backup Slides

DBPL 2005 – p.26/28

Page 35: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML: Algorithm

� � ���� �� �� G � =� j

F� � � � �G � = j� G = � ��� j � �� G j � � �

Question Can we convert

� �� F

into hierarchical XML form

�G � �

inXNF?

DBPL 2005 – p.27/28

Page 36: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML: Algorithm

� � ���� �� �� G � =� j

F� � � � �G � = j� G = � ��� j � �� G j � � �

Question Can we convert

� �� F

into hierarchical XML form

�G � �

inXNF?

...�

... �

...

j � �� jX � �

... �

...�

...

G = � ��� G =X � �

DBPL 2005 – p.27/28

Page 37: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML: Algorithm

� � ���� �� �� G � =� j

F� � � � �G � = j� G = � ��� j � �� G j � � �

� # � ���� �� �� G � =

F# � � � � �G � =� G = � � �

�� � �G [lk �G = [ � � � G = �

ordering:

��� G � =� �� �

�7 � ���� �� �� G � j

F7 � � � � �G � j� j � ��G j � � �

�� � �G [ k j [ k �G j [ �

� �� j �

�G j [ � � � �

ordering:

�� j� ��� �� G

DBPL 2005 – p.27/28

Page 38: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML: Algorithm

� � ���� �� �� G � =� j

F� � � � �G � = j� G = � ��� j � �� G j � � �

� # � ���� �� �� G � =

F# � � � � �G � =� G = � � �

�� � �G [lk �G = [ � � � G = �

ordering:

��� G � =� �� �

�7 � ���� �� �� G � j

F7 � � � � �G � j� j � ��G j � � �

�� � �G [ k j [ k �G j [ �

� �� j �

�G j [ � � � �

ordering:

�� j� ��� �� G

DBPL 2005 – p.27/28

Page 39: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML: Algorithm

� � ���� �� �� G � =� j

F� � � � �G � = j� G = � ��� j � �� G j � � �

� # � ���� �� �� G � =

F# � � � � �G � =� G = � � �

�� � �G [lk �G = [ � � � G = �

ordering:

��� G � =� �� �

�7 � ���� �� �� G � j

F7 � � � � �G � j� j � ��G j � � �

�� � �G [ k j [ k �G j [ �

� �� j �

�G j [ � � � �

ordering:

�� j� ��� �� G

DBPL 2005 – p.27/28

Page 40: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML: Algorithm

� � ���� �� �� G � =� j

F� � � � �G � = j� G = � ��� j � �� G j � � �

� # � ���� �� �� G � =

F# � � � � �G � =� G = � � �

�� � �G [lk �G = [ � � � G = �

ordering:

��� G � =� �� �

�7 � ���� �� �� G � j

F7 � � � � �G � j� j � ��G j � � �

�� � �G [ k j [ k �G j [ �

� �� j �

�G j [ � � � �

ordering:

�� j� ��� �� G

DBPL 2005 – p.27/28

Page 41: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML: Algorithm

� � ���� �� �� G � =� j

F� � � � �G � = j� G = � ��� j � �� G j � � �

� # � ���� �� �� G � =

F# � � � � �G � =� G = � � �

�� � �G [lk �G = [ � � � G = �

ordering:

��� G � =� �� �

�7 � ���� �� �� G � j

F7 � � � � �G � j� j � ��G j � � �

�� � �G [ k j [ k �G j [ �

� �� j �

�G j [ � � � �

ordering:

�� j� � � �� G

DBPL 2005 – p.27/28

Page 42: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML: Algorithm

� � ���� �� �� G � =� j

F� � � � �G � = j� G = � ��� j � �� G j � � �

� # � ���� �� �� G � =

F# � � � � �G � =� G = � � �

�� � �G [lk �G = [ � � � G = �

ordering:

��� G � =� �� �

�7 � ���� �� �� G � j

F7 � � � � �G � j� j � ��G j � � �

�� � �G [ k j [ k �G j [ �

� �� j �

�G j [ � � � �

ordering:

�� j� ��� �� G

DBPL 2005 – p.27/28

Page 43: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML: Algorithm

� � ���� �� �� G � =� j

F� � � � �G � = j� G = � ��� j � �� G j � � �

� # � ���� �� �� G � =

F# � � � � �G � =� G = � � �

�� � �G [lk �G = [ � � � G = �

ordering:

��� G � =� �� �

�7 � ���� �� �� G � j

F7 � � � � �G � j� j � ��G j � � �

�� � �G [ k j [ k �G j [ �

� �� j �

�G j [ � � � �

ordering:

�� j� ��� �� Gmn o p n

q n r p n

s n m p nt p n

t n q p n

o n

uDBPL 2005 – p.27/28

Page 44: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML (cont’d)

So far we talked about hierarchical translation of relations into XML.

What if we allow XML elements to represent more than one relationalattribute ( a semi-hierarchical translation)?

r

AB

C

@c

AB

C

@c

AB

C

@c

AB

C

@c

@a @b @a @b @a @b @a @b1 1

1

1 1

12

2 2

2

23

Given a relation and FDs over it:

Theorem has a redundancy-free semi-hierarchical translation iff ithas a redundancy-free hierarchical translation.

DBPL 2005 – p.28/28

Page 45: Dependency-Preserving Normalization of Relational and …solmaz/docs/dbpltalk.pdf · Dependency-Preserving Normalization of Relational and XML Data ... 4NF, and 3NF. BCNF: eliminates

Converting into XML (cont’d)

So far we talked about hierarchical translation of relations into XML.

What if we allow XML elements to represent more than one relationalattribute ( a semi-hierarchical translation)?

r

AB

C

@c

AB

C

@c

AB

C

@c

AB

C

@c

@a @b @a @b @a @b @a @b1 1

1

1 1

12

2 2

2

23

Given a relation

and FDsF

over it:

Theorem

� �� F

has a redundancy-free semi-hierarchical translation iff ithas a redundancy-free hierarchical translation.

DBPL 2005 – p.28/28