Upload
elaine-williamson
View
264
Download
1
Embed Size (px)
Citation preview
PART 2
RELATIONAL DATABASES
Chapter 7
Relational-Database Design normalization of relational
schemas
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
3
Main Contents in This Chapter VII-1 Why normalization needed?
pitfalls in relational-database design (§7.1 ) principles of relation normalization
VII- 2 Functional dependency (§7.4) VII- 3 Definitions of normal forms
the first normal form(§7.2 ), second normal form, third normal form(§7.3.4 ), BCNF(§7.3.2 )
VII- 4 Decomposition properties (§7.4.4, §7.4.5) VII- 5 3NF decomposition (§7.5.2) VII- 6 BCNF decomposition (§7.5.1)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
4
VII-1-1 Pitfalls in Relational-Database design (§7.1) As shown in Fig. 7.0.1, logical DBS design consists of
initial relational schema generating (§2.9) relational schema normalizing
A bad DB design, i.e. schema not being normalized well, may result in
repetition of information inability to represent certain information
VII-1 Why Normalization Needed ?
Application area/problem in real world
requirements analysis
Specification of functional requirements
conceptual DBS design
Conceptual DBS schema , i.e. E-R schema
logical DBS design
logical DBS schema , i.e. relational data schema
physical DBS design
physical DBS schema about physical storage structure and access method
DBMSindependent
DBMS dependent
Initial relational schema generating(§2.9)
Relational schema normalizing (chapter 7)
Fig.7.0.1 DBS design
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
6 return
Fig.7.0.2 Case study used in chapter 7
VII-1-1 Pitfalls in Relational-Database design (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
7
For all schema concerning loans in Fig.7.0.2 Branch-schema, Borrower-schema, Loan-schema combine these three relations into one single relation
Lending-schema
= (branch-name, branch-city, assets, customer-name, loan-number, amount)
relation lending is shown in Fig.7.1
VII-1-1 Pitfalls in Relational-Database design (cont.)
Fig.7.1 Sample lending(Lending-schema) relation
Perryridge, Horseneck, 1700000, Adams, L-31, 1500t1
New-York, Long-island, 200000, null, null, null t2
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
9
Inserting problem and information redundancy adding a new loan to the DB
the loan is made by the Perryridge branch to Adams in the amount of $1500, and loan-number is L-31
tuple t1 = (Perryridge, Horseneck, 1700000, Adams, L-31, 1500) is inserted into DB
data for branch-name, branch-city, assets are repeated for each loan in the 3rd row, the 10th row, and the last row that a branch “Perryridge”makes, space is wasted
VII-1-1 Pitfalls in Relational-Database design (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
10
Deleting problem e.g. the branch Perryridge is canceled, all loans in this
branch should then be removed in Fig 7.1, every tuples containing Perryridge branch
should be deleted the 3rd row and the 10th row
VII-1-1 Pitfalls in Relational-Database design (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
11
Updating problem and information redundancy information redundancy in DB complicates updating,
introducing possibility of inconsistency of assets value e.g. change assets of Perryridge branch from 1700000 to
1900000, in Fig7.1, every tuples belonging to Perryridge
branch should be updated
VII-1-1 Pitfalls in Relational-Database design (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
12
Information Representation Problem to represent directly information about a new-opened branch
(branch-name, branch-city, assets) in which there no exists loan, a tuple containing null vaules, such as
t2 = (New-York, Long-island, 200000, null, null, null) , is inserted into the relation lending null values in DB complicate data handling in DBS
Why ? lending(branch-name, branch-city, assets, customer-name,
loan-number, amount) includes two types of information about only bank and
information about loan
VII-1-1 Pitfalls in Relational-Database design (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
13
in E-R diagram of the banking enterprise, the relationship from bank to loan is one-to-many, i.e. a bank branch may make several loans
An improved design: decompose the relation schema Lending-schema into:
Branch-schema = (branch-name, branch-city,assets)
Loan-info-schema = (customer-name, loan-number, branch-name, amount)
VII-1-1Pitfalls in Relational-Database design (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
14
在逻辑 DBS 设计过程中,将概念 DB 设计结果 E—R 图进行转换,得到面向特定应用领域的初始关系模式集
这些初始关系模式集中可能存在多种(作为完整性约束的)关系模式属性间的 数据依赖 (Data Dependencies) 关系
函数依赖 (functional dependencies, FD, §7.4) 多值依赖 (Multivalued Dependencies, MVD, §7.6,
Appendix C/C.1 ) 连接依赖 (Join Dependencies, JD, Appendix C/C.2)
VII-1-2 Principles of Relation Normalization
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
15
如果直接根据初始关系模式构造 DBS,由于初始关系模式中数据依赖关系的存在,可能会违反 DB 的完整性约束,导致DBS使用过程中出现如下问题,影响 DBS的正确性、性能、效率
数据冗余问题、插入问题、更新问题、删除问题(pitfalls,§7.1)
VII-1-2 Principles of Relation Normalization (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
16
因此,对初始关系模式集,需要根据关系规范化理论,在保证关系模式的
函数无损连接性( lossless join) , 和 /或 函数依赖保持性 (dependency preservation)
约束前提下,对关系模式集进行规范化处理——等价变换 /模式分解
关系模式规范化主要步骤为 根据函数依赖的 Armstrong’s 公理系统 ( §7.4.1 )和多值
依赖的公理系统 (Appendix C/C.1),从初始关系模式集中已知的函数依赖和多值依赖出发,推导出初始关系模
式集中所有的函数依赖( §7.4.1/7.4.2/7.4.3)和多值依赖
VII-1-2 Principles of Relation Normalization (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
17
对具有函数依赖和多值依赖的初始关系模式集,采用 模式分解算法 ,
, 对其进行(等价)分解和变换,将其转换为各种范式形式,包括:
1NF(§7.2) 、 2NF (Exercise 7.26) 、 BCNF(§7.3.2 ) 、 3NF (§7.3.4) 、 4NF(§7.6.2)
, 以消除模式集中的函数依赖和多值依赖带来的负面影响 , 保证数据库系统的完整性
关系模式规范化处理的基本要求为 : 静态关系具有第一范式形式 (理论上)动态关系最好具有第三范式形式
VII-1-2 Principles of Relation Normalization (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
18
3 种数据依赖间的关系 函数依赖是特殊的多值依赖 多值依赖又是连接依赖的特例
范式 1NF 、 2NF 、 3NF 、 BCNF 可以看作由符合范式要求的各种关系模式组成的关系模式的集合
e.g. 1NF = { R | R 满足第一范式的定义 } 各种范式间的关系,参见 Fig.7.A.1
1NF 2NF 3NF BCNF 4NF 5NF
VII-1-2 Principles of Relation Normalization (cont.)
1NF
2NF
3NF
BCNF
4NF
5NF
消除非主属性对键的部分函数依赖
消除非主属性对键的传递函数依赖
消除主属性对键的部分和传递函数依赖
消除非平凡且非函数依赖的多值依赖
消除非平凡连接依赖
函数依赖保持性
无损连接
性
Fig.7.0.3 关系范式间相互关系
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
20
给定一个关系模式,可以采用规范化算法将其转换为1NF 、 2NF 、 3NF 、 BCNF
对连接依赖和第五范式,无相应的模式规范化算法
VII-1-2 Principles of Relation Normalization (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
21
Function: f: X →Y,
x∈ X, y∈ Y, y = f (x)
e.g. y = 2x for x1, x2∈X, if x1= x2, then f (x1) = f (x2)
Contents in section VII-2 concepts about FD, §7.3.1 Armstrong Axioms to derive all implied FD, i.e. closure
of FD, §7.4.1 an efficient algorithm to compute the closure§7.4.1 “minimal” closure of FD, §7.4.3
VII-2 Functional Dependencies (§7.3/7.4 )
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
22
Definition1. Functional dependency(FD) holds on R
/* 函数依赖 FD 在关系模式 R 上成立 / 保持 */
Let R be a relation schema, and
R , R
, the functional dependency holds on schema R
if and only if
for any legal relations r(R), whenever any two tuples ti and tj in r agree on the attributes , they also agree on the attributes , that is,
ti[] = tj [] ti[ ] = tj [ ]
VII-2-1 Basic Concepts
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
23
Keys in relational schema can be defined in terms of FD K is a superkey for relation schema R, if and only if K R K is a candidate key for R, if and only if
K R, and for no K, R
VII-2-1 Basic Concepts (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
24
Definition2. (A particular) r(R) satisfy FD, or FD is satisfied by r(R)
/* 关系 r(R) 满足函数依赖集 FD, FD 被 r 满足 */
Given FD ={ } holding on R, and a relation r(R) on R, if r is legal under functional dependency set FD, r is said to
satisfy FD note:
r is legal under FD={ } means for ti, tj r(R), if t∈ i[]= tj [], then ti[]= tj []
FD requires that the values for a certain set of attributes determines uniquely the value for another set of attributes
VII-2-1 Basic Concepts (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
25
Given relation r(R) shown below, which FD is satisfied by r A. A →B B. AC → B C. BC →A D. B → C
Fig. 7.0.4
Example One
A B C
1 4 2
3 5 6
3 4 6
7 3 8
9 1 0
t1
t2
t3
t4
t5
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
26
t2[A]=t3[A]=3, t2[B]=5 ≠ t3[B]=4,
A →B is not satisfied
t2[AC]=t3[AC]=36, t2[B]=5 ≠ t3[B]=4, AC →B is not satisfied
BC →A is satisfied
t1[B]=t3[B]=4, t1[C]=2 ≠ t3[C]=6,
B →C is not satisfied
Example One (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
27
Consider the schema R=(employee_ID, date, turnover per-day, department_name, manager) that describes the information about the turnover per-day ( 日营业额 ) for each employee everyday, the department that each employee works at, and the manager of the department the employee works at.
it is assumed that at every day, each employee has only one turnover per-day each employee works at only one department each department is managed by only one manager
According to the descriptions mentioned above, list all the functional dependencies that hold on R
Example Two
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
28
Answer F = { employee_ID, date → turnover per-day, employee_ID →department_name, department_name →manager }
Example Two (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
29
For a schema R, there may be more than one relation instance r(R), i.e. r1(R) , r2(R) , r3(R) ,…, rm(R) , defined on R
e.g. consider R= (A, B, C, D) , and with respect to the instances r1(R) and r2(R) in Fig.7.5
Relation r(R) satisfies vs. holds on schema R if holds on R, then every legal r(R) satisfies this R but for schema R and , if only some ri(R) satisfies R, may not hold on R.
e.g. in Fig.7.5, A C and AB D are satisfied by r1(R) , but A C is not satisfied by r2(R) , so A C does not holds on R
FD holds on R vs FD is satisfied by r(R)
Fig.7.5 instance r1 and r2 defined on schema R
t2
b3
r1
t2
r2
a1 b4 c2 d5 t6
FD1= {A C, AB D}, satisfied by r1
FD2= {AB D}, satisfied by r2
A C does not hold on R
b3
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
31
E.g. True or false ? for a relation r(R) defined on schema R, if r satisfies functional
dependency FD ={ }, then FD holds on schema R answer: false
With respect to Fig. 2.4 in general, customer-street customer-city does not hold on
customer schema, because two different cities may have the same street
but customer-street customer-city is satisfied by relation instance customer as shown in Fig.2.4
VII-2-1 Basic Concepts (cont.)
Fig. 2.4 The customer Relation
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
33
Functional dependencies allow us to express constraints that cannot be expressed using superkeys
E.g. consider the schema in Fig.7.2: bor_loan= (customer-id, loan-number, amount). with respect to the primary key, we have
customer-id, loan-number amount but the following FD also holds on bor_loan loan-number customer-name
VII-2-1 Basic Concepts (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
34
Def. Trivial FD ( 平凡依赖 ) A functional dependency is trivial , if for any schema R, where R, R, is satisfied by all
relations/instances r on R or: on R is trivial if it holds on any schema R, where
R, R E.g. A A is trivial, customer-name customer-name In general, is trivial if
e.g. customer-name, loan-number customer-name
if t1[customer-name, loan-number ]= t2[customer-name, loan-number ]
then t1[customer-name]= t2[customer-name]
VII-2-1 Basic Concepts (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
35
Def. Transitive dependency ( 传递函数依赖 )
E.g. Student(sno, sname, address, depart) for transitive dependency sno→address, there are snosname, sname address
VII-2-1 Basic Concepts (cont.)
a functional dependency is transitive if:
,( ), , ; and is called transitive dependent on
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
36
Def. Partial dependency ( 部分函数依赖 ) a functional dependency is partial if there is a proper
subset of , i.e. , such that ; and is partially dependent on /* 非最小化 / 冗余
E.g. Student(sno, sname, address, depart) for partial dependency
(sno, sname)→address
, there are sname→address, sno→address
VII-2-1 Basic Concepts (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
37
Given a set F set of functional dependencies, there may be some other functional dependencies that are logically implied by F
for example, R(A, B, C)
if F ={A B, B C}, then it is inferred that
A C e.g. Fig. 7.0.5 in next slide
For a given FD set F on R, how to derive all the other functional dependencies on R ?
closures of functional dependencies
VII-2-2 Closure of a Set of Functional Dependencies (cont.) (§7.4.1)
A B C
1 4 2
3 5 6
4 4 2
7 3 8
9 1 0
t1
t2
t3
t4
t5
Fig. 7.0.5
F ={A B , B C };
A C is logically implied by F
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
39
Def. Given a schema R, a functional dependency f on R is logically implied by a set of FD F on R , if
every instance r(R) that satisfies F also satisfies f e.g. Fig. 7.0.5
Def. Given a set F of functional dependencies, the closure of F, denoted as F+
F+ = { f | f is logically implied by F } e.g. in Fig. 7.0.5, {A B , B C }+
= {A B , B C, A C }
VII-2-2 Closure of a Set of Functional Dependencies (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
40
F+ can be derived on the basis of Armstrong’s axioms Three basic Armstrong’s Axioms e.g.
if , then reflexivity, 自反律
if , then augmentation, 增广律
if , and , then transitivity ,传递律
For the proof of reflexivity and augmentation, refer to Appendix A
VII-2-2 Closure of a Set of Functional Dependencies (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
41
Other three additive Armstrong’s axioms union:
if holds and holds, then holds decomposition:
if holds, then holds and holds pseudotransitivity:
if holds and δ holds, then δ holds These three additive rules can be derived from the three basic
Armstrong’s Axioms mentioned above
VII-2-2 Closure of a Set of Functional Dependencies (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
42
Question Which rule about functional dependencies shown below is
right A. if then B if AC , BCD then ABD C. if ABC then BC D if , then
Answer: B
An Example of Functional Dependency
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
43
R = (A, B, C, G, H, I) F = { A B
A C CG H CG I B H }
Some members of F+
A H by transitivity from A B and B H
An Example - Closure of Functional Dependency
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
44
AG I by augmenting A C with G, to get AG CG then by transitivity with AG CG and CG I
CG HI by union with CG H and CG I
An Example - Closure of Functional Dependency (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
45
F+ = Frepeat
for each functional dependency f in F+
apply reflexivity and augmentation rules on f add the resulting functional dependencies to F+
for each pair of functional dependencies f1and f2 in F+
if f1 and f2 can be combined using transitivity then add the resulting functional dependency to
F+
until F+ does not change any further
VII-2-2 Closure of a Set of Functional Dependencies (cont.)
Fig.7.8 Algorithm to compute F+
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
46
Def. An attribute B is functionally determined by if B Def. Given a set of attributes , the closure of under F ,
denoted by +, is
{ | is functionally determined by under F} is functionally determined by under F , if and only if
∈ F+
+ is in F+
E.g. R(A, B, C, D) and F={A B, B C} F+= {A B, B C, A C, …} (A) += (ABC) , A ABC ∈ F+
VII-2-3 Closure of Attributes (§7.4.2)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
47
Input: , F
Output: +
result := ; while (changes to result) do
for each in F do begin
if result /* result =(, …)
then result := result end
Fig.7.9 An efficient algorithm to compute + under F
VII-2-3 Closure of Attributes (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
48
R = (A, B, C, G, H, I) F = {A B,
A C,
CG H, CG I B H }
Computing (AG)+
1. result = AG /* or denoted as {A, G }
2. result = ABCG /* A C , A B
3. result = ABCGH /* CG H
4. result = ABCGHI /* CG I
/* or { A, B, C, G, H, I }
An Example
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
49
(AG)+ = R, AG is a superkey of R
Is AG a candidate key? step1. is AG a super key?
does AG R? == Is (AG)+ = R yes
step2. is any subset of AG a superkey? does A R? == is (A)+ = R ?, no does G R? == is (G)+ = R ?, no
so, AG is a candidate key
An Example (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
50
Usage-I. Testing for superkey To test whether is a superkey of R under F , i.e. whether
R, we check if R = +
Usage-II. Testing functional dependencies To determine whether or not holds on R under F, we
check if +
Usage-III. Computing closure F+
for each γ R, computeγ+={S} under F;
for each S γ+, output γS as a functional dependency in F+
VII-2-3 Closure of Attributes (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
51
Def. For functional dependencies F and G, if F+= G+
then F and G are equivalent
E.g. F={A B , B C } is equivalent to
G={A B , B C, A C }
VII-2-3 Closure of Attributes (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
52
As integrity constraints, functional dependency set F on DB should be checked when DB are modified
checking all in F is time-consuming F may have redundant dependencies that can be inferred from
the others It is desirable to test a a “minimal” set of functional dependencies
equivalent to F Intuitively, a canonical cover ( 正则覆盖 ) of F, denoted as Fc, is
a “minimal” set of functional dependencies equivalent to F without redundant functional dependencies or attributes F+ = Fc
+
VII-2-4 Canonical Cover (§7.3.4)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
53
F, F+, and FC
FC
FF+
F+ = Fc+
grow
shrink
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
54
For {A B, B C, A C}, A C is redundant
On LHS, F={A B, B C, AC D} is equivalent and can be simplified into
Fc={A B, B C, A D}
i.e. C in AC D is redundant, because F+ = Fc
+
On RHS of FD, F={A B, B C, A CD} is equivalent and can be simplified into
Fc={A B, B C, A D}
Examples of Redundancy in F
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
55
Def. Consider functional dependency set F and in F, attribute A is extraneous in extraneous in , , if
A , and F implies/is equivalent to (F – { }) {( – A)
} that is, F+ (F – { }) {( – A) }
attribute A is extraneous in extraneous in , , if A , and the set of functional dependencies (F – { }) { ( – A)} implies/is equivalent to
F that is, ((F – { }) {) ( – A)})+ F
VII-2-4 Canonical Cover (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
56
Consider a set F of functional dependencies and the functional dependency in F ,
to test if attribute A is extraneous
1. compute (-A)+ using the dependencies in F
2. check ∈ ( – A)+ ?
if it does, then ( –A) holds, A is extraneous to test if attribute A is extraneous in
1. compute + using only the dependencies in (F – { }) { ( – A)},
2. check A ∈ + ? if it does, A holds, A is extraneous
VII-2-4 Canonical Cover (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
57
For F = {AB CD, A E, E C }, C in AB CD is extraneous, because
F implies {AB D }; or (AB)+ under F’={AB D, A E, E C } is {ABCDE},
which includes C, thus C is extraneous
Examples of Extraneous Attributes
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
58
For F = {A C, AB C }, B is extraneous in AB C, because F implies {A C }; or (AB – {B})+ = (A) + under F is {AC}, and {AC} contains the
RHS of AB C, A C can be inferred from F
For F = {A C, AB CD}, C is extraneous in AB CD, because
{A C, AB D} implies F; or (AB) + under {A C, AB D}={ABCD}, and
{ABCD}contains C
Examples of Extraneous Attributes (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
59
Def. A canonical cover Fc for F is a set of dependencies such that
F is equivalent to Fc, i.e. F+= Fc+ , that is
F logically implies all dependencies in Fc
Fc logically implies all dependencies in F
no functional dependency in Fc contains an extraneous attribute /* 最简化
each left side of functional dependency in Fc is unique, that is, there are no two dependencies 1 1, 2 2 in Fc , such that 1 = 2
VII-2-4 Canonical Cover (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
60
Fc,=F;
repeat 1. Use the union rule to replace any dependencies in F
1 1 and 1 2 with 1 1 2
2. Find a functional dependency with an extraneous attribute either in or in
if an extraneous attribute is found, delete it from
until Fc does not change
Fig.7.10 Algorithms of computing Fc
Algorithms of computing Fc
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
61
Note: Union rule may become applicable after some extraneous attributes have been deleted, so it has to be re-applied
Note:
for a set F of functional dependencies, there may be several Fc,
VII-2-4 Canonical Cover (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
62
Computing a Canonical Cover for
R = (A, B, C) F1 = { A BC ,
B C,
A B,
AB C} applying the Union rule to combine A BC and A B into
A BC, Fc becomes
F = {A BC, B C, AB C} A is extraneous in AB C, because
F logically implies {A BC, B C} {B C}; or
Example of Canonical Cover
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
63
{AB-A}+ ={B}+ under F is {BC}, and contains C, Fc is now {A BC, B C}
C is extraneous in A BC , because {A B, B C} logically implies {A BC, B C}, or (A) + under {A B, B C} is {BC}, and contains C
Fc is: A B B C
Example of Canonical Cover (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
64
VII-3 Normal Forms
According to functional dependencies, multivalued dependencies, and join dependencies existing in relational schemas, schemas can be classified as
First Normal Form, 1NF Second Normal Form, 2NF Third Normal Form, 3NF Boyce-Codd Normal Form, BCNF Forth Normal Form, 4NF Fifth Normal Form, 5NF
Normal forms can be obtained by several decomposition algorithms on schemas
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
65
VII-3-1 First Normal Form (§7.2)
A attribute domain is atomic if its elements are considered to be indivisible units
Examples of non-atomic domains domain of multi-valued attribute
e.g. dependent-name {{Smith}, {John, Alice}, {Johnson, Brook}}
domain of composite attribute e.g identification numbers of a student, like CS101, can be
broken up into parts, “CS” and “101”
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
66
VII-3-1 First Normal Form (cont.)
Def. A relational schema R is in first normal form if the domains of all attributes of R are atomic
1NF = { R | R is in first normal form}
In relational DBS, it is desired that all relations are in first normal Non-atomic (e.g. composite or multi-valued ) domains in relations
complicate query, result in redundant/repeated storage of data E.g.1 The identification number of the employee is defined as the
string of the form CS0012 or EE1127 first two characters are extracted to find the department, the
domain of identification numbers is not atomic
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
67
to find a employee’s department, application programs should extract information about department from the DB query results
E.g.2. suppose relationship between account(account-number, branch-name, balance, set-of-owners)
customer (customer-name, set-of-accounts) is many-to-many, without independent depositor, set of accounts stored with
each customer, and set of owners stored with each account t1=(John, {#101, #201}) in customer t2=(#101, Perryridge, 200, {John, Smith}) in account redundancy in these two tuples complicates modification on
DBS
VII-3-1 First Normal Form (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
68
Def.1. Relation schema R is in second normal form (2NF) if R is in 1NF, and each attribute A in R meets one of the following criteria
it appears in a candidate key; or
/* A is a primary attribute */ it is not partially dependent on a candidate key
/* A is completely dependent on a candidate key */
Def.2. 如果关系模式 R 是 1NF ,而且每一个非主属性都完全依赖于 R 的主 / 候选键,则 R 为 2NF
VII-3-2 Second Normal Form(P309 7.15)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
69
2NF 性质 不存在非主属性对候选键的部分依赖
VII-3-2 Second Normal Form (cont.)
1NF 2NF消除非主属性对候选键的部分函数依赖
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
70
2NF Example
E.g. SLC(S# , SD, SL, C# , G) (S#, C# ) G, SD SL,
S# SD, (S#, C# ) SD, S# SL, (S#, C# ) SL}
SLC is not in 2NF, because for non-primary attribute SD, there exits S# SD, so SD is
partially dependent on the key (S#, C# ) for non-primary attribute SL, there exits S# SL, so SL is
partially dependent on the key (S#, C# )
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
71
2NF Example (cont.)
Decompose SLC(S# , SD , SL, C# , G) into
SC(S# , C# , G), SSS(S# , SD , SL) FSC = { (S#, C# ) G}
FSSS= {S# SD, SD SL, S# SL } schema SC and SSS are in 2NF
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
72
Def.1. A relation schema R is in third normal form (3NF) with respect to a set F of functional dependencies if,
for all in F+, at least one of the following holds is trivial, e.g., is a superkey for R each attribute Ai in – is contained in a candidate key for
R note: if R has several candidate keys, each attribute in
– Ai may be in a different candidate key
VII-3-3 Third Normal Form (§7.3.4)
E.g. B ABC, (ABC) – (B) =AC
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
73
Def.2. 如果关系模式 R 是 2NF ,而且每一个非主属性都不传递依赖于 R 的任何候选键,则 R 是 3NF
If R is in 3NF, R is also in 2NF
3NF 性质 不存在非主属性对候选键的部分和传递依赖
2NF 3NF消除非主属性对键的传递函数依赖
VII-3-3 Third Normal Form (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
74
cust_banker-branch = (customer-id, employee-id, branch-name, type)
Functional dependencies
employee-id branch-name
/* 每个雇员只能工作在一个支行 customer-id, branch-name employee-id
/* 每个客户在一个支行中最多只有一个 personal banker the candidate keys:
{employee-id, customer-id, type}, {customer-id, branch-name, , type}
3NF — Example One
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
75
In accordance with 3NF definition 1, cust_banker-branch is in 3NF, because
for customer-id, branch-name employee-id, LHS customer-id, branch-name is in the candidate key
for employee-id branch-name, RHS branch-name is in the candidate key
In accordance with 3NF definition 2, cust_banker-branch is also in 3NF, because
customer-id, employee-id and branch-name are all the primary attributes
type is not dependent on the two candidate keys
3NF — Example One (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
76
Schema SSS (S# , SD , SL)
FSSS= {S# SD, S# SL , SDSL}
the candidate is S#In accordance with 3NF definition1, SSS is not in 3NF, because
for SD SL, RHS SL in {SL}-{SD}= {SL} is not in candidate key {S#}
According to 3NF definition2, SSS is not in 3NF, because for S# SL, RHS SL is non-primacy attribute, and is transitive dependent on the candidate key candidate key {S#}
3NF—Example Two
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
77
Boyce-Codd Normal Form, BCNF, enhanced 3NF Def.1. A relation schema R is in BCNF with respect to a set F of
functional dependencies, if
for all functional dependencies in F+, where R and R, at least one of the following holds
is trivial, e.g. is a superkey for R
/* compared with 3NF
Def .2. 设 R 是 1NF ,如果对 R 中的每个属性(包括主属性、非主属性),不存在对键的传递或部分依赖,则 R 是 BCNF 。
VII-3-4 BCNF (§7.3.2)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
78
BCNF 性质 不存在属性(主属性,非主属性)对候选键的传递和部
分依赖 If R is in BCNF, R is also in 3NF
以 FD 作为模式规范化程度度量, BCNF 达到了最高规范化程度。如果 DBS 中所有模式都属于 BCNF ,则消除了插入和删除异常
消除主属性对键的部分和传递函数依赖3NF BCNF
VII-3-4 BCNF (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
79
cust_banker-branch = (customer-id, employee-id, branch-name, type), with respect to
F = {employee-id branch-name;
customer-id, branch-name employee-id}, The candidate keys:
{employee-id, customer-id}, {customer-id, branch-name} According to Def.1, cust_banker-branch is not in BCNF, because
for employee-id branch-name, LHS employee-id is not a superkey
In line with Def.2, Customer-schema is also in BCNF, because primary attributes branch-name is partially dependent on the
candidate key {employee-id, customer-id}
Example
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
80
Customer-schema = (customer-name, customer-street, customer-city), with respect to
F = {customer-name customer-street customer-city}, According to Def.1, Customer-schema is in BCNF, because
F+ =F∪{customer-name customer-city, customer-name customer-street}
LHS {customer-name} is the superkey for Customer-schema In line with Def.2, Customer-schema is also in BCNF, because
for non-primary attributes customer-street,
customer-city and primary attribute customer-name, there are not transitively and partially dependent on key {customer-name}
BCNF- Example One
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
81
Branch-schema = (branch-name, assets, branch-city),
as for
F = {branch-name assets branch-city},
According to Def.1, Branch-schema is in BCNF, because LHS {branch-name} is the superkey for Branch-schema
In line with Def.2, Branch-schema is in BCNF, because for non-primary attributes assets, branch-city and primary
attribute branch-name, there are not transitively and partially dependent on key {branch-name}
BCNF- Example Two
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
82
Loan-info-schema = (branch-name, customer-name, loan-number, amount)
assuming one loan can be made to more than one customer , as far as loan-number amount is concerned According to Def.1., Loan-info-schema is not in BCNF, because
LHS {loan-number } is not the superkey {customer-name, loan-number}
In accordance with Def.2., Loan-info -schema is not in BCNF, because
there exists loan-number amount, and primary attribute loan-number is partially dependent on key {customer-name, loan-number}
BCNF- Example Three
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
83
The non-BCNF Loan-info-schema may result in information repetitions in relation Loan-info.
For example, if a loan titled L-44 are made to two customer John and Smith, then there are two tuples in Loan-info
(Downtown, John, L-44, 1000) (Downtown, Smith, L-44, 1000)
BCNF- Example Three (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
84
SJP(S, J, P): S-student, J-course, P-order {S, J } P /* 每个学生的每门课程有一个名次 */ {J, P } S /* 每门课程的每一排名均有一个学生 */
Candidate keys {S, J} , {J, P}
On the basis of Def.1, SJP is in BCNF, because LHSs of {S, J}P and {J, P} S are super keys
BCNF- Example 4
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
85
STJ(S, T, J ): S-student, J-course, T- teacher T J /* 每门课程只有一个教师教 */ {S, J }T /* 每个学生选定课程后,任课教师随之确定 */ {S, T} J
Candidate keys: {S, J }, {S, T } On basis of Def.1of BCNF, STJ is not in BCNF, because
for T J, LHS {T } is not a key On the basis of Def.1of 3NF, STJ is in 3NF, because
for T J, {J}-{T}={J} is in key{S, J} Non-BCNF STJ(S, T, J ) can be decomposed into two BCNF
schema ST(S, T), TJ(T, J )
BCNF- Example 5
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
86
On basis of mutivalued dependency, 4NF is defined (§7.6.2)
On the basis of join dependency, 5NF is defined
VII-3-5 Other Normal Forms
消除非平凡且非函数依赖的多值依赖BCNF 4NF
4NF 5NF消除非平凡连接依赖
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
87
VII-4-1 Decomposition A relational schema R can be decomposed into several subschema
{R1, R2, ..., Rn}, where R1, R2, ..., Rn R, to guarantee DBS integrities
all attributes of R must appear in the decomposition (R1, R2, ..., Rn):
R = R1 R2 ... Rn E.g. non-BCNF cust_banker-branch
= (customer-id, employee-id, branch-name, type)
can be decomposed into
R1= (employee-id, branch-name)
R2= (customer-id, employee-id, branch-name, type)
VII-4 Decomposition Properties
R(A1, A2, …, Ai, …, An ), R = R1 R2 R3
A1 A2 Ai An-1 An
* * * * * * *
* * * * * * *
* * * * * * *
* * * * * * *
r1=R1(r) r2=R2(r) ... r3=R3(r)
r(R)
Fig. Decomposition of R and r(R)
r (R) = r1 r2 r3 ?
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
89
A good decomposition should be lossless decomposition ( 无损分解 ) dependency preserving ( 函数依赖保持 )
VII-4-1 Schema Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
90
Def. The decomposition {R1, R2, ..., Rn} of R is lossless decomposition ( 无损分解 ) if ,
for all possible relations r(R) in legal database instances,
r = R1 (r) R2 (r) ... Rn (r) also known as lossless-join ( 无损连接分解 ) decomposition
E.g. The decomposition of cust_banker-branch in VII-4-1
/* 无损连接分解中, R 所对应的 r 被分成若干垂直片段 Ri (r) , 各垂直片段通过自然连接可恢复 r 中的数据,保证了数据的完整性 / 分解可恢复性
VII-4-2 Lossless Decomposition
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
91
Def. Lossy decomposition /* 有损连接分解 */
r ≠ R1 (r) R2 (r) ... Rn (r) also known as lossy-join ( 有损连接分解 ) decomposition
Lossy decompositions may result in information loss
VII-4-2 Lossless Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
92
How to determine a lossless decomposition ?
Def./Thm. For schema R and F holds on R, a decomposition of R into R1 and R2 is lossless join, if and only if
at least one of the following dependencies is in F+
R1 R2 R1
R1 R2 R2
,i.e. R1 R2 is the superkey of R1 or R2
E.g. The decomposition of cust_banker-branch is lossless
VII-4-2 Lossless Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
93
bor_loan = (customer-id, loan-number, amount), with respect to
loan-number amount The decomposition
loan ( loan-number, amount) borrower (customer-id, loan-number)
Example 2
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
94
R(A, B, C, D), F ={A →B, B →C, A →D, B→D} on R F+ = F {A →C} lossless decomposition:
R1(A, B, C), F1 ={A →B, B→C, A→C} on R1,
R2(B, D), F2= {B→D} on R2
note: A →D is lost on R1and R2
Example 3
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
95
Def. For a schema R, F holds on R, and the decomposition {R1, R2,.., Rn} of R,
the restriction of F to Ri, denoted as Fi is defined as
Fi = { | F+ AND Ri} i.e. the set of dependencies in F+ that include only attributes
in Ri /*限制、投影 e.g. example 3 in the previous slide
VII-4-3 Dependency Preserving
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
96
Def. For a schema R, F holds on R, decomposition {R1, R2,.., Rn} is dependency preserving if :
(F1 F2 … Fn)+ = F+
/* 对 R 进行分解后,不会丧失 R 中原有的属性间的依赖关系 , 但某些显示的依赖可能转换为 F+ 中隐式的依赖
/* 将在 R 上检查全部的 F 是否成立转换为在各个子模式Ri 上检查各自的 Fi
A efficient method to test whether or not {R1, R2,.., Rn}is dependency preserving
no need to compute F+
on the basis of the attribute closure
VII-4-3 Dependency Preserving (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
97
for F= {}, apply the following procedure for each result =
while (changes to result) do for each Ri in the decomposition
t = (result Ri)+ Ri (with respect to F)
result = result t
if result contains all attributes in , then the functional dependency is preserved
The decomposition is preserved, if and only if all in F are preserved.
/* 利用只包含在各个子模式 Ri 中的中间结果result去推导
Testing of Dependency Preserving
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
98
Student(sno, dept, head), F = {sno→dept, dept→head}, F+ = F {∪ sno→ head}
Decomposition 1 R1(sno, dept), F1={sno→ dept} R2(sno, head), F2 ={sno → head}
lossless, because R1 R1={sno}, and is the key of R1 and R2
non-dependency preservation, because (F1 F∪ 2) + ≠ F+, dept → head is lost,
Example of Dependency Preserving
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
99
for dept →head in F,
(1) with respect to R1,
result=(dept R1)+ R1 = {dept}+ {sno, dept}
= {dept, head} {sno, dept}
= {dept} ;
(2) with respect to R2,
result=(dept R2)+ R2
= φ+ {sno, head}= φ
dept→head is not preserved
Example of Dependency Preserving (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
100
Decomposition 2 R1(sno, dept), F1={sno→ dept} R2(dept, head), F2 ={dept→ head}
lossless-join, because R1 R2={dept}, and is the key of R2
dependency preservation, because ( F1 F∪ 2 ) + = F+
Example of Dependency Preserving (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
101
VII-5 3NF Decomposition
A decomposition algorithm that gives lossless and dependency preserving schemas in 3NF is shown in Fig.7.13
Note: all candidate keys should be founded out a Fc should be computed at first
1. Find out a canonical cover Fc for F;
i := 0; /*R0 is null*/2. for each functional dependency in Fc do
if none of the schemas Rj, 1 j i contains then begin
i := i + 1;Ri :=
end3. if none of the schemas Rj, 1 j i contains a candidate key for R
then begini := i + 1;Ri := any candidate key for R;
end4. return (R1, R2, ..., Ri)
Fig.7.13
对 Fc 中每一个未包括在已有子模式中的函数依赖,构造
一个 BCNF/3NF子模式
对 R 的候选健单独构造一个关系模式
求最简 FD: 去除掉传递依赖和冗余属性
0. Find out all candidate keys for R
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
103
Consider schema R(X, Y, Z, W), and F = {XZ, ZX, YXZ, WXZ} that holds on R. Give the lossless, dependency-preserving decomposition of this schema into 3NF
The solution is as follows: Step1. {YW}+ =R, {YW} is the unique candidate key Step2. With respect to F = {XZ, ZX, YXZ, WXZ},
considering X in YXZ and Z in WXZ, {XZ, ZX, YZ, WX} implies F, so X in YXZ and Z in
WXZ are all extraneous attributes, and one of canonical covers for F is
Fc = {XZ, ZX, YZ, WX}
Example 3NF Decomposition
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
104
the other canonical covers for F are
Fc = {XZ, ZX, YX, WX}, or
Fc = {XZ, ZX, YX, WZ}, or
Fc = {XZ, ZX, YZ, WZ}
Step3. As far as Fc = {XZ, ZX, YZ, WX} is concerned, on the basis of 3NF decomposition algorithm, the subschema
responding to each functional dependency in Fc are
R1={XZ} , R2={YZ}, R3={XW} for the candidate key {YW}, R4={YW}is generated, so one of
the 3NF decomposition is { XZ , YZ , XW , YW}
Example 3NF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
105
With respect to the other canonical covers
Fc = {XZ, ZX, YX, WX}, or
Fc = {XZ, ZX, YX, WZ}, or
Fc = {XZ, ZX, YZ, WZ}
and the candidate key {YW},
the corresponding 3NF decomposition are
{ XZ , YX , WX , YW}
{ XZ , YX , WZ , YW}
{ XZ , YZ , WZ , YW}
For more examples about 3NF decomposition, refer to Appendix B
Example 3NF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
106
VII-6 BCNF Decomposition
A decomposition algorithm that gives lossless schema in BCNF is shown in Fig.7.12
F+ is computed at first to determine whether or not is the superkey of subschema Ri
if other methods can be used to determine whether or not is the superkey of subschema Ri , then F+ need not be computed
An alternative easily-understood and straight-forward description of BCNF decomposition is given below
Fig.7.13-2 BCNF Decomposition AlgorithmDecomposition Algorithm
Given: R, F holding on R
result := {R}; done := false;
while (not done) do
if (there is a non-BCNF subschema Ri in result) then begin
let be a nontrivial functionaldependency that holds on Ri , such that
(1) is not the superkey for Ri ,i.e. is not
in F+, with respect to the restriction Fi of F to
Ri !!!
(2) = ; /* 中无冗余属性 */
result :=(result – Ri) {(Ri – )} {(, )}; end
else done := true;
Ri 中分解为 2 部分,其中 组成单独的 BCNF 模式
找出 non-BCNF子模式 Ri , 进一步分
解
Ri is non-BCNF because of
remainsin Ri
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
108
Note to determine whether or not Ri is in BCNF, the restriction of
F to Ri and the candidate keys of Ri should be computed !!!
Ri is not in BCNF, because with respect to that holds on Ri, is not the super key of Ri
the algorithm replaces non-BCNF Ri with (Ri – ) and (, )
non-BCNF Ri is decomposed into (Ri – ) and BCNF (, )
the restriction of F to the schema (, ) is holds, is the superkey for (, ), so (, ) is in BCNF
VII-6 BCNF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
109
Considering the schema R(C, T, H, R, S, G),
and F={CS G, C T, TH R, HR C, HS R}
, give a decomposition of R into BCNF
Step1. With respect to R and F, the unique candidate key is HS,
and R is not in BCNF Step2. Initially, result := {R}={CTHRSG}
R is not in BCNF, because there is CS G in F, and CS is not the candidate key of R
R(CTHRSG) is decomposed into R1(CSG) and R2(CTHRS)
R1(CSG) is in BCNF, F1= {CS G}
Example OneBCNF Decomposition
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
110
Step3. With respect to R2(CTHRS)
!!!the restriction of F to R2(CTHRS) is
F2={C T, TH R, HR C, HS R} !!the candidate key is HS R2 (CTHRS) is not in BCNF, because there is C T in F2, and
C is not the candidate key of R2
R2(CTHRS) is decomposed into R21(CT) and R22(CHRS)
R21(CT) is in BCNF , F21= {C T}
Example OneBCNF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
111
Step4. With respect to R22(CHRS) !!the restriction of F to R22(CHRS) is
F22={HR C, HS R, CH R} /* CH TH, TH R !!the candidate key is HS R22(CHRS) is not in BCNF, because there is HR C in F22, and
HR is not candidate key of R22
R22(CHRS) is decomposed into R221(HRC) and R222(HRS) R221(HRC) is in BCNF
Step5. With respect to R222(HRS), the restriction of F to R222(HRS) is
F222={HS R}, and the candidate key is HS R222(HRS) is in BCNF
Example OneBCNF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
112
Finally, the BCNF decomposition of R(C, T, H, R, S, G) is R1(CSG) , R21(CT) , R221(HRC) , R222(HRS)
Example OneBCNF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
113
Lending-schema = (branch-name, branch-city, assets, customer-name, loan-number, amount)
the set of functional dependencies on the schema
branch-name branch-city, assets
loan-number amount, branch-name a candidate key is {customer-name, loan-number}
Lending-schema is not in BCNF in view of F={branch-name branch-city, assets}, branch-
name is not a candidate key, or Lending-schema is partly dependent on its candidate key
with respect to loan-number amount, branch-name, is not a candidate key
Example TwoBCNF Decomposition
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
114
Applying the algorithm to R= Lending-schema step1. selecting branch-name branch-city, assets in R
this FD is non-trivial, holds on R, branch-name R is not in F+, i.e. branch-name is not a candidate key
R is decomposed into
R1: branch -schema = (branch-name, branch-city, assets ) R2: lending-info-schema
= (branch-name, customer-name, loan-number, amount)
Example Two BCNF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
115
R1: branch -schema is in BCNF branch-name branch-city, assets is the only dependency
holding on branch -schema branch-name is a candidate key for branch –schema
lending-info-schema is not in BCNF
Example Two BCNF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
116
step2. Selecting loan-number amount, branch-name on R2: lending-info-schema
this FD is non-trivial, holds on R, loan-number R2 is not in F+
R2: lending-info-schema = (branch-name, customer-name, loan-number, amount) is decomposed into
R3: loan-schema = {loan-number, amount, branch-name }
R4 : borrower-schema= {loan-number, customer-name}
Example Two BCNF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
117
The Lending-schema is decomposed into R1, R3, R4, each of which is in BCNF.
the decomposition is lossless-join. the decomposition is also dependency preservation, because
branch-name branch-city, assets holds on branch -schema , loan-number amount, branch-name holds on loan-schema
For more examples about BCNF decomposition, refer to Appendix C
Example Two BCNF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
118
With respect to the 3NF algorithm mentioned in VII-5, it is always possible to decompose a schema R into 3NF, and
decomposition is lossless dependencies are preserved
3NF decomposition algorithm is lossless and dependency-preserving
VII-7 Properties of 3NF and BCNF Decomposition
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
119
With respect to the BCNF algorithm given in VII-6, it is always possible to decompose a schema R into 3NF, and decomposition is lossless but it may not be possible to preserve dependencies, that is,
(F1 F2 … Fn)+ = F+
may not be always holds
BCNF decomposition algorithm does not guarantee that the decomposition is dependency preserving
VII-7 Properties of 3NF and BCNF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
120
Reflexivity: if , then U, for any two tuples t1 and t2 of r , if
t1[]=t2[], and , then t1[]=t2[] ,
Augmentation : if , then , U, for any two tuples t1 and t2 of r ,
if t[ ]=s[ ], then t[]=s[] , t[ ]=s[ ] ,
and , t[ ]=s[ ] ,
so t[ ]=s[ ] ,
Appendix A
Proof of Reflexivity and Augmentation
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
121
Schema Banker-info-schema =
(branch-name, customer-name, banker-name, office-number)
functional dependency set F is as followsbanker-name branch-name office-numbercustomer-name branch-name banker-name
Candidate key is {customer-name, branch-name} This schema is not in 3NF, because banker-name is not
superkey, and branch-name, office-number is not contained in the candidate key
Step0. There is no extraneous attributes in the two functional dependencies of F, so Fc = F
Step1. initially, i= 0, R0 is null
Appendix B Example Two3NF Decomposition
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
122
Step2. the for loop in the algorithm results in the following schemas
R1 = Banker-office-schema = (banker-name, branch-name,
office-number)R2 = Banker-schema = (customer-name, branch-name,
banker-name) Step3. Since Banker-schema contains the candidate key
{customer-name, branch-name} for Banker-info-schema, step3 in the algorithm do not proceeds
/* 对 R 的候选健不必再处理 */ The decomposition result is Banker-office-schema and Banker-
schema
Appendix B Example Two 3NF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
123
R={SNO , SD , Head , CNO , G}
F={SNOSD , SDHead, SNOHead, (SNO,CNO)G} /* 存在传递函数依赖 */
Candidate keys {SNO, CNO} R is not in 3NF
there is transitive FD SNOHead, or
SNOSD does not satisfy 3NF definition
Appendix B Example Three 3NF Decomposition
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
124
Decomposition
Step1. FC={SNOSD, SDHead, (SNO,CNO)G}
Step2. R1 = {SNO, SD}, F1={SNOSD}
R2 = {SD, Head}, F2={SD Head}
R3 = {SNO, CNO, G}, F3={(SNO,CNO)G}
Step3. candidate key {SNO, CNO} has been contained in R3
Appendix B Example Three 3NF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
125
E.g.1 R={SNO , SD , Head , CNO , G} F={SNOSD , SNOHead , SDHead , (SNO,CNO)G}
step1. as far as SNOSD is concerned, R is decomposed into BCNF R1={SNO, SD} , on which F1={SNOSD} holds non-BCNF R2={SNO, Head, CNO, G}, on which
F2={SNOHead, (SNO, CNO)G} holds step2. as for SNOHead, R2 is decomposed into
BCNF R3 = {SNO, Head}, on which F2={SNOHead} holds
BCNF R4 = {SNO, CNO, G}, on which F3 = {(SNO,CNO)G} holds
Appendix C Examples about BCNF Decomposition
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
126
E.g.2
R = (A, B, C), F = {A B, B C} key = {A}, R is not in BCNF with respect to B C , decomposition R1 = (A, B), R2 = (B,
C) R1 and R2 in BCNF
lossless decomposition dependency preserving
Appendix C Examples about BCNF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
127
E.g.3
R = (J, K, L), F = {JK L , L K} two candidate keys: JK and JL R is not in BCNF decomposition
R1 = (K, L), F1 = {L K} R2 = (J, L)
any decomposition of R will fail to preserve
JK L
Appendix C Examples about BCNF Decomposition (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
128
定理。如果属性 A 只出现在 F 中函数依赖的左部 , 则 A 一定是主属性 , 必然出现在候选键中
输入:关系模式 R及其函数依赖集 F输出: R 的所有候选关键字 方法:
step1. 将 R 的所有属性分为四类: L类:仅出现在 F 中函数依赖左部的属性 R类:仅出现在 F 中函数依赖右部的属性 N类:在 F 中函数依赖左右两边均未出现的属性 LR类:在 F 中函数依赖左右两边均出现的属性
Appendix D Computing of Candidate Keys
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
129
并令X_set代表 L 、 N类, Y_set代表 LR类;
step2. 求 X_set + ,若包含了 R 的所有属性,则 X_set即为 R 的唯一候选关键字,转 step5 ;
否则转 step3 step3. 在 Y_set 中取一属性 A ,求( X _set A)+, 若它包含了 R 的
所有属性,则转 step4; 否则,调换一属性反复进行这一过程,直到试完所有 Y_set
中的属性 step4. 如果已找出所有的候选关键字,则转 step5; 否则,在 Y_set 中依此取两个、三个,…,求他们的属性闭
包,直到其闭包包含 R 的所有的属性 step5. 停止,输出结果
Appendix D Computing of Candidate Keys (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
130
E.g.1. R(X, Y, Z, W), and F = {XZ, ZX, YXZ, WXZ} L类: Y, W R 、 N: 空 LR: X, Z X_set={Y, W} Y_set={X, Z} (YW)+=R, so YW is the candidate key
Appendix D Computing of Candidate Keys (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
131
E.g.2. R(B, D, I, O, Q, S), and
F = {BQ, IS, ISQ, SD} L类: B, I R类: D, Q N 类 : O LR 类 : S X_set={B, I, O} Y_set={S} (BIO)+=R, so BIO is candidate key
Appendix D Computing of Candidate Keys (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
132
Sometimes, it is desirable to use non-normalized schema for DB query performance
E.g. To display customer-name along with account-number and balance, the expensive join of normalized 3NF/BCNF relation account(account-number, branch-name, balance) with depositor(account-number, customer-name) is needed
Alternative 1: Use denormalized relation
cus-account (account-number, branch-name, balance,
customer-name)
for faster query demerits: extra space and extra execution time for updates
Appendix E Denormalization in Database Design
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
133
Alternative 2: use a materialized view defined as
account depositor benefits and drawbacks same as above
Appendix E Denormalization in Database Design (cont.)
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
134
给定关系表 r(R) 和 , 判断 r 是否满足 判断关于函数依赖的一些公式是否成立
例如 , 如果 A B, B C, 则 A C
根据文字描述,抽象出函数依赖关系
求候选健 计算属性闭包 + 计算函数依赖集 F 的最小正则集 Fc
Appendix F 本章习题类型
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
135
给定关系模式 R 和定义在 R 上的函数依赖集 F ,判断 R 属于第几范式,为什么?
只考虑 1NF, 3NF, BCNF从文字描述中抽象出函数依赖判断一个模式分解是否为无损连接判断一个模式分解是否为函数依赖保持
给定非 3NF 的关系模式 R 和定义在 R 上的函数依赖集 F ,将 R 分解为第三范式
给定非 BCNF 的关系模式 R 和定义在 R 上的函数依赖集F ,将 R 分解为 BCNF 范式
Appendix G 本章习题类型 (续 )
April 2008 Database System Concepts - Chapter 7 Relational-Database Design -
136
Have a break