Upload
dennis-simon
View
212
Download
0
Embed Size (px)
Citation preview
Christoph F. Eick
Introduction Data Management
Today
1. Introduction to Databases
2. Questionnaire
3. Course Information
4. Grading and Other Things
Christoph F. Eick
Introduction Data Management
Spring 2003 Schedule COSC 6340
Exams: Undergraduate Material Review Exam: Th., Feb. 13 (in
class)Midterm Exam: Tu., March 25 (in class)Final Exam: Tu., May 6, 11aQualifying Exam Part2: Fr.,. May 9, 10:30-noon
Project and Graded Home WorksProject1(Feb. 15-March 15), Project2 (March 30-April 20),
Homework1 (deadline: Feb. 27; March 11), Homework2 (deadline: April 17)
Last day of lecture: Th., April 24, 2003 Spring Break: March 4+6
Christoph F. Eick
Introduction Data Management
Elements of COSC 6340 I: Basic Database Management Concepts --- Review of basic
database concepts, techniques, and languages (4 weeks, Chapters 1-5, 7-11, and 18 of the textbook).
II: Implementation of Relational Operators and Query Optimization (Chapters 12+13, 1.5 weeks)
III: Relational Database Design (1.5 weeks, chapters 15+16,) IV: Introduction to KDD and Making Sense of Data
(Chapters 1, 2, 6, and 7 of the Han/Kamber book centering on data warehouses, OLAP, and data mining). 3 weeks
V: Object-oriented Databases, PL/SQL, Object-relational Database Systems, and SQL3 (1.5 weeks; other material)
VI: Internet Databases and XML (1 week, chapter 22 of the textbook and other teaching material)
Christoph F. Eick
Introduction Data Management
Textbooks for COSC 6340
Required Text: Raghu Ramakrishnan and Johannes Gehrke, Data Management Systems, McGraw Hill, Third Edition, 2002 (complication: the chapter numbers in the new edition are different!!)
Recommended: Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Morgan Kaufman Publishers, 2001, ISBN 1-55860-489-8 (4 chapters will be covered)
Other books with relevant material: Ramez Elmasri and Shamkant Navathe, Fundamentals of Database Systems, Third Edition Addison Wesley ISBN: 0-8053-1755-4
Christoph F. Eick
Introduction Data Management
Schedule for Part1 of COSC 6340
Jan. 14: Introduction to COSC 6340 Fast Review of Undergraduate Material (Jan. 16-Feb. 13)
Jan. 16: Entity-Relationship Data Model Jan. 21: Entity-Relationship Data ModelJan. 23: Relational Data Model Jan. 28: Mapping E/R to Relations Jan. 30: Files, B+-trees, and hashing (chapter 8, 9, 10) Feb. 4: Files, B+-trees, and hashing (chapter 8, 9, 10) Feb. 6: Relational Algebra and SQL (very brief!!) Feb. 11: Transaction Management (chapter 18) Feb. 13: Exam0 (Undergraduate Review Exam)
Christoph F. Eick
Introduction Data Management
Why are integrated databases popular?
Avoidance of uncontrolled redundancy Making knowledge accessible that would otherwise not be
accessible Standardization --- uniform representation of data facilitating
import and export Reduction of software development (though the availability of
data management systems)
IntegratedDatabase
BookkeepingDevice
Car Salesman
Christoph F. Eick
Introduction Data Management
Popular Topics in Databases Efficient algorithms for data collections that reside on disks (or
which are distributed over multiple disk drives, multiple computers or over the internet).
Study of data models (knowledge representation, mappings, theoretical properties)
Algorithms to run a large number of transactions on a database in parallel; finding efficient implementation for queries that access large databases; database backup and recovery,…
Database design How to use database management systems as an application
programmer / end user. How to use database management systems as database
administrator How to implement database management systems Data summarization, knowledge discovery, and data mining Special purpose databases (genomic, geographical, internet,…)
Christoph F. Eick
Introduction Data Management
Data Model
Data Model
Schema (definesa set of database
states)
Current Database State
is used to define
Christoph F. Eick
Introduction Data Management
Schema for the Library Example using the E/R Data Model
Many-to-Many1-to-1 1-to Many Many-to-1
title
authorB#
when
phone
name
ssn
Check_out Person Book(0,35) (0,1)
Christoph F. Eick
Introduction Data Management
Relational Schema for Library Example in SQL/92
CREATE TABLE Book (B# INTEGER, title CHAR(30), author CHAR(20), PRIMARY KEY (B#));
CREATE TABLE Person (ssn CHAR(9), name CHAR(30), phone INTEGER, PRIMARY KEY (ssn));
CREATE TABLE Checkout( book INTEGER, person CHAR(9), since DATE, PRIMARY KEY (B#), FOREIGN KEY (book) REFERENCES Book, FOREIGN KEY (person) REFERENCES Person));
Christoph F. Eick
Introduction Data Management
Referential Integrity in SQL/92
SQL/92 supports all 4 options on deletes and updates. Default is NO ACTION
(delete/update is rejected) CASCADE (also delete all tuples
that refer to deleted tuple) SET NULL / SET DEFAULT (sets
foreign key value of referencing tuple)
CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students
ON DELETE CASCADEON UPDATE SET DEFAULT )
Christoph F. Eick
Introduction Data Management
Example of an Internal Schemafor the Library Example
INTERNAL Schema Library12 references Library. Book is stored sequentially, index on B# using hashing, index on Author using hashing. Person is stored using hashing on ssn. Check_out is stored sequentially, index on since using B+-tree.
Modern Relational DBMSModern Relational DBMS
Modern DBMS
Modern DBMS
Support for Web-Interfaces, XML, andData Exchange
Efficient Implementation of Queries (Query Optimization, Join & Selection & Indexing techniques)
Transaction Concepts; capability
of running manytransactions in
parallel; support forbackup and recovery.
Support for specialData-types: long fields,images, html-links, DNA-sequences,spatial information,…
Support for higher level user interfaces:graphical, natural language, form-based,…
Support for OLAP and Data Warehousing
Support for Data Mining
operations
Support for OO; capability to store operations
Supportfor data-driven computing