13
Christoph F. Eick Introduction Data Management Today 1.Introduction to Databases 2.Questionnaire 3.Course Information 4.Grading and Other Things

Christoph F. Eick Introduction Data Management Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things

Embed Size (px)

Citation preview

Page 1: Christoph F. Eick Introduction Data Management Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things

Christoph F. Eick

Introduction Data Management

Today

1. Introduction to Databases

2. Questionnaire

3. Course Information

4. Grading and Other Things

Page 2: Christoph F. Eick Introduction Data Management Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things

Christoph F. Eick

Introduction Data Management

Spring 2003 Schedule COSC 6340

Exams: Undergraduate Material Review Exam: Th., Feb. 13 (in

class)Midterm Exam: Tu., March 25 (in class)Final Exam: Tu., May 6, 11aQualifying Exam Part2: Fr.,. May 9, 10:30-noon

Project and Graded Home WorksProject1(Feb. 15-March 15), Project2 (March 30-April 20),

Homework1 (deadline: Feb. 27; March 11), Homework2 (deadline: April 17)

Last day of lecture: Th., April 24, 2003 Spring Break: March 4+6

Page 3: Christoph F. Eick Introduction Data Management Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things

Christoph F. Eick

Introduction Data Management

Elements of COSC 6340 I: Basic Database Management Concepts --- Review of basic

database concepts, techniques, and languages (4 weeks, Chapters 1-5, 7-11, and 18 of the textbook).

II: Implementation of Relational Operators and Query Optimization (Chapters 12+13, 1.5 weeks)

III: Relational Database Design (1.5 weeks, chapters 15+16,) IV: Introduction to KDD and Making Sense of Data

(Chapters 1, 2, 6, and 7 of the Han/Kamber book centering on data warehouses, OLAP, and data mining). 3 weeks

V: Object-oriented Databases, PL/SQL, Object-relational Database Systems, and SQL3 (1.5 weeks; other material)

VI: Internet Databases and XML (1 week, chapter 22 of the textbook and other teaching material)

Page 4: Christoph F. Eick Introduction Data Management Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things

Christoph F. Eick

Introduction Data Management

Textbooks for COSC 6340

Required Text: Raghu Ramakrishnan and Johannes Gehrke, Data Management Systems, McGraw Hill, Third Edition, 2002 (complication: the chapter numbers in the new edition are different!!)

Recommended: Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Morgan Kaufman Publishers, 2001, ISBN 1-55860-489-8 (4 chapters will be covered)

Other books with relevant material: Ramez Elmasri and Shamkant Navathe, Fundamentals of Database Systems, Third Edition Addison Wesley ISBN: 0-8053-1755-4

Page 5: Christoph F. Eick Introduction Data Management Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things

Christoph F. Eick

Introduction Data Management

Schedule for Part1 of COSC 6340

Jan. 14: Introduction to COSC 6340 Fast Review of Undergraduate Material (Jan. 16-Feb. 13)

Jan. 16: Entity-Relationship Data Model Jan. 21: Entity-Relationship Data ModelJan. 23: Relational Data Model Jan. 28: Mapping E/R to Relations Jan. 30: Files, B+-trees, and hashing (chapter 8, 9, 10) Feb. 4: Files, B+-trees, and hashing (chapter 8, 9, 10) Feb. 6: Relational Algebra and SQL (very brief!!) Feb. 11: Transaction Management (chapter 18) Feb. 13: Exam0 (Undergraduate Review Exam)

Page 6: Christoph F. Eick Introduction Data Management Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things

Christoph F. Eick

Introduction Data Management

Why are integrated databases popular?

Avoidance of uncontrolled redundancy Making knowledge accessible that would otherwise not be

accessible Standardization --- uniform representation of data facilitating

import and export Reduction of software development (though the availability of

data management systems)

IntegratedDatabase

BookkeepingDevice

Car Salesman

Page 7: Christoph F. Eick Introduction Data Management Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things

Christoph F. Eick

Introduction Data Management

Popular Topics in Databases Efficient algorithms for data collections that reside on disks (or

which are distributed over multiple disk drives, multiple computers or over the internet).

Study of data models (knowledge representation, mappings, theoretical properties)

Algorithms to run a large number of transactions on a database in parallel; finding efficient implementation for queries that access large databases; database backup and recovery,…

Database design How to use database management systems as an application

programmer / end user. How to use database management systems as database

administrator How to implement database management systems Data summarization, knowledge discovery, and data mining Special purpose databases (genomic, geographical, internet,…)

Page 8: Christoph F. Eick Introduction Data Management Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things

Christoph F. Eick

Introduction Data Management

Data Model

Data Model

Schema (definesa set of database

states)

Current Database State

is used to define

Page 9: Christoph F. Eick Introduction Data Management Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things

Christoph F. Eick

Introduction Data Management

Schema for the Library Example using the E/R Data Model

Many-to-Many1-to-1 1-to Many Many-to-1

title

authorB#

when

phone

name

ssn

Check_out Person Book(0,35) (0,1)

Page 10: Christoph F. Eick Introduction Data Management Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things

Christoph F. Eick

Introduction Data Management

Relational Schema for Library Example in SQL/92

CREATE TABLE Book (B# INTEGER, title CHAR(30), author CHAR(20), PRIMARY KEY (B#));

CREATE TABLE Person (ssn CHAR(9), name CHAR(30), phone INTEGER, PRIMARY KEY (ssn));

CREATE TABLE Checkout( book INTEGER, person CHAR(9), since DATE, PRIMARY KEY (B#), FOREIGN KEY (book) REFERENCES Book, FOREIGN KEY (person) REFERENCES Person));

Page 11: Christoph F. Eick Introduction Data Management Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things

Christoph F. Eick

Introduction Data Management

Referential Integrity in SQL/92

SQL/92 supports all 4 options on deletes and updates. Default is NO ACTION

(delete/update is rejected) CASCADE (also delete all tuples

that refer to deleted tuple) SET NULL / SET DEFAULT (sets

foreign key value of referencing tuple)

CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students

ON DELETE CASCADEON UPDATE SET DEFAULT )

Page 12: Christoph F. Eick Introduction Data Management Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things

Christoph F. Eick

Introduction Data Management

Example of an Internal Schemafor the Library Example

INTERNAL Schema Library12 references Library. Book is stored sequentially, index on B# using hashing, index on Author using hashing. Person is stored using hashing on ssn. Check_out is stored sequentially, index on since using B+-tree.

Page 13: Christoph F. Eick Introduction Data Management Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things

Modern Relational DBMSModern Relational DBMS

Modern DBMS

Modern DBMS

Support for Web-Interfaces, XML, andData Exchange

Efficient Implementation of Queries (Query Optimization, Join & Selection & Indexing techniques)

Transaction Concepts; capability

of running manytransactions in

parallel; support forbackup and recovery.

Support for specialData-types: long fields,images, html-links, DNA-sequences,spatial information,…

Support for higher level user interfaces:graphical, natural language, form-based,…

Support for OLAP and Data Warehousing

Support for Data Mining

operations

Support for OO; capability to store operations

Supportfor data-driven computing