Database Systems IFoundations of Databases
Summer term 2010
Melanie [email protected]
Database Systems Group, University of Tübingen
1
Foundations of Databases | Summer term 2010 Melanie Herschel | University of Tübingen
Chapter 0Overview
2
• Overview
• Administrativa
• A little bit of History
Credit: Michael Marcolhttp://www.freedigitalphotos.net/images/view_photog.php?photogid=371
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Welcome Everyone!
Grew up in Bavaria & Lorraine
2000 - 2003Student at the University of Cooperative Education StuttgartInformation Technology
2003 - 2007Research Assistant at Humboldt University in Berlin & at the Hasso-Plattner-Institute PotsdamData quality & data integration
2007 PhD defense
2008 - 2009Post-Doc researcher at the IBM Almaden Research CenterData provenance / query understanding
since 06/2009Research assistant at Tübingen University“Debugging” queries with Nautilus
First of all, let me introduce myself...
Melanie Herschel
B315, Sand 13
Tel +49 7071 29-75481
Email melanie.herschel@uni - tuebingen.de
Web ht tp://www-db.informatik .uni - tuebingen.de/team/herschel
3
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Welcome Everyone!
... and now it is your turn.
Bachelor vs. Diplom?
Computer science, bioinformatics, other studies?
Prior experience with databases?
locals or “neigscheckte”?
4
Which semester?
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Where to Meet Databases
5
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
What is this course about?
• Convince you that there is more to database technology than just open-file(), read()/write(), and close-file().
• Make you see how versatile the strictly tabular data model supported by relational databases can be.
• Make you best friends with SQL, the principal language spoken by relational database systems.
• We will encounter a healthy mix of good, clean theory and highly relevant CS practice.
6
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
What is this course about?Structure
1. Introduction2. ER-Modeling3. Relational model(ing)4. Relational algebra5. SQL6. Programming
• What are databases?
‣ Motivation, history, data independence, database usage
• Modeling databases using the Entity-Relationship Model
‣ Entities, relationships, cardinalities, diagrams
• Developing relational databases
‣ Relational model, ER -> relational, normal forms
• Relational algebra
‣ Criteria for query languages, operators
• SQL
‣ SQL DDL, SQL DML, SELECT... FROM... WHERE...
• Programming for databases
‣ JDBC
7
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
What is this course about?ER-Modeling
1. Introduction2. ER-Modeling3. Relational model(ing)4. Relational algebra5. SQL6. Programming
ownsCustomer Account
Firstname Lastname DOB Number Type Balance
1,1 1,n
• We want to model customers and their accounts(saving account, checking account, credit card, ...)
• Customers and accounts have attributes, e.g., a person has a firstname, lastname, and date of birth.
• A customer can have one or more accounts.
• We assume that an account can only have one owner.
9
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
What is this course about?Relational Model
1. Introduction2. ER-Modeling3. Relational model(ing)4. Relational algebra5. SQL6. Programming
ownsCustomer Account
Firstname Lastname DOB Number Type Balance
1,1 1,n
Cust_ID Firstname Lastname DOBCustomer
Number Type Balance Cust_IDAccount
CREATE TABLE Account (Number INTEGER,Type CHAR(25),Balance DOUBLE,Cust_ID INTEGER,PRIMARY KEY (Number),FOREIGN KEY Cust_ID
REFERENCES Customer) 10
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
What is this course about?Relational Model
1. Introduction2. ER-Modeling3. Relational model(ing)4. Relational algebra5. SQL6. Programming
Cust_ID Firstname Lastname DOB1 John Doe 1.1.1970
2 Jane Smith 2.2.1977
4 Peter Miller 3.3.1983
... ... ... ...
Customer
Number Type Balance Cust_ID123 Checking 2000 1
124 Saving 5000 1
987 Checking 100 2
975 Credit 500 2
777 Saving 6000 4
... ... ... ...
Account
11
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
What is this course about?Relational Algebra & SQL
1. Introduction2. ER-Modeling3. Relational model(ing)4. Relational algebra5. SQL6. Programming
Declarative queries
• Not “How do I generate the query result?”
• But “What data does the query result contain?”
Natural language query
Name of account owner and account number of all accounts with a balance of more than 1000 Euros.
Relational Algebra
πlastname,number (σbalance > 1000 ( σc.cust_id = a.cust_id (customer × account))
SQL
SELECT c.lastname, a.number FROM Customer c, Account a WHERE a.balance > 1000 AND a.cust_id = c.cust_id
12
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
What is this course about?Programming (if time permits)
1. Introduction2. ER-Modeling3. Relational model(ing)4. Relational algebra5. SQL6. Programming
• Accessing the database from an external program
• For instance, from a JAVA program, we can communicate with a database via JDBC.
‣ JDBC Driver
‣ Open Connection
‣ Querying the database
‣ Processing query results
...ResultSet r = statement.executeQuery(myQueryString);
while (r.next()) { String lastname = r.getString(1); String accountNumber = r.getInt(2); Address a = getAddress(lastname, accountNumber); initCommercial(a);} 13
Foundations of Databases | Summer term 2010 Melanie Herschel | University of Tübingen
Chapter 0Overview
14
• Overview
• Administrativa
• A little bit of History
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
AdministrativaCourse schedule
LecturesWhen? Where?
Monday, 10:15 - 11:45 Sand 6/7 kleiner Hörsaal
Tuesday, 10:15 - 11:45 Sand 6/7 großer Hörsaal
PracticalsWhen? Where?
Thursday, 14:15 - 15:45 Sand 13, A104
15
• First assignment available on April 15, 2010.
• First assignment will be discussed April 22, 2010.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
AdministrativaKeep up to date
http://www-db.informatik.uni-tuebingen.de/teaching/ss10/db1Please visit regularly, as the latest slides and news will be posted there.
16
http://twitter.com/DBatUTuebingenStay tuned to the latest database group news.
https://cis.informatik.uni-tuebingen.de/db1-ss-10/Register in order to view assignments and access your scores.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
AdministrativaEvaluating your Performance
End-term exam
• 90 mins. examination on Monday, July 12th, from 10:15 - 11:45.
• No supplemental material is allowed.
• Passing earns you 6 ECTS.
Assignments and grading
• We will distribute, collect, and grade weekly assignments.
• Assignments will be available on our website or via CIS.
• You have one week to complete each assignment.
• You may - and you should - work in teams of two.
• You should hand in your assignments in paper form at Manuel Mayr’s office.
• Scoring 2/3 of the overall points in the assignments earns you an additional 2 ECTS.
• Your scores will be available via CIS only. 17
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
These slides...
Quizzies Definition
18
Code snippetsExamples
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Read a book, write some SQL
Any introductory book is fine, two suggestions are
•Raghu Ramakrishnan and Johannes Gehrke.Database Management Systems. McGraw-Hill.(Vorlesung ist stark an diesem Buch orientiert)
•Alfons Kemper and André Eickler.Datenbanksysteme: Eine Einführung. Oldenbourg Verlag
Install IBM DB2 V9.5 Express-C
•We will bring it with us for almost any lecture.
Dowload at http://www-01.ibm.com/software/data/db2/express/
19
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Further literature!"#$"%"&'#$"%($)%!"#$"%"&'#$"%($)%
!"
!"#$%"& '"(")"*+ ,"$-.
*"+#,&-().(//&0&1'&2($"/3(/4565$"."&7&0&89.."%&:;;<
20
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Questions and Feedback
• Questions anytime!
• During the lecture
• In dedicated office hour: Monday, 15:00 - 16:00
• Email, phone
• Feedback and suggestions are highly appreciated
• Slides
• Information on the Website
• ...
21
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Before we get started...... Commercials
Studien- / Bachelor- / Diplom- / Masterarbeitenhttp://www-db.informatik.uni-tuebingen.de/teaching/studentische-arbeiten
22
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Selected Fun Problems of the ACM Programming Contest
23
Proseminar SS 2010www-db.informatik.uni-tuebingen.de
Vorbesprechung: 15.04.2010, 09:30, B305b
Foundations of Databases | Summer term 2010 Melanie Herschel | University of Tübingen
Chapter 0Overview
24
• Overview
• Administrativa
• A little bit of History
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
A little bit of HistoryData Collection and Storage Technology
25
• Herman Hollerith: “punch card tabulating machine”
•Companies (precursors of IBM)
•Tabulating Machine Corp - 1896
•Computing-Tabulating-Recording Company(C-T-R) - 1911
•International Business Machines Corporation (IBM) - 1924
Credit: IBM Archive, Computer History Museum, Mountain View, CA
Computer History Museum, Mountain View, CA
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
A little bit of HistoryHard-Drive Technology
26
• Magnetic Disc Drive: RAMAC 350 by IBM - 1955/56
•“Random Access Method of Accounting and Control”
•Developed in San Jose, CA
•Technical lead: Reynold B. Johnson (1906-1998)
•Since then, HDD development has followedMoore’s Law
•Cost per MB has decreased by half approx. every 18 months.
•The areal density (bits/inch2) has doubledapprox. every 18 months.
Credit: IBM Archive, Computer History Museum, Mountain View, CA
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
A little bit of HistoryTimeline
27
1960 1970 1980 1990 2000
Database systems based on hierarchical model,
network model
Database systems based on the relational model
Scaling database systems to the very large and the very small
Object-oriented databases
Specialization to new types of data
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
A little bit of HistoryDatabase Systems in The Sixties
28
Early 1960s
• Data is stored in files.
•Application dependent organization of the data
•Integrated Data Store (IDS) by General Electric
Late 1960s
•File management systems (SAM, ISAM)
•Basic operations on the data are possible, e.g., sorting
•Information Management System (IMS) by IBM
‣Still in use on mainframes today, with more than 1 Billion $ revenue.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
A little bit of HistoryRelational Database Management Systems
29
1970s
• Database systems emerge
•Ted Codd (IBM): relational data model as conceptual basis for relational database systems - 1970.
•System R (IBM): first prototype of a relational database management system - 1974
‣Roughly 80,000 lines of code (PL/1, PL/S, Assembler)
‣SEQUEL as a query language
‣First installation in 1977.
•Ingres (University of Berkeley) - 1975
‣QUEL as a query language
‣Precursor of Postgres, Sybase, ...
•Oracel Version 2 - 1979
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
A little bit of HistoryRelational Database Management Systems
30
Credit: Prof. Freytag, Ringvorlesung 2005
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
A little bit of HistoryDatabase Development in the 1980s and 1990s
31
• Systems become smaller and smaller.
‣DBMS run on smaller hardware.
‣DBMS become part of standard installations.
•Systems process more and more data.
‣Gigabyte, Terabyte
‣Large and complex (Multimedia-) objects
‣Persistent storage using hard drives and tertiary storage (tapes, DVDs)
‣Distributed and parallel processing
•Object-oriented database systems
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
A little bit of HistoryDatabase Development in the new Millennium
32
• Support for new types of data
‣XML- and semi-structured data
‣Multimedia (Pictures, Audio, Video)
•Federated databases: integrating data from heterogeneous sources (databases, files, web-sources)
•Mobile databases
‣Managing data on handheld devices (PDA, mobile phone, ...)
•Data Warehouses: From transaction processing to analytical processing
•Information retrieval
•Distributed processing
•...