15
SQL introduction

SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

Embed Size (px)

Citation preview

Page 1: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

SQLintroduction

Page 2: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

RHS – SOC 2

Getting data out of databases

• Databases are ”just” containers of data

• We could – in principle – just put data in a text file instead

• The real strength of databases is the ability to efficiently retrieve a specified subset of data

Page 3: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

RHS – SOC 3

Getting data out of databases

• Suppose we have information about a set of persons:– Name– Address– Date of birth– Occupation– Income– …

• We wish to find all persons, who are older than 40 years, and have an income of more than $50,000

• Is that an easy or hard task…?

Page 4: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

RHS – SOC 4

Getting data out of databases

Name Address Date of birth Occupation Income

John --- 12-05-1980 None $20,000

Mira --- 30-01-1956 Doctor $120,000

Alex --- 09-10-1971 Programmer $240,000

Steven --- 19-06-1966 Teacher $60,000

Joanne --- 03-12-1961 Doctor $90,000

Hannah --- 22-03-1977 Nurse $45,000

Susanne --- 01-08-1960 None $20,000

Page 5: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

RHS – SOC 5

Getting data out of databases

Name Address Date of birth Occupation Income

John --- 12-05-1980 None $20,000

Mira --- 30-01-1956 Doctor $120,000

Alex --- 09-10-1971 Programmer $240,000

Steven --- 19-06-1966 Teacher $60,000

Joanne --- 03-12-1961 Doctor $90,000

Hannah --- 22-03-1977 Nurse $45,000

Susanne --- 01-08-1960 None $20,000

”Older than 40 years”

Page 6: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

RHS – SOC 6

Getting data out of databases

Name Address Date of birth Occupation Income

Mira --- 30-01-1956 Doctor $120,000

Steven --- 19-06-1966 Teacher $60,000

Joanne --- 03-12-1961 Doctor $90,000

Susanne --- 01-08-1960 None $20,000

Page 7: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

RHS – SOC 7

Getting data out of databases

Name Address Date of birth Occupation Income

Mira --- 30-01-1956 Doctor $120,000

Steven --- 19-06-1966 Teacher $60,000

Joanne --- 03-12-1961 Doctor $90,000

Susanne --- 01-08-1960 None $20,000

”Income of more than $50,000”

Page 8: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

RHS – SOC 8

Getting data out of databases

Name Address Date of birth Occupation Income

Mira --- 30-01-1956 Doctor $120,000

Steven --- 19-06-1966 Teacher $60,000

Joanne --- 03-12-1961 Doctor $90,000

Page 9: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

RHS – SOC 9

Getting data out of databases

• Pretty easy; we could almost do it just by looking at the table

• What if we had 100,000 records…?

• Maybe I could write a small program to pick out the relevant records

Page 10: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

RHS – SOC 10

Getting data out of databases

for (Records r : recordList)

{

if ((r.getAge() > 40) and

(r.getIncome() > 50000))

{

selectedRecords.add();

}

}

Page 11: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

RHS – SOC 11

Getting data out of databases

• Fine, but…

• The code is very task-specific; can only solve this specific problem

• The code might be inefficient; we examine all records to find the specified subset

• Enter SQL!

Page 12: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

RHS – SOC 12

SQL

• SQL – Structured Query Language

• A language which enables us to specify subsets of data in a database

• Subsets in terms of– Tables– Fields– Conditions on fields

Page 13: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

RHS – SOC 13

SQL

• SQL is the standard for database languages

• It is non-procedural; you specify what data to find, not how to find it

• Fairly easy to learn and understand

Page 14: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

RHS – SOC 14

SQL

• SQL enables us to– Create databases and tables– Perform ”data management” like inserting,

updating and deleting records– Perform queries on data; i.e. retrieve

specified subsets of data

• We concentrate on queries initially

Page 15: SQL introduction. RHS – SOC 2 Getting data out of databases Databases are ”just” containers of data We could – in principle – just put data in a text

RHS – SOC 15

Exercise

• The piece of Java code to retrieve records was potentially very inefficient, since all records might be examined in the process

1. Consider how the data in the recordList could be organised, in order to make it more efficient to find all records for persons older than 40

2. Will the improved structure also make the original query more efficient (persons older than 40 and income more than $50,000)?