26
INFO 340 Lecture 2 Intro to Databases

INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Embed Size (px)

Citation preview

Page 1: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

INFO 340

Lecture 2

Intro to Databases

Page 2: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

• Book– Need it– Order individually through UW Bookstore– Or Amazon

Page 3: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Previous Class

• How many cards did your system require total?

• How difficult was it to answer the standard queries?– The ad-hoc queries?

Page 4: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

What if -- iTunes Store via File Based Approach

• Imagine all details of a song encapsulated in a single file (or nest of files)• Problems

– Difficulty in accessing data• We need that particular application to access the data

– Duplication of data • do we want a separate copy of the album cover for every single song ?

– Data dependence • What if we want to get at the same data, but with a different application?

– File format incompatibilities• What if the applications use different file formats? Ie one uses text, another

XML, another binary, etc– Fixed queries

• Largely dependent on app developer– Results in fixed (non- ‘ad-hoc’) queries and/or– Multiple applications

Page 5: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Limitations of Files

• Separation and isolation of data• Duplication of data• Data Dependence• Incompatible file formats• Fixed queries / proliferation of apps• Two main problems emerge:

– Data definition is embedded in application– No control over access and data manipulation.

Page 6: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Introducing Databases

• Solves many problems introduced in a file system.

• Multi-user, simultaneous connections

• Security

• Separation of application and data

• Common query language

• One copy of data

Page 7: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Database Def (pg 15)

A shared collection of logically related data, and a description of this data, designed to meet the information needs of an organization

Page 8: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Types of databases

• Relational (we’ll focus on this type in this class)

• Network and Hierarchical model

• Object Oriented

Page 9: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Database are “Self Describing”

• System catalog• Data dictionary• Meta-data

It is this local or in-place description of the data that allows “program-data independence”

(BTW good quiz item)

Page 10: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Data abstraction

• Similar to programming classes– You can change the internal coding to the

object (instantiated class) as long as the external behavior (behavior presented to the outside world) doesn’t change

• Similarly, in databases, we can separate the data from the application

Page 11: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Logically related

• Entities -- distinct object

• Attributes -- describes some aspect of the object

• Relationships -- associations between the entities

Page 12: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

DBMSDatabase Management System

• Provides facility to define the database via a Data Definition Language -- DDL

• Provides facility to store, retrieve, & manipulate data through Data Manipulation Language– SQL (Structured Query Language) is the standard– Note SQL & MySQL are 2 different things

Page 13: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

• Controlled access to database:– Authorization & possibly authentication– Control of integrity/consistency– Handle multiple simultaneous users (concurrency)– Handle recovery -- return to a known, consistent

state– User-accessible catalog

DBMS

Page 14: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Database Application Program

• Program that interacts with a database.• Database can expose Views to the application.

Views help to– Present the data to the user in a useful format– Choose who sees what

• ie not all should see payroll

– Customize data presentation• Same db can have many different looks/appearances• Generally better than having multiple databases

– Allows db to hide any evolutionary structural changes to the database ‘behind the scenes’

Page 15: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Three (or four) levels of the DBMS

Page 16: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Database Design

• Data structure first & application second– Data structure will be the basis– Data structure is also harder to change on

the fly– At the design phase, want to think it

through as much as possible– Data design decisions can make

application design easier (or harder)

Page 17: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

• Data Administrator -- “Data Monk”• Database Administrator• Logical Database Designer

– Business rules• Ie only HR or finance person can see payroll

• Physical Database Designer• Application developers• User types

– Naïve– Sophisticated

Database Design

Page 18: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

DBMS Advantages

• Control (though not elimination) of redundancy• Consistency - reduce points of

update/modification– Caused by ‘same’ data existing in multiple places

• Data sharing• Integrity - use of constraints• Security - users, roles, privileges• Standardization forced• Concurrency• Backup & recovery

Page 19: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

DBMS Disadvantages

• You pay for robustness

• Complexity - Lots of forethought

• Cost (upfront, but generally not over time)

• Higher impact of failure - centralization

• Conversion cost

Page 20: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Schema & Instance

Schema -- Description of the database

Instance -- A functioning database at any point in time

-- sometimes refers to a particular running installation

Page 21: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

DDL & DML

• DDL– Language to specify the db

• System uses sets of tables itself to support user tables/schema

– System catalog– Data dictionary

• DML– Language to update the db & retrieve data– Procedural

• How– Non-Procedural

• What• SQL

Page 22: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

DBMS Functions

• Data storage, retrieval, update• User-accessible catalog

– Data about data -- metadata– System catalog

• Transaction support -- ie ‘rollback’– Failure of db, network, etc before operation is complete

• Concurrency support• Authorization services• Data comm support• Integrity services• Utility services - imports, exports, analysis, monitoring/alerts

Page 23: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Some Behind the Scenes Processes of DBMS

• Query Processor• File Manager• DML preprocessor• DDL compiler - creates metadata tables• Authorization control• Integrity checker• Query optimizer• Transaction processor• Recovery manager

Page 24: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Entity Relationship Modeling/Entity Relationship Diagram (ERD) Introduction

Example - Used car business with multiple lots, multiple cars, multiple sales people

car

salesperson

Sales lots

color make

year

Engine sizename

address

Phone number

location

size

Lighted?

has

M

1

has

M

1

Page 25: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

In-class work

• Groups of 5• Create ERD for following 3 scenarios

– iTunes (w/customers, songs, artists)– Live inventory (w/computers, users, & ‘on-

net’ snapshots)– Hacker attack incident database

• Attacker, victim, locations, computers (addresses, OS’s, other pertinent info,etc)

• Present your results for each

Page 26: INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore –Or Amazon

Assignment

• Read Chapter 3-4 of Connolly & Begg