28
INFO SYSTEMS SPRING 2013

Info systems Spring 2013

  • Upload
    mora

  • View
    45

  • Download
    0

Embed Size (px)

DESCRIPTION

Info systems Spring 2013. Focus of class. Cutting edge topics in information systems Conceptual material, as opposed to coding If you don’t have a good knowledge of relational databases and SQL, now is a good time to study it… A good MySQL book: murach mysql. Class blog & reaching me. - PowerPoint PPT Presentation

Citation preview

Page 1: Info systems Spring 2013

INFO SYSTEMS SPRING 2013

Page 2: Info systems Spring 2013

Focus of class• Cutting edge topics in information systems• Conceptual material, as opposed to coding• If you don’t have a good knowledge of relational

databases and SQL, now is a good time to study it…• A good MySQL book: murach mysql

Page 3: Info systems Spring 2013

Class blog & reaching me• http://rogerking.me

• All assignments and announcements will be on the blog.• I will not send out class email.• Only read blog entries starting on December 30.

• To contact me• [email protected]• Do not use my CU mail, please. It is inundated with Spam.• 303 437 7419.• To talk to me, see me after class, or send email for an appointment.• rogerbuzzking on Skype, but I do not stay logged into Skype; you

must call or email first.

Page 5: Info systems Spring 2013

Assignments and exams• Two exams – material will be from

• 2 required books• Slides posted on blog, along with corresponding lectures• Handouts posted on blog

• Exam and projects• Midterm – 35 % of final grade

• On March 11• Second exam – 35 % of final grade

• This will be on the last schedule day of class, NOT at the final slot• i.e., it will be on May 1

• Small projects/assignments – 30% of final grade total

Page 6: Info systems Spring 2013

The First Assignment• Build a simple application using Mongo• Groups of 1 or 2• Due date to be posted on blog• The application will hold very large video and audio objects• More later…

Page 7: Info systems Spring 2013

Suggestions…• Please come to class.• Feel free to suggest references and/or topics.• Speak up at any time – it would be nice to have an

interactive class.

Page 8: Info systems Spring 2013

Likely and possible topics• Post relational database systems• Full text processing• “Big” data• Databases and XML• Media and databases (images, video, audio, animation)• Personal information systems (not just databases)• Integrating diverse data sets• ???

Page 9: Info systems Spring 2013

Review of relational databases• Schema based technology

• Small schema• Lots of data

• SQL, set-based data manipulation• Multi-language application development• Strict 2 phase ACID transactions• Rigidly structured, identically structured, simply structured

data• User defined types extensions for “objects in the small”

Page 10: Info systems Spring 2013

Relational basics• Tables (relations)• Columns (attributes)• Domains• Rows (tuples)• Primary and candidate keys• Foreign keys• Null values• Triggers• SQL as DDL and DML

Page 11: Info systems Spring 2013

Relational applications• A lot of data, but in amounts no longer considered huge• Transactions manipulating sets of rows• Complete accuracy is required• Not real time or interactive• If distributed, only a modest number of centrally controlled

servers• If distributed, there is a single, unified schema• Complex objects built by connecting tables• Logic of data is mostly in schema and SQL, not app• Mix of reads and writes, without writes being minimal• SQL programs (run as transactions) compiled once, and

optimized, run many times

Page 12: Info systems Spring 2013

Important facts about SQL DBs• Elegant, set theoretic underpinnings• Originally defined with an algebra:

• Projection, join, union/difference/intersection, selection• Closed under this algebra

• SQL is a tuple calculus based on the algebra• Declarative• Far more compact than algebra

• Equally elegant algebra for optimization• Query results are legal view tables• Serious impedance mismatch between o-o languages and

SQL

Page 13: Info systems Spring 2013

Syntax of SQLSELECT select_listFROM table_sourceWHERE search_condition

Page 14: Info systems Spring 2013

ExampleSELECT invoice_total, payment_total, credit_total, invoice_total - payment_total - credit_total AS balance_dueFROM invoicesWHERE NOT (invoice_total >= 5000 OR NOT invoice_date <= '2011-08-01')

Page 15: Info systems Spring 2013

ExampleSELECT 'Paid' AS source, invoice_number, invoice_date, invoice_totalFROM paid_invoicesWHERE invoice_date >= '2011-06-01'ORDER BY invoice_total DESC

Page 16: Info systems Spring 2013

Manipulating multiple tables• You can use the Where clause, T.color = R.color• You can use an explicit join

• Equijoin, outerjoin

SELECT vendor_name, invoice_number, invoice_totalFROM vendors LEFT JOIN invoices ON vendors.vendor_id = invoices.vendor_idORDER BY vendor_name

Page 17: Info systems Spring 2013

AggregatesAVG([ALL|DISTINCT] expression)SUM([ALL|DISTINCT] expression)MIN([ALL|DISTINCT] expression)MAX([ALL|DISTINCT] expression)COUNT([ALL|DISTINCT] expression)COUNT(*) – counts nulls

Page 18: Info systems Spring 2013

ExampleSELECT COUNT(*) AS number_of_invoices, SUM(invoice_total – payment_total – credit_total) AS total_dueFROM invoicesWHERE invoice_total – payment_total – credit_total > 0

Page 19: Info systems Spring 2013

Strict 2 phase transactions

• Each SQL program is within a begin and end transaction pair• Each transaction has its own workspace for DB items it is going

to update• Any transactions that overlap in execution time will appear to

have run in some serial order• This is done by transactions requesting read and write locks

(also known as shared and exclusive locks)• Read locks can be shared with other readers• Write locks cannot be shared with readers or writers• All locks held until the end of the transaction• They are released and then the changes that a transaction has

made are moved to the DB

Page 20: Info systems Spring 2013

Serializability & transactions• The net effect is that the transactions that overlap in

execution time appear to have run in some serial order• Transactions can be undone by throwing away the local

store (conceptually, at least)• The write period at the end of the transaction must be

atomic• The two phases:

• Request read, write, upgrade locks (and wait on locks) and process

• Release locks and move updates to the DB• There is a notion of “serializability” which means that the

actual schedule of executed steps corresponds to some serial order of running the transactions

Page 21: Info systems Spring 2013

Major concept: DB “state”• Changes at the end of each transaction (potentially)• Relies on 2 Phase transactions• We tradeoff throughput in favor of having safe state

changes

Page 22: Info systems Spring 2013

Normalization• Start with an entity model• Map to tables• Create PKs and FKs• Create other constraints• Normalize tables• Goals

• Minimize redundant data• Minimize “update anomalies”• Minimize misinterpreted db by documenting more constraints

• Problems• Skinny tables and more joins• Increases impedance mismatch

Page 23: Info systems Spring 2013

Dependencies• FD

• We say that ai FD-> aj• Or “ai functionally determines aj”

• MVD->• We say that ai MVD-> aj• Or “ai multivalued determines aj”

• Note: the right side of an FD or an MVD can be a set of attributes

Page 24: Info systems Spring 2013

Third normal form• First (1NF) The value stored at the intersection of each

row and column must be a scalar value, and a table must not contain any repeating columns.

• Second (2NF) Every non-key column must depend on the entire primary key.

• Third (3NF) Every non-key column must depend only on the primary key.

Page 25: Info systems Spring 2013

Fixed 3NF and 4NF• Boyce-Codd (BCNF) A non-key column can’t be

dependent on another non-key column. • Fourth (4NF) A table must not have more than one

multivalued dependency, where the primary key has a one-to-many relationship to non-key columns.

Page 26: Info systems Spring 2013

ExampleCustomer ID Address ZIP18 112 First 80304 17 123 Ash 80303 16 123 Ash 80303

Break into two tables:

Customer ID AddressAddress Zip

Page 27: Info systems Spring 2013

ExampleMothers_Phone Fathers_Phone Child_Name

Break into: Mothers_Phone Child_Name 3030000000 Sue 3031111111 SueAnd Fathers_Phone Child_Name 3032222222 Sue

3033333333 Sue

Note: both fields needed for PK

Page 28: Info systems Spring 2013

Stored programs• Stored procedures (can be called by an application)• Stored functions (can be called by an SQL program)• Triggers (tied to an operation like INSERT)• Events (tied to a clock)