48
1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

Embed Size (px)

Citation preview

Page 1: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

1

ICS 214B: Transaction Processing and Distributed Data Management

Course Project

Page 2: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 2

Client ClientClientClient

Distributed Travel Reservation System

ResourceManager

Flights

ResourceManager

Hotels

ResourceManager

Cars

ResourceManager

Customers

WorkflowController

TransactionManager

Page 3: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 3

Project Plan• Two steps:

– Build centralized travel reservation system Components: Resource Manager

– Add distributed functionality Components: Workflow Controller,

Transaction Manager

Page 4: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 4

Client ClientClientClient

Part 1: Simple Travel Resource Manager

ResourceManager

Flights, Hotels,Cars, Customers

interface ResourceManager

start();queryFlightPrice();reserveFlight();queryCarPrice();reserveCar();…commit();

Page 5: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 5

Resource Manager Functionality

• Data Manipulation– Query and Update flight, car, and hotel

data– Make reservations

• Transactions with ACID properties• Technical/Testing Support

Page 6: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 6

Data Definition (fixed)• Stores the following tables:

– FLIGHTS(flightNum, price, numSeats, numAvail)– HOTELS(location, price, numRooms, numAvail)– CARS(location, price, numCars, numAvail)– CUSTOMERS(custName)– RESERVATIONS(custName, resvType, resvKey)

• Simple (unrealistic) assumptions– One hotel, rental car agency per location– One airline– All seats in a flight cost the same– All rooms and cars in a location cost the same

Page 7: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 7

Data Manipulation Operations

• Flights– Add flights– Add seats to flight– Remove seats from a flight– Cancel flight– Query number of seats available on a flight– Query price of a flight– Reserve seat in flight for a given customer

• Similar operations for Cars and Hotels– Described in interface ResourceManager.java

Page 8: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 8

Important concepts review• We assume you’ve completed

ICS214(A) or equivalent• Here we present a short overview just

to refresh your memory

Page 9: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 9

Consistent DB Consistent DB’T

Important concepts review

Page 10: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 10

Big assumption:

If T starts with consistent state + T executes in isolation

T leaves consistent state

Page 11: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 11

What can go wrong?• Transaction bug• DBMS bug• Hardware failure

e.g., disk crash alters balance of account

• System crash– CPU halts, memory wiped out, disk intact

• Concurrencye.g.: T1: give 10% raise to programmers T2: change programmers systems analysts

Page 12: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 12

Key problem: Unfinished transaction

Example Constraint: A=B T1: A A 2 B B 2

Page 13: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 13

T1: Read (A,t); t t2Write (A,t);Read (B,t); t t2Write (B,t);Output (A);Output (B);

A: 8B: 8

A: 8B: 8

memory disk

1616

16

failure!

Page 14: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 14

• Need atomicity: execute all actions of a transaction or none at all

Page 15: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 15

Implementing Atomicity

• In ICS214 (Fall 2002):– Logging

• In this programming project, use– Shadow paging– Why?

Shadow paging is simpler to implement But performance is lower than logging

Page 16: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 16

Shadow Paging Overview• Each file is managed via a page

table P– Each transaction T updates the file via

a private page table– Commit T by replacing the public

page table by a private one– Example: suppose DB has two files,

“a” and “b”

Page 17: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 17

P1anew

Pt´[a] 1 2 3...

Pt´[b] 1 2 3...

P2anew

P2bnew

Master Pointer

ab

P1aold

Pt[a,1]123...

Pt[b,1]123...

P2aold

P1bold

P2bold

DISK

Master´ a b

Master

MAIN MEMORY FOR T

Page 18: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 18

P1anew

Pt´[a] 1 2 3...

Pt´[b] 1 2 3...

P2anew

P2bnew

Master Pointer

ab

P1aold

Pt[a,1]123...

Pt[b,1]123...

P2aold

P1bold

P2bold

DISK

Master´ a b

Master

Page 19: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 19

P1anew

Pt´[a] 1 2 3...

Pt´[b] 1 2 3...

P2anew

P2bnew

Master Pointer

ab

P1aold

Pt[a,1]123...

Pt[b,1]123...

P2aold

P1bold

P2bold

DISK

Master´ a b

Master

Page 20: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 20

•It is a technique pioneered in System R •changes are made to a copy of a page (block)•When a transaction commits, the copy becomes the current page and the original is discarded

SHADOW PAGING

Page 21: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 21

Suppose transaction A starts up:•the current page table (directory) is copied to the shadow page table (shadow directory) •if the transaction updates a page, the original page is not altered, rather a copy is created and that is modified•the copy is pointed to by the current page table - the shadow page table is never modified

Single transaction

Page 22: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 22

Write operationWhen transaction Ti issues a write(A) command, the write operation is executed as follows (assuming data item A resides on page PA):

1. If page PA is not already in main-memory then issue input(PA)

2. If this is the first write operation on page PA by transaction Ti then:(a) allocate a new disk page (call it tPA)

(b) copy PA into tPA

(c) modify the current page table so that the entry corresponding to PA now points to tPA

3. perform the update on the page pointed to by tPA

Page 23: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 23

CommitWhen transaction Ti commits:

1.Ensure all buffer pages in memory that have been modified are flushed to the disk.

2.output the current page table to disk3.change the current page table to become

the new page table4.free the pages of the old shadow page

table that are no longer necessary5.read the old shadow page table and free

its pages

Page 24: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 24

What is required if the system crashes while a transaction is executing?

•free up all modified pages•discard the current page table•reinstate the shadow page table as the current page table

Crash recovery

Page 25: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 25

•appears simple for single transaction environments•complexity increases for concurrent transactions•clustering diminishes quickly•not aware of any commercial implementations

Pros and cons

Page 26: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 26

• What if two transactions update different pages of a file?– If they share their main memory copy of the page table,

then committing one will commit the other’s updates too!

• Use a private copy of page table, per transaction. To commit T, within a critical section:– get a private copy of the last committed value of the page

table of each file modified by T– update their entries for pages modified by T– store the updated page tables on disk– write a new master record and master pointer, thereby

installing the update just for T (// end of critical section)

Multiple transactions

Page 27: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 27

Shadow Paging in Practice• Reference: R. Lorie, “Physical Integrity in

a Large Segmented Database”ACM Trans. on DB Sys., March 1977.

• Used in the Gemstone OO DBMS.• Not good for high-performance TP

systems– count disk updates per transaction– how to do record-level locking?

Page 28: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 28

Concurrency Control T1 T2 … Tn

DB(consistencyconstraints)

• Use the 3 rules (well-formed xacts, legal scheduler, 2PL)

• Support shared locks

Page 29: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 29

• System transparently acquires locks when transactions access data

• Holds all locks until transaction commits– Called “Strict 2-phase locking”– Strict 2PL automatically avoids cascading

rollbacks

#locks

time

Simple Locking System

Page 30: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 30

Implementing the RM• For simplicity, implement the

tables using hashtables– Each row is a hashtable entry

Create a class for every kind of row e.g., Flight

Primary key is hashkey

• For the RESERVATIONS table, you can merge with the CUSTOMERS table: include a List of reservations in each hashtable entry.

Page 31: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 31

Implementing the RM• Java has convenient Hashtable and

Vector classes• For durability, write the database

to disk– You can use Java serialization to

directly write the hashtables to disk– The class java.io.ObjectOutputStream

might be helpful

Page 32: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 32

Implementing the RM• Using serialized hashtables makes it

for easy persistence• But does not use block-based I/O

model– Low-performance

• Need to adapt shadow paging for this model

Page 33: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 33

Shadowing with hashtables

• Have two copies of file on disk, with master pointer pointing to last committed copy

• Last committed copy also cached in memory

• Each transaction has private update list: hashtable of entries it has read or written

• On commit:– Merge update list into in-memory table– Table written to disk using different filename– Master pointer updated to point to new file

Page 34: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 34

In-Mem CopyA:3B:5C:2...

Master Pointer

File0A:3B:5C:2...

DISK

MAIN MEMORY

Update List for T

B:8……...

Page 35: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 35

In-Mem CopyA:3B:8C:2...

Master Pointer

File0A:3B:5C:2...

File1A:3B:8C:2...

DISK

MAIN MEMORY

Update List for T

B:8……...

Page 36: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 36

In-Mem CopyA:3B:8C:2...

Master Pointer

File0A:3B:5C:2...

File1A:3B:8C:2...

DISK

MAIN MEMORY

Update List for T

B:8……...

Page 37: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 37

Concurrency Control• We provide a lock manager• Supports two operations

– lock(xid, itemName, READ|WRITE)– unlockAll(xid)

• Implement two-phase locking using the supplied lock manager

Page 38: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 38

Resource Manager Transactions• A client starts a transaction by

calling the start() method– Returns a transaction id (xid)– Client includes xid in all data

manipulation operation requests• Client calls commit(xid) or

abort(xid) to finish a transaction• System crash or deadlock may

forcibly abort transactions

Page 39: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 39

Client ClientClientClient

Part 2: Distributed Travel Reservation System

ResourceManager

Flights

ResourceManager

Hotels

ResourceManager

Cars

ResourceManager

Customers

WorkflowController

TransactionManager

Page 40: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 40

Part 2: Distributed Transactions• Add Workflow Control and

Transaction Manager components• Implement 2-phase commit

– To be covered later

Page 41: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 41

Code base• project directory with two packages• lockmgr: do not change

– lock & unlockAll in LockManager.java• transaction: your code here

– ResourceManager.java: do not change

– ResourceManagerImpl.java– Client.java– Makefile

Page 42: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 42

Java RMI• RMI: Remote Method Invocation

– The way Client talks to the RM

• 1. Start rmiregistry– Use your own port number

• 2. Start server (RM)– RM binds to a name at the registry

• 3. Start Client– Client queries registry using the bind name

• All taken care of in the code base

Page 43: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 43

Java synchronized constructs• Convenient critical section

implementation• Synchronized block:

– Associated with any object– Before entering, thread obtains exclusive

lock on obj.– no two threads inside synch. blocks

belonging to same obj at same time• Synchronized methods: lock on “this”

Page 44: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 44

Java synchronized statements

• Synchronized block example:AAAsynchronized (ht) {BBB}CCC

• Synchronized method example:public synchronized void foo() { SAME AS…}

public void foo() {synchronized (this) {…}}

Page 45: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 45

Java Hashtable• Table of (key,value) pairsHashtable ht = new Hashtable();…// Insertionht.put(“Flight214, flt);…// Lookupflt = (Flight) ht.get(“Flight214”);

Page 46: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 46

Java serialization• Object Serialization is process of writing the

state of an object to a byte stream– Serializable objects can be written out to disk

and restored easily.– Needed by RMI

• Hashtable implements Serializable, so:ObjectOutputStream outS = new ObjectOutputStream(new

FileOutputStream(“data/RMimage1”);outS.writeObject(ht);----------------------------------------------ObjectInputStream inS = new ObjectInputStream(new

FileInputStream(“data/RMimage1”);Hashtable ht = (Hashtable) inS.readObject();

Page 47: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 47

Project Logistics• Getting started: class home page

– http://www.ics.uci.edu/~ics214b/

• Work on ICS SUN machines• Collaboration policy

– As always, each group can have up to 3 students

• Due dates: – part 1: Feb 15, Saturday– part 2: March 14, Friday

• Submit: source code + short README• Grading: correct functionality

– We’ll use our own Client.java to test your code

Page 48: 1 ICS 214B: Transaction Processing and Distributed Data Management Course Project

ICS214B Notes 03 48

Acknowledgements

• This project is based on a similar course project developed by Anand Rajaraman at Stanford, which was based on a similar project developed by Phil Bernstein at the University of Washington