ECE 462 Object-Oriented Progggramming using C++ and Java ... · • Smaller and faster memoryy( )...

Preview:

Citation preview

ECE 462Object-Oriented Programmingj g g

using C++ and Java

Design Parallel Programs

Yung-Hsiang Lul @ d dyunglu@purdue.edu

YHL Design Parallel Programs 1

Compare Java and C++ (Qt) ThreadsCompare Java and C (Qt) ThreadsJava C++

extends Thread orimplements Runnable

public QThread

public void run() public:public void run() public:void run()

thread1.start(); thread1.start();

synchronized mutex.lock() …mutex unlock()mutex.unlock()

try {wait ();

} h

cond.wait(&mutex);

YHL Design Parallel Programs 2

} catch …

Design PrinciplesDesign Principles

• separate data and threadsp– data: such as bank account, shared among threads,

need protected by synchronized (Java) or mutex (Qt)– threads: such as depositor, extends Threads (Java)

or : public QThread (C++). modify data• choose the granularity of data• choose the granularity of data

– too coarse: most threads are serialized– too fine: mutex overhead dominant

YHL Design Parallel Programs 3

SerializationSerialization

• If all threads need to access one Waitingshared object often, the program essentially becomes a serial program

Waiting

program.• The program may actually run

more slowly due to mutex and thread overhead.

• solution: reduce the "coupling" among threads

Mutexamong threads.

YHL Design Parallel Programs 4

Improve Thread EfficiencyImprove Thread Efficiency

reduce the "coupling" among threadsp g g• do not share objects• do not execute concurrently when objects are shared• shrink the granularity of shared object e g all bank accounts ⇒• shrink the granularity of shared object. e.g. all bank accounts ⇒

individual customer's account• shrink critical sections ⇒ only the operations that are related to the

shared objects are in mutexshared objects are in mutex• use private (not shared) data for short-term storage• do not guarantee correctness, remedy later

YHL Design Parallel Programs 5

Producers and Consumerswith a Finite Buffer

YHL Design Parallel Programs 6

consumer

producer consumer

producerconsumer

circular buffer (FIFO)

producerconsumer

circular buffer (FIFO)

A producer can put an element in the buffer ifp pit is not full. A consumer can get an elementthe buffer if it is not empty.

YHL Design Parallel Programs 7

Circular BufferCircular Buffer

head tail

headtail

• head = tail ⇒ buffer empty• tail > head and (tail – head) == size – 1 ⇒ buffer full• head > tail and (head – tail) == 1 ⇒ buffer full

YHL Design Parallel Programs 8

YHL Design Parallel Programs 9

YHL Design Parallel Programs 10

YHL Design Parallel Programs 11

YHL Design Parallel Programs 12

YHL Design Parallel Programs 13

YHL Design Parallel Programs 14

YHL Design Parallel Programs 15

YHL Design Parallel Programs 16

YHL Design Parallel Programs 17

Check CorrectnessCheck Correctness

• wrong result ⇒ the program has bugsg p g g• correct result ⇒ maybe you are lucky• know the correct results, even though the interleaving

may be different• head ≠ tail after adding an element

randomize interleaving if possible• randomize interleaving if possible • assign invalid values to data that should not appear

YHL Design Parallel Programs 18

Deadlock and LivelockDeadlock and Livelock

• deadlock: none of the participating threads can execute p p gany more code

• livelock: some (or all) of the participating threads can t d b t i d fexecute some more code, but no progress is made. for

example, dinning philosophers

YHL Design Parallel Programs 19

ECE 462Object-Oriented Programmingj g g

using C++ and Java

Deadlock

Yung-Hsiang Lul @ d dyunglu@purdue.edu

YHL Deadlock 1

DeadlockDeadlock

YHL Deadlock 2

DeadlockDeadlock

four necessary conditions for deadlocky– mutual exclusion: a car's current location

cannot be occupied by another carcarca

r

– hold and wait: a car must "hold" the current location while waiting

– no preemption: a car cannot be removed

carcar

– no preemption: a car cannot be removed by a scheduler

– circular wait: a car can move if another car moves but another car is waiting for this car

YHL Deadlock 3

Programs without DeadlockPrograms without Deadlock

YHL Deadlock 4

YHL Deadlock 5

YHL Deadlock 6

YHL Deadlock 7

YHL Deadlock 8

YHL Deadlock 9

YHL Deadlock 10

YHL Deadlock 11

YHL Deadlock 12

YHL Deadlock 13

Java Example of DeadlockJava Example of Deadlock

YHL Deadlock 14

John has a check and a saving accounts with a bank. He gtransferred $500 from the check account to the saving account for a high interest rate. Later in the day, he wrote a check and transferred $200 from the savingwrote a check and transferred $200 from the saving account back to the checking account. In the evening, he went to an ATM to withdraw cash. The ATM said, “ ” ?“Account not Found.” What happened?

YHL Deadlock 15

class Account {class Account {private int balance;synchronized void transfer (Account source, int amount) {

if (source.canWithdraw(amount)) {source.withdraw(amount);balance += amount;

}}synchronized boolean canWithdraw(amount) {synchronized boolean canWithdraw(amount) {

if (balance >= amount) { return true; }return false;

}}}

YHL Deadlock 16

thread 1 thread 2thread 1 thread 2

sa transfer(chk) hk t f ( )sav.transfer(chk) chk.transfer(sav)

i hk kacquire sav key acquire chk key

ll Withdcall chk.canWithdraw call sav.canWithdraw

time

acquire chk key acquire sav key

YHL Deadlock 17

YHL Deadlock 18

YHL Deadlock 19

YHL Deadlock 20

ECE 462Object-Oriented Programmingj g g

using C++ and Java

Thread Performance

Yung-Hsiang Lul @ d dyunglu@purdue.edu

YHL Thread Performance 1

Performance MeasurementPerformance Measurement

ts1 = current time;;execute the program without creating threads;

ts2 = current time;

tp1 = current time;execute the program with multiple threads

tp2 = current time;

ts2 - ts1improvementtp2 tp1

=−

YHL Thread Performance 2

Structure of Multithread ProgramsStructure of Multithread Programsgenerate data or read data from disks

partition data into independent portions

create threads and assign tasks

start thread execution

collect data and analyze resultsYHL Thread Performance 3

collect data and analyze results

Amdahl’s LawAmdahl s Law

• If a program has x (%) parallel code, 1-x sequential codep g ( ) p , q• the speedup of using n threads (no sync) is• if x = 0.9 and n →∞, speedup = 10

1x1-x+1 xn

20

x = 0.9 0.99 0.999 1

5

10

15

Spee

dup

0

5

0 4 8 12 16

S

YHL Thread Performance 4

Number of Processors

Multi-Thread = High Performance?Multi Thread High Performance?

• multi-thread ≠ high performance (faster)g p ( )• If a program is IO-bounded, multi-thread (or multi-core)

does not really help.• Finding sufficient parallelism (make x closer to 1) can be

difficulty.• Reduce the sequential parts as much as possible• Reduce the sequential parts as much as possible

– read data from multiple, parallel sources– partition data as they arrivep y– create long-living threads and reuse them– reduce competitions of mutex keys

YHL Thread Performance 5

Superlinear SpeedupSuperlinear Speedup

• Sometimes, a multithread program's performance , p g pexceeds the number of processors.

execution time of single thread number of processors>

All modern processors are built upon a series of storage

g number of processorsexecution time of multiple thread

>

• All modern processors are built upon a series of storage devices, called memory hierarchy.

processor cache cache L2 memory

YHL Thread Performance 6

Superlinear or SublinearSuperlinear or Sublinear

superlinear linearsuperlinear

sublinear

YHL Thread Performance 7

Memory HierarchyMemory Hierarchy

• Smaller and faster memory (cache) is installed closer to y ( )the processor. When data cannot fit into L1 cache, the data are stored in L2 cache, then memory.M lti l h l L1 h ll ti l• Multiple processors can have larger L1 cache collectively to accommodate more data for faster access. As a result, the program is faster.

• Multi-thread programs can also cause frequent data contention and trigger expensive (i.e. slow) consistency checking code When this happens the program can bechecking code. When this happens, the program can be much slower.

YHL Thread Performance 8

ECE 462Object-Oriented Programmingj g g

using C++ and Java

Tile-Based Platform Game

Yung-Hsiang Lul @ d dyunglu@purdue.edu

YHL Platform Game 1

YHL Platform Game 2

YHL Platform Game 3

YHL Platform Game 4

Changing BackgroundChanging Background

• background map: To provide a changing background g p p g g gwithout a very large actual background

• sky

• tilestiles

YHL Platform Game 5

Tile MapTile Map

• The titles do not need individual copies of the imagesp g• collision detection based on the type of a tile• complex map ⇒ a special map editor• simple map ⇒ text editor sufficient• Each object’s location is relative to the map (not to the

screen) so that a player can scroll left or right easilyscreen) so that a player can scroll left or right easily.

YHL Platform Game 6

YHL Platform Game 7

Tile-Based Collision DetectionTile Based Collision Detection

If the sprite’s next location is (x,y), checking whether p ( ,y), gcollision occurs is easy:

tileMap.get(x,y) ⇒ if a tile exists, collision

♦ ♦

YHL Platform Game 8

YHL Platform Game 9

YHL Platform Game 10

YHL Platform Game 11

Recommended