33
Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Embed Size (px)

Citation preview

Page 1: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Introduction to Parallel Processing

CS 147 November 12, 2004 Johnny Lai

Page 2: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

P PP P P PMicrokernelMicrokernel

Multi-Processor Computing System

Threads InterfaceThreads Interface

Hardware

Operating System

ProcessProcessor ThreadPP

Applications

Computing Elements

Programming paradigms

Page 3: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Architectures System

Software/Compiler Applications P.S.Es Architectures System Software Applications P.S.Es

SequentialEra

ParallelEra

1940 50 60 70 80 90 2000 2030

Two Eras of Computing

Commercialization R & D Commodity

Page 4: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

History of Parallel Processing

PP can be traced to a tablet dated around 100 BC. Tablet has 3 calculating positions. Infer that multiple positions:

Reliability/ Speed

Page 5: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Why Parallel Processing?

Computation requirements are ever increasing -- visualization, distributed databases, simulations, scientific prediction (earthquake), etc.

Sequential architectures reaching physical limitation (speed of light, thermodynamics)

Page 6: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Age

Gro

wth

5 10 15 20 25 30 35 40 45 . . . .

Human Architecture! Growth Performance

Vertical Horizontal

Page 7: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

No. of Processors

C.P

.I.

1 2 . . . .

Computational Power Improvement

Multiprocessor

Uniprocessor

Page 8: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

The Tech. of PP is mature and can be exploited commercially; significant R & D work on development of tools & environment.

Significant development in Networking technology is paving a way for heterogeneous computing.

Why Parallel Processing?

Page 9: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Hardware improvements like Pipelining, Superscalar, etc., are non-scalable and requires sophisticated Compiler Technology.

Vector Processing works well for certain kind of problems.

Why Parallel Processing?

Page 10: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Parallel Program has & needs ...

Multiple “processes” active

simultaneously solving a given

problem, general multiple processors.

Communication and synchronization

of its processes (forms the core of

parallel programming efforts).

Page 11: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Processing Elements Architecture

Page 12: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Simple classification by Flynn: (No. of instruction and data streams)

SISD - conventional SIMD - data parallel, vector computing MISD - systolic arrays MIMD - very general, multiple approaches.

Current focus is on MIMD model, using general purpose processors.

(No shared memory)

Processing Elements

Page 13: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

SISD : A Conventional Computer

Speed is limited by the rate at which computer can transfer information internally.

ProcessorProcessorData Input Data Output

Instru

ctions

Ex:PC, Macintosh, Workstations

Page 14: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

The MISD Architecture

More of an intellectual exercise than a practicle configuration. Few built, but commercially not available

Data InputStream

Data OutputStream

Processor

A

Processor

B

Processor

C

InstructionStream A

InstructionStream B

Instruction Stream C

Page 15: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

SIMD Architecture

Ex: CRAY machine vector processing, Thinking machine cm*Intel MMX (multimedia support)

Ci<= Ai * Bi

InstructionStream

Processor

A

Processor

B

Processor

C

Data Inputstream A

Data Inputstream B

Data Inputstream C

Data Outputstream A

Data Outputstream B

Data Outputstream C

Page 16: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Unlike SISD, MISD, MIMD computer works asynchronously.

Shared memory (tightly coupled) MIMD

Distributed memory (loosely coupled) MIMD

MIMD Architecture

Processor

A

Processor

B

Processor

C

Data Inputstream A

Data Inputstream B

Data Inputstream C

Data Outputstream A

Data Outputstream B

Data Outputstream C

InstructionStream A

InstructionStream B

InstructionStream C

Page 17: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

MEMORY

BUS

Shared Memory MIMD machine

Comm: Source PE writes data to GM & destination retrieves it Easy to build, conventional OSes of SISD can be easily be ported Limitation : reliability & expandibility. A memory component or

any processor failure affects the whole system. Increase of processors leads to memory contention.

Ex. : Silicon graphics supercomputers....

MEMORY

BUS

Global Memory SystemGlobal Memory System

ProcessorA

ProcessorA

ProcessorB

ProcessorB

ProcessorC

ProcessorC

MEMORY

BUS

Page 18: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

MEMORY

BUS

Distributed Memory MIMD

Communication : IPC on High Speed Network. Network can be configured to ... Tree, Mesh, Cube, etc. Unlike Shared MIMD

easily/ readily expandable Highly reliable (any CPU failure does not affect the whole system)

ProcessorA

ProcessorA

ProcessorB

ProcessorB

ProcessorC

ProcessorC

MEMORY

BUS

MEMORY

BUS

MemorySystem A

MemorySystem A

MemorySystem B

MemorySystem B

MemorySystem C

MemorySystem C

IPC

channel

IPC

channel

Page 19: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Laws of caution.....

Speed of computers is proportional to the square of their cost. i.e. cost = Speed

Speedup by a parallel computer increases as the logarithm of the number of processors. Speedup = log2(no. of processors)S

P

log 2P

C

S

(speed = cost2)

Page 20: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Caution....

Very fast development in PP and related area

have blurred concept boundaries, causing lot of

terminological confusion : concurrent computing/

programming, parallel computing/ processing,

multiprocessing, distributed computing, etc.

Page 21: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

It’s hard to imagine a field that changes as rapidly as

computing.

Page 22: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Even well-defined distinctions like

shared memory and distributed

memory are merging due to new

advances in technolgy.

Good environments for developments

and debugging are yet to emerge.

Caution....

Page 23: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

There is no strict delimiters for contributors to the area of parallel processing : CA,OS, HLLs, databases, computer networks, all have a role to play.

This makes it a Hot Topic of Research

Caution....

Page 24: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Types of Parallel Systems

Shared Memory Parallel Smallest extension to existing systems Program conversion is incremental

Distributed Memory Parallel Completely new systems Programs must be reconstructed

Clusters Slow communication form of Distributed

Page 25: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Operating Systems for PP

MPP systems having thousands of processors requires OS radically different fromcurrent ones.

Every CPU needs OS : to manage its resources to hide its details

Traditional systems are heavy, complex and not suitable for MPP

Page 26: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Frame work that unifies features, services and tasks performed

Three approaches to building OS.... Monolithic OS Layered OS Microkernel based OS

Client server OS Suitable for MPP systems

Simplicity, flexibility and high performance are crucial for OS.

Operating System Models

Page 27: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

ApplicationPrograms

ApplicationPrograms

System ServicesSystem Services

HardwareHardware

User ModeUser Mode

Kernel ModeKernel Mode

Monolithic Operating System

Better application Performance Difficult to extend Ex: MS-DOS

Page 28: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Layered OS

Easier to enhance Each layer of code access lower level interface Low-application performance

ApplicationPrograms

ApplicationPrograms

System ServicesSystem Services

User Mode

Kernel Mode

Memory & I/O Device MgmtMemory & I/O Device Mgmt

HardwareHardware

Process ScheduleProcess Schedule

ApplicationPrograms

ApplicationPrograms

Ex : UNIX

Page 29: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Traditional OS

OS DesignerOS Designer

OS

Hardware

User Mode

Kernel Mode

ApplicationPrograms

ApplicationPrograms

ApplicationPrograms

ApplicationPrograms

Page 30: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

New trend in OS design

User Mode

Kernel Mode

Hardware

Microkernel

ServersApplicationPrograms

ApplicationPrograms

ApplicationPrograms

ApplicationPrograms

Page 31: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Microkernel/Client Server OS

(for MPP Systems)

Tiny OS kernel providing basic primitive (process, memory, IPC)

Traditional services becomes subsystems Monolithic Application Perf. Competence OS = Microkernel + User Subsystems

ClientApplication

ClientApplication

Thread lib.

Thread lib.

FileServer

FileServer

NetworkServer

NetworkServer

DisplayServer

DisplayServer

MicrokernelMicrokernel

HardwareHardware

User

Kernel

SendReply

Ex: Mach, PARAS, Chorus, etc.

Page 32: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Few Popular Microkernel Systems

MACH, CMU

PARAS, C-DAC

Chorus

QNX,

(Windows)

Page 33: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai

Reference

http://www.cs.mu.oz.au http://www.whatis.com Computer System Organization &

Architecture John D. Carpinelli http://www.google.com (^_^)