44
CPEG 852 — Advanced Topics in Computing Systems Program Execution Models (PXMs) St´ ephane Zuckerman Computer Architecture & Parallel Systems Laboratory Electrical & Computer Engineering Dept. University of Delaware 140 Evans Hall Newark,DE 19716, United States [email protected] September 1, 2015 S.Zuckerman CPEG852 – Fall ’15 1 / 26

CPEG 852 Advanced Topics in Computing Systems Program

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CPEG 852 Advanced Topics in Computing Systems Program

CPEG 852 — Advanced Topics in ComputingSystems

Program Execution Models (PXMs)

Stephane Zuckerman

Computer Architecture & Parallel Systems LaboratoryElectrical & Computer Engineering Dept.

University of Delaware140 Evans Hall Newark,DE 19716, United States

[email protected]

September 1, 2015

S.Zuckerman CPEG852 – Fall ’15 1 / 26

Page 2: CPEG 852 Advanced Topics in Computing Systems Program

Outline

1 IntroductionComponents of a Modern HPC System: HardwareComponents of a Modern HPC System: Software

2 An Introduction to Program Execution ModelsWhat are Program Execution Models?Components of a Program Execution ModelAn Example of PXM: The Von Neumann Model

S.Zuckerman CPEG852 – Fall ’15 2 / 26

Page 3: CPEG 852 Advanced Topics in Computing Systems Program

Outline

1 IntroductionComponents of a Modern HPC System: HardwareComponents of a Modern HPC System: Software

2 An Introduction to Program Execution ModelsWhat are Program Execution Models?Components of a Program Execution ModelAn Example of PXM: The Von Neumann Model

S.Zuckerman CPEG852 – Fall ’15 3 / 26

Page 4: CPEG 852 Advanced Topics in Computing Systems Program

Modern High-Performance Computer SystemsHardware & Software

S.Zuckerman CPEG852 – Fall ’15 4 / 26

Page 5: CPEG 852 Advanced Topics in Computing Systems Program

Modern High-Performance Computer SystemsHardware

S.Zuckerman CPEG852 – Fall ’15 5 / 26

Page 6: CPEG 852 Advanced Topics in Computing Systems Program

Modern High-Performance Computer SystemsHardware

S.Zuckerman CPEG852 – Fall ’15 5 / 26

Page 7: CPEG 852 Advanced Topics in Computing Systems Program

Modern High-Performance Computer SystemsHardware

S.Zuckerman CPEG852 – Fall ’15 5 / 26

Page 8: CPEG 852 Advanced Topics in Computing Systems Program

Modern High-Performance Computer SystemsHardware

S.Zuckerman CPEG852 – Fall ’15 5 / 26

Page 9: CPEG 852 Advanced Topics in Computing Systems Program

Modern High-Performance Computer SystemsHardware

S.Zuckerman CPEG852 – Fall ’15 5 / 26

Page 10: CPEG 852 Advanced Topics in Computing Systems Program

Modern High-Performance Computer SystemsHardware

S.Zuckerman CPEG852 – Fall ’15 5 / 26

Page 11: CPEG 852 Advanced Topics in Computing Systems Program

Components of a High-Performance Computer System IThe Hardware

HPC Systems: a High-Level View

I One or more compute nodes.They are connected through fastinterconnection networks, e.g., 10Gbps Ethernet, or Infiniband.

I Each node contains one or more chips, connected through some kindof fast on-board interconnect.

I Chips contain billions of transistors. They use them to providevarious ways to parallelize program executions in different ways.

I Possibly additional accelerators/co-processors.

S.Zuckerman CPEG852 – Fall ’15 6 / 26

Page 12: CPEG 852 Advanced Topics in Computing Systems Program

Components of a High-Performance Computer System IIThe Hardware

Micro-Processors Overview

I Chips contain billions of transistors. They use them to providevarious ways to parallelize program executions in different ways.I At the instruction level (ILP)

I From a micro-architectural point of view: super-scalar and VLIWI By leverage vector registers and instructions (e.g., SSE/AVX for x86, or

AltiVec for PowerPC/POWER)

I At the thread level (TLP)

I Micro-processors chips may embed multiple threads (CG-MT, FG-MT,SMT)

I Micro-processors chips may embed multiple cores (CMP)

S.Zuckerman CPEG852 – Fall ’15 7 / 26

Page 13: CPEG 852 Advanced Topics in Computing Systems Program

Components of a High-Performance Computer System IIIThe Hardware

Types of Accelerators

I AMD or nVidia GPUs

I Each GPU board embeds thousands of hardware threadsI They are great at “embarrasingly” parallel tasks, but tend to be bad at

handling branching

I Intel Xeon Phi

I Made up of revamped 1st generation Intel Pentium (P54C), withadditional vector instruction engines (SSE/AVX)

I Currently: Up to 61 cores / 244 hardware threads

I 60 cores / 240 threads are usable, as the last core is used for the OS

I However: The Xeon Phi requires as much care as a regular GPU toefficiently optimize code for high-performance.

I FPGA boards

I Used to provide some application-specific hardware accelerationI May be reconfigured to fit another applicationI But: reconfiguring an FPGA is slow.

S.Zuckerman CPEG852 – Fall ’15 8 / 26

Page 14: CPEG 852 Advanced Topics in Computing Systems Program

Modern High-Performance Computer SystemsHardware & Software

S.Zuckerman CPEG852 – Fall ’15 9 / 26

Page 15: CPEG 852 Advanced Topics in Computing Systems Program

Modern High-Performance Computer SystemsHardware & Software

S.Zuckerman CPEG852 – Fall ’15 9 / 26

Page 16: CPEG 852 Advanced Topics in Computing Systems Program

Modern High-Performance Computer SystemsHardware & Software

S.Zuckerman CPEG852 – Fall ’15 9 / 26

Page 17: CPEG 852 Advanced Topics in Computing Systems Program

Modern High-Performance Computer SystemsHardware & Software

S.Zuckerman CPEG852 – Fall ’15 9 / 26

Page 18: CPEG 852 Advanced Topics in Computing Systems Program

Components of a High-Performance Computer System IThe Software

HPC Systems Software: Overview

Writing bare-metal code is doable, but extremely complex. As a result, awhole software ecosystem has been developed over the years.

I The operating system

I The compiler toolchain

I Various runtime systems

I Other tools which bridge everything mentioned above

S.Zuckerman CPEG852 – Fall ’15 10 / 26

Page 19: CPEG 852 Advanced Topics in Computing Systems Program

Components of a High-Performance Computer System IIThe Software

The Operating System (OS)

It provides an interface between the hardware/software resources and theuser.

I Handles resource management: I/Os, computing resource allocation,memory management, etc.

I In recent years, OSes have been heavily modified to take into accountCMP and HW MT technologies

I OSes have to be fair: processes may have higher priorities, buteventually, every process must have a chance to run on the hardware.

S.Zuckerman CPEG852 – Fall ’15 11 / 26

Page 20: CPEG 852 Advanced Topics in Computing Systems Program

Components of a High-Performance Computer System IIIThe Software

Compiler Toolchain

I Input: a program written in a programming language

I Usual input: a program written in a high-level language

I For some definition of “high-level:” technically, assembly language ishigh-level compared to machine code

I Output: a program written in a different language

I Usual output: a program written in a low-level language

I for some definition of “low-level”

I The compiler must know about the underlying OS, so that it canproduce binaries and libraries that can be launched by it.

I One example of direct interaction between the compiler and the OS:complying to a specific Application Binary Interface (ABI) format

S.Zuckerman CPEG852 – Fall ’15 12 / 26

Page 21: CPEG 852 Advanced Topics in Computing Systems Program

Components of a High-Performance Computer System IVThe Software

Run-Time Systems (RTS)

Runtime systems are very diversified, and are meant to be specialized incertain programming tasks. For example:

I MPI, the Message Passing Interface, provides an efficient way to sendand receive data in a point-to-point or collective manner over anetwork

I PThreads allow a programmer to create several threads ofexecution which can all share the same memory

I Some runtime environments simply provide a way to manage memoryusing garbage collection

I RTSes do not have to be fair: they have a specific set of goals toachieve, within the frame provided by the underlying OS (if it exists).

I etc.

S.Zuckerman CPEG852 – Fall ’15 13 / 26

Page 22: CPEG 852 Advanced Topics in Computing Systems Program

Components of a High-Performance Computer System VThe Software

Other Vital System Software

I Linkers take several object files produced by a compiler, and link themin a single executable entity, or a single library of functions.

I Loaders are given binaries to execute. They substitute symbolicaddresses contained within the binary and replace them with the trueaddress of functions to call.

However, how does one describe the execution of a program in such acomplex system?

S.Zuckerman CPEG852 – Fall ’15 14 / 26

Page 23: CPEG 852 Advanced Topics in Computing Systems Program

Outline

1 IntroductionComponents of a Modern HPC System: HardwareComponents of a Modern HPC System: Software

2 An Introduction to Program Execution ModelsWhat are Program Execution Models?Components of a Program Execution ModelAn Example of PXM: The Von Neumann Model

S.Zuckerman CPEG852 – Fall ’15 15 / 26

Page 24: CPEG 852 Advanced Topics in Computing Systems Program

What Are Program Execution Models?

PXMs: A “Horizontal” Definition

A program execution model specifies the interface between a given set ofhardware features and resources, and the software that tells them what todo.

PXMs: A “Vertical” Definition

A program execution model specifies the way programs behave across alllayers that compose a computer system: Hardware, software, and evenuser-related interactions.

S.Zuckerman CPEG852 – Fall ’15 16 / 26

Page 25: CPEG 852 Advanced Topics in Computing Systems Program

What Are Program Execution Models?

PXMs: A “Horizontal” Definition

A program execution model specifies the interface between a given set ofhardware features and resources, and the software that tells them what todo.

PXMs: A “Vertical” Definition

A program execution model specifies the way programs behave across alllayers that compose a computer system: Hardware, software, and evenuser-related interactions.

S.Zuckerman CPEG852 – Fall ’15 16 / 26

Page 26: CPEG 852 Advanced Topics in Computing Systems Program

What Are Program Execution Models?

S.Zuckerman CPEG852 – Fall ’15 17 / 26

Page 27: CPEG 852 Advanced Topics in Computing Systems Program

What Are Program Execution Models?

S.Zuckerman CPEG852 – Fall ’15 17 / 26

Page 28: CPEG 852 Advanced Topics in Computing Systems Program

What Are Program Execution Models?

S.Zuckerman CPEG852 – Fall ’15 17 / 26

Page 29: CPEG 852 Advanced Topics in Computing Systems Program

What Are Program Execution Models?

S.Zuckerman CPEG852 – Fall ’15 17 / 26

Page 30: CPEG 852 Advanced Topics in Computing Systems Program

Components of a Program Execution Model

PXMs: Components

I A threading (or concurrency) model

I A memory model

I A synchronization model

S.Zuckerman CPEG852 – Fall ’15 18 / 26

Page 31: CPEG 852 Advanced Topics in Computing Systems Program

Components of a Program Execution ModelThe Threading Model

The threading model describes how to create concurrency. In particular, itspecifies

I How to create a new thread, or a group of threads

I How to end an existing thread, or a group of threads

I How and when to schedule existing threads

I How and when to suspend and resume a thread

S.Zuckerman CPEG852 – Fall ’15 19 / 26

Page 32: CPEG 852 Advanced Topics in Computing Systems Program

Components of a Program Execution ModelThe Memory Model

Traditionally, memory models cover two aspects.

I The addressing mode. This includes:

I Shared vs. private memory. Example: Threads vs. processes in UNIXI Shared memory vs. distributed memory. Example: PThreads vs.

MPII Physically vs. logically shared memory, a.k.a. Distributed Shared

Memory (DSM)I Physical vs. virtual memory

I Embedded systems tend to access memory directly through physicaladdresses

I More general-purpose systems tend to rely on virtual memory

I The memory consistency model, i.e.:

I If a memory location is concurrently accessed by two or more memoryoperations in a parallel system, what is the observed order?

I What are the rules to alter memory ordering?I Can multiple values pertaining to the same memory location be

observed by different actors?

S.Zuckerman CPEG852 – Fall ’15 20 / 26

Page 33: CPEG 852 Advanced Topics in Computing Systems Program

Components of a Program Execution ModelThe Synchronization Model

It determines how concurrent and parallel threads can coordinate witheach other. Some examples:

I When to be woken up after an event was triggered

I Whether some data transfer is blocking or non-blocking

I Acquiring some locking mechanism and thus guarantee atomic accessto a region of code

I How to force a group of threads or processes to wait for each otherbefore continuing the program’s execution; etc.

S.Zuckerman CPEG852 – Fall ’15 21 / 26

Page 34: CPEG 852 Advanced Topics in Computing Systems Program

PXMs in One Picture

S.Zuckerman CPEG852 – Fall ’15 22 / 26

Page 35: CPEG 852 Advanced Topics in Computing Systems Program

An Example of PXM: The Von Neumann Model

Basic Components of the von Neumann Model

The von Neumann model relies on an abstract machine composed of:

I A control unit

I A central processing unit

I A memory unit

I An inputs/outputs “hub”

I Data, address, and control buses

Execution Rules

1 The program counter (PC) contains the address of the nextinstruction to execute.

2 Automatically increment the PC by the size of the instructioncurrently executing.

3 Branching instructions may overwrite the PC with arbitrary values.

S.Zuckerman CPEG852 – Fall ’15 23 / 26

Page 36: CPEG 852 Advanced Topics in Computing Systems Program

The Von Neumann Model

Outputs (displays, LED, HDD, ...)

Inputs (kbd, mouse, HDD, ...)

I/Os

MAR MDR

Memory

ControlUnit

Processing Unit

CPU Central Processing Unit

Ad

dres

s B

us

Da

ta B

us

Con

trol

Bus

S.Zuckerman CPEG852 – Fall ’15 24 / 26

Page 37: CPEG 852 Advanced Topics in Computing Systems Program

The Von Neumann Model

Instruction Register

Program Counter

Control Unit

MemoryAddress Register

MemoryData

Register

Memory

Arithmetic &

Logic UnitAccumulator

Processing UnitLED

...

Display

HDD / SSD

Outputs

Mouse

...

Keyboard

HDD / SSD

Inputs

S.Zuckerman CPEG852 – Fall ’15 24 / 26

Page 38: CPEG 852 Advanced Topics in Computing Systems Program

Extending the Von Neumann Model for Parallelism

I Need to duplicate the program counters (PCs)

I What happens to memory accesses?

I What happens to inter-core/CPU synchronization?

No two parallel systems are alike when it comes to answer those questions.

S.Zuckerman CPEG852 – Fall ’15 25 / 26

Page 39: CPEG 852 Advanced Topics in Computing Systems Program

Extending the Von Neumann Model for Parallelism

I Need to duplicate the program counters (PCs)

I What happens to memory accesses?

I What happens to inter-core/CPU synchronization?

No two parallel systems are alike when it comes to answer those questions.

S.Zuckerman CPEG852 – Fall ’15 25 / 26

Page 40: CPEG 852 Advanced Topics in Computing Systems Program

Extending The Von Neumann Model for Parallelism

S.Zuckerman CPEG852 – Fall ’15 26 / 26

Page 41: CPEG 852 Advanced Topics in Computing Systems Program

Extending The Von Neumann Model for Parallelism

S.Zuckerman CPEG852 – Fall ’15 26 / 26

Page 42: CPEG 852 Advanced Topics in Computing Systems Program

Extending The Von Neumann Model for Parallelism

S.Zuckerman CPEG852 – Fall ’15 26 / 26

Page 43: CPEG 852 Advanced Topics in Computing Systems Program

Extending The Von Neumann Model for Parallelism

S.Zuckerman CPEG852 – Fall ’15 26 / 26

Page 44: CPEG 852 Advanced Topics in Computing Systems Program

Extending The Von Neumann Model for Parallelism

S.Zuckerman CPEG852 – Fall ’15 26 / 26