45
Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Jay Swaminathan

Senior Application Engineer

Software Solutions Group

Intel

Multi-core Programming

Page 2: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Topics

• CPU (semiconductor) History

• Moore’s Law & Multi-Core

• Transistor scaling & its impact

• What then? The new era…

• Software Parallelization

• Performance/ Threading Methodology

• Software tools

Page 3: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

What can we do with faster Computers?

Solve problems faster

• Reduce turn-around time of big jobs

• Increase responsiveness of

interactive apps

Get better solutions in the same

amount of time

• Increase resolution of models

• Make model more sophisticated

0

20

40

60

80

100

120

0 5 10

Processors

Tim

e

0

100

200

300

400

500

600

700

0 5 10

ProcessorsP

rob

lem

Siz

e

Page 4: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Why Parallel Computing?

“The free lunch is over: A Fundamental Turn Toward

Concurrency” – Herb Sutter, Dr. Dobb’s Journal, March 2005

We want applications to execute faster…

10 GHz

1 GHz

100 MHz

10 MHz

1 MHz

’79 ’87 ’95 ’03 ’11

Clock speeds no

longer increasing

exponentially

Page 5: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Computing Landscape

What can we expect in the future?

Many-core arrayCMP with 10s-100s low

power coresScalar coresCapable of TFLOPS+Full System-on-ChipServers, workstations, embedded…

Dual core

Symmetric multithreading

Multi-core arrayCMP with ~10 cores

Large, Scalar coresfor high single-thread performance

Scalar plus many cores for highly

threaded workloads

Micro2015: Evolving Processor Architecture, Intel® Developer Forum, Mar 2005

Page 6: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Transformation of Landscape

Tier-1 compute resourceA cluster of multicores, and soon heterogeneous manycores

Tier-2 compute resource General purpose computation using:- GPU (Graphics Processing Unit)- FPGA (Field Programmable Gate Array)- Cell Processor

Page 7: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

How can we accomplish this?

Through parallel-computing…

Attempt to speed solution of a particular task by1. Dividing task into sub-tasks

2. Executing sub-tasks simultaneously on multiple processors

Successful attempts require both1. Understanding of where parallelism can be effective

2. Knowledge of how to design and implement good solutions

Page 8: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Topics

• CPU (semiconductor) History

• Moore’s Law & Multi-Core

• Transistor scaling & its impact

• What then? The new era…

• Software Parallelization

• Performance/ Threading Methodology

• Identifying Threading Issues using Analysis Tools

Page 9: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Moore’s Law

Page 10: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Historical Driving Forces

0.01

0.1

1

10

1970 1980 1990 2000 2010 2020

FeatureSize(um)

Shrinking Geometry

2005Montecito

1.7B Transistors

19714004 Processor

2300 Transistors

19788008 Processor

IBM PC

1986i386 Processor

32-bit

1993Pentium Processor3.1M transistors

1

10

100

1000

10000

100000

1970 1980 1990 2000 2010 2020

Increased Frequency

Frequency(MHz)

Page 11: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Architectural Innovations

• Serial, sequential execution

• Overlapped execution (pipelining)

• Multi-stage, deep pipelining

• Control-speculative execution

• Data-speculative execution

• Super-scalar execution

• Out-of-order execution

• Vector computing

• Addressing extensions

• Application specific instructions

• Multi-level on-chip caching

• Memory disambiguation

• Register renaming

• Score-boarding

• Hardware data prefetching

• …

Many decades of computer architecture focused on

Instruction-Level Parallelism (ILP) enhancement

Page 12: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Processor Resources

- Caches: L0, L1, L2 etc (Different levels of caches)

- General Purpose Registers (For SW programming)

- Segment Registers & TLB (for memory management)

- FP registers, XMM registers

- System Flags

- Control and Data registers, Debug registers

- Many more

Page 13: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Why Multi-core? The Challenges

30nm

45nm65nm

90nm

0.13um

0.18um

0.25um

0.35um

0.5um

0.7um

0.1

1

10

1990 1993 1997 2001 2005 2009

~30%SupplyVoltage

(V)

Diminishing Voltage ScalingPower Limitations

Power = Capacitance x Voltage2 x Frequencyalso

Power ~ Voltage3

Page 14: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Heat DissipationP

ow

er

Den

sit

y

(W/c

m2

)

4004

8008

8080

8085

8086

286386

486

Pentium®

processors

1

10

100

1,000

10,000

’70 ’80 ’90 ’00 ’10

Hot Plate

Nuclear Reactor

Rocket Nozzle

Sun’s Surface

Projected

Page 15: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Max Frequency

Power

Performance

1.00x

What then?

Page 16: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Over-clocked

(+20%)

1.73x

1.13x1.00x

Max Frequency

Power

Performance

Over-clocking

Page 17: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Over-clocked

(+20%)

Under-clocked

(-20%)

0.51x

0.87x1.00x

1.73x

1.13x

Max Frequency

Power

Performance

Under-clocking

Page 18: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Over-clocked

(+20%)

1.00x

Relative single-core frequency and Vcc

1.73x

1.13x

Max Frequency

Power

Performance

Dual-core

(-20%)

1.02x

1.73x

Dual-Core

Multi-Core

Energy-Efficient Performance

Page 19: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Dual core with voltage scaling

Area = 1

Voltage = 1

Freq = 1

Power = 1

Perf = 1

Area = 2

Voltage = 0.85

Freq = 0.85

Power = 1

Perf = ~1.8

Frequency

Reduction

Power

Reduction

Performance

Reduction

15% 45% 10%

A 15%

Reduction

In Voltage

Yields

SINGLE CORE DUAL CORE

RULE OF THUMB

Page 20: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

A New Era…

PerformanceEquals Frequency

Unconstrained Power

Voltage Scaling

PerformanceEquals IPC

THE OLD

THE NEW

Multi-Core

MicroarchitectureAdvancements

Power Efficiency

Page 21: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Multi-Core: what next?

Page 22: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Tera-scale Computing

Terabytes

TIPS

Gigabytes

MIPS

Megabytes

GIPS

Perf

orm

ance

Dataset SizeKilobytes

KIPS

Mult-Media

3D &Video

Text

RMS Personal Media Creation and Management

Entertainment

Learning & Travel

HealthSingle Core

Multi-core

Tera-scale

IPS =

Instr

uction p

er

second

“RMS” ApplicationsRecognition

MiningSynthesis

Page 23: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Intel Polaris (80-core)

Page 24: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Multi-core: Architectural Challenges

- Instruction-level parallelism v/s Thread-level parallelism tradeoffs

- Shared resource management (functional units, caches, TLB, BTB)

- Multi-threading v/s Multi-core tradeoffs

- On and Off-chip bandwidth requirements

- Latencies (execution, cache, and memory) reduction

- Memory Coherence/Consistency (for high speed on-die cache hierarchies)

- Partitioning resources (between threads/cores)

- Fault tolerance (at device, storage, execution, core level) (aka reliability)

- On-die interconnect (optimized along latency, HW, modularity, power, ...)

Page 25: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Multi-core: Eco-system challenges

Underlying Software assumptions on resource sharing

• Lack of standard mechanisms to share “resource sharing info” between

hw and OS

Lack of “Resource sharing” aware SW

• Compilers, Schedulers, Configuration/Management (Power!) etc

Legacy SW architectural requirements left on Multi-Core CPUs

• Compatibility requirements

Many more…unknowns (to CPU Design world)

Page 26: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Topics

• CPU (semiconductor) History

• Moore’s Law & Multi-Core

• Transistor scaling & its impact

• What then? The new era…

• Software Parallelization

• The need of the day…

• Performance/ Threading Methodology

• Identifying Threading Issues using Analysis Tools

Page 27: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Games & Multithreading

Today

• Few current game platforms have multi-core architectures

• Multithreading pain often not worth performance gain

• Most games are single-threaded (or mostly single-threaded)

Future

• Game platforms will have multi-core architectures (PCs & Game consoles)

• Games wanting to maximize performance will be multithreaded

Need to address this change! Thread for efficiency.

Page 28: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Several Applications can be Threaded

• Parallel implementation of MPEG-2 & H264 Encoding

• Divide frame into multiple blocks & encode them in parallel

• Threading High Performance Computing Applications

• Threading CAD and S/W Modeling Applications

• Threading Interactive Applications for better UI feel

• Encryption, Game Theory and Number Crunching

• And many more…

Page 29: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Designing Threaded Programs

Partition

• Divide problem into tasks

Communicate

• Determine amount and pattern of communication

Agglomerate

• Combine tasks

Map

• Assign agglomerated tasks to created threads

TheProblem

Initial tasks

Communication

Combined Tasks

Final Program

Page 30: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Ways of Exploiting Parallelism

• Domain Decomposition (a.k.a data decomposition)

• Task Decomposition (a.k.a functional decomposition)

• Pipelining (a.k.a “assembly line” parallelism)

Page 31: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Domain Decomposition

1. Decide how data elements should be divided

among processors

2. Decide which tasks each processor should be doing

Example: Find largest element in an array

CPU 0 CPU 1 CPU 2 CPU 3

Page 32: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Task Decomposition

1. Divide tasks among processors

2. Decide which data elements are going to be accessed (read

and/or written) by which processors

Ex: Event handler for GUIf()

s()

r()q()h()

g()

CPU 1

CPU 0

CPU 2

f()

g()

r()

h() q()r()

s()

Page 33: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Pipelining

Special kind of task parallelism

a.k.a “Assembly line” parallelism

Ex: 3D rendering in computer graphics

Stage 4Stage 3Stage 2Stage 1

Processing 1

data set

4 stepsProcessing 2

data sets

5 steps

Page 34: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

How Do I Parallelize An

Application?

Page 35: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Five Major Parallel Methods

Which one do I use?

• Message passing• MPI, PVM

• Explicit threading• Windows threading API, Pthreads, Solaris threads, Java thread class

• Compiler-directed• Automatic parallelization, OpenMP, Intel Threading Building Blocks

• Parallel programming languages• HPF, CAF, UPC, CxC, Cilk

• Parallel math libraries• ScaLAPACK, PARDISO, PLAPACK, PETsc

Page 36: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Common Performance Issues

Parallel Overhead

• Due to thread creation, scheduling..

Synchronization

• Excessive use of global data, contention for the same synchronization object

Load balance

• Improper distribution of parallel work

Granularity

• No sufficient parallel work

System Issues

• Memory bandwidth, false sharing

Page 37: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

What is scalability?

Handle growing amounts of work in a graceful manner

What resources might be increased?

• Cores and threads

• Memory capacity

• Data, problem size

“What is it that we really mean by scalability? A service is said to be scalable if when we increase the resources in a system, it results in increased performance in a manner proportional to resources added.”

-- Werner Vogels

CTO - Amazon.com

Page 38: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Amdahl’s Law

Describes the upper bound of parallel speedup (scaling)

Helps think about the effects of overhead

Tserial

P(1-P)

(1-

P)

P/2

n=2n=∞P/∞

Tparallel = {(1-P) + P/n} Tserial + O

where, n = number of processors

Scaling = Tserial / Tparallel

0.5 + 0.25

= 1.0/0.75 = 1.33

0.5 + 0

= 1.0/0. 5 = 2

Serial code limits scaling…

Page 39: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Effect of Serial Code on Scalability

Amdahl's Law

1

10

100

1 2 4 8 16 32

Number of Processors

Sc

ala

bilit

y

0% 10% 20% 30% 50%

Page 40: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Performance Tuning: Top-Down Approach

System Config

Topology

Network I/O

Disk I/O

Database Tuning

OS

Application Design

Threading/ APIs usage

Data Locality

Locks/ Heap Contention

Cache Utilization

Architecture Pitfalls

Branches and Loop Code

Execution Efficiency

System

Application

Architecture

Page 41: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Optimization Methodology

Baseline

Collect

Data

Identify

Bottlenecks

Identify

Alternatives

Apply

Solution

Test

u

v

wx

y

Address one

issue at a time

Page 42: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Development Cycle – Optimization Tools

Analysis

–VTune™ Performance Analyzer

Design (Introduce Threads)

–Intel® Performance libraries: IPP and MKL

–OpenMP* (Intel® Compiler)

–Explicit threading (Win32*, Pthreads*)

Debug for correctness

–Intel® Thread Checker

–Intel Debugger

Tune for performance

–Intel® Thread Profiler

–VTune™ Performance Analyzer

Page 43: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Optimization ToolsMaximize Multi-Threaded Application Performance & Scalability

with Intel® Software Development Products

Intel® Compilers The best way to get C++ and Fortran application performance on Intel®

multi-core processors

+ SSE4 support through intrinsics and inline _asm

+ Code generation optimized for Nehalem

+ OpenMP improvements

Intel® VTune™

Performance

Analyzers

Identify bottlenecks in source code and optimize multi-core performance

+ Performance Tuning Utility (PTU) introduces Data access modeling

prototype on whatif.intel.com

Intel® Performance

Libraries

Highly optimized, thread-safe, multimedia and HPC math functions

+ SSE4 used where applicable

+ Accommodate microarchitecture changes for increased performance

Intel® Threading

Analysis Tools

Find threading errors and optimize threaded applications for maximum

performance

Intel® Threading

Building Blocks

C++ template-based runtime library that simplifies

writing multithreaded applications for performance and scalability

Page 44: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

Copyright © 2007, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Rules of Thumb – For Parallel Programming

• Think parallel.

• Program using abstraction.

• Program in tasks (chores), not threads (cores).

• Design with the option to turn concurrency off.

• Avoid using locks.

• Use tools and libraries designed to help with concurrency.

• Use scalable memory allocators.

• Design to scale through increased workloads

Page 45: Multi-core Programming - Intel Developer Zone€¦ · Jay Swaminathan Senior Application Engineer Software Solutions Group Intel Multi-core Programming

End of Session

Thank you

Q&A