32
Software Challenges to Multi- Core Prof. Weimin Zheng Dept. of Computer Science and Technology Tsinghua University

Software Challenges to Multi- Core

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Software Challenges to Multi- Core

Software Challenges to Multi-Core

Prof. Weimin ZhengDept. of Computer Science and

TechnologyTsinghua University

Page 2: Software Challenges to Multi- Core

Outline

1.Biggest Challenge– Parallel Programming– Debugging Parallel programs

2.Efforts on Multi-core curriculum development

Page 3: Software Challenges to Multi- Core

Software Stacks for Multicore Processors

Middleware

Static Compilers

System Libraries

OS and Hypervisors

General-purposeprog languages

Programming Tools

Middleware in support of domain-specific programming models & frameworks e.g., J2EE containers, relational databases, Async Beans

Infrastructure languages that simplify concurrency for programming for C/C++/Java/C# programmers

e.g., MS Concur, IBM X10, OpenMP

Tools for improved functional/performance quality of concurrent software

Speculative parallelism, assist threads, SIMDization, code partitioning

Dynamic code specialization, speculative parallelism, assist threads, SIMDization, code partitioning

Libraries that encapsulate concurrency e.g., Java Concurrency Utilities, Transactional Memory, Cell libs, TI

OMAP

Dynamic scalable management of heterogeneous resources per core (frequency, power)

Dynamic compilers, VMs, Lang Runtime

Domain-specific progmodels & langs

Domain-specific programming models that encapsulate concurrency e.g., Map-Reduce (Google Sawzall), Pub-

Sub (JMS), Stream Processing Engines (Distillery)

Application Frameworks

Application fameworks and libraries e.g., ESSL, graphics libraries, imaging libraries, security libraries

By Vivek Sarkar, IBM

Page 4: Software Challenges to Multi- Core

Parallel Programming Models

• What’s the situation now– Scientific and Engineering Computing

• Ph.Ds with computer science or science/ engineering degrees are programmers

• They program with MPI and OpenMP, it is painful, but they are clever enough to do it.

– Server / Datacenter• Multithread programming with Pthread, Java Thread or

Windows Thread• Application frameworks, such as J2EE, hide the complexity

of synchronizations. • Programmers are happy generally ☺

Page 5: Software Challenges to Multi- Core

Parallel Programming Models

– Desktop and Embedded Systems• Apart from a few applications, such as media /

image processing/ 3D, most software is still sequential code, which can not gain speedup with (free) extra cores in multicore processors

• VB programmer, get some program training after graduation from other major, don’t know much about lock / thread-safe etc. How can they start to program parallel code?

Page 6: Software Challenges to Multi- Core

Parallel Programming Models

• Impact of Multi-core processors– Scientific / Engineering Computing

• MPI can execute in multi-core, but may need to be optimized

• OpenMP may become more popular• Simpler programming model is still desired to

reduce development time / fault tolerance– Server / Datacenter

• Multicore is just a cheaper SMP machine, old code could run without (big) problem.

Page 7: Software Challenges to Multi- Core

Parallel Programming Models

• Desktop / Embedded– OpenMP is too complex for ordinary

programmers to write correct/fast code

Page 8: Software Challenges to Multi- Core

Pi Calculation – Sequential Code

static long num_steps = 100000;double step;void main (){ int I;

double x, pi, sum = 0.0;step = 1.0/(double) num_steps;for (i=0; i< num_steps; i++){

x = (i+0.5)*step;sum = sum + 4.0/(1.0+x*x);

}pi = step * sum;

}}

Page 9: Software Challenges to Multi- Core

#include <omp.h>static long num_steps = 100000; double step;#define NUM_THREADS 2void main (){ int i;

double x, pi, sum = 0.0;step = 1.0/(double) num_steps;omp_set_num_threads(NUM_THREADS);

#pragma omp parallel for reduction(+:sum) private(x)for (i=0;i< num_steps; i++){

x = (i+0.5)*step;sum = sum + 4.0/(1.0+x*x);

}pi = step * sum;

}

OpenMP Verision

Fonts in red are OpenMPdirectives.However, if you miss anything in

reduction(+:sum) private(x)

You will get wrong result in SOME execution without any warning!

Page 10: Software Challenges to Multi- Core

Trends in Parallel Programming Models

– Transactional Memory– MapReduce– Auto-parallelization

Page 11: Software Challenges to Multi- Core

A lock-based program

Page 12: Software Challenges to Multi- Core

Transactional memory• Use Atomic to specify the automicity

requirement in code• Easy to write it correctly. • Runtime automatically detect violations to

the atomicity rule.Void pushFlow(Veterx v1, Veterx v2, double f) {Atomic {

if (v2.excess > f) {v1.excess += f;v2.excess -= f;

}}

Page 13: Software Challenges to Multi- Core

Map-Reduce: The first instruction for the Data Center

• Map-Reduce is proposed by Google – Programmers only write sequential code– Runtime library schedule the tasks to run in parallel,

and handle fault tolerance and load balance– Run in both multicore and distributed environment– Applications

• Reverse-index• Machine Learning algorithm• Machine translation• Multimedia retrieval • Clustering• …• More than 2000 Map-Reduce applications in Google only

Page 14: Software Challenges to Multi- Core

Code snippets of Map-Reduce

We are trying to use Map-Reduce to support future heterogenousmulti/manycore processors, including GPUs such as Intel Larrabbe

Page 15: Software Challenges to Multi- Core

Auto-Parallelization

• Although hard, but still very important– The only possible approach to port legacy

code to multicore processors to get speedup– Autopar on multicore does not need to have

linear speedup, 30% speedup on dual-core is not too bad, if you can do it automatically and consume less power

– Speculative parallelization is on its way to go one step ahead

Page 16: Software Challenges to Multi- Core

Debugging parallel Code is difficult• Un-deterministic behavior of parallel code

– Caused by data races• Parallel code may have deadlocks

By Yuanyuan Zhou etc. Learning from Mistakes. A Comprehensive Study on Real World Concurrency Bug Characteristics, ASPLOS 2008

Page 17: Software Challenges to Multi- Core

How to detect parallel bugs

• Runtime– VTune: Too slow– Deterministic Replay : Promising, but not enough

• Compile time– Too limited

• Annotation based– Too many annotation required

• We are investigating this problem with Google to find a hybrid infrastructure to solve the problem

Page 18: Software Challenges to Multi- Core

Efforts on Multi-core curriculum development

• Curriculum development on multi-core in Tsinghua Univ. (2006 ~)

• Tsinghua Multi-core workshop (2008.4.14 ~ 2008.4.20)

Page 19: Software Challenges to Multi- Core

Multi-core curriculum development• 3 new courses opened since 2006.9

– Advanced technologies for HPC– Fundamental of Parallel Computing– Advanced CPU logic design: Multi-threading and Multi-

core

• Course “Advanced Computer Architecture” was updated since 2006.9

• Seminar on related research topics by Prof. FransKaashoek of MIT (2008.1 ~ 2008.6) – linux scalability, Tornado, Phoenix Mapreduce, etc

Page 20: Software Challenges to Multi- Core

Multi-core curriculum development• Advanced technologies for HPC

– Introduction to research topics on micro-architecture, parallel system architecture and distributed system based on our research experience

– ~50% multi-core related content introduced– Interactive teaching with paper reading and

discussion– Student voice “Make us touch the new

technologies (such as multi-core) in HPC nowadays and in the future”

Page 21: Software Challenges to Multi- Core

Multi-core curriculum development• Fundamental of Parallel Computing

– Parallel programming training for undergraduates from Engineering Colleges of Tsinghua

– ~40% multi-core related content introduced– Practice is the focus of the course

• Elaborate lab and course project on Multi-core platform

– Student voice “We can keep up with the advance of computer technology, which will benefit us in our future research and work largely”

Page 22: Software Challenges to Multi- Core

Multi-core curriculum development• Advanced CPU logic design: Multi-threading

and Multi-core– Short course by invited prof. Li Yamin from Hosei

University– Introduction to the design method on multithreading

and Multi-Core processor by prelection and practice– Student voice “understanding the advance

technologies on CPU design”

Page 23: Software Challenges to Multi- Core

Multi-core curriculum development

• Advanced Computer Architecture– Module 1: Introduction to multi-core technologies

and software challenges for graduate students– Module 2: Cache coherency and memory

consistency on multi-core– Students are interested with these topics and quite

a few of them made the multi-core related topics as final projects

Page 24: Software Challenges to Multi- Core

Tsinghua Multi-core workshop• 2008.4.14~2008.4.20 in Tsinghua Univ.• Held by Intel & Tsinghua

Page 25: Software Challenges to Multi- Core

Tsinghua Multi-core workshop• Participants

– 25 invited professors come from 22 universities of North and North-east China

Page 26: Software Challenges to Multi- Core

Day Morning Afternoon Evening

1 1. Introduction to Intel University Program and this workshop

2. Intel Architecture for Performance and efficiency (1)

Intel Architecture for Performance and efficiency (2)

N/A

2 Multi-threading programming1. Concept of thread2. Windows Multi-threading

programming 3. Pthread programming

Introduction and lab on Intel Compiler

Lab

3 OpenMP programming Introduction and lab on Intel Vtune and Intel Thread Profiler

Lab

4 MPI programming (1) MPI programming (2) Lab

5 Advanced topic 1: Cache coherency Lab on multi-threading programmingIntroduction and lab on Intel Thread Checker

Lab

6 1. Advanced MPI programming2. Performance analysis and tuning

for parallel programs

1. Introduction to student projects

2. Discussion on course development

Advanced topic2: Software Challenge for multi-core era

TH Multi-core workshop Schedule

Modules for Computer ArchitectureComplete Modules for Parallel Programming: method and practice

Some research topics might be interesting

Page 27: Software Challenges to Multi- Core

27

Pictures during workshopPictures during workshop Workshop Courseware including slides, homework

and student projects (in Chinese)

Related courseware including those from Intel

Software College, Berkeley, GIT, UTK and

Tsinghua Univ.

Related courseware including those from Intel

Software College, Berkeley, GIT, UTK and

Tsinghua Univ.

Over 800MB valuable e-resource for course development

Over 800MB valuable e-resource for course development

TH Multi-core workshop courseware

Page 28: Software Challenges to Multi- Core

Values of TH Multi-core Workshop• It is helpful

– All the professors were satisfied with the workshop• Learn more about Multi-core Technologies, which

are useful not only for teaching but also for future research

• Share our experience on how to develop a multi-core course

• Make things easier if professors are preparing their multi-core courses such as Computer Architecture or Parallel Programming

Page 29: Software Challenges to Multi- Core

Values of TH Multi-core Workshop• Built the academic community between

universities– Tightly communication even after the workshop

• Help universities to improve their proposals• Q&A for donation application, software installation and

course preparation

– Better cooperation between Intel and the universities and inter-universities

Page 30: Software Challenges to Multi- Core
Page 31: Software Challenges to Multi- Core
Page 32: Software Challenges to Multi- Core

Thanks!