9
C++11 and three compilers: performance considerations CERN openlab Summer Students Lightning Talks Sessions Supervised by Pawel Szostek Stephen Wang 19/08/2014

C++11 and three compilers: performance considerations

  • Upload
    dai

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

C++11 and three compilers: performance considerations. CERN openlab Summer Students Lightning Talks Sessions Supervised by Pawel Szostek Stephen Wang. 19/08/2014. Motivation. “Surprisingly, C++11 feels like a new language” by Stroustrup . - PowerPoint PPT Presentation

Citation preview

Page 1: C++11 and  three  compilers:  performance  considerations

C++11 and three compilers:

performance considerationsCERN openlab Summer Students

Lightning Talks SessionsSupervised by Pawel Szostek

Stephen Wang

› 19/08/2014

Page 2: C++11 and  three  compilers:  performance  considerations

Stephen Wang 2

Motivation› “Surprisingly, C++11 feels like a new language” by Stroustrup.

› Improvements: Language usability, Mulithreading and other stuff.

› How about performance?

› Four questions: Time measurement methods, For-loop efficiency, std::async, STL algorithms in parallel mode

› GCC 4.9.0, ICC 15.0.0, Clang+LLVM 3.4› -O2, -O3, -Ofast

19/08/2014

Page 3: C++11 and  three  compilers:  performance  considerations

Stephen Wang 3

Contribution› Make four micro-benchmarks for each feature.› Code: https://github.com/wangyichao/CPP11_Benchmarks

› Automatize the process, make results repeatable› Python and bash script are used.› Compile -> Run -> Table (3 compiler * 3 options * containers * algorithms …)

› Use profiling tools to check performance such as vectorization and threads.

› Perf, Intel Vtune, Likwid, even PMU

› Answer basic questions.

19/08/2014

Page 4: C++11 and  three  compilers:  performance  considerations

Stephen Wang 4

Conclusions

auto start = std::chrono::steady_clock::now();auto end = std::chrono::steady_clock::now();int elapsed = std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count();overhead.push_back(elapsed);

19/08/2014

RDTSCPstd::chrono

Page 5: C++11 and  three  compilers:  performance  considerations

5

Conclusions› std::chrono (C++11) is a reliable time measurement.

19/08/2014 Stephen Wang

auto start = std::chrono::steady_clock::now();microseconds_sleep(index);auto end = std::chrono::steady_clock::now();int elapsed = std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count();

Page 6: C++11 and  three  compilers:  performance  considerations

6

Conclusions› Performance of for-loop varies with iteration method and

container.› Which containers are vectorized?

19/08/2014 Stephen Wang

array std::array std::list std::set std::vector

GCC

ICC

Clang

Table range-based for loop

Page 7: C++11 and  three  compilers:  performance  considerations

7

Conclusions› std::async spawns threads when called and the computation

is done immediately. ( stdlibc++ from GCC)

19/08/2014 Stephen Wang

double sum = 0;auto handle1 = std::async(std::launch::async, fsum, v.begin(), v.begin()+distance);Auto handle2 = std::async(…);……sum = handle1.get()+handle2.get()+handle3.get()+handle4.get();

Page 8: C++11 and  three  compilers:  performance  considerations

Stephen Wang 8

Conclusions› GNU libstdc++ parallel speeds up part of STL algorithms,

whose performance varies with containers.

19/08/2014

Page 9: C++11 and  three  compilers:  performance  considerations

9

› Thanks

19/08/2014 Stephen Wang