56
Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners www.intel.com/software/products Improve Application Performance on Windows*

Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Improve Application Performance on Windows*

Page 2: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

What is the world’s biggest semiconductor company doing building software products?

Page 3: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

3Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Software Development Products

Intel® Compilers Best way to get application performance on Intel processors

Intel® VTune™ AnalyzersQuickly identify “hot spots” and how to fix them

Intel® Performance LibrariesHighly optimized, ready to use building-block functions

Intel® Threading ToolsSpeeds, simplifies development & maintenance of threaded apps

Intel® Cluster ToolsCreate, analyze, optimize and deploy cluster-based applications

Intel Software Development Products for Intel® Personal Internet Client Architecture processors,

Pentium® M, Pentium® 4, Intel® Xeon™and Itanium® 2 Processors

Page 4: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

4Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Software Development Products

Performance– Enable developers to deliver

higher performance softwareCompatibility – Compatible with the leading tools and development

environments already used by many software developers

– Easy to incorporate into the development processSupport– Premier Customer Support– Technical training offered through Intel Software

College

Page 5: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel Compilers

http:www.intel.com/software/productshttp:www.intel.com/software/products

Page 6: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

6Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Compilers for Intel PCA, Intel® 32-bit, EM64T & Itanium® 2 Processors

Intel compilers for Intel PCA processor line support Intel® Wireless MMX™ technologyIntel 32-bit processor support: SSE3, Intel Net Burst®microarchitecture, Hyper-threadingItanium® 2 processor support: software pipelining, improved branch prediction, branch reduction thru predicationAdvanced optimization features of Intel compilers– Profile Guided Optimization, Inter-Procedural Optimization– Parallelism: Auto-parallelization, vectorization, OpenMP*

support – Data prefetching – Processor dispatch on IA-32 processors

Intel® Premier Support: Compiler updates, support, expertise, customer interaction via compiler forums, architectural information, white papers and more

Page 7: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

7Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel CompilersOptimize for Specific ProcessorsOptimize for Specific Processors

Instruction Scheduling– Schedule instructions to be optimal for specific processor– How? On Windows: /G1, /G2, /G5, /G7…

Build target for specific processor– For target processor it uses processor specific opcodes & features

like SSE, SSE2, Vectorization– Runs only the target processor– How? On Windows*: /QxK, /QxW, QxB…

Automatic Processor Dispatch– Runs on all x86 processors– How? On Windows*: /QaxK, /QaxW, /QaxB…

Page 8: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

8Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel CompilersHighHigh--Level OptimizationsLevel Optimizations

High-Level Optimizer– Performs loop level optimizations, aids optimal memory access– How? On Windows: /O3

Inter-Procedural Optimization– Enables inter-procedural optimizations for single/ multiple files– How? On Windows*: /Qip, /Qipo

Profile Guided Optimization– Use execution-time feedback to guide optimization– Aids paging, branch-prediction, basic block reordering– How? On Windows*: /Qprof_gen, /Qprof_use

Page 9: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

9Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel CompilersUsing Parallel Programming DirectivesUsing Parallel Programming Directives

Auto-Parallelization– Automatically converts loops to use multiple processors– How? On Windows*: /Qparallel

OpenMP Support– Intel Compilers supports multi-platform shared-memory parallel

programming in C/C++ and FORTRAN on all platforms & OS– How? On Windows*: /Qopenmp

OpenMP usage example##pragmapragma ompomp parallel forparallel for

forfor (i = 0;i < n; i++) {(i = 0;i < n; i++) {dy[idy[i] = ] = dy[idy[i] + ] + dada**dx[idx[i]; }]; }

Page 10: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

10Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Code Coverage Tool

Example of code coverage summary for a project. The workload applied in this

test exercised 34 of 143 blocks, representing 5 of 19 functions in 2 of 3 modules. In the file, SAMPLE.C, 4 of 5

functions were exercised

Clicking on SAMPLE.C produces a listing that highlights the code that

was exercised. In this example, the pink-highlighted code was

never exercised, the yellow was run but not exercised by any of the tests set up by the developer and

the beige was partially covered.

Page 11: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

11Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Test Prioritization ToolHelps guide and speed software testing, – Helps produce better code more quickly– Helps improve programmer productivity

Example:– These 3 achieve 52.17% block and 50.00% function coverage– Test 3 alone covers 45.65% of basic blocks or 87.50% of total

block coverage from all tests– By adding Test 2, cumulative block coverage goes to 52.17%,

or 100% of the total block coverage of Test 1, Test 2, and Test 3

– Eliminating Test 1 has no negative impact on block coverage and saves time

Number of Tests

%Rat Cvrg

%Blk Cvrg

%Func Cvrg

Test Names@ Options

1 87.50 45.65 37.50 Test3.dpi

2 100.00 52.17 50.00 Test2.dpi

Total Number of Tests = 3Total Block Coverage ~ 52.17%Total Function Coverage ~50.00%

Page 12: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

12Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Compilers 8.1C++ and FortranIA-32, Intel® Itanium® 2, EM 64T & Intel® PCA processor-based systemsIntel® Code-Coverage & Intel® Test-Prioritization toolsThreaded application support (Hyper-Threading Technology)

– OpenMP* 2.0 standard support– Auto-Parallel feature that automatically generates

threaded codeWindows specific:

– Integrates into MS Visual Studio .NET* IDE– Support for MSVC.NET* language features (no

support for C# or managed code)– Compaq Visual Fortran* language features with

Intel code generation and optimization technology

Page 13: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel VTune Performance Analyzer

http:www.intel.com/software/productshttp:www.intel.com/software/products

Page 14: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

14Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Performance Tuning

Detecting common issues – Where to add threads, what to optimize?– Load imbalance?– Wait, blocked, or idle time?– Excessive overhead?– Processor architecture issues?– Application issues?

No particular order: Address issues as needed

No particular order: Address issues as No particular order: Address issues as neededneeded

Page 15: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

15Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® VTune™ Performance Analyzer

VTune analyzer’s intimate knowledge of the processor enables it to provide extensive insights into how software utilizes CPU resourcesAllows you to identify and locate performance bottlenecks in your code

– Collects and displays software performance data– Features that help you identify and address

performance issues:Sampling that uses non-intrusive technologiesCall Graph that displays graphically the program’s flow of controlAnalyzer that has detailed knowledge of the processor’s microarchitectureIntel Tuning Assistant that suggests optimization techniques for your Windows code

“The Intel VTune Performance Analyzer took a multi-day task and turned it into a sub-day task.”

—— Randy Camp, V.P. Software Research and Development, MUSICMATCH, Inc.

Page 16: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

16Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Sampling – Identifying Performance Bottlenecks

“Sample” the CPU’s execution contextAs program runs, gather occasional CPU context snapshots triggered by CPU’s performance monitoring registers– Interrupt based sampling using CPU registers– Low intrusion – doesn’t change performance of the software– No special builds required

Sample rate set to provide statistically meaningful data– Based on CPU clock speed or can be auto-calibrated

Can measure performance sensitive CPU events– Cache misses, branch mispredictions, etc.

Page 17: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

17Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

How to use Intel VTune Performance Analyzer

Build the application– Build the application in Release mode with compiler optimizations

Find “Hotspots” using VTune– A “Hotspot” in an application or a system is a section of code where

there is a significant amount of activity.– Finding “hotspots” would assist you in determining the compiler/ code

optimizations required for gaining performance improvement.

Symbols required for VTune Analyzer– Required Intel compiler switch (on Windows*): /Zi

Page 18: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

18Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Start New Project using Sampling Wizard

Intel VTune Performance Analyzer

Select Application Type to ProfileSelect Application to Launch

Page 19: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

19Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Understanding VTune Interface

Choose Project/Activity/ Run

Choose Project/Activity/ Run

Different ViewsDifferent Views

System-wide performance data

Most Instructions RetiredMost Instructions RetiredMost Instructions Retired

Statistics SummaryStatistics SummaryStatistics Summary

Events Measured

Sampling Analysis

Per CPU AnalysisPer CPU AnalysisPer CPU Analysis

Status OutputStatus OutputStatus Output

Page 20: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

20Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Hotspot Drill Down

Function StatisticsFunction StatisticsFunction Statistics

LINPACK performance data

Symbols required for Hotspot Drill-down

Events Measured

Is this the Hotspot?Is this the Hotspot?Is this the Hotspot?

More analysis needed. Use VTune Call Graph feature to obtain flow info!

Page 21: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

21Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Source Level View

“Hotspot” source““HotspotHotspot”” sourcesource

Efficiency (CPI)Efficiency (CPI)Efficiency (CPI)

View AssemblyView AssemblyView Assembly

Page 22: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

22Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Using Sampling & Call GraphTogether

Why?Use sampling to find which functions have hotspots.Use call graph to find out who is calling these functions.

Why?Use sampling to find which functions have hotspots.Use call graph to find out who is calling these functions.

Page 23: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

23Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

What Are Users Saying

“SGI develops applications for its computers that employ many levels of parallelism, demanding the highest level of performance. The VTune Performance Analyzer for Windows provided invaluable insights to the correction of performance bottlenecks in these applications at the process, thread, and basic block levels."– Arthur Raefsky, Technical Lead, SGI,

Mountain View, CA

Page 24: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel Threading Tools

http:www.intel.com/software/productshttp:www.intel.com/software/products

Page 25: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

25Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Threads Defined

OS creates process for each program loaded– Each process executes as a

separate threadAdditional threads can be created within the processAll threads share code and data – Each thread has its own Stack

and Instruction Pointer

OS creates process for each program loaded– Each process executes as a

separate threadAdditional threads can be created within the processAll threads share code and data – Each thread has its own Stack

and Instruction Pointer

Data

Code

thread2()Stack

IP

threadN()Stack

IP

ProcessProcess

thread1()Stack

IP

Threading Overview

Page 26: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

26Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Amdahl’s LawThreading Overview

If only 1/2 of the code is parallel, 2X speedup is

unlikely

If only 1/2 of the code is parallel, 2X speedup is

unlikely

TotalParallel TONPPT })1{( ++−=

P = parallel portion of processN = number of processors (cores)O = parallel overhead

time PPP(1-P)

TTotal

Page 27: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

27Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Correctness Bugs: Data RacesThreading Overview: Challenges Unique to Threading

Thread1x = a + b

Thread2b = 42

What is value of x if:– Thread1 runs before Thread2?– Thread2 runs before Thread1?

Data race: concurrent read, modify, write of same address

x = 3

x = 43

Suppose: a=1, b=2

Outcome depends on thread execution orderOutcome depends on thread execution order

Page 28: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

28Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Solving Data Races: Synchronization

Thread1Acquire(L)a = 1b = 2x = a + bRelease(L)

Acquisition of mutex L ensures atomic access– Only one thread can hold lock at a time

Example APIs:- EnterCriticalSection(), LeaveCriticalSection()- pthread_mutex_lock(), pthread_mutex_unlock()

Thread2Acquire(L)b = 42Release(L)

Threading Overview: Challenges Unique to Threading

Page 29: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

29Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Performance Penalty: Synchronization

Thread blocked waiting for Mutex– Thread not running, so no parallelism

Mutex Release, Acquire takes time– Release marks mutex free– Acquire must check for free

If free, mark as in useIf not free, thread put to sleep

–– Costs context switch out and in of processorCosts context switch out and in of processor

Threading Overview: Challenges Unique to Threading

Page 30: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

30Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Problem Statement

Developing threaded applications is hardNew class of problems are caused by the interaction between concurrent threads– Correctness problems (data races,

deadlocks, etc)– Performance problems (contention,

imbalance, etc)

Threading Overview

Page 31: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

31Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Software Development Cycle

Introduce ThreadsIntroduce Threads––IntelIntel®® Performance libraries: IPP and MKLPerformance libraries: IPP and MKL––OpenMP* (supports incremental threading)OpenMP* (supports incremental threading)––Explicit threading (Win32*, Explicit threading (Win32*, PthreadsPthreads*)*)

Debug for correctnessDebug for correctness––IntelIntel®® Thread CheckerThread Checker––Intel DebuggerIntel Debugger

Tune for performanceTune for performance––Thread ProfilerThread Profiler––VTuneVTune™™ Performance AnalyzerPerformance Analyzer

Scope of the Tools

Page 32: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

32Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Software Development Products

Intel® Thread Checker and Thread ProfilerVTune™ Performance Analyzer– Prerequisite for Intel® Threading Tools– VTune analyzer has thread support

Intel® Compilers support OpenMP* and the Threading tools– More detailed results are generated with the Intel

compilersIntel Performance Libraries are thread safe– Many functions are threaded

Page 33: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

33Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Common Threading Errors/Bugs

Race conditions– Unprotected concurrent access to shared

variables by multiple threads– Most common error

Deadlocks– Multiple threads waiting on resources that

are held by other threadsThread stalls– Threads waiting on resources infinitely

Page 34: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

34Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Thread Checker Intro

Identifies threading bugs in applications threaded with:– Windows* threads on Windows* systems– OpenMP* on Windows* systems

Plugs into VTune™ environment– Windows* for IA-32 systems

Intel® Thread Checker

Page 35: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

35Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Thread Checker Analysis

Dynamic monitoring as software runs– Data (workload) -driven execution

Includes monitoring of:– Thread and Sync APIs used– Thread execution order

Scheduler impacts results– Memory accesses between threads

Only executed code path is analyzedOnly executed code path is analyzed

Intel® Thread Checker

Page 36: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

36Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Thread Checker Usage

Dynamic Correctness tool– Dataset selection is important

Must touch all code paths– Multiple runs exercising different data

paths yield best results– Use small data set for each path

Monitoring of all memory references is time consuming

Intel® Thread Checker

Page 37: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

37Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Starting Thread Checker

Start VTune™Performance Analyzer

1

2

Intel® Thread Checker

Page 38: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

38Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Diagnostics ListIntel® Thread Checker

Page 39: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

39Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Location in Source Code

Each entry in the diagnostics list links to its source code line(s)

Each entry in the Each entry in the diagnostics list diagnostics list links to its links to its source code source code line(s)line(s)

Intel® Thread Checker

Page 40: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

40Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Common Performance Issues

Parallel Overhead– Due to thread creation, scheduling..

Synchronization– Excessive use of global data, contention for the same

synchronization object– Implicit synchronization

Load balance– Improper distribution of parallel work

Granularity– No sufficient parallel work

Page 41: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

41Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Thread Profiler

Plugs in to the VTune™ performance environmentIdentifies performance issues in OpenMP* or unstructured threaded applications using the Win32*Pinpoints performance bottlenecks that directly affect execution time Uses binary instrumentation technology

Intel® Threading Tools: Thread Profiler

Page 42: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

42Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Thread Profiler

Uses critical path analysisProvides a breakdown of execution time along the critical path– Provides insight into system utilization

Under-subscribed vs. over-subscribed– Thread state transitions

Blocked->Running, call stack information

Allows comparison of multiple runsAllows comparison of multiple runs

Intel® Threading Tools: Thread Profiler

Page 43: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

43Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Execution Flows and Critical Path

Multiple execution flows in applicationsFlow splits when a thread creates new threads or signals another thread to continueFlow ends when a thread stalls or terminates

Thread 1

Thread 2

Thread 3

T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15

Acquire lock L

Wait for Threads 2 & 3

Wait for L

Release L Wait for L

Release L

Longest flow is thecritical pathcritical path

Intel® Threading Tools: Thread Profiler

Page 44: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

44Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Why use Critical Path?

Goal is to shorten the execution timeShorten the critical path and you shorten the total execution timeEvents recorded are events that impact the critical path– Lock/Unlock– Thread Creation, suspension, resume,

termination– Blocking calls, external events

Intel® Threading Tools: Thread Profiler

Page 45: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

45Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Critical Path Analysis

System Utilization– Idle, serial, parallel and oversubscribed– This is relative to the system the application

is running onTime categories along critical path (CP)– Cruise, overhead, blocking and impact time

Resulting view is a combination of utilization and execution time along CP

Intel® Threading Tools: Thread Profiler

Page 46: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

46Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

System Utilization

Examines processor utilization to determine parallel activity of the applicationConcurrency is the number of threads that are active

Thread 1

Thread 2

Thread 3

T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15

Thread Profiler: Critical Path Analysis

Categorization shown for a system configuration with 2 processors

Acquire lock L

Wait for Threads 2 & 3

Wait for L

Release L Wait for L

Release L

IdleSerial

ParallelUnder-subscribed

Over-subscribed

Page 47: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

47Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Execution Time Categories

Analyze critical path by “colorizing” the time spent along it.Associate spans of time with the objects that caused the critical path transitions

Thread Profiler: Critical Path Analysis

Thread 1

Thread 2

Thread 3

T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15

Cruise timeOverheadBlocking timeImpact time

Acquire lock L

Wait for Threads 2 & 3

Wait for L

Release L Wait for L

Release L

Page 48: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

48Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Critical Path View

Thread 1

Thread 2

Thread 3

T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15

Thread Profiler: Critical Path Analysis

Critical Path View0

15

5

10

Tim

eStart with the critical pathBreak down by system utilizationAdd overheadFurther categorize by behavior

Acquire lock L

Wait for Threads 2 & 3

Wait for L

Release L Wait for L

Release L

IdleSerial

ParallelUnder-subscribed

Over-subscribed

Categorization shown for a system configuration with 2 processors

Cruise timeOverheadBlocking timeImpact time

Page 49: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

49Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Thread Profiler Views

Critical Path View– Shows breakdown of the critical path

Profile View– Shows the breakdown of selected critical paths– Use can select other views of the selected profile– Concurrency level, threads, objects..

Timeline View– Shows thread activity and critical path transitions for

the entire applicationSource View– Transition source view, creation source view

Intel® Threading Tools: Thread Profiler

Page 50: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

50Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Thread CheckerLocates threading bugs: – Data races (storage conflicts) – Deadlocks (potential and actual)

Isolates bugs to source code lineDescribes possible causes of errors and suggests resolutionsCategorizes errors by severity levelIdentifies threading bugs in applications threaded with:– Windows* threads on Windows* systems– OpenMP* on Windows* systems

Plugs into VTune™ environment– Windows* for IA-32 systems

Page 51: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

51Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Thread Profiler 2.1Plugs in to the VTune™ performance environmentIdentifies performance issues in OpenMP* or unstructured threaded applications using theWin32* Pinpoints performance bottlenecks that directly affect execution time Uses binary instrumentation technology

Page 52: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel Software College

http:www.intel.com/software/collegehttp:www.intel.com/software/college

Page 53: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

53Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Expert Training @ Intel® Software College

High-quality training by expert trainers worldwide– Take advantage of the latest Intel

processors, platforms, tools and technologiesFlexible training offerings– On-line, On-site, or at Intel facility

Classroom-based or online, self-paced or custom course offerings

www.intel.com/software/college

Visit the Intel Software College website:

"I attended the VTune and Compiler courses at the ISC … I am able to apply what I learned at the ISC to optimizing applications that matter to my company's business. The ISC courses were probably the best that I have had as a professional in terms of delivering on what they said they would teach."

—— Keith Fish - ISV Technical Consultant, Hewlett-Packard Company

Page 54: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

54Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

“Registering for support was easy, and we value the security of knowing that Intel is there to help, even though we haven’t needed it so far.”

—— Rob Hoffmann - Director of Marketing, NewTek, Inc.

Intel Premier Support

Every purchase of an Intel software development product includes a year of support servicesProvides access to Intel® Premier Support and all product updates during that timePremier Support includes online access to Intel’s Premier Support Website– Primary support for all Intel Software products– Issue submission & tracking– Product updates & related downloads– FAQ’s & other proactive notices– 128-bit encrypted communication protects confidentiality– Dedicated expert staff review submissions and respond within 4

Intel business hours

https://premier.intel.com

Page 55: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

55Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Software Development Products

From Supercomputers to Cell Phones, Intel Software Development Products Enable Application Development Across Intel Processors

VTuneVTune™™Performance AnalyzerPerformance Analyzer

LibrariesLibraries

Threading Threading ToolsTools

CompilersCompilers

Math Kernel LibraryMath Kernel Library

Integrated PerformanceIntegrated PerformancePrimitivesPrimitives

Thread Thread CheckerChecker

C++C++

MS Windows* MS Windows* Win CE

Intel Software Development Products

FortranFortran NA NA

NA NA

ShippingShipping

FutureFuture

Performance Performance AnalyzersAnalyzers

Cluster Cluster ToolsTools NATrace Analyzer / Trace Analyzer /

CollectorCollector NANA

Palm*

Symbian

*Nucle

us*

DebuggersDebuggers C++C++

NA NA

NA NA

NA NA NA

NA NA NA NA

Page 56: Improve Application Performance on Windows*download.microsoft.com/download/3/3/6/3364ea20-6347-462e-a264 … · Intel® Code Coverage Tool Example of code coverage summary for a project

56Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Next StepsEvaluate the Products– Download at: www.intel.com/software/products

Contact Vivek Venkatesh with questions– 98456 79348– [email protected]