41
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Embed Size (px)

Citation preview

Page 1: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Intel Compilers 9.x on the Intel® Core Duo™

ProcessorWindows version

Intel Software College

Page 2: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

2

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Objectives

At the successful completion of this module, you will be able to:

• Use key compiler optimization switches

• Optimize software for the Architecture

• Enhance performance with vectorization and other techniques

Page 3: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

3

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Agenda

Introduction

Compiler Switches

Dual Core

Vectorization

Page 4: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

4

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Key to optimizing: Intel® Core™ Duo

Exploiting Architectural Power requires Sophisticated Compilers

Optimal use of

• Registers & functional units

• Dual-Core/Multi-processor

• SSE instructions

• Cache architecture

Page 5: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

5

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

C++ Compatibility with Microsoft

Source & binary compatible with VC2003 with /Qvc71,

Source & binary compatible with w/ VC 2005 under /Qvc8.

Microsoft* & Intel OpenMP binaries are not compatible. • Use the one compiler for all modules compiled with OpenMP

For more information, refer to the User’s Guide

Page 6: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

6

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Use Intel Compiler in Microsoft IDEC++

Page 7: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

7

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Agenda

Introduction

Compiler Switches• Intel® C++ compiler

Dual Core

Vectorization

Page 8: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

8

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

General Optimizations

Windows* Linux* Mac*

/Od -O0 -O0 Disables optimizations

/Zi -g -g Creates symbols

/O1 -O1 -O1 Optimize for Binary Size: Server Code

/O2 -O2 -O2 Optimizes for speed (default)

/O3 -O3 -O3 Optimize for Data Cache:

Loopy Floating Point Code

Page 9: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

9

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Multi-pass Optimization Interprocedural Optimizations (IPO)

ip: Enables interproceduraloptimizations for single file compilation

ipo: Enables interproceduraloptimizations across files

Can inline functions in separate files

Enhances optimization when used in combination with other compiler features

Windows* Linux* Mac*

/Qip -ip -ip

/Qipo -ipo -ipo

Page 10: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

10

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Multi-pass Optimization - IPOUsage: Two-Step Process

Linking

Windows* icl /Qipo main.o func1.o func2.o

Linux* icc -ipo main.o func1.o func2.o

Mac* icc -ipo main.o func1.o func2.o

Pass 1

Pass 2

virtual .o

executable

Compiling

Windows* icl -c /Qipo main.c func1.c func2.c

Linux* icc -c -ipo main.c func1.c func2.c

Mac* icc -c -ipo main.c func1.c func2.c

Page 11: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

11

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Profile Guided Optimizations (PGO)

Use execution-time feedback to guide many other compiler optimizations

Helps I-cache, paging, branch-prediction

Enabled optimizations:

• Basic block ordering

• Better register allocation

• Better decision of functions to inline

• Function ordering

• Switch-statement optimization

• Better vectorization decisions

Page 12: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

12

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Instrumented Compilation(Mac*/Linux*) icc -prof_gen[x] prog.c(Windows*) icl -Qprof_gen[x] prog.c

Instrumented ExecutionRun program on a typical dataset

Feedback Compilation(Mac/Linux) icc -prof_use prog.c(Windows) icl -Qprof_use prog.c

DYN file containingdynamic info: .dyn

Instrumented executable

Merged DYNsummary file: .dpiDelete old dyn files if you do not want the info included

Step 1

Step 2

Step 3

Multi-pass OptimizationPGO: Three-Step Process

Page 13: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

13

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Agenda

Introduction

Compiler Switches

Dual Core• Auto Parallelization• OpenMP• Threading Diagnostics

Vectorization

Page 14: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

14

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Auto-parallelization

Auto-parallelization: Automatic threading of loops without having to manually insert OpenMP* directives.

• Compiler can identify “easy” candidates for parallelization, but large applications are difficult to analyze.

Windows* Linux* Mac*

/Qparallel -parallel -parallel

/Qpar_report[n] -par_report[n] -par_report[n]

Page 15: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

15

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

OpenMP* Threading Technology

Pragma based approach to parallelism

Usage:OpenMP switches: -openmp : /Qopenmp

OpenMP reports: -openmp-report : /Qopenmp-report

#pragma omp parallel for for (i=0;i<MAX;i++) A[i]= c*A[i] + B[i];

Page 16: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

16

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

OpenMP: Workqueueing Extension Example

Intel Compiler’s Workqueuing extension

• Create Queue of tasks…Works on…• Recursive functions• Linked lists, etc.

#pragma intel omp parallel taskq shared(p){ while (p != NULL) {#pragma intel omp task captureprivate(p)

do_work1(p);     p = p->next; }}

Page 17: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

17

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Parallel Diagnostics

Source Instrumentation for Intel Thread Checker

• Allows thread checker to diagnose threading correctness bugs

• To use tcheck/Qtcheck you must have Intel Thread Checker installed

• See thread checker documentation• http://www.intel.com/support/

performancetools/sb/CS-009681.htm

Windows* Linux* Mac*

/Qtcheck -tcheck No support

Page 18: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

18

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Agenda

Introduction

Compiler Switches

Dual Core

Vectorization• SSE & Vectorization• Vectorization Reports• Explanations of a few specific vectorization inhibitors

Page 19: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

19

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

SIMD – SSE, SSE2, SSE3 Support

16x bytes

8x words

4x dwords

2x qwords

1x dqword

4x floats

2x doubles

MMX*

SSE

SSE2SSE3

* MMX actually used the x87 Floating Point Registers - SSE, SSE2, and SSE3 use the new SSE registers

Page 20: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

20

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

SIMD FP using AOS format*

Thread Synchronization

Video encoding

Complex arithmetic

FP to integer conversions

HADDPD, HSUBPD

HADDPS, HSUBPS

MONITOR, MWAIT

LDDQU

ADDSUBPD, ADDSUBPS,

MOVDDUP, MOVSHDUP,

MOVSLDUP

FISTTP

* Also benefits Complex and Vectorization

SSE3 Instructions

Page 21: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

21

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Using SSE3 - Your Task: Convert This…

128-bit Registers

A[0]

B[0]

C[0]

+ + + +

A[1]

B[1]

C[1]

not used not used not used

not used not used not used

not used not used not used

for (i=0;i<=MAX;i++) c[i]=a[i]+b[i];

Page 22: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

22

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

… Into This …

128-bit Registers

A[3] A[2]

B[3] B[2]

C[3] C[2]

+ +

A[1] A[0]

B[1] B[0]

C[1] C[0]

+ +

for (i=0;i<=MAX;i++) c[i]=a[i]+b[i];

Page 23: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

23

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Compiler Based VectorizationProcessor Specific

Description Use Windows* Linux* Mac*

Generate instructions and optimize for Intel® Pentium® 4 compatible processors including MMX, SSE and SSE2.

W /QxW -xW Does not apply

Generate instructions and optimize for Intel® processors with SSE3 capability including Core Duo. These processors support SSE3 as well as MMX,SSE and SSE2.

P /QxP/QaxP

-xP,-axP

Vector-ization occurs by default

Page 24: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

24

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Compiler Based Vectorization Automatic Processor Dispatch – ax[?]

Single executable

• Optimized for Intel® Core Duo processors and generic code that runs on all IA32 processors.

For each target processor it uses:

• Processor-specific instructions

• Vectorization

Low overhead

• Some increase in code size

Page 25: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

25

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Why Loops Don’t Vectorize

Independence

• Loop Iterations generally must be independent

Some relevant qualifiers:

• Some dependent loops can be vectorized.

• Most function calls cannot be vectorized.

• Some conditional branches prevent vectorization.

• Loops must be countable.

• Outer loop of nest cannot be vectorized.

• Mixed data types cannot be vectorized.

Page 26: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

26

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Why Didn’t My Loop Vectorize?

Windows* Linux* Macintosh*

-Qvec_reportn -vec_reportn -vec_reportn

Set diagnostic level dumped to stdout

n=0: No diagnostic information

n=1: (Default) Loops successfully vectorized

n=2: Loops not vectorized – and the reason why not

n=3: Adds dependency Information

n=4: Reports only non-vectorized loops

n=5: Reports only non-vectorized loops and adds dependency info

Page 27: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

27

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Why Loops Don’t Vectorize

• “Existence of vector dependence”

• “Nonunit stride used”

• “Mixed Data Types”

• “Unsupported Loop Structure”

• “Contains unvectorizable statement at line XX”

• There are more reasons loops don’t vectorize but we will disucss the reasons above

Page 28: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

28

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

“Existence of Vector Dependency”

Usually, indicates a real dependency between iterations of the loop, as shown here:

for (i = 0; i < 100; i++) x[i] = A * x[i + 1];

Page 29: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

29

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Defining Loop Independence

Iteration Y of a loop is independent of when (or whether) iteration X occurs.

int a[MAX], b[MAX];

for (j=0;j<MAX;j++) {

a[j] = b[j];

}

Page 30: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

30

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

“Nonunit stride used”

for (I=0;I<=MAX;I++)

for (J=0;J<=MAX;J++) {

c[I][J]+=1; // Unit Stride

c[J][I]+=1; // Non-Unit

A[J*J]+=1; // Non-unit

A[B[J]]+=1; // Non-Unit

if (A[MAX-J])=1 last1=J;}// Non-Unit

End Result: Loading Vector may take more cycles than executing operation sequentially.

Mem

ory

Page 31: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

31

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

“Mixed Data Types”

An example:

int howmany_close(double *x, double *y)

{ int withinborder=0;

double dist;

for(int i=0;i<MAX;i++) {

dist=sqrtf(x[i]*x[i] + y[i]*y[i]);

if (dist<5) withinborder++;

}

}

Mixed data types are possible – but complicate things• i.e.: 2 doubles vs 4 ints per SIMD register

Some operations with specific data types won’t work

Page 32: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

32

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

“Unsupported Loop Structure”

Example:struct _xx {

int data;

int bound; } ;

doit1(int *a, struct _xx *x) { 

for (int i=0; i<x->bound; i++) a[i] = 0;

An unsupported loop structure means the loop is not countable, or the compiler for whatever reason can’t construct a run-time expression for the trip count.

Page 33: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

33

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

“Contains unvectorizable statement”

for (i=1;i<nx;i++) {

B[i] = func(A[i]); }

128-bit Registers128-bit Registers

A[3] A[2]

B[3] B[2]

func func

A[1] A[0]

B[1] B[0]

func func

Page 34: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

34

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Reference

Web-based and classroom training

• www.intel.com/software/college

White papers and technical notes

• www.intel.com/ids

• www.intel.com/software/products

Product support resources

• www.intel.com/software/products/support

Page 35: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

35

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Page 36: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

36

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Activity 1 - raytrace2: Initial Compilation

Set up environment and compile with both Microsoft* Visual C++ .NET (MSVC*) and Intel® C++ Compiler (icl)

Page 37: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

37

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Activity 2 - raytrace2: O3 Compilation

Use Intel compiler’s High Level Optimizer (-O3) for loop centric codes

Page 38: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

38

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Activity 3 - raytrace2: IPO Compilation

Use Intel compiler’s Inter-procedural Optimization (-Qipo)

Page 39: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

39

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Activity 4 - raytrace2: PGO Compilation

Use Intel compiler’s Profile-guided Optimization

Page 40: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

40

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Activity 5 – raytrace2: Vectorization

Use Intel compiler’s Vectorization optimization (-QxP)

Page 41: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College

Copyright © 2006, Intel Corporation. All rights reserved.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

41

Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version

Activity 6 - raytrace2: Putting it all together

Use all previous optimizations in tandem (-O3, -QxP, IPO and PGO)