12

SPEC2006 Static Register Usage€¦ · • Literature suggests GCC/LLVM allocators fail on 470.lbm w/16 FP registers, but better allocations exist • Pre-Allocation Instruction Scheduling

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SPEC2006 Static Register Usage€¦ · • Literature suggests GCC/LLVM allocators fail on 470.lbm w/16 FP registers, but better allocations exist • Pre-Allocation Instruction Scheduling
Page 2: SPEC2006 Static Register Usage€¦ · • Literature suggests GCC/LLVM allocators fail on 470.lbm w/16 FP registers, but better allocations exist • Pre-Allocation Instruction Scheduling

Source: Andrew Waterman’s Thesis

SPEC2006 Static Register Usage

Page 3: SPEC2006 Static Register Usage€¦ · • Literature suggests GCC/LLVM allocators fail on 470.lbm w/16 FP registers, but better allocations exist • Pre-Allocation Instruction Scheduling
Page 4: SPEC2006 Static Register Usage€¦ · • Literature suggests GCC/LLVM allocators fail on 470.lbm w/16 FP registers, but better allocations exist • Pre-Allocation Instruction Scheduling
Page 5: SPEC2006 Static Register Usage€¦ · • Literature suggests GCC/LLVM allocators fail on 470.lbm w/16 FP registers, but better allocations exist • Pre-Allocation Instruction Scheduling

0.90

0.92

0.94

0.96

0.98

1.00

1.02

1.04

1.06

1.08

1.10

Static Code Size Dynamic Instructions Dynamic InstructionBytes

In-Order ExecutionTime

Out-of-OrderExecution Time

No

rmal

ized

to

RV

64

GC

RV64GC vs RV64RC

dhrystone

coremark

whetstone

Reliable static-code size increase due to register spills/loads

Lesser increase in dynamic instructions.Whetstone a special case (call overhead)

Generally more frequent usage of

compressed instructions.

In-order impacted, out-of-order not hurt

Page 6: SPEC2006 Static Register Usage€¦ · • Literature suggests GCC/LLVM allocators fail on 470.lbm w/16 FP registers, but better allocations exist • Pre-Allocation Instruction Scheduling

RVR static code• 5.7% larger on integer benchmarks• 6.2% larger on floating-point benchmarks

1.01

1.02

1.03

1.04

1.05

1.06

1.07

1.08

1.09

Sta

tic

Co

de

Siz

e(n

orm

aliz

ed t

o n

on

-red

uce

d v

aria

nt)

rv32rc

rv64rc

Page 7: SPEC2006 Static Register Usage€¦ · • Literature suggests GCC/LLVM allocators fail on 470.lbm w/16 FP registers, but better allocations exist • Pre-Allocation Instruction Scheduling

1.0050.987 0.984

0.986

0

0.2

0.4

0.6

0.8

1

1.2

400.perlbench 401.bzip2 403.gcc 429.mcf 445.gobmk 456.hmmr 462.libquantum 471.omnetpp average

No

rmal

ized

to

RV

64

GC

Dynamic Instructions Dynamic Instruction Bytes In-Order Execution Time Out-of-Order Execution Time

400.Perlbench and 471.omnetpp • Highest call frequency benchmarks – call every 35 & 23 instructions

• Perlbench - 2.8% fewer dynamic function calls due to GCC’s inlining heuristics (call overhead)

429.mcf• Inlining heuristics greatly impacted

• 28.7% fewer dynamic function calls w/similar performance

Page 8: SPEC2006 Static Register Usage€¦ · • Literature suggests GCC/LLVM allocators fail on 470.lbm w/16 FP registers, but better allocations exist • Pre-Allocation Instruction Scheduling

1.123 1.060 1.052

1.045

0.000

0.200

0.400

0.600

0.800

1.000

1.200

1.400

410.bwaves 433.milc 436.cactusADM 437.leslie3d 444.namd 453.povray 470.lbm average

No

rmal

ized

to

RV

64

GC

Dynamic Instructions Dynamic Instruction Bytes In-Order Execution Time Out-of-Order Execution Time

470.lbm• Literature suggests GCC/LLVM allocators fail on 470.lbm w/16 FP registers, but better allocations exist• Pre-Allocation Instruction Scheduling with Register Pressure Minimization Using a Combinatorial Optimization Approach, G. Shobaki et. al

• x86-64 + LLVM, reduces register spills from 12 to 2 leading to a 21% improvement• Benchmark performance bottlenecked by memory system (data cache miss rates)

Page 9: SPEC2006 Static Register Usage€¦ · • Literature suggests GCC/LLVM allocators fail on 470.lbm w/16 FP registers, but better allocations exist • Pre-Allocation Instruction Scheduling

Source: Krste Asanović, CS152 UC Berkeley

Page 10: SPEC2006 Static Register Usage€¦ · • Literature suggests GCC/LLVM allocators fail on 470.lbm w/16 FP registers, but better allocations exist • Pre-Allocation Instruction Scheduling
Page 11: SPEC2006 Static Register Usage€¦ · • Literature suggests GCC/LLVM allocators fail on 470.lbm w/16 FP registers, but better allocations exist • Pre-Allocation Instruction Scheduling
Page 12: SPEC2006 Static Register Usage€¦ · • Literature suggests GCC/LLVM allocators fail on 470.lbm w/16 FP registers, but better allocations exist • Pre-Allocation Instruction Scheduling