Transcript
Page 1: Operation Reuse on Handheld Devices

Operation Reuse on Handheld Devices

Yonghua Ding and Zhiyuan Li

For LCPC 2003

Page 2: Operation Reuse on Handheld Devices

Outline

Introduction Computation reuse Branch reuse by IF-merging Conclusions

Page 3: Operation Reuse on Handheld Devices

Introduction Handheld devices have

Limited processing power Limited energy resource

Operation reuse Computation reuse Branch reuse

Hardware solutions Software solutions

Page 4: Operation Reuse on Handheld Devices

Computation Reuse Can be viewed as an extension of CSE Redundancy among different

instances of a code segment Code segment with repetitive inputs A hashing table records the input

values and the computed output values

Replace the computation with a table look-up if the input is in the table

Page 5: Operation Reuse on Handheld Devices

An Example code Int quan(int val) { int I; for (i=0; i<15; i++) { if (val < power2[i]) break; } return (i); }

Page 6: Operation Reuse on Handheld Devices

Transformation Code Int quan(int val) { int I, key if (check_hash(val,hash_tab,&key)==0) { for (i=0; i<15; i++) { if ( val<power2[i] ) break; } hash_tab[key].output = I; } else I = hash_tab[key].output; return (i); }

Page 7: Operation Reuse on Handheld Devices

Framework of the SchemeIdentify candidate code segments

Data flow analysis to determine input/output

Estimate hashing overhead

Granularity analysis

Choose code segments for value profiling

Determine code segments to transform

Page 8: Operation Reuse on Handheld Devices

Important factors

Computation granularity ( C ) Hashing overhead ( O )

Hashing function complexity The size of input/output

Reuse rate ( R ) R = 1 – Nds/N

Page 9: Operation Reuse on Handheld Devices

Cost-Benefit Analysis

Cost of computation reuse (C+O)(1-R)+O.R

The gain of computation reuse C - (C+O)(1-R)+O.R Ξ R.C – O

Criteria to choose code segments R.C – O > 0 or R > O/C

Page 10: Operation Reuse on Handheld Devices

Experimentation Setup

Compaq iPAQ 3650 PDA 206MHZ StrongARM SA1110

processor 32MB RAM 16KB I-cache and 8KB D-cache

Digital multi-meter HP 3458a 6 MediaBench programs and a

GNU GO game

Page 11: Operation Reuse on Handheld Devices

Performance Improvement

Programs Original (s)

Reuse (s)

Speedup

G721_encode 2.01 1.53 1.31G721_decode 3.69 2.76 1.34MPEG2_encode

120.63 113.30 1.06

MPEG2_decode

83.02 46.06 1.80

RASTA 14.92 12.66 1.18UNEPIC 1.73 0.76 2.28GNU GO 788.05 654.51 1.20Harmonic Mean

1.37

Page 12: Operation Reuse on Handheld Devices

Energy Saving

Programs Original (J)

Reuse (J) Saving

G721_encode 4.59 3.56 22.4%G721_decode 8.43 6.47 23.3%MPEG2_encode

281.67 265.12 5.9%

MPEG2_decode

193.85 108.01 44.3%

RASTA 36.60 31.02 15.2%UNEPIC 4.03 1.81 55.1%GNU GO 1936.23 1613.69 16.7%

Page 13: Operation Reuse on Handheld Devices

Performance Improvement for Different Input Files

Programs Sources of Inputs

Speedups

G721_encode

MiBench 1.35

G721_decode

MiBench 1.36

MPEG2_encode

Tektronix 1.19

MPEG2_decode

Tektronix 1.48

RASTA Rasta_testsuite_1998

1.18

UNEPIC EPIC web-site 4.25

GNU GO “-b 9 –r 2” 1.20Harmonic Mean

1.43

Page 14: Operation Reuse on Handheld Devices

Related Work

Richardson’s result cache Sodani and Sohi’s instruction reuse Huang and Lilja’s basic block level

reuse Connors and Hwu’s code region

level reuse

Page 15: Operation Reuse on Handheld Devices

Branch Reuse by IF-Merging

Motivation Branch instructions degrade the

efficiency of deep pipelining Branches reduce the size of basic

blocks Branches introduce control

dependences Source-level code transformation

Page 16: Operation Reuse on Handheld Devices

An Example Code If ( sign ) { diff = -diff; } …… If ( sign ) valpred -= vpdiff; Else valpred += vpdiff;

Page 17: Operation Reuse on Handheld Devices

Transformation by IF-merging If ( sign ) { diff = -diff; …… valpred -= vpdiff; } Else { …… valpred += vpdiff; }

Page 18: Operation Reuse on Handheld Devices

Three Schemes of IF-Merging

A basic IF-merging scheme Merge IF statements with identical

condition An IF-condition Factoring scheme

Factor and merge common sub-predicates

A path profiling scheme IF-merging with path profiling

information

Page 19: Operation Reuse on Handheld Devices

A Basic IF-Merging Scheme

Symbolic analysis to identify IF statements with identical IF condition

Data dependence analysis to determine intermediate statements

Page 20: Operation Reuse on Handheld Devices

A Factoring Scheme

Non-identical conditions have common sub-predicates (a&&b, a&&c)

Factor the common sub-predicates to construct a common IF statement

The new IF statement encloses the original IF statements with the remaining sub-predicates as conditions

Page 21: Operation Reuse on Handheld Devices

A Path Profiling Scheme

Merge IF statements with high rate of all taken

Exchange nested IF statements whose conditions are dependent

Page 22: Operation Reuse on Handheld Devices

Experimental Results

Programs Speedups Energy Saving

ADPCM_coder 1.104 9.3%

ADPCM_decoder

1.076 8.0%

G721_encode 1.069 5.8%

G721_decode 1.066 6.1%

GSM_toast 1.067 6.0%

GSM_untoast 1.085 8.2%

PEGWIT_encrypt

1.029 2.5%

PEGWIT_decrypt

1.017 1.5%

Average 1.063 5.9%

Page 23: Operation Reuse on Handheld Devices

Related Work

Kreahling et al’s profile-based condition merging

Branch prediction Predicated execution Muller and Whalley’s avoiding

branches by code replication Yang et al’s branch reordering

Page 24: Operation Reuse on Handheld Devices

Conclusions

Operation reuse techniques are desirable for both program speed and energy saving on handheld devices Computation reuse Branch reuse by IF-merging