22
TRACES Code Padding to Improve the WCET Calculability Christine Rochange and Pascal Sainrat Institut de Recherche en Informatique de Toulouse Toulouse

TRACES Code Padding to Improve the WCET Calculability Christine Rochange and Pascal Sainrat Institut de Recherche en Informatique de Toulouse Toulouse

Embed Size (px)

Citation preview

TRACES

Code Paddingto Improve

the WCET Calculability

Christine Rochange and Pascal Sainrat

Institut de Recherche

en Informatique

de ToulouseToulouse

TRACES

WCET evaluation

Static WCET analysis

IPET:Implicit Path Enumeration Technique

flow analysis low-level analysis

WCET computation

TRACES

Implicit Path Enumeration Technique

A

B

C E

D

xA = 1 + xDA = 1 + xAB

xB = xAB = xBC + xBE

xC = xBC = xCD

xD = xCD + xED = xDA

xE = xBE = xED

xBC = xBE

xDA ≤ N

T = xi.timax + xij.ij

TRACES

Pipelined execution F

FU1

FU2

C

FETCHFU1FU2

COMPL.

1 2 3 4 5FETCH

FU1FU2

COMPL.

1 2 3 4 5

FETCHFU1FU2

COMPL.

1 2 3 4 5 6 5

5

-4

B1,B2

TRACES

Long Timing Effects (1) F

FU1

FU2

C

FETCHFU1FU2

COMPL.

1 2 3 4 5

FETCHFU1FU2

COMPL.

1 2 3 4 5 6 7 8

FETCHFU1FU2

COMPL.

1 2 3 4 5FETCH

FU1FU2

COMPL.

1 2 3 4 5

FETCHFU1FU2

COMPL.

1 2 3 4 5 6FETCH

FU1FU2

COMPL.

1 2 3 4 5 6

5

5

5

-4

-4TA-B-C = 7= 8

+1

TRACES

Long Timing Effects (2)

tABC

tABCD

t1…n = ti + j…ki=1

n

1 ≤ j ≤ k ≤n

tA tB tC tD tE

AB BC CD DEtAB

ABC BCD DEF

ABCD BCDE

ABCDE

J. Engblom

TRACES

Motivation

Long timing effects are: difficult to quantify

they might span over very long sequences

difficult to integrate into WCET computation

Long timing effects increase the variability of execution times

Our goal:eliminate long timing effects

TRACES

Outline

Our approach: code padding

Implementation software framework

analysis algorithms to identify resource requirements to compute safe padding lengths

Experimental results

Concluding remarks

TRACES

Code padding

FETCHFU1FU2

COMPL.

1 2 3 4 5 6 7 8

FETCHFU1FU2

COMPL.

1 2 3 4 5 6 7 8

filler instruction

FETCHFU1FU2

COMPL.

1 2 3 4 5 6

TRACES

Exemple (1)

inst i1

inst i2

…inst ini

inst j1

inst j2

……inst jnj

inst k1

…inst knk

block i

block j

block k

requires a 4-cycle delay

requires a 3-cycle delay

requires a 1-cycle delay

TRACES

Exemple (2)

inst i1

inst i2

…inst ini

inst j1

inst j2

…inst jnj

inst k1

…inst knk

block i

block j

block k

nopnopnopnopnopnopnopnop

nopnopnopnopnopnop

nopnop

4-cycle delay

3-cycle delay

1-cycle delay

TRACES

Exemple (3) inst i1

inst i2

…inst ini

inst j1

inst j2

…inst jnj

inst k1

…inst knk

block i

block j

block k

bl delay4

bl delay3

nopnop

4-cycle delay

3-cycle delay

1-cycle delay

delay4: nopnop

delay3: nopnop

delay2: blr

filler block

TRACES

Code padding framework

C source code

gcc compiler

assembly code

gas assembler

object code

CFG extractor

cycle-level simulator

interferenceanalysis

code padding

safe paddedassembly code

list of basic blocks

execution tracesof block sequences

padding lengths

TRACES

Analyzing resource requirements (1)

Requirements of a basic blockforeach block B do {

ff[B] first fetch cycle of B;lf[B] last fetch cycle of B + 1;foreach resource R do {

n[R] cycle at which R is needed;r[R] cycle at which R is released;

// 0 if R not used by Bn[R,B] n[R] – ff[B];r[R,B] r[R] – lf[B];

// 0 if R not used by B}

d[B] 0;}

FETCHFU1FU2

COMPL.

1 2 3 4 5

ff[B2] = 1lf[B2] = 2

n[FU1,B2] = 0r[FU1,B2] = 0n[FU2,B2] = 1r[FU2,B2] = 3

TRACES

Analyzing resource requirements (2)

Requirements of a sequence

foreach sequence B1-…-Bx (x < n) do {lf[Bx] last fetch cycle of Bx + 1;foreach resource R do {

r[R] cycle at which R is released; // 0 if R not used by any Bi

r[R,B1-…Bx] r[R] – lf[Bx];}

}

FETCHFU1FU2

COMPL.

1 2 3 4 5 6

lf[B2] = 3r[FU1,B1-B2] = 2r[FU2,B1-B2] = 3

r[FU1,B2] = 0

TRACES

Computing padding lengths (1)

Depth-1 strategy objective: r[R,A-B] == r[R,B]

algorithm:

example:

foreach sequence A-B do

foreach resource R do

if r[R,A-B] ≠ r[R,B] then {

d StrictDelay(R,A-B);if d > d[B] then

d[B] d;}

computes the padding length(iterative trials)

r[FU1,B2] = 0r[FU1,B1-B2] = 2 >

TRACES

Computing padding lengths (2)

Depth-n strategy

analyze (n+1)-block sequences (B0-B1-…-Bn)

objectives: for i < n :

if r[R,B0-…-Bi] > n[R,Bi+1] : r[R,B0-…-Bi] == r[R,B1-

…-Bi]

r[R,B0-…-Bn] == r[R,B1-…-Bn]

TRACES

Computing padding lengths (3)

Example: depth-4 algorithm

foreach sequence A-B-C-D-E do

foreach resource R do

if (n[R,C] > 0)&& (r[R,A-B] > n[R,C])

&& (r[R,A-B] > r[R,B]) then {

d MinimumDelay(R,A-B-C);if d > d[B] then

d[B] d;}

elsif (n[R,D] > 0)&& (r[R,A-B-C] > n[R,D])

&& (r[R,A-B-C] > r[R,B-C]) then {

d MinimumDelay(R,A-B-C);...

TRACES

Experimental results (1)

Code size increase 2-way 4-way

matmul 35.24% 76.19%

ludcmp 16.51% 28.20%

jfdctint 11.37% 126.97%

bsort 31.25% 76.25%

heapsort 25.00% 51.47%

insertsort 23.81% 59.52%

MEAN 23.86% 69.77%

0%

20%

40%

60%

80%

depth-1 depth-2 depth-3 depth-4

depth of the analysis

incre

ase

of c

ode

size

2-way 4-way

depth-1

TRACES

Experimental results (2)

WCET increase

0%

20%

40%

60%

depth-1 depth-2 depth-3 depth-4

depth of the analysis

incre

ase

of th

e r

eal W

CE

T 2-way 4-way

TRACES

Concluding remarks Inter-block long timing effects make

the WCET analysis complex and pessimistic

Code padding prevents long timing effects and limit the variability of partial execution times

The cost of padding can be acceptable code size ( 20% for a 2-way pipeline)

real WCET increase ( 20%)

future work:cost on the estimated WCET?

TRACES

Thank you!

Traces stands for

Research group

on Architectures

and Compilers

for Embedded

Systemswww.irit.fr/TRACES