1
Procedure Hopping: a Low Overhead Solution to Mitigate Variability in Shared-L1 Processor Clusters Procedure-level Vulnerability (PLV) Sources of Variability * PLV to expose fast dynamic voltage variation and its effects to the compiler for use in runtime migration * At compile time, we quantify the effect of full operating conditions on the dynamic voltage variation for every procedure. Abbas Rahimi , Luca Benini and Rajesh Gupta UC San Diego and Università di Bologna v v CC , 160˚C Temperature, 40% V TH Open-source Leon3 Design Compiler IC Compiler PrimeTime PX ModelSim Vsim VHDL Timing constraints Verilog net-list Verilog net-list Parasitics Switching activity Proc X Power @(V i ,T j ) Dynamic Voltage droop/rise @(V i ,T j ) Object code PLV characterized metadata For Proc X @Caller : Read current (V,T) sensors of Core i Read characterized metadata for Proc X If PLV X > PLV_threshold Invoke Procedure Hopping (Proc X @Callee) VA-Proc generation: Proc X /Proc X @Caller/ Proc X @Callee Generating metadata Operating condition (V,T) monitor Design time Compile time Run time Leon-3: Core i (0.81V, - 40˚C) (0.90V, 25˚C) (0.99V, 125˚C) TSMC 45nm LIBs PrimeRail SDF (0.81V, 125˚C) Source code (V,T) Executables BCC Compiler VA-Procedures’ source code PLV characterization flow: Design time/Compile time/ Runtime NSF Expedition in Computing, Variability-Aware Software for Efficient Computing with Nanoscale Devices http://variability.org Across-wafer Frequency V CC Droop Temperature Clock actual circuit delay guardband Other uncertainty Variation-tolerant Shared-L1 Cluster Variation-aware V DD -hopping to mitigate process variation ... I$ Bi-1 I$ B0 Log. Interc. Core 15 VA-V DD -hopping ... TCDM Bj-1 TCDM B0 Log. Interc. Low V DD Typical V DD High V DD DFS ... f+180° f+180° f CPM Level Shifters Level Shifters Level Shifters Level Shifters SHM PSS Core 0 VA-V DD -hopping CPM PSS Procedure hopping to mitigate dynamic voltage variation ... I$ Bi-1 I$ B0 Log. Interc. Core 15 VA-V DD -hopping ... TCDM Bj-1 TCDM B0 Log. Interc. Low V DD Typical V DD High V DD DFS ... f+180° f+180° f CPM Level Shifters Level Shifters Level Shifters Level Shifters SHM PSS Core 0 VA-V DD -hopping CPM PSS Each core increases voltage if its delay is high Each procedure hops from one core to another if it causes voltage variation V DD -hopping Three cores (f4, f8, f9) cannot meet target frequency of 830MHz. All cores of the same cluster meet target frequency of 830MHz. VA-V DD -hopping tunes cores' voltage based on their delay reported by CPMs Intra-cluster Procedure Hopping (Vol., Temp.) 0.99V, 125°C 0.90V, 25°C 0.81V, 125°C 0.81V, -40°C Power density 0.66 μW/ μm 2 0.21 μW/ μm 2 0.18 μW/ μm 2 0.16 μW/ μm 2 Max IR-drop 44 mV < 35 mV < 31 mV < 31 mV Inter-corner voltage droop of FIR procedure: FIR does not face any voltage emergency (< 4%) at the corners with voltages of 0.81V−0.9V due to their lower power densities. V DD = 0.81V f 0 862 f 1 909 f 2 870 f 3 847 f 4 826 f 5 855 f 6 877 f 7 893 f 8 820 f 9 826 f 10 909 f 11 847 f 12 901 f 13 917 f 14 847 f 15 901 V DD = 0.99V f 0 1408 f 1 1389 f 2 1408 f 3 1370 f 4 1370 f 5 1408 f 6 1408 f 7 1408 f 8 1370 f 9 1370 f 10 1389 f 11 1370 f 12 1408 f 13 1408 f 14 1389 f 15 1389 VA-V DD -Hopping=( 0.81V 0.99V , ) f 0 862 f 1 909 f 2 870 f 3 847 f 4 1370 f 5 855 f 6 877 f 7 893 f 8 1370 f 9 1370 f 10 909 f 11 847 f 12 901 f 13 917 f 14 847 f 15 901 0 1 2 3 4 5 6 7 8 Max voltage droop (%) (0.81V, -40°C) (0.81V, 125°C) (0.90V, 25°C) (0.99V, 125°C) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 Peak power (W) (0.81V, -40°C) (0.81V, 125°C) (0.90V, 25°C) (0.99V, 125°C) 3.5× inter-corner peak power variation, and 1.28× intra- corner peak power variation * At (0.99V,125°C), all procedures except tblook will face voltage droop/rise > 4% of V DD * At (0.90V, 25°C) only four procedures (IFFT, IDCT, matrix, ttsprk) face the voltage emergencies. * All procedures running at cores with 0.81V have the maximum voltage droop/rise < 4% of V DD * A low-cost runtime procedure hopping facilitates migration of procedures within the processor cluster, utilizing compile time characterization (captured as metadata) of PLV. * This is accomplished through the advantage of shared-L1 I$ and TCDM that eliminates the penalty of filling a private storage. Proc X @Callee: if (calculate_PLV PLV_threshold) set_status X _PHIT = running load_contex&param_from_SSP X set_all_param&pointers call Proc X store_contex_to_SSP X set_status X _PHIT = done send_broadcast_ack else resume_normal_execution Broadcast_req_ISR: Proc X @Callee = search_in_PHIT call Proc X @Callee c all Proc X //conventional compile Call ProcX@Caller //VA-compile Proc X @Caller: If (calculate_PLV PLV_threshold) call Proc X else create_shared_stack_layout set_PHIT_for_Proc X send_broadcast_req set_timer wait_on_ack_or_timer Broadcast_ack_ISR: if (status X _PHIT == done) load_context&return_from_SSP X Shared Local Heap Shared Stack Proc X Proc X @Callee PHIT Operating Con. Monit. Interrupt Cont. Interrupt Cont. TCDM Shared L1 - I$ Callee Core k Caller Core i Proc X @Caller Stacks

Procedure Hopping: a Low Overhead Solution to Mitigate VA ...cseweb.ucsd.edu/~abrahimi/pubs/ISLPED12_Abbas_poster.pdf · Procedure Hopping: a Low Overhead Solution to Mitigate Variability

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Procedure Hopping: a Low Overhead Solution to Mitigate VA ...cseweb.ucsd.edu/~abrahimi/pubs/ISLPED12_Abbas_poster.pdf · Procedure Hopping: a Low Overhead Solution to Mitigate Variability

Procedure Hopping: a Low Overhead Solution to Mitigate Variability in Shared-L1 Processor Clusters

Procedure-level Vulnerability (PLV)

Sources of Variability

* PLV to expose fast dynamic voltage variation and its effects to the compiler for use in runtime migration * At compile time, we quantify the effect of full operating conditions on the dynamic voltage variation for every procedure.

Abbas Rahimi†, Luca Benini‡ and Rajesh Gupta† †UC San Diego and ‡Università di Bologna

v v

10% VCC, 160˚∆C Temperature, 40% VTH

Open-source

Leon3

Design Compiler

IC Compiler

PrimeTime PX

ModelSim

Vsim

VHDL Timing

constraints

Verilog

net-list

Verilog

net-list

Parasitics

Switching

activity

ProcX

Power @(Vi,Tj)

Dynamic

Voltage

droop/rise

@(Vi,Tj)

Object

code

PLV

characterized

metadata

For ProcX@Caller :

Read current (V,T) sensors of Corei

Read characterized metadata for ProcX

If PLVX > PLV_threshold

Invoke Procedure Hopping (ProcX@Callee)

VA-Proc generation: ProcX/ProcX@Caller/

ProcX@Callee

Generating

metadata

Operating

condition

(V,T) monitor

Design time Compile time

Run time

Leo

n-3

:

Co

rei

(0.81V,

-40˚C)

(0.90V,

25˚C)

(0.99V,

125˚C)

TSMC

45nm LIBs

Prim

eR

ail

SD

F

(0.81V,

125˚C)

Source

code

(V,T)

Ex

ec

uta

ble

s

BCC Compiler

VA-Procedures’

source code

PLV characterization flow: Design time/Compile time/ Runtime

NSF Expedi t ion in Comput ing, Var iab i l i ty -Aware Sof tware for Eff ic ient Comput ing wi th Nanoscale Devices h t tp : / /var iab i l i ty.org

Across-wafer Frequency

VCC DroopTemperature

Clock

actual circuit delayguardband

Other

uncertainty

Variation-tolerant Shared-L1 Cluster

Variation-aware VDD-hopping to mitigate process variation

... I$Bi-1I$B0

Log. Interc.

Core15

VA

-VD

D-h

op

pin

g

... TCDMBj-1TCDMB0

Log. Interc.

Low VDD

Typical VDD

High VDD

DF

S...

f+1

80

°f+

18

f

CPM

Level ShiftersLevel Shifters

Level ShiftersLevel Shifters

SHM

PSS

Core0

VA

-VD

D-h

op

pin

g

CPM

PSS

Procedure hopping to mitigate dynamic voltage variation

... I$Bi-1I$B0

Log. Interc.

Core15

VA

-VD

D-h

op

pin

g

... TCDMBj-1TCDMB0

Log. Interc.

Low VDD

Typical VDD

High VDD

DF

S...

f+1

80

°f+

18

f

CPM

Level ShiftersLevel Shifters

Level ShiftersLevel Shifters

SHM

PSS

Core0

VA

-VD

D-h

op

pin

g

CPM

PSS

Each core increases voltage if its delay is high

Each procedure hops from one

core to another if it causes voltage

variation

VDD-hopping

Three cores (f4, f8, f9) cannot meet target frequency of

830MHz.

All cores of the same cluster meet target frequency

of 830MHz.

VA-VDD-hopping tunes cores' voltage based on their delay

reported by CPMs

Intra-cluster Procedure Hopping

(Vol., Temp.) 0.99V, 125°C 0.90V, 25°C 0.81V, 125°C 0.81V, -40°C

Power density 0.66 μW/μm2 0.21 μW/μm2 0.18 μW/μm2 0.16 μW/μm2

Max IR-drop 44 mV < 35 mV < 31 mV < 31 mV4444 44 44

Inter-corner voltage droop of FIR procedure: FIR does not face any voltage emergency (< 4%) at the corners with voltages of 0.81V−0.9V due to their

lower power densities.

VDD = 0.81V

f0

862

f1

909

f2

870

f3

847

f4

826

f5

855

f6

877

f7

893

f8

820

f9

826

f10

909

f11

847

f12

901

f13

917

f14

847

f15

901

VDD = 0.99V

f0

1408

f1

1389

f2

1408

f3

1370

f4

1370

f5

1408

f6

1408

f7

1408

f8

1370

f9

1370

f10

1389

f11

1370

f12

1408

f13

1408

f14

1389

f15

1389

VA-VDD-Hopping=( 0.81V 0.99V, )

f0

862

f1

909

f2

870

f3

847

f4

1370

f5

855

f6

877

f7

893

f8

1370

f9

1370

f10

909

f11

847

f12

901

f13

917

f14

847

f15

901

0

1

2

3

4

5

6

7

8

Max v

olt

ag

e d

roo

p (

%)

(0.81V, -40°C) (0.81V, 125°C) (0.90V, 25°C) (0.99V, 125°C)

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

Peak p

ow

er

(W)

(0.81V, -40°C) (0.81V, 125°C) (0.90V, 25°C) (0.99V, 125°C)

3.5× inter-corner peak power variation, and 1.28× intra-corner peak power variation

* At (0.99V,125°C), all procedures except tblook will

face voltage droop/rise > 4% of VDD

* At (0.90V, 25°C) only four procedures (IFFT, IDCT,

matrix, ttsprk) face the voltage emergencies.

* All procedures running at cores with 0.81V have the

maximum voltage droop/rise < 4% of VDD

* A low-cost runtime procedure hopping facilitates migration of procedures within the processor cluster, utilizing compile time characterization (captured as metadata) of PLV. * This is accomplished through the advantage of shared-L1 I$ and TCDM that eliminates the penalty of filling a private storage.

…ProcX@Callee:if (calculate_PLV ≤ PLV_threshold)

set_statusX_PHIT = runningload_contex&param_from_SSPX

set_all_param&pointerscall ProcX

store_contex_to_SSPX

set_statusX_PHIT = donesend_broadcast_ack

else resume_normal_execution

Broadcast_req_ISR:ProcX@Callee = search_in_PHIT

call ProcX@Callee

…call ProcX //conventional compile Call ProcX@Caller //VA-compile

…ProcX@Caller:

If (calculate_PLV ≤ PLV_threshold)call ProcX

else

create_shared_stack_layoutset_PHIT_for_ProcX

send_broadcast_reqset_timerwait_on_ack_or_timer

…Broadcast_ack_ISR:

if (statusX_PHIT == done)load_context&return_from_SSPX

Shared

Local

Heap

Shared

Stack

ProcXProcX

@Callee

PHIT

Op

era

ting

Co

n. M

on

it.In

terru

pt C

ont.

Op

era

tin

g C

on

. Mo

nit.

Inte

rrup

t C

ont.

TCDM

Sh

are

d

L1 -I$

Callee Corek Caller Corei

ProcX

@Caller……

Stacks