30
Extensible Networking Platform 1 Liquid Architecture Cycle Accurate Performance Measurement Richard Hough Phillip Jones, Scott Friedman, Roger Chamberlain, Jason Fritts, John Lockwood, and Ron Cytron [email protected] http://liquid.arl.wustl.edu/ Funded by NSF Grant ITR-0313203

Extensible Networking Platform 1 Liquid Architecture Cycle Accurate Performance Measurement Richard Hough Phillip Jones, Scott Friedman, Roger Chamberlain,

Embed Size (px)

Citation preview

Extensible Networking Platform 1 Liquid Architecture

Cycle Accurate Performance Measurement

Richard HoughPhillip Jones, Scott Friedman, Roger Chamberlain, Jason Fritts, John

Lockwood, and Ron Cytron

[email protected]://liquid.arl.wustl.edu/

Funded by NSF Grant ITR-0313203

Extensible Networking Platform 2 Liquid Architecture

OutlineOutline

• Introduction• Motivation• Background• Architecture• Usage• Results• Future Work• Related Work• Conclusion

Extensible Networking Platform 3 Liquid Architecture

Introduction – What Are We Doing?

• Creating a module for capturing cycle-accurate profiles of hardware events during the runtime of programs on real systems

Extensible Networking Platform 4 Liquid Architecture

Introduction – What Are We Doing?

• Creating a module for capturing cycle-accurate profiles of hardware events during the runtime of programs on real systems

StatisticsModule

Extensible Networking Platform 5 Liquid Architecture

Introduction – What Are We Doing?

• Creating a module for capturing cycle-accurate profiles of hardware events during the runtime of programs on real systems

StatisticsModule

Program Runtime

ProgramBottlenecks

Extensible Networking Platform 6 Liquid Architecture

Introduction – What Are We Doing?

• Creating a module for capturing cycle-accurate profiles of hardware events during the runtime of programs on real systems

StatisticsModule

Program Runtime

ProgramBottlenecks Cache

Hits

MemoryAccesses

ISADecoding

Extensible Networking Platform 7 Liquid Architecture

Introduction – What Are We Doing?

• Creating a module for capturing cycle-accurate profiles of hardware events during the runtime of programs on real systems

StatisticsModule

Program Runtime

ProgramBottlenecks Cache

Hits

MemoryAccesses

ISADecoding

Extensible Networking Platform 8 Liquid Architecture

Background - FPX• Designed and implemented on the FPX platform

• The FPX platform is:– Designed for developing pluggable network circuits– Contains a Virtex 2000e FPGA for design deployment– Possesses a smaller FPGA used as a network interface device

• Can potentially operate at gigabit line rates

Extensible Networking Platform 9 Liquid Architecture

Background - LEON2• Developed by Gaisler Research

– Sparc-V8– Open-Source VHDL– Widely used

• European Space Agency, etc.– Second in popularity only to the Microblaze

Extensible Networking Platform 10 Liquid Architecture

Motivation – Why Not Use Software?

• Software Profiling Is:– Inaccurate

• Many data points estimated• Time slices not absolute• Profiling affects results

– Inefficient• Unreasonable for real-system deployment

– Ineffective• Difficult to separate OS overhead

Extensible Networking Platform 11 Liquid Architecture

Motivation – Why Not Use Simulation?

• Simulation is:– Slow

• A simple simulation could require 100X more time than running the program

– Bound by the quality of the model• The model used may be inaccurate• Processors often tweaked without updating the

documentation [Larus]

Extensible Networking Platform 12 Liquid Architecture

Motivation – Why Use FPGAs?

• ASICs are expensive– FPGAs provide good blend of cost and

accuracy• Software simulation of processors is

incredibly slow• Allows for easy prototyping

– Test new caching methods, tweak the ISA, etc.

Extensible Networking Platform 13 Liquid Architecture

Motivation – Why Put Statsmod In A FPGA?

• The Statistics Module Allows You To:– Pull Event Signals from anywhere– Evaluate both software and hardware

optimizations• Tweak the architecture• Integrate hardware accelerated modules into

software solutions• Adjust the software algorithm

– Gather repeatable and reliable results

Extensible Networking Platform 14 Liquid Architecture

Eve

nt 0

Eve

nt 2

Eve

nt 3

Eve

nt 4

Eve

nt 5

Eve

nt 6

Eve

nt 1

Eve

nt 7

Eve

nt 8

Eve

nt 9

Method 0

Method 2

Method 3

Method 4

Method 5

Method 6

Method 1

Method 7

Method 8

Method 9

Architecture – Naïve Solution• Interested in 10 events

and counters– Naïve solution

implements a counter for each possibility

• 100 counters!– Not scalable for large

systems

Extensible Networking Platform 15 Liquid Architecture

Architecture – Our Solution• Better Approach

– Associate counters to events and methods at run time

– Covers the problem area, but uses less chip space

Eve

nt 0

Eve

nt 2

Eve

nt 3

Eve

nt 4

Eve

nt 5

Eve

nt 6

Eve

nt 1

Eve

nt 7

Eve

nt 8

Eve

nt 9

Method 0

Method 2

Method 3

Method 4

Method 5

Method 6

Method 1

Method 7

Method 8

Method 9

Extensible Networking Platform 16 Liquid Architecture

Architecture – An In Depth Look

Mux Control Register

Selected Event Selected ARR

EventMUX

ARRMUX

Event Signals

PC-ARR Comparison

Signals

32-Bit Counter

CLK

The Internet

Configuration UDP Packets

Statistic Result

Packets

Timer

Extensible Networking Platform 17 Liquid Architecture

Architecture – Scalability

Four Input LUT Usage

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140

N

4 In

put LU

Ts

Address Range

Registers

Counters

Events

Naïve Approach

Extensible Networking Platform 18 Liquid Architecture

Usage

Extensible Networking Platform 19 Liquid Architecture

Results – What do we get?

• The next few slides contain data from the Linpack benchmark running on the FPGA– Linpack is a FPU intensive benchmark

• While the following slides focus on runtime, it is important to remember that the graphs could in principle be of *any* event

Extensible Networking Platform 20 Liquid Architecture

Runtime Per Method

0

50000000

100000000

150000000

200000000

250000000

300000000

350000000

Method Name

Clo

ck C

ycle

sResults

323,686,726

Clo

ck C

ycle

s

Extensible Networking Platform 21 Liquid Architecture

Runtime per Time Slice

0

2000000

4000000

6000000

8000000

10000000

12000000

2 6 10 14 18 22 26 30 34 38 42 46 50 54 58 62 66 70 74 78

Time(s)

Clock Cycles

run_benchmark()

matgen()

dgefa()

dgesl()

daxpy()

ddot()

dscal()

idamax()

Results

Extensible Networking Platform 22 Liquid Architecture

Results

dscal() Runtime

0

50000

100000

150000

200000

250000

8 16 24 32 40 48 56 64 72 80 88

Time(s)

Clock Cycles

Extensible Networking Platform 23 Liquid Architecture

Results

dscal() Runtime

0

20000

40000

60000

80000

100000

120000

2 4 6 8 10 12 14 16 18 2022 24 26 28 3032 34 36 38 4042 44 46 4850 52 54 56 5860 62 64 66 6870 72 74 76 78

Time(s)

Clock Cycles

Extensible Networking Platform 24 Liquid Architecture

Future Work – Where can we go?

• As of a week ago, the StatsMod was successfully integrated into a Linux 2.6.11 OS running on Leon– Changes have been made to allow a clear

separation between Process IDs• OS, background tasks, threads

– A device driver allows any program, including the program being profiled, to gather the statistics

Extensible Networking Platform 25 Liquid Architecture

Future Work – Where can we go?

• Programs could now potentially collect statistics on themselves perform runtime introspection– Adjust operation to conserve power, memory

accesses, etc.– Deeper integration could occur at the kernel

level to affect scheduler decisions• Adds a new dimension for slicing resources

– Network activity, device activity, page faults, etc.

Extensible Networking Platform 26 Liquid Architecture

Related Work

• SnoopP– Developed by Lesley Shannon and Paul Chow

at the University of Toronto– Collects timing characteristics of programs

running on a Microblaze processor• Focuses on clock cycles only

– Integrated into the EDK

Extensible Networking Platform 27 Liquid Architecture

Conclusion

In closing, I would like to thank:– Phillip Jones for his hard work and support– Ron Cytron for his mentoring and persistence– Scott Friedman for his work on the web

interface– The rest of the Liquid Architecture team– And WISA for the invitation to present

Extensible Networking Platform 28 Liquid Architecture

Questions?

Extensible Networking Platform 29 Liquid Architecture

Background – Liquid

Extensible Networking Platform 30 Liquid Architecture

Usage

1. Connect to a secure web server controlling the FPGA hardware

2. Upload the desired binary executable, associated mapfile, and desired programming bitfile

3. A perl script parses the map file and provides a graphical interface for selecting the desired address ranges and events

4. Statistic results are tabulated at the end of the program’s execution