CHREC F3: Target Tracking Rafael Garcia 11/26/08

CHREC F3: Target CHREC F3: Target TrackingTracking

Rafael Garcia

11/26/08

2

F3 Goals, Motivations, & F3 Goals, Motivations, & ChallengesChallenges Goals

Develop applications & design strategies for scalable architectures from case-study

Analyze & examine available multi-FPGA platforms and tools for scalable system design

Motivations Meet performance requirements in HPC/HPEC

scenarios by mapping across multiple FPGAs Exploit multi-FPGA platforms to develop larger,

complex designs and algorithms Increase understanding of performance

prediction, power, and usability for scalable apps

Challenges Perform multilevel algorithm partitioning, analysis,

and optimization for multi-FPGA systems Determine influence of application characteristics

on selection of platforms, tools and languages

F3Insights

Formulation

Translation

Desig

n

Exe

cuti

on

Kalman Filter Overview Traditional Kalman filters estimate the state of a dynamic

system in a noisy environment Commonly used in target prediction and can be extended to

multiple dimensions, targets, and models Excellent target tracker when an accurate model is known Useful even if an accurate model is not known

Current Architecture 4 tightly coupled FPGAs mapped to 4 quadrants

System is driven by two global clocks 100MHZ inter-FPGA communication links 50MHz data-processing clock

2-step processing cycle returns results at 25MSa/s Inter-FPGA communication occurs when target crosses a quadrant

boundary Current state of target is passed along

Non-pipelined design 2-step cycle where one cycle depends on the previous one and the

other cycle depends on pseudo-sensor data from host CPU Low frequency and lack of

pipeline registers is expected to lower power consumption

2-cycle design simplifies communication network

Current Architecture

Continuously receiving pseudo-sensor data and returning condensed information

Limited to a single target per quadrant Set sensor sampling rate of 25MSa/s

Resource M4K rams DSPs ALUTs

Stratix II: EP2S180F1020C3

1% 15% 2%

Simplified Algorithm Assumes steady-state

operation Target must closely follow

given movement model for accurate results Allows for precomputed

covariance and Kalman-gain terms

Model tracks four parameters Horizontal position Vertical position Horizontal velocity Vertical velocity

Algorithm Changes Remove the hardcoded

terms, increasing prediction accuracy during non-steady-state situations

Modify model to include Z-axis parameters for airborne targets

Sensor Target Precision Resource Kernel

Low Power Slow Fixed Low Kalman FilterFast Sampling Fast Fixed Low Kalman Filter

Multi-Scale Airborne Floating High MKSHigh-Noise Noisy Floating Medium Kalman FilterSelective Multiple Floating High Feature Selection

New Module Types

RCML Representation

BCast

Start/Initialize

for each C value in PredictionVector

Time-Update (“Predict”)

i=4

Update error

covariance

Next-state prediction

GatherReport Current Results

for each D value in MeasurementVector

Measurement-Update (“Correct”)

i=4

Compute Kalman

gain

Correct prediction

Update error

covariance

Time-Step Advance

Generate Sensor

Readings

Data Set: PredictionVectorElement Type: fixed Position

fixed AccelerationNum Elements:= 4

Data Set: MeasurementVectorElement Type: fixed Position

fixed AccelerationNum Elements:= 4

Kalman Filter Estimates state of a dynamic system in a noisy

environment In this case, the ‘dynamic system’ is a moving target

Commonly used in target prediction and can be extended to multiple dimensions, targets, and models

Assumes sensor noise is white Gaussian noise Requires a pre-programmed

model describing the target’s motion Works in a continuous

2-cycle loop Developed in 1960 by

Rudolf E. Kalman (A UF professor from1971-1992!)

Kalman Filter can be viewed as a simple black box An input stream of samples measuring a target’s position

is contaminated with noisy samples The output is a stream of samples with most of the noisy

samples filtered

Kalman System Models

Accurate Samples

Noisy Samples

MostlyAccurate Samples

Kalman Filter

-9.8 m/sNE wind at 23mph

FollowsRoad

Reasons for sensor noise Battery Power

variable battery voltage voltage regulators cost money, draw power, and are not perfect

Sensors low quality sensors

cost-cutting for mass production sometimes requires cheap sensors incorrectly deployed sensors

bad orientation, obstructed sensor Environment

environmental conditions rain, dust, night-time tracking, snow

Multiple targets misinterpreted samples from neighboring targets during multiple-target

tracking Sensor processing stage must ensure proper target isolation

Wireless signal bad data from neighboring sensors due to a weak wireless signal

Kalman Filter example

PR Virtual Architecture with Kalman Filters Sensor records samples

Image processing step extracts specific features Target size, vertical position, horizontal position, target bearing, elevation, etc.

Kalman filters extract sensor noise Results are sent to a central location to be displayed

Module interface

Module interface

Module interface

Module interface

Module interface

Kalman

filter

Kalman

filter

Kalman

filter

Kalman

filter

Kalman

filter

Switch 1 Switch 2 Switch 3 Switch 4 Switch 5Sensor Interface

Display Interface

Communication architecture

VLX25

FPGA and PR benefits for the Kalman Filter FPGA amenable features

Low memory requirements Simple filter with streaming inputs and outputs

Can be implemented using only logic and MAC units Requires only multiplication and addition

No complex time-consuming operations such as division, square-root, differentiation, etc.

Low bandwidth requirements Filter receives/produces a stream of

coordinates, not a stream of images PR amenable features

Optimum resource usage The right filter type for the right job

Swapping modules does not halt execution Active filters are never disturbed

Experimental FPGA Experimental FPGA Power MeasurementsPower Measurements

GiDEL Host Specifications Dual Xeon 3.00 GHz processors (Pentium 4 era) 2GB RAM Single 500GB hard drive CD Drive 600W max power supply (Kappa clone)

ProcStar II Power Characteristics Main board supply rated at 7.6A at 3.3V

7.6A × 3.3V = 25.08W maximum power available to: Stratix II EP2S180 FPGA (4x) 2GB SODIMM DDR memory(2x)(only 1 used for tests) 64MB SRAM memory (8x) Miscellaneous oscillators, peripherals, controllers, etc.

This means roughly 5W max available to each FPGA Test Design Characteristics

Kalman tracking filters Heavy multiplier usage, no block rams, minimal logic usage (w/ dedicated multipliers)

In all cases, design runs at 33MHz

Experimental Setup

Methodology GiDEL host system measured without FPGA board

P3 Kill-A-Watt AC power meter used for measurements 0.2% documented accuracy

Accurate to within 1 Watt 7 different test cases with varying power utilization

GiDEL host system measured with FPGA board Same 7 test cases were used (without loading an FPGA design)

This provides minimum power-use baseline for ProcStar II GiDEL board is loaded with FPGA-computationally intensive design

CPU is kept idle Power consumption under regular design is measured (@ 33 MHz)

2% logic use (per FPGA) 15% multiplier use (per FPGA) 1 filter instance per FPGA

Power consumption under maximum-multiplier-use design is measured (@ 33 MHz) 4% logic use 88% multiplier use 7 filter instances per FPGA

Power consumption under maximum-logic-use design is measured (@ 33 MHz) 77% logic use 0% multiplier use 34 filter instances per FPGA

Test Cases Without ProcStar II

With ProcStar II

1. Server off (not standby)

8 W 8 W

2. Idle 127 W 137 W

3. Idle with CDROM spinning

131 W 141 W

4. Full HDD load (defrag)

132 W 143 W

5. Full CPU load (1 thread)

188 W 198 W

6. Full CPU load (4 threads)

255 W 257 W

7. Full CPU/HDD load (3 threads, defrag)

258 W 264 W

Results: Baseline ProcStar II

Threads are simple while(1) loops Although only 2 cores are present, 4 threads were used to bypass Hyper-threading and

OS scheduling HDD load is an exception since defrag requires its own thread to be effective

Results: Kalman Filters on ProcStar II

Power estimates 12.5% toggle rate assumed @ 33 MHz Experimental numbers below assume FPGAs consume

all power (ie. ProcStar II memories, glue logic, etc. consume 0W)

Design 1 140 W total power

~3.25 W per FPGA 15% mult., 2% logic 1 filter instance, high Fmax


~3.25 W per FPGA 88% mult., 4% logic 7 filter instances, high Fmax


~6.25 W per FPGA 0% mult., 77% logic 34 filter instances, low Fmax

Results: Kalman Filter in ProcStar II

*Measured power is derived by subtracting baseline power consumption on ProcStar II board from measured power consumption and dividing by 4 Power consumed from board components not accounted for, actual FPGA power consumption is lower

Questions?

Documents

CHREC F3: Target Tracking Rafael Garcia 11/26/08