22
October 26, 2006 Parallel Image Processing Programming and Architecture IST PhD Lunch Seminar Wouter Caarls Quantitative Imaging Group

October 26, 2006 Parallel Image Processing Programming and Architecture IST PhD Lunch Seminar Wouter Caarls Quantitative Imaging Group

Embed Size (px)

Citation preview

October 26, 2006

Parallel Image ProcessingProgramming and Architecture

IST PhD Lunch Seminar

Wouter Caarls

Quantitative Imaging Group

October 26, 2006 2 of 22

Why Parallel?

• Processing time• Smaller timesteps, more scales, faster

response times• Memory• Larger images, more dimensions

• Energy consumption• More applications, smaller devices

October 26, 2006 3 of 22

Data parallelism

• Many image processing operations have locality of reference (segmentation, filtering, distance transforms, etc.)Data parallelism

October 26, 2006 4 of 22

Task farm parallelism

• An application consists of many different operations• Some of these operations are independent (scale spaces, parameter sweeps, noise realizations, etc.)Task farm parallelism

October 26, 2006 5 of 22

Pipeline parallelism

• An image processing algorithm consists of consecutive stages• If multiple objects are to be processed, they may be in different stages at the same timePipeline parallelism

October 26, 2006 6 of 22

Parallel hardware architecturesFine grained

• Irregular• Superscalar (most modern microprocessors)• VLIW (DSPs)

• Regular• Vector (supercomputers, MMX)• SIMD (graphics processors)

• Custom• FPGA

October 26, 2006 7 of 22

Parallel hardware architecturesCoarse grained

• Homogeneous• Multi-core, SMP• Cluster

• Heterogeneous• Embedded systems• Grid

October 26, 2006 8 of 22

Obstacles

• Programming• Synchronization, bookkeeping• Different systems, languages, optimization

strategies• Choosing an architecture• Analyze program before it is written• Additional requirements or unexpected

performance may require rewrite

October 26, 2006 9 of 22

Architecture-independent parallel programming

• Data parallelism• Differentiate between synchronization pattern

and computation• Library provides pattern, user provides

computation• Task farm & pipeline parallelism• Operations do not work on images, but on

streams• Sequences of operation calls do not imply an

order, but a stream graph.

October 26, 2006 10 of 22

Algorithmic Skeletons

+ =

+ =

October 26, 2006 11 of 22

Example skeletons

• Pixel• Neighbourhood• Recursive neighbourhood

• Stack• Filter• Associative reduction

October 26, 2006 12 of 22

Constructing stream graphs

• By program (dynamic)

capture(orig);normalize(orig, norm);dx(orig, x_der, 1.0);dy(orig, y_der, 1.0);direction(x_der, y_der, dir);display(dir);

• Visually (static)

normalize

dx dy

direction

display

capture

October 26, 2006 13 of 22

Mapping stream graphs to processorsProcessor

1Processor

2

October 26, 2006 14 of 22

Dealing with heterogeneous tasksProcessor

1Processor

2

1

1

2

1 3

2

4 6

1

1

2

1 3

2

5 5

October 26, 2006 15 of 22

Dealing with interconnect

Processor 1

Processor 2

Interconnect

1

1

2

1 3

2

4

4

5 58

1

1

2

1 3

2

4

3 4 7

October 26, 2006 16 of 22

Dealing with dependencies

Processor 1

Processor 2

Interconnect

1

1

2

1 3

2

4

3 (3)+4 (3)+7(3)+3

1

1

2

1

32

3+4 (3)+4

4

October 26, 2006 17 of 22

Choosing an architecture automatically

• Architecture-independent program allows automatic analyis after it is written, but before an architecture is chosen

• Based on certain constraints, architecture can be chosen automatically to optimize some cost function.

• Tradeoff between cost, power and performance must be made by the designer

October 26, 2006 18 of 22

Design Space Exploration

Program

Archi-tecture

MetricsAnalyze

Explore

October 26, 2006 19 of 22

Search strategyConstrained single objective

minimumperformance

perf

orm

ance

cost

October 26, 2006 20 of 22

Search strategyMultiobjective tradeoff iteration

perf

orm

ance

cost

October 26, 2006 21 of 22

Search strategyStrength Pareto

perf

orm

ance

cost

October 26, 2006 22 of 22

Conclusions

Architecture-independent programming allows• Parallel programming without bookkeeping• Targeting heterogeneous systems• Choosing the most appropriate architecture

automatically

http://www.qi.tnw.tudelft.nl/~wcaarls/smartcam