Upload
ethel-oliver
View
217
Download
0
Embed Size (px)
Citation preview
October 26, 2006
Parallel Image ProcessingProgramming and Architecture
IST PhD Lunch Seminar
Wouter Caarls
Quantitative Imaging Group
October 26, 2006 2 of 22
Why Parallel?
• Processing time• Smaller timesteps, more scales, faster
response times• Memory• Larger images, more dimensions
• Energy consumption• More applications, smaller devices
October 26, 2006 3 of 22
Data parallelism
• Many image processing operations have locality of reference (segmentation, filtering, distance transforms, etc.)Data parallelism
October 26, 2006 4 of 22
Task farm parallelism
• An application consists of many different operations• Some of these operations are independent (scale spaces, parameter sweeps, noise realizations, etc.)Task farm parallelism
October 26, 2006 5 of 22
Pipeline parallelism
• An image processing algorithm consists of consecutive stages• If multiple objects are to be processed, they may be in different stages at the same timePipeline parallelism
October 26, 2006 6 of 22
Parallel hardware architecturesFine grained
• Irregular• Superscalar (most modern microprocessors)• VLIW (DSPs)
• Regular• Vector (supercomputers, MMX)• SIMD (graphics processors)
• Custom• FPGA
October 26, 2006 7 of 22
Parallel hardware architecturesCoarse grained
• Homogeneous• Multi-core, SMP• Cluster
• Heterogeneous• Embedded systems• Grid
October 26, 2006 8 of 22
Obstacles
• Programming• Synchronization, bookkeeping• Different systems, languages, optimization
strategies• Choosing an architecture• Analyze program before it is written• Additional requirements or unexpected
performance may require rewrite
October 26, 2006 9 of 22
Architecture-independent parallel programming
• Data parallelism• Differentiate between synchronization pattern
and computation• Library provides pattern, user provides
computation• Task farm & pipeline parallelism• Operations do not work on images, but on
streams• Sequences of operation calls do not imply an
order, but a stream graph.
October 26, 2006 11 of 22
Example skeletons
• Pixel• Neighbourhood• Recursive neighbourhood
• Stack• Filter• Associative reduction
October 26, 2006 12 of 22
Constructing stream graphs
• By program (dynamic)
capture(orig);normalize(orig, norm);dx(orig, x_der, 1.0);dy(orig, y_der, 1.0);direction(x_der, y_der, dir);display(dir);
• Visually (static)
normalize
dx dy
direction
display
capture
October 26, 2006 14 of 22
Dealing with heterogeneous tasksProcessor
1Processor
2
1
1
2
1 3
2
4 6
1
1
2
1 3
2
5 5
October 26, 2006 15 of 22
Dealing with interconnect
Processor 1
Processor 2
Interconnect
1
1
2
1 3
2
4
4
5 58
1
1
2
1 3
2
4
3 4 7
October 26, 2006 16 of 22
Dealing with dependencies
Processor 1
Processor 2
Interconnect
1
1
2
1 3
2
4
3 (3)+4 (3)+7(3)+3
1
1
2
1
32
3+4 (3)+4
4
October 26, 2006 17 of 22
Choosing an architecture automatically
• Architecture-independent program allows automatic analyis after it is written, but before an architecture is chosen
• Based on certain constraints, architecture can be chosen automatically to optimize some cost function.
• Tradeoff between cost, power and performance must be made by the designer
October 26, 2006 19 of 22
Search strategyConstrained single objective
minimumperformance
perf
orm
ance
cost