Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Optimizations for intensive signal processingapplications on Systems-on-Chip
Calin Glitia
September 6, 2010
PhD defense – Calin Glitia 1/34
Intensive signal processing
Detection systemsMultimedia
Repetitive computationsConsiderable amount of data ⇒ Multi-dimensional arrays
PhD defense – Calin Glitia 2/34
Intensive signal processing
Detection systemsMultimedia
Repetitive computationsConsiderable amount of data ⇒ Multi-dimensional arrays
PhD defense – Calin Glitia 2/34
Modular decomposition
Data-fiow oriented modeling
Logical parallelism1 Task parallelism and pipeline2 Data parallelism
Complexity:Elementary functions assemblageComplex accesses at the data structures
PhD defense – Calin Glitia 3/34
Modular decomposition
Data-fiow oriented modeling
Logical parallelism1 Task parallelism and pipeline2 Data parallelismComplexity:
Elementary functions assemblageComplex accesses at the data structures
PhD defense – Calin Glitia 3/34
Execution platforms
Systems-on-Chip:Increase in the integrationcapacityMultiprocessors
Multiprocessor SoC
PhD defense – Calin Glitia 4/34
Execution platforms
Systems-on-Chip:Increase in the integrationcapacityMultiprocessors
Architecture models :Repetitive topologiesPhysical parallelism
Multiprocessor SoC
XAUI 1PHY/MAC
SerializeDeserialize
XAUI 1PHY/MAC
SerializeDeserialize
Flexible I/OPCIe 1PHY/MAC
Flexible I/O
UARTHPI, I2C
JTAG, SPI
PCIe 0PHY/MAC
SerializeDeserialize
SerializeDeserialize
GbE 0
GbE 1
DDR Controller 0 DDR Controller 1
DDR Controller 3 DDR Controller 2
PhD defense – Calin Glitia 4/34
Co-design
Application Architecture
Mapping
Code Generation
PhD defense – Calin Glitia 5/34
Co-design
Application Architecture
Mapping
Code Generation
Logical parallelism Physical parallelism
Distribution
Eflcient execution
PhD defense – Calin Glitia 5/34
Co-design
Application Architecture
Mapping
Code Generation
Logical parallelism Physical parallelism
Distribution
Eflcient execution
Optimizations
PhD defense – Calin Glitia 5/34
Outline
Application Architecture
Mapping
Code Generation
PhD defense – Calin Glitia 6/34
Outline
Application Architecture
Mapping
Code Generation
Array Oriented Language
PhD defense – Calin Glitia 6/34
Outline
Application Architecture
Mapping
Code Generation
Array Oriented Language
Inter repetition dependence
PhD defense – Calin Glitia 6/34
Outline
Application Architecture
Mapping
Code Generation
Array Oriented Language
Inter repetition dependence
Modeling and Analysis of Real-Time Embedded Systems
PhD defense – Calin Glitia 6/34
Outline
Application Architecture
Mapping
Code Generation
Array Oriented Language
Modeling and Analysis of Real-Time Embedded Systems
Inter repetition dependence
High-level refactoring– data-parallel transformations– strategies
PhD defense – Calin Glitia 6/34
Outline
Application Architecture
Mapping
Code Generation
Array Oriented Language
Modeling and Analysis of Real-Time Embedded Systems
Inter repetition dependence
High-level refactoring– data-parallel transformations– strategies
Gaspard2 environment
PhD defense – Calin Glitia 6/34
Some of the existing languages
Data-Flow
1 Synchronous Data Flow
2 Extensions:Cyclo-Static Data FlowMulti-dimensional Synchronous Data Flow, Windowed SynchronousData FlowBoolean Data Flow, Dynamic Data Flow
3 Array-OL
Functional languages
Alpha: polyhedron, recurrence equationsSisal, Single Assignment C
PhD defense – Calin Glitia Modeling intensive signal processing applications 7/34
Some of the existing languages
Data-Flow
1 Synchronous Data Flow2 Extensions:
Cyclo-Static Data FlowMulti-dimensional Synchronous Data Flow, Windowed SynchronousData FlowBoolean Data Flow, Dynamic Data Flow
3 Array-OL
Functional languages
Alpha: polyhedron, recurrence equationsSisal, Single Assignment C
PhD defense – Calin Glitia Modeling intensive signal processing applications 7/34
Some of the existing languages
Data-Flow
1 Synchronous Data Flow2 Extensions:
Cyclo-Static Data FlowMulti-dimensional Synchronous Data Flow, Windowed SynchronousData FlowBoolean Data Flow, Dynamic Data Flow
3 Array-OL
Functional languages
Alpha: polyhedron, recurrence equationsSisal, Single Assignment C
PhD defense – Calin Glitia Modeling intensive signal processing applications 7/34
Some of the existing languages
Data-Flow
1 Synchronous Data Flow2 Extensions:
Cyclo-Static Data FlowMulti-dimensional Synchronous Data Flow, Windowed SynchronousData FlowBoolean Data Flow, Dynamic Data Flow
3 Array-OL
Functional languages
Alpha: polyhedron, recurrence equationsSisal, Single Assignment C
PhD defense – Calin Glitia Modeling intensive signal processing applications 7/34
Multidimensional Synchronous DataFlow (MDSDF)
Matrix multiplication
B
(M,N)
TTransposition
(M,N,1)
(1,M,N)
Repetition
(1,1,1)
(L, 1, 1)
A
(L,M)
Repetition
(1,1,1)
(1,1,N)
Downsample
(1,M,1)
(1,1,1)
TTransposition
(L,1,N)
(L,N,1)
(0, 1, 0)
Data-fiow graph: actors consuming/producingMultiDimensional-tokensStatic analysis/scheduleLimitations: Multiple data consumptionsExtensions
PhD defense – Calin Glitia Modeling intensive signal processing applications 8/34
Multidimensional Synchronous DataFlow (MDSDF)
Matrix multiplication
B
(M,N)
T(3, 1, 2)
(M,N,1)
(1,M,N)
(1,1,1)
(L, 1, 1)
A
(L,M)
(1,1,1)
(1,1,N)
Downsample
(1,M,1)
(1,1,1)
T(1, 3, 2)
(L,1,N)
(L,N,1)
(0, 1, 0)
Data-fiow graph: actors consuming/producingMultiDimensional-tokensStatic analysis/schedule
Limitations: Multiple data consumptionsExtensions
PhD defense – Calin Glitia Modeling intensive signal processing applications 8/34
Multidimensional Synchronous DataFlow (MDSDF)
Matrix multiplication
B
(M,N)
T(3, 1, 2)
(M,N,1)
(1,M,N)
(1,1,1)
(L, 1, 1)
A
(L,M)
(1,1,1)
(1,1,N)
Downsample
(1,M,1)
(1,1,1)
T(1, 3, 2)
(L,1,N)
(L,N,1)
(0, 1, 0)
Data-fiow graph: actors consuming/producingMultiDimensional-tokensStatic analysis/scheduleLimitations: Multiple data consumptionsExtensions
PhD defense – Calin Glitia Modeling intensive signal processing applications 8/34
Array Oriented Language – principles
Similarities with other data-fiow languages
Hierarchical decomposition – task parallelismData-fiow oriented formalism – multi-dimensional data structures
Matrix multiplication
B(M, N)
A(L,M)
C(L, N)
(L, N)
Line×Column multiplication(M)
(M)
(1)
Ligne×Colonne
(M)
(M)(M)
(M)
×(1)
(1)(1)
Sum
(M) (1)
Explicit data parallelism : data-parallel repetitions (“loops”)Uniform paving by sub-arraysSpace and time mixed as dimensions of the data structures
PhD defense – Calin Glitia Modeling intensive signal processing applications 9/34
Array Oriented Language – principles
Similarities with other data-fiow languages
Hierarchical decomposition – task parallelismData-fiow oriented formalism – multi-dimensional data structures
Matrix multiplication
B(M, N)
A(L,M)
C(L, N)
(L, N)
Line×Column multiplication(M)
(M)
(1)
Ligne×Colonne
(M)
(M)(M)
(M)
×(1)
(1)(1)
Sum
(M) (1)
Explicit data parallelism : data-parallel repetitions (“loops”)Uniform paving by sub-arrays
Space and time mixed as dimensions of the data structures
PhD defense – Calin Glitia Modeling intensive signal processing applications 9/34
Array Oriented Language – principles
Similarities with other data-fiow languages
Hierarchical decomposition – task parallelismData-fiow oriented formalism – multi-dimensional data structures
Matrix multiplication
B(M, N)
A(L,M)
C(L, N)
(L, N)
Line×Column multiplication(M)
(M)
(1)
Ligne×Colonne
(M)
(M)(M)
(M)
×(1)
(1)(1)
Sum
(M) (1)
Explicit data parallelism : data-parallel repetitions (“loops”)Uniform paving by sub-arraysSpace and time mixed as dimensions of the data structures
PhD defense – Calin Glitia Modeling intensive signal processing applications 9/34
Data-parallel repetitions
Matrix multiplication: N = 3,M = 4, L = 2
(3, 4)
(4, 2)
(3, 2)
(3, 2)
(4)
(4)
(1)
Matrix multiplication
PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34
Data-parallel repetitions
Matrix multiplication: N = 3,M = 4, L = 2
Line×Column multiplications
PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34
Data-parallel repetitions
Matrix multiplication: N = 3,M = 4, L = 2
uniformly spaced input/output patterns
PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34
Data-parallel repetitions
Matrix multiplication: N = 3,M = 4, L = 2
DATA-PARALLEL instances
PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34
Data-parallel repetitions
Matrix multiplication: N = 3,M = 4, L = 2
(3, 4)
(4, 2)
(3, 2)
(3, 2)
(4)
(4)
(1)
Compact representation: repetition space
PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34
Data-parallel repetitions
Matrix multiplication: N = 3,M = 4, L = 2
(3, 4)
(4, 2)
(3, 2)
(3, 2)
(4)
(4)
(1)
F =(10
)o =
(00
)P =
(0 00 1
)
F =(01
)o =
(00
)P =
(1 00 0
)
F =( )
o =(00
)P =
(1 00 1
)
Tilers – links between input and output patterns
PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34
Uniform sub-array accesses (patterns)
Tiler
o: origin of the reference pattern
F : Fitting matrix – the shape of the tiles in the arrayP: Paving matrix – uniform spacing of the tiles
0 80
5
Formal specification:
o+ (P F) ·(ri
)mod sarray
PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34
Uniform sub-array accesses (patterns)
Tiler
o: origin of the reference pattern
F : Fitting matrix – the shape of the tiles in the arrayP: Paving matrix – uniform spacing of the tiles
0 80
5
Formal specification:
o+ (P F) ·(ri
)mod sarray
PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34
Uniform sub-array accesses (patterns)
Tiler
o: origin of the reference patternF : Fitting matrix – the shape of the tiles in the array
P: Paving matrix – uniform spacing of the tiles
0 80
5
Formal specification:
o+ (P F) ·(ri
)mod sarray
PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34
Uniform sub-array accesses (patterns)
Tiler
o: origin of the reference patternF : Fitting matrix – the shape of the tiles in the arrayP: Paving matrix – uniform spacing of the tiles
0 80
5
Formal specification:
o+ (P F) ·(ri
)mod sarray
PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34
Uniform sub-array accesses (patterns)
Tiler
o: origin of the reference patternF : Fitting matrix – the shape of the tiles in the arrayP: Paving matrix – uniform spacing of the tiles
0 80
5
Formal specification:
o+ (P F) ·(ri
)mod sarray
PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34
Common motif-based accesses
Pattern examples
0 3
0
4
0 5
0
4
0 5
0
3
0 5
0
3
0 5
0
5
Paving example
0 90
4
r =(00
) 0 904
r =(10
) 0 904
r =(20
)
0 90
4
r =(01
) 0 904
r =(11
) 0 904
r =(21
)
0 90
4
r =(02
) 0 904
r =(12
) 0 904
r =(22
)F =
(1 00 1
)spattern =
(53
)o =
(00
)sarray =
(105
)P =
(0 31 0
)srepetition =
(33
)
PhD defense – Calin Glitia Modeling intensive signal processing applications 12/34
Common motif-based accesses
Pattern examples
0 3
0
4
0 5
0
4
0 5
0
3
0 5
0
3
0 5
0
5
Paving example
0 90
4
r =(00
) 0 904
r =(10
) 0 904
r =(20
)
0 90
4
r =(01
) 0 904
r =(11
) 0 904
r =(21
)
0 90
4
r =(02
) 0 904
r =(12
) 0 904
r =(22
)F =
(1 00 1
)spattern =
(53
)o =
(00
)sarray =
(105
)P =
(0 31 0
)srepetition =
(33
)
PhD defense – Calin Glitia Modeling intensive signal processing applications 12/34
Summary ARRAY-OL
Specification
Data-fiow oriented visual formalismExpress the regularity of computations/data accessesExploit the parallelism
Rules that allow static analysis
Limitations
Numerical values for the multidimensional spaces/accessesCycles not allowed in the dependence graphExtension: inter-repetition dependences
PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34
Summary ARRAY-OL
Specification
Data-fiow oriented visual formalismExpress the regularity of computations/data accessesExploit the parallelismRules that allow static analysis
Limitations
Numerical values for the multidimensional spaces/accessesCycles not allowed in the dependence graphExtension: inter-repetition dependences
PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34
Summary ARRAY-OL
Specification
Data-fiow oriented visual formalismExpress the regularity of computations/data accessesExploit the parallelismRules that allow static analysis
Limitations
Numerical values for the multidimensional spaces/accesses
Cycles not allowed in the dependence graphExtension: inter-repetition dependences
PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34
Summary ARRAY-OL
Specification
Data-fiow oriented visual formalismExpress the regularity of computations/data accessesExploit the parallelismRules that allow static analysis
Limitations
Numerical values for the multidimensional spaces/accessesCycles not allowed in the dependence graph
Extension: inter-repetition dependences
PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34
Summary ARRAY-OL
Specification
Data-fiow oriented visual formalismExpress the regularity of computations/data accessesExploit the parallelismRules that allow static analysis
Limitations
Numerical values for the multidimensional spaces/accessesCycles not allowed in the dependence graphExtension: inter-repetition dependences
PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34
Need for uniform dependences
State construction
Transfer data between different instances of the same repetition.
Examples: Sum, Integrate
Integrate – Flatten
...
...
0
+[0] +[1] +[2]...
Uniform data dependences between instances of a repetition
PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34
Need for uniform dependences
State construction
Transfer data between different instances of the same repetition.Examples: Sum, Integrate
Integrate – Flatten
...
...
0
+[0] +[1] +[2]...
Uniform data dependences between instances of a repetition
PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34
Need for uniform dependences
State construction
Transfer data between different instances of the same repetition.Examples: Sum, Integrate
Integrate – Flatten
...
...
0
+[0] +[1] +[2]...
Uniform data dependences between instances of a repetition
PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34
Need for uniform dependences
State construction
Transfer data between different instances of the same repetition.Examples: Sum, Integrate
Integrate – Flatten
...
...
0
+[0] +[1] +[2]...
Uniform data dependences between instances of a repetition
PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34
Uniform dependences
Integrate
(∞)(∞)
(∞)
+
pinpout
default
Inter-repetition dependence
1 Data dependence: pout→pin
2 Dependence vector inside therepetition space
3 Initial values: default link fordependences that exit therepetition space
Calin Glitia, Philippe Dumont, and Pierre Boulet.
Array-OL with delays, a domain specific specification language for multidimensionalintensive signal processing.Multidimensional Systems and Signal Processing, 2009.
PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34
Uniform dependences
Integrate
(∞)(∞)
(∞)
+
pinpout
default
Inter-repetition dependence
1 Data dependence: pout→pin
2 Dependence vector inside therepetition space
3 Initial values: default link fordependences that exit therepetition space
Calin Glitia, Philippe Dumont, and Pierre Boulet.
Array-OL with delays, a domain specific specification language for multidimensionalintensive signal processing.Multidimensional Systems and Signal Processing, 2009.
PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34
Uniform dependences
Integrate
(∞)(∞)
(∞)
+
pinpout
d =(1)
default
Inter-repetition dependence
1 Data dependence: pout→pin
2 Dependence vector inside therepetition space
3 Initial values: default link fordependences that exit therepetition space
Calin Glitia, Philippe Dumont, and Pierre Boulet.
Array-OL with delays, a domain specific specification language for multidimensionalintensive signal processing.Multidimensional Systems and Signal Processing, 2009.
PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34
Uniform dependences
Integrate
(∞)(∞)0
()
(∞)
+
pinpout
default
Inter-repetition dependence
1 Data dependence: pout→pin
2 Dependence vector inside therepetition space
3 Initial values: default link fordependences that exit therepetition space
Calin Glitia, Philippe Dumont, and Pierre Boulet.
Array-OL with delays, a domain specific specification language for multidimensionalintensive signal processing.Multidimensional Systems and Signal Processing, 2009.
PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34
Uniform dependences
Integrate
(∞)(∞)0
()
(∞)
+
pinpout
d =(1)
default
Inter-repetition dependence
1 Data dependence: pout→pin
2 Dependence vector inside therepetition space
3 Initial values: default link fordependences that exit therepetition space
Calin Glitia, Philippe Dumont, and Pierre Boulet.
Array-OL with delays, a domain specific specification language for multidimensionalintensive signal processing.Multidimensional Systems and Signal Processing, 2009.
PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34
Initial values
Initial values – default link
Same initial valueDifferent values – TilerDifferent default links – Exclusive tilers
d =(2, 0
)
PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34
Initial values
Initial values – default link
Same initial value
Different values – TilerDifferent default links – Exclusive tilers
PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34
Initial values
Initial values – default link
Same initial valueDifferent values – Tiler
Different default links – Exclusive tilers
F =( )
o =(00
)P =
(1 00 1
)
PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34
Initial values
Initial values – default link
Same initial valueDifferent values – TilerDifferent default links – Exclusive tilers
F =(1)
o =(0)
P =(4 0
)
F =(1)
o =(−4)
P =(4 0
)
PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34
Initial values
Initial values – default link
Same initial valueDifferent values – TilerDifferent default links – Exclusive tilers
F =(1)
o =(0)
P =(4 0
)
F =(1)
o =(−4)
P =(4 0
)
PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34
Complex dependences
Dependence constructions:
Multiple default links
Multiple dependences on a repetition spaceDependences connected through the hierarchy
PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34
Complex dependences
Dependence constructions:
Multiple default linksMultiple dependences on a repetition space
Dependences connected through the hierarchy
PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34
Complex dependences
Dependence constructions:
Multiple default linksMultiple dependences on a repetition spaceDependences connected through the hierarchy
Dependences on the complete repetition space
PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34
Impact on the parallelism
The repetition space is split in parallel hyper-planesPipeline execution following the distance vector
Scheduling uniform loops
Alain Darte and Yves Robert.
Constructive methods for scheduling uniform loop nests.IEEE Trans. Parallel Distributed Systems, 5(8):814–822, 1994.
PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34
Impact on the parallelism
The repetition space is split in parallel hyper-planes
Pipeline execution following the distance vector
Scheduling uniform loops
Alain Darte and Yves Robert.
Constructive methods for scheduling uniform loop nests.IEEE Trans. Parallel Distributed Systems, 5(8):814–822, 1994.
PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34
Impact on the parallelism
The repetition space is split in parallel hyper-planesPipeline execution following the distance vector
Scheduling uniform loops
Alain Darte and Yves Robert.
Constructive methods for scheduling uniform loop nests.IEEE Trans. Parallel Distributed Systems, 5(8):814–822, 1994.
PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34
Impact on the parallelism
The repetition space is split in parallel hyper-planesPipeline execution following the distance vector
Scheduling uniform loops
Alain Darte and Yves Robert.
Constructive methods for scheduling uniform loop nests.IEEE Trans. Parallel Distributed Systems, 5(8):814–822, 1994.
PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34
Outline
Application Architecture
Mapping
Code Generation
Array Oriented Language
Inter repetition dependence
PhD defense – Calin Glitia Modeling intensive signal processing applications 19/34
Outline
Application Architecture
Mapping
Code Generation
Array Oriented Language
Inter repetition dependence
Modeling and Analysis of Real-Time Embedded Systems
PhD defense – Calin Glitia Modeling intensive signal processing applications 19/34
Modeling and Analysis of Real-Time Embedded Systems
Profile UML – standard OMG
Model Driven EngineeringCo-design: application, architecture, mapping
Repetitive Structure Modeling
All the ARRAY-OL concepts are includedProposed by the DaRT team
PhD defense – Calin Glitia Modeling intensive signal processing applications 20/34
Model repeated inter-connected architecture topologies
Physical connections between architecture components
Compact expression
Cyclic uniform inter-connections
PhD defense – Calin Glitia Modeling intensive signal processing applications 21/34
Model repeated inter-connected architecture topologies
Physical connections between architecture components
Compact expressionCyclic uniform inter-connections
PhD defense – Calin Glitia Modeling intensive signal processing applications 21/34
Summary inter-repetition dependences
1 Expression of state constructions
2 Complex dependences through the hierarchy
3 Parallelism – pipeline
4 Repeated inter-connected architectures
+
PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34
Summary inter-repetition dependences
1 Expression of state constructions
2 Complex dependences through the hierarchy
3 Parallelism – pipeline
4 Repeated inter-connected architectures
+
PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34
Summary inter-repetition dependences
1 Expression of state constructions
2 Complex dependences through the hierarchy
3 Parallelism – pipeline
4 Repeated inter-connected architectures
+
PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34
Summary inter-repetition dependences
1 Expression of state constructions
2 Complex dependences through the hierarchy
3 Parallelism – pipeline
4 Repeated inter-connected architectures
+
PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34
Outline
Application Architecture
Mapping
Code Generation
Array Oriented Language
Inter repetition dependence
PhD defense – Calin Glitia From a high-level specification to the execution 23/34
Outline
Application Architecture
Mapping
Code Generation
Array Oriented Language
Modeling and Analysis of Real-Time Embedded Systems
Inter repetition dependence
High-level refactoring– data-parallel transformations– strategies
PhD defense – Calin Glitia From a high-level specification to the execution 23/34
Execution
Logical space and time as mixed dimensions of multidimensionalstructure
Specification: expresses the data dependencesbetween all the data elements that transits the system
And a partial execution orderbetween all the execution of the tasks in of the system
EFFICIENT execution
Optimized code generationProjection of specification into physical space and time
Adapt a specification to the execution
High-level refactoringExecution that refiects the specification
PhD defense – Calin Glitia From a high-level specification to the execution 24/34
Execution
Logical space and time as mixed dimensions of multidimensionalstructure
Specification: expresses the data dependencesbetween all the data elements that transits the system
And a partial execution orderbetween all the execution of the tasks in of the system
EFFICIENT execution
Optimized code generationProjection of specification into physical space and time
Adapt a specification to the execution
High-level refactoringExecution that refiects the specification
PhD defense – Calin Glitia From a high-level specification to the execution 24/34
Execution
Logical space and time as mixed dimensions of multidimensionalstructure
Specification: expresses the data dependencesbetween all the data elements that transits the system
And a partial execution orderbetween all the execution of the tasks in of the system
EFFICIENT execution
Optimized code generationProjection of specification into physical space and time
Adapt a specification to the execution
High-level refactoringExecution that refiects the specification
PhD defense – Calin Glitia From a high-level specification to the execution 24/34
Projection into space and time
Multi-dimensional structures
repetition spaces
⇓in space ←→
linked(trade-off)
data structures
⇓in time
Take into account the execution constraints
Data dependencesAvailable resources
PhD defense – Calin Glitia From a high-level specification to the execution 25/34
Projection into space and time
Multi-dimensional structures
repetition spaces⇓
in space
←→linked(trade-off)
data structures⇓
in time
Take into account the execution constraints
Data dependencesAvailable resources
PhD defense – Calin Glitia From a high-level specification to the execution 25/34
Projection into space and time
Multi-dimensional structures
repetition spaces⇓
in space ←→linked(trade-off)
data structures⇓
in time
Take into account the execution constraints
Data dependencesAvailable resources
PhD defense – Calin Glitia From a high-level specification to the execution 25/34
Projection into space and time
Multi-dimensional structures
repetition spaces⇓
in space ←→linked(trade-off)
data structures⇓
in time
Take into account the execution constraints
Data dependencesAvailable resources
PhD defense – Calin Glitia From a high-level specification to the execution 25/34
Projection example
Horizontal Filter
(1920, 1080,∞) (720, 1080,∞)(240, 1080,∞)
(13) (3)
Vertical Filter
(720, 480,∞)(720, 120,∞)
(14) (4)
PhD defense – Calin Glitia From a high-level specification to the execution 26/34
Projection example
Horizontal Filter
(1920, 1080,∞) (720, 1080,∞)(240, 1080,∞)
(13) (3)
Vertical Filter
(720, 480,∞)(720, 120,∞)
(14) (4)
PhD defense – Calin Glitia From a high-level specification to the execution 26/34
Projection example
Horizontal Filter
(1920, 1080,∞) (720, 1080,∞)(240, 1080,∞)
(13) (3)
Vertical Filter
(720, 480,∞)(720, 120,∞)
(14) (4)
Maximal parallelism
Memory sizeInfinite data structures – Blocking points
PhD defense – Calin Glitia From a high-level specification to the execution 26/34
Projection example
Horizontal Filter
(1920, 1080,∞) (720, 1080,∞)(240, 1080,∞)
(13) (3)
Vertical Filter
(720, 480,∞)(720, 120,∞)
(14) (4)
Pipeline
Execution Order
PhD defense – Calin Glitia From a high-level specification to the execution 26/34
Projection example
(1920, 1080,∞) (720, 480,∞)
(240, 120,∞)
Horizontal Filtre
(14, 13) (3, 14)
(14)
(13) (3)
Vertical Filter
(3, 4)
(3)
(14) (4)
Fusion of successive repetitions
Minimize the arrays – macro-patternsDistribution of the common repetitionEach processor its macro-patterns in memory
PhD defense – Calin Glitia From a high-level specification to the execution 26/34
Projection example
(1920, 1080,∞) (720, 480,∞)
(240, 120,∞)
Horizontal Filtre
(14, 13) (3, 14)
(14)
(13) (3)
Vertical Filter
(3, 4)
(3)
(14) (4)
Re-computations
When intermediate values are consumed by multiple repetitionsTrade-off
Recompute valuesKeep in memory – increase of memory size
PhD defense – Calin Glitia From a high-level specification to the execution 26/34
High-level transformations
Adapt a specification to execution
change the granularity of the repetitionsarray sizes reductions
“High-level” loop transformations
repetition = visual representation of data-parallel loop nestfusion, change paving, tiling, collapse, . . .
Calin Glitia and Pierre Boulet.
High level loop transformations for multidimensional signal processing embeddedapplications.In International Symposium on Systems, Architectures, Modeling, and Simulation(SAMOS VIII), Samos, Greece, July 2008.
PhD defense – Calin Glitia From a high-level specification to the execution 27/34
High-level transformations
Adapt a specification to execution
change the granularity of the repetitionsarray sizes reductions
“High-level” loop transformations
repetition = visual representation of data-parallel loop nestfusion, change paving, tiling, collapse, . . .
Calin Glitia and Pierre Boulet.
High level loop transformations for multidimensional signal processing embeddedapplications.In International Symposium on Systems, Architectures, Modeling, and Simulation(SAMOS VIII), Samos, Greece, July 2008.
PhD defense – Calin Glitia From a high-level specification to the execution 27/34
Optimization strategies – memory size reduction
r1
r2 r3 r4 r5
r6
r7
MAXIMAL reduction of the intermediate arrays
PhD defense – Calin Glitia From a high-level specification to the execution 28/34
Optimization strategies – memory size reduction
r1
r2 r3 r4 r5
r6
r7
MAXIMAL reduction of the intermediate arraysFusion of multiple repetitions
Minimizes only the last intermediate arrayRe-computations!
PhD defense – Calin Glitia From a high-level specification to the execution 28/34
Optimization strategies – memory size reduction
r1
r2 r3 r4 r5
r6
r7
MAXIMAL reduction of the intermediate arraysFusion of multiple repetitions
Minimizes only the last intermediate arrayRe-computations!
Complete fusion?Too much re-computationsLimited array reduction
PhD defense – Calin Glitia From a high-level specification to the execution 28/34
Optimization strategies – memory size reduction
r1
r2 r3 r4 r5
r6
r7
MAXIMAL reduction of the intermediate arrays
Strategy that limits the re-computationsusing result from complete fusion and two-by-two fusionswhere re-computations are introduces and minimal achievable arrayreduction
PhD defense – Calin Glitia From a high-level specification to the execution 28/34
Optimization strategies – memory size reduction
r1
r2 r3 r4 r5
r6
r7
r14r67
r12
MAXIMAL reduction of the intermediate arrays
Repetitions Repetitions Re-computations Reduction factorbefore fusion after fusion (product) of the output arrays8× 128× 96
96×
119×(10× 81
)80× 801
9.29 1228.8
119× 96 1 9680× 80× 96 1 96
96 1 1128× 96× 80 128× 96× 80 1 1119× 128× 96 128× 96×
(1191
)1 12288
128× 96 1 1
PhD defense – Calin Glitia From a high-level specification to the execution 28/34
Optimization strategies – memory size reduction
r1
r2 r3 r4 r5
r6
r7
r14r67
r12
MAXIMAL reduction of the intermediate arrays
Calin Glitia, Pierre Boulet, Éric Lenormand, and Michel Barreteau.
Repetitive model refactoring strategy for the design space exploration of intensivesignal processing applications.Journal of Systems Architecture, Special Issue: Hardware/Software CoDesign.
PhD defense – Calin Glitia From a high-level specification to the execution 28/34
And the inter-repetition dependences?
Why ?
To allow the use of the refactoring tools on models with uniformdependences
Typically: ⇒
Algorithm
The global accesses and dependences MUST remain unchangedAutomatically compute new dependences after a transformation
Calin Glitia and Pierre Boulet.
Interaction between inter-repetition dependences and high-level transformations inarray-ol.In Conference on Design and Architectures for Signal and Image Processing (DASIP2009), Sophia Antipolis, France, September 2009.
PhD defense – Calin Glitia From a high-level specification to the execution 29/34
And the inter-repetition dependences?
Why ?
To allow the use of the refactoring tools on models with uniformdependences
Typically: ⇒
Algorithm
The global accesses and dependences MUST remain unchangedAutomatically compute new dependences after a transformation
Calin Glitia and Pierre Boulet.
Interaction between inter-repetition dependences and high-level transformations inarray-ol.In Conference on Design and Architectures for Signal and Image Processing (DASIP2009), Sophia Antipolis, France, September 2009.
PhD defense – Calin Glitia From a high-level specification to the execution 29/34
And the inter-repetition dependences?
Why ?
To allow the use of the refactoring tools on models with uniformdependences
Typically: ⇒
Algorithm
The global accesses and dependences MUST remain unchangedAutomatically compute new dependences after a transformation
Calin Glitia and Pierre Boulet.
Interaction between inter-repetition dependences and high-level transformations inarray-ol.In Conference on Design and Architectures for Signal and Image Processing (DASIP2009), Sophia Antipolis, France, September 2009.
PhD defense – Calin Glitia From a high-level specification to the execution 29/34
And the inter-repetition dependences?
Why ?
To allow the use of the refactoring tools on models with uniformdependences
Typically: ⇒
Algorithm
The global accesses and dependences MUST remain unchangedAutomatically compute new dependences after a transformation
Calin Glitia and Pierre Boulet.
Interaction between inter-repetition dependences and high-level transformations inarray-ol.In Conference on Design and Architectures for Signal and Image Processing (DASIP2009), Sophia Antipolis, France, September 2009.
PhD defense – Calin Glitia From a high-level specification to the execution 29/34
Outline
Application Architecture
Mapping
Code Generation
Array Oriented Language
Modeling and Analysis of Real-Time Embedded Systems
Inter repetition dependence
High-level refactoring– data-parallel transformations– strategies
PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 30/34
Outline
Application Architecture
Mapping
Code Generation
Array Oriented Language
Modeling and Analysis of Real-Time Embedded Systems
Inter repetition dependence
High-level refactoring– data-parallel transformations– strategies
Gaspard2 environment
PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 30/34
Gaspard2 environment
SoC visual co-design
1 allows modeling, simulation and code generation of SoC2 approach Model Driven Engineering3 Subset of the MARTE UML profile
PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 31/34
Gaspard2 conception fiow
high-level specification
inter-repetitiondependencesrefactoring tools:implementation andintegration
MDE contributions toGaspard2
specializations
code generation
PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 32/34
Gaspard2 conception fiow
high-level specificationinter-repetitiondependencesrefactoring tools:implementation andintegration
MDE contributions toGaspard2
specializations
code generation
PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 32/34
Conclusion
Application Architecture
Mapping
Code Generation
PhD defense – Calin Glitia Conclusion 33/34
Conclusion
Application Architecture
Mapping
Code Generation
Array Oriented Language
PhD defense – Calin Glitia Conclusion 33/34
Conclusion
Application Architecture
Mapping
Code Generation
Array Oriented Language
Inter repetition dependence
PhD defense – Calin Glitia Conclusion 33/34
Conclusion
Application Architecture
Mapping
Code Generation
Array Oriented Language
Inter repetition dependence
Modeling and Analysis of Real-Time Embedded Systems
PhD defense – Calin Glitia Conclusion 33/34
Conclusion
Application Architecture
Mapping
Code Generation
Array Oriented Language
Modeling and Analysis of Real-Time Embedded Systems
Inter repetition dependence
High-level refactoring– data-parallel transformations– strategies
PhD defense – Calin Glitia Conclusion 33/34
Conclusion
Application Architecture
Mapping
Code Generation
Array Oriented Language
Modeling and Analysis of Real-Time Embedded Systems
Inter repetition dependence
High-level refactoring– data-parallel transformations– strategies
Gaspard2 environment
PhD defense – Calin Glitia Conclusion 33/34
Perspectives
. . .
PhD defense – Calin Glitia Conclusion 34/34
Modeling intensive signal processing applicationsSpecification languagesArray Oriented LanguageModeling uniform dependences – contribution
From a high-level specification to the executionData-parallel transformationsOptimization strategies
Gaspard2 – co-design environment for SoCConclusion