113
Optimizations for intensive signal processing applications on Systems-on-Chip Calin Glitia September 6, 2010 PhD defense – Calin Glitia 1/34

Optimizations for intensive signal processing applications on … · 2010. 9. 6. · PhD defense Œ Calin Glitia Modeling intensive signal processing applications 9/34. Array Oriented

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

  • Optimizations for intensive signal processingapplications on Systems-on-Chip

    Calin Glitia

    September 6, 2010

    PhD defense – Calin Glitia 1/34

  • Intensive signal processing

    Detection systemsMultimedia

    Repetitive computationsConsiderable amount of data ⇒ Multi-dimensional arrays

    PhD defense – Calin Glitia 2/34

  • Intensive signal processing

    Detection systemsMultimedia

    Repetitive computationsConsiderable amount of data ⇒ Multi-dimensional arrays

    PhD defense – Calin Glitia 2/34

  • Modular decomposition

    Data-fiow oriented modeling

    Logical parallelism1 Task parallelism and pipeline2 Data parallelism

    Complexity:Elementary functions assemblageComplex accesses at the data structures

    PhD defense – Calin Glitia 3/34

  • Modular decomposition

    Data-fiow oriented modeling

    Logical parallelism1 Task parallelism and pipeline2 Data parallelismComplexity:

    Elementary functions assemblageComplex accesses at the data structures

    PhD defense – Calin Glitia 3/34

  • Execution platforms

    Systems-on-Chip:Increase in the integrationcapacityMultiprocessors

    Multiprocessor SoC

    PhD defense – Calin Glitia 4/34

  • Execution platforms

    Systems-on-Chip:Increase in the integrationcapacityMultiprocessors

    Architecture models :Repetitive topologiesPhysical parallelism

    Multiprocessor SoC

    XAUI 1PHY/MAC

    SerializeDeserialize

    XAUI 1PHY/MAC

    SerializeDeserialize

    Flexible I/OPCIe 1PHY/MAC

    Flexible I/O

    UARTHPI, I2C

    JTAG, SPI

    PCIe 0PHY/MAC

    SerializeDeserialize

    SerializeDeserialize

    GbE 0

    GbE 1

    DDR Controller 0 DDR Controller 1

    DDR Controller 3 DDR Controller 2

    PhD defense – Calin Glitia 4/34

  • Co-design

    Application Architecture

    Mapping

    Code Generation

    PhD defense – Calin Glitia 5/34

  • Co-design

    Application Architecture

    Mapping

    Code Generation

    Logical parallelism Physical parallelism

    Distribution

    Eflcient execution

    PhD defense – Calin Glitia 5/34

  • Co-design

    Application Architecture

    Mapping

    Code Generation

    Logical parallelism Physical parallelism

    Distribution

    Eflcient execution

    Optimizations

    PhD defense – Calin Glitia 5/34

  • Outline

    Application Architecture

    Mapping

    Code Generation

    PhD defense – Calin Glitia 6/34

  • Outline

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    PhD defense – Calin Glitia 6/34

  • Outline

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    Inter repetition dependence

    PhD defense – Calin Glitia 6/34

  • Outline

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    Inter repetition dependence

    Modeling and Analysis of Real-Time Embedded Systems

    PhD defense – Calin Glitia 6/34

  • Outline

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    Modeling and Analysis of Real-Time Embedded Systems

    Inter repetition dependence

    High-level refactoring– data-parallel transformations– strategies

    PhD defense – Calin Glitia 6/34

  • Outline

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    Modeling and Analysis of Real-Time Embedded Systems

    Inter repetition dependence

    High-level refactoring– data-parallel transformations– strategies

    Gaspard2 environment

    PhD defense – Calin Glitia 6/34

  • Some of the existing languages

    Data-Flow

    1 Synchronous Data Flow

    2 Extensions:Cyclo-Static Data FlowMulti-dimensional Synchronous Data Flow, Windowed SynchronousData FlowBoolean Data Flow, Dynamic Data Flow

    3 Array-OL

    Functional languages

    Alpha: polyhedron, recurrence equationsSisal, Single Assignment C

    PhD defense – Calin Glitia Modeling intensive signal processing applications 7/34

  • Some of the existing languages

    Data-Flow

    1 Synchronous Data Flow2 Extensions:

    Cyclo-Static Data FlowMulti-dimensional Synchronous Data Flow, Windowed SynchronousData FlowBoolean Data Flow, Dynamic Data Flow

    3 Array-OL

    Functional languages

    Alpha: polyhedron, recurrence equationsSisal, Single Assignment C

    PhD defense – Calin Glitia Modeling intensive signal processing applications 7/34

  • Some of the existing languages

    Data-Flow

    1 Synchronous Data Flow2 Extensions:

    Cyclo-Static Data FlowMulti-dimensional Synchronous Data Flow, Windowed SynchronousData FlowBoolean Data Flow, Dynamic Data Flow

    3 Array-OL

    Functional languages

    Alpha: polyhedron, recurrence equationsSisal, Single Assignment C

    PhD defense – Calin Glitia Modeling intensive signal processing applications 7/34

  • Some of the existing languages

    Data-Flow

    1 Synchronous Data Flow2 Extensions:

    Cyclo-Static Data FlowMulti-dimensional Synchronous Data Flow, Windowed SynchronousData FlowBoolean Data Flow, Dynamic Data Flow

    3 Array-OL

    Functional languages

    Alpha: polyhedron, recurrence equationsSisal, Single Assignment C

    PhD defense – Calin Glitia Modeling intensive signal processing applications 7/34

  • Multidimensional Synchronous DataFlow (MDSDF)

    Matrix multiplication

    B

    (M,N)

    TTransposition

    (M,N,1)

    (1,M,N)

    Repetition

    (1,1,1)

    (L, 1, 1)

    A

    (L,M)

    Repetition

    (1,1,1)

    (1,1,N)

    Downsample

    (1,M,1)

    (1,1,1)

    TTransposition

    (L,1,N)

    (L,N,1)

    (0, 1, 0)

    Data-fiow graph: actors consuming/producingMultiDimensional-tokensStatic analysis/scheduleLimitations: Multiple data consumptionsExtensions

    PhD defense – Calin Glitia Modeling intensive signal processing applications 8/34

  • Multidimensional Synchronous DataFlow (MDSDF)

    Matrix multiplication

    B

    (M,N)

    T(3, 1, 2)

    (M,N,1)

    (1,M,N)

    (1,1,1)

    (L, 1, 1)

    A

    (L,M)

    (1,1,1)

    (1,1,N)

    Downsample

    (1,M,1)

    (1,1,1)

    T(1, 3, 2)

    (L,1,N)

    (L,N,1)

    (0, 1, 0)

    Data-fiow graph: actors consuming/producingMultiDimensional-tokensStatic analysis/schedule

    Limitations: Multiple data consumptionsExtensions

    PhD defense – Calin Glitia Modeling intensive signal processing applications 8/34

  • Multidimensional Synchronous DataFlow (MDSDF)

    Matrix multiplication

    B

    (M,N)

    T(3, 1, 2)

    (M,N,1)

    (1,M,N)

    (1,1,1)

    (L, 1, 1)

    A

    (L,M)

    (1,1,1)

    (1,1,N)

    Downsample

    (1,M,1)

    (1,1,1)

    T(1, 3, 2)

    (L,1,N)

    (L,N,1)

    (0, 1, 0)

    Data-fiow graph: actors consuming/producingMultiDimensional-tokensStatic analysis/scheduleLimitations: Multiple data consumptionsExtensions

    PhD defense – Calin Glitia Modeling intensive signal processing applications 8/34

  • Array Oriented Language – principles

    Similarities with other data-fiow languages

    Hierarchical decomposition – task parallelismData-fiow oriented formalism – multi-dimensional data structures

    Matrix multiplication

    B(M, N)

    A(L,M)

    C(L, N)

    (L, N)

    Line×Column multiplication(M)

    (M)

    (1)

    Ligne×Colonne

    (M)

    (M)(M)

    (M)

    ×(1)

    (1)(1)

    Sum

    (M) (1)

    Explicit data parallelism : data-parallel repetitions (“loops”)Uniform paving by sub-arraysSpace and time mixed as dimensions of the data structures

    PhD defense – Calin Glitia Modeling intensive signal processing applications 9/34

  • Array Oriented Language – principles

    Similarities with other data-fiow languages

    Hierarchical decomposition – task parallelismData-fiow oriented formalism – multi-dimensional data structures

    Matrix multiplication

    B(M, N)

    A(L,M)

    C(L, N)

    (L, N)

    Line×Column multiplication(M)

    (M)

    (1)

    Ligne×Colonne

    (M)

    (M)(M)

    (M)

    ×(1)

    (1)(1)

    Sum

    (M) (1)

    Explicit data parallelism : data-parallel repetitions (“loops”)Uniform paving by sub-arrays

    Space and time mixed as dimensions of the data structures

    PhD defense – Calin Glitia Modeling intensive signal processing applications 9/34

  • Array Oriented Language – principles

    Similarities with other data-fiow languages

    Hierarchical decomposition – task parallelismData-fiow oriented formalism – multi-dimensional data structures

    Matrix multiplication

    B(M, N)

    A(L,M)

    C(L, N)

    (L, N)

    Line×Column multiplication(M)

    (M)

    (1)

    Ligne×Colonne

    (M)

    (M)(M)

    (M)

    ×(1)

    (1)(1)

    Sum

    (M) (1)

    Explicit data parallelism : data-parallel repetitions (“loops”)Uniform paving by sub-arraysSpace and time mixed as dimensions of the data structures

    PhD defense – Calin Glitia Modeling intensive signal processing applications 9/34

  • Data-parallel repetitions

    Matrix multiplication: N = 3,M = 4, L = 2

    (3, 4)

    (4, 2)

    (3, 2)

    (3, 2)

    (4)

    (4)

    (1)

    Matrix multiplication

    PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34

  • Data-parallel repetitions

    Matrix multiplication: N = 3,M = 4, L = 2

    Line×Column multiplications

    PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34

  • Data-parallel repetitions

    Matrix multiplication: N = 3,M = 4, L = 2

    uniformly spaced input/output patterns

    PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34

  • Data-parallel repetitions

    Matrix multiplication: N = 3,M = 4, L = 2

    DATA-PARALLEL instances

    PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34

  • Data-parallel repetitions

    Matrix multiplication: N = 3,M = 4, L = 2

    (3, 4)

    (4, 2)

    (3, 2)

    (3, 2)

    (4)

    (4)

    (1)

    Compact representation: repetition space

    PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34

  • Data-parallel repetitions

    Matrix multiplication: N = 3,M = 4, L = 2

    (3, 4)

    (4, 2)

    (3, 2)

    (3, 2)

    (4)

    (4)

    (1)

    F =(10

    )o =

    (00

    )P =

    (0 00 1

    )

    F =(01

    )o =

    (00

    )P =

    (1 00 0

    )

    F =( )

    o =(00

    )P =

    (1 00 1

    )

    Tilers – links between input and output patterns

    PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34

  • Uniform sub-array accesses (patterns)

    Tiler

    o: origin of the reference pattern

    F : Fitting matrix – the shape of the tiles in the arrayP: Paving matrix – uniform spacing of the tiles

    0 80

    5

    Formal specification:

    o+ (P F) ·(ri

    )mod sarray

    PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

  • Uniform sub-array accesses (patterns)

    Tiler

    o: origin of the reference pattern

    F : Fitting matrix – the shape of the tiles in the arrayP: Paving matrix – uniform spacing of the tiles

    0 80

    5

    Formal specification:

    o+ (P F) ·(ri

    )mod sarray

    PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

  • Uniform sub-array accesses (patterns)

    Tiler

    o: origin of the reference patternF : Fitting matrix – the shape of the tiles in the array

    P: Paving matrix – uniform spacing of the tiles

    0 80

    5

    Formal specification:

    o+ (P F) ·(ri

    )mod sarray

    PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

  • Uniform sub-array accesses (patterns)

    Tiler

    o: origin of the reference patternF : Fitting matrix – the shape of the tiles in the arrayP: Paving matrix – uniform spacing of the tiles

    0 80

    5

    Formal specification:

    o+ (P F) ·(ri

    )mod sarray

    PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

  • Uniform sub-array accesses (patterns)

    Tiler

    o: origin of the reference patternF : Fitting matrix – the shape of the tiles in the arrayP: Paving matrix – uniform spacing of the tiles

    0 80

    5

    Formal specification:

    o+ (P F) ·(ri

    )mod sarray

    PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

  • Common motif-based accesses

    Pattern examples

    0 3

    0

    4

    0 5

    0

    4

    0 5

    0

    3

    0 5

    0

    3

    0 5

    0

    5

    Paving example

    0 90

    4

    r =(00

    ) 0 904

    r =(10

    ) 0 904

    r =(20

    )

    0 90

    4

    r =(01

    ) 0 904

    r =(11

    ) 0 904

    r =(21

    )

    0 90

    4

    r =(02

    ) 0 904

    r =(12

    ) 0 904

    r =(22

    )F =

    (1 00 1

    )spattern =

    (53

    )o =

    (00

    )sarray =

    (105

    )P =

    (0 31 0

    )srepetition =

    (33

    )

    PhD defense – Calin Glitia Modeling intensive signal processing applications 12/34

  • Common motif-based accesses

    Pattern examples

    0 3

    0

    4

    0 5

    0

    4

    0 5

    0

    3

    0 5

    0

    3

    0 5

    0

    5

    Paving example

    0 90

    4

    r =(00

    ) 0 904

    r =(10

    ) 0 904

    r =(20

    )

    0 90

    4

    r =(01

    ) 0 904

    r =(11

    ) 0 904

    r =(21

    )

    0 90

    4

    r =(02

    ) 0 904

    r =(12

    ) 0 904

    r =(22

    )F =

    (1 00 1

    )spattern =

    (53

    )o =

    (00

    )sarray =

    (105

    )P =

    (0 31 0

    )srepetition =

    (33

    )

    PhD defense – Calin Glitia Modeling intensive signal processing applications 12/34

  • Summary ARRAY-OL

    Specification

    Data-fiow oriented visual formalismExpress the regularity of computations/data accessesExploit the parallelism

    Rules that allow static analysis

    Limitations

    Numerical values for the multidimensional spaces/accessesCycles not allowed in the dependence graphExtension: inter-repetition dependences

    PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  • Summary ARRAY-OL

    Specification

    Data-fiow oriented visual formalismExpress the regularity of computations/data accessesExploit the parallelismRules that allow static analysis

    Limitations

    Numerical values for the multidimensional spaces/accessesCycles not allowed in the dependence graphExtension: inter-repetition dependences

    PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  • Summary ARRAY-OL

    Specification

    Data-fiow oriented visual formalismExpress the regularity of computations/data accessesExploit the parallelismRules that allow static analysis

    Limitations

    Numerical values for the multidimensional spaces/accesses

    Cycles not allowed in the dependence graphExtension: inter-repetition dependences

    PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  • Summary ARRAY-OL

    Specification

    Data-fiow oriented visual formalismExpress the regularity of computations/data accessesExploit the parallelismRules that allow static analysis

    Limitations

    Numerical values for the multidimensional spaces/accessesCycles not allowed in the dependence graph

    Extension: inter-repetition dependences

    PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  • Summary ARRAY-OL

    Specification

    Data-fiow oriented visual formalismExpress the regularity of computations/data accessesExploit the parallelismRules that allow static analysis

    Limitations

    Numerical values for the multidimensional spaces/accessesCycles not allowed in the dependence graphExtension: inter-repetition dependences

    PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  • Need for uniform dependences

    State construction

    Transfer data between different instances of the same repetition.

    Examples: Sum, Integrate

    Integrate – Flatten

    ...

    ...

    0

    +[0] +[1] +[2]...

    Uniform data dependences between instances of a repetition

    PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

  • Need for uniform dependences

    State construction

    Transfer data between different instances of the same repetition.Examples: Sum, Integrate

    Integrate – Flatten

    ...

    ...

    0

    +[0] +[1] +[2]...

    Uniform data dependences between instances of a repetition

    PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

  • Need for uniform dependences

    State construction

    Transfer data between different instances of the same repetition.Examples: Sum, Integrate

    Integrate – Flatten

    ...

    ...

    0

    +[0] +[1] +[2]...

    Uniform data dependences between instances of a repetition

    PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

  • Need for uniform dependences

    State construction

    Transfer data between different instances of the same repetition.Examples: Sum, Integrate

    Integrate – Flatten

    ...

    ...

    0

    +[0] +[1] +[2]...

    Uniform data dependences between instances of a repetition

    PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

  • Uniform dependences

    Integrate

    (∞)(∞)

    (∞)

    +

    pinpout

    default

    Inter-repetition dependence

    1 Data dependence: pout→pin

    2 Dependence vector inside therepetition space

    3 Initial values: default link fordependences that exit therepetition space

    Calin Glitia, Philippe Dumont, and Pierre Boulet.

    Array-OL with delays, a domain specific specification language for multidimensionalintensive signal processing.Multidimensional Systems and Signal Processing, 2009.

    PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  • Uniform dependences

    Integrate

    (∞)(∞)

    (∞)

    +

    pinpout

    default

    Inter-repetition dependence

    1 Data dependence: pout→pin

    2 Dependence vector inside therepetition space

    3 Initial values: default link fordependences that exit therepetition space

    Calin Glitia, Philippe Dumont, and Pierre Boulet.

    Array-OL with delays, a domain specific specification language for multidimensionalintensive signal processing.Multidimensional Systems and Signal Processing, 2009.

    PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  • Uniform dependences

    Integrate

    (∞)(∞)

    (∞)

    +

    pinpout

    d =(1)

    default

    Inter-repetition dependence

    1 Data dependence: pout→pin

    2 Dependence vector inside therepetition space

    3 Initial values: default link fordependences that exit therepetition space

    Calin Glitia, Philippe Dumont, and Pierre Boulet.

    Array-OL with delays, a domain specific specification language for multidimensionalintensive signal processing.Multidimensional Systems and Signal Processing, 2009.

    PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  • Uniform dependences

    Integrate

    (∞)(∞)0

    ()

    (∞)

    +

    pinpout

    default

    Inter-repetition dependence

    1 Data dependence: pout→pin

    2 Dependence vector inside therepetition space

    3 Initial values: default link fordependences that exit therepetition space

    Calin Glitia, Philippe Dumont, and Pierre Boulet.

    Array-OL with delays, a domain specific specification language for multidimensionalintensive signal processing.Multidimensional Systems and Signal Processing, 2009.

    PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  • Uniform dependences

    Integrate

    (∞)(∞)0

    ()

    (∞)

    +

    pinpout

    d =(1)

    default

    Inter-repetition dependence

    1 Data dependence: pout→pin

    2 Dependence vector inside therepetition space

    3 Initial values: default link fordependences that exit therepetition space

    Calin Glitia, Philippe Dumont, and Pierre Boulet.

    Array-OL with delays, a domain specific specification language for multidimensionalintensive signal processing.Multidimensional Systems and Signal Processing, 2009.

    PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  • Initial values

    Initial values – default link

    Same initial valueDifferent values – TilerDifferent default links – Exclusive tilers

    d =(2, 0

    )

    PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  • Initial values

    Initial values – default link

    Same initial value

    Different values – TilerDifferent default links – Exclusive tilers

    PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  • Initial values

    Initial values – default link

    Same initial valueDifferent values – Tiler

    Different default links – Exclusive tilers

    F =( )

    o =(00

    )P =

    (1 00 1

    )

    PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  • Initial values

    Initial values – default link

    Same initial valueDifferent values – TilerDifferent default links – Exclusive tilers

    F =(1)

    o =(0)

    P =(4 0

    )

    F =(1)

    o =(−4)

    P =(4 0

    )

    PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  • Initial values

    Initial values – default link

    Same initial valueDifferent values – TilerDifferent default links – Exclusive tilers

    F =(1)

    o =(0)

    P =(4 0

    )

    F =(1)

    o =(−4)

    P =(4 0

    )

    PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  • Complex dependences

    Dependence constructions:

    Multiple default links

    Multiple dependences on a repetition spaceDependences connected through the hierarchy

    PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34

  • Complex dependences

    Dependence constructions:

    Multiple default linksMultiple dependences on a repetition space

    Dependences connected through the hierarchy

    PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34

  • Complex dependences

    Dependence constructions:

    Multiple default linksMultiple dependences on a repetition spaceDependences connected through the hierarchy

    Dependences on the complete repetition space

    PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34

  • Impact on the parallelism

    The repetition space is split in parallel hyper-planesPipeline execution following the distance vector

    Scheduling uniform loops

    Alain Darte and Yves Robert.

    Constructive methods for scheduling uniform loop nests.IEEE Trans. Parallel Distributed Systems, 5(8):814–822, 1994.

    PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

  • Impact on the parallelism

    The repetition space is split in parallel hyper-planes

    Pipeline execution following the distance vector

    Scheduling uniform loops

    Alain Darte and Yves Robert.

    Constructive methods for scheduling uniform loop nests.IEEE Trans. Parallel Distributed Systems, 5(8):814–822, 1994.

    PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

  • Impact on the parallelism

    The repetition space is split in parallel hyper-planesPipeline execution following the distance vector

    Scheduling uniform loops

    Alain Darte and Yves Robert.

    Constructive methods for scheduling uniform loop nests.IEEE Trans. Parallel Distributed Systems, 5(8):814–822, 1994.

    PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

  • Impact on the parallelism

    The repetition space is split in parallel hyper-planesPipeline execution following the distance vector

    Scheduling uniform loops

    Alain Darte and Yves Robert.

    Constructive methods for scheduling uniform loop nests.IEEE Trans. Parallel Distributed Systems, 5(8):814–822, 1994.

    PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

  • Outline

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    Inter repetition dependence

    PhD defense – Calin Glitia Modeling intensive signal processing applications 19/34

  • Outline

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    Inter repetition dependence

    Modeling and Analysis of Real-Time Embedded Systems

    PhD defense – Calin Glitia Modeling intensive signal processing applications 19/34

  • Modeling and Analysis of Real-Time Embedded Systems

    Profile UML – standard OMG

    Model Driven EngineeringCo-design: application, architecture, mapping

    Repetitive Structure Modeling

    All the ARRAY-OL concepts are includedProposed by the DaRT team

    PhD defense – Calin Glitia Modeling intensive signal processing applications 20/34

  • Model repeated inter-connected architecture topologies

    Physical connections between architecture components

    Compact expression

    Cyclic uniform inter-connections

    PhD defense – Calin Glitia Modeling intensive signal processing applications 21/34

  • Model repeated inter-connected architecture topologies

    Physical connections between architecture components

    Compact expressionCyclic uniform inter-connections

    PhD defense – Calin Glitia Modeling intensive signal processing applications 21/34

  • Summary inter-repetition dependences

    1 Expression of state constructions

    2 Complex dependences through the hierarchy

    3 Parallelism – pipeline

    4 Repeated inter-connected architectures

    +

    PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

  • Summary inter-repetition dependences

    1 Expression of state constructions

    2 Complex dependences through the hierarchy

    3 Parallelism – pipeline

    4 Repeated inter-connected architectures

    +

    PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

  • Summary inter-repetition dependences

    1 Expression of state constructions

    2 Complex dependences through the hierarchy

    3 Parallelism – pipeline

    4 Repeated inter-connected architectures

    +

    PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

  • Summary inter-repetition dependences

    1 Expression of state constructions

    2 Complex dependences through the hierarchy

    3 Parallelism – pipeline

    4 Repeated inter-connected architectures

    +

    PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

  • Outline

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    Inter repetition dependence

    PhD defense – Calin Glitia From a high-level specification to the execution 23/34

  • Outline

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    Modeling and Analysis of Real-Time Embedded Systems

    Inter repetition dependence

    High-level refactoring– data-parallel transformations– strategies

    PhD defense – Calin Glitia From a high-level specification to the execution 23/34

  • Execution

    Logical space and time as mixed dimensions of multidimensionalstructure

    Specification: expresses the data dependencesbetween all the data elements that transits the system

    And a partial execution orderbetween all the execution of the tasks in of the system

    EFFICIENT execution

    Optimized code generationProjection of specification into physical space and time

    Adapt a specification to the execution

    High-level refactoringExecution that refiects the specification

    PhD defense – Calin Glitia From a high-level specification to the execution 24/34

  • Execution

    Logical space and time as mixed dimensions of multidimensionalstructure

    Specification: expresses the data dependencesbetween all the data elements that transits the system

    And a partial execution orderbetween all the execution of the tasks in of the system

    EFFICIENT execution

    Optimized code generationProjection of specification into physical space and time

    Adapt a specification to the execution

    High-level refactoringExecution that refiects the specification

    PhD defense – Calin Glitia From a high-level specification to the execution 24/34

  • Execution

    Logical space and time as mixed dimensions of multidimensionalstructure

    Specification: expresses the data dependencesbetween all the data elements that transits the system

    And a partial execution orderbetween all the execution of the tasks in of the system

    EFFICIENT execution

    Optimized code generationProjection of specification into physical space and time

    Adapt a specification to the execution

    High-level refactoringExecution that refiects the specification

    PhD defense – Calin Glitia From a high-level specification to the execution 24/34

  • Projection into space and time

    Multi-dimensional structures

    repetition spaces

    ⇓in space ←→

    linked(trade-off)

    data structures

    ⇓in time

    Take into account the execution constraints

    Data dependencesAvailable resources

    PhD defense – Calin Glitia From a high-level specification to the execution 25/34

  • Projection into space and time

    Multi-dimensional structures

    repetition spaces⇓

    in space

    ←→linked(trade-off)

    data structures⇓

    in time

    Take into account the execution constraints

    Data dependencesAvailable resources

    PhD defense – Calin Glitia From a high-level specification to the execution 25/34

  • Projection into space and time

    Multi-dimensional structures

    repetition spaces⇓

    in space ←→linked(trade-off)

    data structures⇓

    in time

    Take into account the execution constraints

    Data dependencesAvailable resources

    PhD defense – Calin Glitia From a high-level specification to the execution 25/34

  • Projection into space and time

    Multi-dimensional structures

    repetition spaces⇓

    in space ←→linked(trade-off)

    data structures⇓

    in time

    Take into account the execution constraints

    Data dependencesAvailable resources

    PhD defense – Calin Glitia From a high-level specification to the execution 25/34

  • Projection example

    Horizontal Filter

    (1920, 1080,∞) (720, 1080,∞)(240, 1080,∞)

    (13) (3)

    Vertical Filter

    (720, 480,∞)(720, 120,∞)

    (14) (4)

    PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  • Projection example

    Horizontal Filter

    (1920, 1080,∞) (720, 1080,∞)(240, 1080,∞)

    (13) (3)

    Vertical Filter

    (720, 480,∞)(720, 120,∞)

    (14) (4)

    PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  • Projection example

    Horizontal Filter

    (1920, 1080,∞) (720, 1080,∞)(240, 1080,∞)

    (13) (3)

    Vertical Filter

    (720, 480,∞)(720, 120,∞)

    (14) (4)

    Maximal parallelism

    Memory sizeInfinite data structures – Blocking points

    PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  • Projection example

    Horizontal Filter

    (1920, 1080,∞) (720, 1080,∞)(240, 1080,∞)

    (13) (3)

    Vertical Filter

    (720, 480,∞)(720, 120,∞)

    (14) (4)

    Pipeline

    Execution Order

    PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  • Projection example

    (1920, 1080,∞) (720, 480,∞)

    (240, 120,∞)

    Horizontal Filtre

    (14, 13) (3, 14)

    (14)

    (13) (3)

    Vertical Filter

    (3, 4)

    (3)

    (14) (4)

    Fusion of successive repetitions

    Minimize the arrays – macro-patternsDistribution of the common repetitionEach processor its macro-patterns in memory

    PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  • Projection example

    (1920, 1080,∞) (720, 480,∞)

    (240, 120,∞)

    Horizontal Filtre

    (14, 13) (3, 14)

    (14)

    (13) (3)

    Vertical Filter

    (3, 4)

    (3)

    (14) (4)

    Re-computations

    When intermediate values are consumed by multiple repetitionsTrade-off

    Recompute valuesKeep in memory – increase of memory size

    PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  • High-level transformations

    Adapt a specification to execution

    change the granularity of the repetitionsarray sizes reductions

    “High-level” loop transformations

    repetition = visual representation of data-parallel loop nestfusion, change paving, tiling, collapse, . . .

    Calin Glitia and Pierre Boulet.

    High level loop transformations for multidimensional signal processing embeddedapplications.In International Symposium on Systems, Architectures, Modeling, and Simulation(SAMOS VIII), Samos, Greece, July 2008.

    PhD defense – Calin Glitia From a high-level specification to the execution 27/34

  • High-level transformations

    Adapt a specification to execution

    change the granularity of the repetitionsarray sizes reductions

    “High-level” loop transformations

    repetition = visual representation of data-parallel loop nestfusion, change paving, tiling, collapse, . . .

    Calin Glitia and Pierre Boulet.

    High level loop transformations for multidimensional signal processing embeddedapplications.In International Symposium on Systems, Architectures, Modeling, and Simulation(SAMOS VIII), Samos, Greece, July 2008.

    PhD defense – Calin Glitia From a high-level specification to the execution 27/34

  • Optimization strategies – memory size reduction

    r1

    r2 r3 r4 r5

    r6

    r7

    MAXIMAL reduction of the intermediate arrays

    PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  • Optimization strategies – memory size reduction

    r1

    r2 r3 r4 r5

    r6

    r7

    MAXIMAL reduction of the intermediate arraysFusion of multiple repetitions

    Minimizes only the last intermediate arrayRe-computations!

    PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  • Optimization strategies – memory size reduction

    r1

    r2 r3 r4 r5

    r6

    r7

    MAXIMAL reduction of the intermediate arraysFusion of multiple repetitions

    Minimizes only the last intermediate arrayRe-computations!

    Complete fusion?Too much re-computationsLimited array reduction

    PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  • Optimization strategies – memory size reduction

    r1

    r2 r3 r4 r5

    r6

    r7

    MAXIMAL reduction of the intermediate arrays

    Strategy that limits the re-computationsusing result from complete fusion and two-by-two fusionswhere re-computations are introduces and minimal achievable arrayreduction

    PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  • Optimization strategies – memory size reduction

    r1

    r2 r3 r4 r5

    r6

    r7

    r14r67

    r12

    MAXIMAL reduction of the intermediate arrays

    Repetitions Repetitions Re-computations Reduction factorbefore fusion after fusion (product) of the output arrays8× 128× 96

    96×

    119×(10× 81

    )80× 801

    9.29 1228.8

    119× 96 1 9680× 80× 96 1 96

    96 1 1128× 96× 80 128× 96× 80 1 1119× 128× 96 128× 96×

    (1191

    )1 12288

    128× 96 1 1

    PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  • Optimization strategies – memory size reduction

    r1

    r2 r3 r4 r5

    r6

    r7

    r14r67

    r12

    MAXIMAL reduction of the intermediate arrays

    Calin Glitia, Pierre Boulet, Éric Lenormand, and Michel Barreteau.

    Repetitive model refactoring strategy for the design space exploration of intensivesignal processing applications.Journal of Systems Architecture, Special Issue: Hardware/Software CoDesign.

    PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  • And the inter-repetition dependences?

    Why ?

    To allow the use of the refactoring tools on models with uniformdependences

    Typically: ⇒

    Algorithm

    The global accesses and dependences MUST remain unchangedAutomatically compute new dependences after a transformation

    Calin Glitia and Pierre Boulet.

    Interaction between inter-repetition dependences and high-level transformations inarray-ol.In Conference on Design and Architectures for Signal and Image Processing (DASIP2009), Sophia Antipolis, France, September 2009.

    PhD defense – Calin Glitia From a high-level specification to the execution 29/34

  • And the inter-repetition dependences?

    Why ?

    To allow the use of the refactoring tools on models with uniformdependences

    Typically: ⇒

    Algorithm

    The global accesses and dependences MUST remain unchangedAutomatically compute new dependences after a transformation

    Calin Glitia and Pierre Boulet.

    Interaction between inter-repetition dependences and high-level transformations inarray-ol.In Conference on Design and Architectures for Signal and Image Processing (DASIP2009), Sophia Antipolis, France, September 2009.

    PhD defense – Calin Glitia From a high-level specification to the execution 29/34

  • And the inter-repetition dependences?

    Why ?

    To allow the use of the refactoring tools on models with uniformdependences

    Typically: ⇒

    Algorithm

    The global accesses and dependences MUST remain unchangedAutomatically compute new dependences after a transformation

    Calin Glitia and Pierre Boulet.

    Interaction between inter-repetition dependences and high-level transformations inarray-ol.In Conference on Design and Architectures for Signal and Image Processing (DASIP2009), Sophia Antipolis, France, September 2009.

    PhD defense – Calin Glitia From a high-level specification to the execution 29/34

  • And the inter-repetition dependences?

    Why ?

    To allow the use of the refactoring tools on models with uniformdependences

    Typically: ⇒

    Algorithm

    The global accesses and dependences MUST remain unchangedAutomatically compute new dependences after a transformation

    Calin Glitia and Pierre Boulet.

    Interaction between inter-repetition dependences and high-level transformations inarray-ol.In Conference on Design and Architectures for Signal and Image Processing (DASIP2009), Sophia Antipolis, France, September 2009.

    PhD defense – Calin Glitia From a high-level specification to the execution 29/34

  • Outline

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    Modeling and Analysis of Real-Time Embedded Systems

    Inter repetition dependence

    High-level refactoring– data-parallel transformations– strategies

    PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 30/34

  • Outline

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    Modeling and Analysis of Real-Time Embedded Systems

    Inter repetition dependence

    High-level refactoring– data-parallel transformations– strategies

    Gaspard2 environment

    PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 30/34

  • Gaspard2 environment

    SoC visual co-design

    1 allows modeling, simulation and code generation of SoC2 approach Model Driven Engineering3 Subset of the MARTE UML profile

    PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 31/34

  • Gaspard2 conception fiow

    high-level specification

    inter-repetitiondependencesrefactoring tools:implementation andintegration

    MDE contributions toGaspard2

    specializations

    code generation

    PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 32/34

  • Gaspard2 conception fiow

    high-level specificationinter-repetitiondependencesrefactoring tools:implementation andintegration

    MDE contributions toGaspard2

    specializations

    code generation

    PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 32/34

  • Conclusion

    Application Architecture

    Mapping

    Code Generation

    PhD defense – Calin Glitia Conclusion 33/34

  • Conclusion

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    PhD defense – Calin Glitia Conclusion 33/34

  • Conclusion

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    Inter repetition dependence

    PhD defense – Calin Glitia Conclusion 33/34

  • Conclusion

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    Inter repetition dependence

    Modeling and Analysis of Real-Time Embedded Systems

    PhD defense – Calin Glitia Conclusion 33/34

  • Conclusion

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    Modeling and Analysis of Real-Time Embedded Systems

    Inter repetition dependence

    High-level refactoring– data-parallel transformations– strategies

    PhD defense – Calin Glitia Conclusion 33/34

  • Conclusion

    Application Architecture

    Mapping

    Code Generation

    Array Oriented Language

    Modeling and Analysis of Real-Time Embedded Systems

    Inter repetition dependence

    High-level refactoring– data-parallel transformations– strategies

    Gaspard2 environment

    PhD defense – Calin Glitia Conclusion 33/34

  • Perspectives

    . . .

    PhD defense – Calin Glitia Conclusion 34/34

    Modeling intensive signal processing applicationsSpecification languagesArray Oriented LanguageModeling uniform dependences – contribution

    From a high-level specification to the executionData-parallel transformationsOptimization strategies

    Gaspard2 – co-design environment for SoCConclusion