C 2030 Lundgren Gedae Gedae for Certification

Embed Size (px)

Citation preview

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    1/24

    26 May 2009

    Gedae: SoftwareGedae: Software--Enabled SystemsEnabled Systems

    Engineering Facilitates Certification of NewEngineering Facilitates Certification of New

    PlatformsPlatforms

    William I. LundgrenWilliam I. Lundgren

    Gedae. Inc.Gedae. Inc.

    [email protected]@gedae.com

    11

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    2/24

    Traditional Software DevelopmentTraditional Software Development

    ProcessProcess

    22

    BehavioralSpecification SoftwareTied to HW

    ComplexExpensive

    Chasm between teamsintroduces errors

    Verification difficult

    Many PossibleImplementations

    OneSpecification

    ManualTranslation

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    3/24

    Concept for Automatic CertificationConcept for Automatic Certification

    33

    Behavioral

    SpecificationSoftware

    Automatic

    Implementation

    ManySpecifications

    Specifications VerifiedArchitecture Dependent

    All code automatically generated

    ManyImplementations

    OneSpecification

    ImplementationSpecification

    Minimal CodeVerification

    Produces Open

    Source Code

    Verify atBehavioral Level

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    4/24

    Concept is SimpleConcept is SimpleCan it be Done?Can it be Done?

    Requires a quantum leap in technologyRequires a quantum leap in technology

    Quantum leap in technology based on discarding theQuantum leap in technology based on discarding the

    broken von Neumann theory of computingbroken von Neumann theory of computing

    Complexity is the killer of von Neumann theory (see nextComplexity is the killer of von Neumann theory (see nextchart)chart)

    Automation addresses complexityAutomation addresses complexity

    Abstraction enables automationAbstraction enables automation Abstract functional behaviorAbstract functional behavior

    Abstract model of the hardwareAbstract model of the hardware

    44

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    5/24

    Multicore Implementation IssuesMulticore Implementation Issues

    Each additional facet of implementationEach additional facet of implementation

    further clouds the meaning of the codefurther clouds the meaning of the code

    55

    Functionality

    DataDecomposition

    Thread

    Creation

    IPC

    Memory

    Optimization

    MemorySharing

    Power, Tempand Error Mgt

    The functional description becomesThe functional description becomes

    scattered across threads/filesscattered across threads/files

    Intermingling between facets furtherIntermingling between facets further

    obscures meaningobscures meaning

    Optimization becomesOptimization becomes

    impracticalimpractical

    Compromise becomes theCompromise becomes thenormnorm

    Complexities becomeComplexities become

    intractableintractable

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    6/24

    How Does a Quantum Leap Occur?How Does a Quantum Leap Occur?

    Work with many architectures over many yearsWork with many architectures over many years

    DSPs, FPGAs, GPPs, Cell/B.E., X86 multi coresDSPs, FPGAs, GPPs, Cell/B.E., X86 multi coresWork with many applicationsWork with many applications

    Radar, lidar, sonar, acoustics, and image processing, and timeRadar, lidar, sonar, acoustics, and image processing, and time--

    critical controlcritical control

    Understand and solve the real challengesUnderstand and solve the real challenges Not developed in a labNot developed in a lab

    Partner with customersPartner with customers

    Hardened by implementing demanding production applicationsHardened by implementing demanding production applicationsHard work and perseverance over 20+ yearsHard work and perseverance over 20+ years

    ReplacesReplaces DARPA hardDARPA hard fundingfunding

    66

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    7/24

    Management of ComplexityManagement of Complexity

    The separation of functionality and implementation intoThe separation of functionality and implementation into

    separate code bases can help manage functionalityseparate code bases can help manage functionalitySolutionsSolutions

    MiddlewareMiddlewareretains existing software development processretains existing software development process

    PredefinedPredefinedoptimizes common algorithms and distributionsoptimizes common algorithms and distributions

    Prebuilt for each architecturePrebuilt for each architecture Further optimization not possibleFurther optimization not possible

    Trades efficiency for flexibilityTrades efficiency for flexibility

    Function API extends existing languagesFunction API extends existing languages

    CompilationCompilationintroduces automationintroduces automationCustom application for each application/target combinationCustom application for each application/target combination

    Compile time optimizationsCompile time optimizations

    Global optimizationsGlobal optimizations

    77

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    8/24

    A Thread CompilerA Thread Compiler

    ExpandExpandthe definition of a compilerthe definition of a compiler

    Local optimization, like parallelizing a single loop, insufficieLocal optimization, like parallelizing a single loop, insufficientnt Global resource management and planningGlobal resource management and planning

    Verification (consistent and implementable behavior)Verification (consistent and implementable behavior)

    Threads (natural threading, additional decomposition for distribThreads (natural threading, additional decomposition for distribution)ution)

    IPC (infrastructure)IPC (infrastructure)Concurrency control (depends on characteristics of IPC, deadlockConcurrency control (depends on characteristics of IPC, deadlock

    avoidance, decentralized control)avoidance, decentralized control)

    Memory hierarchy (slow and fast memory, cache performance)Memory hierarchy (slow and fast memory, cache performance)

    Memory use (packing, buffer reuse)Memory use (packing, buffer reuse)

    Blur the line between application and OSBlur the line between application and OS

    Optimizations are generalizedOptimizations are generalized

    Based on a hardware modelBased on a hardware model

    88

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    9/24

    The Structure of GedaeThe Structure of Gedae

    www.gedae.comwww.gedae.com 9999

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    10/24

    Key ComponentsKey Components

    Language:Language: GedaeGedaeOneOne andand GedaeGedaeTwoTwo

    ImplementationImplementation--free expression of functionality enables compilerfree expression of functionality enables compiler

    Abstraction helps with the introduction of a new languageAbstraction helps with the introduction of a new language

    Complete set of language abstractions maximizes productivityComplete set of language abstractions maximizes productivity

    Abstract hardware modelAbstract hardware model Gedae compiler builds custom software matching the softwareGedae compiler builds custom software matching the software

    functionality to the hardware targetfunctionality to the hardware target

    CompilerCompiler

    Automatic target dependent optimized software structureAutomatic target dependent optimized software structure

    Thread ManagerThread Manager

    Distributed thread scheduler optimizes execution of distributedDistributed thread scheduler optimizes execution of distributed

    threadsthreads

    1010

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    11/24

    GedaeGedaeOneOne

    Language AbstractionsLanguage Abstractions

    www.gedae.comwww.gedae.com 11111111

    Algebraic Expressions Matrix Processing String Processing

    State Machine Database

    Core Libraries Data Structures Object Oriented

    Data Flow Parameters Conditionals Iteration Persistent Memory Side Effects Recursion Software Reset

    FunctionalBehavior

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    12/24

    GedaeGedaeOneOne

    1212

    out[i][j](t) = sar(in[i][j](t), Taylor[j], Azker[i2_1]) {

    range i2=2*#i;

    t1[i][j] = Taylor[j] * in[i][j];

    range[i][j] = fft(t1[i][j]);

    cturn[j][i] = range[i][j];adjoin[j][i2](t) = i2 < R ? cturn[j][i2](t) :

    cturn[j][i2](t-1);

    t2[j][i2] = ifft(adjoin[j][i2]);

    t3[j][i2] = Azker[i2] * t2[j][i2];

    azimuth[j][i2] = fft(t3[j][i2]);out[j][i] = azimuth[j][i];

    }

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    13/24

    GedaeGedaeOneOne vs. Middlewarevs. Middleware

    Black box functions hide essential functionality from compilerBlack box functions hide essential functionality from compiler

    Library is a vocabulary with an implementationLibrary is a vocabulary with an implementationconv(float *in, float *out, int R, int C,conv(float *in, float *out, int R, int C,

    float *kernel, int KR, int KC);float *kernel, int KR, int KC);

    Algebraic language is a specificationAlgebraic language is a specification

    range i=R, j=C, i1=KR, j1=KC;range i=R, j=C, i1=KR, j1=KC;out[i][j] += in[i+i1][j+j1] * kernel[i1][j1];out[i][j] += in[i+i1][j+j1] * kernel[i1][j1];

    1313

    Other examples:

    As[i][j] += B[i+i1][j+j1]; /* kernel of ones */Ae[i][j] ||= B[i+i1][j+j1]; /* dilation */

    Am[i][j] = As[i][j] > (KR*KC/2); /* majority op */

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    14/24

    GedaeGedaeOneOne + Data Flow =+ Data Flow = GedaeGedaeTwoTwo

    1414

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    15/24

    Implementation ToolsImplementation Tools

    Select from 4memory

    packers

    Select whether toprototype application

    in SDK or as separateexecutables

    View thehardware model

    Automate setting

    of queue sizesbetween

    dynamicallyrelated threads

    Implementation

    tools for everyaspect of the

    software

    Select locationof command

    program

    Choose structure

    of productView compiler

    status

    1515

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    16/24

    Implementation Tools:Implementation Tools:

    Partitioning ToolPartitioning Tool

    Experimentally partition

    components to findoptimum distributions

    Easy scalability usingformulas based on indices

    1616

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    17/24

    Implementation Tools:Implementation Tools:

    IPC Specification ToolIPC Specification Tool

    1717

    Easily setup buffersfor nonblocking

    transfers

    List of all transfers automaticallyinserted into app

    Easily incorporateDMA and other

    optimized transfer

    methods

    Easily incorporatemultibuffering

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    18/24

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    19/24

    Analysis Tools:Analysis Tools:

    Event and Processor StatisticsEvent and Processor Statistics

    Detailed events andevents statistics are

    available

    Detailed events andevents statistics are

    available

    Detailed events andevents statistics are

    available 1919

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    20/24

    Analysis Tools:Analysis Tools:

    Distributed DebuggingDistributed Debugging

    Processor controlledindividually or global stop

    Gedae instruments codewith more or less detail

    User canadd events

    Breakpoints can be added onany sensible event like the 4th

    firing of the FFT on partition p2

    Probes can be added atany point - debugging

    code is separate from the

    application2020

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    21/24

    Cell/B.E. Benchmark ResultsCell/B.E. Benchmark Results --

    SummarySummary

    Monte Carlo BlackMonte Carlo Black--Scholes SimulationScholes Simulation

    Same as performance of hand optimized codeSame as performance of hand optimized codeMatrix multiplyMatrix multiply

    Block data layoutBlock data layout

    194 gflops194 gflops95% of max processor performance95% of max processor performance

    SAR (synthetic aperture RADAR)SAR (synthetic aperture RADAR)

    End to end timing including 0 flop cornerturnEnd to end timing including 0 flop cornerturn

    Sustained 88 gflops/secSustained 88 gflops/sec

    87x algorithm on a 500 Mhz quad altivec board (normalized87x algorithm on a 500 Mhz quad altivec board (normalizedclock speedclock speed13.6 x)13.6 x)

    2121

    Similar results for multi-core and networked X86

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    22/24

    Enabling Legacy Code in aEnabling Legacy Code in a

    Multicore EnvironmentMulticore Environment

    Bottom up conversionBottom up conversion

    Incrementally convert legacy codeIncrementally convert legacy code Target computer intense portions of legacy codeTarget computer intense portions of legacy code

    Gedae compiler produces a library of C callableGedae compiler produces a library of C callable

    functions from a library of Gedae One functionsfunctions from a library of Gedae One functions

    Build your middleware library using portable and maintainableBuild your middleware library using portable and maintainablecodecode

    2222

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    23/24

    Recoding Legacy Code forRecoding Legacy Code for

    Multicore EnvironmentsMulticore Environments

    Top down conversionTop down conversion

    Gedae offers enough productivity to make even the mostGedae offers enough productivity to make even the mostdemanding recoding tasks affordabledemanding recoding tasks affordable

    Gedae One provides 2Gedae One provides 2--10x productivity10x productivity increase over C code forincrease over C code for

    single core processorssingle core processors

    GedaeGedaes compiler offers 4s compiler offers 4--10x productivity increase by10x productivity increase byautomating the implementation of software for multicores fromautomating the implementation of software for multicores from

    single processor implementations written in Gedae Onesingle processor implementations written in Gedae One

    Aggregate productivity increase of 8Aggregate productivity increase of 8--100x is sufficient for even100x is sufficient for even

    the most demanding applicationsthe most demanding applications

    2323

  • 8/3/2019 C 2030 Lundgren Gedae Gedae for Certification

    24/24

    Gedae is a technology enabler forGedae is a technology enabler for

    rapid cost effective softwarerapid cost effective softwarecertificationcertification

    2424