12
“What and how can the Open64 community collaborate more closely?” - our experiences and ideas Open64 Developers Forum 2010 Embedded Software Consorti um Jenq Kuen Lee Chairman, MOE Embedded Software Consortium, Taiwan Professor, Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan [email protected]

“What and how can the Open64 community collaborate more closely?” - our experiences and ideas

  • Upload
    grover

  • View
    45

  • Download
    0

Embed Size (px)

DESCRIPTION

Embedded Software Consortium. “What and how can the Open64 community collaborate more closely?” - our experiences and ideas. Jenq Kuen Lee Chairman, MOE Embedded Software Consortium, Taiwan Professor, Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan - PowerPoint PPT Presentation

Citation preview

Page 1: “What and how can the Open64  community collaborate more closely?” - our experiences and ideas

“What and how can the Open64 community collaborate more closely?”

- our experiences and ideas

Open64 Developers Forum 2010

Embedded Software Consortium

Jenq Kuen LeeChairman, MOE Embedded Software Consortium, Taiwan

Professor,

Department of Computer Science,

National Tsing Hua University,

Hsinchu, Taiwan

[email protected]

Page 2: “What and how can the Open64  community collaborate more closely?” - our experiences and ideas

Embedded Software Consortiu

m

Outline

• Our experience with Open64 for compiler research– Programming language and compiler research lab.

Tsing-Hua Univ., Taiwan– Major research funding from Taiwan MOEA

• What and how can the Open64 community collaborate more closely ?

Workshop on Embedded Systems Education, 2009

2

Page 3: “What and how can the Open64  community collaborate more closely?” - our experiences and ideas

• Compiler for VLIW DSP processors with distributed register files

– Local register allocation• Register bank assignment and cluster assignment

for distributed register architectures• “PALF: Compiler Supports for Irregular Register

Files in Clustered VLIW DSP Processors” [Lin, CCPE’07]

– Global register allocation• Global decisions on register bank assignment

of multiple basic blocks• “LC-GRFA: Global Register File Assignment with

Local Consciousness for VLIW DSP Processors with Non-uniform Register Files” [Lu, CCPE’09]

– Improved register spilling • Spilling data to unoccupied register banks

rather than to memory• “Expression Rematerialization for VLIW DSP

Processors with Distributed Register Files” [Wu, CPC’09]

– SIMD intrinsics• Means and essential optimizations for users to

write high-performance code for VLIW DSP• “SIMD Intrinsic Supports for VLIW DSP Processors

with Distributed Register Files” [Kuan, CPC’10]

Loop Optimization

FrontEnd

Whirl-Level Optimizer (IPA, WOPT, LNO...)

Lowering / Code Selection / Intrisic

Hyperblock Formation / If-Conversion

EBO Pre Process

Control Flow Optimization

Control Flow Optimization

EBO Process

LC-GRFA

Global Scheduling (Before RA)

GRA

LRA

EBO Post Process

Global Scheduling (After RA)

Local Instruction Scheduling

Global Code Motion

Low-Power Optimization

Code Emition

SA-Based LRFAPALF-LRFA

CIO

Source Code

Assembly Code

New Phases for PACDSP

Specially Tuned for PACDSP

Ported for Target Dependency

Original Phases of Open64

Unroll SWP

Compiler for VLIW DSP processors with distributed register files

Embedded Software Consortium

Page 4: “What and how can the Open64  community collaborate more closely?” - our experiences and ideas

• Probabilistic pointer analysis (PPA)– Quantitative

• Aggressive optimizations can be applied– Fast

• With the aid of SSA form, explicit def-side can be found in linear

– χ in SSA form helps find potential def-side that can not be known by symbolic checking

– Implemented in Opt_ssa.cxx in WOPT phase– Incorporate with edge profiling , acquire more

accurate point-to information– Optimizations

• Point-to information can be used to guide memory locality optimization in the presence of pointers.

• Speculative execution, • Transactional memory, • code specialization, • data layout assignments

int *p, *q, v, u; p=&v; q=&u; while ( … ) // condition 1 if ( … ) // condition 2 p=q; else q=p; *p = … // where does p points-to ?

PPA in SSA Form of Open64

Internal Memory

Internal Memory

External MemoryExternal Memory

Software Cache

Software Cache

So

ftwar

e C

ache

AP

I

So

ftwar

e C

ache

AP

I

multi-level memory systems:* internal memory (small & fast)* external memory (large & slow)

DSPDSP

DSPDSP

DSPDSP

Interprocedural Probabilistic Pointer Analysis, Peng-Sheng Chen, Yuan-Shin Hwang, Dz-Ching Ju, Jenq Kuen Lee, IEEE Transactions on Parallel and Distributed Systems, Volume 15, Issue 10, pp. 893-907, Oct. 2004.

Page 5: “What and how can the Open64  community collaborate more closely?” - our experiences and ideas

OpenCL Compiler Support Based on Open64 for MPUs+GPUs

• OpenCL is an emerging standard for heterogeneous multicore programming.

• We’ve incorporated Open64 compiler in ATI SDK

– Syntax supports• Qualifiers• Vector data types• Built-in functions

– Future directions• WHIRL/CGIR

optimizations• Data locality and

SIMD optimizations

Our Ongoing Work with OpenCL

Embedded Software Consortium

clcclc prelink.bcInternal

optimizer and linker

Internal optimizer and linker

builtin-x86.bc

opt.s

ldld

asas

openccopencc

Reuse stub code and metadata

Reuse stub code and metadata

kernel.cl

builtin-x86.bcllvm-extract/llcllvm-extract/llc

stub/metadata

OpenCL_kernel.s

lib.c

libatiocl.so

clc: OpenCL-LLVM front-end.bc: LLVM IR files

ATI SDK → LLVM approach →

→ Open64 approach →

Page 6: “What and how can the Open64  community collaborate more closely?” - our experiences and ideas

Partneruniversitie

s

MOE ESW Consortium, TaiwanMOE ESW Consortium, Taiwan

SoCConsortium

Advisory Committee

Otherconsortiums

ES Designcontest

Advisory Committee

ESW consortium

MOEAdvisory Office

PartnerUniversities

Collaboration With TEIA

Collaboration withNSC OpenSource/Embedded Program

Page 7: “What and how can the Open64  community collaborate more closely?” - our experiences and ideas

Develop 25 courses and lab modules on Embedded

System Softwareyear course year Course

2003 Embedded Real-Time Operating System

2006

Embedded Middleware Design

2004

USB Driver Design & Implementation Embedded System Overview

Toolchain for embedded software

2007

Embedded Systems and software engineering (1)

I/O Architecture & Device Drivers Embedded Systems and software engineering (2)

Embedded Software for Networked SoC Systems Hands-on Lab Development based-on Local Platforms

2005

Embedded Compiler Design Embedded Multi-core System and Software

Embedded OS Implementation

2008

Embedded Multimedia Design

Embedded Microcontroller System Java Software for Embedded system

Embedded System Programming Lab modules of Embedded Hw/Sw Co-design and Analysis

2006Interface Design

2009Heterogenous Multi-cores

Embedded System Implementation OpenSparc Lab modules

2010 Innovative Embedded System Curriculum on Android Platforms

2010CPS and wireless sensor network

embedded multi-core programming languages

Page 8: “What and how can the Open64  community collaborate more closely?” - our experiences and ideas

Embedded Software Consortiu

m

Embedded course development flow of the ESW

Embedded softwarecurricula

Curricula from ACM, IEEE-CS, other universities

Advisory board

Inputs from other task groups, profs. and industry executives

New course or course module

Seek for project leaders

Team up coursedevelopment team

ESW office

Course development Course trial run

Regular course development meeting

Regular course promotion workshop

Deployment phase

Page 9: “What and how can the Open64  community collaborate more closely?” - our experiences and ideas

Promote Open64 via CollaborationCurriculums• Open64 courses and

textbooks• Hand-on labs

• Make it easy to break engineering challenges with Open64, and have students to focus on scientific innovations.

Encyclopedia Compilers

Open-64

How to devise Compiler to deliver optimal performance on Open-64

Lectures NotesHand-on Labs

Page 10: “What and how can the Open64  community collaborate more closely?” - our experiences and ideas

Possible Collaboration on Joint Research Projects

• Potential collaboration items with OpenCL on Open-64

OpenCL and CUDA Front-end

OpenCL Optimizationsat WHIRL & CGIR

ATI GPU

Optimizations for Embedded or Green

Google Android

IBM & UIUCBlue Waters

Nvidia GPUEmbeddedMulticore

Code Generation for New Targets

code size

low-power

Update-to-dateC/C++ Front-end

Page 11: “What and how can the Open64  community collaborate more closely?” - our experiences and ideas

Wish List: Serializable CGIR

As a research compiler• to save/restore current states is really important

– a valuable observation may be disappeared after other team members changed prior phase’s implementation, and then we have to find this case in other benchmarks/applications again and again

• to provide an interface in the entry point of CG phase is also important– sometimes we want to use different compilers’ optimizations just

before CG phase• in order to compare optimization capabilities of different compiler’s

front-end & middle-end• in order to take advantages of other implementations

– for example, to use LLVM for optimizing the code, and directly output to CG phase for performing existed optimizations & generating codes

Page 12: “What and how can the Open64  community collaborate more closely?” - our experiences and ideas

Wish List: Replace Build System by CMake

• More powerful analysis for dependencies– it enables parallel make easily

• on an Intel Core 2 Extreme QX9650 3.4GHz (O.C.) machine, to build a full PACC32 compiler (based on Open64 4.0) with gmake -j5 just needs no more than 5 minutes

– it’s convenient to release product in binary form• rpath can be easily setup by simple CMake commands• any required runtime libraries can be included to

binary packages automatically– the speed of system inspection & build process

is faster than autotools (autoconf/automake/libtool), which is also not used in Open64 project so far