6
Computer Architecture and System Research For Future Computing Jaehyuk Huh

Computer Architecture and System Research For Future Computing Jaehyuk Huh

Embed Size (px)

Citation preview

Page 1: Computer Architecture and System Research For Future Computing Jaehyuk Huh

Computer Architecture and System

Research For Future Computing

Jaehyuk Huh

Page 2: Computer Architecture and System Research For Future Computing Jaehyuk Huh

Computer ArchitectureComputer Architecture

Research Area

Parallel Programming

Exploit multicore/streaming technologies

Parallel programming is still difficult

Parallel Programming

Exploit multicore/streaming technologies

Parallel programming is still difficult

Future Applications

Workload Characterization

Find parallelism

High-performance applications

Embedded applications

Future Applications

Workload Characterization

Find parallelism

High-performance applications

Embedded applications

Multicores

GPGPULow-power

Embedded

Research Activities: Architectural characterization of future applications Mapping applications to parallel architecture HW enhancement based on characterization HW support for parallel programming productivity Exploit semiconductor technology advancement

• Research Goal: Explore microarchitectural and software-based solutions to enhance future computing environments

• Research Area: Computer Architecture, Parallel Processing

Page 3: Computer Architecture and System Research For Future Computing Jaehyuk Huh

Multicore Research

4 cores in 2008 10s-100s cores in 2018

P0P0

I D

P1P1

I D

P2P2

I D

P3P3

I D

P4P4

I D

P5P5

I D

P6P6

I D

P7P7

I D

II DD

P0

P15P15II DD

P0

P14P14II DD

P0

P13P13II DD

P0

P12P12II DD

P0

P11P11II DD

P0

P10P10II DD

P0

P9P9II DD

P0

P8P8

Directory for

L2 coherence

L2 Banks +

Networks

Configurable Cache Design:

Shared Non-Uniform Cache Architecture

Communication Latency

Coherence Decoupling

Communication Bandwidth :

Subspace Snooping with Optical Interconnect

Cores

Coherence

Subspace

High BW Optical

Interconnect

Coherence

Networks

Coherence

Networks

Consumer core

Producer core

Data request

Data return

Data prediction

Page 4: Computer Architecture and System Research For Future Computing Jaehyuk Huh

Past Multicore Research

• Shared Non-Uniform Cache Architecture (ICS05/IEEE TPDS07)

– Novel shared cache for multi-cores: low hit latencies

– Provide configurability: per-application/per-line sharing degree

• Decoupling of Coherence Protocols (ASPLOS04/IEEE Micro Top Picks 04) – Exploit speculative execution in out-of-order cores

– Decouple coherence performance from correctness

• Subspace Snooping– Adapt coherence protocols to application patterns

– Reduce unnecessary probes in snoop-based coherence

• Multi-core Design Space Exploration (PACT 01)

• MP-Sauce: Multiprocessor Full-system Simulator

• Cache Burst: Dead Block Prediction (MICRO 08)

Page 5: Computer Architecture and System Research For Future Computing Jaehyuk Huh

Future Research

Future Applications Computer vision, data mining,

physical simulation, synthesis Embedded applications Find HW resource requirements Find available parallelism

Future Applications Computer vision, data mining,

physical simulation, synthesis Embedded applications Find HW resource requirements Find available parallelism

Parallel Programming Productivity Help programmers write better parallel

programs faster Transactional memory Support for race detection and

performance tuning

Parallel Programming Productivity Help programmers write better parallel

programs faster Transactional memory Support for race detection and

performance tuning

Multicore Architecture Adaptive cache systems Efficient resource sharing among

cores Improve communication and

coherence mechanisms Power-constraint for embedded

systems

Multicore Architecture Adaptive cache systems Efficient resource sharing among

cores Improve communication and

coherence mechanisms Power-constraint for embedded

systems

Alternative Parallel Architectures

(Streaming/GPGPU) CUDA (Nvidia), CTM (AMD)

Alternative Parallel Architectures

(Streaming/GPGPU) CUDA (Nvidia), CTM (AMD)

Computer Architecture

HW optimization by characteristics SW management of HW resources How to map applications to parallel architectures

Exploit efficient resource sharing in multicore HW support for parallel programming

Exploit inherent parallelism in applications

Heterogeneous Multicores

Fusion of two programming models

Page 6: Computer Architecture and System Research For Future Computing Jaehyuk Huh

Primary Investigator

• Publications: www.cs.utexas.edu/users/jhhuh

• Experience– Sr. Design Engineer , Advanced Micro Devices (AMD)

– Summer Intern, IBM Austin Research Laboratory

– Summer Intern, IBM T.J. Watson Research Center

– Research/Teaching Assistant, The University of Texas at Austin

• Education– Ph.D. in Computer Sciences, The University of Texas at Austin, Advisor:

Prof. Doug Burger

– M.S. in Computer Sciences, The University of Texas at Austin,

– B.S. in Computer Science and Statistics, Seoul National University