37
1 Simon Segars EVP & GM, ARM August 18 th , 2011 ARM Processor Evolution: Bringing High Performance to Mobile Devices

ARM Processor Evolution - Hot Chips

  • Upload
    others

  • View
    16

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ARM Processor Evolution - Hot Chips

1

Simon Segars EVP & GM, ARM August 18th, 2011

ARM Processor Evolution: Bringing High Performance to Mobile Devices

Page 2: ARM Processor Evolution - Hot Chips

2

1980’s mobile computing

Page 3: ARM Processor Evolution - Hot Chips

3

HotChips 1981

 4MHz Z80 Processor

 64KB memory

 Floppy drives

 5” screen

 24.5 lbs

 $1,795

 11,000 units sold

Osbourne 1 image courtesy of www.oldcomputers.net

Page 4: ARM Processor Evolution - Hot Chips

4

Hot Chips 1983

 Motorola DynaTAC

 $3995

 30 minutes talk time

 10 hours charge time

 No texting or Bluetooth

Page 5: ARM Processor Evolution - Hot Chips

5

30 Years later…

 MacBook Air

 1.7GHz Processor

 8GB memory

 256GB storage

 13” screen

 2.96 lbs

 $1,599

Page 6: ARM Processor Evolution - Hot Chips

6

Your phone is a lot more useful too

 Samsung Galaxy S-II

 Android 2.3

 4.3” screen

 32GB memory

 Web, weather, Angry Birds

 ~1/20th the DynaTAC cost

Page 7: ARM Processor Evolution - Hot Chips

7

Connectivity driving computing

Source: Morgan Stanley, 2009

Page 8: ARM Processor Evolution - Hot Chips

8

Evolutionary Pressures

Functionality

Cost Form Factor

Competition

Page 9: ARM Processor Evolution - Hot Chips

9

1990-2011

Page 10: ARM Processor Evolution - Hot Chips

10

A long time ago in a village far away….

 Acorn, Apple and VLSI formed a joint venture

  13 engineers and a CEO

 ARM the company was born

  21 years old in November

 Goal to design low power embedded 32b processors

 But never to manufacture them

Page 11: ARM Processor Evolution - Hot Chips

11

1993: An early mobile computer

 1.4lb

 Personal Applications

 Expansion socket

 Stylus interface

Page 12: ARM Processor Evolution - Hot Chips

12

1992: ARM7TDMI – ‘thumb’

 A small, low power 32 bit core for wireless baseband

 74,000 transistors, 40MHz, 4.2mm2 on 0.5µm

Page 13: ARM Processor Evolution - Hot Chips

13

Early GSM phone

 1 week battery life

 Contact list

 Calculator

 SMS

 Snake…

 Not exactly a mobile computer

Page 14: ARM Processor Evolution - Hot Chips

14

Evolution

Page 15: ARM Processor Evolution - Hot Chips

15

Moore’s law has helped a lot

1985 ARM1 3µ 6k gates 7mm x 7mm

Cortex-M0 20nm 8k gates

0.07mm x 0.07mm

Dec 2010

1/10,000th size

Cortex-M0 Subsystem

. M0

System

Mali-400

IO

Dual Cortex-A9

Cortex-A9 SOC 40nm 100M gates 7.4mm x 6.9mm

Page 16: ARM Processor Evolution - Hot Chips

16

The next decade

There may be trouble ahead….

Page 17: ARM Processor Evolution - Hot Chips

17

First the good news

Over 4 Billion people connected via mobile phones

BIG NUMBERS!

Over 13 Billion smartphone apps downloaded in 2.5

years

During the 2010 Holiday period $230M

was spent on EBay using smartphones

Smartphones will leapfrog over the PC in the developing world

LTE Deployment starting

In 2010 280M smartphones were

shipped

Smartphone data traffic will exceed PC traffic in

2014

Page 18: ARM Processor Evolution - Hot Chips

18

0

100

200

300

400

500

600

700

800

Sm

art

ph

on

e V

olu

me

(M

u)

Smartphone growing and growing

Source: Gartner Q4 2010 Forecast

Phase 2

Phase 3

Phase 1

ARM 926 100 MHz

ARM 926 200MHz

ARM 11 500MHz

Cortex-A8 650MHz

Cortex-A8 1GHz

2x Cortex-A9 1GHz

4x Cortex-A9 1GHz

2x Cortex-A15 1.5GHz

0

10

20

30

40

50

60

70

80

90

100

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013

Re

lati

ve

Pe

rfo

rma

nc

e

Page 19: ARM Processor Evolution - Hot Chips

19

The “Superphone” for 2013

  Your primary internet access device

  Always connected broadband

  Enough compute power to replace your laptop

  Your content with you

  From HD to MP3 with cloud access

  Multi-core processor

  Multi-core graphics processor

  Multi-core coherency

  28nm implementation

  Security solution

  12+MP Camera

  1080p playback and capture

  128 Gbyte of storage

  45 hours of 1080p video

  HDMI video out

  Full browser with HTML v5 & Flash Player

  New UI paradigms

  Voice, gestures, Augmented Reality supported by OpenCL

  Full ‘office’ support

  Vibrant applications market

  HD Gaming on device or screen

Device Vision

Enabling Technology

Visual Experience

Applications

Page 20: ARM Processor Evolution - Hot Chips

20

Another 1Bn sub $100 Computers

Device Features 1+ Billion Opportunity

  Consumers, OEMs & MNO all want smartphones

  Emerging markets to bypass the PC

  Full internet connectivity

  Growth controlled by retail price, target $100-$200

  Requirement beyond ARM11™ performance

Access to full range of apps in Android Marketplace

  Full Android™ OS

  Access to full range of apps in Google Marketplace

  Full browser

  Flash Player 10 support

  Full HTML 5 support

  3D UI accelerated with OpenGL® ES 2.0

Page 21: ARM Processor Evolution - Hot Chips

21

A whole lot of software

Operating Systems

innovating on

a 6-12 month period

Internet Explorer

Google Chrome

Mozilla Firefox

<HTML> v5

Webkit Flash Player 10

Application diversity delivers the personalized compute experience

Apps & App Stores

100,000’s of Apps

15+ Billion Downloads

Full connectivity to the Internet and the Cloud

Web Technology

support for all

standards

Diversity and rapid mobile OS innovation

AIR

Page 22: ARM Processor Evolution - Hot Chips

22

The center of your world

Page 23: ARM Processor Evolution - Hot Chips

23

TrustZone in 3 Steps

Trusted Execution

Environment

1. Define secure hardware architecture

  Two separate domains:- normal and secure

  Extends across system:-

Processor, display, keypad, memory, clock, radios

2. Implement in silicon system on chip (SoC)

  Physically enforcing secure/normal separation

3. Combine SoC with Secure OS

  Separate but connected to main operating system

Result: A Trusted Execution Environment   Ready to develop and deploy trusted services

Trusted Execution

Environment

CPU CPU

SECURE NORMAL

Rich OS Secure OS

Page 24: ARM Processor Evolution - Hot Chips

24

But there are challenges

Transistor scaling Battery scaling

Modems Implementation

Page 25: ARM Processor Evolution - Hot Chips

25

Modem relative performance

 4G modem ~500x more complex than 2G

  Control processor

  Dedicated data processing engines

  More silicon area

  More power consumed

0 100 200 300 400 500 600

4G

3.9G LTE

3G

2G

Page 26: ARM Processor Evolution - Hot Chips

26

Batteries haven’t helped

 Historical 11% capacity growth

  Not well matched to Moore's Law

 Continued innovation required just to maintain 11%

  New Si alloy materials or anode Carbon Nano Tubes may help

Assuming 12 hours of use per day

4.5 kCal

30g

255 kCal

49g

x2.2

Capacity

Page 27: ARM Processor Evolution - Hot Chips

27

Challenge: Implementation Complexity

 ARM7TDMI

  74K transistors

  4.2 mm2 in 0.5µm technology

  Largely hand-designed

  3 corners, 1 voltage domain

 Cortex-A9 MP Dual Core

  >20M transistors

  3.4 mm2 in 28nm technology

  Complex timing sign-off

  12+ corners, 3 voltage domains

  DFM, Variation modeling, …

Page 28: ARM Processor Evolution - Hot Chips

28

Disaggregation – pros and cons

1970s

Vertical Suppliers

Fully Vertically Integrated

1980s

ASIC Vendors

System Manufacturer

ASIC Vendor

1990s

Fabless Semis

Design & Distribution

Manufacture

EDA

Today

IP Driven Design

Design

IP

EDA

Manufacture

Foundry

Page 29: ARM Processor Evolution - Hot Chips

29

Fabs: not as popular as they used to be

Al#s Semiconductor 

Dongbu Hitek  Dongbu Hitek 

Freescale  Freescale 

Fujitsu  Fujitsu 

Globalfoundries  Globalfoundries  Freescale 

Grace Semiconductor  Grace Semiconductor  Fujitsu 

IBM  IBM  Globalfoundries 

Infineon  Infineon  IBM 

Intel  Intel  Infineon  Fujitsu 

Panasonic  Panasonic  Intel  Panasonic 

Renesas (NEC)  Renesas (NEC)  Panasonic  Globalfoundries 

Samsung  Samsung  Renesas (NEC)  IBM 

Seiko Epson  Seiko Epson  Samsung  Intel 

SMIC  SMIC  SMIC  Renesas (NEC)  Globalfoundries 

Sony  Sony  Sony  Samsung  Intel 

ST Microelectronics  ST Microelectronics  ST Microelectronics  SMIC  Panasonic  Globalfoundries 

Texas Instruments  Texas Instruments  Texas Instruments  ST Microelectronics  Samsung  Intel 

Toshiba  Toshiba  Toshiba  Toshiba  ST Microelectronics  Samsung 

TSMC  TSMC  TSMC  TSMC  TSMC  ST Microelectronics 

UMC  UMC  UMC  UMC  UMC  TSMC 

130nm  90nm  65nm/55nm  45/40nm  32/28nm  22/20nm 

Page 30: ARM Processor Evolution - Hot Chips

30

Device Scaling

65nm

32nm

40nm

28nm

8nm

14nm

20nm

10nm

?nm

Strain

High-K

2x Patterning

3x Patterning? EUV??

????

Page 31: ARM Processor Evolution - Hot Chips

31

Think different!

Cortex-Big Multi-CPU

Multi-Gx

Hardware Platform

Runtime

Object

Compiler

Application

Kernel

CPU-Audio

Modem

Video

Audio

5G

Web

Modem WiFi

Page 32: ARM Processor Evolution - Hot Chips

32

Advanced Processing

Fetch Decode Rename Dispatch

NEON FPU

Multiply

Load/Store 5 stages 7 stages

Integer pipeline

Cortex-A15 for High-end, power-efficient processing

  15-Stage Integer Pipeline

 2 extra cycles for multiply, load/store

 2-10 extra cycles for complex media instructions

Issu

e

WB

Int.

Branch Is

su

e

Issu

e

WB

WB

Page 33: ARM Processor Evolution - Hot Chips

33

Cortex-A15 System Scalability Introduces Cache Coherent Interconnect

  Processor to Processor Coherency and I/O coherency

  Memory and synchronization barriers

  Virtualization support with distributed virtual memory signaling

128-bit AMBA 4

Quad Cortex-A15 MPCore

A15

Processor Coherency (SCU) Up to 4MB L2 cache

A15 A15 A15

CoreLink CCI-400 Cache Coherent Interconnect

128-bit AMBA 4

IO c

oh

ere

nt d

evic

es

MMU-400

Quad Cortex-A15 MPCore

A15

Processor Coherency (SCU) Up to 4MB L2 cache

A15 A15 A15

System MMU

GIC-400

Page 34: ARM Processor Evolution - Hot Chips

34

Fully coherent SoCs

2011 Devices

  Full coherency within CPU cluster

  Limited I/O coherency

  Software managed coherency for SoC

2013 Devices

  Full coherency for multiple CPU clusters

  I/O coherency with graphics and other

  Simpler software programming model

2015 Devices

  Full coherency on CPU, GPU and other

  True General Purpose Compute

Applications Processor

Fully Coherent

2x or 4x Cortex-A9

Graphics and Video

Non-coherent or Software Managed

Mali-400 Video

Applications Processor

Fully Coherent

>4x Cortex-A15

Graphics and Video

I/O Coherent

Mali- T604 Video

CI

Applications Processor

Fully Coherent

Next generation Compute cluster

Graphics and Video

Next Gen Video

CI

Gre

ate

r P

erf

orm

an

ce

an

d in

tera

cti

on

Lo

we

r e

ne

rgy

pe

r tr

an

sa

cti

on

Page 35: ARM Processor Evolution - Hot Chips

35

The holistic SoC Designer

IP Driven Design

Design

IP

EDA

Manufacture

Foundry

Architecture

Implementation

Page 36: ARM Processor Evolution - Hot Chips

36

Conclusion

 30 years have delivered incredible gains

  Mobile computing now ubiquitous

  Smartphones, Superphones, Tablets

 Silicon scaling has driven PPA gains

  But it has to end somewhere, so get ready!

 The future is Hetrogeneous

  Multi-core CPU, multi-CPU, dedicated engines

 A new battery would help too!

Page 37: ARM Processor Evolution - Hot Chips

37

Thank you.