Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Vladimir Stojanovic | Chief Architect, Co-Founder
SemiCon West
July 20-23, 2020
TeraPHY Optical I/O The new universal chip to chip I/O solution
© Ayar Labs, Inc. All rights reserved 2
Outline
1) Motivation: why are
we working on this?
2) Optical I/O
Requirements
3) Microring WDM I/O
Architecture
5) Leveraging the chiplet
ecosystem
6) Technology
demonstrations
4) TeraPHY – the
Terabit/s optical PHY
© Ayar Labs, Inc. All rights reserved 3
4
PIPES: Photonics in the Package for Extreme Scalability
1
10
100
1000
10000
1990 2000 2010 2020 2030
POW
ER (W
)
power for off-chip I/O
total power per package
2000 2010 2020 2030
bandwidth per socket
10 Tbps
1 Pbps
100 Gbps
SO
CKET B
AN
DW
IDTH
1 Tbps
100 Tbps
10 Gbps
I/O power exceeds 300 W
What’s the problem? I/O bandwidth & power limits
highest-performance CPU, FPGA, GPU, ASIC
NVIDIA Tesla V100GPU accelerator
5120 cores125 teraflops, 300 W, $5K
in-package
NVIDIA DGX-2Enterprise AI
16 GPUs2 petaflops, 10 kW, $400K
board-level
IBM SummitTop supercomputer
36,864 GPUs & CPUs200 petaflops, 13 MW, $300M
system-level
Images courtesy of NVIDIA and IBM
DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited.
Attacking the data movement bottleneck across microelectronics applications
I/O Trends and Requirements
• Electrical I/O bound by pin-count and signaling limitations
• Power used for I/O are increasing and unsustainable
• Datacenter workloads require new I/O architectures to sustain throughputs
– CPU, GPU, FPGA, Accelerator, resource pooling key to avoid stranded resources
[G. Keeler, DARPA ERI 2019]
I/O power exceeds
package power limit!
© Ayar Labs, Inc. All rights reserved 4
HBM2e
112G XSR
Large penalties for leaving the chip, package and board
~4 orders of
magnitude!
112Gbps SerDes likely the last long-range electrical I/O solution
*Source: Gordon Keeler, DARPA MTO, ERI Summit 2019
© Ayar Labs, Inc. All rights reserved 5
TeraPHY Optical I/O Extends In-Package I/O to End of Rack/Row
*Source: Gordon Keeler, DARPA MTO, ERI Summit 2019
scale-up and scale-out at the density and energy cost of in-package I/O
Target for Optical I/O:
TeraPHY technology spanHBM2e
112G XSR
© Ayar Labs, Inc. All rights reserved 6
Outline
1) Motivation: why are
we working on this?
2) Optical I/O
Requirements
3) Microring WDM I/O
Architecture
5) Leveraging the chiplet
ecosystem
6) Technology
demonstrations
4) TeraPHY – the
Terabit/s optical PHY
© Ayar Labs, Inc. All rights reserved 7
Ayar Labs Optical I/O System Architecture
Host SoC:
CPU/GPU/
FPGA/ASIC
Host SoC:
CPU/GPU/
FPGA/ASIC
TeraPHY CMOS
Optical I/O chip
TeraPHY CMOS
Optical I/O chip
Up to 2km reach
optical links via
Single Mode
fiber
SuperNova
multi-wavelength
source
SuperNova
multi-wavelength
source
CW light supply via
Single Mode fiber
CW light supply via
SM fiber
Organic package/interposer
Electrical I/O
Organic package/interposer
Electrical I/O
Typical in-package
temperature 80-110oC
External laser module
temperature <55oC
• Monolithic integration allows flexible electrical I/O interface to host SoC
• Wide parallel or high-speed serial
• Silicon interposer or organic substrate
• Remote laser source simplifies packaging
© Ayar Labs, Inc. All rights reserved 8
Requirements for Optical I/O
High-density optical devices and circuits
Ecosystem: Leverage
CMOS Infrastructure and I/O
Standards
Scalable high-volume
manufacturing
System Requirements
© Ayar Labs, Inc. All rights reserved 9
Technology requirements
High-density optical devices and circuits
Ecosystem: Leverage
CMOS Infrastructure and I/O
Standards
Scalable high-volume
manufacturing
Wavelength division
multiplexed (WDM) links
Requirements for Optical I/O
System Requirements
© Ayar Labs, Inc. All rights reserved 10
Technology requirements
High-density optical devices and circuits
Ecosystem: Leverage
CMOS Infrastructure and I/O
Standards
Scalable high-volume
manufacturing
Wavelength division
multiplexed (WDM) links
Chiplet-driven, with
tight integration of
electronics
Requirements for Optical I/O
System Requirements
© Ayar Labs, Inc. All rights reserved 11
Technology requirements
High-density optical devices and circuits
Ecosystem: Leverage
CMOS Infrastructure and I/O
Standards
Scalable high-volume
manufacturing
Compatible with advanced
packaging and fiber attach
Wavelength division
multiplexed (WDM) links
Chiplet-driven, with
tight integration of
electronics
Requirements for Optical I/O
System Requirements
© Ayar Labs, Inc. All rights reserved 12
Outline
1) Motivation: why are
we working on this?
2) Optical I/O
Requirements
3) Microring WDM I/O
Architecture
5) Leveraging the chiplet
ecosystem
6) Technology
demonstrations
4) TeraPHY – the
Terabit/s optical PHY
© Ayar Labs, Inc. All rights reserved 13
Microring-based WDM Optical Architecture
• Off-chip light source produces continuous wave (CW) laser
• Light is coupled from fiber-to-chip through vertical grating couplers
• Microring modulator converts data from electrical domain to optical domain
• Microring detector converts data from optical domain to electrical domain
© Ayar Labs, Inc. All rights reserved 14
Microring-based WDM Optical Architecture
• Microring modulators act as both a modulator and a wavelength multiplexer
• Microring detectors act as both a detector and wavelength demultiplexer
© Ayar Labs, Inc. All rights reserved 15
Microring-based WDM Optical Architecture
• Cascaded microrings along same waveguide increases data per fiber
• Each microring acts as an independent communications channel
© Ayar Labs, Inc. All rights reserved 16
Microring-based WDM Optical Architecture
• Cascaded microrings along same waveguide increases data per fiber
• Each microring acts as an independent communications channel
© Ayar Labs, Inc. All rights reserved 17
Microring-based WDM Optical Architecture
• Cascaded microrings along same waveguide increases data per fiber
• Each microring acts as an independent communications channel
• Scalable architecture up to ~64 microrings
© Ayar Labs, Inc. All rights reserved 18
Building WDM Systems
• Monolithic integration allows
for clocking, drivers, TIAs,
and control circuitry to be
integrated on same chip as
optical devices
• Small size of microring
devices monolithically
integrated with CMOS
transistors leads to large
bandwidth density and
energy efficiency
© Ayar Labs, Inc. All rights reserved 19
Outline
1) Motivation: why are
we working on this?
2) Optical I/O
Requirements
3) Microring WDM I/O
Architecture
5) Leveraging the chiplet
ecosystem
6) Technology
demonstrations
4) TeraPHY – the
Terabit/s optical PHY
© Ayar Labs, Inc. All rights reserved 20
TeraPHY: Main features• 24 Channels of AIB (960 Gbps total data bandwidth)
• 10 photonics Tx/Rx macro pairs
• Configurable to 128 – 256 Gb/s per macro (1.28-2.56 Tb/s per
chip)
• NRZ modulation format on the optical channel – no FEC
required!
• <10 ns (AIB -> TeraPHY -> AIB) + 5 ns/m latency
• Configurable cross-bar to map AIB channels to optical
channels
• Reach: Up to 2km
• Estimated energy efficiency: <5 pJ/bit (all-inclusive)
• Roadmap to >25Tbps/chip, 1pJ/bit
© Ayar Labs, Inc. All rights reserved 21
Silicon waveguide
Microring modulator/detector
(source: IBM)
Vertical grating couplers
MOSFETs
Monolithic Integration
© Ayar Labs, Inc. All rights reserved 22
TeraPHYTM chipmixed-pitch bumps
AIB
In
terf
ace
Cro
ssbar
Tera
PH
Ym
acro
s
Fib
er
Couple
r A
rray
© Ayar Labs, Inc. All rights reserved 23
8 TxRx slices (wavelength SerDes lanes) per optical macro
TeraPHYTM optical macro
PL
L-U
PL
L-D
PI
PI
RX
TX
Clock
TR
X S
lic
e
© Ayar Labs, Inc. All rights reserved 24
RX
TIA, EQ
Clock Distribution
I/Q Gen.
ILO3 X PI
3 X PII/Q Gen.
ILO
TX
Eye Monitor
Heater
Driver
PL
L-U
PL
L-D
PI
PI
RX
TX
Clock
TR
X S
lic
e
TeraPHYTM TxRx slice
One wavelength lane per slice (ring resonators)
© Ayar Labs, Inc. All rights reserved 25
Outline
1) Motivation: why are
we working on this?
2) Optical I/O
Requirements
3) Microring WDM I/O
Architecture
5) Leveraging the chiplet
ecosystem
6) Technology
demonstrations
4) TeraPHY – the
Terabit/s optical PHY
© Ayar Labs, Inc. All rights reserved 26
Industry Adoption of System-in-Package Integration
AMD Radeon R9 Fury XnVIDIA Tesla T100
• Mix die function• GPU, CPU, memory, I/O, etc.
• Diverse processes & nodes• E.g. 16nm, 10nm, DRAM, etc.
• Manage yield
• For optics to use ecosystem, must be like electronics!
Intel® AgileX™ Xilinx Virtex-7 HT
© Ayar Labs, Inc. All rights reserved 27
Embedded Multi-die Interconnect Bridge (EMIB)
Microbump pitch 55um
Flip-Chip Pitch
> 100um
Intel® Embedded Multi-Die Interconnect Bridge (EMIB)
• EMIB packaging technology supports mixed bump pitch on the same die
• Embedded silicon bridge is used for dense die-to-die connectivity
• Organic substrate is used for off-package connections (power, I/O, etc.)
© Ayar Labs, Inc. All rights reserved 28
Die-to-die interface: Serial vs Parallel
Metric Value
Bandwidth density ~1 Tb/s/mm
Energy ~2 pJ/bit
Design complexity high
Package
complexity
low
Metric Value
Bandwidth density ~1 Tb/s/mm
Energy ~0.5 pJ/bit
Design complexity low
Package
complexity
moderate
© Ayar Labs, Inc. All rights reserved 29
EMIB Substrate
TeraPHY
Location
EMIB Link between
TeraPHY and FPGA
© Ayar Labs, Inc. All rights reserved 30
SoC package assembly
TeraPHY Optical I/O Chiplets Intel® FPGA
Other chiplets
© Ayar Labs, Inc. All rights reserved 31
Outline
1) Motivation: why are
we working on this?
2) Optical I/O
Requirements
3) Microring WDM I/O
Architecture
5) Leveraging the chiplet
ecosystem
6) Technology
demonstrations
4) TeraPHY – the
Terabit/s optical PHY
© Ayar Labs, Inc. All rights reserved 32
© Ayar Labs, Inc. All rights reserved 33
© Ayar Labs, Inc. All rights reserved 34
800 Gb/s Demonstration
• Uses 4 macros, each modulating 8λ at 25Gbps/λ
• Microring tuning tolerant to imperfect channel spacings
Transmit output
spectra
© Ayar Labs, Inc. All rights reserved 35
8λ Stability Test
• Overnight lock stability >15 hours
• Open eyes on all optical transmit wavelengths
Initial chip / package
heating upLab temperature
cools down
overnight
Lab warms up in
the morning
© Ayar Labs, Inc. All rights reserved 36
Receiver Test
© Ayar Labs, Inc. All rights reserved 37
© Ayar Labs, Inc. All rights reserved 38
© Ayar Labs, Inc. All rights reserved 39
• TeraPHY based optical fabric creates new opportunities to build high bandwidth, low latency optical connectivity straight from the package
• Enables shelf, rack, and row system scale out
Logically connected, physically distributed
GPU GPU GPU GPU CPU CPU CPU CPU
FPGA FPGA FPGA FPGA ASIC ASIC ASIC ASIC
| 39
© Ayar Labs, Inc. All rights reserved 40
TeraPHYTM Scaling
2Tbps chip
8 macros,
8 λ/macro,
32Gbps/λ NRZ
8Tbps chip
8 macros,
16 λ/macro,
56Gbps/λ NRZ
or
8 λ/macro,
112Gbps/λ PAM4
16Tbps chip
8 macros,
16 λ/macro,
112Gbps/λ PAM4
or
32 λ/macro,
56Gbps/λ NRZ
32Tbps chip
8 macros,
32 λ/macro,
112Gbps/λ PAM4
4Tbps chip
8 macros,
8 λ/macro,
56Gbps/λ NRZ
or
4 λ/macro,
112Gbps/λ PAM4
Host SoC-tailored Electrical Interfaces
(Parallel, Serial), organic-substrate or 2.5D
© Ayar Labs, Inc. All rights reserved 41
• Chip-to-chip communications requires photonics to overcome I/O bottleneck
• Emerging chiplet ecosystem offers opportunity for monolithic in-package optics
• In-package optics fundamentally breaks the traditional bandwidth-distance trade-off and supports new high-performance computer architectures
Conclusions
TeraPHY eval kit
Available now!
To learn more, contact me: