View
223
Download
3
Category
Preview:
Citation preview
Helio X20:
The First Tri-Gear Mobile SoC
with CorePilot™ 3.0 Technology
August 2016
Tsung-Yao Lin, Ming-Hsien Lee, Loda Chou, Clavin Peng, Jih-Ming Hsu, Jia-Ming Chen, John-CC Chen, Alex Chiou, Artis Chiu, David Lee, Carrie Huang, Kenny Lee, TzuHeng Wang, Wei-Ting Wang, Yenchi Lee, Chi-Hui Wang, Pao-Ching Tseng, Ryan Chen, Kevin Jou
Tri-Gear Concept
Challenges
Key Technologies
• Tailored CPU cores for gears
• Enhanced coherent interconnect
• Hybrid scheduler
• Holistic gear allocation
• Adaptive thermal management
Achievements
Summary
Agenda
User Behavior Changed
Scenarios Example
Application Task Load
Time Spent%
Per Day
(2013)
Time Spent%
Per Day
(2014)
Time Spent%
Per Day
(2015)
Changes
(20142015)
Web Browsing Chrome
Browser
Heavy ~
Medium 20% 14% 10% -4%
Gaming Temple Run 2 Heavy ~
Light 32% 32% 15% -17%
Social
Messaging Facebook Medium 24% 28% 31% +3%
Entertainment,
Utilities, and
others
YouTube,
Medium ~
Light 24% 26% 44% +18%
Source: Flurry Analytics
• Social messaging, entertainment, and utilities (with medium to light
loads) take up to 75% of user time
Task Load Distribution of Scenarios
12% 28%
17% 42% 38%
48% 47%
36% 33% 13%
0%
20%
40%
60%
80%
100%
WebBrowsing
Gaming SocialMessaging
Entertainment,Utilities & Others
Energy Consumption of Scenarios
Heavy Load
Medium Load
Light Load
Idle
• Medium load tasks are important across all scenarios (36% ~ 48%)
• Heavy load tasks are still important for specific scenarios
big
• game
• multimedia • always-on,
connected
LITTLE
Light Tasks Medium Tasks Heavy Tasks
The Dual-Gear Dilemma
big
• game
• multimedia • always-on,
connected
LITTLE
• Sustainable
usage
big
LITTLE
Light Tasks Medium Tasks Heavy Tasks
The Dual-Gear Dilemma
Execute medium load tasks on
• big wasted energy
• LITTLE cannot meet
performance requirement
big
• game
• multimedia • always-on,
connected
LITTLE
Mid
• Sustainable
usage
big
LITTLE
Light Tasks Medium Tasks Heavy Tasks
The Dual-Gear Dilemma
Execute medium load tasks on
• Mid: balance between performance
and power
po
we
r
performance 0 % 100 %
Mid
1 New Mid gear introduced
Min
Max
2 Min gear goes for even lower power,
Max gear aims for higher performance
po
we
r
performance 0 % 100 %
Max
Mid
Min
po
we
r
performance
0 % 100 %
Max
Mid
Min
3 Reduced power consumption
across entire performance range
Low Power
Sustainable
Performance
High Performance
Introduction to Tri-Gear
Previous Dual-Gear
Improved
thermal sensing,
power budgeting
Improved
gear
management
Enhanced
coherent
interconnect
Tailored
processors
Revised
scheduler
Challenges of Tri-Gear
Evolving to Tri-Gear
SW
HW
Scheduler Thermal Management
Power Management
Balance power and performance
Maximize thermal performance Prevent overheating
Minimize power consumption
big
Coherent Interconnect
Right Task to Right CPU Control Info.
Control
Info
.
Light
Task
Heavy
Task
LITTLE
Tri-Gear Concept
Challenges
Key Technologies
• Tailored CPU cores for gears
• Enhanced coherent interconnect
• Hybrid scheduler
• Holistic gear allocation
• Adaptive thermal management
Achievements
Summary
Agenda
0.5X
1.0X
1.5X
2.0X
2.5X
0X 1X 2X 3X
Ener
gy C
on
sum
pti
on
Single-Thread Performance
Max
Mid
Min
• +30% power-efficiency
− Multi-bit flip-flops optimization
− Delicate usage of high leakage LVT cells
• +40% performance vs. Min gear
− LIB and MEM optimizations
* Energy and Performance scale relative to the highest
point of Min curve
Min, Max gears extend power/performance ranges
Tailored CPU Cores for Three Gears
2.5GHz Max
A72 A72
1.4GHz Min
A5
3
A5
3
A5
3
A53
2.0GHz Mid
A53
A5
3
A5
3
A5
3
Mid gear for efficient performance
+40% Performance
Mid vs. Min
+30% power-efficiency
Mid vs. Max
ACE ACE ACE
ACE ACE
Enhanced from 2 ACE ports to 3 ACE ports
Increased logic extra power
• ~50% power reduction by sub-module
Fine-Grain Clock Gating (FGCG)
-50% power
* Power is relative to 2-gear at 1GB/s
common usage range
0.3
Enhanced Coherent Interconnect
Coherent Interconnect Power Comparison
Min Mid Max
Memory
LITTLE big
Coherent Interconnect
Tri-Gear Coherent Interconnect
Memory
LITTLE
C0 C1 C2 C3
big
C0 C1
Dual-Gear scheduler
HMP Dual-Gear scheduler
• Limited to Dual-Gear
• Boot CPU is always on and cannot be migrated
(Fixed CPU0)
Typically in LITTLE LITTLE cannot be off
Fixed
CPU0
Hybrid Scheduler
Dual-level HMP scheduler for Tri-Gear?
• Might not be optimal
• Fixed CPU0 limits power saving opportunities
HMP (Heterogeneous Multi-Processing)
SMP (Symmetric Multi-Processing) SMP
Min
C0 C1 C2 C3
Max
C0 C1
Tri-Gear scheduler HMP
SMP SMP
Mid
C0 C1 C2 C3
SMP
HMP? Fixed
CPU0
Min Mid Max
Min Mid Max
Min Mid
Min Mid Min
LITTLE big
LITTLE big
LITTLE big
Power-Off
ICAT assigns CPU0 dynamically
• Min gear can be off by task migration
• 8%~10% CPU power saved for medium load
Intelligent Core Activation Technology (ICAT)
Min
C0 C1 C2 C3
Mid
C0 C1 C2 C3
Max
C0 C1
Min
C0 C1 C2 C3 C0 C1 C2 C3
Max
C0 C1
Min always online for CPU0(booted CPU)
Fixed CPU0
ICAT: Min can be offline
Dynamic CPU0
Power-Off
0.5X
1.0X
1.5X
2.0X
2.5X
45 55 65 75 85
CP
U P
ow
er
Tj (°C)
2 threadsw/o ICAT
2 threadswith ICAT
1 threadw/o ICAT
1 threadwith ICAT
Power/Tj curve
* Power is relative to 1 thread with ICAT at 65°C
Mid
Min
C0 C1 C2 C3
Mid
C0 C1 C2 C3
AMP
AMP: enhanced HMP with dynamic gear
operation for power saving
task migration
with ICAT
Asymmetric Multi-Processing (AMP) with ICAT
• Packing tasks to Mid for sustainable performance
• Packing tasks to Min for low power
Min
C0 C1 C2 C3
Mid
C0 C1 C2 C3
HMP
Min
C0 C1 C2 C3
Mid
C0 C1 C2 C3
HMP
Min
C0 C1 C2 C3
Mid
C0 C1 C2 C3
AMP Min
C0 C1 C2 C3
Mid
C0 C1 C2 C3
Max
C0 C1
Tri-Gear scheduler
SMP SMP SMP
AMP (Asymmetric Multi-Processing)
HMP
Min Mid Max
Max
Min Mid Max
Instant boost technology
Inter-gear task migration
HMP for high performance
• Instant boost technology
Quick response to utilize Max for
urgent or heavy tasks
Hybrid = SMP + AMP + HMP
• Inter-gear task migration
Dynamic threshold control for
energy efficiency and responsiveness
Thread-group migration strategy to
increase cluster (L2 cache) locality
0 % 100 % 0 % 100 % p
ow
er
performance 0 % 100 %
HMP
AMP
Hybrid Scheduler
Min Mid Max Min Mid
High Performance
Sustainable Performance
Low Power
Control Control
Previous Power Management
• Dynamic Voltage & Frequency Scaling (DVFS)
and Hot-Plug drivers consider inputs separately:
• Power budget, performance requests, and
system status such as load, Thread Level
Parallelism (TLP)
• Big gear on/off controlled by Hot-Plug driver
Centralized Gear Allocation
• A holistic control to handle increased complexity
• Tracking steady states to avoid unnecessary
gear migration overhead
• Linking to user-specified performance, normal,
power-saving modes
Enhanced Power Management
Power Budget
Requests
Centralized Gear Allocation
Performance
Requests
CPU DVFS CPU Hot-Plug
Status
Status
Thermal, Battery... Heavy task, Scenario...
CPU DVFS CPU Hot-Plug
Power Budget
Requests
Performance
Requests
Status
Thermal, Battery... Heavy task, Scenario...
0X
1X
2X
0X 1X 2X
Po
wer
2-Thread Performance
Tri-Gear 2Max
1Max+1Mid
2Mid
1Max+1Min
2Min
1Mid+1Min
0X
1X
2X
0X 1X 2X
Po
wer
2-Thread Performance
Dual-Gear
2Max
1Max+1Min
2Min
Power budgeting by both core limit
and frequency limit for all CPUs
Dual-Gear to Tri-Gear
• More possible solutions from core /
frequency combination meeting power
target
• 1.5X ~ 3X more possible solutions on core
combination alone, depending on TLP
* Power and performance are relative to the highest
point of Max curve
* Each point in a curve represents a choice of gear /
core / freq
Adaptive Thermal Management (ATM)
Previous power allocation
• Simple cost function: power efficiency only
• Large search space: chosen solution might
not meet actual system requirement
Precise power allocation
• Comprehensive cost function: power
efficiency, system requirement (#core,
frequency and power), system overhead
• +10% Performance from considering
system requirement
• -5°C max Tj from reducing system
overhead: hot-plug vs. DVFS latency
* Power and performance are relative to the
highest point of Max curve
* Geekbench v3 Multi-core Performance
0X
1X
2X
3X
0X 1X 2X 3X 4X 5X
Po
wer
Multi-Thread Performance
0X
1X
2X
3X
0X 1X 2X 3X 4X 5X
Po
wer
Multi-Thread Performance
Power budget
Power budget
Max Min
Precise Power Allocation
Previous Power Allocation
1 Heavy +
3 Light tasks Large
search space
Reduced
search space
Freq. limit
Freq. limit
Max Min
ATM for More Combinations
Tri-Gear Concept
Challenges
Key Technologies
• Tailored CPU cores for gears
• Enhanced coherent interconnect
• Hybrid scheduler
• Holistic gear allocation
• Adaptive thermal management
Achievements
Summary
Agenda
0%
20%
40%
60%
80%
100%
VideoRecord+EIS
(Utilities)
Web Rollover(Web Browsing)
Burst Photo(Utilities)
Facebook(Social
Messaging)
Heavy LoadingGame
(Gaming)
En
erg
y C
on
su
mp
tio
n
Tri-Gear Max
Tri-Gear Mid
Tri-Gear Min
-35% -38% -38% -21% -12%
Energy saving from Dual-Gear to Tri-Gear
• Up to -38% CPU energy measured for scenarios used daily
Energy Saving from Tri-Gear CPU Architecture
Dual-Gear LITTLE
Dual-Gear big
MT6592 MT6595 Helio X20 Helio P10
CorePilot™ 3.0 CorePilot™ 2.0 CorePilot™ 1.0
• Octa-core with SMP • CPU+GPU Computing
• Dynamic Gear Migration
for low power
• Tri-Gear CPU Architecture
• 12% ~ 38% CPU energy saving
• big.LITTLE HMP
• Global Task Scheduling
CorePilot™ Technology Evolvement
SMP Tri-Gear HMP Symmetric
Multi-Processing
Heterogeneous
Multi-Processing
Hybrid Tri-Gear
Multi-Processing
HC Heterogeneous
Computing
big
C1 C2 C3 C0
GPU
Mid
C1 C2 C3 C0
GPU
Max
C1 C0
LITTLE
C1 C2 C3 C0
LITTLE
C1 C2 C3 C0
LITTLE
C1 C2 C3 C0 C1 C2 C3 C0
LITTLE
C1 C2 C3 C0
Min big
C1 C2 C3 C0
Summary
po
we
r
performance 0 % 100 %
po
we
r
performance 0 % 100 %
Max
Mid
Min
pow
er
performance 0 % 100 %
Max
Mid
Min
Majority of tasks are medium and light loads • Added Mid gear and enhanced Min gear
CorePilot™ 3.0 Key Technologies • Tailored CPU cores for gears
• Enhanced coherent interconnect
• Hybrid scheduler
• Holistic gear allocation
• Adaptive thermal management
Benefit of Tri-Gear • Up to 38% CPU energy saving for typical scenarios
used daily over extended performance range
Copyright © MediaTek Inc. All rights reserved.
Recommended