27
MICRO ‘18 Amber: Enabling Precise Full-System Simulation with Detailed Modeling of All SSD Resources Donghyun Gouk, Miryeong Kwon, Jie Zhang, Sungjoon Koh Wonil Choi, Nam Sung Kim, Mahmut Kandemir and Myoungsoo Jung Computer Architecture and MEmory Systems Lab SimpleSSD 2.0

Amber: Enabling Precise Full-System Simulation with

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Amber: Enabling Precise Full-System Simulation with

MICRO ‘18

Amber: Enabling Precise Full-System Simulation with Detailed Modeling of All SSD Resources

Donghyun Gouk, Miryeong Kwon, Jie Zhang, Sungjoon KohWonil Choi, Nam Sung Kim, Mahmut Kandemir and Myoungsoo Jung

Computer Architecture and MEmory Systems Lab

SimpleSSD 2.0

Page 2: Amber: Enabling Precise Full-System Simulation with

Executive Summary

SSD simulation

Hard Disk Drives

Solid State Drives

DRAM

CPU Caches

Memory Hierarchy

CPU

Registers

Fast, expensive and small capacity

Slow, cheap and large capacity

Full-System simulation

Co-simulation

Ban

dw

idth

(M

B/s

)

Real

Simulator

Amberoffers the full-system simulationsupport with a good performance accuracy compared to real devices.

Page 3: Amber: Enabling Precise Full-System Simulation with

Agenda• Background

• What and why are trace-based storage simulations not good enough?

• Amber overview• What do we have to model for SSD internal components?

• Storage complex

• Computation complex

• Storage interface models• UFS/SATA/NVM Express/Open-Channel SSD

• Evaluation results

• Demo

• Conclusion

Page 4: Amber: Enabling Precise Full-System Simulation with

Full System Simulation

Models Computer System

CP

UCPU

Caches

Simics

Full-system Simulators

Running real OS with detailed hardware model.

GPU Network Storage

Page 5: Amber: Enabling Precise Full-System Simulation with

Bandwidth:Avg: 660.3MB/sMin: 237.6MB/sMax: 1.935GB/s...

Latency:Avg: 110.5msMin: 1.66msMax: 1.207s...

GC statistics:# GC: 0# page copy: 0......

Simulation Results

SSD Simulator

Trace-based SimulationBlock Trace File(s)

MAJ,MIN CPUID SEQ# SEC NS PID OFFSET LENGTH259,0 4 2 0.000007455 3003 Q RM 76040 + 8259,0 4 3 0.000008428 3003 G RM 76040 + 8259,0 4 4 0.000013460 3003 D RM 76040 + 8259,0 4 5 0.000128406 0 C RM 76040 + 8...

VFS / FS

Block I/O Layer

Device Driver

Storage

Ker

nel

Use

rD

evic

e

I/O Trace File

0.000013460 Read 76040 + 8...

Convert for SSD simulators

Tick Oper. Offset Length

Trace ReplayerR

ead

Host protocol simulation

DRAM latency model

Data buffer model

Flash Translation Layer (FTL)Address Translation, GC, Wear-leveling NAND Transaction Scheduling

NAND Flash Array ModelInternal parallelism, Thermal/Power

Stat

isti

c C

olle

cto

r

Write

Page 6: Amber: Enabling Precise Full-System Simulation with

Trace-based Simulation

User-level programKernelHardware

CPU, cache and DRAM

Host

Real system Trace-based Simulation

Host interfaceEmbedded CPU & DRAMSSD FirmwareNAND Flash Array

SSD SSD

I/O trace fileStatic values

Host interfaceEmbedded CPU & DRAMSSD FirmwareNAND Flash Array

Trace-based simulation(s) cannot capture…

Performance brought by a bidirectional communication

between the host and SSD

Performance simulated from an isolated storage model

Page 7: Amber: Enabling Precise Full-System Simulation with

Trace-based Simulation1. Host hardware changes 2. Host software changes

CPU

CPU@ 2GHz

CPU

CPU@ 4GHz

DDR3-1600 DDR4-2400Linux 3.x Linux 4.x

3. Protocol differences 4. Active/Passive Storage

64K queues64K entries/queue

NVMe protocol

NCQ with 32 entries

SATA protocol & PHY

App

Active storage

Block I/O

Device driver

Host interface

Data buffer

FTL

Ho

stD

evic

e

App

Passive storage

Block I/O

Device driver

Host interface

Data buffer

FTL

Ho

st

Page 8: Amber: Enabling Precise Full-System Simulation with

Accuracy – Bandwidth

Sequential read Random read

Sequential write Random write

bs=4K, QD=1~32

10 15 20 25 30

400

800

1200

1600

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth 10 15 20 25 30

400

800

1200

1600 A

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth 10 15 20 25 30

400

800

1200

1600

D

A

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth 10 15 20 25 30

400

800

1200

1600

CB

D

A

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth 10 15 20 25 30

400

800

1200

1600

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth 10 15 20 25 30

400

800

1200

1600 A

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth 10 15 20 25 30

400

800

1200

1600

CB

D

A

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth

10 15 20 25 30

200

400

600

800

1000

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth 10 15 20 25 30

200

400

600

800

1000 D

A

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth 10 15 20 25 30

200

400

600

800

1000

C

D

A

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth 10 15 20 25 30

200

400

600

800

1000

C B

D

A

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth 10 15 20 25 30

200

400

600

800

1000

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth 10 15 20 25 30

200

400

600

800

1000

DA

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth 10 15 20 25 30

200

400

600

800

1000

C

DA

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth 10 15 20 25 30

200

400

600

800

1000

CB

DA

Intel 750

Ba

nd

wid

th (

MB

/s)

I/O Depth

34.8%28.7%

64.9%59.3%

Average error rate:A: 128% B: 65%C: 74% D: 42%

Page 9: Amber: Enabling Precise Full-System Simulation with

Accuracy – Latency

Sequential read Random read

Sequential write Random write

bs=4K, QD=1~32

10 15 20 25 30

100

200

300

400L

ate

ncy (

s)

I/O Depth

Intel 750

10 15 20 25 30

100

200

300

400L

ate

ncy (

s)

I/O Depth

Intel 750

A

10 15 20 25 30

100

200

300

400L

ate

ncy (

s)

I/O Depth

C

Intel 750

A

10 15 20 25 30

100

200

300

400L

ate

ncy (

s)

I/O Depth

BC

Intel 750

A

D

10 15 20 25 30

100

200

300

400

La

ten

cy (

s)

I/O DepthIntel 750

10 15 20 25 30

100

200

300

400

La

ten

cy (

s)

I/O DepthIntel 750

A

10 15 20 25 30

100

200

300

400

La

ten

cy (

s)

I/O Depth

BC

Intel 750

A

D

10 15 20 25 30

100200300400

13001400

La

ten

cy (

s)

I/O Depth

Intel 750

10 15 20 25 30

100200300400

13001400

La

ten

cy (

s)

I/O Depth

Intel 750

A

10 15 20 25 30

100200300400

13001400

La

ten

cy (

s)

I/O Depth

C

Intel 750

A

D

10 15 20 25 30

100200300400

13001400

La

ten

cy (

s)

I/O Depth

B

C

Intel 750

A

D

10 15 20 25 30

400

800

1200

1600

6000

La

ten

cy (

s)

I/O Depth

Intel 750

10 15 20 25 30

400

800

1200

1600

6000

La

ten

cy (

s)

I/O Depth

B

C

Intel 750

A

D

37.4% 50.2%

140.3%

179.5%

Average error rate:A: 4560% B: 507%C: 4090% D: 602%

Page 10: Amber: Enabling Precise Full-System Simulation with

Amber*

* Amber is a project name of SimpleSSD 2.0

• Offering full-system simulation by tightly integrating storage and computing resources over diverse interface protocols(e.g., SATA, UFS, NVMe and Open-Channel SSD)

• For storage specific studies, it also support a traditional trace-based simulation model.

• Provides good accuracy.4 ~ 28% (bandwidth) and 6 ~ 36% (latency) error rates with diverse real SSDs

Co

re

PC

H SATA

/PC

Ie

PH

Y

MC

H

SSD

Fun

ctio

nal

CP

U (A

tom

icSi

mpl

e)

DR

AM

Tim

ing

CP

U (I

nOrd

er, O

3, S

MT)

Co

reC

ore

Sto

rage

co

mp

lex

FTL FIL

Inte

rna

l req

ues

ts

I/O

que

ue

HIL ICL

LBA

LBA

LBA

S-PP

NS-

PPN

S-PP

N

Internal DRAM GC&WL

Cache Mapping table

Tra

nsla

tion

Flus

h/pa

ge re

plac

emen

t

Tra

nsac

tion

Inte

rnal

par

alle

lism

Memory mapped region

SW queuesData

Ho

st

con

tro

ller

(SA

TA/U

FS)

ICH

HW

que

ue

Poin

ter

list

SATA

/PC

Ie P

HY

De

vice

co

ntr

olle

r

Em

be

dd

ed c

ore

Powered by gem5, popular full-system simulator.

SSD interface modules are added by Amber.

SSDHost

Page 11: Amber: Enabling Precise Full-System Simulation with

SSD InternalsC

omp

uta

tio

n c

om

ple

x

L1L1

L1

DR

AM

DR

AM

L2 c

ach

e

Core

PCIe/SATA

Host interface

DR

AM

DR

AM

co

ntr

oll

er

Core

Core

Sto

rage

co

mp

lex

Flash interface

Ch. 0 Ch. 1 Ch. N

NA

ND

p

acka

ge

NA

ND

p

acka

ge

NA

ND

p

acka

ge

NA

ND

p

acka

ge

NA

ND

p

acka

ge

NA

ND

p

acka

ge

Way

0W

ay 1

Computation Complex• Including multiple CPUs, Caches and DRAM modules.• Running diverse components of the SSD firmware.

Storage Complex• Composed by multiple flash packages, channels and flash controllers,

including flash interface management.

NAND Flash Package• Multiple components for

massive parallelism• Multiple planes, dies,

control logic, etc. • Some components cannot

operate concurrently

NAND package

Die 0

Control logic I/O

Die

1

Plane 0

Pla

ne

1Block 0Page 0Page 1

Page N

Emb

ed

ded

Co

re

Cac

he

NAND Flash Array

Modeling all timing details of storage complex is required to bring a high accuracy, but it unfortunately can lead a slow simulation speed in full-system simulation!

Page 12: Amber: Enabling Precise Full-System Simulation with

Storage Complex AbstractionPAL: Parallelism Abstraction Layer- Goal: Capturing all the latency for activations, conflicts and idle of each component, but simplifying unnecessary operation timings/details

PALPhysical Address Latency

• Simplified NAND command protocol

• Conflict modeling

Consecutive I/O on same channel, different die. Consecutive I/O on same channel, same die.

Channel

Die mem_op

pre-dma post-dmaChannel

Die mem_op

CMDDataOpCode Addr tADL Data CMD

Die 1

Channel 0

Die 0

IO #1 IO #2

mem-op (w)

mem-op (r)

Conflict

IO #1 IO #2

Die 1

Channel 0

Die 0 Mem -op (w)

Mem -op (r)

Conflict

Memory operation latency• Idleness of Plane• Idleness of Block• Page offset

LSB? CSB? MSB?

Page 13: Amber: Enabling Precise Full-System Simulation with

Computation Complex Modeling

Co

re

PC

H SATA

/PC

Ie

PH

Y

MC

H

SSD

Fun

ctio

nal

CP

U (A

tom

icSi

mpl

e)

DR

AM

Tim

ing

CP

U (I

nOrd

er, O

3, S

MT)

Co

reC

ore

Sto

rage

co

mp

lex

FTL FIL

Inte

rna

l req

ues

ts

I/O

que

ue

HIL ICL

LBA

LBA

LBA

S-PP

NS-

PPN

S-PP

N

Internal DRAM GC&WL

Cache Mapping tableTr

ans

latio

n

Flus

h/pa

ge re

plac

emen

t

Tra

nsac

tion

Inte

rnal

par

alle

lism

Memory mapped region

SW queuesData

Ho

st

con

tro

ller

(SA

TA/U

FS)

ICH

HW

que

ue

Poin

ter

list

SATA

/PC

Ie P

HY

De

vice

co

ntr

olle

r

Em

be

dd

ed c

ore

PAL

Core model(ARMv8/aarch64)

DRAM model(from gem5)

HIL: Host Interface Layer• Handling the communication and controlling

the datapath between host and SSD• Implementation can be either hardware or

software

ICL: Internal Cache Layer• Buffering the data by respecting the storage

interface adopted• Different associativity and $ schemes support

FTL: Flash Translation Layer• Address translation (between virtual and

physical)• Various algorithms (page, etc.) support

• Reliability management such as garbage collection and wear-leveling

Page 14: Amber: Enabling Precise Full-System Simulation with

SSD Interface Modeling

eMMC UFS

H-Type – Hardware driven S-Type – Software driven

PHY

DRAMCPU

Device driver

Queue

datadata datadata

Host controller

Internal buffer

Ho

stSS

D

PHY

Device controllerInternal DRAM

Storage complex

Dedicated host controllerDMA

Redundant data copy

datadata

PCIe Root Port

DRAMCPU

Device driver

MCH

Ho

stSS

D

PCIe End Point

Device controllerInternal DRAM

Storage complex

MSIDB

datadata

Device driver do everything

datadata

DMA

Page 15: Amber: Enabling Precise Full-System Simulation with

Cmd.Offset

Entry

EntryEntry

Buffer

Buffer

Buffer

PRDT

. . .

. . .

PRDTLength

PRDTOffset

SATA: Command HeaderUFS: UTP Transfer Request Descriptor (UTRD)

OPCODE

SLBA

NLB

SATA: Command FISUFS: UFS Protocol Information Unit (UPIU)

Doorbell

Queue OffsetMMIO

Registers

Controller Registers

SSD Interface Modeling

UFSUniversal Flash Storage

H-Type

Open-Channel SSD

S-TypeSystem Memory

SATA: Command ListUFS: UTP Transfer Request List

Entry 0

Entry 1

Entry 30

Entry 31

. . .

Host Controller (H-type)

SATA: Advanced Host Controller Interface (AHCI)UFS: UFS Host Controller Interface (UFSHCI)

Controller Internal Memory PH

Y

Inte

rfac

e

Transfer to SSDSATA: FIS / UFS: UPIU

Page 16: Amber: Enabling Precise Full-System Simulation with

SSD Interface Modeling

Open-Channel SSD

S-TypeSystem Memory

SSD (S-type)

Internal Memory Sto

rage

Co

mp

lex

Inte

rfac

e

Device Controllerwith embedded coresQueue

OffsetMMIORegisters

Controller Registers

NVMe: Submission Queue

Entry 0

Entry 1

Entry 65534

Entry 65536

. . .

Buffer

Buffer

Buffer

EntryEntry

PRP List

. . .

PRP 2

PRP 1

NVMe: Submission Queue Entry

SLBA

NLB

OPCODE

Doorbell

Page 17: Amber: Enabling Precise Full-System Simulation with

SSD Interface Modeling

[1] "Open-Channel SSD Spec. Revision 2.0," http://lightnvm.io/docs/OCSSD-2_0-20180129.pdf[2] M. Bjøring, et. al, "LightNVM: The Linux Open-Channel SSD Subsystem," FAST 17

NVMe Open-Channel SSD [1]

ApplicationUser-level

Virtual File System

File System

Block I/O Layer

NVMe DriverKernel-level

Host Interface

Flash Translation Layer

NAND Flash Array

Device-level

ApplicationUser-level

Virtual File System

File System

Block I/O Layer

NVMe DriverKernel-level

Host Interface

Flash Translation Layer

NAND Flash Array

Device-level

pblk driver [2]

LightNVM driver [2]

File I/O

Block I/O

NAND I/O

File I/O

Block I/O

NAND I/O

Defines or overrides NVMe commands for low-level NAND I/O operations.

Page 18: Amber: Enabling Precise Full-System Simulation with

Power/Energy Modeling

• Embedded Core: McPAT from HP Labs [1]

• DRAM: DRAMPower from gem5 [2]

• NAND Flash Array

[1] S. Li, et. al, "McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures," MICRO 42[2] K. Chandrasekar, et. al, "DRAMPower: Open-source DRAM Power & Energy Estimation Tool," http://www.drampower.info

Cache registers Page/Block access I/O bus

Core model(ARMv8/aarch64)

DRAM model(from gem5)

Latency

Func. Instru-ctions

CmdsLatency

Access

McPAT

DRAMPower

Modeling current and voltage of SRAM-like 6 transistors cell at a specified clock speed

Current/voltage of flash cell array (page) or block for:• READ• PROGRAM• ERASE

Current/voltage of NVDDR bus protocol at a specific clock speed (configurable)

Pre-DMA times of each readPost-DMA times of each write

Memory operation times (measured at simulation)

Active time of flash bus (between AMBA channel and internal registers)

Page 19: Amber: Enabling Precise Full-System Simulation with

Evaluation – Setup

User-level programs

1) Microbenchmarks- Flexible I/O tester [1]

2) Workloads- Reconstructed MSPS [2]

- Replayed with FIO [1]

[1] J. Axboe, "Flexible I/O tester," https://github.com/axboe/fio[2] M. Kwon, et. al, "TraceTracker: Hardware/Software Co-Evaluation for Large-Scale I/O Workload Reconstruction," IISWC 17

Unmodified Linux Kernel- 4.9.92 for performance validation- 4.14.42 for Open-Channel SSD evaluations

gem5 configuration

Amber configurations1) Intel 750 400GB2) Z-SSD 800GB3) 983 DCT 1.92TB4) 850 PRO 256GB

ISA ARMv8

CPU 4 cores @ 4.4GHz

L1I 32KB, 2-way, private

L1D 64KB, 2-way, private

L2 2MB, 8-way, shared

DRAM DDR4-2400 4GBNote that all performance results and validation are performed at actual user-level applications on the Linux-enabled system emulation

Page 20: Amber: Enabling Precise Full-System Simulation with

Evaluation – ValidationSequential read Random read

Sequential write Random write

bs=4K, QD=1~32

10 20 300

400

800

1200

1600

2000

10 20 30Bandw

idth

(M

B/s

)

I/O Depth

Intel 750

850 PRO

Z-SSD

983 DCT

10 20 300

400

800

1200

1600

2000

10 20 30

Intel 750

Bandw

idth

(M

B/s

)

I/O Depth

850 PRO

Z-SSD

983 DCT

10 20 300

400

800

1200

1600

2000

10 20 30Ba

nd

wid

th (

MB

/s)

I/O Depth

Intel 750

850 PRO

Z-SSD

983 DCT

10 20 300

400

800

1200

1600

2000

10 20 30Ba

nd

wid

th (

MB

/s)

I/O Depth

Intel 750

850 PRO

Z-SSD

983 DCT

10 20 300

400

800

1200

1600

2000

10 20 30Bandw

idth

(M

B/s

)

I/O Depth

Intel 750 (72%)

850 PRO (91%)

Z-SSD (88%)

983 DCT (72%)

10 20 300

400

800

1200

1600

2000

10 20 30Ba

nd

wid

th (

MB

/s)

I/O Depth

Intel 750 (93%)

850 PRO (86%)

Z-SSD (83%)

983 DCT (94%)

10 20 300

400

800

1200

1600

2000

10 20 30Ba

nd

wid

th (

MB

/s)

I/O Depth

Intel 750 (88%)

850 PRO (86%)

Z-SSD (80%)

983 DCT (91%)

10 20 300

400

800

1200

1600

2000

10 20 30

(81%)Intel 750

Bandw

idth

(M

B/s

)

I/O Depth

850 PRO (78%)

Z-SSD (96%)

983 DCT (88%)

<Device name> (<Accuracy in %>)

Page 21: Amber: Enabling Precise Full-System Simulation with

Sequential read Random read

Sequential write Random write

bs=4K, QD=1~32Evaluation – Validation

10 20 300

80

160

240

320

10 20 30

Late

ncy (

us)

I/O Depth

Intel 750

850 PRO

Z-SSD

983 DCT

10 20 300

80

160

240

320

10 20 30

Late

ncy (

us)

I/O Depth

Intel 750

850 PRO

Z-SSD

983 DCT

10 20 300

80

160

240

320

10 20 30

La

ten

cy (

us)

I/O Depth

Intel 750

850 PRO

Z-SSD

983 DCT

10 20 300

80

160

240

320

10 20 30

La

ten

cy (

us)

I/O Depth

Intel 750

850 PRO

Z-SSD

983 DCT

10 20 300

80

160

240

320

10 20 30

(64%)

Late

ncy (

us)

I/O Depth

Intel 750 (76%)

850 PRO

Z-SSD (96%)

983 DCT (86%)

10 20 300

80

160

240

320

10 20 30

(92%)

Late

ncy (

us)

I/O Depth

Intel 750

850 PRO (72%)

(87%)Z-SSD

983 DCT (94%)

10 20 300

80

160

240

320

10 20 30

(86%)

Late

ncy (

us)

I/O Depth

Intel 750

850 PRO (72%)

(84%)

(91%)Z-SSD

983 DCT

10 20 300

80

160

240

320

10 20 30

(72%)

Late

ncy (

us)

I/O Depth

Intel 750

850 PRO (86%)

Z-SSD (92%)

983 DCT (86%)

<Device name> (<Accuracy in %>)

Page 22: Amber: Enabling Precise Full-System Simulation with

Evaluation – Open-Channel SSDKernel: 4.14 with Open-Channel SSD 1.2, use pblk driver (Host FTL) to map Open-Channel SSD to block device.SSD: Modified Intel 750 (4 channels, 1 pkgs/ch)

How much should Kernel CPU be involvedin handling I/O requests over Open-Channel SSD?

4 64 4 64 4 64 4 64

Seq. Read Rnd. Read Seq. Write Rnd. Write

0

50

100

150

200

Bandw

idth

(M

B/s

)

Block Size (KiB)

NVMe

4 64 4 64 4 64 4 64

Seq. Read Rnd. Read Seq. Write Rnd. Write

0

50

100

150

200

Bandw

idth

(M

B/s

)

Block Size (KiB)

NVMe OCSSD

4 64 4 64 4 64 4 64

Seq. Read Rnd. Read Seq. Write Rnd. Write

0

50

100

150

200

Bandw

idth

(M

B/s

)

Block Size (KiB)

NVMe OCSSD

0 1 2 3 4 5 60

20

40

60

80

100

Kern

el C

PU

Utiliz

ation (

%)

Time (sec)

NVMe

0 1 2 3 4 5 60

20

40

60

80

100

Kern

el C

PU

Utiliz

ation (

%)

Time (sec)

NVMe OCSSD

11MB/sfaster

11MB/sfaster

26MB/sfaster

20MB/sfaster

29MB/sslower

29MB/sslower

11MB/sslower

21MB/sslower

Less than 10%

Over 50%

Data buffering and FTL operations

consume host CPU cycles

Small data bufferSmall I/O unit for low-

level NAND accessI/O phase

FIOinit.

pblk init.

Page 23: Amber: Enabling Precise Full-System Simulation with

DEMO

Page 24: Amber: Enabling Precise Full-System Simulation with
Page 25: Amber: Enabling Precise Full-System Simulation with

Download

Website of SimpleSSD 2.0• Providing step-by-step execution instructions and tutorials• Offering the detailed explanations of each simulation model• Will be updated at November, 2018

Source-code• SimpleSSD

SSD models, including the computation and storage complexes

• SimpleSSD-FullSystemFork of gem5 by employing Amber specific host interfaces

• SimpleSSD-StandaloneFor trace-based simulation studies

https://github.com/SimpleSSD

https://simplessd.org/

Page 26: Amber: Enabling Precise Full-System Simulation with

Conclusion• We introduced Amber, which is a project name of SimpleSSD 2.0

• Amber enables all necessary SSD internals, including hardware and software

• gem5-integrated Amber can support the models of SATA, UFS, NVMe, Open-Channel SSD, which can be used for diverse computing domains ranging from embedded systems to personal computer to servers

• Amber can also • allow system designers to modify the kernel stack and entire operating systems with

diverse sets of SSD devices

• simulate an emerging system like Open-Channel SSDs, which are in development

• In the paper, there are more simulation-based studies and validations, related to power, computing model, memory usages, etc.

Page 27: Amber: Enabling Precise Full-System Simulation with

Q&A