Did you know

Preview:

DESCRIPTION

In-Memory Computing

Citation preview

Did You Know?

Today, a CPU core can cycle three billion times in

one second.

In about 1 second, light travels to the moon …

… but during one CPU cycle, light travels only

10cm.

Did You Know?

A motherboard with eight x 16 core CPUs will

soon be available …

That is 128x the computing power of a

single CPU …

… or over 400 billion CPU cycles per second on a

single server blade or socket.

But …

… most of that computing power will be wasted …

… waiting for data.

RAM FLASH DISKCPU

2010 - 2022 128X increase in transistors per chip

NIC

Moore’s Law will continue for at least 10 Years

Transistors per area will double ~ every 2 year

128 X increase in ~ 12 years

2022: 512Gbit / DRAM, 8 Tbit / Flash

Frequency Gains are difficult

Pollack’s rule: Power scales quadratic with clock

performance

Parallelism with more cores is a must

RAM FLASH DISKCPU

2010 - 2022 128X increase in transistors per chip

NIC

2014: 64 cores, 2016: 128 cores, 2022: 1024

cores

Memory/IO bandwidth need to grow with

processing power

Disks cannot follow!

RAM FLASH DISKCPU

2010 - 2022 128X increase in transistors per chip

2010 2022

CORES PER CHIP 10 1024

MEMORYBANDWIDTH 40 Gb/s 2.5 Tb/s

IO BANDWIDTH 2 Gb/s 250 Gb/s

• No big change : Single Core Clock Rate (will stay < 5GHz )

• But impressive overall computing power: 5000 ( core * GHz )

NIC

Challenging! But needed to feed the

cores !

DISK

Disks are Tape

Forget Hard Disks !

Disks cannot go faster

Disks cannot follow bandwidth requirements

Random-read scanning of a 1TB disk space today :

takes 15 – 150 days (!)

To reach 1TB/s you would need 10.000 disks in

parallel

Disks can only be archives any more (sequential

access)

DRAM, Flash and PCM will be replacement

“Spinning Rust”

RAM FLASH DISKCPU

2010 - 2022 128X increase in transistors per chip

2010 2022

CORES PER CHIP 16 1024

MEMORYBANDWIDTH 40 GB/s 2.5 TB/s

IO BANDWIDTH 2 GB/s 250 GB/s

NIC

No big change : Latency

RAM

FLASH

DISKCPUNIC

NICs move to PCI Express

May move onto CPU chip

10 – 100 Gbit/s already today

Latency in cluster ~1 µs

possible (Infiniband/opt.

Ethern.)

LAN/WAN latency 0.1 – 100

ms

Latency and Bandwidth

Throughput x 2 / year

Access time falls by 50% /

year

goes from SATA to PCI

Express

2 determining factors , which won’t change : RAM – CPU latency : ~ 0.1 µs

NIC latency via LAN or WAN : 0.1 – 100 ms

archive

Did You Know?

A CPU accesses Level 1 cache

memory in 1 – 2 cycles.

A CPU accesses Level 1 cache memory in 1

– 2 cycles.

It accesses Level 2 cache memory

in 6 – 20 cycles.

It accesses Level 2 cache memory in 6 – 20

cycles.

It accesses RAM in 100 – 400

cycles.

It accesses RAM in 100 – 400 cycles.It accesses Flash memory in 5000

cycles.

It accesses Flash memory in 5000 cycles.It accesses Disc storage

in 1, 000, 000 cycles.

translate cycles to miles and assume you were a CPU core ..

… then Level 1 cache would be in the building …

Level 2 cache would be at the edge of this city …

RAM would be in a different state …Flash memory would be a different country

…... and disc storage would be the planet

Mars.

RAM

FLASH

DISKCPUNIC

Software Implications

archive

500 cycles

5,000 cycles

1000 – 500,000,000

cycles

1,000,000cycles

Roundtrip latency

RAM

FLASH

DISKCPUNIC

Software Implications

Latency and locality are the determining factorsWhat could that mean?

archive

500 cycles

5,000 cycles

1000 – 500,000,000

cycles

1,000,000cycles

Roundtrip latency

Systems may just get smaller !

More users for transaction processing on a single machine -

isn’t that great?

Already today most customers could run the ERP load of a company on a single blade

Commodity hardware becomes sufficient for ERP

No threat! (… or may be becoming a commodity is a threat?)

Why Bother ?

or ? .......

Think in opportunities .......