79
Shadi A. Noghabi Prelim Committee: Prof. Roy Campbell (advisor), Prof. Indy Gupta (advisor), Prof. Klara Nahrstedt, Dr. Victor Bahl 1 Building Interactive Distributed Processing Applications at a Global Scale

Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Shadi A. Noghabi

Prelim Committee: Prof. Roy Campbell (advisor), Prof. Indy Gupta (advisor), Prof. Klara Nahrstedt, Dr. Victor Bahl

1

Building Interactive Distributed Processing Applications at a Global Scale

Page 2: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Many Interactive Applications

2

< few seconds100s of PBs data

< few minutesTrillions of events

< few millisecondsMillions - billions of devices

Page 3: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Low Latency at Large Scale

• Latency-sensitive, reaching low latency is an important for interactivity

100ms latency (Amazon) → 1% loss in revenue $4M per millisec! *

• Large scale processing, at large and global scales

My focus: Building Interactive systems at a production scale

3* https://blog.gigaspaces.com/amazon-found-every-100ms-of-latency-cost-them-1-in-sales/

Page 4: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

4

At Scale, Low Latency is Challenging

Heterogeneity• Data• Machines• Operations

Dynamism• Workload • Environment

Distribution

State & Replication

Imbalance

Page 5: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

5

Thesis Statement

Latency-driven designs can be used to build productioninfrastructures for interactive applications at a global

scale

while addressing myriad challenges on heterogeneity,dynamism, state, and imbalance.

Page 6: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Latency-driven Design

Achieving latency has the highest priority

6

Consistency

Availability

Latency

PACELCLatency

Throughput

Latency

Isolation

1. Abadi, “Consistency Tradeoffs in Modern Distributed Database System Design”, Computer 20122. Brewer, Gilbert, and Lynch “Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services”, Sigact News 2002

1,2

Page 7: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Related WorkLatency-Driven in other frameworks

• Trend of NoSQL systems: Cassandra, Amazon’s DynamoDB, Pileus.

Latency-Driven does not fit all• Consistency driven:

– Consistent Storage: BigTable, Spanner, Yahoo’s PNUTS, Espresso– Exactly-once Stream processing: Flink, Millwheel, Spark Streaming, Trident

• Throughput driven:– Batch processing: Hadoop, Spark, Tez, Pig and Hive– Micro-batch streaming: Spark Streaming, media processing– High-throughput Storage: HDFS and GFS

7

Page 8: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Latency-driven Techniques

8

Background processing

Tiered data access Latency

DrivenLoad

balancing

Adaptive Scaling

Partitioning & parallelism

Opportunistic processing

Locality

Page 9: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Latency-driven at All Layers

9

Processing

Networking

DC 1 DC 2

DC n

Storage

Ambry

[SIGMOD’16]

Samza

[VLDB’17]

Steel

[HotCloud’18]

FreeFlow

[HotNets’17]

Edge

Edge

Edge

…Edge

Main production system >4 years & 500M users at LinkedInUsed at ~15 Companies, with 100s of applications at LinkedInIntegrated with production Azure and Windows IoTGeneral container networking applicable to Docker, Kubernetes, etc.

Page 10: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Latency-driven Techniques

10

Background processing

Tiered data access Latency

DrivenLoad

balancing

Adaptive Scaling

Partitioning & parallelism

Opportunistic processing

Locality Ambry

[SIGMOD’16]

Samza

[VLDB’17]

Steel

[HotCloud’18]

FreeFlow

[HotNets’17]

Sto

rage

Pro

cess

ing

Net

wo

rkin

g

Page 11: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Latency-driven Techniques

11

Background processing

Tiered data access Latency

DrivenLoad

balancing

Adaptive Scaling

Partitioning & parallelism

Opportunistic processing

Locality Ambry

[SIGMOD’16]

Samza

[VLDB’17]

Steel

[HotCloud’18]

FreeFlow

[HotNets’17]

Sto

rage

Pro

cess

ing

Net

wo

rkin

g

Page 12: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Samza: Stateful Scalable Stream Processing at LinkedIn

VLDB’17

Shadi A. Noghabi*, Kartik Paramasivam^, Yi Pan^, NavinaRamesh^, Jon Bringhurst^, Indranil Gupta*, Roy Campbell*

* University of Illinois at Urbana-Champaign^ LinkedIn Corp. +

Page 13: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Stream Processing*

13

Interactive feeds or ads

Security & Monitoring

Internet of Things

*Stream processing is the processing of data in motion, i.e., computing on data immediately as it is produced

Page 14: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Stream Applications need State

14

State: durable data read/written along with the processing

Many cases need large state• Aggregation: aggregating counts over a window• Join: two (or more) streams over a window• Other: machine learning model

Access Stream(IP, access time)

DoS Stream(Attacker IP)

DoSIdentifier

IP Count

1.1.1.1 2

2.3.2.1 30

Page 15: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Large scale of input, in Kafka alone:• 2.1 Trillion msg/Day, 16 Million msg/sec peaks• 0.5 PB in, 2 PB out per day

Many applications:• 100s of production applications• Over 10,000 containers

Large scale of state• Several 100s of TBs for a single application

15

Scale of Processing

Page 16: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Stateful Stream Processing

16

• Opensource Apache project

• Powers hundreds of apps in LinkedIn’s production

• In use at LinkedIn, Uber, Metamarkets, Netflix, Intuit, TripAdvisor, VMware, Optimizely, Redfin, etc.

Need to handle state with low latency, at large-scale, and with correct (consistent) results

http://samza.apache.org/

Page 17: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Stateful Stream Processing

Processing Model• Input partitioning

• Parallel and independent tasks

• Locality aware data passing

17

Stateful• Local state

• Incremental checkpointing

• 3-Tier caching

• Parallel recovery

• Host Stickiness

• Compaction

Page 18: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Stateful Stream Processing

Processing Model• Input partitioning

• Parallel and independent tasks

• Locality aware data passing

18

Stateful• Local state

• Incremental checkpointing

• 3-Tier caching

• Parallel recovery

• Host Stickiness

• Compaction

Page 19: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Processing from a Partitioned Source

19

Stream of events

Ex: page access, ad views, etc.

Page 20: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Processing from a Partitioned Source

20

Stream of events

Ex: page access, ad views, etc.

Partition 1

Partition 2

Partition k

Page 21: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Processing from a Partitioned Source

21

Stream of events

Ex: page access, ad views, etc.

Partition 1

Partition 2

Partition k

Page 22: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Processing from a Partitioned Source

22

Stream of events

Ex: page access, ad views, etc.

Partition 1

Partition 2

Partition k

Task 1

Task 2

Task k

Partitioning & parallelism

Page 23: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Multi-Stage Dataflow

Application logic: Count number of ‘Page accesses’ for each ip domain in a 5 minute window

23

Application DAG in tasksPage Accessin stream

DoS Streamout stream

Windowcount

Filter SendTo

Data Parallelism: each task performs the entire pipeline on its own chunk of input data

Locality

Page 24: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Stateful Stream Processing

Processing Model• Input partitioning

• Parallel and independent tasks

• Locality aware data passing

24

Stateful• Local state

• Incremental checkpointing

• 3-Tier caching

• Parallel recovery

• Host Stickiness

• Compaction

Page 25: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Stateful Stream Processing

Processing Model• Input partitioning

• Parallel and independent tasks

• Locality aware data passing

25

Stateful• Local state

• Incremental checkpointing

• 3-Tier caching

• Parallel recovery

• Host Stickiness

• Compaction

Page 26: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Stateful Processing

Store count of accessper ip domain

But, using a remote store is slow

State:Ip access count

Samza Application

Task 1

Task 2

Task k

26

Page 27: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Stateful Processing

State:Ip access count

Samza Application

Task 1

Task 2

Task k

27

Page 28: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Stateful Processing

State:Ip access count

Samza Application

Task 1

Task 2

Task k

Local State: use the local storage of tasks• Each task, state corresponding to their partition.

• Ex: task 1, processing input [1-100], state [1-100]

[1-100]

[900-1000]

[100-200]

Locality

28

Page 29: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Stateful Processing

Samza Application

Task 1

Task 2

Task k

Local Storage:• in-memory: fast, but low capacity (GBs)• on-disk: high capacity (TBs) and persistent, but slow• Disk-mem: Disk (persistent store) + memory (cache)

• Fast, high capacity, and persistent

[1-100]

[900-1000]

[100-200]

Tiered data access

29

Page 30: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Stateful Processing

Samza Application

Task 1

Task 2

Task k

[1-100]

[900-1000]

[100-200]

30

• What about failures?

• How to not lose state?

Page 31: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Fault-Tolerance

...

partition k partition 2 partition 1

Changeloge.g., Kafka log

compacted topic

Task 1

Task 2

Task k

Changes saved to a durable change log

→ Recovery by replay change-log

Periodically batch & flush changes

Page 32: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Fault-Tolerance

...

partition k partition 2 partition 1

Changeloge.g., Kafka log

compacted topic

Task 1

Task 2

Task k

X

Changes saved to a durable change log

→ Recovery by replay change-log

Periodically batch & flush changes

Page 33: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

...

partition k partition 2 partition 1

Changeloge.g., Kafka log

compacted topic

Fault-Tolerance

Task 1

Task 2

Task k

X = 10

Changes saved to a durable change log

→ Recovery by replay change-log

Periodically batch & flush changes

Page 34: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

...

partition k partition 2 partition 1

Changeloge.g., Kafka log

compacted topic

Fault-Tolerance

Task 1

Task 2

Task k

Changes saved to a durable change log

→ Recovery by replay change-log

Periodically batch & flush changes

X = 10

Background processing

Page 35: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

But, what is the cost?

35

Latency Consistency

Availability

Page 36: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

But, what is the cost?

36

Latency Consistency

Availability

Replayable input +Consistent snapshot

Longer failure recovery

Page 37: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Consistency Guarantees

...

partition k partition 2 partition 1

Changeloge.g., Kafka topic

Task 1

Task 2

Task k

X = 10

Offset 1005

Offsetse.g., Kafka topic

Offset1005

Page 38: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Availability Optimizations

• Availability is compromised. To improve we use:

– Parallel recovery

– Background compaction

– Host stickiness

38

Background processing

Locality

Partitioning & parallelism

Page 39: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Task-1 Task-4 Task-2 Task-3

Durable : Task-Container-Host Mapping

Task 1, Task 4 -> Host-ATask 2 -> Host-BTask 3 -> Host-C

Job

Input Stream

Change-log

Fast Restarts with Host Stickiness

Host Stickiness in YARN : - Try to place task on same host

after restart- Minimize state rebuilding

Overhead

Host-B Host-CHost-A

Page 40: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Related Work• Stateless: do not support state

– Apache Storm [SIGMOD’14], Heron [SIGMOD’15], S4 [ICDM’10]

• Remote Store: relying on external storage– Trident, MillWheel [VLDB’13], Dataflow [VLDB’15]

– High latency overhead per input message

– but, fast recovery (better availability)

• Local state, but, full state checkpointing: – Flink [TCDE’15], Spark Streaming [HotCloud’12], IBM Streams

[VLDB’16], Streamscope [NDSI’16], System S [DEBS’11]

– Does not scale for large application state

– but, easier “repeatable results” at failure

40

Page 41: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Evaluation

41

Page 42: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Evaluation Setup

• Production Cluster– 500 node YARN cluster– real world applications

• Small Cluster– 8 node cluster: 64GB RAM, 24 core CPUs, a 1.6 TB SSD

– micro-benchmarks• Read-only workload ~ join with table• Read-write workload ~ aggregation over time

42

Page 43: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Local State -- Latency

43

Stores:• in-mem• on-disk• disk-mem• remote

Fault-tolerance:• None • changelog (Clog)• remote

changelog adds minimal overhead

on disk w/ caching comparable with in memory

> 2 orders of magnitude slower compared to local

state

Page 44: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Failure Recovery

44

Growing linearly with size of state

Almost constant overhead with host stickiness

Overhead independent of % of failures (parallel

recovery)

StickySticky

Page 45: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Summary of State in Samza

• 100x better latency

• Better resource utilization

• no need for remote DB resources

• Parallel and independent

processing

Pros Cons

• Dependent to input partitioning• Does NOT work when state is

large and not co-partitionable

in input stream

• Auto-scaling becomes harder

• Recovery is slower (vs remote)

Page 46: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Conclusion

46

Processing Model• Input partitioning

• Parallel and independent tasks

• Locality aware data passing

Stateful• Local state• Incremental checkpointing• 3-Tier caching• Parallel recovery• Host Stickiness• Compaction

Background processing

Locality

Partitioning & parallelism

Need to handle state with low latency, at large-scale, and with correct (consistent) results

Page 47: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Latency-driven Techniques

47

Background processing

Tiered data access Latency

DrivenLoad

balancing

Adaptive Scaling

Partitioning & parallelism

Opportunistic processing

Locality Ambry

[SIGMOD’16]

Samza

[VLDB’17]

Steel

[HotCloud’18]

FreeFlow

[HotNets’17]

Sto

rage

Pro

cess

ing

Net

wo

rkin

g

Page 48: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Latency-driven Techniques

48

Background processing

Tiered data access Latency

DrivenLoad

balancing

Adaptive Scaling

Partitioning & parallelism

Opportunistic processing

Locality Ambry

[SIGMOD’16]

Samza

[VLDB’17]

Steel

[HotCloud’18]

FreeFlow

[HotNets’17]

Sto

rage

Pro

cess

ing

Net

wo

rkin

g

Page 49: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Steel: Unified and Optimized Edge-Cloud Environment

HotCloud’18

Shadi A. Noghabi*, John Kolb^, Peter Bodik**, Eduardo Cuervo**, Indranil Gupta*, Roy Campbell*

* University of Illinois at Urbana-Champaign

^ University of California, Berkeley

** Microsoft Research49

Page 50: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Cloud is Not an One-Size-Fits-All

50

Latency

< 80ms< 20ms

< 10 ms

…The Cloud is simply too far!

> 70 ms round trip time

Page 51: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Cloud is Not an One-Size-Fits-All

51

Latency

< 80ms< 20ms

< 10 ms

…Y

Resources

… 10s – 100s GB/s

Bandwidth

BatteryEnergyHeating

Not enough resources to use the

Cloud

Page 52: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Cloud is Not an One-Size-Fits-All

52

Latency

< 80ms< 20ms

< 10 ms

…Y

Resources

… 10s – 100s GB/s

Bandwidth

BatteryEnergyHeating

Privacy & Security

Page 53: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Edge Computing

53

Cloud

Edge 1

Edge 2

Edge 3

Satyanarayanan, M., Bahl, P., et. al. “The Case for VM-based Cloudlets in Mobile Computing”, Pervasive Computing 2009

Page 54: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

However…Industry is at its infancy of building Edge-Cloud

applications

• No proper unification of Edge & Cloud

– Hard to configure, deploy, and monitor

– Manual and redundant optimizations

54

Steel: A unified edge cloud framework with modular and automated optimizations

Integrated with productionAzure services

Page 55: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

55

Steel

Abstraction (Logical Spec)

Compile DeployMonitor &

AnalyzeFabric

Placement Communication …Optimization Modules

Load Balancing

Applications

Edge-Cloud Ecosystem

AB

CD

Page 56: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Optimizer Modules

56

PlacementWhere (edge/cloud) to place?

Adapt to long-term changes

Communicationconfigure edge-cloud links

Adapt to short-term spikes

Background processing

Locality

Opportunistic processing

• Location and vicinity aware• Background profiling & what-if analysis • Background batching & compression• Partitioned and parallel optimizations• Change strategy opportunistically

(available resources)

Partitioning & parallelism

Page 57: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Latency-driven Techniques

57

Background processing

Tiered data access Latency

DrivenLoad

balancing

Adaptive Scaling

Partitioning & parallelism

Opportunistic processing

Locality Ambry

[SIGMOD’16]

Samza

[VLDB’17]

Steel

[HotCloud’18]

FreeFlow

[HotNets’17]

Sto

rage

Pro

cess

ing

Net

wo

rkin

g

Page 58: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Latency-driven Techniques

58

Background processing

Tiered data access Latency

DrivenLoad

balancing

Adaptive Scaling

Partitioning & parallelism

Opportunistic processing

Locality Ambry

[SIGMOD’16]

Samza

[VLDB’17]

Steel

[HotCloud’18]

FreeFlow

[HotNets’17]

Sto

rage

Pro

cess

ing

Net

wo

rkin

g

Page 59: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Ambry: LinkedIn’s Scalable Geo-Distributed Object Store

SIGMOD ‘16

Shadi A. Noghabi*, Sriram Subramanian+, Priyesh Narayanan+, Sivabalan Narayanan+, Gopalakrishna Holla+, Mammad Zadeh+, Tianwei Li+, Indranil Gupta*, Roy H. Campbell*

* University of Illinois at Urbana-ChampaignSRG: http://srg.cs.illinois.edu and DPRG: http://dprg.cs.uiuc.edu

+ LinkedIn Corporation Data Infrastructure team 59

Page 60: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Massive Media Objects are Everywhere

60

100s of Millions of users

100s of TBs to PBs

From all around the world

Page 61: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

61

How to handle all these objects?

We need a geographically distributed system that stores and retrieves objects in an low-latency and

scalable manner

AmbryIn production for >4 years and >500 M users!

Open source: https://github.com/linkedin/ambry

Page 62: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Datacenter 1

Geo-Distribution

62

Count

success

Datacenter 2 Datacenter 3

item_iitem_i

Alice

item_i

Locality

Asynchronous writes• Write synchronously only to local

datacenter

• Asynchronously replicate others

• Reduce user perceived latency

• Minimize cross-DC traffic

Page 63: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Datacenter 2 Datacenter 3

Geo-Distribution

63

Asynchronous writes• Write synchronously only to local

datacenter

• Asynchronously replicate others

• Reduce user perceived latency

• Minimize cross-DC traffic

item_i

Count

success

Datacenter 1

item_i

Background

Replication

Alice

item_i

LocalityBackground

processing

Page 64: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Other Techniques

Ambry is a large production system with many other techniques:

• Logical data partitioning

• Caching, indexing, bloom filters

• Background load balancing

•….

64

Tiered data

access

Partitioning &

parallelism

Load balancing

Page 65: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Latency-driven Techniques

65

Background processing

Tiered data access Latency

DrivenLoad

balancing

Adaptive Scaling

Partitioning & parallelism

Opportunistic processing

Locality Ambry

[SIGMOD’16]

Samza

[VLDB’17]

Steel

[HotCloud’18]

FreeFlow

[HotNets’17]

Sto

rage

Pro

cess

ing

Net

wo

rkin

g

Page 66: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Latency-driven Techniques

66

Background processing

Tiered data access Latency

DrivenLoad

balancing

Adaptive Scaling

Partitioning & parallelism

Opportunistic processing

Locality Ambry

[SIGMOD’16]

Samza

[VLDB’17]

Steel

[HotCloud’18]

FreeFlow

[HotNets’17]

Sto

rage

Pro

cess

ing

Net

wo

rkin

g

Page 67: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

FreeFlow: High performance container networking

HotNets ‘16

Tianlong Yu ^, Shadi A. Noghabi *, Shachar Raindel†, Hongqiang Liu†, Jitu Padhye†, and Vyas Sekar ^

* University of Illinois at Urbana-Champaign† Microsoft Research^ Carnegie Mellon University

67

Page 68: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Containers are popular

https://clusterhq.com/assets/pdfs/state-of-container-usage-june-2016.pdf

Containers provide good portability and isolation

68

Page 69: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Network

FreeFlow

Node 2Node 1

FreeFlow

69

Container 3

Shared memory

Container 1

RDMA

Container 2

Locality

Opportunistic processing

Page 70: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Latency-driven Techniques

70

Background processing

Tiered data access Latency

DrivenLoad

balancing

Adaptive Scaling

Partitioning & parallelism

Opportunistic processing

Locality Ambry

[SIGMOD’16]

Samza

[VLDB’17]

Steel

[HotCloud’18]

FreeFlow

[HotNets’17]

Sto

rage

Pro

cess

ing

Net

wo

rkin

g

Page 71: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Future Directions• One global system seamlessly across the globe

– Holistic and cross-layer approach

• Latency sensitivity-aware approaches (not treating all objects/apps equally)– E.g., dynamic use of compression, erasure coding,

compaction in storage for old objects– E.g., multi-tenant stream processing environments

• Auto-pilot systems coping with changes automatically and dynamically

71

Page 72: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Thanks to Many Collaborators

• Advisors: Indranil Gupta and Roy Campbell

• LinkedIn: Sriram Subramanian, Priyesh Narayanan, Kartik Paramasivam, Yi Pan, Navina Ramesh, et al.

• Microsoft Research: Victor Bahl, Peter Bodik, Eduardo Cuervo, Hongqiang Liu, Jitu Padhye, et. al.

72

Page 73: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Publications1. “Steel: Unified Management and Optimization of Edge-Cloud Applications”, SA Noghabi, J Kolb, P Bodik, E Cuervo,

I Gupta, RH Campbell, (in progress)

2. “Steel: Simplified Development and Deployment of Edge-Cloud Applications”, SA Noghabi, J Kolb, P Bodik, E Cuervo, HotCloud 2018

3. “Samza: Stateful Scalable Stream Processing at LinkedIn”, SA Noghabi, K Paramasivam, Y Pan, N Ramesh, J Bringhurst, I Gupta, RH Campbell, VLDB 2017

4. “Ambry: LinkedIn’s Scalable Geo-Distributed Object Store”, SA Noghabi, S Subramanian, P Narayanan, S Narayanan, G Holla, M Zadeh, T Li, I Gupta, RH Campbell, SIGMOD 2016

5. “Freeflow: High performance container networking”, T Yu, SA Noghabi, S Raindel, H Liu, J Padhye, V Sekar, HotNets2016

6. “Steel: Unified Management and Optimization of Edge-Cloud IoT Applications”, NSDI poster 2018

7. “To edge or not to edge?”, F Kalim, SA Noghabi, S Verma, SoCC poster 2017

8. “Building a Scalable Distributed Online Media Processing Environment”, SA Noghabi, PhD workshop at VLDB 2016

9. “Toward Fabric: A Middleware Implementing High-level Description Languages on a Fabric-like Network”, SH Hashemi, SA Noghabi, J Bellessa, RH Campbell, ANCS 2016.

10. “Towards enabling cooperation between scheduler and storage layer to improve job performance”, M Pundir, J Bellessa, SA Noghabi, CL Abad, RH Campbell, PDSW 2015

73

Page 74: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Summary of Thesis

74

Latency-driven designs can be used to build production infrastructures for interactive applications at a global scale

Background processing

Tiered data access Latency

DrivenLoad

balancing

Adaptive Scaling

Partitioning & parallelism

Opportunistic processing

Locality ProcessingSamza [VLDB’17]Steel [HotCloud’18]

StorageAmbry [SIGMOD’16]

NetworkingFreeFlow [HotNets’17]

Page 75: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Back up Slides

75

Page 76: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Theoretical Background

Latency-driven design can be modeled theoretically with Queueing Theory*

System: A hierarchical queuing system

Latency: Response time (response_processed - req_enter_time)

Techniques: queueing theory models/techniques • Parallelism & Partitioning →multiple queues, parallel servers

• Adaptive scaling → adding more servers

• …

76* Gross, Donald. Fundamentals of queueing theory. John Wiley & Sons, 2008.

Page 77: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Other Techniques

Batch + Stream in one System

• Reprocess parts/all stream or Database

– bugs, upgrades, logic changes

→Batch as a finite stream

+ conflict resolution, scaling & throttling

77

Adaptive Scaling

Page 78: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Local State -- Throughput

78

remote state 30-150x worse than local state

on disk w/ caching comparable with in memory

changelog adds minimal overhead

Page 79: Building Interactive Distributed Processing Applications ...dprg.cs.uiuc.edu/docs/shadi_phd_thesis/defense.pdf · PACELC Latency Throughput Latency Isolation 1. Abadi, ^ onsistency

Snapshot vs Changelog

79