Auto-scaling Techniques for Elastic Data Stream Processing · Auto-scaling Techniques for Elastic...

Public

Auto-scaling Techniques for Elastic Data

Stream Processing Thomas Heinze, Valerio Pappalardo, Zbigniew Jerzak, Christof Fetzer

March 2014

Public

Outline

1. Introduction

2. Auto-scaling Techniques for Elastic Data Stream Processing

3. Evaluation

4. Conclusion and Future Work

Public

Utilization within Cloud Environments

Cluster of Twitter[1] has average CPU utilization < 20%, however ~80% of the resources are reserved

Google Cluster Trace[2] shows an average 25-35% CPU utilization and 40% memory

The average utilization within public clouds

is estimated to be between 6% to 12%[3].

1 Benjamin Hindman, et al. "Mesos: A platform for fine-grained resource sharing in the data center." NSDI, 2011. 2 Charles Reiss, et al. "Heterogeneity and dynamicity of clouds at scale: Google trace analysis." SOCC, ACM, 2012. 3 Arunchandar Vasan, et al. “Worth their watts? An empirical study of datacenter servers“. HPCA, 2010.

Public

Elasticity

Users needs to reserve required resources Limited understanding of the performance of the system

Limited knowledge of characteristics of the workload

Workload Load

Static Provisioning

Elastic Provisioning

Underprovisioning

Overprovisioning

Public

Elastic Data Stream Processing

Long standing continous queries over potential infinite data stream

Small memory footprint (MB – GB) for most use cases, fast scale out Strict requirements on end to end latency Unpredictable workload, high variability (within seconds to minutes) Load balancing influences running queries

Input Streams

Data Stream Processing Engine Output Streams

Public

Auto-scaling Techniques[1]

Different algorithmic approaches for various domains and use cases.

1) Threshold-based approaches

2) Time series analysis

3) Reinforcement learning

4) Queuing theory

5) Control theory

1T. Lorido-Botran et al., “Auto-scaling techniques for elastic applications in cloud environments,” Tech.

Rep., 2012. 6

Public

Requirements

Workload Independence: Independent from workload characteristics; we make no assumption on the input workload.

Adaptivity: Adapt online to changing conditions like different workload characteristics.

Configurability: Easy to setup and configure by an end user.

Computational feasibility: The algorithm has to be computationally feasible for scale out within seconds.

Feasible: Threshold-based approaches, Reinforcement Learning, Control Theory.

Not feasible: Time Series Analysis, Queuing Models.

Public

Threshold-based Approach

Upper Threshold:

„If CPU utilization of host is larger than x for y seconds,

host is marked as overloaded.“

Lower Threshold:

„If CPU utilization of host is small than z for w seconds,

host is marked as underloaded.“

Additional parameters: Target Utilization, Grace Period

Two Variants: Local Thresholds vs. Global Thresholds

Public

Reinforcement Learning[1]

Lookup table describing for each state the best action

Best action is adapted based on an online learning algorithm

Extension for Elastic Data Stream Processing:

Local policy

Initialisation based on threshold-based policy

Grace Period

1R. Das et al., “Model-based and model-free approaches to autonomic resource allocation”, IBM

Research Report, 2005. 9

Public

Elastic Data Stream Processing

Elastic CEP Engine Elastic CEP Engine FU

Operator Placement Distributed Data

Stream Processing Engine Processing

Coordination

Auto-scaling Techniques

Heinze, Thomas, et al. "Elastic Complex Event Processing under Varying Query Load.“ BD3,VLDB,

Queries

Input Streams

Output Streams

Public

Measured

Host Utilization

Auto-scaling Technique

Max./ Min. Host Utilization,

Target Utilization

Scaling

Decision

Integration of Auto-Scaling Techniques

Public

Based on data taken from Frankfurt Stock Exchange

35 aggregation queries using varying data selection and time ranges

up to 10 processing nodes

Experiment duration: ~1 hour

1 101 201 301 401 501 601 701

Public

Configurability

Global Thresholds Local Thresholds

Local threshold are more robust, while global thresholds highly sensitiv.

0.2-0.8 0.3-0.8 0.3-0.9

Latency Avg Utilization Max Utilization Min Utilization Median Latency

0.2-0.8 0.3-0.8 0.3-0.9

Public

Configurability

Reinforcement Learning Local Thresholds

Reinforcement Learning achieves best utilization and latency values.

0.2-0.8 0.3-0.8 0.3-0.9

Public

Different Workloads

Reinforcement Learning Local Thresholds

Reinforcement Learning reduces variability between different workloads.

Day 1 Day 2 Day3

Public

Lessions Learned

Elasticity for data stream processing systems poses new challenges towards auto-scaling techniques

Global thresholds create many overload situations

Adaptive Reinforcement Learning is more stable than Local Thresholds, improves utilization by 5% and reduces latency by 30%

Need to handle variations within a workload better

Investigate additional parameters like latency, queue length.

Study influence of the placement algorithm

Auto-scaling Techniques for Elastic Data Stream Processing · Auto-scaling Techniques for Elastic...

Documents

AWS Certifi ed Solutions · AWS Batch 14 Elastic Beanstalk 14 Elastic Load Balancing 16 Auto Scaling 16 CloudFormation 17 Application Services 18 OpsWorks 18 CloudFront 18 ... Introducing

Acknowledgements Proof-of-Concept Elastic Scaling with ALICE

Basics of Cloud Computing –Lecture 3 Scaling ...€¦ · – Amazon Auto Scale and Elastic Load Balancing – Open source options for developing Auto Scale solutions 2/27/2018 Satish

Elastic Load Balancing Auto Scaling - Amazon S3 · PDF fileElastic Load Balancing Auto Scaling ... • Elastic Load Balancer does not cap the number of connections that it can attempt

AWS: Scaling With Elastic Beanstalk

AWS Auto Scaling · AWS Auto Scaling API Reference DeleteScalingPlan DeleteScalingPlan Deletes the speciﬁed scaling plan. Deleting a scaling plan deletes the underlying ScalingInstruction

Package ‘paws.management’ · Application Auto Scaling Description With Application Auto Scaling, you can conﬁgure automatic scaling for the following resources: •Amazon ECS

Amazon EC2 Auto Scaling (日本語) - ユーザーガイド · Amazon EC2 Auto Scaling (日本語) ユーザーガイド Amazon EC2 Auto Scaling とは

AWS Auto Scaling › ko_kr › autoscaling › plans › user... · 2020-04-29 · AWS Auto Scaling 사용 설명서 예를 들어 예측 조정을 활성화하고, Auto Scaling 그룹의

Amazon EC2 Auto Scaling - AWS Documentation · PDF fileAmazon EC2 Auto Scaling User Guide Create an Auto Scaling Group from an EC2 Instance Using the Console ..... 36

Cloud Computing An Introduction - Semantic Scholar · IaaS Elasticity (Auto-Scaling) Services • AWS Elastic Load Balancing • GoGrid’s Infrastructure and RAM Scaling •Rackspace

Auto Scaling AWS

ELASTIC: Improving CNNs with Dynamic Scaling Policiesalanlab/Pubs19/wang2019elastic.pdf · 2019. 4. 7. · ELASTIC: Improving CNNs with Dynamic Scaling Policies Huiyu Wang1* Aniruddha

Application Auto Scaling - User Guide - AWS … · Application Auto Scaling User Guide ... GitHub repository. ... Application Auto Scaling scales based on the policy that provides

AUTO-SCALING OF NETWORK RESOURCES

Amazon Web Services - Computer Engineering · PDF fileAmazon Web Services ... Auto Scaling, Elastic Load Balancing ... • Application Services – Elastic Transcoder • Security,

CloudScale: Elastic Resource Scaling for Multi-Tenant

EdgeScaler: Effective Elastic Scaling for Graph Stream ...Graph Processing, Stream Processing, Elastic Scaling ACM Reference Format: Daniel Presser, Frank Siqueira, Luís Rodrigues,

Auto Scaling WordPress in AWS

Amazon EC2 Auto Scaling - Ncloud24