34
SACBD/ECSA - September 9th, 2019 Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific Model Approach 1 Cristian Camilo Castellanos 1 Carlos A. Varela 2 Dario Correal 1 2 Department of Computer Science Rensselaer Polytechnic Institute Troy, NY, USA 1 Department of Systems Engineering Universidad de los Andes Bogotá, Colombia

Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019

Measuring Performance Quality Scenarios in Big Data

Analytics Applications: A DevOps and Domain-Specific

Model Approach

1

Cristian Camilo Castellanos1

Carlos A. Varela2

Dario Correal1

2Department of Computer Science

Rensselaer Polytechnic Institute

Troy, NY, USA

1Department of Systems Engineering

Universidad de los Andes

Bogotá, Colombia

Page 2: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 2

Deployment

Gap

DevOpsSoftware

Architecture

Model

ACCORDANT

Detect Near Mid Air

Collisions NMAC

Problem

Experimentation

Solution Domain

Proposal

Page 3: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 3

Context Challenge Proposal Experimentation Conclusions Q&A

Deployment Gap Phenomenon

● “Despite the increasing interest in BDA adoption, actual deployments are still scarce” [1]

● “50% of companies do not have a specific data science production procedure.” [2]

● Delayed deployment of ready-to-use models (months: 31%, or years: 30%) [3]

● Incompatibility across multiple tools and communication problems. [4]

● It is not yet clear how to define and monitor different QoS in BDA applications [5]

[1] Chen, Kazman & Matthes (2015). Demystifying Big Data Adoption: Beyond IT Fashion and Relative Advantage

[2] Dataiku. (2017). Building Production-Ready Predictive Analytics.

[3].Rexer, K., Gearan, P., & Allen, H. (2016). 2015 Data Science Survey.

[4] Rexer, K., Gearan, P., & Allen, H. (2016). 2015 Data Science Survey.

[5] Rajiv Ranjan. (2014). Streaming Big Data Processing in Datacenter Clouds.

And, What if I need multiple iterations and configurations???

Page 4: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 4

Context Challenge Proposal Experimentation Conclusions Q&A

Business

● Real-time NMAC (Near

Mid Air Collisions) service

● Response time ≤ 3 s.

● Decision Tree model

● Filtering and cleaning

● Modeling and evaluation

IT Architecture

● Latency < 3s

● Kafka, Python, Spark

● Cloud vs Fog computing

Lambda Architecture

Data Science/Analytics

Monitoring

Big Data Analytics (BDA) development

Data

Model

Deployment Gap“months: 31%, or years: 30%”

Functional

Requirements

Quality

Scenarios (QS)

Page 5: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 5

Challenge

Page 6: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 6

How to reduce the big data analytics deployment gap by

specifying and measuring quality scenarios and

speeding up their deployment and performance

monitoring?

Context Challenge Proposal Experimentation Conclusions Q&A

Page 7: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 7

Proposal

Page 8: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 8

Context Proposal Experimentation Conclusions Q&A

ACCORDANT

An exeCutable arChitectural mOdel foR big Data ANalyTics

2. Proposal Process1. Strategy (DSM and DevOps)

Challenge

Page 9: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 9

1- Proposal Strategy

ACCORDANT: A Domain Specific Model and

DevOps Approach

Context Proposal Experimentation Conclusions Q&AChallenge

Page 10: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 10

IT Architecture

● Latency < 3s

● Kafka, Python, Spark

● Cloud vs Fog computing

Lambda Architecture

Functional Viewpoint Model

Deployment Viewpoint Model

A Domain Specific Model

Context Proposal Experimentation Conclusions Q&AChallenge

Automatic Code Generation:● Software Components● Infrastructure as Code

Domain Specific Language (DSL)

Page 11: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019

ACCORDANT

Deployment Viewpoint

11

Context Proposal Experimentation Conclusions Q&A

Functional Viewpoint

Challenge

Page 12: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019

ACCORDANT

12

Context Proposal

Deployment Viewpoint

Experimentation Conclusions Q&AChallenge

Page 13: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 13

Context Proposal Experimentation Conclusions Q&A

ACCORDANT DSL Example

Challenge

Page 14: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 14

Context Proposal Experimentation Conclusions Q&A

ACCORDANT

An exeCutable arChitectural mOdel foR big Data ANalyTics

2. Proposal Process1. Strategy (DSM and DevOps)

Challenge

Page 15: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 15

2 - Proposal Process

Context Proposal Experimentation Conclusions Q&AChallenge

Page 16: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 16

Context Proposal Experimentation Conclusions Q&A

Requirements Operation

Business

User

2- Models and

Transformations

Data Scientist

designs

BDA Deployment Process

4- Integration

designs

1- Quality

Scenarios

defines

DeploymentDevelopment

3- Software

Architecture

guide

5- Code

Generation

6- Code

Execution

ACCORDANT MM

BDA

Solution

import PMML

monitoringSW Architect

Challenge

Page 17: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 17

Context Proposal Experimentation Conclusions Q&AChallenge

Process Overview

Deployment Gap

● Specify performance QS

integrated with software

architecture.

● Speed up BDA deployment and

monitoring.

Page 18: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 18

Experimentation

Page 19: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 19

Context Challenge Proposal Experimentation Conclusions Q&A

Business (FAA, private pilots)

● Real-time NMAC (Near

Mid Air Collisions) service

● Response time ≤ 3 s.

● Decision Tree model

● Filtering and cleaning

● NMAC detection Model

IT Architecture

● Latency < 3s

● Kafka, Python, Spark

● Cloud vs Fog computing

Lambda Architecture

Avionics Data Scientist

Monitoring

Data

ADS-B

Deployment Gap“months: 31%, or years: 30%”

Functional

Requirements

Quality

Scenarios (QS)

Avionics BDA deployment

Model

Page 20: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 20

Context Proposal Experimentation Conclusions Q&A

Experimentation in Avionics

● Feasibility using Avionics use cases

○ UC1: Near Mid-Air Collision Analysis for

route planning.

○ UC2: Near Mid-Air Collision Detection in

operation.

● Deployment Effort

○ Time

○ Lines of Code (Complexity)

https://wcl.cs.rpi.edu/

Challenge

Page 21: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 21

Context Proposal Experimentation Conclusions Q&A

Business: Data Collection for 2, 20, and 200 nmi around JFK

● 2 nmi: 13.328 compares● 20 nmi: 656.177 compares● 200 nmi: 18,899,217 compares

ADS-B Exchange

Challenge

Automatic dependent surveillance – broadcast

Page 22: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 22

Data Scientist: Build Analytics Model for NMAC Detection

Context Proposal Experimentation Conclusions Q&A

Dtree.pmml

Challenge

ADS-B Exchange

Page 23: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 23

Context Proposal Experimentation Conclusions Q&A

IT Architect: Define Software architecture of two use cases

UC1

UC2

Deadline

< 3600 s

Latency

< 3 s

Challenge

Page 24: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 24

Context Proposal Experimentation Conclusions Q&A

IT Architect: Define Deployment Strategies

Functional View

Deadline

< 3600 s

Challenge

Technology

Assignments

Deployments

Page 25: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 25

Context Proposal Experimentation Conclusions Q&AChallenge

IT Architect: Specify Functional and Deployment Models

Page 26: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 26

ACCORDANT: Automatic Code Generation

Context Proposal Experimentation Conclusions Q&A

Evaluator evaluator = EvaluatorUtil.createEvaluator(DTree.pmml);

TransformerBuilder pmmlTransformerBuilder = new

TransformerBuilder(evaluator).withTargetCols()

.withOutputCols().exploded(false);

List<StructField> fields = new ArrayList<StructField>();

fields.add(DataTypes.createStructField("a",DataTypes.IntegerType, true));

...

fields.add(DataTypes.createStructField("sz_norm",DataTypes.FloatType,

true));

StructType schema = DataTypes.createStructType(fields);

Transformer pmmlTransformer = pmmlTransformerBuilder.build();

Logging.traceMetrics(Logging.DEADLINE, timestamp); //TRACING

apiVersion: apps/v1

kind: Deployment

spec:

replicas: 3

spec:

containers:

- name: spark-worker-ex

image: ramhiser/spark:2.0.1

ports:

- containerPort: 8081

resources:

requests:

cpu: 0.25

ACCORDANT XMI

Challenge

Page 27: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 27

ACCORDANT: Monitoring application operation

Context Proposal Experimentation Conclusions Q&A

ACCORDANT XMI

Challenge

Page 28: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 28

Context Proposal Experimentation Conclusions Q&AChallenge

QS Monitoring of UC1

Page 29: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 29

QS Monitoring of UC2

Context Proposal Experimentation Conclusions Q&A

2 nmi

20 nmi

Challenge

Page 30: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 30

Results

Context Proposal Experimentation Conclusions Q&A

-57.3% -73.47% -32.86

Challenge

Speed Up BDA deployment and monitoring iterations.

Deployment Gap Reduction

Page 31: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 31

Conclusions

Page 32: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 32

Context Proposal Experimentation Conclusions Q&A

● A DSM and DevOps approach to formalize and accelerate BDA

solution development and deployment using FV and DV.

● A performance metrics specification and monitoring.

● An evaluation applied to avionics use cases with different deployment

strategies and quality scenarios.

We believe that this work is a step forward towards deployment gap

reduction!!

Contributions

Challenge

Page 33: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 33

Context Proposal Experimentation Conclusions Q&A

● Train models to predict performance behavior.

● Architectural properties verification.

Future Work

● Design vs development effort.

● Adoption in other industry cases.

● Different deployment paradigms such as serverless or fog computing.

Open Challenges

Challenge

Page 34: Measuring Performance Quality Scenarios in Big Data ... Performance Quali… · Measuring Performance Quality Scenarios in Big Data Analytics Applications: A DevOps and Domain-Specific

SACBD/ECSA - September 9th, 2019 34

Q & A Session

[email protected]

Thanks!!!

Context Proposal Experimentation Conclusions Q&AChallenge