42
Copyright 2016 Expero, Inc. All Rights Reserved 1 In today's development ecosystem building a service oriented architecture based on a micro services is common practice. With the rise of Big Data and Internet of Things applications making these services highly performant services is no longer an option. In order to accomplish the scalability and performance requirements that customers expect we are required to start thinking differently about how we architect and build these applications in order to meet those demands. This session will demonstrate a method for creating a highly performant service based application using Google’s GRPC and Apache Cassandra in .NET. We will show how you can combine gRPC to minimize communication overhead while leveraging Cassandra to optimize storage of time series data. We will explore these concepts by creating an Internet of Things (IoT) application to demonstrate how you can effectively meet the performance and scalability challenges posed by these new breeds of applications. Abstract

Performance is not an Option - gRPC and Cassandra

Embed Size (px)

Citation preview

Page 1: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 1

In today's development ecosystem building a service oriented architecture based on a micro services is

common practice. With the rise of Big Data and Internet of Things applications making these services highly

performant services is no longer an option. In order to accomplish the scalability and performance

requirements that customers expect we are required to start thinking differently about how we architect and

build these applications in order to meet those demands.

This session will demonstrate a method for creating a highly performant service based application using

Google’s GRPC and Apache Cassandra in .NET. We will show how you can combine gRPC to minimize

communication overhead while leveraging Cassandra to optimize storage of time series data. We will explore

these concepts by creating an Internet of Things (IoT) application to demonstrate how you can effectively

meet the performance and scalability challenges posed by these new breeds of applications.

Abstract

Page 2: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

Performance is not an OptionBuilding High performance Web Services with gRPC and Cassandra

June 9th, 2016

#build4prfrmnc

Page 3: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 3

● What is gRPC

● What is Cassandra

● How to build a simple gRPC Microservice

● How to persist time series data in Cassandra

● Why you might want to use gRPC/Cassandra instead of a

more traditional REST/RDBMS for a Time-Series IoT

application

What I hope you take away

Page 4: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 4

Dave BechbergerSenior Architect [email protected] @bechbd https://www.linkedin.com/in/davebechberger

About me

Page 5: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 5

Expero - Bringing challenging product ideas to reality

Architecture & Development

Product Strategy

User Experience

DomainExpert

Page 6: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 6

Expero - Select Clients

6Austin(HQ) • Houston • New York City • Founded 2003

Page 7: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights ReservedCopyright 2016 Expero, Inc. All Rights Reserved

What is gRPC?

7

Page 8: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

● gRPC is a general purpose RPC framework

● Built on standards

● Free and Open Source

● Built for distributed systems

8

What is gRPC?

Page 9: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

● Allows client to call methods on the server as if they were local

● Built for low latency highly scalable microservices

● Payload agnostic

● Bi-Directional Streaming

● Pluggable and Extensible

9

gRPC Architecture

Page 10: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 10

Simple Model and Service Definition

Page 11: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 11

Optimized Speed and Performance

Page 12: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 12

Code Generation

Page 13: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights ReservedCopyright 2016 Expero, Inc. All Rights Reserved

What is Cassandra?

13

Page 14: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

What is Cassandra?

● Distributed Datastore

● Open Source Apache Project

● No Single Point of Failure

● Scalable

14

Page 15: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

CAP Theorem - Pick 2● Consistency - all nodes see the same data

at the same time

● Availability - every requests receives a response

● Partition Tolerant - the system continues to operate even during network failures

15

Page 16: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

ACID vs. BASERDBMS World

● Atomic - transactions are “all or nothing”

● Consistency - On completion all data is the same

● Isolated - transactions do not interfere with one another

● Durable - results of a transaction are permanent

16

NoSQL World

● Base Availability - The datastore works most of the time

● Soft State - Stores are not write consistent, data can differ between replicas

● Eventually Consistent - Stores become consistent over time

Page 17: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

Cassandra Architecture

17

Page 18: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

Hash Ring Architecture

18

● All nodes own a portion of the token ring

● All nodes know which token ranges belong to which nodes

● Partitioner generates a token from the Partition Key

● Tokens determine where data is located on the ring

Page 19: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

How Tokens Work

19

Partitioner

Token:12

Client Driver

PK: Expero

Data Written

Page 20: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

Data Replication

● Data replication is automatic

● Number of replicas is called the Replication Factor or RF

● Data is replicated between Data Centers

● Hinted Handoff

Page 21: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

Data Replication in Action

Client

Write A

Data Written

Replica Written

Replica Written

Coordinator

Driver

Partitioner

Token:12

PK: Expero

Page 22: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

What does it mean to be Eventually Consistent?

22

● Data will “eventually” match on all replicas, usually in terms of milliseconds

● Consistency Level or CL (11 for writes, 10 for reads)

● Tuning consistency affects performance and availability

● CL can be tuned for R/W performance on a per query basis using CQL

Page 23: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

Why use Cassandra over your RDBMS?

● Performance

● Linearly Scalable

● Natively built as a distributed datastore

● Always-On Architecture

23

Page 24: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

What is DataStax vs. Apache Cassandra?● Certified Cassandra – Delivers highest

quality Cassandra software for confidence and peace of mind for production environments

● Enterprise Security – Full protection for sensitive data

● Automatic Management Services – Automates key maintenance functions to keep the database running smoothly

● OpsCenter – Advanced management and monitoring functionality for production applications

● Expert Support – Answers and assistance from the Cassandra experts for all production needs

24

Page 25: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights ReservedCopyright 2016 Expero, Inc. All Rights Reserved

The Problem - Engine Monitoring

25

Page 26: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

● Truck Engine Monitoring Software

● Currently ~ 1000 trucks taking readings every 10 seconds

● WebAPI REST on a SQL Server 2014 Database

26

Setup

Page 27: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

● You were recently landed a huge new client Expero Trucking Inc.

● Sensor readings now 1/second and add geolocation (lat/long) data

● Adding 10,000 trucks.

● Minimize costs and zero downtime

27

The Requirements

Page 28: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

● 100 measurements/second to 22,000 measurements/second

● Data load from ~35 MB/day to ~2.2 GB/day

● Your architecture needs to change

28

The Problem

Page 29: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights ReservedCopyright 2016 Expero, Inc. All Rights Reserved

The Solution - gRPC and Cassandra

29

Code Available Here https://github.com/experoinc/NDC-Oslo-2016/tree/master/NDC.Oslo

Page 30: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved

● Change SQL Server Database to a Cassandra Cluster

● Replace REST based services with gRPC services

30

Proposed Solution

Page 31: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 31

Defining a Model and Service

Page 32: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 32

Generating Client/Server Stubs

Model Definition Service Definition

Page 33: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 33

Creating Cassandra KeySpace and Table

Page 34: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 34

Connecting to Apache Cassandra using DataStax DriverDataStax Open Source C# Driver - https://github.com/datastax/csharp-driver

Page 35: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 35

Writing Data to Cassandra

Page 36: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 36

Reading Data From Cassandra

Page 37: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 37

Time to See Some Running Code

Average Ping Time

Running in Oregon

Page 38: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights ReservedCopyright 2016 Expero, Inc. All Rights Reserved

Tradeoffs of gRPC and Cassandra

38

Page 39: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 39

● Not for Browsers

● Chunk “Big” (>1MB ) Data

● No Nullable Data Types

● Not Production Yet

Tradeoffs of using gRPC

Page 40: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 40

● No joins between tables

● No Ad-Hoc queries

● Minimal Aggregations

● Complexity

● Cassandra is Not Relational

Tradeoffs of using Cassandra

Page 41: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights Reserved 41

● gRPC

○ http://www.grpc.io/

○ https://developers.google.com/protocol-buffers/docs/proto3

● Cassandra

○ http://cassandra.apache.org/

○ http://www.planetcassandra.org/

○ https://academy.datastax.com/

○ http://www.datastax.com/

Learning More

Page 42: Performance is not an Option - gRPC and Cassandra

Copyright 2016 Expero, Inc. All Rights ReservedCopyright 2016 Expero, Inc. All Rights Reserved

Thank you, any Questions?

42