Upload
dave-bechberger
View
218
Download
1
Embed Size (px)
Citation preview
Copyright 2016 Expero, Inc. All Rights Reserved 1
In today's development ecosystem building a service oriented architecture based on a micro services is
common practice. With the rise of Big Data and Internet of Things applications making these services highly
performant services is no longer an option. In order to accomplish the scalability and performance
requirements that customers expect we are required to start thinking differently about how we architect and
build these applications in order to meet those demands.
This session will demonstrate a method for creating a highly performant service based application using
Google’s GRPC and Apache Cassandra in .NET. We will show how you can combine gRPC to minimize
communication overhead while leveraging Cassandra to optimize storage of time series data. We will explore
these concepts by creating an Internet of Things (IoT) application to demonstrate how you can effectively
meet the performance and scalability challenges posed by these new breeds of applications.
Abstract
Copyright 2016 Expero, Inc. All Rights Reserved
Performance is not an OptionBuilding High performance Web Services with gRPC and Cassandra
June 9th, 2016
#build4prfrmnc
Copyright 2016 Expero, Inc. All Rights Reserved 3
● What is gRPC
● What is Cassandra
● How to build a simple gRPC Microservice
● How to persist time series data in Cassandra
● Why you might want to use gRPC/Cassandra instead of a
more traditional REST/RDBMS for a Time-Series IoT
application
What I hope you take away
Copyright 2016 Expero, Inc. All Rights Reserved 4
Dave BechbergerSenior Architect [email protected] @bechbd https://www.linkedin.com/in/davebechberger
About me
Copyright 2016 Expero, Inc. All Rights Reserved 5
Expero - Bringing challenging product ideas to reality
Architecture & Development
Product Strategy
User Experience
DomainExpert
Copyright 2016 Expero, Inc. All Rights Reserved 6
Expero - Select Clients
6Austin(HQ) • Houston • New York City • Founded 2003
Copyright 2016 Expero, Inc. All Rights ReservedCopyright 2016 Expero, Inc. All Rights Reserved
What is gRPC?
7
Copyright 2016 Expero, Inc. All Rights Reserved
● gRPC is a general purpose RPC framework
● Built on standards
● Free and Open Source
● Built for distributed systems
8
What is gRPC?
Copyright 2016 Expero, Inc. All Rights Reserved
● Allows client to call methods on the server as if they were local
● Built for low latency highly scalable microservices
● Payload agnostic
● Bi-Directional Streaming
● Pluggable and Extensible
9
gRPC Architecture
Copyright 2016 Expero, Inc. All Rights Reserved 10
Simple Model and Service Definition
Copyright 2016 Expero, Inc. All Rights Reserved 11
Optimized Speed and Performance
Copyright 2016 Expero, Inc. All Rights Reserved 12
Code Generation
Copyright 2016 Expero, Inc. All Rights ReservedCopyright 2016 Expero, Inc. All Rights Reserved
What is Cassandra?
13
Copyright 2016 Expero, Inc. All Rights Reserved
What is Cassandra?
● Distributed Datastore
● Open Source Apache Project
● No Single Point of Failure
● Scalable
14
Copyright 2016 Expero, Inc. All Rights Reserved
CAP Theorem - Pick 2● Consistency - all nodes see the same data
at the same time
● Availability - every requests receives a response
● Partition Tolerant - the system continues to operate even during network failures
15
Copyright 2016 Expero, Inc. All Rights Reserved
ACID vs. BASERDBMS World
● Atomic - transactions are “all or nothing”
● Consistency - On completion all data is the same
● Isolated - transactions do not interfere with one another
● Durable - results of a transaction are permanent
16
NoSQL World
● Base Availability - The datastore works most of the time
● Soft State - Stores are not write consistent, data can differ between replicas
● Eventually Consistent - Stores become consistent over time
Copyright 2016 Expero, Inc. All Rights Reserved
Cassandra Architecture
17
Copyright 2016 Expero, Inc. All Rights Reserved
Hash Ring Architecture
18
● All nodes own a portion of the token ring
● All nodes know which token ranges belong to which nodes
● Partitioner generates a token from the Partition Key
● Tokens determine where data is located on the ring
Copyright 2016 Expero, Inc. All Rights Reserved
How Tokens Work
19
Partitioner
Token:12
Client Driver
PK: Expero
Data Written
Copyright 2016 Expero, Inc. All Rights Reserved
Data Replication
● Data replication is automatic
● Number of replicas is called the Replication Factor or RF
● Data is replicated between Data Centers
● Hinted Handoff
Copyright 2016 Expero, Inc. All Rights Reserved
Data Replication in Action
Client
Write A
Data Written
Replica Written
Replica Written
Coordinator
Driver
Partitioner
Token:12
PK: Expero
Copyright 2016 Expero, Inc. All Rights Reserved
What does it mean to be Eventually Consistent?
22
● Data will “eventually” match on all replicas, usually in terms of milliseconds
● Consistency Level or CL (11 for writes, 10 for reads)
● Tuning consistency affects performance and availability
● CL can be tuned for R/W performance on a per query basis using CQL
Copyright 2016 Expero, Inc. All Rights Reserved
Why use Cassandra over your RDBMS?
● Performance
● Linearly Scalable
● Natively built as a distributed datastore
● Always-On Architecture
23
Copyright 2016 Expero, Inc. All Rights Reserved
What is DataStax vs. Apache Cassandra?● Certified Cassandra – Delivers highest
quality Cassandra software for confidence and peace of mind for production environments
● Enterprise Security – Full protection for sensitive data
● Automatic Management Services – Automates key maintenance functions to keep the database running smoothly
● OpsCenter – Advanced management and monitoring functionality for production applications
● Expert Support – Answers and assistance from the Cassandra experts for all production needs
24
Copyright 2016 Expero, Inc. All Rights ReservedCopyright 2016 Expero, Inc. All Rights Reserved
The Problem - Engine Monitoring
25
Copyright 2016 Expero, Inc. All Rights Reserved
● Truck Engine Monitoring Software
● Currently ~ 1000 trucks taking readings every 10 seconds
● WebAPI REST on a SQL Server 2014 Database
26
Setup
Copyright 2016 Expero, Inc. All Rights Reserved
● You were recently landed a huge new client Expero Trucking Inc.
● Sensor readings now 1/second and add geolocation (lat/long) data
● Adding 10,000 trucks.
● Minimize costs and zero downtime
27
The Requirements
Copyright 2016 Expero, Inc. All Rights Reserved
● 100 measurements/second to 22,000 measurements/second
● Data load from ~35 MB/day to ~2.2 GB/day
● Your architecture needs to change
28
The Problem
Copyright 2016 Expero, Inc. All Rights ReservedCopyright 2016 Expero, Inc. All Rights Reserved
The Solution - gRPC and Cassandra
29
Code Available Here https://github.com/experoinc/NDC-Oslo-2016/tree/master/NDC.Oslo
Copyright 2016 Expero, Inc. All Rights Reserved
● Change SQL Server Database to a Cassandra Cluster
● Replace REST based services with gRPC services
30
Proposed Solution
Copyright 2016 Expero, Inc. All Rights Reserved 31
Defining a Model and Service
Copyright 2016 Expero, Inc. All Rights Reserved 32
Generating Client/Server Stubs
Model Definition Service Definition
Copyright 2016 Expero, Inc. All Rights Reserved 33
Creating Cassandra KeySpace and Table
Copyright 2016 Expero, Inc. All Rights Reserved 34
Connecting to Apache Cassandra using DataStax DriverDataStax Open Source C# Driver - https://github.com/datastax/csharp-driver
Copyright 2016 Expero, Inc. All Rights Reserved 35
Writing Data to Cassandra
Copyright 2016 Expero, Inc. All Rights Reserved 36
Reading Data From Cassandra
Copyright 2016 Expero, Inc. All Rights Reserved 37
Time to See Some Running Code
Average Ping Time
Running in Oregon
Copyright 2016 Expero, Inc. All Rights ReservedCopyright 2016 Expero, Inc. All Rights Reserved
Tradeoffs of gRPC and Cassandra
38
Copyright 2016 Expero, Inc. All Rights Reserved 39
● Not for Browsers
● Chunk “Big” (>1MB ) Data
● No Nullable Data Types
● Not Production Yet
Tradeoffs of using gRPC
Copyright 2016 Expero, Inc. All Rights Reserved 40
● No joins between tables
● No Ad-Hoc queries
● Minimal Aggregations
● Complexity
● Cassandra is Not Relational
Tradeoffs of using Cassandra
Copyright 2016 Expero, Inc. All Rights Reserved 41
● gRPC
○ http://www.grpc.io/
○ https://developers.google.com/protocol-buffers/docs/proto3
● Cassandra
○ http://cassandra.apache.org/
○ http://www.planetcassandra.org/
○ https://academy.datastax.com/
○ http://www.datastax.com/
Learning More
Copyright 2016 Expero, Inc. All Rights ReservedCopyright 2016 Expero, Inc. All Rights Reserved
Thank you, any Questions?
42