Building Highly Scalable Spring Applications using In-Memory Data Grids

  • Published on
    23-Feb-2017

  • View
    310

  • Download
    1

Embed Size (px)

Transcript

PowerPoint Presentation

Building Highly Scalable Spring Applications using In-Memory Data GridsBy John Blum & Luke Shannon@john_blum

SPRINGONE2GXWASHINGTON, DCUnless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Presenters2

John Blum - @john_blumSpring Data GemFire Project LeadApache Geode CommitterGemFire Engineer/Technical LeadPivotal Software, Inc

Luke ShannonField/Community EngineerApache Geode CommitterPivotal Software, Inc.

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

We are awesome!2

Agenda3Introduction to Apache GeodeDistributed System & In-Memory Database Concepts

Overview of Spring Data GemFireHow to build highly scalable applications

Spring Data GemFire in ActionFast Foot Shoes DemoCaching Demo (?)

Whats New

QA

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Introduction to Apache Geode will cover the Why, What and How?

Lets get started3

Why Apache Geode?4MotivationVolume of Data (Big Data)Velocity of Data (Fast Data)Verity of Data (Data Accuracy)

Enables new and existing Spring applications to operate at cloud-scale in a consistent, highly-available and predictable manner in order to transact and analyze big, fast data in real-time thereby achieving meaningful and impactful business results.

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Need to manage large quantities of data under extreme load with accuracy and resilience in a reliable way.

Big Data == data lake (any and all data)Fast Data == processing streams of events in (near) real-time

All about Data Access4

What is Apache Geode?5In a nutshell

Open Source core of Pivotal GemFirehttps://pivotal.io/big-data/pivotal-gemfire

Apache Incubator projecthttps://wiki.apache.org/incubator/GeodeProposalhttp://geode.incubator.apache.org/

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

5

What is Apache Geode?6

A distributed, in-memory compute and data management platform that elastically scales to achieve high-throughput, low-latency access to big, fast data powering business critical, analytical applications in real-time. John Blum,

Elastic capacity

+/-NodesOps / SecLinear scalabilityLatency optimizeddata distribution

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Data is stored in-memory for improved performance (lower latency access) and distributed across the cluster for high-availability (high read/write throughput) with the option to persist data to disk (durability).

Scale Out rather Up

Throughput (or number of operations) increases as more nodes are added to the cluster

Data is stored in distributed, highly-concurrent, in-memory data structures to minimize context switching and contentionData is replicated & partitioned for fast, predictable read/write throughput6

Apache Geode Use Cases7Persistent, OLTP/OLAP Database (System of Record)JSR-107 Cache Provider (Key/Value Store)HTTP Session State ManagementDistributed L2 Caching for HibernateMemcached Server (Gemcached)Message Bus with guaranteed message deliveryGlorified version of ConcurrentHashMap

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Database ACID properties, local/global (JTA) transactional capable, Indexing, Querying (OQL) and Functions

Cache with Eviction, Expiration, Overflow (to Disk), Read-Through, Write-Through and Write-Behind

Messaging Apache Geode enables event-based application architectures with Register Interests (RI) and Pivotal GemFire builds on that with Continuous Queries (CQ)

ConcurrentMap implements java.util.concurrent.ConcurrentMap interface.7

China Railway Corporation

GemFire runs on ten primary x86 servers with over two terabytes of memory and there are ten backup servers this has replaced the 72 UNIX boxes and traditional RDBMS with a more efficient, cost-effective approach.

With so many people relying on the website for travel, it must be continuously available. Demand has far exceeded expectations and the future shows as much as 50% growth per year as mobile phone access is added.

https://pivotal.io/big-data/case-study/scaling-online-sales-for-the-largest-railway-in-the-world-china-railway-corporation8

China Railway Corporation

20 million users per day; 40,000 visits per second4.5 million ticket purchases &.Spikes of 15,000 tickets sold per minuteThe system is operating with solid performance and uptime. Now, we have a reliable, economically sound production system that supports record volumes and has room to grow

Dr. Jiansheng Zhu, Vice Director of China Academy of Railway Sciences

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

How Apache Geode Works9Stores data In-MemoryJVM Heap + Off-HeapFunctions as a Distributed System, In-Memory Data Grid (IMDG)Pools system resources across multiple nodes in a cluster to manage both application state and behaviorIncludes: Memory, CPU, Network & (optionally) Disk(optional) Stores data to DiskIn OPLOGS | HDFS for Overflow & Persistence

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

In a nutshell under-the-hood Apache Geode is implemented

Stores data in-memory with puts.

Stores data to disk (synchronously (default) or asynchronously) on persistence and overflowOplogs are append-only; compaction is necessaryHDFS is new and Geode can feed Apache Spark processing streams.9

Memory ManagementApache Geode manages memory usingEviction: LRUExpiration: Time-To-Live (TTL), Idle Timeout (TTI)Auto resource management: critical/eviction HEAP % thresholds(Region) Data Compression: SnappyJVM/GC Tuning

10

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Auto-resource management actually can prevent Cache (Region) put operations

10

Where to begin?11Data Node

Data NodeApplication

What about load?

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

All about Data Storage & Access

Start with GemFire/Geode Data Node, a single Cache NodeAdd Distributed Regions to store dataPerhaps start a cache server and connect a cache client application, orAn application peer cache node (with embedded cache)11

Where to begin?12Data Node

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Except, what happens when too many clients overload the node OutOfMemoryErrors!!12

Where to begin?13

Locator

High read throughputWhat about writes?

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Form a GemFire cluster with a Locator (or multicast networking)

Scale-out to handle loadData is Highly AvailableDurable with Replication & Disk PersistenceResilient to node failure; shared-nothing architecture (each node is independent)

Client Connection Pool with LocatorLoad BalancingFailoverSingle-hop, low/predictable latency, data access13

Where to begin?14

Locator

High read/write throughputWhat about consistency?

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Switch to PARTITION Regions (shard the data)High read/write throughputControl of redundancy level, partitioning policy (default is hash by key; use PartitionResolver to customize), and collocationAutomatic rebalance and restore redundancy in the case of peer data node failure14

Partition Region15

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Consistency is achieved by writing to the Primary PARTITION (and then secondaries) and using a Distributed Ack message policy.15

Where to begin?16

Locator

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Automatic rebalance and redundancy is restored in the case of data node failure16

Network Partition Resolution17

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Auto-reconnect by disconnected data nodes.17

Apache Geode Topologies18

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Summary of Apache GeodeOpen SourceIn-MemoryDistributedScalable (scale-out)High Throughput & Low/Predictable LatencyHighly AvailableConsistentDurableFault Tolerant (resilient)Data-Aware / Parallel Compute

19

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Consistency, highly-availability are low-latency are important aspects for enabling fast, responsive, resilient and accurate applications at scale.19

Other Features of Apache GeodeData Serialization (PDX, Java Serialization)Delta PropagationTransactions (Local & Global JTA-based TX)Querying (OQL) + Continuous QueriesFunctions (Stored Procedures)Native Client Support (C#/C++)REST APIManagement & Monitoring (JMX with Gfsh & Pulse)Security (Auth, Secure Transport with SSL, No Encryption)Statistics & Logging (Log4j)20

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Data Structures: User-defined Classes, JSON, PDXFunctions == MapReduce; Scatter-GatherPublish / Subscribe with Register Interest (RI) & Continuous Queries (CQ) using reliable, async message queues20

Spring Data GemFire21

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Simple things should be simple; complex things should be possible Alan Kay22

Why Spring Data GemFire?22

Spring Data GemFire (SDG)23Applies Spring's powerful, non-invasive programming model to simplify configuration and development of Apache Geode applications.

Spring Ecosystem IntegrationSpring Cache Abstraction / Transaction ManagementSpring Data Commons + RESTSpring Integration (Inbound/Outbound Channel Adapters)Spring Session (coming soon)Spring XD (Sources & Sinks)

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

And Springs programming model is applied in a consistent, familiar manner with other Spring portfolio projects.23

Apache Geode with Spring Data GemFire and Springs Cache Abstraction is a JSR-107 (JCache) caching provider24

++

Spring Data GemFire Use Cases25Configure & Bootstrap Apache GeodeReplacement for cache.xml; Can be used with Cluster Configuration

Build an Application Peer Cache (Cache)Embedded Cache

Build an Application Cache Client (ClientCache)Client/Server

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Spring Data GemFire FeaturesPivotal GemFire / Apache Geode Configuration & BootstrappingSpring Data RepositoriesCRUD + Querying (OQL) + (basic) POJO MappingAnnotated-based Function Implementation / ExecutionTransaction Management with Spring Transaction InfrastructureException Translation into Spring DAO Exception HierarchyRegister Interests & Continuous Querying SupportJSON Region ProxiesData Snapshots Imports/ExportsClient/Server ApplicationsWAN Architecture SupportGemFire / Geode caching provider support in Springs Cache Abstraction26

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Unfortunately this slide was not big enough to list all the features.26

Spring Data GemFire / Apache Geode Coordinates27

spring-libs-snapshot Spring Maven libs-snapshot Repository https://repo.spring.io/libs-snapshot

org.springframework.data spring-data-gemfire 1.7.0.APACHE-GEODE-EA-SNAPSHOT

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Same for Gradle27

Configuration & Bootstrapping28

Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under aCreative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Configuring Apache Geode with Spring29XMLJava-based Configuration

Unless otherwise indicated, these...

Recommended

View more >