Upload
redis-labs
View
604
Download
1
Embed Size (px)
Citation preview
Home of Redis
Analytics at the Speed of Business with Redis and Spark
Leena JoshiVP Product Marketing
Noel YuhannaPrincipal Analyst, Forrester
2
Agenda
• Why Data & Analytics Need to be Real Time
• Drivers and Challenges for Real time analytics
• The Roadmap to Fast Data
• Recommendations
• Brief Introduction to Redis
• Analytics with Redis
• Redis –Spark Integration
• Making Analytics Cost Effective
• Extended analytics with Redis Modules
Noel Yuhanna – 20 min Leena Joshi – 20 min
Running Analytics At The Speed Of
Your BusinessNoel Yuhanna, Principal Analyst
RedisLabs Webinar
© 2016 Forrester Research, Inc. Reproduction Prohibited 4
Data bottlenecks are creating
business bottlenecks that’s
impacting growth and innovation!
© 2016 Forrester Research, Inc. Reproduction Prohibited 5
Currency Oil
Digital transformation is all about the data…
But what if your data is slow and that’s not being
utilized for analytics or in a timely manner?
Data is the new
Today business users think of analytics as a set of boring reports
and dashboards … they don’t want yesterdays data tomorrow!
of enterprise datain used for analytics….
12%
Source: Forrester
Performance remains a key Database challenge..
© 2016 Forrester Research, Inc. Reproduction Prohibited 8
Trends affecting your Database strategy..
Database
› Increasing transaction volume
› Data volume explosion
› Continuous 24x7 availability
› Stronger security measures
› All types of data formats
› New analytical requirements
› Faster access to information
› Co-related/unified data access
› More self-service capabilities
› Unpredictable workloads/patterns
DatabaseDatabase
© 2016 Forrester Research, Inc. Reproduction Prohibited 9
Businesses want real-time access to information…
› Mobile devices – we need data now!
› Competitive pressure – to act more quickly
› Pressure from businesses (LOB) - to support real-time data access
› New insights, advanced analytics – real-time BI
› Global business – that needs global real-time access
› IOT Applications – sensors, devices . .
› Lower cost of memory and computing
© 2016 Forrester Research, Inc. Reproduction Prohibited 10
TREND – The need for Fast Data
Real-time
Mostly
Batch
FAST DATA
© 2016 Forrester Research, Inc. Reproduction Prohibited 11
What is Fast Data?
Fast Data is combining Systems of Engagement (batch) and
Systems of Record(Real-time) together quickly to support
new next-generation business analytics.
Systems of engagement (SOE)
• Mobile, web, and smart devices
• Frequent changes
• Delight clients
• Delivered frequently
Systems of record (SOR)
• Stable requirements
• Highly transactional
• Less change
• Delivered infrequently
Forrester estimates that 20% of all data in an enterprises
is Fast Data, and that’ll double over the next three years.
Fast
Data
Traditional DataReal-time data
© 2016 Forrester Research, Inc. Reproduction Prohibited 12
Key capabilities you need for Fast Data Strategy
› Distributed In-memory computing layer
› Low-latency access to large volumes of data
› Ability to integrate data from disparate data sources
› Continuous availability of the database/data platform
› Support for scale-out architecture to support extreme scale
› Ability to support hybrid environment – on-prem and cloud
› Easy to deploy, highly automated and with built-in intelligence
© 2016 Forrester Research, Inc. Reproduction Prohibited 13
Apache Spark offers new possibilities…
› Open Source distributed computing framework that uses in-memory
platform to scale, process and provide low-latency access
› Key benefits: i) Performance, ii) Supports streaming and complex
analytics, iii) Supports SQL, iv) Easy to write Apps using Java, Scala or
Python.
› Use cases: i) Sensor data processing, ii) Stream processing, iii)
Interactive analytics and data processing platform, iv) Interactive
algorithms in machine learning, v) IOT analytics, vi) Complex analytics.
› Adoption: Current adoption of Apache Spark is estimated at 30% in
large enterprises likely to double in the next three years.
© 2016 Forrester Research, Inc. Reproduction Prohibited 14
Road Map for your Fast Data Strategy
© 2016 Forrester Research, Inc. Reproduction Prohibited 15
Recommendations
› In the era of big data, you need to look beyond traditional data
architectures to succeed and gain competitive advantage.
› Fast data strategy needs to be on your roadmap, focusing on making
data available more quickly to business users and decision makers.
› Look for automation, simplification and easy-of-use database solutions
that can help support faster time-to-value initiatives.
› Look at in-memory and scale-out architectures to support new
business analytics to grow business and innovate.
› Look at open source that can provide lower cost and deliver a platform
to support your fast data strategy.
© 2009 Forrester Research, Inc. Reproduction Prohibited
Thank you
Noel Yuhanna
www.forrester.com
Twitter: @nyuhanna
17
Who We Are
The open source home and commercial provider of Redis
Open source. The leading in-memory data structure store, supporting any high performance operational or analytic use case.
18
Redis is a Game Changer
Simplicity(through Data Structures)
Extensibility (through Redis Modules)
Performance
ListsSorted Sets
Hashes Hyperlog-logs
Geospatial Indexes
Bitmaps
SetsStrings
Bit field
19
• Used by developers like “Lego” blocks
• Enables data to be processed on the database level rather than the application level
• Turns complex functionality into a single command such as:"Get the e-mail address of the user with the highest score in a game that started on July 24th at 11:00pm PST”ZREVRANGE 07242015_2300 0 0
Simplicity: Data Structures - Redis’ Building Blocks
ListsSorted Sets
HashesHyperlog-
logs
Geospatial IndexesBitmaps
SetsStrings
• Enable solving complex problems by creating relations between data structures, using standard or custom (Lua) commands
• The result: cleaner, more elegant code, faster execution time
20
Extensibility: Modules Extend Redis Infinitely
• Add-ons using a Redis API for seamlessly adding to it use cases and data structures
• Modules enjoy Redis’ simplicity, super high performance, infinite scalability and high availability
• Modules can be created by anyone. Certified by Redis Labs.
Full Text Search Enhanced JSON Graph Operations Secondary Indexes
Linear Algebra SQL Support Image ProcessingN-Dimension
Queries …
21
Performance: the Most Powerful Database
Highest Throughput at Lowest Latency in High Volume of Writes Scenario
Lowest number of servers needed to deliver 1 Million writes/second
300
50 50
20
50
100
150
200
250
300
350
Benchmarks performed by Avalon Consulting Group Benchmarks published in the Google blog
22
Redis CloudAvailable since mid-2013
6,100+ enterprise customers
Redis Labs Enterprise Cluster (RLEC)Available since early-2015
100+ enterprise customers
Wide Adoption
Why Use Redis in Analytics
24
Popular Redis Use Cases
Geo SearchData Ingestion Social Functionality
Following, Followers, Relations Location-based ApplicationsHigh Throughput Buffering
Job & Queue Caching
Any Business Application Any Web or Mobile App
High Speed Transactions Time-Series
Business Applications
Analytics
Real-time Computations Time-Based Analysis
25
Example : Redis For Bid Management
The Application Problem
• Many users bidding on items• Need to instantly show who’s
leading, in what order and by how much
• May also need to display analytics like how many users are bidding in what range
• Disk-based DBMS-es are too slow for real-time, high scale calculations
Why Redis Rocks This
• Sorted sets automatically keep list of users and scores updated and in order (ZADD)
• ZRANGE, ZREVRANGE will get your top users
• ZRANK will get any users rank instantaneously
• ZCOUNT will return a count of users in a range,
• ZRANGEBYSCORE will return all the users in a range by their bids
26
Redis Sorted Sets
ZADD item:1 10000 id:2 21000 id: 1ZADD item:1 34000 id:3 35000 id 4ZINCRBY item1:1 10000 id:3
ZREVRANGE item:1 0 0id:3
Item: 1id:3 44000
id:4 35000
id:1
id:2
21000
10000
27
Example : Redis For RecommendationsThe Application Problem
• Users, items, likes, dislikes, similarities• Set comparisons of user likes, user
dislikes should help create similarity scores, which can then be stored in a sorted set
• Set comparisons of similar user likes/dislikes with items not purchased by current user should yield suggestions
• High speed and low latency requirements
Why Redis Rocks This• Redis Sets are unordered collections
of strings- SADD to add objects to each tag
• Set operations executed in –memory, blazing fast speeds
• SINTER, SINTERSTORE to intersect
multiple sets
• SUNIONSTORE to add multiple sets
• SISMEMBER to determine membership,
SMEMBERS to retrieve all values
• Sets and Sorted sets combined are a great choice for recommendation engines
28
Redis Sets
SADD item:1 tag:1 tag:22 tag:24SADD tag:1 item:1SADD tag: 2 item:22 item:14 item:3
SINTER tag1 tag2item:3
SUNIONSTORE tag:x tag1 tag2SMEMBERS tag:xitem:1 item:3 item:22 item:14 item:3
item 1 {tag:1, tag:22, tag:24}
{item:1, item:3}tag 1
{item:22, item:14, item: 3}tag 2
{item:1, item:22, item:14, item: 3}tag x
Redis & Spark
30
Spark & Redis – Serving Layer & Accelerator
Internal accelerator
31
Accelerate Spark Time-Series with Redis
Redis sorted sets accelerate time series data processing by 100 times compared to other in-
memory K/V stores
Example time series data: Stock prices for 1024 stocks over 32 years
32
Accelerating Spark Time-Series with Redis
Redis is faster by upto 100 times compared to HDFS and over 45 times compared to Tachyon or Spark
33
More Details About the Redis & Spark Integration
Github link: Spark-Redis Connector Package https://github.com/RedisLabs/spark-redis
How to get started with Spark and Redis:https://redislabs.com/solutions/spark-and-redis
Blog: https://redislabs.com/blog/connecting-spark-and-redis
Cost Effective Analytics
35
Price/Performance of Memory Technology
36
Redis on Flash Flash used as a RAM extender and NOT as persistent storage
37
How to Achieve Optimal Price/Performance
By dynamically setting RAM/Flash ratio Behind the scenes…
38
Single Server Results with Dell & Samsung NVMe
read
write
read
write
Avg: 2.04M ops/sec
Max: 2.14M ops/sec
Avg: 0.91msec
Max: 0.98 msec
% below 1msec: 100%
Avg: 313RMB / 9.4WMB
Max: 1.71RGB / 96WMB
Avg: 1.45Gbps (Tx) / 0.97Gbps (Rx)
Max: 1.6Gbps (Tx) / 1.2Gbps (Rx)
Test setup:• Redis Labs Enterprise
Cluster v3.2• Dell Xeon CPU E5-
2670 v3 @ 2.50GHz• 4x Samsung NVMe
PM1725• Memtier benchmark-
open source tool• 100B object size• 80% read• 20% write
Throughput – ops/sec
Latency – msec
Disk Bandwidth – MB/sec
NW Bandwidth – Gb/sec
>2M Ops/sec, <1 ms latency, > 1GB disk bandwidth
39
Customer Example : Redis on Flash
• Genome dataset: 31TBs of raw data
• Optimized data set through encodingand using Redis Hashes
• Resulting data runs high speed analyses with 55GB of RAM and 4.5TB of Flash
• 97% annual savings compared to a pure RAM solution
Redis on RAM Redis on Flash
RAM Size 5TB 0.5TB
Flash size N/A 4.5TB
Serverson AWS :
21x r3.8xlarge on P8:
2x s822 LC
1yr costs $489,333 $15,677
P8 savings 97%
Extending Redis Analytics
40
41
What Can Modules Do41
• All modules are certified by Redis Labs for full compliance with OSS Redis, Redis Cloud and Redis Labs Enterprise Cluster (RLEC)
Full Text Search Enhanced JSON Graph Operations Secondary Indexes
Linear Algebra SQL Support Image ProcessingN-Dimension
Queries …
4242
3.152.40
21.00
8.70
24.57
10.61
0.00
5.00
10.00
15.00
20.00
25.00
30.00
Full text search Prefix search
Average Latency (msec)
RLEC Elasticsearch Solr
20,045
6,831
690
3,686
621
3,133
0
5,000
10,000
15,000
20,000
25,000
Full text search Prefix search
Ops/sec
RLEC Elasticsearch Solr
85% higher
32x higher
7.8x faster 4.1x faster
redisearch
The world fastest text search engine
43
Redis Module Hub (www.redismodules.com)
44Redis Labs proprietary & confidential information
Next Steps
Learn More:
Redis with Spark: https://redislabs.com/solutions/spark-and-redis
Redis on Flash : https://redislabs.com/solutions/redis-for-very-large-datasets
Redis Modules : www.redismodules.com
44
Home of Redis
Questions?
@socialeena