Are You Ready for Big Data Big Analytics?

Preview:

DESCRIPTION

Presented to the Rich Report, August 30 2013. Listen to the audio podcast at: http://inside-bigdata.com/2013/09/04/slidecast-ready-big-data-analytics/

Citation preview

Revolution Confidential

Are You Ready for Big Data Big Analytics? September, 2013

Bill JacobsDirector, Product MarketingRevolution Analytics@bill_jacobs

Revolution Analytics@RevolutionR

Revolution Confidential

2

Revolution Confidential

3

Key Big Data Challenge: The Analytics Talent Pool

Revolution Confidential

4

The Analytics Talent Pool with R

2 Million R Users

Revolution Confidential

5

What Language is Most Popular for Data Mining and Data Science?

Survey Question:

“What programming/statistics languages you used for an analytics / data mining / data science work in 2013?”

Results:

R – 61%

Python – 39%

SQL - 37%

How does this compare to 2012?

“Highest growth was for Pig/Hive/Hadoop-based languages, R, and SQL, while Perl, C/C++, and Unix tools declined…”

From 2013 KDNuggets Survey of 700 voters.

Revolution Confidential

6

The R Language: What Is It? A Language Platform…

A Procedural Language optimized for Statistics and Data Science A Data Visualization Framework Provided as Open Source

A Community… 2M Statistical Analysis and Machine Learning Users Taught in Most University Statistics Programs Active User Groups Across the World

An Ecosystem CRAN: 4500+ Freely Available Algorithms, Test Data and Evaluations Many Applicable to Big Data If Scaled

Revolution ConfidentialRevolution Analytics - Overview

7

We are the only provider of a commercial analytics platform based on the open source R statistical computing language.

Power

Productivity

Enterprise Readiness

Stable,scalable

multi-platform

world-wide support

Easier to build and deploy analytic

applications

Professional services enablement

Distributed, high performance

analytics algorithms

World Wide Support Teams

• Standard and Premium Programs

• Technical Account Managers

• Customer Success Managers

Professional Services

• Architecture planning

• Systems Integration

• Advanced analytic applications

• Full life cycle projects

Revolution Confidential

Digital Media & Retail

200+ Customer StoriesFinance & Insurance Healthcare & Life Sciences

Manufacturing & High TechAcademic & Gov’t

8

Revolution Confidential

9

Revolution R Enterprise

Revolution R Enterprise is the only commercial big data analytics platform

that provides Big Data Big Analytics based on R.

Portable Across Enterprise Platforms

High Performance, Scalable Analytics

Easier to Build & Deploy

Revolution Confidential

10

Additional Technology Challenges Accompanying Big Data Analytics Efforts

Big Data• New Data Sources• Data Variety &

Velocity• Fine Grain Control• Data Movement,

Memory Limits

Complex Computation• Experimentation• Many Small

Models• Ensemble Models• Simulation

Enterprise Readiness• Heterogeneous

Landscape• Write Once,

Deploy Anywhere• Skill Shortage• Production

Support

Production Efficiency• Shorter Model

Shelf Life• Volume of Models• Long End-to-End

Cycle Time• Pace of Decision

Accelerated

Revolution Confidential

Open Source R Drives Analytical Innovation… with some limitations for enterprisesbut has some limitations for Enterprise Deployment

Memory BoundLarge Data & Cluster-Based

Storage Management

Single ThreadedScalable, multi-threaded,

parallel processing

Community SupportCommercial production

support and professional services teams

Innovative – 5000 packages+, exponential growth

Ability to combine with open source R packages where needed

Operate on bigger data sizes

Increased speed of analysis

Holistic production support

A key combination of innovation and scale

Results

limitations

Revolution ConfidentialBig Data Speed @ Scale with Revolution R Enterprise (RRE)

Fast Math Libraries

Parallelized Algorithms

In-Database Execution

Multi-Threaded Execution

Multi-Core Processing

In-Hadoop Execution

Memory Management

Parallelized User Code

12

First, we enhance and accelerate the Open Source R interpreter.

Revolution Confidential

13

Open Source R performance:Multi-threaded MathOpen

Source R

Revolution R Enterprise

Computation (4-core laptop) Open Source R Revolution R Speedup

Linear Algebra1

Matrix Multiply 176 sec 9.3 sec 18x

Cholesky Factorization 25.5 sec 1.3 sec 19x

Linear Discriminant Analysis 189 sec 74 sec 3x

General R Benchmarks2

R Benchmarks (Matrix Functions) 22 sec 3.5 sec 5x

R Benchmarks (Program Control) 5.6 sec 5.4 sec Not appreciable

1. http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php2. http://r.research.att.com/benchmarks/

Customers report 5-50x performance improvements

compared to Open Source R — without changing any code

Revolution ConfidentialBig Data Speed @ Scale with Revolution R Enterprise (RRE)

Fast Math Libraries

Parallelized Algorithms

In-Database Execution

Multi-Threaded Execution

Multi-Core Processing

In-Hadoop Execution

Memory Management

Parallelized User Code

14

Second, we built a platform for hosting R

with Big Data on a variety of massively parallel platforms.

Revolution Confidential

15

Unparalleled Big Data Big AnalyticsScale, Performance & Innovation

1 + 1 = 1000’s

Performance

Value

Revolution R Enterprise

+ =

Performance Enhanced R

R Language

Open Source R Analytic Packages

Big DataDistributed &

Parallel Processing

& Analytic Package

Big DataDistributed &

Parallel Processing

& Analytic Package

Open Source R Analytic Packages

Performance Enhanced R

Revolution Confidential

16

Analytic Personas and their Tools

Analytic Consumer

Business Analyst

Power Analyst

Data Scientist

Information Technologist

Right Tool, Right Problem

Revolution Confidential

On-demand sales forecasting

Real-time social media sentiment

analysis

Create Custom, On-Demand Analytical AppsSome Examples:

Leveraging the power of R from Microsoft tools

17

Revolution Confidential

18

Revolution Confidential

19

Predicting Predictive Analytics

What Are Your Use Cases? How Will Your Use Cases Evolve? What Platform Will Best Support Each? Who’s Platform Excel Tomorrow?

?

Revolution Confidential

20

Portability and Investment Assurance:Write Once – Deploy Anywhere

Servers

Server Clusters

EDWs and Analytical DBMSs

Hadoop (coming soon!)

Write it Once.Deploy it Anywhere

Workstations

Revolution Confidential

21

Summary.

R is Hot.

Revolution R Enterprise: Scales R to Big Data.

Scales Performance on Big Data Platforms

Is Commercially Supported

Is Broadly Deployable

Allows you to WODA!

Revolution Analytics Maximizes Results, While

Minimizing Near-Term and Long-Term Risks

Revolution Confidential

22

www.revolutionanalytics.com 650.646.9545 Twitter: @RevolutionR

The leading commercial provider of software and support for the popular open source R statistics language.

Next steps?

Revolution Confidential

23

Thank You.

Recommended