32
Revolution Confidential R evolution R : 100% R and More Presented by: David Smith @ revodavid VP Marketing and Community R evolution Analytics

Revolution R Enterprise: 100% R and More (14 Mar 2013)

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

R evolution R : 100% R and More

P res ented by: David S mith @ revodavid V P Marketing and C ommunity R evolution A nalytic s

Page 2: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

P oll Ques tion

Which stats package do you use most?

Page 3: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential Marc h 13, 2013: Welc ome!

Thanks for coming. Slides and replay available (soon) at: http://bit.ly/YbfQo1

David Smith VP Marketing & Community, Revolution Analytics Editor, Revolutions blog http://blog.revolutionanalytics.com Twitter: @revodavid

3

Page 4: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential In today’s webc as t:

About Revolution Analytics and R How Revolution R Enterprise enhances R

Resources for getting more from R

Q&A

4

Page 5: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

5

Enterprise-ready Multi-platform Scalable from desktop to big data Delivers high performance analytics Easier to build and deploy analytic applications

Revolution Analytics is the leading commercial provider of software and support for the

open-source R statistical computing language

R evolution R E nterpris e is

Page 6: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential What is R ?

Data analysis software A powerful programming language Development platform designed by and for statisticians

A complete environment Huge library of algorithms for data access, data

manipulation, analysis and graphics An open-source software project Free, open, and active

A vibrant community Thousands of contributors, 2 million users Resources and help in every domain

6

Download the White Paper

R is Hot bit.ly/r-is-hot

Page 7: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

R is exploding in popularity and func tionality

Source: http://r4stats.com/popularity; “Why R is a name to know in 2011”, Forbes; number of packages is now 4,250 7

“A key benefit of R is that it provides near-instant availability of new and

experimental methods created by its user base — without waiting for the

development/release cycle of commercial software. SAS recognizes the value of R

to our customer base…”

Product Marketing Manager SAS Institute, Inc

“I’ve been astonished by the rate at which R has been adopted. Four years ago,

everyone in my economics department [at the University of Chicago] was using

Stata; now, as far as I can tell, R is the standard tool, and students learn it first.”

Deputy Editor for New Products at Forbes

Page 8: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

A Vibrant R Us er C ommunity

More: The R Ecosystem

bit.ly/R-ecosystem

8

Local R User Groups (93) Local R User Groups (102)

Page 9: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

R evolution A nalytic s S cales R to the E nterpris e

9

Power

Productivity

Power Distributed high

performance analytics

Productivity Build & deploy analytics

applications easily

Enterprise Readiness Enterprise landscape Full-service customer

support, consulting and training

Enterprise Readiness

Revolution R Enterprise

Page 10: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

10

Revolution R Enterprise

ScaleR High Performance Big Data Analytics

RevoR Performance Enhanced Open Source R

Open Source R packages

ConnectR High Speed Connectors HDFS, Hbase, ODBC, SAS

PlatformR Parallel Distributed Computing

IBM/Netezza, IBM/Platform LSF, MS HPC Server, MS Azure Burst

DevelopR Integrated Development

Environment

DeployR Web Services

R evolution R E nterpris e High P erformanc e, Multi-P latform A nalytic s P latform

Page 11: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

Enterprise Deployment

Performance

Productivity

Big Data Analysis

Training & Consulting

Technical Support

R evolution R E nterpris e:

11

Open Source

Performance Enhancements

Greater Productivity & Ease of Use

Tackle “Big Data”

IT-Friendly Enterprise Deployment

On-Call Experts

Page 12: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

R evolution R E nterpris e

Productivity

12

Page 13: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential T he s tandard R interfac e

13

Page 14: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential DevelopR Integrated Development E nvironment

14

Script with type ahead and code

snippets Solutions window

for organizing code and data

Packages installed and

loaded

Objects loaded in the

R Environment

Object details

Sophisticated debugging with

breakpoints , variable values etc.

http://www.revolutionanalytics.com/demos/revolution-productivity-environment/demo.htm

Page 15: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

R evolution R E nterpris e

Performance

15

Page 16: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential P erformance: Multi-threaded Math

Open Source R

16

Revolution R Enterprise

Computation (4-core laptop) Open Source R Revolution R Speedup

Linear Algebra1

Matrix Multiply 176 sec 9.3 sec 18x

Cholesky Factorization 25.5 sec 1.3 sec 19x

Linear Discriminant Analysis 189 sec 74 sec 3x

General R Benchmarks2

R Benchmarks (Matrix Functions) 22 sec 3.5 sec 5x

R Benchmarks (Program Control) 5.6 sec 5.4 sec Not appreciable

1. http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php 2. http://r.research.att.com/benchmarks/

Page 17: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

R evolution R E nterpris e

Big Data Analysis with ScaleR

17

Page 18: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

R evoS caleR brings the power of B ig Data to R

18

Distributed Statistical Algorithms

Communications Framework

Data Source API

R Language Interface

Parallel External Memory Algorithms exploit available compute resources (cores & computers) independent of platform

Abstracted communications layer provides portability of

code between platforms: server,

cluster, or in-database

Use the high-speed local data mart (XDF), or stream data from SAS, ODBC, HDFS or other remote data sources.

Familiar, high-productivity

programming environment for R

users

Page 19: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

S c aleR A ddres s es P erformance and C apacity L imitations of Open S ource R

19

Page 20: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

High P erformance B ig Data A nalytics with S c aleR

20

Statistical Tests

Machine Learning

Simulation

Descriptive Statistics

Data Visualization

R Data Step

Predictive Models

Sampling

Page 21: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential R evolution R E nterpris e S c aleR : High P erformance B ig Data A nalytics

21

Data import – Delimited, Fixed, SAS, SPSS, OBDC

Variable creation & transformation

Recode variables Factor variables Missing value handling Sort Merge Split Aggregate by category

(means, sums)

Min / Max Mean Median (approx.) Quantiles (approx.) Standard Deviation Variance Correlation Covariance Sum of Squares (cross product

matrix for set variables) Pairwise Cross tabs Risk Ratio & Odds Ratio Cross-Tabulation of Data

(standard tables & long form) Marginal Summaries of Cross

Tabulations

Chi Square Test Kendall Rank Correlation Fisher’s Exact Test Student’s t-Test

Data Prep, Distillation & Descriptive Analytics

Subsample (observations & variables)

Random Sampling

R Data Step Statistical Tests

Sampling

Descriptive Statistics

Page 22: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential R evolution R E nterpris e S c aleR : High P erformance B ig Data A nalytics

22

Sum of Squares (cross product matrix for set variables)

Multiple Linear Regression Generalized Linear Models (GLM)

- All exponential family distributions: binomial, Gaussian, inverse Gaussian, Poisson, Tweedie. Standard link functions including: cauchit, identity, log, logit, probit. User defined distributions & link functions.

Covariance & Correlation Matrices

Logistic Regression Classification & Regression Trees Predictions/scoring for models Residuals for all models

Histogram Line Plot Scatter Plot Lorenz Curve ROC Curves (actual data and

predicted values)

K-Means

Statistical Modeling

Decision Trees

Predictive Models Cluster Analysis Data Visualization

Classification

Machine Learning

Simulation

Monte Carlo

Page 23: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

R evolution R E nterpris e

Enterprise Deployment

23

Page 24: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

On-demand sales forecasting

Real-time social media sentiment

analysis

C reate c us tom, on-demand analytic s applic ations S ome examples :

24

Leveraging the power of R from Microsoft tools

Page 25: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

R evolution R E nterpris e DeployR integrates R with applications

Seamless Bring the power of R to any web enabled application

Simple Leverage common APIs including JS, Java, .NET

Scalable Robustly scale user and compute workloads

Secure Manage enterprise security with LDAP & SSO

25

R / Statistical Modeling Expert

DeployR

Data Analysis

Business Intelligence

Mobile Web Apps

Cloud / SaaS

Deployment Expert

Page 26: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential R evolution R E nterpris e

A rc hitec ture Use a connected MPP server or cluster for: Data exploration On-demand R

applications Big-data predictive

models Offline (batch)

operations Code generation for

real-time deployment

Page 27: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential C onnectR for Hadoop: S tream data from Hadoop to R evolution R E nterpris e

Page 28: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

On-Call Technical Support Consulting Migration | Analytics | Applications | Validation

Training R | Revolution R | Statistical Topics

Systems Integration BI | ERP | Databases | Cloud

28

Page 29: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

P oll Ques tion

What interests you most about Revolution R Enterprise?

Page 30: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential

Why cus tomers choos e R evolution R E nterpris e

30

INNOVATION MULTI-PLATFORM

TIME-to-VALUE VALUE

Page 31: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential T hank You! Download slides, replay http://bit.ly/YbfQo1

Resources for getting started with R http://bit.ly/ZnZGt2

Get Revolution R Enterprise Contact Sales: http://bit.ly/hey-revo Free to Academics:

www.revolutionanalytics.com/academic We’re Hiring! www.revolutionanalytics.com/careers

31

Page 32: Revolution R Enterprise: 100% R and More (14 Mar 2013)

Revolution Confidential T hank you.

32

www.revolutionanalytics.com 650.646.9545 Twitter: @RevolutionR

The leading commercial provider of software and support for the popular open source R statistics language.