31
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform Because Sometimes You Just Need Metal

Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

Embed Size (px)

DESCRIPTION

There's an elephant in the room when it comes to Big Data. Apache Hadoop and Spark offer the promise to transform how businesses leverage Big Data, finding the right mix of flexible deployments, elastic scalability, and performance can be daunting. Introducing Rackspace OnMetal™ for Apache Spark™ an industry first that combines the performance and efficiency of bare metal with the ease and flexibility of cloud. With Rackspace OnMetal for Cloud Big Data Platform you can transform how you run Hadoop and Spark workloads: •Deploy in minutes, not months •Spin instances up or down on demand •Process data in-memory for faster query times •Get bare metal performance and say goodbye to virtualization taxes Sign up and learn how Rackspace OnMetal for Cloud Big Data Platform can rapidly move your organization from planning to deploying.

Citation preview

Page 1: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data PlatformBecause Sometimes You Just Need Metal

Page 2: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

2

Meet Your Speakers

www.rackspace.com

Sean AndersonManager, Data Services

John EngatesCTO

David GrierSystems Engineer

Page 3: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

3

• Big Data is now much more than hype – real customers with real use cases are adopting daily

• Recent survey found that business leaders expected the deployment of Hadoop to result in a 3-year benefit ranging from $5M to $50M+

• Close to 100% of business leaders have already deployed or plan to deploy ApacheTM Hadoop®

Big Data is Here to Stay

www.rackspace.com

"Enterprises are showing increasing interest in the value provided by the large-scale data processing that Hadoop and Spark can provide, but can be wary of the upfront cost and complexity of setting up a cluster to prove that value. Managed services such as [OnMetalTM Cloud Big Data Platform] enable enterprises to focus their energies on generating business insights rather than configuring and managing infrastructure.” 

Matt Aslett451 Research Director, Data Platforms and Analytics

Page 4: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

4

• Biggest impediments include:– Insufficient skills in-house to design and deploy

– Designing and deploying takes too long

– High cost of physical infrastructure

Hadoop is Hard

www.rackspace.com

3 10in onlybusinesses that plan to implement Hadoop have done so

Page 5: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

5www.rackspace.com

• Original focus on batch processing• Streaming and interactive use cases emerging• Shift from jobs that take hours to seconds• Impala, Spark, and Presto are emerging tools

Hadoop is Changing

Page 6: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

What are Companies Doing with Hadoop?

6www.rackspace.com

Vertical Use Case Data Type

Financial Services

New Account Risk Screens Text, Server Logs

Fraud Prevention Server Logs

Trading Risk Server Logs

Maximize Deposit Spread Text, Server Logs

Insurance Underwriting Geographic, Sensor, Text

Accelerate Loan Processing Text

Telecom

Call Detail Records (CDRs) Machine, Geographic

Infrastructure Investment Machine, Server logs

Next Product to Buy (NPTB) Clickstream

Real-time Bandwidth Allocation Server Logs, Text, Sentiment

New Product Development Machine, Geographic

Retail

360 View of the Customer Clickstream, Text

Analyze Brand Sentiment Sentiment

Localized, Personalized Promotions Geographic

Website Optimization Clickstream

Optimal Store Layout Sensor

Manufacturing

Supply Chain and Logistics Sensor

Assembly Line Quality Assurance Sensor

Proactive Maintenance Machine

Crowdsourced Quality Assurance Sentiment

Page 7: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

7www.rackspace.com

What Is the Cost of Lacking a Big Data Strategy?

• Today every company can be a data company

• Successful companies will be data companies

• Under Armour isn’t just a fitness company---they’re a data company

Page 8: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

8www.rackspace.com

Page 9: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

9www.rackspace.com

Page 10: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

10www.rackspace.com

Page 11: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

11

Rackspace Cloud Big Data = Big Data as a Service

www.rackspace.com

• A fully managed Hadoop and Spark hardware and software stack with the elasticity and availability of the Rackspace Managed Cloud

• Save time and money in deploying, maintaining and scaling Big Data workloads

• Start small, spin instance up or down on demand, and scale elastically

Page 12: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

12www.rackspace.com

The Trade Off...

Custom BuiltConsistentAvailable

Performant

Purpose BuiltElasticFlexible

On-Demand

Page 13: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

13www.rackspace.com

OnMetal Lets You Scale Like the Internet Giants

BARE METAL SERVERS

API-drivenInstantly Available Highly Specialized No Hypervisor

“Rackspace Cloud, because of its single-tenant OnMetal line, is the only place on Earth where you can enjoy Facebook/Google-style infrastructure rented by the hour.”

-Ev KontsevoyDirector, Product

Rackspace

Page 14: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

14

• For the first time, data scientists can get the best of both worlds: bare metal performance with cloud agility, all backed by Fanatical Support®

• What this means:– Spin projects up or down on demand so that

capacity is always perfectly aligned to demand

– When you’re running your projects, you can get screaming fast, predictable performance

www.rackspace.com

An Industry First for Big Data as a Service

Page 15: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

15

• Rackspace OnMetal Cloud Big Data for Spark is engineered for break-through performance, enabling data scientists to iterate interactively with large data sets.

Breakthrough Performance

www.rackspace.com

Terasort DFS IO0

10

20

30

40

50

60

Traditional CBD

OnMetal CBD

Traditional Cloud Big Data

OnMetal Cloud Big Data

Se

con

ds

Page 16: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

16

• Differentiators– CPU, Memory, SSD, Networking, and more

– optimized for screaming performance

• Three Flavors– Traditional: JBOD, commodity boxes, low

CPU and low RAM

– In Memory: SSD Drives, High Memory, High CPU

– Bare Metal resources a must due to high demands on all the resources

Under the Hood with Rackspace OnMetal

www.rackspace.com

OnMetal I/O

Workload type

• Online transaction processing (OLTP)• NoSQL databases• Traditional SQL databases

Features & Specs

• Intel Xeon E5-2680 v2 2.8 Ghz• 2X10 Core• 128GB RAM• Boot device (32GB SATADOM)• 2x L Si Nytro WarpDrive BLP4-1600

(1.6TB) for 3.2TB of high I/O storage• Redundant 10Gbps network

connections

Page 17: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

17

• Introducing support of Apache SparkTM

• Apache Spark combined with Rackspace OnMetal enables enterprises to combine the breadth of structured and unstructured data with the speed of in-memory processing to build streaming, machine learning, and graph-optimized applications that allow businesses to take action at the speed of insight.

Rackspace Cloud Big Data is About More than Hadoop

www.rackspace.com

Page 18: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

18

Apache Spark

www.rackspace.com

Speed Ease of Use Generality Integrated with Hadoop

Page 19: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

19

• Deeper Integration with SQL Workloads

• Streaming Applications

• Machine Learning

• Iterative Processing

• Real Time Graphical Dashboards

New Use Cases

www.rackspace.com

Page 20: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

20

• Big Data is here to stay• Hadoop is Hard• Rackspace makes Hadoop easy• With OnMetal and Spark, Rackspace takes Big Data beyond batch processing• Become a data company today

Summary

www.rackspace.com

Page 21: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

21www.rackspace.com

Page 22: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

22www.rackspace.com

Page 23: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

23www.rackspace.com

Page 24: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

24www.rackspace.com

Average Build Time:10 Minutes

Page 25: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

25www.rackspace.com

Page 26: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

26www.rackspace.com

Page 27: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

27

Big Data Platform

www.rackspace.com

BARE METAL SERVERS

Page 28: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

28

Rackspace Offerings for the Data Tier

www.rackspace.com

Infrastructure for Data

Managed Offerings of Most Popular Big Data, SQL, & NoSQL Databases

Managed Database Services for Production Apps

Cloud IaaSGet started fast

Dedicated Hosting

Predictable costs & performance

OnMetalCloud Elasticity &

Dedicated Performance

•Automatic DBA: Sharding, Backup, & HA

•Entire Stack Optimized on Bare Metal

•Supported 24x7x365 by experts•More than MongoDB…

•Architecture & Design•Tuning & Monitoring•24 x 7 x 365 Support•Cost Effective

DBA Services

Page 29: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

29

www.baremetalbigdata.com

1. Sign up for a free trial

2. Want to know more? – Read my blog and check out the articles

What’s Next?

www.rackspace.com

Page 30: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

30

Questions?

www.rackspace.com

Page 31: Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

THANK YOU

RACKSPACE® | 1 FANATICAL PLACE, CITY OF WINDCREST | SAN ANTONIO, TX 78218

US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM

© RACKSPACE LTD. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN THE UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM