30
Hortonworks Eric Baldeschwieler, Co-Founder and CEO September 2011 Overview for Cowen Big Data Day 2011 © Hortonworks Inc. 2011

Hortonworks for Financial Analysts Presentation

Embed Size (px)

DESCRIPTION

Hortonworks presentation from Cowen Big Data Day for financial industry analysts

Citation preview

Page 1: Hortonworks for Financial Analysts Presentation

Hortonworks

Eric Baldeschwieler, Co-Founder and CEOSeptember 2011

Overview for Cowen Big Data Day 2011

© Hortonworks Inc. 2011

Page 2: Hortonworks for Financial Analysts Presentation

2

Agenda

• Hortonworks• Apache Hadoop• Use cases• Hadoop in the Enterprise• Market• Strategy

© Hortonworks Inc. 2011

Page 3: Hortonworks for Financial Analysts Presentation

3

About Hortonworks – Basics

• Founded – July 1st, 2011− 22 architects & committers from Yahoo!

• Mission – Architect the future of Big Data− Revolutionize and commoditize the storage and processing of Big Data

via open source

• Vision – Half of the worlds data will be stored in Hadoop within five years

© Hortonworks Inc. 2011

Page 4: Hortonworks for Financial Analysts Presentation

4

About Hortonworks – Game Plan

• Support the growth of a huge Apache Hadoop ecosystem−Invest in ease of use, management, and other enterprise features−Define APIs for ISVs, OEMs and others to integrate with Apache Hadoop−Continue to invest in advancing the Hadoop core, remain the experts−Contribute all of our work to Apache

• Profit by providing training & support to the Hadoop community

© Hortonworks Inc. 2011

Page 5: Hortonworks for Financial Analysts Presentation

Credentials

• Technical: key architects and committers from Yahoo! Hadoop engineering team−Delivered every major Apache Hadoop release since 0.1−Highest concentration of Apache Hadoop committers−Driving innovation across entire Apache Hadoop stack−Experience managing world’s largest deployment−Access to Yahoo!’s 1,000+ users and 42k+ nodes for testing, QA, etc.

• Business operations: team of highly successful open source veterans−Led by Rob Bearden, former COO of SpringSource & JBoss

• Investors: backed by Benchmark Capital and Yahoo!

5© Hortonworks Inc. 2011

Page 6: Hortonworks for Financial Analysts Presentation

6

What is Apache Hadoop?

• Set of open source projects −Owned by Apache Software Foundation

• Transforms commodity hardware into a service that:

−Stores petabytes of data reliably (HDFS)−Allows huge distributed computations

(MapReduce)

• Key attributes:−Redundant and reliable

Doesn’t stop or lose data even if hardware fails

−Easy to program−Extremely powerful

Allows the development of big data algorithms & tools

−Batch processing centric −Runs on commodity hardware

Computers & network

© Hortonworks Inc. 2011

Page 7: Hortonworks for Financial Analysts Presentation

7

Typical Hadoop Applications

advertising optimization

ad selection

Website personalization

machine learning search ranking

ad inventory prediction

Mail anti-spam

user interest prediction

audience, ad and search pipelines

advertising data systems

Content Optimization

data analytics

© Hortonworks Inc. 2011

Page 8: Hortonworks for Financial Analysts Presentation

8

Who Builds Hadoop?Lines of code contributed since Hadoop inception

© Hortonworks Inc. 2011

Page 9: Hortonworks for Financial Analysts Presentation

9

Who Builds Hadoop?Lines of code contributed in 2011

© Hortonworks Inc. 2011

Page 10: Hortonworks for Financial Analysts Presentation

, early adopters Scale and productize Hadoop

Apache Hadoop

Other Internet Companies Add tools / frameworks, enhance Hadoop

Wide Enterprise Adoption Funds further development, enhancements

Service Providers Provide training, support, hosting

A Brief History

2006 – present

2008 – present

2010 – present

Nascent / 2011

10© Hortonworks Inc. 2011

Page 11: Hortonworks for Financial Analysts Presentation

HADOOP @ YAHOO!

40K+ Servers

170 PB Storage

5M+ Monthly Jobs

1000+ Active users

11© Yahoo 2011

Page 12: Hortonworks for Financial Analysts Presentation

twice the engagement

CASE STUDYYAHOO! HOMEPAGE

Personalized for each visitor

Result: twice the engagement

+160% clicksvs. one size fits all

+79% clicksvs. randomly selected

+43% clicksvs. editor selected

Recommended links News Interests Top Searches

12© Yahoo 2011

Page 13: Hortonworks for Financial Analysts Presentation

CASE STUDYYAHOO! HOMEPAGE

13

• Serving Maps• Users - Interests

• Five Minute Production

• Weekly Categorization models

SCIENCE HADOOP CLUSTER

SERVING SYSTEMS

PRODUCTION HADOOP CLUSTER

USERBEHAVIOR

ENGAGED USERS

CATEGORIZATIONMODELS (weekly)

SERVINGMAPS

(every 5 minutes)USER

BEHAVIOR

» Identify user interests using Categorization models

» Machine learning to build ever better categorization models

Build customized home pages with latest data (thousands / second)13© Yahoo 2011

Page 14: Hortonworks for Financial Analysts Presentation

CASE STUDYYAHOO! MAIL

Enabling quick response in the spam arms race

• 450M mail boxes • 5B+ deliveries/day

• Antispam models retrainedevery few hours on Hadoop

40% less spam than Hotmail and 55% less spam than Gmail

“ “

SCIENCE

PRODUCTION

1414© Yahoo 2011

Page 15: Hortonworks for Financial Analysts Presentation

Hadoop in the Enterprise

© Hortonworks Inc. 2011 15

Page 16: Hortonworks for Financial Analysts Presentation

Big Data PlatformsCost per TB, Adoption

Source:

Size of bubble = cost effectiveness of solution

16© Hortonworks Inc. 2011

Page 17: Hortonworks for Financial Analysts Presentation

Traditional Enterprise ArchitectureData Silos + ETL

17

EDW Data Marts

BI / Analytics

Traditional Data Warehouses, BI & AnalyticsServing Applications

Web Serving

NoSQLRDMS …

Unstructured Systems

Serving Logs

Social Media

Sensor Data

Text Systems

Traditional ETL &Message buses

Traditi

onal

ETL &

Mes

sage

buse

s© Hortonworks Inc. 2011

Page 18: Hortonworks for Financial Analysts Presentation

Hadoop Enterprise ArchitectureConnecting All of Your Big Data

18

EDW Data Marts

BI / Analytics

Traditional Data Warehouses, BI & AnalyticsServing Applications

Web Serving

NoSQLRDMS …

Unstructured Systems

Apache HadoopEsTsL (s = Store) Custom Analytics

Serving Logs

Social Media

Sensor Data

Text Systems

Traditional ETL &Message buses

Traditi

onal

ETL &

Mes

sage

buse

s© Hortonworks Inc. 2011

Page 19: Hortonworks for Financial Analysts Presentation

Hadoop Enterprise ArchitectureConnecting All of Your Big Data

19

EDW Data Marts

BI / Analytics

Traditional Data Warehouses, BI & AnalyticsServing Applications

Web Serving

NoSQLRDMS …

Unstructured Systems

Serving Logs

Social Media

Sensor Data

Text Systems

80-90% of data produced today is unstructured

Gartner predicts 800% data growth over next 5 years

Traditional ETL &Message buses

Traditi

onal

ETL &

Mes

sage

buse

s

Apache HadoopEsTsL (s = Store) Custom Analytics

© Hortonworks Inc. 2011

Page 20: Hortonworks for Financial Analysts Presentation

The Hadoop Market

© Hortonworks Inc. 2011 20

Page 21: Hortonworks for Financial Analysts Presentation

21

Market Drivers for Apache Hadoop

• Business drivers−Identified high value projects that require use of more data−Belief that there is great ROI in mastering big data

• Financial drivers−Growing cost of data systems as proportion of IT spend−Cost advantage of commodity hardware + open source

Enables departmental-level big data strategies

• Technical drivers−Existing solutions failing under growing requirements

3Vs - Volume, velocity, variety−Proliferation of unstructured data

Significant opportunity for Hadoop in enterprise data architectures

© Hortonworks Inc. 2011

Page 22: Hortonworks for Financial Analysts Presentation

22

Market Opportunity for Hadoop

• Current−Apache Hadoop can become de facto platform for managing

unstructured data in the enterprise−Enable new breed of applications to be built on top of Apache Hadoop

• Future−Hadoop becomes the next generation enterprise data architecture

© Hortonworks Inc. 2011

Page 23: Hortonworks for Financial Analysts Presentation

23

Market Dynamics

• Technology & knowledge gaps are preventing Apache Hadoop from becoming an enterprise standard−Difficult to install and deploy Hadoop projects −Lack of technical content to assist−Demand for knowledgeable developers far exceeds supply

• Virtually every F500 company is constructing a Hadoop strategy−But most are still in POC/experimentation phase with Hadoop

• Top ISV/OEMs working to create Hadoop strategies−Driven by customer demand

• Community is becoming increasingly confused by all of the noise−Multiple distributions, many vendor announcements−Fear of market fragmentation

© Hortonworks Inc. 2011

Page 24: Hortonworks for Financial Analysts Presentation

24

Conclusion

• There is not a Hadoop market to “win” today−Most organizations haven’t moved to full-scale production−Lack of mass adoption limiting short-term monetization opportunities−Need to drive Apache Hadoop as a unifying standard

• In order to succeed, we need to enable the market−Continue investment to overcome technology gaps−Enable a vibrant partner ecosystem−Expand availability of content and services to address knowledge gaps

How will Hortonworks do that?

© Hortonworks Inc. 2011

Page 25: Hortonworks for Financial Analysts Presentation

Hortonworks Strategy

© Hortonworks Inc. 2011 25

Page 26: Hortonworks for Financial Analysts Presentation

Hortonworks Strategy #1

Overcome Technology Gaps

• Make Apache Hadoop projects easier to install, manage & use−Regular sustaining releases−Projects released as binary (RPM, .deb)−Open source Management & Monitoring

• Make Apache Hadoop more robust−Performance gains−High availability−Administration & monitoring

All done within Apache Hadoop community

• Develop collaboratively with community

• Complete transparency• All code contributed

back to Apache

Anyone should be able to easily deploy the Hadoop projects from Apache

26© Hortonworks Inc. 2011

Page 27: Hortonworks for Financial Analysts Presentation

27

Hortonworks Strategy #2

Enable a Vibrant Ecosystem

• Unify the community around a strong Apache Hadoop offering

• Make Apache Hadoop easier to integrate & extend

−Work closely with partners to define and build open APIs

−Everything contributed back to Apache

• Provide enablement services as necessary to optimize integration

Hardware Partners Cloud & Hosting Platform Partners

DW, Analytics & BI Partners

Serving & Unstructured Data Systems

Partners

Integration & Services PartnersHadoop Application Partners

© Hortonworks Inc. 2011

Page 28: Hortonworks for Financial Analysts Presentation

28

Hortonworks Strategy #3

Overcome Knowledge Gaps

• Improve user experience with Apache Hadoop software−Binaries, installers, etc.

• Expand Apache Hadoop technical content−Core content on Apache.org

Docs, installation guides, etc.−Advanced tools on Hortonworks.com

Best practices, screencasts, forums, etc.

• Extensive Hadoop training & certification program

• Expert technical support services

© Hortonworks Inc. 2011

Page 29: Hortonworks for Financial Analysts Presentation

29

Rationale for Hortonworks Strategy

• Strong interest from community (enterprises and ISV/OEMs) in a complete, enterprise-viable, Apache Hadoop platform−Strong desire for core to remain unified and strong, avoid UNIX wars II−Fremium model seen as a barrier to growth and adoption

• Highly defensible because of Hortonworks leadership in core projects

• Proven experience executing open source business models−Rob Bearden & Benchmark

© Hortonworks Inc. 2011

Page 30: Hortonworks for Financial Analysts Presentation

30

Thank You.

© Hortonworks Inc. 2011