20
MongoDB & Hadoop: Providing Business Insights Thomas Boyd Senior Solutions Architect, MongoDB

Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

  • Upload
    mongodb

  • View
    113

  • Download
    3

Embed Size (px)

DESCRIPTION

Join us for a webinar on how MongoDB and Hadoop can work together to solve Big Data problems in today's enterprises. We will take an in depth look at how the two technologies make real business intelligence accessible to end users. After a brief introduction to both technologies, this webinar will dive deep into the MongoDB+Hadoop Connector and how it is applied to enable new business insights. In this webinar you will learn: What information problems are a good fit for MongoDB and Hadoop How to integrate the two technologies using the MongoDB+Hadoop Connector Programming paradigms for tackling common problems

Citation preview

Page 1: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

MongoDB & Hadoop:Providing Business Insights

Thomas BoydSenior Solutions Architect, MongoDB

Page 2: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

2

What is MongoDB?

The leading NoSQL database

Document Database

Open-Source

General Purpose

Page 3: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

3

RDBMS

MongoDB Document Model

MongoDB

{

_id : ObjectId("4c4ba5e5e8aabf3"),

employee_name: "Dunham, Justin",

department : "Marketing",

title : "Product Manager, Web",

report_up: "Neray, Graham",

pay_band: “C",

benefits : [

{ type :  "Health",

plan : "PPO Plus" },

{ type :   "Dental",

plan : "Standard" }

]

}

Page 4: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

4

What is Hadoop?

“The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.”*

*source: hadoop.apache.org

• Large datasets• Analytics• Batch• Map-Reduce

Page 5: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

5

Enterprise IT Stack

EDWHadoop

Man

agem

ent

& M

on

ito

rin

gS

ecurity &

Au

ditin

g

RDBMS

CRM, ERP, Collaboration, Mobile, BI

OS & Virtualization, Compute, Storage, Network

RDBMS

Applications

Infrastructure

Data Management

Online Data Offline Data

Page 6: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

6

Consideration: Online vs. Offline

• Long-running• High-Latency• Availability is lower

priority

• Real-time• Low-latency• High availability

Online Offlinevs.

Page 7: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

7

Consideration: Online vs. Offline

Online Offlinevs.

Page 8: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

8

Hadoop is good for…

Risk Modeling Churn AnalysisRecommendation

Engine

Ad TargetingTransaction

AnalysisTrade

Surveillance

Network Failure Prediction

Search Quality Data Lake

Page 9: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

9

MongoDB is good for…

360 Degree View of the Customer

Mobile & Social Apps

Fraud Detection

User Data Management

Content Management &

DeliveryReference Data

Product CatalogsMachine to

Machine AppsData Hub

Page 10: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

10

MongoDB and Hadoop: Complementary

• “Data Lake”• In-depth analytics

• Real-time systems• Light-weight analytical

workloads

Page 11: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

11

Use MongoDB+Hadoop Together

E-Commerce

• Products & Inventory• Real-time

recommendations• Customer profile• Session management• Customer clickstream• Fraud detection

• Transaction history• Clickstream history• Recommendation

model• Fraud modeling

Analysis

MongoDB Connector for

Hadoop

Page 12: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

12

Example – Fraud Detection

Payments

• Fraud modeling

Nightly Analysis

MongoDB Connector for

Hadoop

Results Cache

• Online payments processing

3rd Party Data Sources

Fraud Detection

queryonly

query only

Page 13: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

13

Customer example – Global Travel Firm

Travel

• Flights, hotels and cars

• Real-time offers• User profiles,

reviews• User metadata

(previous purchases, clicks, views)

• User segmentation• Offer recommendation

engine• Ad serving engine• Bundling engine

Algorithms

MongoDB Connector for

Hadoop

Page 14: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

14

Customer example – MetLife

Insurance

• Insurance policies• Demographic data• Customer web data• Call center data• Real-time churn

detection

• Customer action analysis

• Churn prediction algorithms

Churn Analysis

MongoDB Connector for

Hadoop

Page 15: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

15

Customer example – Criteo

Ad-Serving

• Catalogs and products

• User profiles• Clicks• Views• Transactions

• User segmentation• Recommendation

engine• Prediction engine

Algorithms

MongoDB Connector for

Hadoop

Page 16: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

16

• Java Map-Reduce, Stream Map-Reduce, Pig, & Hive access to MongoDB– MongoDB as input

• mongo.job.input.format=com.hadoop.MongoInputFormat• mongo.input.uri=mongodb://my-db:27017/db1.collection1

– MongoDB as output• mongo.job.output.format=com.hadoop.MongoOutputFormat• mongo.input.uri=mongodb://my-db:27017/db1.collection2

– Using MongoDB backup files• mongo.job.output.format=com.hadoop.BSONFileOutputFormat• mapred.output.dir=file:///results.bson

– Xxx

What is MongoDB-Hadoop Connector?

Page 17: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

17

• Version 1.1.0, July 2013

– Pig support

– Hive support

– Streaming support

– Read/Write MongoDB backups

– Update writes

– Much more….

Enhancing MongoDB-Hadoop Connector

• Version 1.2.0, December 2013

– Apache Hadoop 2.2 support

– Multiple collections as M-R

source

– Multiple mongos support

– Custom splitting support

– Performance improvements

Page 18: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

18

• Rich query language

• Native secondary indexes

• Geospatial indexes & search

• Text indexes & search

• Aggregation framework

• Javascript Map-Reduce

• Client-side analytics

MongoDB Native Analytics

Page 19: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

19

Resources

White paper: Big Data: Examples and Guidelines for the Enterprise Decision Maker

http://www.mongodb.com/lp/whitepaper/big-data-nosql

Recorded Webinar Series: Thrive with Big Data

http://www.mongodb.com/lp/big-data-series

Recorded Webinar: What’s New with MongoDB Hadoop Integration

http://www.mongodb.com/presentations/webinar-whats-new-mongodb-hadoop-integration Documentation: MongoDB Connector for

Hadoophttp://docs.mongodb.org/ecosystem/tools/hadoop/

Trouble Tickets http://jira.mongodb.org (project = Hadoop Integration)

Subscriptions, support, consulting, training https://www.mongodb.com/products/how-to-buy

Resource Location

Page 20: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights