20
Copyright Edmunds.com,Inc. (the “Company”). All rights reserved. Edmunds ® , Edmunds.com ® , the Edmunds.com car design, Inside Line sm and AutoObserver ® are proprietary trademarks of the Company. This document contains proprietary and/or confidential information of the Company. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Company, and any such disclosure requires the express Copyright Edmunds.com,Inc. (the “Company”). All rights reserved. Edmunds ® , Edmunds.com ® , the Edmunds.com car design, Inside Line sm and AutoObserver ® are proprietary trademarks of the Company. This document contains proprietary and/or confidential information of the Company. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Company, and any such disclosure requires the express Copyright Edmunds Inc. (the “Company”). All rights reserved. Edmunds ® , Edmunds.com ® , the Edmunds.com car design, Inside Line sm , CarSpace sm and AutoObserver® are proprietary trademarks of the Company. This document contains proprietary and/or confidential information of the Company. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Company, and any such disclosure requires the express approval of the Company. Prepared by Gregory Rokita Edmunds’ Pomelo: Automobile Dealership Analytics in Real Time using MongoDB April 3 rd , 2012 Greg Rokita, Sharat Nair Edmunds.com, Inc

Prepared by Gregory Rokita

  • Upload
    brosh

  • View
    28

  • Download
    0

Embed Size (px)

DESCRIPTION

Edmunds’ Pomelo : Automobile Dealership Analytics in Real Time using MongoDB April 3 rd , 2012 Greg Rokita, Sharat Nair Edmunds.com , Inc. Prepared by Gregory Rokita. Assumptions. Understanding of MongoDB Experience with Java - PowerPoint PPT Presentation

Citation preview

Page 1: Prepared by Gregory Rokita

Copyright Edmunds.com,Inc. (the “Company”). All rights reserved. Edmunds®, Edmunds.com®, the Edmunds.com car design, Inside Linesm and AutoObserver® are proprietary trademarks of the Company. This document contains proprietary and/or confidential information of the Company. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Company, and any such disclosure requires the express approval of the Company.

Copyright Edmunds.com,Inc. (the “Company”). All rights reserved. Edmunds®, Edmunds.com®, the Edmunds.com car design, Inside Linesm and AutoObserver® are proprietary trademarks of the Company. This document contains proprietary and/or confidential information of the Company. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Company, and any such disclosure requires the express approval of the Company.

Copyright Edmunds Inc.  (the “Company”).  All rights reserved.Edmunds®, Edmunds.com®, the Edmunds.com car design, Inside Linesm , CarSpacesm and AutoObserver® are proprietary trademarks of the Company. This document contains proprietary and/or confidential information of the Company.  No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Company, and any such disclosure requires the express approval of the Company.

Prepared by Gregory Rokita

Edmunds’ Pomelo: Automobile Dealership Analytics in Real Time using MongoDBApril 3rd, 2012Greg Rokita, Sharat NairEdmunds.com, Inc

Page 2: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Assumptionso Understanding of MongoDBo Experience with Javao Basic understanding of serialization protocols e.g.

Thrift, Protocol Bufferso Basic understanding of messaging protocols e.g.

JMS

2

Page 3: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Agendao Edmunds

o Scale of Big Data operations o Use case for Pomelo Application

o System Overview & Designo Real time integration with MongoDB o Real time data creation for MongoDB

o Implementationo MongoDB Consumero MongoDB REST service

o Q&A

3

Page 4: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Edmunds.com and Scaleo Premier online resource for automotive information

launched in 1995 as the first automotive information Web site

o 15 million unique visitorso 210 million page viewso 1 million+ new inventory items per dayo 2 TB of new data every montho 40 node Hadoop cluster aggregating logs,

transactions, calls, referrals, advertising, vehicle, pricing, inventory and other data sets

o

Page 5: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Pomelo Applicationo Analytics tool for Automotive Dealers and

Edmunds’ Dealer Sales o Performance measurement for Edmunds traffic

and its correlation to calls & referrals o iPad, HTML5, Sencha Touch & Charts

5

Page 6: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

6

Page 7: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

7

Unifying data for MongoDB

Page 8: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

8

Processing data for MongoDB-Oozie

Page 9: Prepared by Gregory Rokita

9

Populating MongoDB - Publishing System

Page 10: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Targeting MongoDB - Producer-Consumer matching

Generic Thrift Producer

MongoDB Consumer

ProdLAXEdmundsGTP

I am

Prod, TestLax, EC2EdmundsMongoDB

Send To ProdLAX, EC2EdmundsGTP

I amTestEC2EdmundsMongoDB

Receive From

BrokerDestinationInterceptor

PublishDealerMetrics

PublishDealerMetrics

DealerMetrics Virtual Topic

DealerMetricsQueue

Page 11: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Integration with MongoDB – layered architecture for transport

ActiveMQ

Camel

Thrift

Message persistence, durability and failover

Retries and error handling

Type safety, versioning and service

Page 12: Prepared by Gregory Rokita

12

Preparing data for MongoDB - summaryStructured and Unstructured Data (Logs, Calls, Referrals, etc)

Map-Reduce

Source Specific Thrift Objects in HBase

Map-Reduce

Application Specific Thrift Object in HBase

Generic Thrift Producer

Broker

MongoDB Consumer

MongoDB

Page 13: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Thrift IDL definition

Page 14: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Mongo Connection

<bean id="mongo” class="com.edmunds...MongoDBConnectionFactory">

<property name="address" value="pl1db470.media.edmunds.com:27017,pl1db471.media.edmunds.com:27017"/>

</bean>

Page 15: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Mongo Connection - cont’d

@Autowiredpublic MongoDbDealerMetricsConsumer(Mongo mongo) { collection = mongo.getDB(DB_NAME).getCollection(COLLECTION_NAME); collection.ensureIndex(new BasicDBObject(LAST_ACTIVE_DATE, -1));}

Page 16: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Mongo consumerprivate void processDealerMetrics(DealerMetrics dealerMetrics) throws TException {

String cddId = dealerMetrics.getCddDealershipId(); BasicDBObject query = new BasicDBObject(); query.put(CDD_ID, cddId); DBObject dmObj = (DBObject) JSON.parse(serializeToJson(dealerMetrics)); /* query - query to match fields - fields to be returned sort - sort to apply before picking first document remove - if true, document found will be removed update - update to apply returnNew - if true, the updated document is returned, otherwise the old document is returned (or it would be lost forever) upsert - do upsert (insert if document not present) */ collection.findAndModify(query, null, null, false, dmObj, true, true); }

Page 17: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Public interface to Mongo - Dealer

public List<DBObject> getDocument(String cddId) { final BasicDBObject query = new BasicDBObject(); query.put(CDD_ID, cddId); final DBObject object = collection.findOne(query); object.removeField(OBJECT_ID); object.removeField(LAST_ACTIVE_DATE); return newArrayList(object);}

Page 18: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Public interface to Mongo - Active listpublic List<DBObject> getActiveList() { final BasicDBObject query = new BasicDBObject(); query.put(LAST_ACTIVE_DATE, getActiveDate()); query.put(DMA_NAME, getDmaCriteria()); final BasicDBObject keys = new BasicDBObject(); keys.put(OBJECT_ID, 0); keys.put(CDD_ID, 1); keys.put(DEALERSHIP_NAME, 1); return collection.find(query, keys).toArray();}

private Object getActiveDate() { return collection.find().sort(getSortCriteria()).next().get(LAST_ACTIVE_DATE);}

private BasicDBObject getSortCriteria() { return new BasicDBObject(LAST_ACTIVE_DATE, -1);}

private BasicDBObject getDmaCriteria() { return new BasicDBObject("$in", DMAS);}

Page 19: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Rest service @GET @Path("{id}") @Produces(MediaType.APPLICATION_JSON) public List<DBObject> get(@PathParam("id") String cddId) { return dealerMetricsMongoDao.getDocument(cddId); }

@GET @Path("list") @Produces(MediaType.APPLICATION_JSON) public List<DBObject> getDealerList() { return dealerMetricsMongoDao.getActiveList(); }

Page 20: Prepared by Gregory Rokita

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Q&A

Greg [email protected]

Sharat [email protected]

20