Upload
brosh
View
28
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Edmunds’ Pomelo : Automobile Dealership Analytics in Real Time using MongoDB April 3 rd , 2012 Greg Rokita, Sharat Nair Edmunds.com , Inc. Prepared by Gregory Rokita. Assumptions. Understanding of MongoDB Experience with Java - PowerPoint PPT Presentation
Citation preview
Copyright Edmunds.com,Inc. (the “Company”). All rights reserved. Edmunds®, Edmunds.com®, the Edmunds.com car design, Inside Linesm and AutoObserver® are proprietary trademarks of the Company. This document contains proprietary and/or confidential information of the Company. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Company, and any such disclosure requires the express approval of the Company.
Copyright Edmunds.com,Inc. (the “Company”). All rights reserved. Edmunds®, Edmunds.com®, the Edmunds.com car design, Inside Linesm and AutoObserver® are proprietary trademarks of the Company. This document contains proprietary and/or confidential information of the Company. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Company, and any such disclosure requires the express approval of the Company.
Copyright Edmunds Inc. (the “Company”). All rights reserved.Edmunds®, Edmunds.com®, the Edmunds.com car design, Inside Linesm , CarSpacesm and AutoObserver® are proprietary trademarks of the Company. This document contains proprietary and/or confidential information of the Company. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Company, and any such disclosure requires the express approval of the Company.
Prepared by Gregory Rokita
Edmunds’ Pomelo: Automobile Dealership Analytics in Real Time using MongoDBApril 3rd, 2012Greg Rokita, Sharat NairEdmunds.com, Inc
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
Assumptionso Understanding of MongoDBo Experience with Javao Basic understanding of serialization protocols e.g.
Thrift, Protocol Bufferso Basic understanding of messaging protocols e.g.
JMS
2
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
Agendao Edmunds
o Scale of Big Data operations o Use case for Pomelo Application
o System Overview & Designo Real time integration with MongoDB o Real time data creation for MongoDB
o Implementationo MongoDB Consumero MongoDB REST service
o Q&A
3
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
Edmunds.com and Scaleo Premier online resource for automotive information
launched in 1995 as the first automotive information Web site
o 15 million unique visitorso 210 million page viewso 1 million+ new inventory items per dayo 2 TB of new data every montho 40 node Hadoop cluster aggregating logs,
transactions, calls, referrals, advertising, vehicle, pricing, inventory and other data sets
o
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
Pomelo Applicationo Analytics tool for Automotive Dealers and
Edmunds’ Dealer Sales o Performance measurement for Edmunds traffic
and its correlation to calls & referrals o iPad, HTML5, Sencha Touch & Charts
5
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
6
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
7
Unifying data for MongoDB
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
8
Processing data for MongoDB-Oozie
9
Populating MongoDB - Publishing System
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
Targeting MongoDB - Producer-Consumer matching
Generic Thrift Producer
MongoDB Consumer
ProdLAXEdmundsGTP
I am
Prod, TestLax, EC2EdmundsMongoDB
Send To ProdLAX, EC2EdmundsGTP
I amTestEC2EdmundsMongoDB
Receive From
BrokerDestinationInterceptor
PublishDealerMetrics
PublishDealerMetrics
DealerMetrics Virtual Topic
DealerMetricsQueue
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
Integration with MongoDB – layered architecture for transport
ActiveMQ
Camel
Thrift
Message persistence, durability and failover
Retries and error handling
Type safety, versioning and service
12
Preparing data for MongoDB - summaryStructured and Unstructured Data (Logs, Calls, Referrals, etc)
Map-Reduce
Source Specific Thrift Objects in HBase
Map-Reduce
Application Specific Thrift Object in HBase
Generic Thrift Producer
Broker
MongoDB Consumer
MongoDB
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
Thrift IDL definition
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
Mongo Connection
<bean id="mongo” class="com.edmunds...MongoDBConnectionFactory">
<property name="address" value="pl1db470.media.edmunds.com:27017,pl1db471.media.edmunds.com:27017"/>
</bean>
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
Mongo Connection - cont’d
@Autowiredpublic MongoDbDealerMetricsConsumer(Mongo mongo) { collection = mongo.getDB(DB_NAME).getCollection(COLLECTION_NAME); collection.ensureIndex(new BasicDBObject(LAST_ACTIVE_DATE, -1));}
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
Mongo consumerprivate void processDealerMetrics(DealerMetrics dealerMetrics) throws TException {
String cddId = dealerMetrics.getCddDealershipId(); BasicDBObject query = new BasicDBObject(); query.put(CDD_ID, cddId); DBObject dmObj = (DBObject) JSON.parse(serializeToJson(dealerMetrics)); /* query - query to match fields - fields to be returned sort - sort to apply before picking first document remove - if true, document found will be removed update - update to apply returnNew - if true, the updated document is returned, otherwise the old document is returned (or it would be lost forever) upsert - do upsert (insert if document not present) */ collection.findAndModify(query, null, null, false, dmObj, true, true); }
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
Public interface to Mongo - Dealer
public List<DBObject> getDocument(String cddId) { final BasicDBObject query = new BasicDBObject(); query.put(CDD_ID, cddId); final DBObject object = collection.findOne(query); object.removeField(OBJECT_ID); object.removeField(LAST_ACTIVE_DATE); return newArrayList(object);}
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
Public interface to Mongo - Active listpublic List<DBObject> getActiveList() { final BasicDBObject query = new BasicDBObject(); query.put(LAST_ACTIVE_DATE, getActiveDate()); query.put(DMA_NAME, getDmaCriteria()); final BasicDBObject keys = new BasicDBObject(); keys.put(OBJECT_ID, 0); keys.put(CDD_ID, 1); keys.put(DEALERSHIP_NAME, 1); return collection.find(query, keys).toArray();}
private Object getActiveDate() { return collection.find().sort(getSortCriteria()).next().get(LAST_ACTIVE_DATE);}
private BasicDBObject getSortCriteria() { return new BasicDBObject(LAST_ACTIVE_DATE, -1);}
private BasicDBObject getDmaCriteria() { return new BasicDBObject("$in", DMAS);}
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
Rest service @GET @Path("{id}") @Produces(MediaType.APPLICATION_JSON) public List<DBObject> get(@PathParam("id") String cddId) { return dealerMetricsMongoDao.getDocument(cddId); }
@GET @Path("list") @Produces(MediaType.APPLICATION_JSON) public List<DBObject> getDealerList() { return dealerMetricsMongoDao.getActiveList(); }
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.
No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.
Q&A
Greg [email protected]
Sharat [email protected]
20