42
Reporting: From MySQL to Hadoop/Hive Manuel Aldana

Reporting: From MySQL to Hadoop/Hive

Embed Size (px)

Citation preview

Reporting: From MySQL to Hadoop/Hive

Manuel Aldana

Classifieds = Kleinanzeigen

Ad = Kleinanzeige

April 2009 - Sept 2009 2009

Launch of Discovery by DLRCC license: http://flic.kr/p/9kRijr

8M Ads

11M UV pM

1B PV pM

ReportingWhat?

http://upload.wikimedia.org/wikipedia/commons/0/0b/Relief_Map_of_Germany.svg

Devil / Angel by Alper CugunCC license: http://flic.kr/p/5tXxze

Reporting back then...

Batch Jobs Apache POI

1.

2.

Email

3.

Problems followed...

Respects of Residents of Adenau by chuckbiscuitoCC license: http://flic.kr/p/eLk2o

New Curbside Recycling in Richland by Colleen LaneCC license: http://flic.kr/p/7GKY6c

Sucks!!!http://flic.kr/p/7GKY6chttp://www.icons-land.com

Reporting Now

Batch Job

1.

2.

3. pull

Hive Integration

Map Reduce HDFS

Log-FileJSON

SQL Subset(via JDBC)Batch Job

nightly copyFromLocal

- Model?pilot log book by buttersweetCC license: http://flic.kr/p/8Dstb

pilot log book by buttersweetCC license: http://flic.kr/p/8Dstb

Batch Job

1.

2.

3. pull

Why still?

Tableau

CatsSeasonality

BikesSeasonality

New Ads By Bundesland on last Sunday

Aftermath

00Respects of Residents of Adenau by chuckbiscuitoCC license: http://flic.kr/p/eLk2o

New Curbside Recycling in Richland by Colleen LaneCC license: http://flic.kr/p/7GKY6c

Tug of War by toffehoffeCC license: http://flic.kr/p/nD2nk

Binoculars Portrait by gerlosCC license: http://flic.kr/p/5KGg5B

HadoopFacts

0.20.1 v. Hadoop

0.9.0 v. Hive

22 Nodes

11 Reporting Jobs

1TB Overall

5GB Daily

Lessons learned

http://flic.kr/p/5KGg5B

pilot log book by buttersweetCC license: http://flic.kr/p/8Dstb

Log-FileJSON

HDFS

nightly copyFromLocal

Mac vs. PC by skrukCC license: http://flic.kr/p/8sdk8Z

Your lessons?

Thanks!