44

The Internet of Everywhere—How IBM The Weather Company Scales

Embed Size (px)

Citation preview

Page 1: The Internet of Everywhere—How IBM The Weather Company Scales
Page 2: The Internet of Everywhere—How IBM The Weather Company Scales
Page 3: The Internet of Everywhere—How IBM The Weather Company Scales
Page 4: The Internet of Everywhere—How IBM The Weather Company Scales

EverywhereDefined• 26Bforecasts/dayor250,000/second– vs 3.5BGooglequeriesdaily

• 2.2billionuniquelocations• 200kpersonalweatherstations• 200Mactivemobileusers• Petabytesofdatagenerateddaily

Page 5: The Internet of Everywhere—How IBM The Weather Company Scales

OurBrands

Page 6: The Internet of Everywhere—How IBM The Weather Company Scales

Over

30BillionServed

Page 7: The Internet of Everywhere—How IBM The Weather Company Scales

FlightRouting

Page 8: The Internet of Everywhere—How IBM The Weather Company Scales

EnergyTrading

Page 9: The Internet of Everywhere—How IBM The Weather Company Scales

Insurance

Page 10: The Internet of Everywhere—How IBM The Weather Company Scales

WeatherAlerting

Page 11: The Internet of Everywhere—How IBM The Weather Company Scales

DecisionsatScale

101001110100101

101001110100101010100101011001101010101011100000011010110010

Page 12: The Internet of Everywhere—How IBM The Weather Company Scales

WhoAreYou?

RDBMS?

Page 13: The Internet of Everywhere—How IBM The Weather Company Scales

WhoAreYou?

?

Page 14: The Internet of Everywhere—How IBM The Weather Company Scales

SocialWeather

Page 15: The Internet of Everywhere—How IBM The Weather Company Scales

SocialWeather

RDBMS

Page 16: The Internet of Everywhere—How IBM The Weather Company Scales

SocialWeather

RDBMS SELECTcount(*)FROMwx_reportsGROUPBYtime/300000*300000

Page 17: The Internet of Everywhere—How IBM The Weather Company Scales

SocialWeather

Live Reporting

ETL

Page 18: The Internet of Everywhere—How IBM The Weather Company Scales

SocialWeather

Live

Reporting

SqoopM/R

Page 19: The Internet of Everywhere—How IBM The Weather Company Scales

ScalingwithSpark

Live

Reporting

Page 20: The Internet of Everywhere—How IBM The Weather Company Scales

EasingtheTransition

101001110100101

101001110100101010100101011001101010101011100000011010110010

Page 21: The Internet of Everywhere—How IBM The Weather Company Scales

EasingtheTransition

101001110100101

101001110100101010100101011001101010101011100000011010110010

Page 22: The Internet of Everywhere—How IBM The Weather Company Scales

EasingtheTransition

101001110100101

101001110100101010100101011001101010101011100000011010110010

101001110100101010100101011001101010101011100000011010110010

10100,11101,0010101010,01010,1100110101,01010,1110000001,10101,...

Page 23: The Internet of Everywhere—How IBM The Weather Company Scales

EasingtheTransition

101001110100101

101001110100101010100101011001101010101011100000011010110010

101001110100101010100101011001101010101011100000011010110010

10100,11101,0010101010,01010,1100110101,01010,1110000001,10101,...

Page 24: The Internet of Everywhere—How IBM The Weather Company Scales

ScalingwithSpark

Live

Page 25: The Internet of Everywhere—How IBM The Weather Company Scales

ScalingwithSpark

Live

Reporting

Page 26: The Internet of Everywhere—How IBM The Weather Company Scales

BatchAggregationval wx_reports = // load data from database

val sql = new org.apache.spark.sql.SQLContext(sc)import sql.implicits._

wx_reports.toDF.registerTempTable("wx_reports")

val counts = sql("select count(*) from wx_reports group by timestamp / 300000 * 300000")

Page 27: The Internet of Everywhere—How IBM The Weather Company Scales

StreamingAggregationval wx_reports = // load from streaming source

wx_reports.foreachRDD { rdd =>val sql = SQLContext.getOrCreate(rdd.sparkContext)import sql.implicits._rdd.toDF.registerTempTable("wx_reports")val count = sql("select count(*) from wx_reports")

}

Page 28: The Internet of Everywhere—How IBM The Weather Company Scales

DataScienceRoles

Data Scientist Data Engineer

Page 29: The Internet of Everywhere—How IBM The Weather Company Scales

DataScienceRoles

Data Scientist Data Engineer

Machine learningexpert

Page 30: The Internet of Everywhere—How IBM The Weather Company Scales

DataScienceRoles

Data Scientist Data Engineer

Machine learningexpert Scalablealgorithms expert

Page 31: The Internet of Everywhere—How IBM The Weather Company Scales

DataScienceRoles

Data Scientist Data Engineer

Buildspipelines thatworkonherlaptop

Page 32: The Internet of Everywhere—How IBM The Weather Company Scales

DataScienceRoles

Data Scientist Data Engineer

Rewritesherpipelinestoscalebetter

Page 33: The Internet of Everywhere—How IBM The Weather Company Scales

CollaborativeDataScience

Page 34: The Internet of Everywhere—How IBM The Weather Company Scales

TheAnalyticsOS

Notebooks StreamAnalytics

BatchAnalytics

Page 35: The Internet of Everywhere—How IBM The Weather Company Scales

But…

Page 36: The Internet of Everywhere—How IBM The Weather Company Scales

TheRealWorld(EnterpriseVersion)

Page 37: The Internet of Everywhere—How IBM The Weather Company Scales

TheRealWorld(StartupVersion)

Application MySQL

Page 38: The Internet of Everywhere—How IBM The Weather Company Scales

Step1:PickaProblemtoSolve

Page 39: The Internet of Everywhere—How IBM The Weather Company Scales

Step2:BuildaDataLake

Page 40: The Internet of Everywhere—How IBM The Weather Company Scales

Step3:SetupSpark

• Directdownload• Hadoop distribution(Hortonworks,Cloudera,etc)

• Managedservice(ElasticMapReduce,Databricks,BlueMix,etc)

Page 41: The Internet of Everywhere—How IBM The Weather Company Scales

Step4:StartCollectingData• Options:– Sqoop tomoveRDBMStables– Flume/FluentD tomovelogs– ImportfromSpark-supporteddatasources– UsingSparkStreamingattachedtoaqueue– …

Page 42: The Internet of Everywhere—How IBM The Weather Company Scales

Step5:UseaNotebook

Page 43: The Internet of Everywhere—How IBM The Weather Company Scales

FinalThoughts

Page 44: The Internet of Everywhere—How IBM The Weather Company Scales

ThankYou!

Robbie Strickland@rs_atl

(we’rehiring!)