37
Real-Time, Geospatial, Maps Neil Dahlke 29 June 2016

Real-Time, Geospatial, Maps by Neil Dahlke

  • Upload
    memsql

  • View
    433

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Real-Time, Geospatial, Maps by Neil Dahlke

Real-Time, Geospatial, Maps

Neil Dahlke

29 June 2016

Page 2: Real-Time, Geospatial, Maps by Neil Dahlke

Agenda

2

▪PowerStream▪Supercar▪Q&A▪Drinks

Page 3: Real-Time, Geospatial, Maps by Neil Dahlke

Renewable Energy

in the News

Page 4: Real-Time, Geospatial, Maps by Neil Dahlke

BCC: http://www.bbc.com/news/science-environment-36420750

Investment in renewablesreached $286 billion worldwide

in 2015

Page 5: Real-Time, Geospatial, Maps by Neil Dahlke

Germany Just Got Almost All of Its Power From Renewable Energy

May 15, 2016

Bloomberg: http://www.bloomberg.com/news/articles/2016-05-16/germany-just-

got-almost-all-of-its-power-from-renewable-energy

Page 6: Real-Time, Geospatial, Maps by Neil Dahlke

Denmark is aiming for 50% renewable energy sources within the next five yearsIndependent: http://www.independent.co.uk/environment/germany-just-got-almost-all-of-its-power-from-renewable-energy-a7037851.html

42% of electricity produced from wind turbines in 2015

The Guardian: http://www.theguardian.com/environment/2016/jan/18/denmark-broke-world-record-for-wind-power-in-2015

Page 7: Real-Time, Geospatial, Maps by Neil Dahlke

Portugal Runs for Four Days Straight on Renewable Energy Alonehttp://www.theguardian.com/environment/2016/may/18/portugal-runs-for-four-days-straight-on-renewable-energy-alone

22% of electricityprovided by wind in 2015

Page 8: Real-Time, Geospatial, Maps by Neil Dahlke

MemSQL PowerStreamPredicting the global health of wind turbines

Page 9: Real-Time, Geospatial, Maps by Neil Dahlke

Sensors

Wind Turbine Wind Farm

MemSQL PowerStream197,000 wind turbines around the world

Page 10: Real-Time, Geospatial, Maps by Neil Dahlke

1 to 2 million data points per secondwith MemSQL Streamliner

Page 11: Real-Time, Geospatial, Maps by Neil Dahlke

Simulation Details

11

Data producers (Python programs) push to Kafka▪1M data points per second from 200k turbines▪Generated sensor data is based on predetermined turbine failure

modelTransform models individual turbine (2 components per turbine) failures w/ machine learning, determining: How fast is the turbine deteriorating? How bad does the turbine get before being

repaired?

Page 12: Real-Time, Geospatial, Maps by Neil Dahlke

How does it work?

Page 13: Real-Time, Geospatial, Maps by Neil Dahlke

REAL-TIME INPUTS

REAL-TIMEAPPLICATION

Demo Architecture and Data Flow

13

Page 14: Real-Time, Geospatial, Maps by Neil Dahlke

REAL-TIME INPUTS

REAL-TIMEAPPLICATION

Demo Architecture and Data FlowSimulated sensor data is written to Kafka

14

Page 15: Real-Time, Geospatial, Maps by Neil Dahlke

Extract

REAL-TIME INPUTS

StreamlinerREAL-TIME

APPLICATION

Demo Architecture and Data FlowSimulated sensor data is written to KafkaStreamliner Extractor pulls data from Kafka into Spark

15

Page 16: Real-Time, Geospatial, Maps by Neil Dahlke

Extract, Transform

REAL-TIME INPUTS

StreamlinerREAL-TIME

APPLICATION

Demo Architecture and Data FlowSimulated sensor data is written to KafkaStreamliner Extractor pulls data from Kafka into SparkStreamliner Transformer then “scores” the failure model (ML algorithm)

• Failure model is scored through performing a regression on incoming sensor data values

16

Page 17: Real-Time, Geospatial, Maps by Neil Dahlke

Extract, Transform, Load

REAL-TIME INPUTS

StreamlinerREAL-TIME

APPLICATION

Demo Architecture and Data FlowSimulated sensor data is written to KafkaStreamliner Extractor pulls data from Kafka into SparkStreamliner Transformer then “scores” the failure model (ML algorithm)

• Failure model is scored through performing a regression on incoming sensor data valuesStreamliner Loader inserts the data into MemSQL

17

Page 18: Real-Time, Geospatial, Maps by Neil Dahlke

Cluster Architecture

18

Aggregator Nodes

Leaf Nodes

Page 19: Real-Time, Geospatial, Maps by Neil Dahlke

Cluster Architecture

19

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

ZooKeeperSpark Master

Page 20: Real-Time, Geospatial, Maps by Neil Dahlke

Internet-of-Things simulation depicting

health of wind turbines globally.

8 machines - AWS C4-2X large instances, at $0.311 per hour per machine,

annual cost ~ $22,000.

Cluster Architecture

20

Page 21: Real-Time, Geospatial, Maps by Neil Dahlke

Visual Layer

21

▪MemSQL data is rendered in a web UI• Turbine Health (green, yellow, red)

▪Draw positions of turbines on a MapBox map• A geospatial query is sent to MemSQL each time the map

view is moved▪Alerts based on predicted turbine health▪Data points shown on the UI map are all from real-time

queries• Real-time in this case = 1 second interval

Page 22: Real-Time, Geospatial, Maps by Neil Dahlke

Demo

Page 23: Real-Time, Geospatial, Maps by Neil Dahlke

The On-Demand

Economy

Page 24: Real-Time, Geospatial, Maps by Neil Dahlke

24

MemSQL Supercar

Real-time asset tracking and analysis

Page 25: Real-Time, Geospatial, Maps by Neil Dahlke

We live in an on-demand economy

Page 26: Real-Time, Geospatial, Maps by Neil Dahlke

Consumers are conditioned to instant services, like Uber, Stripe, and Airbnb

Page 27: Real-Time, Geospatial, Maps by Neil Dahlke

Where does that leave enterprises?

Page 28: Real-Time, Geospatial, Maps by Neil Dahlke

Racing to meet internal and external expectations for speed and personalization

Page 29: Real-Time, Geospatial, Maps by Neil Dahlke

Batch processing in the enterprise enemy

Page 30: Real-Time, Geospatial, Maps by Neil Dahlke

Enterprises must move from overnight to real-time, intra-day operations

Page 31: Real-Time, Geospatial, Maps by Neil Dahlke

Cluster Architecture

▪One single 16 core machine w/ 64 GB RAM is enough to handle all of the data in real time. ▪That’s really it

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

ZooKeeperSpark Master

31

Page 32: Real-Time, Geospatial, Maps by Neil Dahlke

Simulation Details▪NYC Taxi and Limo Commission Trip Record Data

• Downloads available each year fo’ free

▪Simulation utilizes dataset from NYE (one of the busiest days for cabs in NYC)

▪Drivers are assigned pickups and dropoffs from real data set

▪Routes are replayed over time

32

Page 33: Real-Time, Geospatial, Maps by Neil Dahlke

Extract, Transform, Load

REAL-TIME INPUTS

StreamlinerREAL-TIME

APPLICATION

Demo Architecture and Data FlowSimulated driver data is written to KafkaStreamliner Extractor pulls data from Kafka into SparkStreamliner Transformer parses the CSV and transforms it to a Spark DataFrameStreamliner Loader inserts the data into MemSQL

33

Page 34: Real-Time, Geospatial, Maps by Neil Dahlke

Demo

Page 35: Real-Time, Geospatial, Maps by Neil Dahlke

Q&A

Page 36: Real-Time, Geospatial, Maps by Neil Dahlke

Resources▪Powerstream blog post

http://blog.memsql.com/powerstream-demo/

▪Powerstream recordinghttps://youtu.be/DhP324uNZMI?t=589

▪Supercar blog posthttp://blog.memsql.com/real-time-geospatial-intelligence-with-supercar/

▪Supercar recordinghttps://www.youtube.com/watch?v=2txICCLUV-Y

▪Today’s talks will be published soon.

36

Page 37: Real-Time, Geospatial, Maps by Neil Dahlke

Thank You