23

Architecting big data solutions in the cloud

Embed Size (px)

Citation preview

Page 1: Architecting big data solutions in the cloud
Page 2: Architecting big data solutions in the cloud

Session Objectives And Takeaways

Page 3: Architecting big data solutions in the cloud

Lambda Architecture

http://lambda-architecture.net/

1.All data entering the system is dispatched to both the

batch layer and the speed layer for processing.

2.The batch layer has two functions: (i) managing the

master dataset (an immutable, append-only set of raw

data), and (ii) to pre-compute the batch views.

3.The serving layer indexes the batch views so that they

can be queried in low-latency, ad-hoc way.

4.The speed layer compensates for the high latency of

updates to the serving layer and deals with recent data

only.

5.Any incoming query can be answered by merging

results from batch views and real-time views.

Page 4: Architecting big data solutions in the cloud

Lambda Architecture

Page 5: Architecting big data solutions in the cloud

Linux

Windows

What is HDInsight

Page 7: Architecting big data solutions in the cloud

HDInsight clusters on Azure

Page 8: Architecting big data solutions in the cloud

What is HBase

Page 9: Architecting big data solutions in the cloud

Order No Customer Name Customer Phone Company Name Company Address

12012015 Mostafa 101-232-2345 Microsoft Redmond, WA

Customer Company

Order No Customer

Name

Customer

Phone

Company Name Company

Address

12012015 Mostafa 101-232-2345 Microsoft Redmond, WA

Page 10: Architecting big data solutions in the cloud

Create

Select

Update

Select

What is HBase

Page 11: Architecting big data solutions in the cloud

data warehouse system

What is Hive

Page 12: Architecting big data solutions in the cloud
Page 13: Architecting big data solutions in the cloud

distributed fault-tolerant open-source

analytics solutions

templates

What is Apache Storm

Page 14: Architecting big data solutions in the cloud

Topologies

topology

Stream

Tuple

Spout

Bolt streams tuples streams

Apache Storm Components

Page 15: Architecting big data solutions in the cloud
Page 16: Architecting big data solutions in the cloud

100x 10x

What is Apache Spark

Page 17: Architecting big data solutions in the cloud
Page 18: Architecting big data solutions in the cloud

complexities of ingesting and storing all of your data batch streaming interactive analytics

Azure Data Lake (ADL)

Page 19: Architecting big data solutions in the cloud

Azure Data Lake (ADL)

Page 20: Architecting big data solutions in the cloud

Azure Data Lake Analytics

Page 21: Architecting big data solutions in the cloud
Page 22: Architecting big data solutions in the cloud
Page 23: Architecting big data solutions in the cloud

http://mostafa.rocks