Upload
amazon-web-services
View
259
Download
2
Embed Size (px)
Citation preview
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
John Yeung, Solutions Architect
31 October 2017
Deep Dive on AWS with DemoAWS Big Data and Machine Learning Day | Hong Kong
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What to expect from the session
Big Data ChallengesArchitectural PrinciplesDesign PatternsDemo (around 15 mins)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Ever-Increasing Big Data
Volume
Velocity
Variety
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Big Data Evolution
Batch Processing
StreamProcessing
MachineLearning
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Plenty of Tools
Amazon Glacier
S3 DynamoDB
RDS
EMR
Amazon Redshift
Data PipelineAmazon Kinesis
Amazon Kinesis Streams app
Lambda Amazon ML
SQS
ElastiCache
DynamoDBStreams
Amazon ElasticsearchService
Amazon Kinesis Analytics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Big Data Challenges
Why?
How?
What tools should I use?
Is there a reference architecture?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Architectural Principles
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Architecture Principles
#1: Build Decoupled Systems• Data → Store → Process → Store → Analyze → Answers
#2: Use Right Tool for the Job• Data structure, latency, throughput, access patterns
#3: Leverage AWS Managed Services• Scalable/elastic, available, reliable, secure, no/low admin
#4: Use Lambda Architecture Ideas• Immutable (append-only) log, batch/speed/serving layer
#5: Be Cost-conscious• Big data ≠ big cost
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Simplify Big Data Processing
COLLECT STORE PROCESS/ANALYZE CONSUME
1. Time to answer (Latency)2. Throughput
3. Cost
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
COLLECT
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Types of DataCOLLECT
Mobile apps
Web apps
Data centersAWS Direct
Connect
RECORDS
Appl
icat
ions In-memory data
Database records
AWS Import/ExportSnowball
Logging
Amazon CloudWatch
AWS CloudTrail
DOCUMENTS
FILES
Logg
ing
Tran
spor
t
Search documents
Log files
MessagingMessage MESSAGES
Mes
sagi
ng
Messages
Devices
Sensors & IoT platforms
AWS IoT STREAMS
IoT Data streams
Transaction-based
File-based
Event-based
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Store
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
STORE
Devices
Sensors & IoT platforms
AWS IoT STREAMS
IoT
COLLECT
AWS Import/ExportSnowball
Logging
Amazon CloudWatch
AWS CloudTrail
DOCUMENTS
FILES
Logg
ing
Tran
spor
t
MessagingMessage MESSAGES
Mes
sagi
ngAp
plic
atio
ns
Mobile apps
Web apps
Data centersAWS Direct
Connect
RECORDS
Types of Data Stores
Database SQL & NoSQL databases
Search Search engines
File store File systems
Queue Message queues
Streamstorage
Pub/sub message queues
In-memory Caches
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
In-memory
COLLECT STORE
Mobile apps
Web apps
Data centersAWS Direct
Connect
RECORDS Database
AWS Import/ExportSnowball
Logging
Amazon CloudWatch
AWS CloudTrail
DOCUMENTS
FILES
Search
MessagingMessage MESSAGES
Devices
Sensors & IoT platforms
AWS IoT STREAMS
Apache Kafka
Amazon KinesisStreams
Amazon Kinesis Firehose
Amazon DynamoDB Streams
Hot
Stre
am
Amazon SQS
Mes
sage
Amazon S3File
Logg
ing
IoT
Appl
icat
ions
Tran
spor
tM
essa
ging
In-memory, Database, Search
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
COLLECT STORE
Mobile apps
Web apps
Data centersAWS Direct
Connect
RECORDS
AWS Import/ExportSnowball
Logging
Amazon CloudWatch
AWS CloudTrail
DOCUMENTS
FILES
MessagingMessage MESSAGES
Devices
Sensors & IoT platforms
AWS IoT STREAMS
Apache Kafka
Amazon KinesisStreams
Amazon Kinesis Firehose
Amazon DynamoDB Streams
Hot
Stre
am
Amazon SQS
Mes
sage
Amazon Elasticsearch Service
Amazon DynamoDB
Amazon S3
Amazon ElastiCache
Amazon RDS
Sear
ch
SQL
N
oSQ
L C
ache
File
Logg
ing
IoT
Appl
icat
ions
Tran
spor
tM
essa
ging
Amazon ElastiCache• Managed Memcached or Redis service
Amazon DynamoDB• Managed NoSQL database service
Amazon RDS• Managed relational database service
Amazon Elasticsearch Service• Managed Elasticsearch service
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Use the Right Tool for the Job
Data Tier
Search
Amazon Elasticsearch Service
In-memory
Amazon ElastiCacheRedisMemcached
SQL
Amazon AuroraAmazon RDS
MySQLPostgreSQLOracleSQL Server
NoSQL
Amazon DynamoDBCassandraHBaseMongoDB
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
In-memory
COLLECT STORE
Mobile apps
Web apps
Data centersAWS Direct
Connect
RECORDSDatabase
AWS Import/ExportSnowball
Logging
Amazon CloudWatch
AWS CloudTrail
DOCUMENTS
FILES
Search
MessagingMessage MESSAGES
Devices
Sensors & IoT platforms
AWS IoT STREAMS
Apache Kafka
Amazon KinesisStreams
Amazon Kinesis Firehose
Amazon DynamoDB Streams
Hot
Stre
am
Amazon S3
Amazon SQS
Mes
sage
Amazon S3File
Logg
ing
IoT
Appl
icat
ions
Tran
spor
tM
essa
ging
File Storage
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why Is Amazon S3 Good for Big DataNatively supported by big data frameworks (Spark, Hive, Presto, etc.) Multiple & heterogeneous analysis clusters can use the same dataUnlimited number of objects and volume of dataVery high bandwidth – no aggregate throughput limitDesigned for 99.99% availability – can tolerate zone failureDesigned for 99.999999999% durabilityNo need to pay for data replicationNative support for versioningTiered-storage (Standard, IA, Amazon Glacier) via life-cycle policiesSecure – SSL, client/server-side encryption at restLow cost
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
In-memory
Amazon Kinesis Firehose
Amazon KinesisStreams
Apache Kafka
Amazon DynamoDB Streams
Amazon SQS
Amazon SQS• Managed message queue service
Apache Kafka• High throughput distributed streaming platform
Amazon Kinesis Streams• Managed stream storage + processing
Amazon Kinesis Firehose• Managed data delivery
Amazon DynamoDB• Managed NoSQL database• Tables can be stream-enabled
Message & Stream Storage
Devices
Sensors & IoT platforms
AWS IoT STREAMS
IoT
COLLECT STORE
Mobile apps
Web apps
Data centersAWS Direct
Connect
RECORDSDatabaseAp
plic
atio
ns
AWS Import/ExportSnowball
Logging
Amazon CloudWatch
AWS CloudTrail
DOCUMENTS
FILES
Search
File store
Logg
ing
Tran
spor
t
MessagingMessage MESSAGES
Mes
sagi
ng
Mes
sage
Stre
am
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why Stream Storage
Decouple producers & consumers
Persistent buffer
Collect multiple streams
Preserve client ordering
Parallel consumption
4 4 3 3 2 2 1 14 3 2 1
4 3 2 1
4 3 2 1
4 3 2 14 4 3 3 2 2 1 1
shard 1 / partition 1
shard 2 / partition 2
Consumer 1Count of red = 4
Count of violet = 4
Consumer 2Count of blue = 4
Count of green = 4
DynamoDB stream Amazon Kinesis stream Kafka topic
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What Stream Storage should I use?AmazonDynamoDBStreams
AmazonKinesisStreams
AmazonKinesis Firehose
ApacheKafka
AmazonSQS
AWS managed service
Yes Yes Yes No Yes
Guaranteedordering
Yes Yes Yes Yes No
Delivery exactly-once at-least-once exactly-once at-least-once at-least-once
Data retention period
24 hours 7 days N/A Configurable 14 days
Availability 3 AZ 3 AZ 3 AZ Configurable 3 AZ
Scale / throughput
No limit /~ table IOPS
No limit /~ shards
No limit /automatic
No limit /~ nodes
No limits /automatic
Parallel clients Yes Yes No Yes No
Stream MapReduce Yes Yes N/A Yes N/A
Record/object size 400 KB 1 MB Redshift row size Configurable 256 KB
Cost Higher (table cost) Low Low Low (+admin) Low-medium
Hot Warm
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Which Data Store Should I Use
Data Structure → Fixed schema, JSON, key-value
Access Patterns → Store data in the format you will access it
Data Characteristics → Hot, Warm, Cold
Cost → Right cost
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Structure and Access Patterns
Access Patterns What to use?Put/Get (key, value) In-memory, NoSQLSimple relationships → 1:N, M:N NoSQLMulti-table joins, transaction, SQL SQLFaceting, search Search
Data Structure What to use?Fixed schema SQL, NoSQLSchema-free (JSON) NoSQL, Search(Key, value) In-memory, NoSQL
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is the temperature of your data
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data characteristics: Hot, Warm or Cold
Hot Warm ColdVolume MB–GB GB–TB PB–EBItem size B–KB KB–MB KB–TBLatency ms ms, sec min, hrsDurability Low–high High Very highRequest rate Very high High LowCost/GB $$-$ $-¢¢ ¢
Hot data Warm data Cold data
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
In-memory SQL
Request rateHigh Low
Cost/GBHigh Low
LatencyLow High
Data volumeLow High
Amazon Glacier
Stru
ctur
e
NoSQL
Hot data Warm data Cold data
Low
High
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Which Data Store Should I UseAmazon ElastiCache
AmazonDynamoDB
AmazonRDS/Aurora
AmazonES
Amazon S3
AmazonGlacier
Average latency
ms ms ms, sec ms,sec ms,sec,min(~ size)
hrs
Typicaldata stored
GB GB–TBs(no limit)
GB–TB(64 TB max)
GB–TB MB–PB(no limit)
GB–PB(no limit)
Typicalitem size
B-KB KB(400 KB max)
KB(64 KB max)
B-KB(2 GB max)
KB-TB(5 TB max)
GB(40 TB max)
Request Rate
High – very high Very high(no limit)
High High Low – high(no limit)
Very low
Storage costGB/month
$$ ¢¢ ¢¢ ¢¢ ¢ ¢4/10
Durability Low - moderate Very high Very high High Very high Very high
Availability High2 AZ
Very high 3 AZ
Very high3 AZ
High2 AZ
Very high3 AZ
Very high3 AZ
Hot data Warm data Cold data
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
PROCESS / ANALYZE
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Analytics & FrameworksInteractive
Takes secondsExample: Self-service dashboardsAmazon Redshift, Amazon Athena, Amazon EMR (Presto, Spark)
BatchTakes minutes to hours Example: Daily/weekly/monthly reportsAmazon EMR (MapReduce, Hive, Pig, Spark)
MessageTakes milliseconds to secondsExample: Message processingAmazon SQS applications on Amazon EC2
StreamTakes milliseconds to secondsExample: Fraud alerts, 1 minute metricsAmazon EMR (Spark Streaming), Amazon Kinesis Analytics, KCL, Storm, AWS Lambda
PROCESS / ANALYZE
Amazon Machine LearningM
LM
essa
ge
Amazon SQS appsAmazon EC2
Streaming
Amazon Kinesis Analytics
KCLapps
AWS Lambda
Stre
am
Amazon EC2
Amazon EMR
Fast
Amazon Redshift
Presto
AmazonEMR
Fast
Slow
Amazon Athena
Batc
hIn
tera
ctiv
e
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What about ETL
https://aws.amazon.com/big-data/partner-solutions/
ETLSTORE PROCESS / ANALYZE
Data Integration PartnersReduce the effort to move, cleanse, synchronize, manage, and automatize data related processes. AWS Glue
AWSGlueisafullymanagedETLservicethatmakesiteasytounderstandyourdatasources,preparethedata,andmoveitreliablybetweendatastores
New
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
CONSUME
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
COLLECT STORE CONSUMEPROCESS / ANALYZE
Amazon Elasticsearch Service
Apache Kafka
Amazon SQS
Amazon KinesisStreams
Amazon Kinesis Firehose
Amazon DynamoDB
Amazon S3
Amazon ElastiCache
Amazon RDS
Amazon DynamoDB Streams
Hot
Hot
War
m
File
Mes
sage
Stre
am
Mobile apps
Web apps
Devices
MessagingMessage
Sensors & IoT platforms
AWS IoT
Data centersAWS Direct
Connect
AWS Import/ExportSnowball
Logging
Amazon CloudWatch
AWS CloudTrail
RECORDS
DOCUMENTS
FILES
MESSAGES
STREAMS
Logg
ing
IoT
Appl
icat
ions
Tran
spor
tM
essa
ging
ETL
Sear
ch
SQL
N
oSQ
L C
ache
Streaming
Amazon Kinesis Analytics
KCLapps
AWS Lambda
Fast
Stre
am
Amazon EC2
Amazon EMR
Amazon SQS apps
Amazon Redshift
Amazon Machine Learning
Presto
AmazonEMR
Fast
Slow
Amazon EC2
Amazon Athena
Batc
hM
essa
geIn
tera
ctiv
eM
L
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
STORE CONSUMEPROCESS / ANALYZE
Amazon QuickSight
Apps & Services
Anal
ysis
& v
isua
lizat
ion
Not
eboo
ks
IDE
API
Applications & API
Analysis and visualization
Notebooks
IDE
Business users
Data scientist, developers
COLLECT ETL
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Put them together
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Streaming
Amazon Kinesis Analytics
KCLapps
AWS Lambda
COLLECT STORE CONSUMEPROCESS / ANALYZE
Amazon Elasticsearch Service
Apache Kafka
Amazon SQS
Amazon KinesisStreams
Amazon Kinesis Firehose
Amazon DynamoDB
Amazon S3
Amazon ElastiCache
Amazon RDS
Amazon DynamoDB Streams
Hot
Hot
War
m
Fast
Stre
am
Sear
ch
SQL
N
oSQ
L C
ache
File
Mes
sage
Stre
am
Amazon EC2
Mobile apps
Web apps
Devices
MessagingMessage
Sensors & IoT platforms
AWS IoT
Data centersAWS Direct
Connect
AWS Import/ExportSnowball
Logging
Amazon CloudWatch
AWS CloudTrail
RECORDS
DOCUMENTS
FILES
MESSAGES
STREAMS
Amazon QuickSight
Apps & Services
Anal
ysis
& v
isua
lizat
ion
Not
eboo
ksID
EAP
I
Logg
ing
IoT
Appl
icat
ions
Tran
spor
tM
essa
ging
ETL
Amazon EMR
Amazon SQS apps
Amazon Redshift
Amazon Machine Learning
Presto
AmazonEMR
Fast
Slow
Amazon EC2
Amazon Athena
Batc
hM
essa
geIn
tera
ctiv
eM
L
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Design Patterns
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Concept #1: Decoupled Data Bus
• Storage decoupled from processing• Multiple stages
Store Process Store Process
ProcessStore
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Concept #2: Multiple Stream Processing
ProcessStore
Amazon Kinesis
Amazon DynamoDB
Amazon S3
AWS Lambda
Amazon Kinesis Connector
Library KCL
• Parallel processing
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Concept #3: Multiple Data Stores
Amazon EMR
Amazon Kinesis
AWS Lambda
Amazon S3
Amazon DynamoDB
Spark Streaming
Amazon Kinesis Connector
Library KCL
Spark SQL
• Analysis framework reads from or writes to multiple data stores
ProcessStore
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EMR
ApacheKafka
KCL
AWS Lambda
SparkStreaming
Apache Storm
Amazon SNS
AmazonML
Notifications
AmazonElastiCache
(Redis)
AmazonDynamoDB
AmazonRDS
AmazonES
Alert
App state
Real-time prediction
KPI
DynamoDBStreams
Amazon Kinesis
ProcessStore
Real-time Analytics Design Pattern
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SQS
Amazon SQS App
Amazon SQS App
Amazon SNS Subscribers
AmazonElastiCache
(Redis)
AmazonDynamoDB
AmazonRDS
AmazonES
Publish
App state
KPI
Amazon SQS App
Amazon SQSApp
Auto Scaling group
Amazon SQSPriority queue
Messages /eventsProcess
Store
Message / Event Processing Design Pattern
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon S3
Amazon EMR
Hive
Pig
Spark
AmazonMachine Learning
Consume
Amazon Redshift
Amazon EMR
PrestoSpark
BatchMode
InteractiveMode
Batch prediction
Real-time predictionAmazon Kinesis
Firehose
Amazon Athena
Amazon KinesisAnalytics
Files
ProcessStore
Interactive &Batch Analytics Design Pattern
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DemonstrationApply what we’ve just learnt
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Real-time Analytics Design Pattern
Apache Web Server
Amazon Kinesis
Firehose
Amazon Kinesis
Firehose
Amazon Kinesis Analytics
Amazon S3 bucket
Availability Zone #1
KibanaAmazon ElasticSearch
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Elastic Cloud Computing EC2
Amazon EC2 provides the Virtual Machines VMs, known as instances, to run your web application on the platform you choose. It allows you to configure and scale your compute capacity easily to meet changing requirements and demand.
In this demo, this instance is installed with Apache Web Server which continuously generates web access log records and Amazon Kinesis Agent which streams these records to Amazon Kinesis Firehose.
Apache Web Server
+Amazon
Kinesis Agent
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Kinesis Firehose
Amazon Kinesis Firehose is a fully managed service for delivering real-time streaming data to destinations such as Amazon Simple Storage Service (AmazonS3), Amazon Redshift, or Amazon Elasticsearch Service (Amazon ES).
In this step, we will create an Amazon Kinesis Firehose delivery stream to save each log entry in Amazon S3 and to provide the log data to the Amazon Kinesis Analytics application.
Amazon Kinesis
Firehose
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Example: Real-time Analytics (1)
Apache Web Server
Amazon Kinesis
Firehose
Availability Zone #1
1. A Linux Instance is installed with Amazon Kinesis Agent which sends log records to Amazon Kinesis Firehose continuously.
Streaming data
COLLECT
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Simple Storage Service S3
Amazon S3 has a simple web services interface that you can use to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure.
Examples: Web Access Log, Static Web Site and Data Lake etc.
Amazon S3
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Kinesis Analytics
Amazon Kinesis Analytics enables you to query streaming data or build entire streaming applications using SQL, so that you can gain actionable insights promptly.
It takes care of everything required to run your queries continuously and scales automatically to match the volume and throughput rate of your incoming data.
Amazon Kinesis
Analytics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Example: Real-time Analytics (2)
Apache Web Server
Amazon Kinesis
Firehose
Amazon S3 bucket
Availability Zone #1
2a. Amazon Kinesis Firehose will write each log record to Amazon Simple Storage Service S3 for durable storage.
COLLECT STORE
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Example: Real-time Analytics (2)
Apache Web Server
Amazon Kinesis
Firehose
Amazon Kinesis Analytics
Amazon S3 bucket
Availability Zone #1
2b. Amazon Kinesis Analytics run a SQL statement against the streaming input data.
COLLECT STORE PROCESS / ANALYZE
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SQL Operations Inside Kinesis Analytics
Source Stream
Insert & Select (Pump)
Destination Stream
Amazon Kinesis Analytics
CREATE OR REPLACE STREAM "DESTINATION_SQL_STREAM" ( datetime VARCHAR(30), status INTEGER, statusCount INTEGER);
CREATE OR REPLACE PUMP "STREAM_PUMP" AS INSERT INTO "DESTINATION_SQL_STREAM" SELECT STREAM TIMESTAMP_TO_CHAR('yyyy-MM-dd''T''HH:mm:ss.SSS', LOCALTIMESTAMP) as datetime, "response" as status, COUNT(*) AS statusCountFROM "SOURCE_SQL_STREAM_001" GROUP BY "response", FLOOR(("SOURCE_SQL_STREAM_001".ROWTIME - TIMESTAMP '1970-01-01 00:00:00') minute / 1 TO MINUTE);
Amazon Kinesis
Firehose
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Example: Real-time Analytics (3)
Apache Web Server
Amazon Kinesis
Firehose
Amazon Kinesis
Firehose
Amazon Kinesis Analytics
Amazon S3 bucket
Availability Zone #1
COLLECT STORE PROCESS / ANALYZE
3. Amazon Kinesis Analytics creates an aggregated data set every minute and output that data to a second Firehose delivery stream.
STORE
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Elasticsearch Service ES
Amazon Elasticsearch Service makes it easy to deploy, secure, operate, and scale Elasticsearch for log analytics, full text search, application monitoring, and more. Amazon Elasticsearch Service is a fully managed service that delivers real-time analytics capabilities alongside the availability, scalability, and security that production workloads require.
The service offers built-in integrations with Kibana, Logstashand other AWS services. It enables you to go from raw data to actionable insights quickly and securely.
Amazon Elasticsearch
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Example: Real-time Analytics (4)
Apache Web Server
Amazon Kinesis
Firehose
Amazon Kinesis
Firehose
Amazon Kinesis Analytics
Amazon S3 bucket
Availability Zone #1
Amazon ElasticSearch
COLLECT STORE PROCESS / ANALYZE STORE
4. This Firehose delivery stream will write the aggregated data to an Amazon ES domain.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kibana
Kibana lets you visualize your Elasticsearch data. It provides you interactive visualizations with various types including histograms, line graphs, pie charts, and more. It leverages the full aggregation capabilities of Elasticsearch.
Kibana
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Example: Real-time Analytics (5)
Apache Web Server
Amazon Kinesis
Firehose
Amazon Kinesis
Firehose
Amazon Kinesis Analytics
Amazon S3 bucket
Availability Zone #1
KibanaAmazon ElasticSearch
COLLECT STORE PROCESS / ANALYZE STORE CONSUME
5. Finally, use Kibana to visualize the result of your system.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Implementation Steps
Apache Web Server
Amazon Kinesis
Firehose
Amazon Kinesis
Firehose
Amazon Kinesis Analytics
Amazon S3 bucket
Availability Zone #1
KibanaAmazon ElasticSearch
COLLECT STORE PROCESS / ANALYZE STORE CONSUME
1 2a
2b
345 6
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Let’s build your own one in 60 mins!
https://aws.amazon.com/getting-started/projects/build-log-analytics-solution/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!John Yeung | [email protected]