Cloudlytics: In Depth S3 & CloudFront Log Analysis - Featuring Reports

Preview:

DESCRIPTION

This presentation talks about the Following - -Working of AWS S3 & CloudFront Logs with respect to Content Storing and Distribution. -The hidden potential of your Stored S3 & CloudFront Logs & Unlocking them with Cloudlytics -Some of our Reports using Cloudlytics Check the video embedded after the slideshare for a Live recording of our webinar conducted around this topic.

Citation preview

Featuring Our

Latest

Reports

AGENDA

2

Introduction To Amazon S3 & CloudFront

Log Processing Using AWS Vs Traditional ways

Log Processing With Cloudlytics- Big Data Approach

Cloudlytics Use cases

Cloudlytics Reports & Live Demo

What is Amazon Simple Storage Service or S3?

• Amazon Simple Storage Service is storage for the

internet

• Stores 2 trillion+ objects

• 1.1 million requests per second at peak

• Each time a request is

made to access a file on S3,

an entry to log file is created

• Average size of each

log entry 550B

3

Image Courtesy: http://threatpost.com/files/2013/03/

What is Amazon CloudFront?

4

1 5

4

3

2

Amazon CloudFront is a

web service for content

delivery

CloudFront

decreases

latencies

for object

downloads

and streamsSupports static

and dynamic

content, including

web pages

Each time a request

is made to access a file

on CloudFront, an entry

to log file is created

Average size of each

log entry 650B

Information Hidden in S3 & CloudFront Logs

5

-

Object Details Download Status

Download /Streaming Time

Number of

Bytes

Transferred

Details about Edge Locations

IP Address of the Requester

Referrer LinkTime Taken to Download Object(S3)

Details about Play, Pause, Stop (Streaming Content on CloudFront)

Uncover the Hidden Information

• Generating logs for Amazon S3 &

CloudFront logs is optional

• Log files are stored in S3 buckets

• CloudFront Log files are compressed

and stored in .gz format

• A log file is generated every hour, but we

have seen varied patterns with multiple files

generated every hour

• No ready solution from AWS to process

these log files

6

S3 logs CloudFront

logs

Logs Stored

in S3

Logs Analyzed by

CLOUDLYTICS

Image Courtesy: www.fao.org

Traditional Log Processing

• Extract data from the source using an ETL tool

• Transform data and load in a data-warehouse

• Takes days to process a few GBs of

log files using traditional hardware

• Alternately use a Hadoop distribution

to process logs

• But maintaining a Hadoop cluster is a huge overhead

7

Log Processing with Cloudlytics

• Cloudlytics - Analyze your Amazon S3 &

CloudFront Logs

• Detailed analysis of your S3 & CloudFront

access patterns

• Dynamic Graphs to get a 360 degree

perspective

• Scalable & Reliable service built using

Amazon EMR & RedShift

• Pay as you go

8

Log Processing – Big Data Approach

• Cloudlytics extracts log files stored

in S3 buckets

• Processes the log files to

transform information

• Stores the processed data in

a data-warehouse

• Graphical and tabular reports generated from data-warehouse

9

10

Cloudlytics Use Cases

Independent Software Vendors (ISVs)

• ISVs distribute downloadable software to

end users across the globe

• ISVs need to ensure that downloads are

fast, helps improve user experience

• ISVs need to track each download for

success and failure

• Identify broken links on the website, helps improve user

experience

• Identify the most popular downloads, focus on popular products

• Identify spam attacks, help reduce bandwidth costs

11

E-Learning Companies

• E-Learning companies distribute educational

content in multiple formats

(ebooks, audio, video etc.)

• Figure out the most popular content

• Figure out end user engagement by querying

number of events per request (Play, Pause, Stop)

• Get a breakup of requests by Operating System and Devices,

develop content creation for specific platforms

12

Image Courtesy: http://www.elifescience.in/images

Media Organizations

• Large number of media assets

available online

• Content does not go any changes

during it’s life cycle

• Some content is extremely popular

while others do not get any views

• Identify the most popular content,

set caching mechanism

• Figure out end user engagement by querying number of events per

request (Play, Pause, Stop)

• Identify the edge locations from which the content is downloaded

the most, optimize billing using CloudFront pricing Class

13

Cloudlytics REPORTS

14

Which Reports Resonate

with Your Business Needs?

Geographic Reports

15

Browser & OS Statistics

16

Detail IP Monitoring

17

Timeline Charts

18

The TOP 10

19

Edge Location Traffic

20

DEMO

21

Image Courtesy: SourceKeyit.com

Let’s Look at

Cloudlytics in

Action

How to get started?

22

So Where do you

Get Started ??

Image courtesy: http://blogs.position2

.com/best-of-the-week-august-24-2012

Get Started in 3 Easy Steps

23

Configure your Log

buckets

Register for Free

Analyze & Generate

Reports

Pricing

• No upfront costs

• Register for free and analyze upto 25MB logs/month FREE

• Pay only for the amount of logs you subscribe

24

Advantage - Cloudlytics

• Scalable & Reliable

• Developed using Amazon Web Services

tools like Amazon EMR & Amazon Redshift

• Developed by BlazeClan Technologies,

leading Consulting Partner with

Amazon Web Services

• Pay as you go service with no contracts and no lock-ins

25

To sum it Up

26

Identifying Popular

Downloads & Streams

Get Geographical Distribution

of Downloads & Streams

Improve User Experience by

Calculating & Optimizing Latencies

Identify Edge Location

Traffic and optimize AWS Billing

Identify Spam

Attacks

Get Streaming Pattern

for Video Content

Our Global User Reach

27

80+ Users and

Counting!

Coming Soon!

28

Image Courtesy: http://www.trophies.com/coming-soon/

Upcoming Webinar

29

Check out Our

Upcoming Webinars

@ blazeclan.com/webinars