Unconstrained Endpoint Profiling

Preview:

DESCRIPTION

Googling the Internet. Unconstrained Endpoint Profiling. Ionut Trestian , Supranamaya Ranjan , Alekandar Kuzmanovic , Antonio Nucci Reviewed by Lee Young Soo. Introduction. Obtaining ‘raw’ packet trace from operational networks can be very hard. - PowerPoint PPT Presentation

Citation preview

Unconstrained Endpoint Profiling

Googling the Internet

Ionut Trestian, Supranamaya Ranjan, Alekandar Kuzmanovic, Antonio Nucci

Reviewed by Lee Young Soo

Introduction

Obtaining ‘raw’ packet trace from operational networks can be very hard.

Accurately classifying in an online fashion at high speeds is an inher-ently hard problem.

For under-standing

what people are doing on the Internet

Analyze opera-

tional net-work trace.

Unconstrained Endpoint Profiling

Introduction of a novel methodology. No operational traces are available Packet-level traces are available Sampled flow-level traces are available

Internet access trend analysis for four world regions.

Methodology

Rule Generation Querying Google using a sample ‘seed set’ of

random IP address from the networks in four world regions.

Constrain top N keywords that could be meaningfully used for endpoint classification.

Methodology

Methodology

Web Classifier Rapid URL search Hit text search

Example URL : www.robtex.com/dns/32.net.ru.html

Methodology IP tagging

URL based tagging General hit text based tagging Hit text based tagging for Forums

Post-date & username is in the vicinity of the IP address=> forum user

Presence of following keywords:http:\, ftp:\, ppstream:\, mms:\=> http share, ftp share, streaming node

Methodology Examples

200.101.18.182-inforum.insite.com URL based tagging

61.172.249.13-ttzai.com Hit text based tagging for Forum

Information come from Web logs Proxy logs Forums Malicious list Server list P2P communication

Evaluation When No Traces are Available. When Packet-Level Trace are Avail-

able. When Sampled Trace are Available.

When No Traces are Avail-able

Applying the unconstrained endpoint approach on a subset of the IP range belonging to four ISPs shown in above table.

When No Traces are Avail-able

When No Traces are Available

Correlation with operational traces.Correlation with other sources.

Unconstrained endpoint profiling approach can be effec-tively used to estimate application popularity trends.

When Packet-Level Trace are Available

BLINC

Off-line tool

Cannot classify par-ticularly at application

level

Variable quality result for different traces

UEP

Superior classifi-cation result

Efficiently operate online

When Packet-Level Trace are Available

Collect most popular 5% of IP address and tag them by applying the methodology.

Use this information to classify the traffic flow.

When Packet-Level Trace are Available

When Sampled Trace are Available

Due to sampling, insufficient amount of data remains in the trace, and hence the graphlets approach simply does not work.

Popular endpoint are still present in the trace, despite sampling.

When Sampled Trace are Available

Endpoint approach remains largely unaffected by sampling.

Endpoint Profiling Endpoint Clustering

Employ clustering in networking has been done before : Autoclass algorithm.

A set of tagged IP addresses from re-gion’s network Input to the endpoint clustering algorithm.

Endpoint Profiling

Browsing, browsing and chat or mail seems to be most common behavior.

Endpoint Profiling Traffic Locality

Conclusion UEP

Accurately predict application and protocol usage trends when no network traces are available.

Dramatically out perform when packet traces are available. Retain high classification capabilities when flow-level traces

are available. Profile endpoints residing at four different world re-

gions. Network applications and protocols used in these region. Characteristics of endpoint classes that share similar ac-

cess patterns. Clients’ locality properties.

Recommended