Platforms and Techniques for Online Platforms and Techniques for Online
Traffic IdentificationTraffic Identification
CostCost--TMA Samos Meeting 22TMA Samos Meeting 22--23 September 200823 September 2008
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 1
Alberto Alberto DainottiDainotti
[email protected]@unina.it
COMICS Research GroupCOMICS Research Group
University of Napoli “Federico II”University of Napoli “Federico II”
People@COMICS
� COMICS (COMputersCOMputers forfor InteractionInteraction and and CommunicationSCommunicationS )
� Around 30 people in the group
� 2 laboratories:
�UoN/DIS
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 2
• @ University of Napoli
�CINI/ITEM
• a research lab of the Italian University Consortium in Computer
Science & Engineering
� Funding mainly from EU, Industry, with some money from
national and local government
Projects@COMICS
� EU Projects
�OneLab
�OneLab2
�NetQoS
�Content
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 3
�Cost TMA
� Intersection
� National
�Recipe (Robust and Efficient traffic Classification in IP
nEtworks)
�Research areas:
�Traffic Measurements and Analysis
�Network Monitoring
�QoS in heterogeneous networks
�Traffic Engineering
�Wireless Mesh Networks
Research@Comics
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 4
�Wireless Mesh Networks
�Management and control of network infrastructures
• SLA, SLS, Policy based management
�Security, Reliability and Resilience
�Multimedia services engineering
�…
Network Network MonitoringMonitoring and and MeasurementsMeasurements
Links
Topologies� Topology Discovery
� Intradomain, IP level, Active/Passive, Distributed, …
� Active and passive measurements
� Available Bandwidth
� QoSparameters
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 5
ApplicationsTraffic
� Traffic Characterization (novel applications and malware traffic), Traffic Modeling, Traffic Generation, Traffic Classification
� Anomaly and Worm detection
� New trends: Network neutrality, Network forensics
parameters
� …
http://www.grid.unina.it/Traffic
TIE: Traffic Identification EngineTIE: Traffic Identification Engine
� TIE is a community-oriented project for traffic classification
� Public web site and first beta announced today @ Cost-
TMA meeting !
http://tie.comics.unina.it
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 6
� Project started in 2007. Collaborations too!
TIE: Traffic Identification EngineTIE: Traffic Identification Engine
� An open-source software platform working as a multiple
classifier system
� Purpose: to allow the community to work with shared tools
and data to investigate several aspects of traffic
classification
� Offline, Online, historical web reports
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 7
� Easy to add: classification techniques, classification
features, combination strategies
� Programming API and documentation. Users mailing list.
� Anonymized Traces with GT data
� Code to the data
Tie’s ComponentsTie’s Components
� Well defined portions of code allow easy modifications and
extensions
Packet
Filter
SessionBuilder
Feature Extractor
Decision Combiner
Classification
Plugin #1
Classification
Plugin #n
Output
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 8
extensions
� Processing revolves around a sessions table. Each
session structure in the table contains
� Status Information
� Flags
� Counters
� Features
Tie’s Components: Packet filterTie’s Components: Packet filter
� Based on the pcap* library
Packet
Filter
SessionBuilder
Feature Extractor
Decision Combiner
Classification
Plugin #1
Classification
Plugin #n
Output
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 9
� Input can be either live traffic or a traffic trace
� Can operate packet filtering and validation (e.g.
checksums)
*http://ww.tcpdump.org
Tie’s Components: Session BuilderTie’s Components: Session Builder
� Different definitions of sessions are allowed
Packet
Filter
SessionBuilder
Feature Extractor
Decision Combiner
Classification
Plugin #1
Classification
Plugin #n
Output
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 10
� Flows
• <L4Proto, IPsrc, Portsrc, IPdst, Portdst> + timeout
� Biflows
• Same as above but src and dst swappable
• Some heuristics for TCP can be used
� Host
• Under development
Tie’s Components: Feature ExtractorTie’s Components: Feature Extractor
� Features
Packet
Filter
SessionBuilder
Feature Extractor
Decision Combiner
Classification
Plugin #1
Classification
Plugin #n
Output
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 11
� Portions of payload
� Pkt/byte count
� PS vector
� IPT vector
� …
� Features can be enabled/disabled at compile time
Tie’s Components: Classification Tie’s Components: Classification PluginsPlugins (1/2)(1/2)
� Each plugin implements a specific classification technique
Packet
Filter
SessionBuilder
Feature Extractor
Decision Combiner
Classification
Plugin #1
Classification
Plugin #n
Output
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 12
classification technique
� It operates on a session
� It returns a result that includes a confidence value
� “dummy” plugin source available
typedef struct {
int (* disable)(); int (* enable)();int (* load_signatures)(char *);int (* train)(char *);class_output* (* classify_session)(void *session);int (* dump_statistics)(FILE *);bool (* is_session_classifiable)(void *session);int (* session_sign)(session *, class_output *); char *name;u_int32_t *flags;
} classifier;
Tie’s Components: Classification Tie’s Components: Classification PluginsPlugins (2/2)(2/2)
Name Based on Status Contributor
Port L4 Ports Available UNINA (signatures from
CAIDA)
L7 Deep Payload Inspection Available UNINA (signatures/code from
Linux L7-filter)
NBC Lightweight Payload Inspection Under test UNINA
GMM-
PS
Statistical Approach: PS Under test UNINA
HMM Statistical Approach*: PS, IPT Under test UNINA
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 13
� Each plugin
HMM Statistical Approach*: PS, IPT Under test UNINA
FPT Statistical Approach**: PS, IPT Under
devel.
UNIBS
Joint Machine Learning Under
devel.
UNINA-CAIDA
??? ??? ??? YOU ?
*A. Dainotti, W. de Donato, A. Pescapè, P. Salvorossi “Classification of Network Traffic via Packet-Level Hidden Markov Models”, IEEE GLOBECOM 2008
**M. Crotti, F. Gringoli, P. Pelosato, L. Salgarelli, "A Statistical Approach to IP-level classification of network traffic", IEEE ICC 2006
Tie’s Components: Decision CombinerTie’s Components: Decision Combiner
� The decision combiner determines the combination
strategy
Packet
Filter
SessionBuilder
Feature Extractor
Decision Combiner
Classification
Plugin #1
Classification
Plugin #n
Output
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 14
strategy
� Which classifiers are invoked
� When classifiers are invoked
� How different results are combined to output a final
decision and confidence value
Tie’s Components: OutputTie’s Components: Output
� Output format is one, semantics change depending on
session type (flow, biflow) and working mode (offline,
Packet
Filter
SessionBuilder
Feature Extractor
Decision Combiner
Classification
Plugin #1
Classification
Plugin #n
Output
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 15
session type (flow, biflow) and working mode (offline,
realtime, cyclic,…)
� Some perl scripts help in processing the output (e.g.
overall stats, confusion matrix, …)
ScreenshotsScreenshots
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 16
ScreenshotsScreenshots
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 17
ScreenshotsScreenshots
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 18
ScreenshotsScreenshots
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 19
ScreenshotsScreenshots
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 20
THANKS
For Cost-TMA activities write us @
COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 21
Antonio Pescapè
http://www.grid.unina.it/Traffic