Upload
teddy
View
52
Download
0
Embed Size (px)
DESCRIPTION
Network-Level Spam Detection. Nick Feamster Georgia Tech. Spam: More than Just a Nuisance. 95% of all email traffic Image and PDF Spam (PDF spam ~12%) As of August 2007, one in every 87 emails constituted a phishing attack Targeted attacks on the rise - PowerPoint PPT Presentation
Citation preview
1
Network-Level Spam Detection
Nick FeamsterGeorgia Tech
2
Spam: More than Just a Nuisance
• 95% of all email traffic– Image and PDF Spam
(PDF spam ~12%)
• As of August 2007, one in every 87 emails constituted a phishing attack
• Targeted attacks on the rise– 20k-30k unique phishing attacks per month
Source: CNET (January 2008), APWG
3
Detection
• Detect unwanted traffic from reaching a user’s inbox by distinguishing spam from ham
• Question: What features best differentiate spam from legitimate mail?– Content-based filtering: What is in the mail?– IP address of sender: Who is the sender?– Behavioral features: How the mail is sent?
4
Content-Based Detection: Problems
• Low cost to evasion: Spammers can easily alter features of an email’s content can be easily adjusted and changed
• Customized emails are easy to generate: Content-based filters need fuzzy hashes over content, etc.
• High cost to filter maintainers: Filters must be continually updated as content-changing techniques become more sophisticated
5
Another Approach: IP Addresses
• Problem: IP addresses are ephemeral
• Every day, 10% of senders are from previously unseen IP addresses
• Possible causes– Dynamic addressing– New infections
6
• Filter email based on how it is sent, in addition to simply what is sent.
• Network-level properties are less malleable– Hosting or upstream ISP (AS number)– Membership in a botnet (spammer, hosting
infrastructure)– Network location of sender and receiver– Set of target recipients
Idea: Network-Based Detection
7
Behavioral Blacklisting
• Idea: Blacklist sending behavior (“Behavioral Blacklisting”)– Identify sending patterns commonly used by
spammers
• Intuition: Much more difficult for a spammer to change the technique by which mail is sent than it is to change the content
8
Improving Classification
• Lower overhead• Faster detection• Better robustness (i.e., to evasion, dynamism)
• Use additional features and combine for more robust classification– Temporal: interarrival times, diurnal patterns– Spatial: sending patterns of groups of senders
9
SNARE: Automated Sender Reputation
• Goal: Sender reputation from a single packet?(or at least as little information as possible)– Lower overhead– Faster classification– Less malleable
• Key challenge– What features satisfy these properties and can
distinguish spammers from legitimate senders
10
Sender-Receiver Geodesic Distance
90% of legitimate messages travel 2,200 miles or less
11
Density of Senders in IP Space
For spammers, k nearest senders are much closer in IP space
12
Other Network-Level Features
• Time-of-day at sender
• Upstream AS of sender
• Message size (and variance)
• Number of recipients (and variance)
13
Combining Features
• Put features into the RuleFit classifier• 10-fold cross validation on one day of query logs
from a large spam filtering appliance provider
• Using only network-level features• Completely automated
14
Cluster-Based Features
• Construct a behavioral fingerprint for each sender
• Cluster senders with similar fingerprints
• Filter new senders that map to existing clusters
15
domain1.com domain2.com domain3.com
spam spam spam
IP Address: 76.17.114.xxxKnown Spammer
DHCPReassignment
Behavioral fingerprint
domain1.com domain2.com domain3.com
spam spam spam
IP Address: 24.99.146.xxxUnknown sender
Cluster on sending behavior
Similar fingerprint!
Cluster on sending behavior
Infection
Identifying Invariants
16
Building the Classifier: Clustering
• Feature: Distribution of email sending volumes across recipient domains
• Clustering Approach– Build initial seed list of bad IP addresses– For each IP address, compute feature vector:
volume per domain per time interval– Collapse into a single IP x domain matrix:– Compute clusters
17
Clustering: Fingerprint
• For each cluster, compute fingerprint vector:
• New IPs will be compared to this “fingerprint”
IP x IP Matrix: Intensity indicates pairwise similarity
18
Evaluation
• Emulate the performance of a system that could observe sending patterns across many domains– Build clusters/train on given time interval
• Evaluate classification– Relative to labeled logs– Relative to IP addresses that were eventually listed
19
Early Detection Results
• Compare SpamTracker scores on “accepted” mail to the SpamHaus database– About 15% of accepted mail was later determined to
be spam– Can SpamTracker catch this?
• Of 620 emails that were accepted, but sent from IPs that were blacklisted within one month– 65 emails had a score larger than 5 (85th percentile)
20
Small Samples Work Well
Relatively small samples can achieve low false positive rates
21
Extensions to Phishing
• Goal: Detect phishing attacks based on behavioral properties of hosting site(vs. static properties of URL)
• Features– URL regular expressions– Registration time of domain– Uptime of hosting site– DNS TTL and redirections
• Next time: Discussion of phishing detection/integration
22
Integration with SMITE• Sensors
– Extract network features from traffic– IP addresses– Combine with auxiliary data (routing, time, etc.)
• Algorithms– Clustering algorithm to identify behavioral fingerprints– Learning algorithm to classify based on multiple features
• Correlation– Clusters formed by aggregating sending behavior observed
across multiple sensors– Various features also require input from data collected
across collections of IP addresses
23
Summary
• Spam increasing, spammers becoming agile– Content filters are falling behind– IP-Based blacklists are evadable
• Up to 30% of spam not listed in common blacklists at receipt. ~20% remains unlisted after a month
• Complementary approach: behavioral blacklisting based on network-level features– Blacklist based on how messages are sent– SNARE: Automated sender reputation
• ~90% accuracy of existing with lightweight features– Cluster-based features to improve accuracy/reduce
need for labelled data
24
26
Improvements
• Accuracy– Synthesizing multiple classifiers– Incorporating user feedback– Learning algorithms with bounded false positives
• Performance– Caching/Sharing– Streaming
• Security– Learning in adversarial environments
27
Sampling: Training Time
28
Dynamism: Accuracy over Time