Upload
delphia-georgina-mccarthy
View
219
Download
0
Embed Size (px)
Citation preview
Collaborative Center for Internet Epidemiology and Defenses (CCIED)
Stefan Savage
Department of Computer Science & EngineeringUniversity of California, San Diego
Context: threat transformation
• Traditional threats– Attacker manually targets high-
value system/resource – Defender increases cost to
compromise high-value systems– Biggest threat: insider attacker
• Modern threats– Attacker uses automation to
attack many resources at once (filter later)
– Defender must defend all systems at once
– Biggest threat: software bugs and naïve users
Technical enablers
• Wide-open communications architecture– IP model: anyone can send anything to anyone– Federated management, minimal authentication
• Vulnerable computing platforms– One software bug -> millions of compromised hosts– Naïve users -> don’t even need software bugs
• Lack of meaningful deterrence– Little forensic attribution/audit capability– Inefficient investigatory mechanisms/
prosecutorial incentives
Bigger problem: Economic Drivers
• In last six years, emergence of profit-making malware– Anti-spam efforts force spammers to launder e-mail through
compromised machines (starts with MyDoom.A, SoBig)– “Virtuous” economic cycle transforms nature of threat
• Commoditization of compromised hosts– Fluid third-party exchange market (millions of hosts)
• Raw bots (range from pennies to dollars)• Value added tier: SPAM proxying (more expensive)
• Innovation in both host substrate and its uses– Sophisticated infection and command/control networks: platform– SPAM, piracy, phishing, identity theft, DDoS are all applications
DDoS for sale• Emergence of economic engine for Internet crime
– SPAM, phishing, spyware, etc
• Fluid third party markets for illicit digital goods/services– Bots ~$0.5/host, special orders, value added tiers– Cards, malware, exploits, DDoS, cashout, etc.
6
• 3.6 cents per bot week
• 6 cents per bot week
• 2.5 cents per bot week
September 2004 postings to SpecialHam.com, Spamforum.biz
>20-30k always online SOCKs4, url is de-duped and updated> every 10 minutes. 900/weekly, Samples will be sent on> request. Monthly payments arranged at discount prices.
>$350.00/weekly - $1,000/monthly (USD) >Type of service: Exclusive (One slot only)>Always Online: 5,000 - 6,000>Updated every: 10 minutes
>$220.00/weekly - $800.00/monthly (USD)>Type of service: Shared (4 slots)>Always Online: 9,000 - 10,000>Updated every: 5 minutes
Botnet Spammer Rental Rates
Bot PayloadsBot Payloads
Structural asymmetries• Defenders reactive, attackers proactive
– Defenses public, attacker develops/tests in private– Arms race where best case for defender is to “catch up”
• New defenses expensive, new attacks cheap– Defenses sunk costs/business model,
attacker agile and not tied to particular technology
• Defenses hard to measure, attacks easy to measure– Few security metrics (no “evidence-based” security), attackers
directly monetization which drives attack quality
• Minimal deterrent effect
8
CCIED• Collaborative Center for Internet Epidemiology and
Defenses (“Seaside”)– Joint UCSD/ICSI project, 1 of 4 National CyberTrust Centers – Focused on threats posed by large-scale host compromise
• Worms, viruses, botnets, DDoS, spam, spyware etc
• Three key areas of work– Internet epidemiology: measuring/understanding attacks– Automated defenses: blocking/stopping attacks:– Economic drivers: why attacks are happening
• See: http://www.ccied.org
10
DetectingDetecting Outbreaks Outbreaks
• Both defense and deterrence are predicated on getting good intelligence– Need to detect, characterize and analyze new malware threats
– Need to be do it quickly across a very large number of events
• Classes of monitors– Network-based
– Endpoint-based
• Monitoring environments– In-situ: real activity as it happens
• Network/host IDS
– Ex-situ: “canary in the coal mine”• HoneyNets/Honeypots
Network Telescopes
• Idea: Unsolicited packets evidence of global phenomena– Backscatter: response packets sent by victims provide insight into
global prevalence of DoS attacks (and who is getting attacked)– Scans: request packets can indicate an infection attempt from a
worm (and who is current infected, growth rate, etc.)• Very scalable: CCIED Telescope monitors 17M+ IP addrs
– (> 1% of all routable addresses of the Internet)Moore et al, Inferring Internet Denial-of-Service Activity, USENIX Security, 2001.
Backscatter analysis
• Monitor block of n IP addresses
• Expected # of backscatter packets given an attack of m packets:
• Extrapolated attack rate R’ is a function of measured backscatter rate R:
322
nmE(X)
nRR
322'
Attacks over time
Example: Periodic attack (1hr per 24hrs)
Measuring worm growth
CodeRed infects 360,000 hosts in 14 hours in 2001
Moore et al, Code Red: a case study on the spread and victims of an Internet worm, ACM IMW, 2002
Code red was slow
• Slammer worm released January 2003– First ~1 min behaves like classic scanning worm
(doubles in 8.5secs)– >1 min worm saturates access bandwidth
• Some hosts issue > 20,000 scans/sec• Self-interfering
– Peaks at ~3 min• >55 million IP scans/sec
– 90% of Internet scanned in <10 mins
Moore et al, The Spread of the Sapphire/Slammer Worm, IEEE Security & Privacy, 1(4), 2003
Scalability/Fidelity Scalability/Fidelity Tradeoff in detectionTradeoff in detection
Live Honeypot
Telescopes + Responders(iSink, honeyd, Internet Motion Sensor)
VM-based Honeynet(e.g., Collapsar)
NetworkTelescopes(passive)
MostScalable
HighestFidelity
Potemkin Honeyfarm
• Provide the illusion of millions of honeypots– But use a much smaller
set of physical resources– 1 Million IP addresses on
10s of physical hosts
• Gateway multiplexes traffic onto multiple virtual machines (VMs)
• VMM multiplexes multiple VMs on physical servers
Vrable et al., Scalability, Fidelity, and Containment in the Potemkin Virtual Honeyfarm, SOSP 2005.
Was largest high-fidelity honeyfarm on planet
Potemkin OperationPotemkin Operation
• Packet received by gateway• Dispatched to honeyfarm
server• VM instantiated
– Adopts destination IP address
– Creation must be fast enough to maintain illusion (creation via copy)
• Many VMs will be created– Must be resource efficient
(copy-on-write representation)
– Can support 100s of simultaneous VMs per server
Outbreak Defense• Modern worms can infect
>1M hosts/sec• Need to detect and block
new outbreaks << 1 sec [Moore et al, Infocom03]
SRC: 11.12.13.14.3920 DST: 132.239.13.24.5000 PROT: TCP
00F0 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 ................0100 90 90 90 90 90 90 90 90 90 90 90 90 4D 3F E3 77 ............M?.w0110 90 90 90 90 FF 63 64 90 90 90 90 90 90 90 90 90 .....cd.........0120 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 ................0130 90 90 90 90 90 90 90 90 EB 10 5A 4A 33 C9 66 B9 ..........ZJ3.f.0140 66 01 80 34 0A 99 E2 FA EB 05 E8 EB FF FF FF 70 f..4...........p. . .
PACKET HEADER
PACKET PAYLOAD (CONTENT)
SRC: 11.12.13.14.3920 DST: 132.239.13.24.5000 PROT: TCP
00F0 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 ................0100 90 90 90 90 90 90 90 90 90 90 90 90 4D 3F E3 77 ............M?.w0110 90 90 90 90 FF 63 64 90 90 90 90 90 90 90 90 90 .....cd.........0120 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 ................0130 90 90 90 90 90 90 90 90 EB 10 5A 4A 33 C9 66 B9 ..........ZJ3.f.0140 66 01 80 34 0A 99 E2 FA EB 05 E8 EB FF FF FF 70 f..4...........p. . .
PACKET HEADER
PACKET PAYLOAD (CONTENT)
Earlybird: Line-rate network inference of worm signatures [Singh et al, OSDI04]
Key issue: how to learn popular strings with high in and out degree, without maintaining per-string state
Precise signature identification < 1ms
Singh et al., Automated Worm Fingerprinting, NSDI 2004.
Today
• We are increasingly focused on better mapping the economics of on-line crime– Botnet infiltration– Spam conversion– Buying and selling of stolen credit cards,
bank accounts, botnets, etc
• The hope is to find economics bottlenecks and focus defenses there
Spam
• The oldest e-crime profit generator• > 100B spam e-mails sent/day (Ironport)• Wide range of campaigns
– Scams: pharma, software, rolex, jobs, porn,..– Phishing: banks (e.g. BoA), e-commerce, etc– Web exploits, XSS & social engineering
• Key question: what is ROI?– Costs can be estimated, but we don’t know sales
conversion rate
Courtesy Stuart Brownmodernlifisrubbish.co.uk
How Pharma Spam works?
Key opportunity
• Spam is increasingly sent by botnets
• Botnets are increasingly self-organizing
• Can infiltrate botnet C&C network– Observe who is getting spammed– Observe what spam is being sent– Observe which addresses get delivered to– Change templates in transit
Kanich, Kreibich, Levchenko, Enright, Paxson, Voelker and Savage, Spamalytics: an Empirical Analysis of Spam Marketing Conversion, ACM CCS 2008
http://canadianpharma.com
http://ucsdpharma.com
Spam pipeline
26
83.6 M
347.5M
21.1M (25%)
82.7M (24%)
3,827 (0.005%)
10,522 (0.003%)
316 (0.00037%)
28 (0.000008%)
---
Pharma: 12 M spam emails for one “purchase”Pharma: 12 M spam emails for one “purchase”
Sent MTA Visits ConversionsInbox
40.1 M 10.1M (25%) 2,721 (0.005%) 225 (0.00056%)
E-card: 1 in 10 visitors execute the binaryE-card: 1 in 10 visitors execute the binary
Questions?
Yahoo! 27
Collaborative Center for Internet Epidemiology and Defenses
http://ccied.org
What’s next: Value-chain characterization
• Value-chain characterization– Empirical map establishing links between criminal
groups and enablers• Affiliate programs, botnets, fast flux networks, registrars,
payment processors, SEO/traffic partners, fulfillment/manufacturing
• Data mining across huge data feeds we’ve built or established relationships for
– Social network among criminal groups• Semantic Web mining
New: Fulfillment measurements
• About to start purchasing wide range of spam-advertized products
– Watches– Pharma– Traffic
• Cluster purchases based on
– Merchant and processor– Packaging (postmark, forensic analysis of
paper)– Artifacts of manufacturing process (e.g., FT-
NIR on drugs)29
• Observations
– Modest number of bots send most spam
– Virtually all bots use templates with simple rules to describe polymorphism
– Templates+dictionaries ≈ regex describing spam to be generated
– If we can extract or infer these from the botnets, we have a perfect filter for all the spam generated by the botnet
– Very specific filters, extremely low FP risk
New: Bot-based spam filter generation
http://www.marshal.com/trace/spam_statistics.asp
random letters and numbers
phrases from a dictionary
Early results (last week)0 FP with 50 examples0 FN on Storm with 500 examples
Still tuning for other botnets