22
Analyzing Tor’s Anonymity with Machine Learning Sanjit Bhat, David Lu May 21 st , 2017 Mentor: Albert Kwon

Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Analyzing Tor’s Anonymity with Machine Learning

Sanjit Bhat, David LuMay 21st, 2017Mentor: Albert Kwon

Page 2: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Acknowledgements

Thank you to Albert Kwon for mentoring us

Thank you to Prof. Srini Devadas for PRIMES-CS

Thank you to Dr. Slava Gerovitch and the PRIMES program

Thank you to our parents for supporting us and for listening to us incessantly talk about our project for the past four months!

2

Page 3: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Motivation and Background

3

Page 4: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Anonymity Matters

● Whistleblowers

● Governmental suppression of political opinion

● Censorship circumvention

http://blog.transparency.org/2016/06/20/new-whistleblower-protection-law-in-france-not-yet-fit-for-purpose/

http://facecrooks.com/Internet-Safety-Privacy/To-be-anonymous-or-not-to-be-should-you-use-your-real-name-on-the-Internet.html/

http://www.dmnews.com/social-media/what-if-people-want-their-internet-anonymity-back/article/338654/

4

Page 5: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

The Internet Provides No Guarantee of Anonymity

Sender (Alice)

Receiver (Bob)Adversary

5

From: AliceTo: Bob

From: AliceTo: Bob

Page 6: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

A Supposed Fix - Tor: The Onion Router

● Alice connects to the Tor network

6

Page 7: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

A Supposed Fix - Tor: The Onion Router

● Alice obtains a list of Tor nodes from the Tor network

7

Page 8: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

A Supposed Fix - Tor: The Onion Router

● Alice chooses 3 Tor nodes to make a connection to Bob ● No Tor nodes know the identities of both Bob and Alice

8

Page 9: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Tor Has Key Vulnerabilities Exploited by Website Fingerprinting Attacks● Adversary only needs 1 link in the chain

Receiver

Sender

Tor Network

Adversary

9

Page 10: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Adversaries Learn the Following Features from Packet SequencesBasic Features

● Total transmission time● Total number of packets● Number of incoming packets● Number of outgoing packets

More Abstract Features

● Number of packets before outgoing packets

● Number of packets between outgoing packets

● Concentrations of outgoing packets

● Bursts● Initial packet directions

T. Wang et al.: “Effective Attacks and Provable Defenses for Website Fingerprinting” 10

Page 11: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Our Research Focus - Applications of Website Fingerprinting to Unlocking Tor’s Anonymity

Question 1: Are website fingerprinting attacks still viable?

11

Question 2: Are certain features more important for the attacker?

Question 3: Can we classify websites based on their content type?

Page 12: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Experiments and Results

12

Page 13: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Methodology● Our Dataset

○ Top 10,000 Sites from StuffGate○ 50 trials per site using the Tor network○ Mimic actions of actual user as closely as possible

● Evaluation○ Multiple generic Weka machine learning algorithms

13

Page 14: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Question 1: Are website fingerprinting attacks still viable?

14

Page 15: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Q. 1Closed World - A Necessary Benchmark

15

K-Nearest Neighbors Classifier

100 websites: 86%

2000 websites: 75.6%

Page 16: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Open World - A More Realistic ScenarioQ. 1

Random Forest Classifier100 sensitive websites

5000 insensitive websites

True Positive Rate: 86%False Positive Rate: 5.5%

True Positive Rate: 71%False Positive Rate: 0.76%

16

Page 17: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Question 2: Are certain features more important for the attacker?

17

Page 18: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Packet Ordering - The Most Impactful FeatureQ. 2

18

Page 19: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Question 3: Can we classify websites based on their content type?

19

Page 20: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Websites Can Be Classified into CategoriesK-Nearest Neighbors Classifier - Closed World Test

Q. 3

20

Page 21: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Conclusion

● Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity

● Among the features available to an adversary observing the Tor network, the number of incoming packets between successive outgoing packets is the most important feature

● Adversaries can successfully classify, by content type, the websites accessed by users on the Tor network

21

Page 22: Analyzing Tor’s Anonymity with Mentor: Albert Kwon Machine ... · Website fingerprinting attacks are still viable and can be used to circumvent Tor’s anonymity Among the features

Future Work

● Attacks○ More powerful/tailored machine learning algorithms

■ Scikit-learn○ Deep-learning

■ Capture abstract features not humanly visible● Defenses

○ Hide packet ordering○ Unsupervised learning

■ Cluster sites together to find categories not humanly identifiable

22