24
BY IBRAHIM MOSAAD SUPERVISED BY OSAMA KAMAL VISUAL FINGERPRINTING FOR MALICIOUS DOMAINS

Visual fingerprinting for malicious websites

Embed Size (px)

Citation preview

Page 1: Visual fingerprinting for malicious websites

BY IBRAHIM MOSAAD

SUPERVISED BY OSAMA KAMAL

VISUAL FINGERPRINTING FOR MALICIOUS DOMAINS

Page 2: Visual fingerprinting for malicious websites

OUTLINE• Introduction

• Statistics of malicious Domains/URLs

• Goal

• How

• Conceptually

• Theoretically

• Practically

• Testing And Results

• Challenges

• Future Works

Page 3: Visual fingerprinting for malicious websites

INTRODUCTION

• Statistics

• In 2014, Kaspersky Lab’s web antivirus detected 123,054,503 unique malicious objects: scripts, exploits, executable files, etc

Page 4: Visual fingerprinting for malicious websites

INTRODUCTION

• Exploit kits

• How Common Are Exploit Kits?• 6000 infections/0.2 hour

• 2B visitors/month

• 2/3rd of all malwares delivered by exploit kits

Page 5: Visual fingerprinting for malicious websites

GOAL

“Create an automated system to d iff erenti ate between benign and mal ic ious websi tes”

Page 6: Visual fingerprinting for malicious websites

HOW - CONCEPTUALLY

• How do malicious websites behave?• Lack of a good training set

• How do benign websites behave? • Testing top 250 websites from different categories in Alexa

• Scoring system

Page 7: Visual fingerprinting for malicious websites

HOW – THEORETICALLY

• Browsing websites using real/emulated system

• Store/Visualize The collected data

• Score it

Page 8: Visual fingerprinting for malicious websites

HOW - PRACTICALLY

• Browsing websites using honeyclients• Low-interaction

• Thug

• HoneySpider Network 2.0

• High-interaction

• Capture-HPC

• HoneyClient

Page 9: Visual fingerprinting for malicious websites

HOW - PRACTICALLY

• HSN• Modular Framework – Extendable

• Wappalyzer module (Developed)

• Peepdf Module (Developed)

• Cuckoo sandbox module (Updated)

• Yara module (Updated)

Page 10: Visual fingerprinting for malicious websites

HOW - PRACTICALLY• Storing collected data

• Graph database neo4j

• GraphDB driver to HSN using Py2neo

• Scoring System• Mix of First and Second Degree functions

Page 11: Visual fingerprinting for malicious websites

FIRST RUN - TRAINING

• Number of websites: 1500

Page 12: Visual fingerprinting for malicious websites

MOZILLA.ORG

Page 13: Visual fingerprinting for malicious websites

AVG.COM

Page 14: Visual fingerprinting for malicious websites

ORACLE.COM

Page 15: Visual fingerprinting for malicious websites

APPLE.COM

Page 16: Visual fingerprinting for malicious websites

FIRST RUN

• Feature Extraction • Number levels

• Number resources

• Number redirections

• Number Iframes

• Website Topology

Page 17: Visual fingerprinting for malicious websites

BABYLON.COM

Page 18: Visual fingerprinting for malicious websites

SECOND RUN – REAL CASE

• Top domains looked malicious• http://dictionary.reverso.net

• http://n4hr.com

• http://s02.arab.sh

• http://dc11.arabsh.com

Page 19: Visual fingerprinting for malicious websites

CHALLENGES

• HSN• Lack of good documentation

• Last version was released in 2013

• Code written in 3 languages C/Python/Java

• Lack of community support

Page 20: Visual fingerprinting for malicious websites

CHALLENGES

Page 21: Visual fingerprinting for malicious websites

CHALLENGES

• Graph Database (py2neo)• Insertion

• Library is still immature

• REST-API can’t handle it

• 7000 URL * 30 * 2 = 420000 ~ 0.5M Nodes

• Store the queries in one request?!

• Huge POST request

• Querying

• 7000 URL => 7000*20 = 140K

Page 22: Visual fingerprinting for malicious websites

FUTURE WORKS

• HSN• Enhance the web-client module

• Enhance SWF emulation module

• Scoring System• Machine learning

• Graph Database• Adopt Giraph database rather neo4j

• Monitoring governmental websites

Page 23: Visual fingerprinting for malicious websites

BIGGER PICTURE

Page 24: Visual fingerprinting for malicious websites

Questions?