27
Data Mining for Security

data mining for security application

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: data mining for security application

Data Mining for

Security Applications

Page 2: data mining for security application

• Overview of Data Mining• Security Threats• Data Mining for Cyber security applications

– Intrusion Detection– Data Mining for Firewall Policy Management– Data Mining for Worm Detection• Data Mining for Counter-terrorism• Surveillance• Advantages• Conclusion

Page 3: data mining for security application

Data Mining - Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) information or patterns from data in large databases [Han and Kamber 2005].

Data mining is used to sort through the tremendous amounts of data stored by automated data collection tools.

Extracts rules, regularities, patterns, and constraints from databases.

Page 4: data mining for security application

Natural DisastersHuman Errors

Non-Information related threats

Information Related threats

Biological, Chemical, Nuclear Threats

CriticalInfrastructureThreats

ThreatTypes

Page 5: data mining for security application

Data mining is being applied to problems such as intrusion detection and auditing. For example,

Anomaly detection techniques could be used to detect unusual patterns and behaviors.

Link analysis may be used to trace self-propagating malicious code to its authors.

Classification may be used to group various cyber attacks and then use the profiles to detect an attack when it occurs.

Prediction may be used to determine potential future attacks depending in a way on information learnt about terrorists through email and phone conversations

Page 6: data mining for security application

An intrusion can be defined as “any set of actions that attempt to compromise the integrity, confidentiality, or availability of a resource”.

Attacks are: Host-based attacks Network-based attacks

Intrusion detection systems are split into two groups: Anomaly detection systems Misuse detection systems

Page 7: data mining for security application

Data mining can help automate the process of investigating intrusion detection alarms.

Data mining on historical audit data and intrusion detection alarms can reduce future false alarms.

Page 8: data mining for security application

Build models of normal data Detect any deviation from normal data Flag deviation as suspect Identify new types of intrusions as deviation from normal behavior

Misuse detection Label all instances in the data set (“normal” or “intrusion” ) Run learning algorithms over the labeled data to generate

classification rules Automatically retrain intrusion detection models on different

input data

Page 9: data mining for security application

Misuse detection

•Classification Model

Bayesian classifier

Decision tree

Association rule

Support vector machine

Learning from rare class

Page 10: data mining for security application

Anomaly detection

•Anomaly Detection Model

Association rule

Neural network

Unsupervised SVM

Outlier detection

Page 11: data mining for security application

Analysis of Firewall Policy Rules Using Data Mining Techniques

•Firewall is the de facto core technology of today’s network security•First line of defense against external network attacks and threats

•Firewall controls or governs network access by allowing or denying the incoming or outgoing network traffic according to firewall policy rules.

•Manual definition of rules often result in anomalies in the policy

•Detecting and resolving these anomalies manually is a tedious and an error prone task

Page 12: data mining for security application

Anomaly detection:• Theoretical Framework for the resolution of anomaly• A new algorithm will simultaneously detect and

resolve any anomaly that is present in the policy rules

Traffic Mining: • Mine the traffic and detect anomalies

Page 13: data mining for security application

To bridge the gap between what is written in the firewall policy rules and what is being observed in the network is to analyze traffic and log of the packets–

Network traffic trend may show that some rules are out-dated or not used recently

FirewallFirewallLogLog File File

Mining Log File Mining Log File Using FrequencyUsing Frequency

FilteringFilteringRule Rule

GeneralizationGeneralization

Generic RulesGeneric Rules

Identify Decaying Identify Decaying &&

Dominant RulesDominant Rules

EditEditFirewall RulesFirewall Rules

FirewallPolicy Rule

Page 14: data mining for security application

What are worms? Self-replicating program; Exploits software vulnerability on a victim;

Remotely infects other victims Goals of worm detection

Real-time detection Issues

Substantial Volume of Identical Traffic, Random Probing Methods for worm detection

Count number of sources/destinations; Count number of failed connection attempts

Worm Types Email worms, Instant Messaging worms, Internet worms, IRC worms,

File-sharing Networks worms

Page 15: data mining for security application

Training data

Feature extraction

Clean or Infected ?

Outgoing Emails

ClassifierMachine Learning

Test data

The Model

Task: given some training instances of both “normal” and “viral” emails, induce a hypothesis to detect “viral” emails.

Page 16: data mining for security application

Data Mining forNon real-time Threats:Gather data, build terrorist profilesMine data, prune results

Data Mining forCounter-terrorism

Data Mining forReal-time Threats:Gather data in real-time, build real-time models,Mine data, Report results

Page 17: data mining for security application

Gather data from multiple sources Information on terrorist attacks: who, what, where, when,

how Personal and business data: place of birth, ethnic origin,

religion, education, work history, finances, criminal record, relatives, friends and associates, travel history, . . .

Unstructured data: newspaper articles, video clips, speeches, emails, phone records, . . .

Integrate the data, build warehouses and federations Develop profiles of terrorists, activities/threats Mine the data to extract patterns of potential terrorists and

predict future activities and targets Find the “needle in the haystack” - suspicious needles? Data integrity is important

Page 18: data mining for security application

Integratedatasources

Clean/modifydatasources

BuildProfilesof Terrorists and Activities

Examineresults/

Pruneresults

Reportfinalresults

Data sourceswith informationabout terroristsand terrorist activities

Minethedata

Page 19: data mining for security application

Nature of data Data arriving from sensors and other devices

Continuous data streams Breaking news, video releases, satellite images Some critical data may also reside in caches

Rapidly sift through the data and discard unwanted data for later use and analysis (non-real-time data mining)

Data mining techniques need to meet timing constraints Quality of service (QoS) tradeoffs among timeliness, precision and

accuracy Presentation of results, visualization, real-time alerts and triggers

Page 20: data mining for security application

Integratedatasources in real-time

Buildreal-timemodels

ExamineResults in Real-time

Reportfinalresults

Data sourceswith informationabout terroristsand terrorist activities

Minethedata

Rapidlysift throughdata and discardirrelevant data

Page 21: data mining for security application

Association:John and Jamesoften seen together after anattack

Link Analysis:Follow chain from A to B to C to D

Clustering: Divide population; People from country X of a certain religion; people from Country Y Interested in airplanes

Classification:Build profiles ofTerrorist and classify terrorists

Anomaly Detection:John registers at flight school;but des not care about takeoff or landing

Data Mining Outcomes and Techniques

Page 22: data mining for security application

Huge amounts of surveillance and video data available in the security domain

Analysis is being done off-line usually using “Human Eyes”

Need for tools to aid human analyst ( pointing out areas in video where unusual activity occurs)

Page 23: data mining for security application

Event Representation Estimate distribution of pixel intensity change

Event Comparison Contrast the event representation of different

video sequences to determine if they contain similar semantic event content.

Event Detection Using manually labeled training video

sequences to classify unlabeled video sequences

Page 24: data mining for security application

Law enforcement: Data mining can aid law enforcers in identifying criminal suspects as well as apprehending these criminals by examining trends in location, crime type, habit, and other patterns of behaviors.

Researchers: Data mining can assist researchers by speeding up their data analyzing process; thus, allowing them more time to work on other projects.   

Page 25: data mining for security application

The various data mining techniques that have been proposed towards the enhancement of security of different application.

The ways in which data mining has been known to aid the process of Intrusion Detection,firewall,worm detection counter-terrorism and the ways in which the various techniques have been applied and evaluated.

Page 26: data mining for security application

B. Thuraisingham. Managing threats to web databases and cyber systems: Issues, solutions and challenges. In V. Kumar et al, editor, Cyber Security: Threats and Countermeasures. Kluwer

B. Thuraisingham. Data mining, national security, privacy and civil liberties.SIGKDD Explorations, January 2003

F. Bolz et al. The Counterterrorism Handbook: Tactics, Procedures, and Techniques.CRC Press, 2001.

http://dmoz.org/Computers/Security/Intrusion_Detection_Systems/

Page 27: data mining for security application

Thank you