Upload
ngothien
View
219
Download
0
Embed Size (px)
Citation preview
SESSION ID:SESSION ID:
#RSAC
Vijay Dheap
AI In Cyber Security: A Balancing Force or Disruptor?
AIR-T08
@dheap
#RSAC
Topics
2
A quick primer on Artificial Intelligence (AI)
Motivations and applications of AI in cyber security
AI adoption by malicious actors
Challenges and risks with AI for Cyber Security
Conclusion: Beginning your AI journey
#RSAC
What is AI?
4
Definition of artificial intelligence. 1 :a branch of computer science dealing with the simulation of intelligent behavior in computers. 2 :the capability of a machine to imitate intelligent human behavior.
- Merriam Webster Dictionary
AI
Vision
Logical Analysis
Natural Language Processing
Learned Behaviors
Autonomous Movement
…
Speech
#RSAC
Machine Learning: Building block of AI
5
A subfield of computer science that enables computers to learn without being explicitly programmed- Arthur Samuel in 1959
Supervised Unsupervised ReinforcementInferring a general rule or mathematicalfunction from labeled training data to be applied to other data
Primary Use Cases• Regression Analysis
o Deriving correlation relationships between variables and estimating the strength of those relationships
o Prediction & Forecastingo Example: Vulnerability prioritization
• Classification:o Produces a model from a training
set that can assign unseen inputs into different categories
o Example: Sentiment detection
Detecting the presence of patterns or models from unlabeled data
Primary Use Cases• Clustering
o Data is divided into different groups based on one or more attributes
o Example: Peer group determination
• Dimensionality Reductiono process of reducing the number of
random variables under consideration
o Example: Feature Selection and extraction of false positives
Refining behavior based on external assessment and feedback
Primary Use Cases• Trade-off Analysis
o Balancing short term or long term reward
o Selection of rewards based on context
o Example: Risk assessment
• Optimizationo Maximizing reward function
through simulations or observations
o Example: security configuration optimization
#RSAC
Traditional Data Analytics vs. Machine Learning
6
Classical statistics aims to formalize relationships between variables in the form of mathematical formulas.
Machine learning can offer greater precision in its results because it avoids the simplification assumptions typically incorporated into manually created statistical models
The quality of the underlying data influences both traditional data analytics and machine learning since both are operating on the data syntactically
#RSAC
Deep Learning: Evolution of Machine Learning
7
Application to learning tasks of artificial neural networks(ANNs) that contain more than one hidden layer. The "deep" architectures can vary considerably with each implementation being optimized for different tasks or goals
Reasons for Deep Learning1. Pattern Complexity: the number of patterns
to recognize
2. Pattern Reuse: Learn intricate patterns by building on the work of the previous layer
Example: Anomaly Detection
#RSAC
Cognitive Reasoning
8
Identification Model 1
Inference: Hypothesis Generation
Domain Knowledge
Unknown Event
Identification Model 2
Identification Model 3
Evidence Gathering
and Semantic
Processing
Probabilistic Conclusion
Potentially incomplete context
Context enrichment
Continuous learning
Continuous learning
Example: Context enrichment of a security incident, quantification of impact and cataloging evidence of the kill chain
#RSAC
The current cyber landscape favors malicious actors
10
Global information solutions company, Equifax, has reported a major cybersecurity incident affecting 143 million consumers in the US.
Anthem: Hacked Database Included 78.8 Million People
SEC reveals it was hacked, information may have been used for illegal stock trades
”Big Four” accounting firm Deloitte was likely breachedin October or November 2016, but wasn’t discoveredby the firm until March 2017
The Cost of Cyber Security Operations Continues to Increase without Mitigating Risk
#RSACCyber Security Imperatives to achieve a more favorable equilibrium
11
Scale Security OpsWith an increasing volume and sophistication of attacks organizations need a force multiplier for their security teams
Assist DecisionsGiven a broader threat surface area security teams need assistance in effective and accurate data driven decision making
Improve ResponsivenessDue to increasing risk of compromise, institutions and individuals will demand faster response to breaches
Be ProactiveThe goal is to instrument proactive security controls to minimize exposure to emerging threats
#RSAC
The Appeal of AI for Cyber Security
12
Automate Operational Tasks
Present Complete Security Context
Mitigate Human Biases
Recall Relevant Knowledge
Develop Predictive Capabilities
Derive Actionable Intelligence
Provide Expert Advice
Propose Best Practices
Assess Risk
Remove Noise
#RSAC
Revisiting the Cyber Security Lifecycle
13
Key Requirements:• Monitoring for anomalous activity• Identification of current or
potential threats
Activities of Security Professionals:• System Integration &
Configuration• Threat hunting
Traditional Operations:• Instrumented and automated but
requires deployment integration• Pattern/signature or rule based
detection• Generates significant volume of
alerts
Key Requirements:• Validation of escalated incidents
as security concerns• Risk assessment of detected
security threats – scope and magnitude
Activities of Security Professionals:• Security Analysis• Threat Hunting
Traditional Operations:• Highly manual investigative
process• Inconsistent methodologies• Prone to human biases and
plagued by lack of skills
Key Requirements:• Formulating responses to specific
risks or incidents• Tracking incidents from detection
to investigation to resolution
Activities of Security Professionals:• Incident Response• System Administration/ Program
Management
Traditional Operations:• Time intensive manual security
assessments• Heavily reliant on experiential
knowledge of individuals• Delays in instituting responses
Detection of Threats & Risks Investigation & Qualification of Security Alerts
Incident Response & Governance
#RSAC
Applied AI: Detection of Threats & Risks
14
Naive Bayes classifiers for Spam filtering
Holt Winters algorithm for Network Anomaly Detection
Clustering for baselining behaviors for anomaly detection
Peer group analysis for insider threat detection
Optimizing attack detection through observational reinforcement and deep nets
Present Near FutureLate 1990’s and early 2000’s Mid to Late 2000’s Early to mid 2010’s
Incr
easi
ng A
I sop
hist
icat
ion
and
mat
urity
Incident Classification: recognizing type and nature of threat
Anomaly Detection: behavior analysis of users, network, assets, data, applications
Information Synthesis: profiling actors and activities by constructing dynamic models
Proactive Prioritization: anticipating outcomes based on current events and historical knowledge
Cognitive threat hunting
Malware detection using various techniques: clustering, classification decision trees and deep nets
Long history of automated data analysis, and initial focus of most AI initiatives in cyber security
Detection of malware-generated domains with Recurrent Neural Models
#RSAC
Applied AI: Alert Investigation and Qualification
15
Incr
easi
ng A
I sop
hist
icat
ion
and
mat
urity
False positive and noise reduction using Principal Component Analysis and deep nets
Dimensionality Reduction: understanding the key traits that influence risk
Attack Tactics Resolution: Dissecting the process and goals of an attack
Threat Research: automated incident specific information gathering
Enriched Incident Context: identifying affected entities and relationships among them
15
Present Near FutureLate 1990’s and early 2000’s Mid to Late 2000’s Early to mid 2010’s
Natural Language Processing (NLP) for semantic analysis powered by deep nets
Deducing behaviors and motivations using cognitive and time series analysis
Clustering and classification of indicators by threat type
Key investment segment to address skills shortage in security analysis to improve security outcomes
#RSAC
Applied AI: Incident Response
16
Incr
easi
ng A
I sop
hist
icat
ion
and
mat
urity
16
Present Near FutureEarly to mid 2010’s
Response Simulation: projecting the security and business repercussions of a response plan
Impact Assessment: identifying all affected entities and quantifying the scope of risk
Orchestration: automation of incident response plan
Recommended Actions: suggested response plan based on historical outcomes
Decision matrices and observational learning to mimic manual cognitive processes
Cognitive analysis for best practice selection
Risk classification of security incidents using regression analysis and deep nets
Trade-off analysis for process disruption and security risk calculation
Significant potential to scale security operations and improve overall response times.
#RSAC
A Case Study: Network Analysis powered by AI
17
Data Machine Learning Models Security Value
Network Traffic
Phishing Kits
Malicious Traffic Classification
Principal components of malicious domains and sites
Detection and validation of suspicious
traffic, threat actors, malicious domains and
susceptible internal systems and users
#RSAC
Malicious Actors and AI
19
Increasing Success & Falling CostsCurrent tools and tactics are already delivering greater success while reducing costs –so any new investment in AI must promise higher returns
High Value Advanced TargetsTarget organizations or specific outcomes that were previously deemed too risky of exposure now could potentially become feasible with AI
Short time windowWhen highly prevalent vulnerabilities are announced, some organizations may not respond quickly – AI could allow malicious actors to capitalize on that window of opportunity
Individualized Large-scale CompromiseToday centralized targets are prized targets but while they have high yield they become public. AI could allow for large scale decentralized compromises that are hidden
#RSAC
AI Enhanced Kill Chain
20
Surveillance & Research Breach Exploit
• Understanding security controls:o Standard practiceso Specific Target
• Monitoring processes and activitieso Institutional practiceso Specific users
• Learn about IT infrastructures and solutions to reveal vulnerabilities
• Natural Context-aware messagingo Email, text, tweets etc
• Adaptive toolso Environment-aware
behavior modificationo Evolving malware
• Reputation Spoofing
• Diversionary or Evasive Tactics to confuse security controlso Generate noise
• Dynamic Tacticso Embedded data transferso Entity baseline
modificationo False security event
generation
#RSAC
Examples of AI Enhanced Kill Chain
21
Surveillance & Research Breach Exploit
• Blackbox probing of security controls to gather results and possibly confidence (ex. through vendor product testing)
• Intelligent NLP powered crawlers to monitor social media and forum activities of individuals/aliases
• Automated analysis of security bulletins, policies and controls powered by NLP
• NLP powered social engineering bots
• Organization specific topic selection
• Simulating speech or writing style of colleagues
• Intelligent Malware• Countermeasure aware
polymorphism• Information altering to
mislead classifiers• Generative adversarial
network (GAN) to bypass machine learning based detection
• Time synchronized DDOS smokescreen for data exfiltration
• Injection of benign activity similar to future malicious behavior to confuse anomaly detection
• Steganography to bundle sensitive data into copies of benign transmissions
#RSAC
Uniqueness of Applied AI in Cyber Operations
23
Active Adversary Data Availability Time Value Tradeoff
Assume every action taken will be witnessed and an equal or greater effort will be invested to counter it
AI requires high volumes of high quality data to learn. Data silos and varying formats can affect training
Given dynamic cyber landscape use cases need to stand the test of time and context or else can negate value
#RSAC
Cautionary Tales
24
• Purely syntactic analysis can lead to overfitting to specific attributes or features of the data
• Lack of understanding of how an algorithm arrives at its answer can hide flaws in its design or data selection
Today its still easy to throw off an recognition systems by strategically inserting non-relevant data. If attackers know how a information is classified, they can trick the learned behavior into faulty interpretation
Altered street signs confuse driverless cars
#RSAC
Building AI powered Security Controls
26
Use Case Definition
Gathering Training
Data
Features Selection
Machine Learning Model
Selection
Tuning Action Results
Focus the problem statement to improve odds of success and minimize time to value
Curate data to minimize possibility of poisoning the learned model and guarantee required richness
Employ security domain knowledge and context to validate the analytical process
Develop data science and machine learning expertise to match algorithms to problem
Refine the model to improve robustness and iteratively broaden the scope of solution to expand value
Emphasize consumability of implementation to operationalize the solution
#RSAC
Embracing AI: An Example
27
Use Case Definition
Gathering Training
Data
Features Selection
Machine Learning Model
Selection
Tuning Action Results
Goal: Insider threat –Identify risky identities and prevent exfiltration of data
Security Data sources: SIEM, endpoint logs, proxy logs, Access control logs, application logs, network traffic, email content
Data Preparation: normalization, filtering, annotation etc.
Attributes of risky identity: negative sentiment, inconsistent activity, access to critical or sensitive data
Build:- Sentiment analytics of email content or social media activity- Peer group analysis to highlight behavioral anomalies
Buy: validate use case coverage and accuracy
Train: Machine learning models tested on internal data sources to minimize false positives
Operational Focus: Visualization for dashboards and notifications and data export into existing security tools – SIEMs, Firewalls, Device Management etc.
#RSAC
Summary & Conclusion
28
Collectively AI security teams have to merely achieve a Nash equilibrium state such that their detection and response matches new threats posed by attackers • For malicious actors such an
equilibrium will not provide the necessary returns to justify the investment
It is important for organizations to invest in data science competencies to build solutions for unique security requirements and assess vendor solutions for more generalized or advanced use cases• Several tools and APIs now make
machine learning and AI accessible to a broader range of developers
AI facilitates collective defense -rapid sharing of insights within the security community. However, without a collaboration based approach AI will not discourage malicious actors because the total cost of security for each organization may outweigh the perceived business risk