Slide 1/13 Countering Evolving Threats in Distributed Applications: Scientific Principles Saurabh Bagchi The Center for Education and Research in Information

Slide 1/13

Countering Evolving Threats in Distributed Applications: Scientific

Principles

Saurabh BagchiThe Center for Education and Research in Information

Assurance and Security (CERIAS)School of Electrical and Computer Engineering

Purdue University

Joint work with: Gaspar Howard, Chris Gutierrez, Jeff Avery, Alan Qi (Purdue); Guy Lebanon (Amazon); Donald Steiner (Northrop Grumman)

Work Supported By: Northrop Grumman,

NSF

Slide 2/13

What is Special about Distributed System Security?

• Most of our critical infrastructure is built out of careful orchestration of multiple distributed services– Banking, Military mission planning, Power grid, …

• Distributed infrastructure means– Many machines, possibly under different admin domains– Many users, external and internal– Dynamic environment where software gets upgraded, new users

are added, new machines are added

• Attack surface is large and changing– All of the above dynamic factors cause this– Attack may originate from outside or inside

Slide 3/13

Three Big Trends in Threats Against Distributed Systems

1. Attack at the point of least resistance

2. Exploit zero-day vulnerabilities in any constituent service

3. Set up a covert channel for leaking sensitive information

– Find a vulnerable outward-facing service, OR– Initiate an insider attack

– Thriving black market in zero-day vulnerabilities– Tweak existing attack vectors to bypass rigid defense systems

– Relevant for systems with highly sensitive but low volume data– Timing channels, storage channels

Slide 4/13

Current Approaches against These Three Threat Vectors

1. Attack at the point of least resistance

2. Exploit zero-day vulnerabilities in any constituent service

3. Set up a covert channel for leaking sensitive information

– Create an ever more rigid perimeter – Improve the IDS alerting mechanisms, built alert correlation

– Hope white hats (vendors, open source devs) find these before the black hats

– Some impactful work in detecting metamorphic malware

– Only ad-hoc techniques leading to an arms race– Timing channels: perturb timing of actions indiscriminately– Storage channels: “null out” values of all unused storage elements

Slide 5/13

Desired Characteristics of Solutions• Clean slate design approach

– Build individual services following secure design principles– Includes randomization, use of type safe programming

languages, static vulnerability checking, dynamic taint analysis

• Bolt security on– Embed secure layer on constituent services, not relying only on

an impenetrable perimeter– Use the power of big data – lots of users, lots of machines, lots

of workloads– Learn from mistakes, i.e., the attacks that succeed – allow

expert security admins to provide input to automated system

OR

Slide 6/13

A Glimpse into Our Solution Approaches

Slide 7/13

Distributed Inferencing from Individual Sensor Information

D1D2

D3

D4

D5

D6

Slide 8/13

Automatic Generation and Update of IDS Signatures: SQLi

• First for SQL injection attacks

8

1. Crawls multiple public cybersecurity portals to collect attack samples

2. Extracts a rich set of features from the attack samples

3. Applies a clustering technique to the samples, giving the distinctive features for each cluster

4. A generalized signature is created for each cluster, using logistic regression modeling

Slide 9/13

Automatic General and Update of Signatures: Phishing

• Next for phishing attacks• Phishing specific features are created

– Word features determined using word frequency counting– Based on common phishing features, e.g., # links, # image tags– Sentiment analysis for determining words conveying sense of

change and urgency that attackers attempt to portray to the user

• Parsing phishing emails (corpus from Purdue’s IT organization) input as mbox files

Slide 10/13

Phishing: Preliminary Results

This cluster includes features such as: "below ,need, dear, update, customer, account, bank"

• Each cluster forms a general story about the emails contained within it from which the basis of the attack can be deduced– For example, for cluster 4, the attack is trying to get the user to update

information for their banking account.

• It is much easier training the user based on the attack signature for clusters, than the mass of individual emails

Slide 11/13

Covert Timing Channels• Designed a covert network timing channel imitating long

range dependent (LRD) legitimate traffic– Can be hidden in the Web traffic, the most observed traffic on

Internet today– Statistically indistinguishable from real traffic– Evades the best available detection methods.

• Data Rate: 2 – 6 bits/second• Decoding Error: 3% – 6 %• Solution approach

– Look for autocorrelation function values– Look for Hurst value that characterizes LRD traffic

Slide 12/13

Take Aways• Distributed applications need to be protected• Three emerging trends

1. Attack at the point of least resistance2. Exploit zero-day vulnerabilities in any constituent service3. Set up a covert channel for leaking sensitive information

• Lessons in solving these trends– If clean slate design is possible for some services, use a

comprehensive set of secure design principles: randomization, use of type safe programming languages, static vulnerability checking, dynamic taint analysis

– If security needs to be bolted on, look at internal security, not just perimeter security

– Big data advances can enable learning from large volumes of existing data to extrapolate to new attack types

Slide 13/13

Presentation available at:Dependable Computing Systems Lab

(DCSL) web siteengineering.purdue.edu/dcsl

Documents

Slide 1/13 Countering Evolving Threats in Distributed Applications: Scientific Principles Saurabh Bagchi The Center for Education and Research in Information