39
Local Password validation using Self- Organizing Maps ESORICS 2014 Diogo Monica and Carlos Ribeiro (diogo.monica, carlos.ribeiro)@tecnico.ulisboa.pt )

ESORICS 2014: Local Password validation using Self-Organizing Maps

Embed Size (px)

DESCRIPTION

The commonly used heuristics to promote password strength (e.g. minimum length, forceful use of alphanumeric characters, etc) have been found considerably ineffective and, what is worst, often counterproductive. When coupled with the predominancy of dictionary based attacks and leaks of large password data sets, this situation has led, in later years, to the idea that the most useful criterion on which to classify the strength of a candidate password, is the frequency with which it has appeared in the past. Maintaining an updated and representative record of past password choices does, however, require the processing and storage of high volumes of data, making the schemes thus far proposed centralized. Unfortunately, requiring that users submit their chosen candidate passwords to a central engine for validation may have security implications and does not allow offline password generation. Another major limitation of the currently proposed systems is the lack of generalisation capability: a password similar to a common password is usually considered safe. In this article, we propose an algorithm which addresses both limitations. It is designed for local operation, avoiding the need to disclose candidate passwords, and is focused on generalisation, recognizing as dangerous not only frequently occurring passwords, but also candidates similar to them. An implementation of this algorithm is released in the form of a Google Chrome browser extension.

Citation preview

Page 1: ESORICS 2014: Local Password validation using Self-Organizing Maps

Local Password validation using Self-Organizing Maps

!

ESORICS 2014 !

Diogo Monica and Carlos Ribeiro (diogo.monica, carlos.ribeiro)@tecnico.ulisboa.pt )

Page 2: ESORICS 2014: Local Password validation using Self-Organizing Maps

Talk Outline

‣ Background and Motivation

‣ Our approach

‣ Performance

‣ Conclusions

Page 3: ESORICS 2014: Local Password validation using Self-Organizing Maps

Background and Motivation

Page 4: ESORICS 2014: Local Password validation using Self-Organizing Maps

Background

Passwords are still the #1 way of doing authentication

‣ Leaks have shown that:

‣ Users are bad at choosing passwords

‣ There is prevalent password re-use across services

Page 5: ESORICS 2014: Local Password validation using Self-Organizing Maps

Background

Password validation heuristics are not working

‣ Leaks have shown that:

‣ promote weak variations that computers are good at guessing

‣ positive reinforcement of bad decisions (password strength meters)

Page 6: ESORICS 2014: Local Password validation using Self-Organizing Maps

Background

Password storage is evolving

‣ The prevalence of bcrypt, scrypt and variations have made brute-force hard

‣ Time-memory trade off (TMTO) resistance

‣ GPU unfriendliness (rapid random reads)

‣ Targeted (dictionary) attacks becoming the norm

‣ Some of them based on the knowledge of what the most common passwords are

Page 7: ESORICS 2014: Local Password validation using Self-Organizing Maps

Background‣ Password managers are still not prevalent

‣ There will always be passwords that people need to chose and memorize (laptop, offline access, password manager’s password, etc)

Page 8: ESORICS 2014: Local Password validation using Self-Organizing Maps

Motivation

‣ Give better password strength feedback

‣ Remove unnecessary jumps and hoops (strict rules)

‣ Promote the use of passwords that are easy to remember but hard to guess

Page 9: ESORICS 2014: Local Password validation using Self-Organizing Maps

Password Frequency

Page 10: ESORICS 2014: Local Password validation using Self-Organizing Maps

Core idea

‣ Use the frequency of past password appearance to hamper the effectiveness of statistical dictionary attacks

‣ Some organizations already use basic versions of this method

‣ Twitter prohibits the use of the 370 most common passwords

http://techcrunch.com/2009/12/27/twitter-banned-passwords/

Page 11: ESORICS 2014: Local Password validation using Self-Organizing Maps

Why is it hard?

‣ Lack of access to representative data-sets

‣ Computationally expensive

‣ Offline access (local validation)

‣ Efficient client distribution (updates)

‣ Compression

‣ Potential leak of candidate passwords

Page 12: ESORICS 2014: Local Password validation using Self-Organizing Maps

Dealing with variations

password1111111aaaaaqwerty

p@ssw0rdpassword1passw0rdp4ssw0rd

...

2222222333333344444445555555

...

asdfgzxcvbqw3rtywertyu...

...

bbbbbcccccsssssddddd...

Frequent Passwords

List

Page 13: ESORICS 2014: Local Password validation using Self-Organizing Maps

Our goal

Design a popularity based classification scheme that:

‣ Resists common password variations (generalization capability)

‣ Allows for offline operation (no centralized authority)

‣ Is easily distributable through end user’s systems

‣ Total size must be small

‣ Should not compromise password security

‣ Testing candidate passwords should be easy and inexpensive

‣ Time, cpu, memory

Page 14: ESORICS 2014: Local Password validation using Self-Organizing Maps

Our approach

Page 15: ESORICS 2014: Local Password validation using Self-Organizing Maps

Server-side:

• Compression

• Generalization

• Hashing

Client-side:

• Classification

password1111111aaaa123456

Password List

Generalization

Classification Database

Compression

Hashing

User-sideServer-side

download

Classification

Candidate Passwords

password1111111aaaa123456

Our approach

Page 16: ESORICS 2014: Local Password validation using Self-Organizing Maps

Compression

Page 17: ESORICS 2014: Local Password validation using Self-Organizing Maps

Requirements‣ Compress database to allow distribution (e.g. mobile

phones)

‣ Compression should not destroy topological proximity of the passwords

Page 18: ESORICS 2014: Local Password validation using Self-Organizing Maps

Self-Organizing Maps

‣ Clustering tool

‣ Unsupervised neural network

‣ Reflects in the output space topological proximity relations in the input space

‣ It provides us with an easy way of doing password “generalization”

Page 19: ESORICS 2014: Local Password validation using Self-Organizing Maps

High-level Process:

• Determine which node has a model closer to the input password (BMU)

• Find the set of all nodes in the lattice neighborhood of the BMU

• Update the models of all nodes in the lattice neighborhood of the BMU, to make them approximate the input password.

Output:

The resulting map is a summary replica of the input space, with a much lower number of elements but maintaining its topological relations.

Self-Organizing MapsTraining process

Concrete details of the training procedure can be found in the paper

Page 20: ESORICS 2014: Local Password validation using Self-Organizing Maps

Compression RatioSelf-Organizing Maps

‣ Compression ration is

‣ is the number of input passwords

‣ is the number of nodes in the SOM.

‣ Important to node that for any chosen compression ratio.

pmiss is the probability of wrongly classifying a password whose occurrence is higher than the threshold as safe

Page 21: ESORICS 2014: Local Password validation using Self-Organizing Maps

Similarity MeasureSelf-Organizing Maps

Beta pulls the overall measure of dissimilarity towards a simple human-related overlap distance

‣ Defines the topological characteristics of the input space to be preserved in the output space

Hamming distanceEuclidean distance between ASCII codes

Page 22: ESORICS 2014: Local Password validation using Self-Organizing Maps

Classification‣ Chose a popularity threshold

‣ Determine the BMU of the candidate password

‣ If the popularity level of the BMU is above the threshold the password is rejected.

Page 23: ESORICS 2014: Local Password validation using Self-Organizing Maps

Generalization

Page 24: ESORICS 2014: Local Password validation using Self-Organizing Maps

Generalization

‣ At this point we have a map that can be used for password classification

‣ Similar passwords are adjacent to each other

‣ By imposing popularity leakage from local maxima to neighboring nodes, we increase the generalization capability of the network.

Page 25: ESORICS 2014: Local Password validation using Self-Organizing Maps

Non-linear low pass filtering of the popularity levelsGeneralization

!"

#$%"

#$%"

#$%"#$%"

#$%"

#$%"

#$%"

#$%"

!"

#"

$%&!'#("

#"

!"

!"

!"!"

!"

!"

!"

!"

!"

#"

$)&!'#("

Smoothing kernel

Popularity label of node (x,y)

We apply:

With:

Page 26: ESORICS 2014: Local Password validation using Self-Organizing Maps

Hashing

Page 27: ESORICS 2014: Local Password validation using Self-Organizing Maps

Hashing

‣ The models in the SOM are a compressed summary of the training passwords

‣ Security problem

‣ We will need to find a hash that ensures non-invertibility of models

‣ Can’t destroy the topological proximity

Page 28: ESORICS 2014: Local Password validation using Self-Organizing Maps

Hashing

‣ Locality Preserving Hashes

‣ Deterministically invertible

‣ Cryptographic Hashes

‣ Destroy topological proximity by definition

Page 29: ESORICS 2014: Local Password validation using Self-Organizing Maps

Discrete Fourier TransformHashing

‣ A linear vector projection that we understand

‣ We can remove the phase information, thus avoiding invertibility

Password Spectrum (complex)

Spectrum (magnitude)Autocorrelation

discard phase

‣ The power spectrum of a password is always closer to itself than to the power spectrum of a different password

Page 30: ESORICS 2014: Local Password validation using Self-Organizing Maps

Performance

Page 31: ESORICS 2014: Local Password validation using Self-Organizing Maps

CM-Sketch

‣ Oracle to identify undesirably popular passwords

‣ Uses a count-min sketch

‣ No false negatives ( )

‣ Centralized operation

‣ No generalization features

http://research.microsoft.com/pubs/132859/popularityiseverything.pdf

Page 32: ESORICS 2014: Local Password validation using Self-Organizing Maps

Compression rate and statistical performancePerformance

0 1 2 3 4 5 6 7 8 9 10x 104

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1SOM vs CM Sketch

Popularity ranking order of chosen threshold

p FA

SOM (140,200)CMSkectch(3,28000)CMSkectch(3,56000)CMSkectch(3,84000)CMSkectch(3,112000)

‣ CM-sketch false positives are pure statistical classification errors

‣ SOM false positives may result from the desirable generalization properties

~834Kb

~672Kb

Page 33: ESORICS 2014: Local Password validation using Self-Organizing Maps

Generalization capabilityPerformance

German Dane Dutch English Spanish Italian Latin0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Language

PercentagePasswords Flagged as

Dangerous500 most probable 100%

All mutations 84%

Random passwords 11.3%

John the Ripper mutations testing

Testing different language dictionaries

‣ Is the SOM generalizing in a useful way?

‣ Is the SOM generalizing too much?

Page 34: ESORICS 2014: Local Password validation using Self-Organizing Maps

Implementation

Page 35: ESORICS 2014: Local Password validation using Self-Organizing Maps

Conclusions

‣ Presented a scheme for password validation

‣ Envisaged for local, decentralized operation

‣ Possesses generalization capabilities

‣ No claims of optimality were made, but it was shown that our approach is feasible

‣ The solution was implemented and tested empirically

Page 36: ESORICS 2014: Local Password validation using Self-Organizing Maps

Thank you Diogo Monica (@diogomonica)

Page 37: ESORICS 2014: Local Password validation using Self-Organizing Maps

Why didn’t you do the DFT at the beginning?

‣ Our inability to come up with a similarity measure that has human meaning when working with hashes

‣ Having the hashes at the beginning would make the training computationally more expensive

‣ By doing them at the end, the hashing function can be improved without changing anything in the SOM training

‣ We verified via Monte Carlo simulations that there is practically no loss in terms of topological proximity when doing the hashes at the end, making this a non-issue

Page 38: ESORICS 2014: Local Password validation using Self-Organizing Maps

Why did you chose this similarity measure?

‣ We are able to easily calculate distances between models and input passwords

‣ We have the ability of doing fractional approximation

‣ Our similarity measure captures a human “closeness” criteria

Page 39: ESORICS 2014: Local Password validation using Self-Organizing Maps

Why did you chose DFTs

‣ They are very fast to compute

‣ They are informationally non-invertible (if we discard the phase component)

‣ They maintained the topological proximity of our SOM

‣ Something that we can reason about because we understand what it means

‣ The only hashing mechanism that we found that works