63
PETs, POTs, and Pitfalls Rethinking the Protection of Users against Machine Learning Carmela Troncoso @carmelatroncoso Security and Privacy Engineering Lab

PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

PETs, POTs, and Pitfalls

Rethinking the Protection of Users against Machine Learning

Carmela Troncoso@carmelatroncoso

Security and Privacy Engineering Lab

Page 2: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

The machine learning revolution

2

Page 3: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

The machine learning tsunami

Social JusticePrivacy

3

Page 4: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Introducing the idea(see notes for details)

The ML tsunami on privacy

Privacy

Page 5: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Attacks are not new… but the adversary is

Attacks on privacy

Privacy Enhancing Technologies

PETs

Page 6: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Attacks on privacy

Attacks are not new… but the adversary is

?PETs??

Page 7: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Machine Learning

for privacy 7

Page 8: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

The goal is not to understand, it is to beat!

Page 9: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

The goal is not to understand, it is to beat!

Page 10: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Adversarial examples are only adversarial when you are the owner of the algorithm!

Page 11: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Adversarial examples are only adversarial when you are the owner of the algorithm!

PETs!! ML models

Page 12: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Wait! Why do we need adversarial examples if we have privacy-preserving ML!!

Jason Mancuso, Ben DeCoste and Gavin Uhma.https://medium.com/dropoutlabs/privacy-preserving-machine-learning-2018-a-year-in-review-b6345a95ae0f

Page 13: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Machine learning as a privacy adversary

Data

Service

Avoid that learns about data

Actively (maybe not willingly) provide data. Solutions like Differential privacy and Encryption are suitable

13

ML Privacy-oriented Literature

Page 14: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Machine learning as a privacy adversary

Data

Service

Avoid that learns about data

Actively (maybe not willingly) provide data. Solutions like Differential privacy and Encryption are suitable No active sharing!

Cannot count on

14

ML Privacy-oriented Literature

In this talk

Page 15: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Adversarial examples as privacy defenses

Data Inferences

15

Use ML adversarial example techniques to transform data!

Page 16: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Adversarial examples as privacy defenses

Data Inferences

16

Can this solve all privacy problems?

Use ML adversarial example techniques to transform data!

Page 17: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Can this solve all privacy problems?Protect web

searches from inferences

Protect traffic

patterns

???

???

???

Protect tweets from inferences

Page 18: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Can this solve all privacy problems?Protect web

searches from inferences

Protect traffic

patterns

???

???

???

Protect tweets from inferences

In privacy problems adversarial examples belong to a DISCRETE and CONSTRAINED domain

FEASIBILITY COSTIBILITY me the enemy?

Page 19: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Nobody has thought of this?

19

Usenix Security Symposium - 2018

Modify social network attributes to avoid inferences

Use adversarial examples (evasion attacks) to keep utility

Use a version of Jacobian-based Saliency Map Attack (JSMA)“aware of policies” = only do feasible transformations

Page 20: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Nobody has thought of this?

20

Usenix Security Symposium - 2018

Modify social network attributes to avoid inferences

Use adversarial examples (evasion attacks) to keep utility

Use a version of Jacobian-based Saliency Map Attack (JSMA)“aware of policies” = only do feasible transformations

PoPETS - 2019

Modify Twitter line to avoid inferences

Add, remove, replace tweets

Greedy search by importance for classifier

Page 21: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Nobody has thought of this?

21

Non-privacy constrained applicationsText:Goal: change classification (positive to negative sentiment,

change inferred topic for a post)

Malware: Goal: change classification (from malicious to benign)

Page 22: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Nobody has thought of this?

22

Non-privacy constrained applicationsText:Goal: change classification (positive to negative sentiment,

change inferred topic for a post)

Malware: Goal: change classification (from malicious to benign)

Repeated patterns:

- Model transformation- Find new search algorithm

e.g., Hill climbing, beam search- Evaluate & compare performance

But NO systematic design method

Page 23: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Our proposal: Evasion as a graph

23

Tweets from an account

ML

Protecting users from demographic inferences

Goal change Twitter line classification regarding age

TransformationsUse synonyms Introduce typosChange punctuation

CostKeep the meaning!

Page 24: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

I love Justin Bieber!

24

Tweets from an account

MLOur proposal: Evasion as a graphCost: keep meaning

Page 25: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

I love Justin Bieber!

I love Justin Bieber.

25

I like Justin Bieber I loath Justin Bieber

Tweets from an account

ML

1 2 20

Cost: keep meaningOur proposal: Evasion as a graph

Page 26: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

I love Justin Bieber!

I love Justin Bieber.

26

I like Justin Bieber I loath Justin Bieber

Tweets from an account

I love Justin Trudeau. I love Justin Timberlake.

ML

1 2 20

20 5

cost = 1 + 20 = 21 cost = 1 + 5 = 6

Cost: keep meaningOur proposal: Evasion as a graph

Page 27: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

I love Justin Bieber!

I love Justin Bieber.

27

I like Justin Bieber I loath Justin Bieber

Tweets from an account

I love Justin Trudeau. I love Justin Timberlake.

ML

1 2 20

20 5

cost = 1 + 20 = 21 cost = 1 + 5 = 6

Cost: keep meaning

In privacy problems examples belong to a DISCRETE and CONSTRAINED domain

FEASIBILITY COST me the enemy?

Our proposal: Evasion as a graph

Page 28: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

The graph approach comes with advantages

Enables the use of graph theory toEFFICIENTLY find adversarial examples (A*, beam search, hill climbing, etc)

CAPTURES most attacks in the literature! (comparison base)

Finds provable MINIMAL COST adversarial examples (A*) if

- The discrete domain is a subset of Rm

For example, categorical one-hot encoded features: [0 1 0 0]

- Cost of each single transformation is Lp

For example, L∞([0 1 0 0], [1 0 0 0]) = 1

- We can compute pointwise robustness for the target classifier over Rm30

Page 29: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Finding minimal cost adv. examples: the concept

Confidence of the example

31

Page 30: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Adversarial examples for privacy

Provide privacy in settings where the ML model is adversarial and not cooperative

Privacy is CONSTRAINED , a graphical approach can be used toEFFICIENTLY find FEASIBLE adversarial examples find MINIMAL COST adversarial examples

Even if they cannot be deployed in practice, this approach provides a BASELINE to compare defenses’ efficiency

33

Page 31: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Bonus: applicable to security problems!

MINIMAL COST adversarial examples can become security metrics!

Cost can be associated with RISKCannot stop attacks, but can we ensure they are expensive?

Constrained domains securityContinuous-domains approaches can be very conservative!

34

Page 32: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Only privacy is at stake?

35

Privacy breaches

Page 33: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Only privacy is at stake?

36

Privacy breaches

Data used to optimize …

Page 34: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Only privacy is at stake?

37

Privacy breaches

Data used to optimize …

Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment

Advertisement(e.g., Facebook ads) Routing

(e.g., Waze)

Credit scoring(e.g., FICO)

Page 35: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Introducing the idea(see notes for details)

The ML tsunami on Social Justice

Social Justice

Page 36: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Introducing the idea(see notes for details)

The ML tsunami on Social Justice

Social Justice

Page 37: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

40

Optimization Systems

Optimization systeminteract withBenefit

Algorithm Algorithm…

Page 38: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

41

Optimization Systems

Optimization systeminteract with affectAlgorithm Algorithm…

Page 39: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

42

Optimization Systems

Optimization systeminteract with affectAlgorithm Algorithm…

Page 40: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

43

Optimization Systems

interact with affectOptimization system

Algorithm Algorithm…

Page 41: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

44

Optimization Systems

interact with affectOptimization system

Algorithm Algorithm…

Page 42: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

45

Optimization Systems

interact with affectOptimization system

Algorithm Algorithm…

Page 43: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

46

Optimization Systems

interact with affectOptimization system

Algorithm Algorithm…

Page 44: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

47

Optimization Systems

interact with affectOptimization system

Algorithm Algorithm…

non-users usersnon-usersusers

Page 45: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

48

Optimization Systems

interact with affectOptimization system

Algorithm Algorithm…

non-users usersnon-usersusers

How do we avoid negative effects caused by the Optimization System?(direct and externalities)

Page 46: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

We have fairness research!!“We’re creating algorithms that cause harms,

so we need to fix the algorithms”

Page 47: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

We have fairness research!!

https://www.esat.kuleuven.be/cosic/questioning-the-assumptions-behind-fairness-solutions/A Mulching Proposal: Analysing and Improving an Algorithmic System for Turning the Elderly into High-Nutrient Slurry. Os Keyes, Jevan Hutson, and Meredith Durbin

“We’re creating algorithms that cause harms, so we need to fix the algorithms”

Limited to algorithmic bias within a system

Decontextualized from the system’s goal Ignores other harms

Assumes ML owners have the incentives and the means

Page 48: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Wait! But we have fairness research!!

https://www.esat.kuleuven.be/cosic/questioning-the-assumptions-behind-fairness-solutions/A Mulching Proposal: Analysing and Improving an Algorithmic System for Turning the Elderly into High-Nutrient Slurry. Os Keyes, Jevan Hutson, and Meredith Durbin

“We’re creating algorithms that cause harms, so we need to fix the algorithms”

Limited to algorithmic bias within a system

Assumes ML owners have the incentives and the means

Fairness vs. Optimization Systems harms

Decontextualized from the system’s goal Ignores other harms

Page 49: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Technologies aimed at mitigating externalities of optimization system’s

Protective Optimization Technologies (POTs)

interact with affectOptimization system

Algorithm Algorithm…

non-users usersnon-usersusers

52

Page 50: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Credit scoring

Biased training data Underlying algorithms can:- discriminate applicants on protected attributes like gender or ethnicity - cause feedback loops for populations disadvantaged by the financial system

Credit bureaus have little incentive to change Fairness techniques are incipient and hard to deploy

53

potential risk posed by lending money to consumers and to mitigate losses due to bad debt

Page 51: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

POTs for Credit scoring- Enable users to help others get loans

Bureau

54

Take loans & repay

Poisoning

Page 52: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

- Enable users to help others get loans- Enable discriminated users to get loans

Bureau

Take loans & repay

55

Adversarial examples

Poisoning

POTs for Credit scoring

Page 53: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

- Enable users to help others get loans- Enable discriminated users to get loans

Bureau

Take loans & repay

56

Adversarial examples

Poisoning

POTs for Credit scoring

DISCRETE AND CONSTRAINED!

Page 54: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Adversarial machine learning for social justice

There is a need to protect individuals beyond preserving their privacy

Protective Optimization Technologies can be deployed to help individuals and groups to counter externalities

POTs are also CONSTRAINED so the graphical approach can also be used as technique to EFFICIENTLY find MINIMAL COST adversarial examples

57

Page 55: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

A challenge ahead

Page 56: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Disparate vulnerability

• Machine learning models inherit biases in the training

• Two Key implications

• ML-based attacks are unfair (like any ML-based model…)

59Your Installed Apps Reveal Your Gender and More! Suranga Seneviratne, Aruna Seneviratne, Prasant Mohapatra, and Anirban Mahanti.

Page 57: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Disparate vulnerability

• Machine learning models inherit biases in the training

• Two Key implications

• ML-based attacks are unfair

• Attacks on ML-models are unfair!

60Disparate Vulnerability: on the Unfairness of Privacy Attacks Against Machine Learning. Mohammad Yaghini, Bogdan Kulynych, Carmela Troncoso

Page 58: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Disparate vulnerability

• Is increased when defending ML models from other shortcomings

61

Page 59: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Disparate vulnerability

• Is increased when defending ML models from other shortcomings

62

Page 60: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Disparate vulnerability

• And blanket defenses have disparate impact on utility!

63

Page 61: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Universal design for protection technologies

We need to take into account attack’s fairness when designing protections

• Is it possible to have secure accurate models with fair privacy?• Security vs. privacy trade-off? • More importantly: fair privacy at the cost of privacy?

• Are adversarial learning-based defenses immune to this issue?• If so, should they be our only way forward?

• Should fairness be a bullet in privacy by design beyond ML?

64

Page 62: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

Takeaways

● Adversarial machine learning is hard to defend from: a great opportunity!

Adversarial machine learning as protective technologies

for privacy (PETs) and social justice (POTs)

● New graphical framework to approach the search of adversarial examples

… we can use of graph theory to improve efficiency and provide guarantees

● The fairness problems of machine learning will become a hurdle for protection!65

Page 63: PETs, POTs, and Pitfalls - USENIX...Prevalent use of optimization algorithms to extract maximum economic value from the manipulation of people's activities and their environment. Advertisement

https://github.com/spring-epfl/

https://spring.epfl.ch/en

http://carmelatroncoso.com/

Bogdan Kulynych

Mohammad Yaghini

Seda Guerses

Rebekah Overdorf

Ero Balsa

Jamie Hayes

Nikita Samarin