38
Societal consequences CS 446

Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Societal consequences

CS 446

Page 2: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

A few vignettes

1 / 34

Page 3: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

A few vignettes

Machine learning is entering human-facing technologies more and more.

I “Big data” and large scale machine learning allow for masssurveillance (and mass discrimination).

I The choices suggested by machine learning systems are ofteninscrutable and unreliable; the systems are hard to debug.

I Machine learning systems have obscure failure modes (adversarialexamples!) which well-informed attackers can trigger.

How can anyone argue a machine learning system actually implementssome policy?How can one argue it doesn’t covertly implement some other policy?

2 / 34

Page 4: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

A few vignettes

Machine learning is entering human-facing technologies more and more.

I “Big data” and large scale machine learning allow for masssurveillance (and mass discrimination).

I The choices suggested by machine learning systems are ofteninscrutable and unreliable; the systems are hard to debug.

I Machine learning systems have obscure failure modes (adversarialexamples!) which well-informed attackers can trigger.

How can anyone argue a machine learning system actually implementssome policy?How can one argue it doesn’t covertly implement some other policy?

2 / 34

Page 5: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

YouTube’s “Content ID”

(From Wikipedia.)

YouTube can label videos as violating copyright and take them down.This can remove livelihood (demonetization).

3 / 34

Page 6: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

China’s “social credit system”

(From Wikipedia.)

Who oversees this system?Which political agendas does it actually enforce?

4 / 34

Page 7: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Other examples

I What if everyone starts using deep-learning based websites to finddates?

I What if someone tampers with the algorithms?

I What if all cars are self-driving, with deep-learning based visionsystems?

I What if someone tampers with the weight vectors so that, in raresituations, the car drives itself into oncoming traffic?Given the existence of adversarial examples, are such modificationsdetectable?

5 / 34

Page 8: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

The power of nation-states: Juniper routers (part 1)

(From Wikipedia.)

6 / 34

Page 9: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

The power of nation-states: Juniper routers (part 2)

(From Wikipedia.)

Once nation-states start tampering with machine learning systems, it’llget even scarier.

7 / 34

Page 10: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Facial recognition and other biometrics?

...

...

(From ZDNet.)

8 / 34

Page 11: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Facial recognition

(From NYTimes.)

9 / 34

Page 12: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

10 / 34

Page 13: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Lecture overview

I Today we’ll discuss two fields of study related to societalconsequences: fairness and privacy. Here are some references:

I Upcoming Fairness textbook: “Fairness and machine learning”;Solon Barocas, Moritz Hardt, Arvind Narayanan;https://fairmlbook.org/.

I Privacy tutorial: “Differentially Private Machine Learning: Theory,Algorithms, and Applications”; Kamalika Chaudhuri, Anand D.Sarwate; https://www.ece.rutgers.edu/~asarwate/nips2017/.

I These fields are still young, and most deployed ML is not adjusted totake them into account.

11 / 34

Page 14: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Privacy

12 / 34

Page 15: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Privacy

Suppose each example in a data set D = (zi)ni=1 corresponds to an

individual from the population.

I Can the output of a learning algorithm, run on data set D, revealinformation about an individual?

I Example: average salary computation.(Recall, average is minimizer of squared loss risk . . . )Suppose attacker knows salary of all but one person in a company.Claim: By learning the average salary, attacker knows everyone’ssalary.

average salary =1

n

(your salary + sum of everyone else’s salary

).

I Privacy was compromised by release of average salary:

I Before release of average salary, attacker did not know your salary.I After release of average salary, attacker knows your salary.

13 / 34

Page 16: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Another example

Privacy compromised by release of genome-wide association study(GWAS) results (Wang, Li, Wang, Tang, & Zhou, 2009).

Cancer group Healthy group

Published correlation statistics from GWAS

I “individuals can be identified from even a relatively small set ofstatistics, as those routinely published in GWAS papers”

I Based on your genes, can learn if you participated in the study or not.I Can also learn if you were in the “disease” group or “healthy” group.

14 / 34

Page 17: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Why this is different from ordinary inference

Scenario: An attacker knows your genetic information, but does notknow if you have certain disease.

1. Suppose GWAS results allow attacker to determine that you were inthe “disease” group in the study.Is privacy compromised by release of GWAS results?

2. Suppose GWAS results suggest people (like you) with a certaingenetic marker are likely to have a disease.Is privacy compromised by release of GWAS results?

15 / 34

Page 18: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

What we can hope to achieve: differential privacy

What we would like:Given output of learning algorithm, attacker should not be able todistinguish between:

I Case 1: Training data contains your data (xi, yi).

I Case 2: Training data does not contain your data (xi, yi).

I Whether or not you are included in training data does not changewhat attacker can learn about you.

I Any secrets you have (that couldn’t have been learned otherwise)are not compromised by the learning algorithm.

Learning algorithms with this property are said to guarantee differentialprivacy (Dwork, McSherry, Nissim, & Smith, 2006).

16 / 34

Page 19: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Differential privacy

Formal definition: A randomized algorithm A guarantees ε-differentialprivacy if, for all data sets D and D′ that differ in just one individual’sdata point,

P(A(D) = t) ∈ (1± ε) · P(A(D′) = t), ∀ possible outputs t.

17 / 34

Page 20: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Example: differentially private average of salaries

Average salaryI Goal: approximately compute average salary.

I Sample average is not differentially private as before: can use it torecover individual salaries.

I Our algorithm: Given data set D = (si)ni=1 of salaries, output

A(D) =1

n

n∑i=1

si + Z,

where

Z ∼ pZ(z) =1

2σexp

(−|z|σ

),

σ ≈ max salary

εn.

(“max salary” is assumedto be publicly known.)

-3 -2 -1 0 1 2 3

x

0

0.1

0.2

0.3

0.4

0.5

exp(-abs(x))/2

18 / 34

Page 21: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Example: differentially private average of salaries

Average salaryI Goal: approximately compute average salary.I Sample average is not differentially private as before: can use it to

recover individual salaries.

I Our algorithm: Given data set D = (si)ni=1 of salaries, output

A(D) =1

n

n∑i=1

si + Z,

where

Z ∼ pZ(z) =1

2σexp

(−|z|σ

),

σ ≈ max salary

εn.

(“max salary” is assumedto be publicly known.)

-3 -2 -1 0 1 2 3

x

0

0.1

0.2

0.3

0.4

0.5

exp(-abs(x))/2

18 / 34

Page 22: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Example: differentially private average of salaries

Average salaryI Goal: approximately compute average salary.I Sample average is not differentially private as before: can use it to

recover individual salaries.

I Our algorithm: Given data set D = (si)ni=1 of salaries, output

A(D) =1

n

n∑i=1

si + Z,

where

Z ∼ pZ(z) =1

2σexp

(−|z|σ

),

σ ≈ max salary

εn.

(“max salary” is assumedto be publicly known.)

-3 -2 -1 0 1 2 3

x

0

0.1

0.2

0.3

0.4

0.5

exp(-abs(x))/2

18 / 34

Page 23: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Why this guarantees differential privacy

Consider data sets of salaries D and D′ that differ only in one entry.I Our algorithm: A(D) = avg(D) + Z, where

Z ∼ pZ(z) =1

2σexp

(−|z|σ

), σ ≈ max salary

εn.

I Need to show that output densities of A on input D and on inputD′ are within 1± ε of each other.

I Since D and D′ differ in one entry, say, si and s′i:

A(D′) = avg(D) +s′i − sin

+ Z.

I Ratio of probability densities:

pA(D)(t)

pA(D′)(t)=

pZ(t− avg(D)

)pZ

(t− avg(D)− s′i−si

n

)= exp

(− 1

σ

(∣∣∣t− avg(D)∣∣∣− ∣∣∣t− avg(D)− s′i − si

n

∣∣∣))

∈ exp

(± 1

σ· max salary

n

)≈ 1± ε.

19 / 34

Page 24: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

How good is this?

For statistical analysis: Often data set itself is just a random sample,and we care more about broader population.

I Average on sample differs from population mean by Θ(n−1/2)anyway!

I Expected magnitude of noise Z: σ ≈ max salary

εn.

Overall:∣∣A(D)− population mean∣∣ ≤ O( salary stddev√

n+

max salary

εn

).

Accuracy improves with n, even under constraint of ε-differential privacy.

I In particular, when max salary is comparable to salary stddev,ε ≈ 1/

√n is almost “free”.

20 / 34

Page 25: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Recap: privacy

I Privacy can be compromised when sensitive data is used in dataanalysis / machine learning.

I is one formal definition of privacy that can be rigorously guaranteedand reasoned about.

I Many techniques for creating “differentially private” versions ofstatistical methods (including machine learning algorithms).

21 / 34

Page 26: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Fairness

22 / 34

Page 27: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Fairness

What is fairness?

I Equal treatment of people.

I Equal treatment of groups of people.

I . . .

What is unfair?

I Unequal treatment (of people or groups of people) based onattributes that society deems inappropriate (e.g., race, sex, age,religion).

I . . .

Highly domain- and application-specific.Why is this of concern with machine learning?

I Systems built using machine learning are increasingly used to makedecisions that affect people’s lives.

I More opportunity to misuse and confuse.

23 / 34

Page 28: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Example: predictive policing

Predictive policing: data-driven systems for allocating police resourcesbased on predictions of where it is needed.

Financial Times article from August 22, 2014 by Gillian Tett:

After all, as the former CPD computer experts point out, thealgorithms in themselves are neutral. “This program had abso-lutely nothing to do with race ... but multi-variable equations,”argues Goldstein. Meanwhile, the potential benefits of predictivepolicing are profound.

24 / 34

Page 29: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Unfairness with machine learning

Many reasons systems using machine learning can be unfair:

I People deliberately seeking to be unfair.

I Disparity in amounts of available training data for differentindividuals/groups.

I Differences in usefulness of available features for differentindividuals/groups.

I Differences in relevance of prediction problem for differentindividuals/groups.

I . . .

How we might address these problems:

1. Rigorously define and quantify (un)fairness.

2. Identify sources of confusion regarding fairness.

3. . . .

25 / 34

Page 30: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Quantifying (un)fairness: statistical parity

Setup for classification problem:

I X: features of an individual.

I A: protected attribute (e.g., race, sex, age, religion).

I For simplicity, assume A takes on only two values, 0 and 1.

I Y : output variable to predict (e.g., will repay loan).

I For simplicity, assume Y takes on only two values, 0 and 1.

I Y : prediction of output variable (as function of X and A).

Statistical parity is satisfied if

P(Y = 1 | A = 0) = P(Y = 1 | A = 1)

(i.e., Y is independent of A).A particular relaxation is the 4/5th rule:

P(Y = 1 | A = 0) ≥ 4

5· P(Y = 1 | A = 1).

26 / 34

Page 31: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Proxies for protected attribute

Cannot simply avoid using A explicitly, since because other features couldbe correlated with it.

Example: WSJ observed that online price for an item depends on howfar you live from a brick-and-mortar Staples store.

I Pricing system doesn’t explicitly look at your income.

I But where you live is probably correlated with your income.

I Result: charging higher prices to lower-income people.

27 / 34

Page 32: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Problem with statistical parity

The following example satisfies statistical parity:

A = 0 A = 1

Y = 0 Y = 1 Y = 0 Y = 1Y = 0 1/2 0 1/4 1/4Y = 1 0 1/2 1/4 1/4

Suppose Y = 1{will repay loan if granted one}.I For group A = 0: correctly give loans to people who will repay

them.

I For group A = 1: give loans out at random. (Sets up people forfailure!)

28 / 34

Page 33: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Equalized odds

Equalized odds is satisfied if

P(Y = 1 | Y = y,A = 0) = P(Y = 1 | Y = y,A = 1), y ∈ {0, 1}

(i.e., Y is conditionally independent of A given Y ).

I The previous example fails to satisfy equalized odds:

A = 0 A = 1

Y = 0 Y = 1 Y = 0 Y = 1

Y = 0 1/2 0 1/4 1/4 00+1/2 6=

1/41/4+1/4

Y = 1 0 1/2 1/4 1/4 1/20+1/2 6=

1/41/4+1/4

I Possible to post-process a predictor to make it satisfy equalized odds(w.r.t. empirical distribution), provided that predictor can look atA. (Hardt, Price, & Srebro, 2016)

I Problem: may be illegal to look at A, even if purpose is to ensurefairness property.

29 / 34

Page 34: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Within-group (weak) calibration

Within-group (weak) calibration is satisfied if

P(Y = 1 | A = a) = P(Y = 1 | A = a), a ∈ {0, 1}.

Note: Suppose Y is randomized classifier derived from conditionalprobability estimator p, i.e.,

Y =

{1 with probability p,

0 with probability 1− p.

And further, suppose p satisfies

P(Y = 1 | p = p,A = 0) = P(Y = 1 | p = p,A = 1) = p, p ∈ [0, 1],

(i.e., p is calibrated conditional probability estimator).Then Y satisfies within-group (weak) calibration.

30 / 34

Page 35: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Inherent trade-offs

Theorem (Kleinberg, Mullainathan, & Raghavan, 2016): Impossible tosatisfy both within-group (weak) calibration and equalized odds, unless

I P(Y = 1 | A = 0) = P(Y = 1 | A = 1), or

I Y is perfect predictor of Y .

Note: Similar dichotomy holds for scoring functions / conditionalprobability estimators, with slightly different definitions of within-groupcalibration and equalized odds.

31 / 34

Page 36: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

COMPAS

COMPAS “risk assessment”: predict whether criminal defendants willcommit (another) crime.

I ProPublica study argues COMPAS is unfair:

E.g., FPR for A = 0 is 0.45, while FPR for A = 1 is 0.23.

I Study also shows COMPAS satisfies approximate within-group(weak) calibration.

P(Y = 1 | Y = 1, A = 0) = 0.64, P(Y = 1 | Y = 1, A = 1) = 0.59.

(This was part of a different argument that COMPAS is fair.)

I Inherent trade-off theorem explains why different definitions offairness may necessarily lead to different conclusions.

32 / 34

Page 37: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Ways to confuse fairness: Nature of the training data

Sometimes the available training data is not suitable for task.E.g., system to decide loan applications. Goal is to predict whether anapplicant will repay a loan if granted one.Is the following training data appropriate?

1. x = features of past loan applicant,y = 1{loan application was approved}.

2. x = features of past loan applicant who received loan,y = 1{loan was repaid}.

3. x = features of past loan applicant,y = 1{loan was repaid}.

33 / 34

Page 38: Societal consequences - University Of Illinoismjt.cs.illinois.edu/courses/ml-s19/files/slides-societal...Lecture overview I Today we’ll discuss two elds of study related to societal

Recap: fairness

I Automated decision-making based on machine learning should bescrutinized for appropriate fairness considerations (like any otherdecision-making system).

I Must think hard about appropriateness of data for a givenapplication.

34 / 34