47
Detecting Algorithmic Discrimination 15th Dutch-Belgian Information Retrieval Workshop, November 2016 Carlos Castillo @chatox Based on joint KDD 2016 tutorial with Sara Hajian and Francesco Bonchi 1

Detecting Algorithmic Bias (keynote at DIR 2016)

Embed Size (px)

Citation preview

Page 1: Detecting Algorithmic Bias (keynote at DIR 2016)

Detecting Algorithmic Discrimination

15th Dutch-Belgian Information Retrieval Workshop, November 2016

Carlos Castillo @chatox

Based on joint KDD 2016 tutorialwith Sara Hajian and Francesco Bonchi

1

Page 2: Detecting Algorithmic Bias (keynote at DIR 2016)

Part IPart I Algorithmic Discrimination ConceptsPart II Algorithmic Discrimination Discovery

2

Page 3: Detecting Algorithmic Bias (keynote at DIR 2016)

Part I: algorithmic discrimination conceptsMotivation and examples of algorithmic bias

Sources of algorithmic bias

Legal definitions and principles of discrimination

Measures of discrimination

Discrimination and privacy

3

Page 4: Detecting Algorithmic Bias (keynote at DIR 2016)

A dangerous reasoningTo discriminate is to treat someone differently

(Unfair) discrimination is based on group membership, not individual merit

People's decisions include objective and subjective elements

Hence, they can be discriminate

Algorithmic inputs include only objective elements

Hence, they cannot discriminate?

4

Page 5: Detecting Algorithmic Bias (keynote at DIR 2016)

Judiciary use of COMPAS scoresCOMPAS (Correctional Offender Management Profiling for Alternative Sanctions): 137-questions questionnaire and predictive model for "risk of recidivism"

Prediction accuracy of recidivism for blacks and whites is about 60%, but ...

● Blacks that did not reoffendwere classified as high risk twice as much as whites that did not reoffend

● Whites who did reoffendwere classified as low risk twice as much as blacks who did reoffend

5Pro Publica, May 2016. https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm

Page 6: Detecting Algorithmic Bias (keynote at DIR 2016)

On the web: race and gender stereotypes reinforced● Results for "CEO" in Google Images: 11% female, US 27% female CEOs

○ Also in Google Images, "doctors" are mostly male, "nurses" are mostly female

● Google search results for professional vs. unprofessional hairstyles for work

Image results:"Unprofessionalhair for work"

Image results:"Professionalhair for work"

6M. Kay, C. Matuszek, S. Munson (2015): Unequal Representation and Gender Stereotypes in Image Search Results for Occupations. CHI'15.

Page 7: Detecting Algorithmic Bias (keynote at DIR 2016)

The dark side of Google AdsAdFisher: tool to automate the creation of behavioral and demographic profiles.

Used to demonstrate that setting gender = female results in less ads for high-paying jobs.

A. Datta, M. C. Tschantz, and A. Datta (2015). Automated experiments on ad privacy settings.Proceedings on Privacy Enhancing Technologies, 2015(1):92–112.

7

Page 8: Detecting Algorithmic Bias (keynote at DIR 2016)

Self-perpetuating algorithmic biasesRanking algorithm for finding candidates puts Maria at the bottom of the list

Hence, Maria gets less jobs offers

Hence, Maria gets less job experience

Hence, Maria becomes less employable

The same spiral happens in credit scoring, and instop-and-frisk of minorities that increases incarceration rates

8

Page 9: Detecting Algorithmic Bias (keynote at DIR 2016)

To make things worse ...Algorithms are "black boxes" protected by

Industrial secrecy

Legal protections

Intentional obfuscation

Discrimination becomes invisible

Mitigation becomes impossible

9F. Pasquale (2015): The Black Box Society. Harvard University Press.

Page 10: Detecting Algorithmic Bias (keynote at DIR 2016)

Weapons of Math Destruction (WMDs)"[We] treat mathematical models as a neutral and inevitable force, like the weather or the tides, we abdicate our responsibility" - Cathy O'Neil

Three main ingredients of a "WMD":

● Opacity● Scale● Damage

10C. O'Neil (2016): Weapons of Math Destruction. Crown Random House

Page 11: Detecting Algorithmic Bias (keynote at DIR 2016)

Part I: algorithmic discrimination conceptsMotivation and examples of algorithmic bias

Sources of algorithmic bias

Legal definitions and principles of discrimination

Measures of discrimination

Discrimination and privacy

11

Page 12: Detecting Algorithmic Bias (keynote at DIR 2016)

Two areas of concern: data and algorithmsData inputs:

● Poorly selected (e.g., observe only car trips, not bicycle trips)● Incomplete, incorrect, or outdated● Selected with bias (e.g., smartphone users)● Perpetuating and promoting historical biases (e.g., hiring people that "fit the culture")

Algorithmic processing:

● Poorly designed matching systems● Personalization and recommendation services that narrow instead of expand user options● Decision making systems that assume correlation implies causation● Algorithms that do not compensate for datasets that disproportionately represent populations● Output models that are hard to understand or explain hinder detection and mitigation of bias

12Executive Office of the US President (May 2016): "Big Data: A Report on Algorithmic Systems,Opportunity,and Civil Rights"

Page 13: Detecting Algorithmic Bias (keynote at DIR 2016)

Detecting algorithmic discriminationMotivation and examples of algorithmic bias

Sources of algorithmic bias

Legal definitions and principles of discrimination

Measures of discrimination

Discrimination and privacy

13

Page 14: Detecting Algorithmic Bias (keynote at DIR 2016)

Legal conceptsAnti-discrimination legislation typically seeks equal access to employment, working conditions, education, social protection, goods, and services

Anti-discrimination legislation is very diverse and includes many legal concepts:

Genuine occupational requirement (male actor to portray male character)

Disparate impact and disparate treatment

Burden of proof and situation testing

Group under-representation principle

14

Page 15: Detecting Algorithmic Bias (keynote at DIR 2016)

Discrimination: treatment vs impactModern legal frameworks offer various levels of protection for being discriminated by belonging to a particular class of: gender, age, ethnicity, nationality, disability, religious beliefs, and/or sexual orientation

Disparate treatment:

Treatment depends on class membership

Disparate impact:

Outcome depends on class membershipEven if (apparently?) people are treated the same way

15

Page 16: Detecting Algorithmic Bias (keynote at DIR 2016)

Proving discrimination is hardThe burden of proof can be shared for discrimination cases:

the accuser must produce evidence of the consequences,

the defendant must produce evidence that the process was fair

(This is the criterion of the European Court of Justice)

16Migration Policy Group and Swedish Centre for Equal Rights (2009): Proving discrimination cases - the role of situational testing.

Page 17: Detecting Algorithmic Bias (keynote at DIR 2016)

Part I: algorithmic discrimination conceptsMotivation and examples of algorithmic bias

Sources of algorithmic bias

Legal definitions and principles of discrimination

Measures of discrimination

Discrimination and privacy

17

Page 18: Detecting Algorithmic Bias (keynote at DIR 2016)

Principles for quantifying discriminationTwo basic frameworks for measuring discrimination:

Discrimination at the individual level: consistency or individual fairness

Discrimination at the group level: statistical parity

18I. Žliobaitė (2015): A survey on measuring indirect discrimination in machine learning. arXiv pre-print.

Page 19: Detecting Algorithmic Bias (keynote at DIR 2016)

Individual fairness is about consistencyConsistency score

C = 1 - ∑i∑yj∊knn(yi) |yi - yj|

Where knn(yi) = k nearest neighbors of yi

A consistent or individually fair algorithm is one in which similar people experience similar outcomes … but note that perhaps they are all treated equally badly

19Richard S. Zemel, Yu Wu, Kevin Swersky, Toniann Pitassi, and Cynthia Dwork. 2013. Learning Fair Representations. In Proc. of the 30th Int. Conf. on Machine Learning. 325–333.

Page 20: Detecting Algorithmic Bias (keynote at DIR 2016)

Group fairness is about statistical parityExample:

"Protected group" ~ "people with disabilities"

"Benefit granted" ~ "getting a scholarship"

20

Intuitively, ifa/n1, the fraction of people with disabilities that does not get a scholarship is much larger thanc/n1, the fraction of people without disabilities that does not get a scholarship,then people with disabilities could claim they are being discriminated.D. Pedreschi, S. Ruggieri, F. Turini: A Study of Top-K Measures for Discrimination Discovery. SAC 2012.

Page 21: Detecting Algorithmic Bias (keynote at DIR 2016)

Simple discrimination measuresThese measures compare the protected group against the unprotected group:

● Risk difference = RD = p1 - p2

● Risk ratio or relative risk = RR = p1 / p2

● Relative chance = RC = (1-p1) / (1-p2)● Odds ratio = RR/RC

21D. Pedreschi, S. Ruggieri, F. Turini: A Study of Top-K Measures for Discrimination Discovery. SAC 2012.

Page 22: Detecting Algorithmic Bias (keynote at DIR 2016)

Simple discrimination measuresThese measures compare the protected group against the unprotected group:

● Risk difference = RD = p1 - p2

● Risk ratio or relative risk = RR = p1 / p2

● Relative chance = RC = (1-p1) / (1-p2)● Odds ratio = RR/RC

22D. Pedreschi, S. Ruggieri, F. Turini: A Study of Top-K Measures for Discrimination Discovery. SAC 2012.

Mentioned in UK law

Mentioned by EU Court of Justice

US courts focus on selection rates:(1-p1) and (1-p2)

There are many other measures of discrimination!

Page 23: Detecting Algorithmic Bias (keynote at DIR 2016)

Part I: algorithmic discrimination conceptsMotivation and examples of algorithmic bias

Sources of algorithmic bias

Legal definitions and principles of discrimination

Measures of discrimination

Discrimination and privacy

23

Page 24: Detecting Algorithmic Bias (keynote at DIR 2016)

A connection between privacy and discriminationFinding if people having attribute X were discriminated is like inferring attribute X from a database in which:

the attribute X was removed

a new attribute (the decision), which is based on X, was added

This is similar to trying to reconstruct a column from a privacy-scrubbed dataset

24

Page 25: Detecting Algorithmic Bias (keynote at DIR 2016)

Part IIPart I Algorithmic Discrimination ConceptsPart II Algorithmic Discrimination Discovery

25

Page 26: Detecting Algorithmic Bias (keynote at DIR 2016)

Discrimination discoveryDefinition

Data mining approaches

Discriminatory rankings

26

Page 27: Detecting Algorithmic Bias (keynote at DIR 2016)

The discrimination discovery task at a glance

Given a large database of historical decision records,

find discriminatory situations and practices.

27S. Ruggieri, D. Pedreschi and F. Turini (2010). DCUBE: Discrimination discovery in databases. In SIGMOD, pp. 1127-1130.

Page 28: Detecting Algorithmic Bias (keynote at DIR 2016)

Discrimination discoveryDefinition

Data mining approaches

Discriminatory rankings

28

Page 29: Detecting Algorithmic Bias (keynote at DIR 2016)

Discrimination discoveryDefinition

Data mining approaches

Discriminatory rankings

29

Classification rule mining Group discr.k-NN classification Individual discr.Bayesian networks Individual discr.Probabilistic causation Ind./Group discr.Privacy attack strategies Group discr.Predictability approach Group discr.

Page 30: Detecting Algorithmic Bias (keynote at DIR 2016)

Discrimination discoveryDefinition

Data mining approaches

Discriminatory rankings

30

Classification rule mining Group discr.k-NN classification Individual discr.Bayesian networks Individual discr.Probabilistic causation Ind./Group discr.Privacy attack strategies Group discr.Predictability approach Group discr.

See KDD 2016 tutorialby S. Hajian, F. Bonchi, C. Castillo

Page 31: Detecting Algorithmic Bias (keynote at DIR 2016)

Discrimination discoveryDefinition

Data mining approaches

Discriminatory rankings

31

Classification rule miningk-NN classification

D. Pedreschi, S. Ruggieri and F. Turini (2008). Discrimination-aware data mining. In KDD'08.D. Pedreschi, S. Ruggieri, and F. Turini (2009). Measuring discrimination in socially-sensitive decision records. In SDM'09.S. Ruggieri, D. Pedreschi, and F. Turini (2010). Data mining for discrimination discovery. In TKDD 4(2).

Page 32: Detecting Algorithmic Bias (keynote at DIR 2016)

Defining potentially discriminated (PD) groupsA subset of attribute values are perceived as potentially discriminatory based on background knowledge. Potentially discriminated groups are people with those attribute values.

Examples:

Female gender

Ethnic minority (racism) or minority language

Specific age range (ageism)

Specific sexual orientation (homophobia)32

Page 33: Detecting Algorithmic Bias (keynote at DIR 2016)

Direct discriminationDirect discrimination implies rules or procedures that impose ‘disproportionate burdens’ on minorities

PD rules are any classification rule of the form:

A, B → C

where A is a PD group (B is called a "context")

33

Example:gender="female", saving_status="no known savings"

→ credit=no

Page 34: Detecting Algorithmic Bias (keynote at DIR 2016)

Indirect discriminationIndirect discrimination implies rules or procedures that impose ‘disproportionate burdens’ on minorities, though not explicitly using discriminatory attributes

Potentially non-discriminatory (PND) rules may unveil discrimination, and are of the form:

D, B → C where D is a PND group

34

Example:neighborhood="10451", city="NYC"

→ credit=no

Page 35: Detecting Algorithmic Bias (keynote at DIR 2016)

Indirect discrimination exampleSuppose we know that with high confidence:(a) neighborhood=10451, city=NYC → benefit=deny

But we also know that with high confidence:(b) neighborhood=10451, city=NYC → race=black

Hence:(c) race=black, neighborhood=10451, city=NYC → benefit=deny

Rule (b) is background knowledge that allows us to infer (c), which shows thatrule (a) is indirectly discriminating against blacks

35

Page 36: Detecting Algorithmic Bias (keynote at DIR 2016)

Discrimination discoveryDefinition

Data mining approaches

Discriminatory rankings

36

B. T. Luong, S. Ruggieri, and F. Turini (2011). k-NN as an implementation of situation testing for discrimination discovery and prevention. KDD'11

Classification rule miningk-NN classification

Page 37: Detecting Algorithmic Bias (keynote at DIR 2016)

● Legal approach for creating controlled experiments

● Matched pairs undergo the same situation, e.g. apply for a job

○ Same characteristics apart from the discrimination ground

37

Situation testing

Page 38: Detecting Algorithmic Bias (keynote at DIR 2016)

k-NN as situation testing (algorithm)For r ∈ P(R), look at its k closest neighbors

● ... in the protected set○ define p1 = proportion with the same decision as r

● … in the unprotected set○ define p2 = proportion with the same decision as r

● measure the degree of discrimination of the decision for r○ define diff(r) = p1 - p2 (think of it as expressed in percentage points of difference)

38

knnP(r,k) knnU(r,k)

p1 = 0.75

p2 = 0.25

diff(r) = p1 - p2 = 0.50

Page 39: Detecting Algorithmic Bias (keynote at DIR 2016)

k-NN as situation testing (results)● If decision=deny-benefit, and diff(r) ≥ t, then we found discrimination around r

39

Page 40: Detecting Algorithmic Bias (keynote at DIR 2016)

Discrimination discoveryDefinition

Data mining approaches

Discriminatory rankings

40

Page 41: Detecting Algorithmic Bias (keynote at DIR 2016)

How to generate synthetic discriminatory rankings?How to generate a ranking that is "unfair" towards a protected group?

Let P be the protected group, N be the non-protected group

● Rank each group P, N, independently● While both are non-empty

○ With probability f pick top from protected, (1-f) top from non-protected

● Fill-in the bottom positions with remaining group

41Ke Yang, Julia Stoyanovich: Measuring Fairness in Ranked Outputs. FATML 2016.

Page 42: Detecting Algorithmic Bias (keynote at DIR 2016)

Synthetic ranking generation example

42Ke Yang, Julia Stoyanovich: Measuring Fairness in Ranked Outputs. FATML 2016.

f = 0.0 f = 0.3 f = 0.5

Page 43: Detecting Algorithmic Bias (keynote at DIR 2016)

Measuring discrimination using NDCG frameworkUse logarithmic discounts per position

discount(i) = 1 / log2(i)

Compute set-based differences at different positions

fairness = ∑ i=10,20,... diff(P, N, TopResult1...TopResulti)

diff(P, N, Results) evaluates to what extent P and Nare represented in the results, e.g., proportions

43Ke Yang, Julia Stoyanovich: Measuring Fairness in Ranked Outputs. FATML 2016.

i discount(i)

5 0.43

10 0.30

20 0.23

30 0.20

See Yang and Stoyanovich 2016 for details

Page 44: Detecting Algorithmic Bias (keynote at DIR 2016)

Fair rankingsGiven a ranking policy and a fairness policy (possibly an affirmative action one), produce a ranking that:

● Generates ranking that ensures group fairness○ E.g.: sum of discounted utility for each group, ...

● Minimizes individual unfairness○ E.g.: average distance from color-blind ranking, people ranked below lower scoring people, …

Work in progress!

44Work in progress with Meike Zehlike, Sara Hajian, Francesco Bonchi

Page 45: Detecting Algorithmic Bias (keynote at DIR 2016)

Concluding remarks

45

Page 46: Detecting Algorithmic Bias (keynote at DIR 2016)

Concluding remarks1. Discrimination legislation attempts to create a balance2. Algorithms can discriminate3. Methods to avoid discrimination are not "color blind"4. Group fairness (statistical parity) vs. individual fairness

46

Page 47: Detecting Algorithmic Bias (keynote at DIR 2016)

Additional resources● Presentations/keynotes/book

○ Sara Hajian, Francesco Bonchi, and Carlos Castillo: Algorithmic Bias Tutorial at KDD 2016

○ Workshop on Fairness, Accountability, and Transparency on the Web at WWW 2017

○ Suresh Venkatasubramanian: Keynote at ICWSM 2016

○ Ricardo Baeza: Keynote at WebSci 2016

○ Toon Calders: Keynote at EGC 2016

○ Discrimination and Privacy in the Information Society by Custers et al. 2013

● Groups/workshops/communities○ Fairness, Accountability, and Transparency in Machine Learning (FATML) workshop and

resources

○ Data Transparency Lab - http://dtlconferences.org/ 47