30
What algorithms are, why they might need governing, and how we might do it. Algorithm Workshop, University of Strathclyde, 15 Feb 2017 Michael Veale @mikarv | [email protected] Department of Science, Technology, Engineering & Public Policy (STEaPP) University College London DEPARTMENT OF SCIENCE, TECHNOLOGY, ENGINEERING AND PUBLIC POLICY GOVERNANCE GOVERNANCE GOVERNANCE

What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

What algorithms are, why they might need governing, and how

we might do it. Algorithm Workshop, University of Strathclyde, 15 Feb 2017

Michael Veale @mikarv | [email protected]

Department of Science, Technology, Engineering & Public Policy (STEaPP)

University College London

DEPARTMENT OF SCIENCE, TECHNOLOGY, ENGINEERING AND PUBLIC POLICY

GOVERNANCEGO

VERNAN

CE

GOVERNANCE

Page 2: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

1. What are algorithms?

Page 3: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Algorithms: A broad view from computer science

Well-defined (not vague) instructions

that logically frame a problem

and specify an approach to tackle it

Alan says: “Everything computers

can do can be represented as a step-by-step

process”

Sources: doi: 10.1145/359131.359136

Page 4: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

This content downloaded from 90.198.224.115 on Tue, 07 Feb 2017 22:42:53 UTCAll use subject to http://about.jstor.org/terms

doi: 10.2307/2490013doi: 10.1007/978-3-642-18192-4

Accounting algorithms & algorithmic controversies are old news

~3500BC ~1500AD

Page 5: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

These algorithms can be long and complicated, but structured

Source: http://www.breezetree.com/articles/nassi-shneiderman-diagram.htm

Nassi-Shneiderman diagrams

Page 6: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

What’s new(er)? Machine learning algorithms

Machine learning identifies and utilises patterns in data.

We say a machine ‘learns’ to perform a task when a measurement of its performance increases with new data.

Machine learning algorithms and models are step-by-step in a Turing sense, but we are more interested in their complex, emergent properties.

IBM Watson says: “Several modules of my

programme are not specified explicitly by humans, but

induced from data”

Sources: Mitchell, Tom M. "Machine learning" McGraw Hill (definition)

Page 7: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

More complex algorithms

What is machine learning? Identifying and using patterns in data. Several types: Supervised learning: (most low hanging fruit) Give me labelled data, I learn to predict labels when they’re missing

Unsupervised learning: (useful, but less deployed) Give me unlabelled data, I detect clusters and structure.

Variables Label

£££ age edu. repays loan?

28k 27 Degree Y

22k 22 High Sch. N

27k 34 Degree N

/

neural network

decision tree

support vector machine

Variables Label

£££ age edu. repays loan?

31k 22 Degree Y (60% chance)

40k 29 High Sch. Y (80% chance)

22k 31 Degree N (55% chance)

Page 8: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Complicated vs Complex algorithms

Complicated algorithms Complex algorithms

- Structure is opaque by its extent- Linearity- Deterministic outcomes- Reductive characteristics: structures

determine logic- Logic first

- Structure inherently opaque- Non-linearities- Probabilistic outcomes- Emergent characteristics: structure and logics co-

constitute each other- Data first

Examples: - if–else-while-foreach flowcharts- scorecards.

Examples: - machine learning models - neural networks - random forests- evolutionary algorithms- agent-based models

Source: Author

Page 9: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Let’s run a neural network, now (hopefully)

http://playground.tensorflow.org/

Page 10: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

What can machine learning do fairly comfortably?

Modelling with reasonable amounts of well-ordered data • Model certain societal phenomena

• Crime, tax fraud, etc. • Predict characteristics e.g. gender, income, etc from online behaviour

• Detect basic emotions from physiological data • Detect anomalies • Target advertising

Source: HunchLab, Azavea

Page 11: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Recent machine learning advances

Mostly in domains where we’ve now got so much more data • Voice recognition • Image recognition (one or multiple objects in a scene) • Playing games (learning from only pixels and scores) • Understand how meanings of words relate to each other • Translation between languages • Driving cars

Source: Google, https://www.tensorflow.org

Page 12: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Where does machine learning struggle?

Struggles • ‘Idiot savants’: Multi-tasking and transferring learning • Dealing with messy data • Identifying context in images • Summarising text • Generalising from small amounts of data

Future directions • More unsupervised learning

•If this can’t be done, we restrict problems to only the situations where we can get lots of data

Source: xkcd, “Tasks”

Page 13: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Algorithms affect and are affected by people

Algorithms are actors that ‘do’ things in the world

We shape technologies and technologies shape us.

Technologies aren’t neutral, but political and distributive.

We should scrutinise the world views of ‘innovators’.

Bruno says: “When machines appear to

settle matters of fact, we start to look more at inputs and outputs

than their internals.”

Source: Latour, Bruno (1999). Pandora’s hope: Essays on the reality of science studies. Cambridge, MA: Harvard University Press.

Page 14: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Where are the system boundaries of ‘algorithm’?

models

organisationsmanagers and maintainers data collectors/cleaners decision support users decision-subjects

designers data sources

the wider worldpolitical, economic, legal, environmental systems

ethics public policy law economics organisational sciences science and technology studies business sociology operations research human–computer interaction requirements engineering computer science philosophy of science statistics mathematics

health warning

this is indicative! in general disciplines

resist easy linear ordering.

many important ones are also missing

Page 15: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Some reasons to govern algorithms

Page 16: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Fairness and discrimination

• Discrimination over protected characteristics • Actually including race, gender, etc in a model. • Indirectly including them through other variables,

intentionally or not. • Unfairness over non-protected characteristics

• Decisions on actions that seem arbitrary — being a tall redhead, having curiously searched for certain keywords.

• Unfairness from entrenching inequality • Deciding people or areas will be bad tomorrow because

they have been bad in the past • Unfairness from algorithmic memory and resolution

• Panoptical society/‘perfect discrimination’ • No ‘clean slates’

Page 17: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Transparency (accountability as virtue)

• Can’t get information about the algorithm • IP or lack of open data/standards

• Aren’t equipped to understand the algorithm • Skill mismatch, or poorly commented code

• Algorithm too complex to understand conventionally • Machine learning system without human interpretation

Photo: Author

Page 18: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Accountability (as mechanism, or forum)

a relationship between an actor and a forum, in which the actor has an obligation to explain and to justify his or her conduct, the forum can pose questions and pass judgement, and the actor may face consequences (Bovens, 2007)

• No idea what people actually do with the algorithm • Part of an opaque decision-support/making system

• No idea if there even is an algorithm • Unclear process, or silent negative provision

• No due process or easy redress • No non-automated institutions that work at sufficient speed/scale

• Issues compounded by technical/institutional opacity

Source: doi 10.1111/j.1468-0386.2007.00378.x ; Photo: flickr : blondavenger CC BY-NC-ND

Page 19: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Reliability, resilience, robustness

• Does this algorithm even work? • Does performance justify fairness/accountability issues

• Is the algorithm robust to change? • Will it stop working tomorrow, or on Wednesdays?

• Does the algorithm create problematic feedback loops? • Are future data collected a function of decisions that were made previously?

Photo: Italian Job (film), fair use

Page 20: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Security and gaming

Source: doi: 10.1145/2976749.2978392

• Can algorithms be gamed by malicious adversaries? • Risk of manipulation: make

biased, or make useless • Risk of private data release • Risk of interaction with other

cyberphysical systems

Page 21: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Mythos of neutrality and objectivity

Page 22: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Algorithmic races and anti-competitive behaviour

• Algorithms power business models but are powered themselves by data • Those that hold the most data can keep others out of the market

Photo: NASA, public domain

Page 23: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

As Brixton Station sagely asks us…

Source: Photo, author. Art: Giles Round. See http://art.tfl.gov.uk/projects/design-work-leisure/

Page 24: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Technical governance: ‘FATML’ tools

Source; www.fatml.org

Page 25: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Technical governance: Removing discrimination

For more, see Kamiran, F. et al. (2012). Techniques for Discrimination-Free Predictive Models . doi: 10.1007/978-3-642-30487-3_12

what kind of discrimination?

direct (use of protected characteristics)indirect (use of correlated characteristics)both (mix)

Discrimination removal strategies

data model outputlearningalgorithm

preprocessing (massage the data)

inprocessing (change the learning logic)

postprocessing (alter the learned model)

Page 26: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Technical governance: Removing discrimination, but…

Source: arXiv:1606.06121v1

Page 27: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Adding transparency

Sources: Tickle, Alan B., et al. "The truth will come to light: Directions and challenges in extracting the knowledge embedded within trained artificial neural networks." IEEE Transactions on Neural Networks 9.6 (1998): 1057-1068; Andrews, Robert, Joachim Diederich, and Alan B. Tickle. "Survey and critique of techniques for extracting rules from trained artificial neural networks." Knowledge-based systems 8.6 (1995): 373-389.

decompositional make/use a more

interpretable model

regression decision trees

pedagogical/ model-agnostic

wrap an uninterpretable model with a simpler one to estimate

its logics

LIME (next slide) rule extraction

?

Page 28: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Adding transparency: emerging methods

Ribeiro, M.T. et al. (2016). "Why Should I Trust You?": Explaining the Predictions of Any Classifier. arxiv:1602.04938

Page 29: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

Governance approaches: food for thought

Could we… - Start a 3rd party standards and certification scheme,

with regular independent audits?

- Foster ‘chartered’ data scientists with better ethical training and professional codes?

- Better enforce the laws we have with a statutory algorithmic investigatory body/supercomplaint watchdog?

- Do something completely different?

🤖

Page 30: What algorithms are, why they E might need governing, and how · Accounting algorithms & algorithmic controversies are old news ~3500BC ~1500AD. These algorithms can be long and complicated,

> fin # tweet me @mikarv > |