68
Brian Moran Wherein our intrepid explorer provides an account of many curious encounters with data... 28 September 2016

Adventures in data science

Embed Size (px)

Citation preview

Page 1: Adventures in data science

Brian MoranWherein our intrepid explorer provides an account of many curious encounters with data...

28 September 2016

Page 2: Adventures in data science

Adapted from http://www.anlytcs.com/2014/01/data-science-venn-diagram-v20.html

Data Science

Mathematics and Statistics

Research Methods

UNICORN

Machine Learning

Subject Matter Expertise

Programming Skills

Computer Science

What Does it Involve?

Page 3: Adventures in data science

Business Understanding

Data Understanding

Data PreparationModelling

Evaluation

Deployment

What Does it Do?

The CRISP data mining process

Page 4: Adventures in data science

Check email

Naive Bayes

Internet search

PageRank

Watch Netflix

Boltzman machine

Buy lunch

Artificial neural net

Use Sat-Nav

Dijkstra’s algorithm

Apply for a loan

Decision trees

Shop on Amazon

Matrix factorization

Get a letter

k-means clustering

Diagrams (but not text) from: http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/

Data Science Applications are Everywhere

Page 5: Adventures in data science
Page 6: Adventures in data science
Page 7: Adventures in data science

Welcome back Suzy

Page 8: Adventures in data science

Welcome back Jon

Page 9: Adventures in data science

Recommend money advice

Alert

Close

82%Likelihood:

Page 10: Adventures in data science

Recommend alternative

Alert

Close

90%Chance of broken appointment:

Better alternative:

2:00pm - 5:00pm

Page 11: Adventures in data science

Recommend incentives

Advice

Close

Tenancy value:

Negative Low High

Page 12: Adventures in data science

One click repair adactus

Page 13: Adventures in data science

1. Repairrequest Delay Staff

checkStaffinput

Repairordered

2. Repairrequest

Repairordered

3. Repairrequest

MachineLearning

Staffcheck

Staffinput

Repairordered

RepairorderedProblem? N

Y

Options for Back-office Repairs Process

Page 14: Adventures in data science

Proof of concept 1: Natural language processing with Naive Bayes…

Page 15: Adventures in data science

TRADEREPAIR REQUEST

Check web form

Naive Bayes

Page 16: Adventures in data science

“Our ba th room is leaking through the ceiling onto the stairs and the ceiling is wet through along with the walls where the taps are mounted. And puddle on the stairs”

PLUMBER

Check web form

Naive Bayes

Page 17: Adventures in data science

Check web form

Naive Bayes

bathroomleaking

ceilingstairsceiling

wet

walls taps

stairs

PLUMBER

Page 18: Adventures in data science
Page 19: Adventures in data science
Page 20: Adventures in data science

-£300,000

-£225,000

-£150,000

-£75,000

£0

£75,000

£150,000

£225,000

£300,000

10% channel shift 20% channel shift 30% channel shift 40% channel shift 50% channel shift 60% channel shift 70% channel shiftNet c

ost /

ben

efit o

f rep

airs

sel

f-ser

vice

90% accuracy

10% accuracy

50% accuracy

‘Friction Free’ Self Service: Avoiding Pyrrhic Victories

Page 21: Adventures in data science

Proof of concept 2: Clustering with K-means…

Page 22: Adventures in data science

CUSTOMISE SERVICES & INFORMATION

Assign to cluster

K-means clustering

Page 23: Adventures in data science
Page 24: Adventures in data science
Page 25: Adventures in data science
Page 26: Adventures in data science
Page 27: Adventures in data science
Page 28: Adventures in data science

Blocked means of escape

Alert

Close Blocked for one day

56 King StreetLeighWN7 4LJ

Page 29: Adventures in data science

Unusual transaction

Alert

Close

Page 30: Adventures in data science

Audit file ready

Advice

Close

Values beginning with

Highlighted for attention

1166

Page 31: Adventures in data science

Proof of concept 3: Identifying unnatural numbers with Benford’s Law…

Page 32: Adventures in data science
Page 33: Adventures in data science

Find outliers

Benford’s Law

1 2 3 4 5 6 7 8 9

Page 34: Adventures in data science
Page 35: Adventures in data science
Page 36: Adventures in data science

6’0”

5’0”

4’0”

3’0”

5’6”

4’6”

3’6”

6’0”

5’0”

4’0”

3’0”

5’6”

4’6”

3’6”I’m going to beat Benford’s Law

I’m going to beat Benford’s Law

I’m going to beat Benford’s Law I’m going to beat

Benford’s Law

Page 37: Adventures in data science

6’0”

5’0”

4’0”

3’0”

5’6”

4’6”

3’6”

6’0”

5’0”

4’0”

3’0”

5’6”

4’6”

3’6”

BUSTED!

BUSTED!

BUSTED!

CRIMINAL GENIUS!

Page 38: Adventures in data science
Page 39: Adventures in data science
Page 40: Adventures in data science
Page 41: Adventures in data science

Payment cycle broken

Alert

Close

Page 42: Adventures in data science

Proof of concept 4: Forecasting arrears with Holt-Winters…

Page 43: Adventures in data science

Forecast time series data

Holt-Winters Triple Exponential Smoothing

Page 44: Adventures in data science
Page 45: Adventures in data science
Page 46: Adventures in data science
Page 47: Adventures in data science
Page 48: Adventures in data science
Page 49: Adventures in data science

Staffing forecast ready

Advice

Close

Page 50: Adventures in data science

Recommend replacement

Advice

Close

97%Likelihood:

Boiler beyond economic repair

Page 51: Adventures in data science

Optimal time to sell property

Alert

Close

Page 52: Adventures in data science

Housing Association Costs

Page 53: Adventures in data science

Proof of concept 5: Tenancy survival with Kaplan-Meier…

Page 54: Adventures in data science

Survival Analysis

Kaplan-Meier Estimator

S(t)=1−F(t) = Prob(T≥t)

Page 55: Adventures in data science
Page 56: Adventures in data science

Histogram of Tenancy Length (Current Tenants)

Page 57: Adventures in data science
Page 58: Adventures in data science
Page 59: Adventures in data science

A Simple Model of Tenancy Lengths

Tenancy Length (years)

Page 60: Adventures in data science

Tenancy Length (years)

A Simple Model of Tenancy Lengths

Page 61: Adventures in data science

Tenancy Length (years)

Unlikely to need component

replacements

Likely to needcomponent replacements

Even Simple Models can Lead to Interesting Thoughts

Page 62: Adventures in data science
Page 63: Adventures in data science
Page 64: Adventures in data science

RECEPTIONCONTACT CENTRE

Getting Technology isn't the Same as Really Getting it

Page 65: Adventures in data science
Page 66: Adventures in data science

Digital front of office Digital back of office +

Really Getting Channel Shift Choice

Page 67: Adventures in data science
Page 68: Adventures in data science

Brian Moran

brian.moran@ adactushousing.co.uk