Transcript
Page 1: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Process Mining: Data Science in ActionProcess Mining as a Tool to Find Out What People and Organizations Really Do

prof.dr.ir. Wil van der Aalstwww.processmining.org

Page 2: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 2

How many students will take my class

next year?

What is the

real curriculum?

When, where and why do

students drop out?

When, where and why do students

deviate?

Will I graduate or drop out?

When will I graduate?

Should I take this course if I want to graduate in two

years from now?

Page 4: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

The data scientist …

PAGE 4

Page 5: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Data Science Center Eindhoven (DSC/e)

PAGE 5

28 research groups are

involved

Page 6: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Data Science Center Eindhoven (DSC/e)

PAGE 67 research programs

Page 7: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Process Mining

Page 8: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

www.olifantenpaadjes.nl

Process discovery

Page 9: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

abcdacbdabcdabcefcbdacbefcbefbcdetc

Process Discovery

Page 10: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

10

Conformance checking

Page 11: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

acbdabdacbefbcdacbefebcdddfeabb

ConformanceChecking

Page 12: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 12

Let's play

Page 13: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 13

Play-Out

Page 14: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Play Out: A possible scenario

PAGE 14

a b d e gCase Activity Timestamp Resource432 register travel request (a) 18-3-2014:9.15 John432 get support from local manager (b) 18-3-2014:9.25 Mary432 check budget by finance (d) 19-3-2014:8.55 John432 decide (e) 19-3-2014:9.36 Sue432 accept request (g) 19-3-2014:9.48 Mary

Page 15: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Play Out: Another scenario

PAGE 15

a cd e hf b d e

Page 16: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Play Out: Process model allows for many more scenarios

PAGE 16

abdeg

abcefbdehadceg adbeh

acbefbdehacdefcdefbdeh

adcefcdefbdefbdegabdegadcehadbeh

acbefbdegacdefcdefbdehadcefcdefbdefbdeg

abdegadceh

adbeh

acbefbdegacdefcdefbdeh

adcefcdefbdefbdeg

abdeg

Page 17: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 17

Play-In

Page 18: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Play In:Simple process allowing for 4 traces

PAGE 18

abdeg adbeg

abdehadbeh

abdegadbeg

abdehadbehabdehabdeh

abdehadbeh adbeh

Page 19: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Play In: Process allowing for more traces

PAGE 19

abdeg

abcefbdehadcegadbeh

acbefbdehacdefcdefbdehadcefcdefbdefbdeg abdeg

adcehadbehacbefbdeg

acdefcdefbdeh

adcefcdefbdefbdeg

abdegadceh

adbehacbefbdeg

acdefcdefbdeh

adcefcdefbdefbdeg

abdeg

Page 20: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

No modeling needed!

PAGE 20

No modeling needed!

Page 21: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Example Process Discovery(Dutch housing agency, 208 cases, 5987 events)

PAGE 21

Page 22: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Example process discovery for hospital(627 gynecological oncology patients, 24331 events)

PAGE 22

Page 23: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Process discovery algorithms (small selection)

PAGE 23

α algorithm

α++ algorithm

α# algorithm

language-based regions

state-based regionsgenetic mining

heuristic mining

hidden Markov models

neural networks

automata-based learning

stochastic task graphs

conformal process graph

mining block structures

multi-phase miningpartial-order based mining

fuzzy mining

LTL mining

ILP mining

distributed genetic mining

ETM genetic algorithmInductive Miner (infrequent)

Page 24: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Language based regions

PAGE 24

Region R = (X,Y,c) corresponding to place pR: X = {a1,a2,c1} = transitions producing a token for pR, Y = {b1,b2,c1} = transitions consuming a token from pR, and c is the initial marking of pR.

Page 25: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Basic idea: enough tokens should be present when consuming

PAGE 25

A place is feasible if it can be added without disabling any of thetraces in the event log.

Page 26: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 26

Replay

Page 27: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Replay

PAGE 27

a c d e g

Page 28: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Replay

PAGE 28

a c

check budget (d) is missing!

e g?

Page 29: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Alignments: Relating reality and model

PAGE 29

a c » e ga c d e g

check budget (d) did not happen but should have according to the model

Page 30: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Replay

PAGE 30

a c d e gh

reject request (h) is impossible

?

Page 31: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Alignments: Relating reality and model

PAGE 31

a c h d e ga c » d e g

reject request (h) happened but could not happen according to the model

Page 32: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Any trace in reality can be related to a path in the model

PAGE 32

a c e f d d b c e h

Page 33: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Any trace in reality can be related to a path in the model

PAGE 33

a c » e f d d b c e ha c d e f d » b » e h

check is missing

one check too many

cannot both be done

optimization problem using a cost fu

nction

Page 34: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 34

process model

event log

synchronous move

move on model only

move on log only

Page 35: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Conformance Checking (WOZ objections Dutch municipality, 745 objections, 9583 event, f= 0.988)

PAGE 35

Page 36: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Replay with timestamps

PAGE 36

a9.15 c9.20 d9.35 e10.15 g11.30

9.159.20

9.35

10.15

11.30

5 55

20 40

75

Page 37: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Replay with timestamps for many traces

PAGE 37

frequencies of activities

frequencies of paths

durations of activities

waiting times and other delays between

activities

Page 38: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 38

• conformance checking to diagnose deviations• squeezing reality into the model to do model-based

analysis

Alignments are essential!

Page 39: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Example: BPI Challenge 2012(Dutch financial institute, doi:10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f)

PAGE 39Work of Arya Adriansyah (Replay project)

Page 40: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 40

Auditor's toolbox

Page 41: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 41

Business analyst's toolbox

Page 42: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Software

Page 43: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

600+ plug-ins available covering the whole process mining spectrum

Page 44: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE
Page 45: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Disco

PAGE 45

Page 46: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Overview: Role of process models

PAGE 46

Play-Out

Play-In

Replay

Page 47: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

decomposed/distributed process mining

Page 48: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

data Big data

Page 49: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

http://www.multivu.com/assets/58095/photos/Data-is-the-new-oil-infographic-Nigel-Holmes-2012-from-The-Human-Face-of-Big-Data-original.jpg

Page 50: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

What if?

PAGE 50

there are more than 1000 different

activities?

there are more than 1.000.000

cases?

there are more than 100.000.000

events?

Page 51: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Decompose event log! vertical or horizontal

PAGE 51

sets of cases

sets of activities

Page 52: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Vertical distribution: Split cases

PAGE 52

sets of cases

Page 53: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Horizontal distribution

PAGE 53

sets of activities

{a,b,e,f,g}

{b,c,d,e}{a,b,e,f,g}

{b,c,d,e}

Page 54: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Horizontal distribution: The key idea

PAGE 54

projected on {a,b,e,f,g}

projected on {b,c,d,e}

Page 55: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Two foundational ways of spitting event data: horizontal or vertical

PAGE 55

Page 56: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE
Page 57: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Decomposing Conformance Checking

PAGE 57See "divide and conquer" framework by Eric Verbeek.

Page 58: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Example of a valid decomposition

PAGE 58

Log can be split in the same way!

Page 59: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Example of an alignment for observed trace a,b,c,d,e,c,d,g,f

PAGE 59

Etc.

a,b,c,d,e,c,d,g,f

Page 60: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Conformance checking can be decomposed !!!

• General result for any valid decomposition: Any event log or trace is perfectly fitting the overall model if and only if it is also fitting all the individual fragments

PAGE 60

for a

ny Petri

net

(not j

ust WF-n

ets)

for a

ny valid

decompsiti

on

Wil van der Aalst, Decomposing Petri nets for process mining: A generic approach. Distributed and Parallel Databases, Volume 31, Issue 4, pp 471-507, 2013

also for d

ata

Petri n

ets

Page 61: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Example (work with Jorge Munoz-Gama and Josep Carmona)

PAGE 61

prFm6

Page 62: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Decomposing Process Discovery

PAGE 62See "divide and conquer" framework by Eric Verbeek.

Page 63: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

conclusion

Page 64: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Don't just check the temperature …

PAGE 64

Page 65: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

X-ray your processes!

PAGE 65

Page 66: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 66

www.processmining.org

www.win.tue.nl/ieeetfpm/

Page 67: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Process Mining: Data Science in Actionhttps://www.coursera.org/course/procmin

First Massive Open Online Course (MOOC) on Process Mining


Recommended