67
Process Mining: Data Science in Action Process Mining as a Tool to Find Out What People and Organizations Really Do prof.dr.ir. Wil van der Aalst www.processmining.org

Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

  • Upload
    yandex

  • View
    410

  • Download
    0

Embed Size (px)

DESCRIPTION

Data science is the profession of the future, because organizations that are unable to use (big) data in a smart way will not survive. It is not sufficient to focus on data storage and data analysis. The data scientist also needs to relate data to process analysis. Process mining bridges the gap between traditional model-based process analysis (e.g., simulation and other business process management techniques) and data-centric analysis techniques, such as machine learning and data mining. Process mining seeks to find a connection between event data (i.e., observed behavior) and process models (hand-made or discovered automatically). This technology has become available only recently, but it can be applied to any type of operational processes (organizations and systems). Example applications can include: analyzing treatment processes in hospitals, improving customer service processes in multinational companies, understanding browsing behavior of customers on a booking site, analyzing failures of a baggage handling system, or improving user interface of the X-ray machine. What all of these applications have in common is the need to relate dynamic behavior to process models. Not only does process mining provide a bridge between data mining and business process management, but it also helps to address the classical divide between "business" and "IT". Evidence-based business process management based on process mining helps to create a common ground for business process improvement and information systems development.

Citation preview

Page 1: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Process Mining: Data Science in ActionProcess Mining as a Tool to Find Out What People and Organizations Really Do

prof.dr.ir. Wil van der Aalstwww.processmining.org

Page 2: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 2

How many students will take my class

next year?

What is the

real curriculum?

When, where and why do

students drop out?

When, where and why do students

deviate?

Will I graduate or drop out?

When will I graduate?

Should I take this course if I want to graduate in two

years from now?

Page 4: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

The data scientist …

PAGE 4

Page 5: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Data Science Center Eindhoven (DSC/e)

PAGE 5

28 research groups are

involved

Page 6: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Data Science Center Eindhoven (DSC/e)

PAGE 67 research programs

Page 7: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Process Mining

Page 8: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

www.olifantenpaadjes.nl

Process discovery

Page 9: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

abcdacbdabcdabcefcbdacbefcbefbcdetc

Process Discovery

Page 10: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

10

Conformance checking

Page 11: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

acbdabdacbefbcdacbefebcdddfeabb

ConformanceChecking

Page 12: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 12

Let's play

Page 13: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 13

Play-Out

Page 14: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Play Out: A possible scenario

PAGE 14

a b d e gCase Activity Timestamp Resource432 register travel request (a) 18-3-2014:9.15 John432 get support from local manager (b) 18-3-2014:9.25 Mary432 check budget by finance (d) 19-3-2014:8.55 John432 decide (e) 19-3-2014:9.36 Sue432 accept request (g) 19-3-2014:9.48 Mary

Page 15: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Play Out: Another scenario

PAGE 15

a cd e hf b d e

Page 16: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Play Out: Process model allows for many more scenarios

PAGE 16

abdeg

abcefbdehadceg adbeh

acbefbdehacdefcdefbdeh

adcefcdefbdefbdegabdegadcehadbeh

acbefbdegacdefcdefbdehadcefcdefbdefbdeg

abdegadceh

adbeh

acbefbdegacdefcdefbdeh

adcefcdefbdefbdeg

abdeg

Page 17: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 17

Play-In

Page 18: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Play In:Simple process allowing for 4 traces

PAGE 18

abdeg adbeg

abdehadbeh

abdegadbeg

abdehadbehabdehabdeh

abdehadbeh adbeh

Page 19: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Play In: Process allowing for more traces

PAGE 19

abdeg

abcefbdehadcegadbeh

acbefbdehacdefcdefbdehadcefcdefbdefbdeg abdeg

adcehadbehacbefbdeg

acdefcdefbdeh

adcefcdefbdefbdeg

abdegadceh

adbehacbefbdeg

acdefcdefbdeh

adcefcdefbdefbdeg

abdeg

Page 20: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

No modeling needed!

PAGE 20

No modeling needed!

Page 21: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Example Process Discovery(Dutch housing agency, 208 cases, 5987 events)

PAGE 21

Page 22: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Example process discovery for hospital(627 gynecological oncology patients, 24331 events)

PAGE 22

Page 23: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Process discovery algorithms (small selection)

PAGE 23

α algorithm

α++ algorithm

α# algorithm

language-based regions

state-based regionsgenetic mining

heuristic mining

hidden Markov models

neural networks

automata-based learning

stochastic task graphs

conformal process graph

mining block structures

multi-phase miningpartial-order based mining

fuzzy mining

LTL mining

ILP mining

distributed genetic mining

ETM genetic algorithmInductive Miner (infrequent)

Page 24: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Language based regions

PAGE 24

Region R = (X,Y,c) corresponding to place pR: X = {a1,a2,c1} = transitions producing a token for pR, Y = {b1,b2,c1} = transitions consuming a token from pR, and c is the initial marking of pR.

Page 25: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Basic idea: enough tokens should be present when consuming

PAGE 25

A place is feasible if it can be added without disabling any of thetraces in the event log.

Page 26: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 26

Replay

Page 27: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Replay

PAGE 27

a c d e g

Page 28: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Replay

PAGE 28

a c

check budget (d) is missing!

e g?

Page 29: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Alignments: Relating reality and model

PAGE 29

a c » e ga c d e g

check budget (d) did not happen but should have according to the model

Page 30: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Replay

PAGE 30

a c d e gh

reject request (h) is impossible

?

Page 31: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Alignments: Relating reality and model

PAGE 31

a c h d e ga c » d e g

reject request (h) happened but could not happen according to the model

Page 32: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Any trace in reality can be related to a path in the model

PAGE 32

a c e f d d b c e h

Page 33: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Any trace in reality can be related to a path in the model

PAGE 33

a c » e f d d b c e ha c d e f d » b » e h

check is missing

one check too many

cannot both be done

optimization problem using a cost fu

nction

Page 34: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 34

process model

event log

synchronous move

move on model only

move on log only

Page 35: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Conformance Checking (WOZ objections Dutch municipality, 745 objections, 9583 event, f= 0.988)

PAGE 35

Page 36: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Replay with timestamps

PAGE 36

a9.15 c9.20 d9.35 e10.15 g11.30

9.159.20

9.35

10.15

11.30

5 55

20 40

75

Page 37: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Replay with timestamps for many traces

PAGE 37

frequencies of activities

frequencies of paths

durations of activities

waiting times and other delays between

activities

Page 38: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 38

• conformance checking to diagnose deviations• squeezing reality into the model to do model-based

analysis

Alignments are essential!

Page 39: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Example: BPI Challenge 2012(Dutch financial institute, doi:10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f)

PAGE 39Work of Arya Adriansyah (Replay project)

Page 40: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 40

Auditor's toolbox

Page 41: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 41

Business analyst's toolbox

Page 42: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Software

Page 43: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

600+ plug-ins available covering the whole process mining spectrum

Page 44: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE
Page 45: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Disco

PAGE 45

Page 46: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Overview: Role of process models

PAGE 46

Play-Out

Play-In

Replay

Page 47: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

decomposed/distributed process mining

Page 48: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

data Big data

Page 49: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

http://www.multivu.com/assets/58095/photos/Data-is-the-new-oil-infographic-Nigel-Holmes-2012-from-The-Human-Face-of-Big-Data-original.jpg

Page 50: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

What if?

PAGE 50

there are more than 1000 different

activities?

there are more than 1.000.000

cases?

there are more than 100.000.000

events?

Page 51: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Decompose event log! vertical or horizontal

PAGE 51

sets of cases

sets of activities

Page 52: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Vertical distribution: Split cases

PAGE 52

sets of cases

Page 53: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Horizontal distribution

PAGE 53

sets of activities

{a,b,e,f,g}

{b,c,d,e}{a,b,e,f,g}

{b,c,d,e}

Page 54: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Horizontal distribution: The key idea

PAGE 54

projected on {a,b,e,f,g}

projected on {b,c,d,e}

Page 55: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Two foundational ways of spitting event data: horizontal or vertical

PAGE 55

Page 56: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE
Page 57: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Decomposing Conformance Checking

PAGE 57See "divide and conquer" framework by Eric Verbeek.

Page 58: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Example of a valid decomposition

PAGE 58

Log can be split in the same way!

Page 59: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Example of an alignment for observed trace a,b,c,d,e,c,d,g,f

PAGE 59

Etc.

a,b,c,d,e,c,d,g,f

Page 60: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Conformance checking can be decomposed !!!

• General result for any valid decomposition: Any event log or trace is perfectly fitting the overall model if and only if it is also fitting all the individual fragments

PAGE 60

for a

ny Petri

net

(not j

ust WF-n

ets)

for a

ny valid

decompsiti

on

Wil van der Aalst, Decomposing Petri nets for process mining: A generic approach. Distributed and Parallel Databases, Volume 31, Issue 4, pp 471-507, 2013

also for d

ata

Petri n

ets

Page 61: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Example (work with Jorge Munoz-Gama and Josep Carmona)

PAGE 61

prFm6

Page 62: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Decomposing Process Discovery

PAGE 62See "divide and conquer" framework by Eric Verbeek.

Page 63: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

conclusion

Page 64: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Don't just check the temperature …

PAGE 64

Page 65: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

X-ray your processes!

PAGE 65

Page 66: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

PAGE 66

www.processmining.org

www.win.tue.nl/ieeetfpm/

Page 67: Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE

Process Mining: Data Science in Actionhttps://www.coursera.org/course/procmin

First Massive Open Online Course (MOOC) on Process Mining