19
©Wil van der Aalst & TU/e (use only with permission & acknowledgements) Event Logs What kind of data does process mining require? prof.dr.ir. Wil van der Aalst www.processmining.org

Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Event Logs What kind of data does process mining require?

prof.dr.ir. Wil van der Aalst www.processmining.org

Page 2: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Getting the "right" data …

software system

(process)model

eventlogs

modelsanalyzes

discovery

records events, e.g., messages,

transactions, etc.

specifies configures implements

analyzes

supports/controls

enhancement

conformance

“world”

people machines

organizationscomponents

businessprocesses

Page 3: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Event log

• We assume the existence of an event log where each event refers to a case, an activity, and a point in time.

• An event log can be seen as a collection of cases.

• A case can be seen as a trace/sequence of events.

Page 4: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

event data can be found everywhere!

Page 5: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Event data may come from …

• a database system (e.g., patient data in a hospital), • a comma-separated values (CSV) file or spreadsheet, • a transaction log (e.g., a trading system), • a business suite/ERP system (SAP, Oracle, etc.), • a message log (e.g., from IBM middleware), • an open API providing data from websites or social

media, …

Page 6: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

An example log

student name course name exam date mark Peter Jones Business Information systems 16-1-2014 8 Sandy Scott Business Information systems 16-1-2014 5

Bridget White Business Information systems 16-1-2014 9 John Anderson Business Information systems 16-1-2014 8

Sandy Scott BPM Systems 17-1-2014 7 Bridget White BPM Systems 17-1-2014 8 Sandy Scott Process Mining 20-1-2014 5

Bridget White Process Mining 20-1-2014 9 John Anderson Process Mining 20-1-2014 8

… … … …

case id activity name timestamp other data

Page 7: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Another event log: order handling order

number activity timestamp user product quantity

9901 register order [email protected] Sara Jones iPhone5S 1 9902 register order [email protected] Sara Jones iPhone5S 2 9903 register order [email protected] Sara Jones iPhone4S 1 9901 check stock [email protected] Pete Scott iPhone5S 1 9901 ship order [email protected] Sue Fox iPhone5S 1 9903 check stock [email protected] Pete Scott iPhone4S 1 9901 handle payment [email protected] Carol Hope iPhone5S 1 9902 check stock [email protected] Pete Scott iPhone5S 2 9902 cancel order [email protected] Carol Hope iPhone5S 2

… … … … … …

case id activity name timestamp other data resource

Page 8: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Another event log: patient treatment patient activity timestamp doctor age cost 5781 make X-ray [email protected] Dr. Jones 45 70.00 5541 blood test [email protected] Dr. Scott 61 40.00 5833 blood test [email protected] Dr. Scott 24 40.00 5781 blood test [email protected] Dr. Scott 45 40.00 5781 CT scan [email protected] Dr. Fox 45 1200.00 5833 surgery [email protected] Dr. Scott 24 2300.00 5781 handle payment [email protected] Carol Hope 45 0.00 5541 radiation therapy [email protected] Dr. Jones 61 140.00 5541 radiation therapy [email protected] Dr. Jones 61 140.00

… … … … … …

case id activity name timestamp other data resource

Page 9: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Minimal requirements in terms of a CSV file

• Each row corresponds to an event. • There are at least three columns:

• Case id (patient id, order number, claim number, …) • Activity name (approve, reject, request, send, …) • Timestamp (2015-08-18T06:36:40, …)

• There may be many other (optional) columns: resource, transaction type, age, costs, etc.

order number activity timestamp user product quantity

9901 register order [email protected] Sara Jones iPhone5S 1 9902 register order [email protected] Sara Jones iPhone5S 2

Page 10: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Extensions • Transactional information on activity instances:

An event can represent a start, complete, suspend, resume, abort, etc.

• Case versus event attributes: − case attributes do not change, e.g., the birth date or

gender of a patient, − event attributes are related to a particular step in the

process.

schedule assign reassignsuspend

resumestart

autoskipmanualskip complete

abort_case

abort_activity

withdraw

successful termination

unsuccessful termination

Page 11: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

XES (eXtensible Event Stream)

• Adopted by the IEEE Task Force on Process Mining. • The format is supported by tools such as ProM and

Disco (used in this course). • Predecessors: MXML and SA-MXML. • Conversion from other formats (CSV) is easy if the

right data are available. • XML syntax and OpenXES library available. • See www.xes-standard.org.

Page 12: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Simplistic view on event data order

number activity timestamp user product quantity

9901 register order [email protected] Sara Jones iPhone5S 1 9902 register order [email protected] Sara Jones iPhone5S 2 9903 register order [email protected] Sara Jones iPhone4S 1 9901 check stock [email protected] Pete Scott iPhone5S 1 9901 ship order [email protected] Sue Fox iPhone5S 1 9903 check stock [email protected] Pete Scott iPhone4S 1 9901 handle payment [email protected] Carol Hope iPhone5S 1 9902 check stock [email protected] Pete Scott iPhone5S 2 9902 cancel order [email protected] Carol Hope iPhone5S 2

… … … … … …

case id activity name timestamp other data resource

Page 13: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

A more refined view

...

process case

activity activity instance

event

attribute

timestamp

resource*

1

*1

*

1

*1

1

**1

1

*a

b

c

d

e

f g

h

i

event level

costs

trans- action

model level

instance level

j

k

Page 14: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Transactional model for activities

schedule assign reassignsuspend

resumestart

autoskipmanualskip complete

abort_case

abort_activity

withdraw

successful termination

unsuccessful termination

Page 15: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Atomic activities

schedule assign reassignsuspend

resumestart

autoskipmanualskip complete

abort_case

abort_activity

withdraw

successful termination

unsuccessful termination

Page 16: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Activities that have a duration

schedule assign reassignsuspend

resume

start

autoskipmanualskip complete

abort_case

abort_activity

withdraw

successful termination

unsuccessful termination

Page 17: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Converting Event Data to XES • Easy with tools like ProM and Disco!

Import CSV file, then apply “Convert CSV to XES” plug-in.

Page 18: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Learn more about event logs

• Extensive introduction: Chapter 4 of Process Mining: Data Science in Action by W.M.P. van der Aalst, Springer Verlag, 2016 (ISBN 978-3-662-49850-7)

• XES standard: www.xes-standard.org • More advanced: W.M.P. van der Aalst. Extracting Event

Data from Databases to Unleash Process Mining. In BPM - Driving Innovation in a Digital World, Springer-Verlag, Berlin, 2015. http://dx.doi.org/10.1007/978-3-319-14430-6_8

Page 19: Event Logs - Process Mining · Event log • We assume the existence of an event log where each event refers to a case, an activity, and a point in time. • An event log can be seen

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Example logs files (XES)

• http://data.3tu.nl/repository/collection:event_logs_real • http://data.3tu.nl/repository/collection:event_logs_synthetic • http://www.win.tue.nl/ieeetfpm/doku.php?id=shared:process_mi

ning_logs • http://www.processmining.org/logs/start

Event data are everywhere! What is your excuse?