33
Process-oriented System Analysis Process Mining

Process-oriented System Analysis

  • Upload
    akiva

  • View
    57

  • Download
    0

Embed Size (px)

DESCRIPTION

Process-oriented System Analysis. Process Mining. BPM Lifecycle. Motivation. Up until now : Designed or pre-defined models Assumption that they are appropriate Process Mining Consideration of information from the execution of proceses This is covered in log data Logs - PowerPoint PPT Presentation

Citation preview

Page 1: Process-oriented System Analysis

Process-oriented System AnalysisProcess Mining

Page 2: Process-oriented System Analysis

BPM Lifecycle

Page 3: Process-oriented System Analysis

Motivation

Up until now:Designed or pre-defined modelsAssumption that they are appropriate

Process MiningConsideration of information from the

execution of procesesThis is covered in log data

LogsSequence of log entries, which capture

events in a company that relate to processes

Page 4: Process-oriented System Analysis

Log entries

Examples of log entriesCheck Invoice for Invoice No. 4567 completed on 12.11.2010 at 9:19:57Function StoreCustomerData(„Müller“, c1987, „Bad Bentheim“)

completed on 12.11.2010 at 9:22:24Send Invoice for Invoice No. 4567 completed on 12.11.2010 at 9:23:18Function ContactCustomer(c1987, PromoMailing) completed on

12.11.2010 at 9:24:10Function StoreCustomerData(„Miller“, c1988, „Osnabrück“) completed

on 12.11.2010 at 9:26:08Check Invoice for Invoice No. 4568 completed on 12.11.2010 at 9:26:38Function ContactCustomer(c1988, PromoMailing) completed on

12.11.2010 at Send 9:27:32

Page 5: Process-oriented System Analysis

Logs bear valuable information

Logs bear valuable information to answer questions likeWhen and how many process instances have been executed?Are there recurring patterns in the execution of activities?Can process models be derived from the data?Which paths of execution are used how often in the process

models?Are there paths which are never taken?

Page 6: Process-oriented System Analysis

Process Discovery

Process Discovery is a technique for deriving a process model from log data

Input: execution logs as ordered lists of activities with time stamp and case id

Output: process model which could have generated the execution logs

The case id is often not directly covered in the data, and needs to be generated in pre-processing

Page 7: Process-oriented System Analysis

Process Conformance

Process Conformance is a technique to analyze the relationship between log data and process models

Input: Logs and process modelOutput: information on the relationship, e.g. fitness

Page 8: Process-oriented System Analysis

Overview

Page 9: Process-oriented System Analysis

Execution Logs

AssumptionExecution log defines complete order of events, which can all be

related to process activitiesAll events in the execution log relate to process instances of the

considered processHint

Often log entries refer to different process modelsThis warrants filtering activities

AbstractionTechniques often work on abstraction of logsFocus on case id and activities

Page 10: Process-oriented System Analysis

Execution Log Format

Log format(caseID, activity)

ExampleCheck Invoice for Invoice No. 4567 completed on 12.11.2010 at

9:19:57Function StoreCustomerData(„Müller“, c1987, „Bad Bentheim“)

completed on 12.11.2010 at 9:22:24Send Invoice for Invoice No. 4567 completed on 12.11.2010 at

9:23:18

Resulting Log(4567, Check Invoice), (c1987, StoreCustomerData), (4567, Send

Invoice), etc.

Page 11: Process-oriented System Analysis

Execution Log

Further abstractionA‘s and B‘s(case id, task id)

Additional informationEvent type, time, resource,

dataNot considered here

AssumptionActivity execution captured by

one eventNo intermediate activities

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

Page 12: Process-oriented System Analysis

The Alpha Algorithm

Page 13: Process-oriented System Analysis

Process Discovery Algorithms

Simplest Algorithm: The α – AlgorithmRelatively simple, some properties can be proofedAffected by Noise, therefore not first choice in practice

Noise refers to incomplete or erroneous logsFurthermore, the α+(+) – Algorithms

α+ and α++ are extensions to the α – Algorithm for recognizing more fine-granular structure in the process model

Also affected by NoiseFinally, techniques for dealing with Noise

Page 14: Process-oriented System Analysis

Definitions

Let T be a set of activities (Tasks) and T * the set of all sequences of arbitrary length over T, then we have:σ T * is called execution sequence, if all activities in σ belong to

the same process instanceW T * is called execution log (workflow log)

AssumptionsIn each process model, each activity appears at most onceEach direct neighbor relation between activities is represented at least

once

Page 15: Process-oriented System Analysis

Execution Logs

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

Page 16: Process-oriented System Analysis

Execution Logs

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

Execution sequences:Case 1: ABCDCase 2: ACBDCase 3: ABCDCase 4: ACBDCase 5: EFResultingworkflow log: W = {ABCD, ACBD, EF}

Page 17: Process-oriented System Analysis

Order relations

Log based order relations for pairs of activities a, b T in a workflow log W:Direct successor

a >w b i.e. in an execution sequence b directly follows aCausality

a w b i.e. a >w b and not b >w a

Concurrency a ║w b i.e. a >w b and b >w a

Exclusivenessa w b i.e. not a >w b and not b >w aActivity pairs which never succeed each other

Page 18: Process-oriented System Analysis

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

W = {ABCD, ACBD, EF}• Direct successor• Causality• Concurrency

Execution log analysis

Page 19: Process-oriented System Analysis

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

A>BA>CB>CB>DC>BC>DE>F

AB

AC

BD

CD

EF

B||CC||B

1) 2) 3)

• W = {ABCD, ACBD, EF}• Direct successor• Causality• Concurrency

Execution log analysis

Page 20: Process-oriented System Analysis

α-Algorithm

The idea is to utilize order relations for deriving a workflow net that is compliant with these relations

Precisely, each order relation results in a petri net fragment, which imposes the respective relationship

Page 21: Process-oriented System Analysis

α-Algorithm

Idea (a)

a b

Page 22: Process-oriented System Analysis

α-Algorithm

Idea (b)

a b, a c and b # c

Page 23: Process-oriented System Analysis

α-Algorithm

Idea (c)

b d, c d and b # c

Page 24: Process-oriented System Analysis

α-Algorithm

Idea (d)

a b, a c and b || c

Page 25: Process-oriented System Analysis

α-Algorithm

Idea (e)

b d, c d and b || c

Page 26: Process-oriented System Analysis

The Alpha-Algorithm (simplified)

1. Identify the set of all tasks in the log as TL.2. Identify the set of all tasks that have been observed as the first

task in some case as TI.3. Identify the set of all tasks that have been observed as the last

task in some case as TO.4. Identify the set of all connections to be potentially represented in

the process model as a set XL. Add the following elements to XL:a. Pattern (a): all pairs for which hold a→b.b. Pattern (b): all triples for which hold a→(b#c).c. Pattern (c): all triples for which hold (b#c)→d.

Note that triples for which Pattern (d) a→(b||c) or Pattern (e) (b||c)→d hold are not included in XL.

Page 27: Process-oriented System Analysis

The Alpha-Algorithm (cont.)

5. Construct the set YL as a subset of XL by:a. Eliminating a→b and a→c if there exists some a→(b#c).b. Eliminating b→c and b→d if there exists some (b#c)→d.

6. Connect start and end events in the following way:a. If there are multiple tasks in the set TI of first tasks, then draw a start

event leading to an XOR-split, which connects to every task in TI. Otherwise, directly connect the start event with the only first task.

b. For each task in the set TO of last tasks, add an end event and draw an arc from the task to the end event.

Page 28: Process-oriented System Analysis

The Alpha-Algorithm (cont.)

7. Construct the flow arcs in the following way:a. Pattern (a): For each a→b in YL, draw an arc a to b.

b. Pattern (b): For each a→(b#c) in YL, draw an arc from a to an XOR-split, and from there to b and c.

c. Pattern (c): For each (b#c)→d in YL, draw an arc from b and c to an XOR-join, and from there to d.

d. Pattern (d) and (e): If a task in the so constructed process model has multiple incoming or multiple outgoing arcs, bundle these arcs with an AND-split or AND-join, respectively.

8. Return the newly constructed process model.

Page 29: Process-oriented System Analysis

α-Algorithm Example

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

Page 30: Process-oriented System Analysis

α-Algorithm Example

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

a(W):

α-Algorithm

Page 31: Process-oriented System Analysis

Log Completeness

Level of completeness required for a logAssume for the execution sequence EF, there is a log missingThen, the correct process model cannot be derived

Basic assumption: each execution sequence must be part of the logConsequence: the complete behaviour is visibleProblem: amount of required instances grows dramaticallyExample:

10 activities are executed in parallelAmount of potential execution sequences:

10! = 3.628.800

Page 32: Process-oriented System Analysis

Log Completeness

ResultFor the α-Algorithm it is sufficient to have completeness in terms of

the successor relationship (>w)Reason

All other relations are derived from direct successorshipInterpretation

Each time two activities may succeed each other, this must be visible in at least one execution sequence

HintIn case of highly concurrent process models, this reduces the amount

of required execution sequences dramatically

Page 33: Process-oriented System Analysis

Summary

• Execution Logs• Process Mining using the Alpha-Algorithm