TextText Wil van der Aalst - EIT Digital · Statistics on the first intake •Total number of...

Preview:

Citation preview

Text Text Wil van der Aalst Professor Information Systems TU/e

t

A new profession is emerging, just like computer science in the early 1980-ties!

Industry and Society need Data Scientists!

t

EIT-Digital Data Science Major

5 universities involved

EIT-Digital Data Science Major

3 entry universities

EIT-Digital Data Science Major

5 exit universities (specializations)

EIT-Digital Data Science Major

5 specializations

Distributed Systems & Data Mining for Really Big Data at KTH

Multimedia & Web Science for Big Data at UNS

Design, Implementation, and Usage of Data Science Instruments at TUB

Process Mining in High Tech Systems, Healthcare, Visual Analytics, or Big Software at TUE

Internet of Things (IoT) at UPM

Statistics on the first intake

• Total number of students: 46

• EU students 54%

• Age :20-29

• Female students: 17%

DSC

t

DSC/e

t

Data Science Center Eindhoven

http://www.tue.nl/dsce/

11

DSC/e: Competences and Research Programs 28 groups and 420+ people involved

Context: Why are we using data science, does it have the intended effect, and will

people accept it?

Analysis: How to turn data into real value (models, answers/decisions, and

visualizations/insights)?

Enabling technologies: How to get the data and deal with computational/

infrastructural challenges (big data and hard questions)?

Probability and Statistics

Stochastic Networks

Data Mining

Process Mining

Visualization

Large-Scale Distributed Systems

Data-Intensive Algorithms

Data-Driven Operations Management

Data-Driven Innovation and Business

Human and Social Analytics

Privacy, Security, Ethics, and Governance

Internet of Things

[RP1] Process Analytics: Improving Service While Cutting Costs

[RP2] Customer Journey: Correlating Events to Learn and Influence Customer Behavior

[RP3] Smart Maintenance & Diagnostics: Safeguarding Availability

[RP4] Quantified Self: Improving Performance and Well-Being

[RP5] Data Value and Privacy: Economic and Legal Aspects of Data Science

[RP6] Smart Cities: Ensuring Safety and Convenience for Citizens

[RP7] Smart Grids: Data Intensive Infrastructures

Data Science Flagship (Philips & DSC/e)

4 Strategic topics • Data Driven Value Propositions

• Healthcare Smart Maintenance

• Optimizing Healthcare Workflows

• Continuous Personal Health

4 TU/e departments

16 PhD students

30 Data science specialists

Many more organizations

• BrandLoyalty

• Vanderlande Industries

• ASML

• SynerScope

• Magnaview

• Fluxicon

• Adversitement

• Rabobank

• ING

• SAP

• IBM

• PwC

• AMC

• …

Process Mining

Example: Process Mining as the Bridge Between Data Science and Process Science

Process Mining: Spreadsheet for behavior

• Input: events (“things that have happened”)

• Mandatory per event:

• case identifier

• activity name

• timestamp/date

• Optional

• resource

• transaction type

• costs

• …

case

identifier

activity

name timestamp

resource row = event

Process Mining: Spreadsheet for behavior

208 cases

5987 events

74 activities

Process Mining: Spreadsheet for behavior

batching for activities

“opstellen eindnota” and

“archiveren”

Loesje van

der Aalst

desire line

Process Discovery

Process Mining: Spreadsheet for behavior process discovery

NO

modeling

needed!

Process Mining: Spreadsheet for behavior process discovery

NO

modeling

needed!

74 act.

11 act.

3 act.

event data process

model

Conformance Checking

desire line

very safe

system

Conformance Checking

Process Mining: Spreadsheet for behavior conformance checking

?

discovered or

hand-made

Process Mining: Spreadsheet for behavior conformance checking

fitness of

93.5%

Process Mining: Spreadsheet for behavior

conformance checking

final inspection is

skipped 40 times

Process Mining: Spreadsheet for behavior conformance checking

move on model

(something should have

happened, but did not)

move on log

(something happened that

should not happen)

Process Mining: Spreadsheet for behavior performance analysis

average

flowtime is

1.92 months

bottleneck

NO

modeling

needed!

Process Mining: Spreadsheet for behavior

performance analysis

waiting time of

15.74 days

NO

modeling

needed!

Process Mining: Spreadsheet for behavior animating reality

NO

modeling

needed!

real cases

Process Mining: Spreadsheet for behavior

16 cases are

queueing

animating reality

Process Mining: Spreadsheet for behavior

Deviations

Where?

Why? time

costs

What?

32

Conclusion

•Need for Data Scientists!

•Wonderful Data Science Master Program with 3

entry points and 5 specializations

• Ask Farideh Heidari (f.heidari@tue.nl) for details!

•Zoomed-in on the Data Science ecosystem in

Eindhoven: Data Science Center Eindhoven (DSC/e)

•Zoomed-in on a particular Data Science topic:

Process Mining (linking processes and data)

masterschool.eitdigital.eu

More information?

http://www.masterschool.eitdigital.eu/programmes/dsc/

https://www.coursera.org/course/procmin/

http://www.processmining.org/

http://www.tue.nl/dsce/

http://vdaalst.com/