35
lip6 universit´ e de paris 1 - cri Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis ebastien Heymann, B´ en´ edicte Le Grand Emails: [email protected], [email protected] May 30, 2013

Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

Embed Size (px)

DESCRIPTION

Talk IEEE RCIS 2013.

Citation preview

Page 1: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Monitoring User-System

Interactions through Graph-Based

Intrinsic Dynamics Analysis

Sebastien Heymann, Benedicte Le Grand

Emails: [email protected], [email protected] 30, 2013

Page 2: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Monitoring user-system

interactions

What type of user-system interactions?

• user-invoked services in information systems

• social networks

• ...

What kind of monitoring?

• discovery

• conformance

• model improvement

Our ultimate goal: automatic and real-time anomaly detection.

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

2/28

Page 3: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Studied social network

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

3/28

Page 4: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Github interaction: code commit

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

4/28

Page 5: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Github interaction: bug report

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

5/28

Page 6: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Collected Dataset

👤 👤 👤

📸 📸 📸 📸 📸 📸

❞❞

🎔

Interactions examples

commit code / merge

repositories.

open / close bug reports.

❞ comment on bug reports.

🎔edit the repository wiki.

”who contributes to which source code repository”

• 336 000 users and repositories monitored during 4 months.

• 2.2 million interactions recorded sequentially with timestamps.

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

6/28

Page 7: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Log trace sample

User, user, repository, event, timestamp

lukearmstrong, fuel, core, IssuesEvent, 1341420003Try-Git, clarkeash, try git, CreateEvent, 1341420006uGoMobi, jquery, jquery-mobile, IssuesEvent, 1341420009jexp, neo4j, java-rest-binding, IssueCommentEvent, 1341420011HosipLan, nette, nette, PullRequestEvent, 1341420152

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

7/28

Page 8: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Bipartite graph

👤 👤 👤

📸 📸 📸 📸 📸 📸

>: users

⊥: repositories

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

8/28

Page 9: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Links appear over time

👤

📸

Detection of statistically abnormal links dynamics?Model of links dynamics?Link prediction?

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

9/28

Page 10: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Links appear over time

👤 👤

📸 📸

Detection of statistically abnormal links dynamics?Model of links dynamics?Link prediction?

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

9/28

Page 11: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Links appear over time

👤 👤

📸 📸 📸

Detection of statistically abnormal links dynamics?Model of links dynamics?Link prediction?

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

9/28

Page 12: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Links appear over time

👤 👤

📸 📸 📸 📸

Detection of statistically abnormal links dynamics?Model of links dynamics?Link prediction?

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

9/28

Page 13: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Links appear over time

👤 👤

📸 📸 📸 📸

Detection of statistically abnormal links dynamics?Model of links dynamics?Link prediction?

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

9/28

Page 14: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Links appear over time

👤 👤👤

📸 📸 📸 📸 📸

Detection of statistically abnormal links dynamics?Model of links dynamics?Link prediction?

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

9/28

Page 15: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Links appear over time

👤 👤👤

📸 📸 📸 📸 📸 📸

Detection of statistically abnormal links dynamics?Model of links dynamics?Link prediction?

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

9/28

Page 16: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Links appear over time

👤 👤👤

📸 📸 📸 📸 📸 📸Detection of statistically abnormal links dynamics?Model of links dynamics?Link prediction?

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

9/28

Page 17: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Methodology1 Order links by timestamp.

2 Define a sliding window of width w (time unit?).

3 Extract the bipartite graph from each window at interval i .

4 Compute an appropriate property on each graph.

5 Analyze the time series.Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

10/28

Page 18: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Example

Date

Nb

node

s

500

1000

1500

11 March 13 April 31 May 18 July

weekly patternNumber of nodes

Date

Nb

node

s

400600800

1000120014001600

15 April 22 April

day-night patternzoom

w =1 hour, i = 5 minutes.

Question: don’t temporal patterns hide information?

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

11/28

Page 19: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Notions of time

Extrinsic time (real time)

Time measured in units such as seconds.

Good at revealing exogenous phenomena, e.g. day-night patterns.

Intrinsic time (related to graph dynamics)

Time measured in units such as the transition of two states in thegraph.

Better at revealing endogenous phenomena independently from thegraph dynamics?

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

12/28

Page 20: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Window width: high resolution

Time (nb links)

Nb

node

s

200400600800

10001200

500000 1000000 1500000 2000000

Number of nodes

w = 1000 links, i = 100 links.

:) Additional observation

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

13/28

Page 21: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Window width: lower resolution

Number of nodes

Time (nb links)

Nb

node

s

15000200002500030000

500000 1000000 1500000 2000000

w = 50, 000 links, i = 1000 links.

:) No need for high resolution

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

14/28

Page 22: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Event validation

Visualization of the sub-graph: connected nodes are closer,

disconnected nodes are more distant.

In the sub-graph of8,370 nodes and10,000 links at thetime of the event,one node has a highnumber of links:

Try-Git interacts with4,127 users (over5,000).

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

15/28

Page 23: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

http://try.github.io

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

16/28

Page 24: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Towards automatic anomaly

detection

Need for more elaborate properties, like:

Internal links

Their removal does not change the projection of the graph for agiven set of nodes, either > or ⊥.

👤👤

👤👤 👤👤

📸 📸 📸 📸 📸 📸

G G’ = G - (red link) G’T = GT

👤 👤👤

📸 📸 📸 📸 📸 📸

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

17/28

Page 25: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Results

Ratio of >-internal links

Time (nb links)

Rat

io o

f top

−int

erna

l lin

ks

0.5

0.6

0.7

0.8

0.9

1.0

0 500000 1000000 1500000 2000000 2300000

not outlier potential outlier outlier unknown

A

B C D E F GH I

JK

w = 10, 000 links, i = 1000 links.

Color = outlier class using the automatic Outskewer method*.

* S. Heymann, M.Latapy and C. Magnien. Outskewer: Using Skewness to Spot

Outliers in Samples and Time Series, IEEE ASONAM 2012

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

18/28

Page 26: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Conclusion

Contributions• Graph-based methodology to monitor user-system interactions

• Intrinsic time unit avoids exogeneous patterns impact

• Smaller windows not necessarily optimal

• Checked relevance of detected events

Applicable in other contexts

• Client-server architectures

• Processes-messages graphs

• File-provider graphs

• User-invoked services in information systems

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

19/28

Page 27: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Future work

• Which property for anomaly detection?

• Models of interaction dynamics

• Link prediction

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

20/28

Page 28: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

Questions?Monitoring User-System Interactions through

Graph-Based Intrinsic Dynamics Analysis<[email protected]>

Page 29: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

Thank You!Monitoring User-System Interactions through

Graph-Based Intrinsic Dynamics Analysis<[email protected]>

Page 30: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

Backup Slides

Page 31: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Statistically significant anomalies

General definition

Values which deviate remarkably from the remainder of values(Grubbs, 1969)

Outskewer method*:

Our definition

Extremal value which skews a distribution of values.

* Heymann, Latapy and Magnien. Outskewer: Using Skewness to Spot Outliers in Samples and Time Series, IEEEASONAM 2012

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

24/28

Page 32: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Skewness coefficient

γ = n(n−1)(n−2)

∑x∈X

(x−mean

standard deviation

)3de

nsity

x dens

ity

xγ < 0γ > 0

Example of skewed distributions.

It is sensitive to extremal values (min/max) far from the mean !

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

25/28

Page 33: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Automatic anomaly detection

Outskewer classifies each value as:

Year

∆pop

ulat

ion

−1500000−1000000

−5000000

5000001000000

●●●●●●●●●●●●● ●●●

●●●

●●●●

●●● ●●

● ●●●●

●●●●●●●●●●●●● ●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●

1900 1920 1940 1960 1980 2000

status

● not outlier

potential outlier

outlier

or ’unknown’ for heterogeneous distributions of values.

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

26/28

Page 34: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Event detection in time series

On a sliding window of size w , each value of X is classified wtimes.The final class of a value is the one that appears the most.

time

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

27/28

Page 35: Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

l i p 6 u n i v e r s i t e d e p a r i s 1 - c r i

Why Outskewer?

• claims no strong hypothesis on data

• 1 parameter: the time window width

• ignores regime changes (shifts in normality)

• can be implemented on-line.

Sebastien Heymann, Benedicte Le Grand — Monitoring User-System Interactions — May 30, 2013

28/28