Migration Motif: A Spatial-Temporal Pattern Mining Approach for Financial Markets Xiaoxi Du, Ruoming...

Preview:

Citation preview

Migration Motif: A Spatial-Temporal Pattern Mining

Approach for Financial MarketsXiaoxi Du, Ruoming Jin, Liang Ding, Victor E. Lee, John H.Thornton Jr

Presented by: Xiaoxi Du

Department of Computer ScienceKent State University

Do we yet fully understand financial market risks?

To describe frequent behaviors

of individual companies

To describethe relationships

between stock market change over time and

stock return

10

9

8

7

6

5

4

3

2

1

1 2 3 4 5 6 7 8

P/B

SIZE

9 10

SBUX

2

24

3

GT3

2

SJM

6

3

5

WEC

PU

2

2

2

4

2

5

SJM:SMUCKER

J M CO

GT:GOODYEAR

YIRE&

RUBRCO

SBUX:STARBUCKS

CORP

PU:PULLMAN

INC

WEC:WISCONSIN

ENERGYCORP

Example: Trajectories on a Financial Grid

Financial Grid

SIZEmarket captalization

= (share price×number of shares)

P/BPrice-to-book ratio

= (Current price per share / book value per share)

Company Trajectory

Compact Trajectory

1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

10

T2

T1

10

10

Spatial and Temporal Constraint

SIZE

P/BSpatial Constraint:

To guaranteedto follow

a bounded pathU

Temporal Constraint:An upper boundtime constraint

(short-term)ε

Migration Motif A migration motif (pattern) corresponds to a

collection of sub-trajectories which follow similar path.

properties: pair-wise similarity: distance ≤ ε Maximal: add one other sub-trajectory violate pair-

wise similarity Frequent: sub-trajectories → at least θ different

trajectories

AlgorithmGoal:To Extract

Migration Motifsefficiently

Trajectories(company)

2-LengthSub-Trajectories

Similarity GraphFrequent 2-Length

Migration Motif

FrequentK-Length

Migration Motif

AprioriProperty

CompactTrajectory

Patternrepresentation Graph

theoretical

MaximalClique

Characteristics of the Datasets

Data Source The Center for Research in Security Prices

(CRSP) and Compustat Databases

Time Period 1964 to 2007

Parameters Temporal Constraint

U = {3,4,5}

Spatial Constraint ε = {0,1,2}

Minimum Support Level

θ = {10,15,20}

Grid Dimensions g = {10×10, 20x20, 50x50, 100x100}

Stock Exchanges

andDescription

NYSE 1717 (relatively large)

NASDAQ 2675 (smaller)

AMEX 825 (mostly smaller)

Motif Sensitivity to Parameters

10

9

8

7

6

5

4

3

2

1

1 2 3 4 5 6 7 8

P/B

SIZE

M6-1

M5-59

M5-37

M5-58

M5-45

M5-25

9 10

NYSE Motifs: (10g/U3/ε1/θ10)

17

P/B

SIZE

M5-6

20

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1 191817161514131211

1

2

M5-13

M5-10

M5-16 M5-2

3 4 5 6 7 8 9 10

18

19

20

M3-42

M4-186

M3-433

M4-184

M4-101

M3-115

NYSE Motifs: (20g/U3/ε1/θ10)

Result: NYSE

Motif Sensitivity to Parameters

50

P/B

SIZE

21

...

4

3

2

1 1916151211

1

... ... ... ... ... 23 ... 25 ... 28 ... 49M6-2M5-17

M5-3

5

6

M3-170M3-304

M3-172

50

M3-22

M4-50

NASDAQ Motifs: (50g/U3/ε1/θ10)

Result: NASDAQ

The randomized data contains many 2-length

motif (M2),

Statistical Significance of Motifs

However, random motifs

longer than 2 are quite rare

Risk factor migration in the stock market is not random,

And should not be

neglected

Oscillation Motif Patterns

10

9

8

7

6

5

4

3

2

1

1 2 3 4 5 6 7 8

P/B

SIZE

M6-1

M5-59

M5-37

M5-58

M5-45

M5-25

9 10

NYSE Motifs: (10g/U3/ε1/θ10)

Value oscillation(horizontal)

size oscillation(vertical)

Distribution of Motifs

10

9

8

7

6

5

4

3

2

1

1 2 3 4 5 6 7 8

P/B

SIZE

M6-1

M5-59

M5-37

M5-58

M5-45

M5-25

9 10

NYSE Motifs: (10g/U3/ε1/θ10)

50

P/B

SIZE

21

...

4

3

2

1 1916151211

1

... ... ... ... ... 23 ... 25 ... 28 ... 49M6-2M5-17

M5-3

5

6

M3-170M3-304

M3-172

50

M3-22

M4-50

NASDAQ Motifs: (50g/U3/ε1/θ10)

Motif Timing

2 2 23 3 3 44 4 55 5 66 60

5

10

15

20

25

NYSE10×10 NASDAQ50×50 AMEX50×50

Motifs by Length and by Market

Ave

rage

Sta

rtin

g Y

ear

- Average Starting Time - the point at which its migration pattern is first captured by a motif - Maturity

- Average Staying Time - Long term vs Short term

- Loser and Winners Portfolios

Motif Company Time Span

-To list Membership information for typical motifs.

-To provide each company’s ticker and time span

- M5-45 time spans are highly concentrated for value oscillation path

- M6-1 significant jumps

- M4-50 no clear clustering of starting years for vertical oscillation path

Conclusion

We introduce two new algorithms to discover migration motifs in the financial grid

Our work is the first attempt to find multi-year migration patterns in financial datasets

We are the first to find long oscillation patterns in P/B value