31
Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series Sergey Kirshner, UC Irvine Padhraic Smyth, UC Irvine Andrew Robertson, IRI July 10, 2004

Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

  • Upload
    gustav

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series. Sergey Kirshner, UC Irvine Padhraic Smyth, UC Irvine Andrew Robertson, IRI. July 10, 2004. Overview. Data and its modeling aspects Model description General Approach Hidden Markov models - PowerPoint PPT Presentation

Citation preview

Page 1: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector

Time Series

Sergey Kirshner, UC Irvine

Padhraic Smyth, UC Irvine

Andrew Robertson, IRI

July 10, 2004

Page 2: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

2UAI-2004 © Sergey Kirshner, UC Irvine

Overview

• Data and its modeling aspects

• Model description– General Approach

• Hidden Markov models

– Capturing data properties• Chow-Liu trees• Conditional Chow-Liu trees

• Inference and Learning

• Experimental Results

• Summary and Future Extensions

Page 3: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

3UAI-2004 © Sergey Kirshner, UC Irvine

Snapshot of the Data

1

2

3

4

5

1 2 3 4 5 6 7 8 …

T

N

Page 4: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

4UAI-2004 © Sergey Kirshner, UC Irvine

Data Aspects

• Correlation– Spatial dependence

• Temporal structure– First order dependence

• Variability of individual series– Interannual variability

Page 5: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

5UAI-2004 © Sergey Kirshner, UC Irvine

Modeling Precipitation Occurrence

Southwestern Australia, 1978-92

Western US, 1952-90

Page 6: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

6UAI-2004 © Sergey Kirshner, UC Irvine

A Bit of Notation

• Vector time series R– R1:T=R1,..,RT

• Vector observation of R at time t– Rt=(At,Bt,…,Zt)

A1

B1

Z1

C1

R1

A2

B2

Z2

C2

R2

AT

BT

ZT

CT

RT

Page 7: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

7UAI-2004 © Sergey Kirshner, UC Irvine

Weather Generator

R1 R2 RT

A1

B1

Z1

C1

A2

B2

Z2

C2

AT

BT

ZT

CT

T

t c

T

t

ttttT ccPcPPPP2 Z,..,A 2

1111:1 )|()()|()()( RRRR

• Does not take correlation into account

Page 8: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

8UAI-2004 © Sergey Kirshner, UC Irvine

Hidden Markov Model

R1 R2 Rt RT-1 RT

S1 S2 St ST-1 ST

T

t

tt

T

t

ttTT SPSSPSPSP12

11:1:1 )|()|()(),( RR

Page 9: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

9UAI-2004 © Sergey Kirshner, UC Irvine

HMM-Conditional-Independence

Rt

St St

At Ct

ZtBt

=

Z,,A

)(

)|Z,,A()|(

c

tt

ttttt

|ScP

SPSP R

R1 R2 Rt RT-1 RT

S1 S2 St ST-1 ST

Page 10: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

10UAI-2004 © Sergey Kirshner, UC Irvine

HMM-CI: Is It Sufficient?

• Simple yet effective

• Requires large number of values for St

• Emissions can be made to capture more spatial dependencies

Page 11: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

11UAI-2004 © Sergey Kirshner, UC Irvine

Chow-Liu Trees

• Approximation of a joint distribution with a tree-structured distribution [Chow and Liu 68]

Page 12: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

12UAI-2004 © Sergey Kirshner, UC Irvine

0.31260.02290.01720.02300.01830.2603

ABACADBCBDCD

(0.56, 0.11, 0.02, 0.31)(0.51, 0.17, 0.17, 0.15)(0.53, 0.15, 0.19, 0.13)(0.44, 0.14, 0.23, 0.19)(0.46, 0.12, 0.26, 0.16)(0.64, 0.04, 0.08, 0.24)

A

C

B

D

A

C

B

D

ABACADBCBDCD

0.31260.02290.01720.02300.01830.2603

(0.56, 0.11, 0.02, 0.31)(0.51, 0.17, 0.17, 0.15)(0.53, 0.15, 0.19, 0.13)(0.44, 0.14, 0.23, 0.19)(0.46, 0.12, 0.26, 0.16)(0.64, 0.04, 0.08, 0.24)

Illustration of CL-Tree Learning

A

C

B

D

Page 13: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

13UAI-2004 © Sergey Kirshner, UC Irvine

Chow-Liu Trees

• Approximation of a joint distribution with a tree-structured distribution [Chow and Liu 68]

• Learning the structure and the probabilities– Compute individual and pairwise marginal distributions for all

pairs of variables – Compute mutual information (MI) for each pair of variables

– Build maximum spanning tree with for a complete graph with variables as nodes and MIs as weights

• Properties– Efficient:

• O(#samples×(#variables)2×(#values per variable)2)

– Optimal

YX YPXP

YXPYXPYX

, )()(

),(log),(),MI(

Page 14: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

14UAI-2004 © Sergey Kirshner, UC Irvine

HMM-Chow-Liu

R1 R2 Rt RT-1 RT

S1 S2 St ST-1 ST

Rt

St

Bt

DtCt

Bt

DtCt

Bt

DtCt

St

St=1 St=2 St=3=

T1(Rt) T2(Rt) T3(Rt)

At AtAt

Page 15: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

15UAI-2004 © Sergey Kirshner, UC Irvine

Improving on Chow-Liu Trees

• Tree edges with low MI add little to the approximation.

• Observations from the previous time point can be more relevant than from the current one.

• Idea: Build Chow-Liu tree allowing to include variables from the current and the previous time point.

Page 16: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

16UAI-2004 © Sergey Kirshner, UC Irvine

Conditional Chow-Liu Forests

• Extension of Chow-Liu trees to conditional distributions– Approximation of conditional multivariate

distribution with a tree-structured distribution– Uses MI to build maximum spanning trees (forest)

• Variables of two consecutive time points as nodes

• All nodes corresponding to the earlier time point considered connected before the tree construction

– Same asymptotic complexity as Chow-Liu trees• O(#samples×(#variables)2×(#values per variable)2)

– Optimal

Page 17: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

17UAI-2004 © Sergey Kirshner, UC Irvine

B’A’

C’

BA

C

0.31260.02290.02300.12070.12530.06230.13920.17000.05590.00330.00300.0625

ABACBCA’AA’BA’CB’AB’BB’CC’AC’BC’C

(0.56, 0.11, 0.02, 0.31)(0.51, 0.17, 0.17, 0.15)(0.44, 0.14, 0.23, 0.19)(0.57, 0.11, 0.11, 0.21)(0.51, 0.17, 0.07, 0.25)(0.54, 0.14, 0.14, 0.18)(0.52, 0.07, 0.16, 0.25)(0.48, 0.10, 0.11, 0.31)(0.47, 0.11, 0.21, 0.21)(0.48, 0.20, 0.20, 0.12)(0.41, 0.26, 0.17, 0.16)(0.53, 0.14, 0.14, 0.19)

ABACBCA’AA’BA’CB’AB’BB’CC’AC’BC’C

0.31260.02290.02300.12070.12530.06230.13920.17000.05590.00330.00300.0625

B’A’

C’

BA

C

Example of CCL-Forest Learning

B’A’

C’

BA

C

B’A’

C’

BA

C

Page 18: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

18UAI-2004 © Sergey Kirshner, UC Irvine

AR-HMM

T

t

ttt

T

t

ttTT SPSSPSPSPSP2

,1

2

1111:1:1 )|()|()|()(),( RRRR

R1 Rt RT

S1 St ST

Rt-1

St-1

R2

S2

Page 19: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

19UAI-2004 © Sergey Kirshner, UC Irvine

HMM-Conditional-Chow-Liu

St

Rt-1 Rt

R1 Rt RT

S1 St ST

Rt-1

St-1

R2

S2

At-1

Bt-1

Ct-1

Dt-1

At Bt

CtDt

Dt-1

Ct-1

Bt-1

At-1

CtDt

At Bt

Dt-1

Ct-1

Bt-1

At-1

Dt Ct

At Bt

St

St=1 St=2 St=3

=

Page 20: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

20UAI-2004 © Sergey Kirshner, UC Irvine

Inference and Learning for HMM-CL and HMM-CCL

• Inference (calculating P(S|R,))– Recursively calculate P(R1:t,St|) and P(Rt+1:T|St,)

(Forward-Backward)

• Learning (Baum-Welch or EM)– E-step: calculate P(S|R,)

• Forward-Backward

• Calculate P(St|R,) and P(St,St+1|R,)

– M-step: • Maximize EP(S|R,)[P(S, R|’)]

• Similar to mixtures of Chow-Liu trees

Page 21: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

21UAI-2004 © Sergey Kirshner, UC Irvine

Chain Chow-Liu Forest (CCLF)

R1 Rt RTRt-1R2

RtRt-1

Bt

CtDt

At

At

Bt

Ct

Dt

=

Page 22: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

22UAI-2004 © Sergey Kirshner, UC Irvine

Complexity Analysis

Model

Criterion

HMM-CI HMM-CL HMM-CCL

# params K2+MK(V-1) K2+K(M-1)(V2-1) K2+KM(V2-1)

Time (per iteration)

O(NTK(K+M)) O(NTK(K+M2V2)) O(NTK(K+

+M2V2))

Space O(NTK(K+M)) O(NTK(K+M)+KM2V2) O(NTK(K+M)+

+KM2V2)

N – number of sequencesT – length of each sequenceK – number of hidden statesM – dimensionality of each vectorV – number of possible values for each vector component

Page 23: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

23UAI-2004 © Sergey Kirshner, UC Irvine

Experimental Setup

• Data– Australia

• 15 seasons, 184 days each, 30 stations

– Western U.S.• 39 seasons, 90 days each, 8 stations

• Measuring predictive performance– Choose K (number of states)– Leave-one-out cross-validation– Log-likelihood– Error for prediction of a single entry given the rest

Page 24: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

24UAI-2004 © Sergey Kirshner, UC Irvine

Australia (log-likelihood)

Page 25: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

25UAI-2004 © Sergey Kirshner, UC Irvine

Australia (predictive error)

Page 26: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

26UAI-2004 © Sergey Kirshner, UC Irvine

Deeper Look at Weather States

Page 27: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

27UAI-2004 © Sergey Kirshner, UC Irvine

Western U.S. (log-likelihood)

Page 28: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

28UAI-2004 © Sergey Kirshner, UC Irvine

Western U.S. (predictive error)

Page 29: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

29UAI-2004 © Sergey Kirshner, UC Irvine

Summary

• Efficient approximation for finite-valued conditional distributions– Conditional Chow-Liu forests

• New models for spatio-temporal finite-valued data– HMM with Chow-Liu trees– HMM with conditional Chow-Liu forests– Chain Chow-Liu forests

• Applied to precipitation modeling

Page 30: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

30UAI-2004 © Sergey Kirshner, UC Irvine

Future Work

• Extension to real-valued data

• Priors on tree structure and parameters [Jaakkola and Meila 00]

– Locations of the stations

• Interannual variability– Atmospheric variables as inputs to non-homogeneous HMM

[Robertson et al 04]

• Other approximations for finite-valued multivariate data– Maximum Entropy– Multivariate probit models (binary)

Page 31: Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

31UAI-2004 © Sergey Kirshner, UC Irvine

Acknowledgements

• DOE (DE-FG02-02ER63413)

• NSF (SCI-0225642)

• Dr. Stephen Charles of CSIRO, Australia

• Datalab @ UCI (http://www.datalab.uci.edu)