162
Visualisation of big time series data Visualisation of big time series data 1 Rob J Hyndman

Visualization of big time series data

Embed Size (px)

Citation preview

  • Visualisation ofbig time seriesdata

    Visualisation of big time series data 1

    Rob J Hyndman

  • Visualisation ofbig time seriesdata

    Visualisation of big time series data 1

    Rob J Hyndman

    with Earo Wang, Nikolay LaptevYanfei Kang, Kate Smith-Miles

  • Visualisation ofbig time seriesdata

    Visualisation of big time series data 1

    Rob J Hyndman

    with Earo Wang, Nikolay LaptevYanfei Kang, Kate Smith-Miles

  • Visualisation ofbig time seriesdata

    Visualisation of big time series data 1

    Rob J Hyndman

    with Earo Wang, Nikolay LaptevYanfei Kang, Kate Smith-Miles

  • Visualisation ofbig time seriesdata

    Visualisation of big time series data 1

    Rob J Hyndman

    with Earo Wang, Nikolay LaptevYanfei Kang, Kate Smith-Miles

  • Visualisation ofbig time seriesdata

    Visualisation of big time series data 1

    Rob J Hyndman

    with Earo Wang, Nikolay LaptevYanfei Kang, Kate Smith-Miles

  • Visualisation ofbig time seriesdata

    Visualisation of big time series data 1

    Rob J Hyndman

    with Earo Wang, Nikolay LaptevYanfei Kang, Kate Smith-Miles

  • Outline

    1 The problem

    2 Australian tourism demand

    3 M3 competition data

    4 Yahoo web traffic

    5 What next?

    Visualisation of big time series data The problem 2

  • Spectacle sales

    Visualisation of big time series data The problem 3

    Monthly sales data from 2000 2014Provided by a large spectacle manufacturerSplit by brand (26), gender (3), price range (6),materials (4), and stores (600)About a million disaggregated series

  • Fulcher collection

    www.comp-engine.org/timeseries

    38,190 time series from many sources

    Over 20,000 real series from meterology,medicine, audio, astrophysics, finance, etc.Over 10,000 simulated series from variouschaotic and stochastic models.

    Visualisation of big time series data The problem 4

    www.comp-engine.org/timeseries

  • Fulcher collection

    www.comp-engine.org/timeseries

    38,190 time series from many sources

    Over 20,000 real series from meterology,medicine, audio, astrophysics, finance, etc.Over 10,000 simulated series from variouschaotic and stochastic models.

    Visualisation of big time series data The problem 4

    www.comp-engine.org/timeseries

  • FRED: research.stlouisfed.org/fred2/

    Visualisation of big time series data The problem 5

    research.stlouisfed.org/fred2/

  • Quandl: www.quandl.com

    Visualisation of big time series data The problem 6

    www.quandl.com

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 7

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Time

  • How to plot lots of time series?

    Visualisation of big time series data The problem 8

  • How to plot lots of time series?

    Visualisation of big time series data The problem 8

  • How to plot lots of time series?

    Visualisation of big time series data The problem 8

  • How to plot lots of time series?

    Visualisation of big time series data The problem 8

  • How to plot lots of time series?

    Visualisation of big time series data The problem 8

  • Key idea

    Examples for time series

    lag correlationsize and direction of trendstrength of seasonalitytiming of peak seasonalityspectral entropy

    Called features or characteristics in themachine learning literature.

    Visualisation of big time series data The problem 9

    John W Tukey

    Cognostics

    Computer-produced diagnostics(Tukey and Tukey, 1985).

  • Key idea

    Examples for time series

    lag correlationsize and direction of trendstrength of seasonalitytiming of peak seasonalityspectral entropy

    Called features or characteristics in themachine learning literature.

    Visualisation of big time series data The problem 9

    John W Tukey

    Cognostics

    Computer-produced diagnostics(Tukey and Tukey, 1985).

  • Key idea

    Examples for time series

    lag correlationsize and direction of trendstrength of seasonalitytiming of peak seasonalityspectral entropy

    Called features or characteristics in themachine learning literature.

    Visualisation of big time series data The problem 9

    John W Tukey

    Cognostics

    Computer-produced diagnostics(Tukey and Tukey, 1985).

  • Key idea

    Examples for time series

    lag correlationsize and direction of trendstrength of seasonalitytiming of peak seasonalityspectral entropy

    Called features or characteristics in themachine learning literature.

    Visualisation of big time series data The problem 9

    John W Tukey

    Cognostics

    Computer-produced diagnostics(Tukey and Tukey, 1985).

  • Key idea

    Examples for time series

    lag correlationsize and direction of trendstrength of seasonalitytiming of peak seasonalityspectral entropy

    Called features or characteristics in themachine learning literature.

    Visualisation of big time series data The problem 9

    John W Tukey

    Cognostics

    Computer-produced diagnostics(Tukey and Tukey, 1985).

  • Key idea

    Examples for time series

    lag correlationsize and direction of trendstrength of seasonalitytiming of peak seasonalityspectral entropy

    Called features or characteristics in themachine learning literature.

    Visualisation of big time series data The problem 9

    John W Tukey

    Cognostics

    Computer-produced diagnostics(Tukey and Tukey, 1985).

  • Key idea

    Examples for time series

    lag correlationsize and direction of trendstrength of seasonalitytiming of peak seasonalityspectral entropy

    Called features or characteristics in themachine learning literature.

    Visualisation of big time series data The problem 9

    John W Tukey

    Cognostics

    Computer-produced diagnostics(Tukey and Tukey, 1985).

  • Key idea

    Examples for time series

    lag correlationsize and direction of trendstrength of seasonalitytiming of peak seasonalityspectral entropy

    Called features or characteristics in themachine learning literature.

    Visualisation of big time series data The problem 9

    John W Tukey

    Cognostics

    Computer-produced diagnostics(Tukey and Tukey, 1985).

  • Outline

    1 The problem

    2 Australian tourism demand

    3 M3 competition data

    4 Yahoo web traffic

    5 What next?

    Visualisation of big time series data Australian tourism demand 10

  • Australian tourism demand

    Visualisation of big time series data Australian tourism demand 11

  • Australian tourism demand

    Visualisation of big time series data Australian tourism demand 11

    Quarterly data on visitor night from1998:Q1 2013:Q4From: National Visitor Survey, based onannual interviews of 120,000 Australiansaged 15+, collected by Tourism ResearchAustralia.Split by 7 states, 27 zones and 76 regions(a geographical hierarchy)Also split by purpose of travel

    HolidayVisiting friends and relatives (VFR)BusinessOther

    304 disaggregated series

  • Domestic tourism demand: VictoriaB

    AA

    Hol

    BA

    BH

    ol

    BA

    AV

    isB

    AB

    Vis

    BA

    AB

    usB

    AB

    Bus

    BA

    AO

    thB

    AB

    Oth

    BA

    CH

    olB

    BA

    Hol

    BA

    CV

    isB

    BA

    Vis

    BA

    CB

    usB

    BA

    Bus

    BA

    CO

    thB

    BA

    Oth

    BC

    AH

    olB

    CB

    Hol

    BC

    AV

    isB

    CB

    Vis

    BC

    AB

    usB

    CB

    Bus

    BC

    AO

    thB

    CB

    Oth

    BC

    CH

    olB

    DA

    Hol

    BC

    CV

    isB

    DA

    Vis

    BC

    CB

    usB

    DA

    Bus

    BC

    CO

    thB

    DA

    Oth

    BD

    BH

    olB

    DC

    Hol

    BD

    BV

    isB

    DC

    Vis

    BD

    BB

    usB

    DC

    Bus

    BD

    BO

    thB

    DC

    Oth

    BD

    DH

    olB

    DE

    Hol

    BD

    DV

    isB

    DE

    Vis

    BD

    DB

    usB

    DE

    Bus

    BD

    DO

    thB

    DE

    Oth

    BD

    FH

    olB

    EA

    Hol

    BD

    FV

    isB

    EA

    Vis

    BD

    FB

    usB

    EA

    Bus

    BD

    FO

    thB

    EA

    Oth

    BE

    BH

    olB

    EC

    Hol

    BE

    BV

    isB

    EC

    Vis

    BE

    BB

    usB

    EC

    Bus

    BE

    BO

    thB

    EC

    Oth

    BE

    DH

    olB

    EE

    Hol

    BE

    DV

    isB

    EE

    Vis

    BE

    DB

    usB

    EE

    Bus

    BE

    DO

    thB

    EE

    Oth

    BE

    FH

    olB

    EG

    Hol

    BE

    FV

    isB

    EG

    Vis

    BE

    FB

    usB

    EG

    Bus

    BE

    FO

    thB

    EG

    Oth

    Visualisation of big time series data Australian tourism demand 12

  • An STL decompositionTourism demand for holidays in PeninsulaYt = St + Tt + Rt St is periodic with mean 0

    5.0

    6.0

    7.0

    data

    0.

    50.

    5

    seas

    onal

    5.8

    6.1

    6.4

    tren

    d

    0.

    40.

    0

    2000 2005 2010

    rem

    aind

    er

    timeVisualisation of big time series data Australian tourism demand 13

  • Seasonal stacked bar chart

    Place positive values above the origin whilenegative values below the originMap the bar length to the magnitudeEncode quarters by colours

    1.0

    0.5

    0.0

    0.5

    1.0

    Holiday

    BAA BAB BAC BBABCABCBBCCBDABDBBDCBDDBDEBDF BEA BEBBECBEDBEE BEFBEGRegions

    Sea

    sona

    l Com

    pone

    nt

    Qtr

    Q1

    Q2

    Q3

    Q4

    Visualisation of big time series data Australian tourism demand 14

  • Seasonal stacked bar chart: VIC

    Visualisation of big time series data Australian tourism demand 15

  • Seasonal stacked bar chart: VIC

    1.00.5

    0.00.51.0

    1.00.5

    0.00.51.0

    1.00.5

    0.00.51.0

    1.00.5

    0.00.51.0

    Holiday

    VF

    RB

    usinessO

    ther

    BAABABBACBBABCABCBBCCBDABDBBDCBDDBDEBDFBEABEBBECBEDBEEBEFBEGRegions

    Sea

    sona

    l Com

    pone

    nt

    QtrQ1Q2Q3Q4

    Visualisation of big time series data Australian tourism demand 15

  • Trend analysis

    Linearity: the long-term direction andstrength of trend.

    Curvature: the changing direction of trend.

    Estimate by regression:

    Tt = 0 + 11(t) + 22(t) + et

    where k(t) is a kth-degree orthogonalpolynomial in time t.

    To separate the linearity (1) and curvature(2).

    Visualisation of big time series data Australian tourism demand 16

  • Trend analysis

    Visualisation of big time series data Australian tourism demand 17

    01234

    01234

    01234

    01234

    Holiday

    VF

    RB

    usinessO

    ther

    BAA BAB BAC BBA BCABCBBCCBDABDBBDCBDDBDE BDF BEA BEB BECBED BEE BEF BEGRegions

    Tren

    d Li

    near

    ity

    Direction+

  • Trend analysis

    Visualisation of big time series data Australian tourism demand 17

  • Corrgram of remainder

    Visualisation of big time series data Australian tourism demand 181

    0.8

    0.6

    0.4

    0.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    BE

    EH

    olB

    EF

    Oth

    BE

    EO

    thB

    DE

    Oth

    BE

    BO

    thB

    EA

    Bus

    BE

    FB

    usB

    DC

    Oth

    BA

    CH

    olB

    EB

    Bus

    BE

    AV

    isB

    BA

    Hol

    BD

    EH

    olB

    AB

    Oth

    BA

    AV

    isB

    AA

    Hol

    BD

    CH

    olB

    BA

    Bus

    BC

    BH

    olB

    EG

    Bus

    BD

    DV

    isB

    AB

    Vis

    BD

    AV

    isB

    EA

    Oth

    BD

    FH

    olB

    EE

    Bus

    BA

    AO

    thB

    AC

    Oth

    BD

    AO

    thB

    DE

    Bus

    BC

    BO

    thB

    AC

    Bus

    BE

    BV

    isB

    AC

    Vis

    BC

    AO

    thB

    EF

    Vis

    BC

    BV

    isB

    ED

    Hol

    BE

    GO

    thB

    DB

    Hol

    BA

    BB

    usB

    EB

    Hol

    BD

    FB

    usB

    EC

    Hol

    BC

    AH

    olB

    DB

    Oth

    BE

    AH

    olB

    DC

    Bus

    BE

    CV

    isB

    DB

    Vis

    BC

    CH

    olB

    BA

    Vis

    BA

    BH

    olB

    BA

    Oth

    BC

    CO

    thB

    CB

    Bus

    BC

    CV

    isB

    EG

    Vis

    BD

    DH

    olB

    EC

    Oth

    BD

    CV

    isB

    AA

    Bus

    BC

    CB

    usB

    EC

    Bus

    BC

    AV

    isB

    DF

    Vis

    BE

    GH

    olB

    DD

    Oth

    BE

    DO

    thB

    ED

    Vis

    BD

    DB

    usB

    DE

    Vis

    BE

    FH

    olB

    EE

    Vis

    BD

    BB

    usB

    DA

    Bus

    BD

    AH

    olB

    CA

    Bus

    BD

    FO

    thB

    ED

    Bus

    BEEHolBEFOthBEEOthBDEOthBEBOthBEABusBEFBusBDCOthBACHolBEBBusBEAVisBBAHolBDEHolBABOthBAAVisBAAHolBDCHolBBABusBCBHolBEGBusBDDVisBABVisBDAVisBEAOthBDFHolBEEBusBAAOthBACOthBDAOthBDEBusBCBOthBACBusBEBVisBACVisBCAOthBEFVisBCBVisBEDHolBEGOthBDBHolBABBusBEBHolBDFBusBECHolBCAHolBDBOthBEAHolBDCBusBECVisBDBVisBCCHolBBAVisBABHolBBAOthBCCOthBCBBusBCCVisBEGVisBDDHolBECOthBDCVisBAABusBCCBusBECBusBCAVisBDFVisBEGHolBDDOthBEDOthBEDVisBDDBusBDEVisBEFHolBEEVisBDBBusBDABusBDAHolBCABusBDFOthBEDBus

  • Corrgram of remainder

    Visualisation of big time series data Australian tourism demand 181

    0.8

    0.6

    0.4

    0.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    BE

    EH

    olB

    EF

    Oth

    BE

    EO

    thB

    DE

    Oth

    BE

    BO

    thB

    EA

    Bus

    BE

    FB

    usB

    DC

    Oth

    BA

    CH

    olB

    EB

    Bus

    BE

    AV

    isB

    BA

    Hol

    BD

    EH

    olB

    AB

    Oth

    BA

    AV

    isB

    AA

    Hol

    BD

    CH

    olB

    BA

    Bus

    BC

    BH

    olB

    EG

    Bus

    BD

    DV

    isB

    AB

    Vis

    BD

    AV

    isB

    EA

    Oth

    BD

    FH

    olB

    EE

    Bus

    BA

    AO

    thB

    AC

    Oth

    BD

    AO

    thB

    DE

    Bus

    BC

    BO

    thB

    AC

    Bus

    BE

    BV

    isB

    AC

    Vis

    BC

    AO

    thB

    EF

    Vis

    BC

    BV

    isB

    ED

    Hol

    BE

    GO

    thB

    DB

    Hol

    BA

    BB

    usB

    EB

    Hol

    BD

    FB

    usB

    EC

    Hol

    BC

    AH

    olB

    DB

    Oth

    BE

    AH

    olB

    DC

    Bus

    BE

    CV

    isB

    DB

    Vis

    BC

    CH

    olB

    BA

    Vis

    BA

    BH

    olB

    BA

    Oth

    BC

    CO

    thB

    CB

    Bus

    BC

    CV

    isB

    EG

    Vis

    BD

    DH

    olB

    EC

    Oth

    BD

    CV

    isB

    AA

    Bus

    BC

    CB

    usB

    EC

    Bus

    BC

    AV

    isB

    DF

    Vis

    BE

    GH

    olB

    DD

    Oth

    BE

    DO

    thB

    ED

    Vis

    BD

    DB

    usB

    DE

    Vis

    BE

    FH

    olB

    EE

    Vis

    BD

    BB

    usB

    DA

    Bus

    BD

    AH

    olB

    CA

    Bus

    BD

    FO

    thB

    ED

    Bus

    BEEHolBEFOthBEEOthBDEOthBEBOthBEABusBEFBusBDCOthBACHolBEBBusBEAVisBBAHolBDEHolBABOthBAAVisBAAHolBDCHolBBABusBCBHolBEGBusBDDVisBABVisBDAVisBEAOthBDFHolBEEBusBAAOthBACOthBDAOthBDEBusBCBOthBACBusBEBVisBACVisBCAOthBEFVisBCBVisBEDHolBEGOthBDBHolBABBusBEBHolBDFBusBECHolBCAHolBDBOthBEAHolBDCBusBECVisBDBVisBCCHolBBAVisBABHolBBAOthBCCOthBCBBusBCCVisBEGVisBDDHolBECOthBDCVisBAABusBCCBusBECBusBCAVisBDFVisBEGHolBDDOthBEDOthBEDVisBDDBusBDEVisBEFHolBEEVisBDBBusBDABusBDAHolBCABusBDFOthBEDBus

    Compute the correlations amongthe remainder components

    Render both the sign andmagnitude using a colour mappingof two hues

    Order variables according to thefirst principal component of thecorrelations.

  • Corrgram of remainder

    Visualisation of big time series data Australian tourism demand 181

    0.8

    0.6

    0.4

    0.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    BD

    AH

    ol

    BD

    DH

    ol

    BE

    BH

    ol

    BE

    FH

    ol

    BE

    CH

    ol

    BE

    DH

    ol

    BD

    FH

    ol

    BC

    CH

    ol

    BD

    CH

    ol

    BC

    AH

    ol

    BE

    AH

    ol

    BE

    GH

    ol

    BB

    AH

    ol

    BA

    AH

    ol

    BA

    BH

    ol

    BD

    BH

    ol

    BD

    EH

    ol

    BA

    CH

    ol

    BC

    BH

    ol

    BE

    EH

    ol

    BDAHol

    BDDHol

    BEBHol

    BEFHol

    BECHol

    BEDHol

    BDFHol

    BCCHol

    BDCHol

    BCAHol

    BEAHol

    BEGHol

    BBAHol

    BAAHol

    BABHol

    BDBHol

    BDEHol

    BACHol

    BCBHol

    BEEHol

  • Corrgram of remainder: TAS

    Visualisation of big time series data Australian tourism demand 191

    0.8

    0.6

    0.4

    0.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    FC

    AH

    ol

    FB

    BH

    ol

    FB

    AH

    ol

    FAA

    Hol

    FC

    BH

    ol

    FC

    AV

    is

    FB

    BV

    is

    FAA

    Vis

    FC

    BB

    us

    FAA

    Oth

    FC

    AO

    th

    FB

    BO

    th

    FB

    AB

    us

    FB

    AO

    th

    FC

    BV

    is

    FC

    AB

    us

    FB

    AV

    is

    FC

    BO

    th

    FB

    BB

    us

    FAA

    Bus

    FCAHol

    FBBHol

    FBAHol

    FAAHol

    FCBHol

    FCAVis

    FBBVis

    FAAVis

    FCBBus

    FAAOth

    FCAOth

    FBBOth

    FBABus

    FBAOth

    FCBVis

    FCABus

    FBAVis

    FCBOth

    FBBBus

    FAABus

  • Outline

    1 The problem

    2 Australian tourism demand

    3 M3 competition data

    4 Yahoo web traffic

    5 What next?

    Visualisation of big time series data M3 competition data 20

  • M3 forecasting competition

    Visualisation of big time series data M3 competition data 21

  • M3 forecasting competition

    Visualisation of big time series data M3 competition data 21

  • M3 forecasting competition

    The M3-Competition is a final attempt by the authors tosettle the accuracy issue of various time series methods. . .The extension involves the inclusion of more methods/researchers (in particular in the areas of neural networksand expert systems) and more series.

    Makridakis & Hibon, IJF 2000

    3003 series

    All data from business, demography, finance andeconomics.

    Series length between 14 and 126.

    Either non-seasonal, monthly or quarterly.

    All time series positive.

    Visualisation of big time series data M3 competition data 22

  • M3 forecasting competition

    Visualisation of big time series data M3 competition data 23

  • Candidate features

    STL decompositionYt = St + Tt + Rt

    Seasonal period

    Strength of seasonality: 1 Var(Rt)Var(YtTt)Strength of trend: 1 Var(Rt)Var(YtSt)Spectral entropy: H =

    fy() log fy()d,

    where fy() is spectral density of Yt.Low values of H suggest a time series that iseasier to forecast (more signal).

    Autocorrelations: r1, r2, r3, . . .

    Optimal Box-Cox transformation parameter Visualisation of big time series data M3 competition data 24

  • Candidate features

    STL decompositionYt = St + Tt + Rt

    Seasonal period

    Strength of seasonality: 1 Var(Rt)Var(YtTt)Strength of trend: 1 Var(Rt)Var(YtSt)Spectral entropy: H =

    fy() log fy()d,

    where fy() is spectral density of Yt.Low values of H suggest a time series that iseasier to forecast (more signal).

    Autocorrelations: r1, r2, r3, . . .

    Optimal Box-Cox transformation parameter Visualisation of big time series data M3 competition data 24

  • Candidate features

    STL decompositionYt = St + Tt + Rt

    Seasonal period

    Strength of seasonality: 1 Var(Rt)Var(YtTt)Strength of trend: 1 Var(Rt)Var(YtSt)Spectral entropy: H =

    fy() log fy()d,

    where fy() is spectral density of Yt.Low values of H suggest a time series that iseasier to forecast (more signal).

    Autocorrelations: r1, r2, r3, . . .

    Optimal Box-Cox transformation parameter Visualisation of big time series data M3 competition data 24

  • Candidate features

    STL decompositionYt = St + Tt + Rt

    Seasonal period

    Strength of seasonality: 1 Var(Rt)Var(YtTt)Strength of trend: 1 Var(Rt)Var(YtSt)Spectral entropy: H =

    fy() log fy()d,

    where fy() is spectral density of Yt.Low values of H suggest a time series that iseasier to forecast (more signal).

    Autocorrelations: r1, r2, r3, . . .

    Optimal Box-Cox transformation parameter Visualisation of big time series data M3 competition data 24

  • Candidate features

    STL decompositionYt = St + Tt + Rt

    Seasonal period

    Strength of seasonality: 1 Var(Rt)Var(YtTt)Strength of trend: 1 Var(Rt)Var(YtSt)Spectral entropy: H =

    fy() log fy()d,

    where fy() is spectral density of Yt.Low values of H suggest a time series that iseasier to forecast (more signal).

    Autocorrelations: r1, r2, r3, . . .

    Optimal Box-Cox transformation parameter Visualisation of big time series data M3 competition data 24

  • Candidate features

    STL decompositionYt = St + Tt + Rt

    Seasonal period

    Strength of seasonality: 1 Var(Rt)Var(YtTt)Strength of trend: 1 Var(Rt)Var(YtSt)Spectral entropy: H =

    fy() log fy()d,

    where fy() is spectral density of Yt.Low values of H suggest a time series that iseasier to forecast (more signal).

    Autocorrelations: r1, r2, r3, . . .

    Optimal Box-Cox transformation parameter Visualisation of big time series data M3 competition data 24

  • Candidate features

    STL decompositionYt = St + Tt + Rt

    Seasonal period

    Strength of seasonality: 1 Var(Rt)Var(YtTt)Strength of trend: 1 Var(Rt)Var(YtSt)Spectral entropy: H =

    fy() log fy()d,

    where fy() is spectral density of Yt.Low values of H suggest a time series that iseasier to forecast (more signal).

    Autocorrelations: r1, r2, r3, . . .

    Optimal Box-Cox transformation parameter Visualisation of big time series data M3 competition data 24

  • Candidate features

    Visualisation of big time series data M3 competition data 25

    Seasonality

    N00

    01

    1976 1978 1980 1982 1984 1986 1988

    1000

    3000

    5000

    N15

    02

    1978 1980 1982 1984 1986

    010

    000

    2000

    0

    N30

    03

    1984 1986 1988 1990 1992

    2000

    6000

    1000

    0

  • Candidate features

    Visualisation of big time series data M3 competition data 25

    Trend

    N00

    01

    1976 1978 1980 1982 1984 1986 1988

    2000

    4000

    6000

    N15

    02

    1982 1984 1986 1988 1990 1992

    3000

    5000

    N30

    03

    1975 1980 1985100

    040

    0070

    00

  • Candidate features

    Visualisation of big time series data M3 competition data 25

    ACF1

    N00

    01

    1987 1988 1989 1990

    5800

    6000

    6200

    N15

    02

    1987 1988 1989 1990 1991

    3000

    5000

    7000

    N30

    03

    1984 1986 1988 1990 1992

    7000

    8000

    9000

  • Candidate features

    Visualisation of big time series data M3 competition data 25

    Spectral entropy

    N00

    01

    1964 1966 1968 1970 1972 1974

    2500

    4000

    5500

    N15

    02

    1986 1988 1990 1992

    3000

    4500

    N30

    03

    1976 1978 1980 1982 1984 1986 1988200

    024

    0028

    00

  • Candidate features

    Visualisation of big time series data M3 competition data 25

    Box Cox

    N00

    05

    1976 1978 1980 1982 1984 1986 1988

    4500

    6000

    N22

    69

    1984 1986 1988 1990 1992

    4200

    4800

    5400

    N30

    03

    0 10 20 30 40 50 60

    3500

    4500

    5500

  • Candidate features

    Visualisation of big time series data M3 competition data 26

    SpecEntr

    0.0 0.4 0.8 2 6 10 0.0 0.4 0.8

    0.5

    0.9

    0.0

    0.6

    Trend

    Season

    0.0

    0.6

    28 Freq

    ACF

    0.

    40.

    6

    0.5 0.7 0.9

    0.0

    0.6

    0.0 0.4 0.8 0.4 0.2 0.8

    Lambda

  • Dimension reduction for time series

    Visualisation of big time series data M3 competition data 27

  • Dimension reduction for time series

    Visualisation of big time series data M3 competition data 27

    SpecEntr

    0.0 0.4 0.8 2 6 10 0.0 0.4 0.8

    0.5

    0.9

    0.0

    0.6

    Trend

    Season

    0.0

    0.6

    28 Freq

    ACF

    0.

    40.

    6

    0.5 0.7 0.9

    0.0

    0.6

    0.0 0.4 0.8 0.4 0.2 0.8

    Lambda

    Featurecalculation

  • Dimension reduction for time series

    Visualisation of big time series data M3 competition data 27

    SpecEntr

    0.0 0.4 0.8 2 6 10 0.0 0.4 0.8

    0.5

    0.9

    0.0

    0.6

    Trend

    Season

    0.0

    0.6

    28 Freq

    ACF

    0.

    40.

    6

    0.5 0.7 0.9

    0.0

    0.6

    0.0 0.4 0.8 0.4 0.2 0.8

    Lambda

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

    Featurecalculation

    Principalcomponentdecomposition

  • Feature space of M3 data

    Visualisation of big time series data M3 competition data 28

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

    First two PCs explain 68% of variation.

  • Feature space of M3 data

    Visualisation of big time series data M3 competition data 28

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

    3

    6

    9

    12value

    Freq

  • Feature space of M3 data

    Visualisation of big time series data M3 competition data 28

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

    0.00

    0.25

    0.50

    0.75

    value

    Season

  • Feature space of M3 data

    Visualisation of big time series data M3 competition data 28

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

    0.25

    0.50

    0.75

    value

    Trend

  • Feature space of M3 data

    Visualisation of big time series data M3 competition data 28

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

    0.0

    0.5

    value

    ACF

  • Feature space of M3 data

    Visualisation of big time series data M3 competition data 28

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

    0.50.60.70.80.9

    value

    SpecEntr

  • Feature space of M3 data

    Visualisation of big time series data M3 competition data 28

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

    0.000.250.500.751.00

    value

    Lambda

  • Predictability

    Three general forecasting methods:

    Theta method Best overall in 2000 M3competition

    ETS Exponential smoothing statespace models

    STL-AR AR model applied to seasonallyadjusted series from STL, andseasonal component forecastusing the seasonal naive method.

    Compute minimum MASE from all three methods

    Visualisation of big time series data M3 competition data 29

  • Predictability

    Three general forecasting methods:

    Theta method Best overall in 2000 M3competition

    ETS Exponential smoothing statespace models

    STL-AR AR model applied to seasonallyadjusted series from STL, andseasonal component forecastusing the seasonal naive method.

    Compute minimum MASE from all three methods

    Visualisation of big time series data M3 competition data 29

  • Predictability

    Visualisation of big time series data M3 competition data 30

    Theta

    1975 1980 1985 1990

    2000

    4000

    6000

    8000

    1000

    0

  • Predictability

    Visualisation of big time series data M3 competition data 30

    ETS

    1975 1980 1985 1990

    2000

    4000

    6000

    8000

    1000

    0

  • Predictability

    Visualisation of big time series data M3 competition data 30

    AR

    1975 1980 1985 1990

    2000

    4000

    6000

    8000

    1000

    0

  • Predictability

    Visualisation of big time series data M3 competition data 31

    Theta

    1980 1982 1984 1986 1988 1990 1992

    3000

    4000

    5000

    6000

  • Predictability

    Visualisation of big time series data M3 competition data 31

    ETS

    1980 1982 1984 1986 1988 1990 1992

    3000

    4000

    5000

    6000

  • Predictability

    Visualisation of big time series data M3 competition data 31

    STLAR

    1980 1982 1984 1986 1988 1990 1992

    3000

    4000

    5000

    6000

  • Predictability

    Visualisation of big time series data M3 competition data 32

    Theta

    1984 1986 1988 1990 1992 1994

    6000

    6500

    7000

    7500

    8000

  • Predictability

    Visualisation of big time series data M3 competition data 32

    ETS

    1984 1986 1988 1990 1992 1994

    6000

    6500

    7000

    7500

    8000

  • Predictability

    Visualisation of big time series data M3 competition data 32

    STLAR

    1984 1986 1988 1990 1992 1994

    6000

    6500

    7000

    7500

    8000

  • Predictability

    Visualisation of big time series data M3 competition data 33

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2Low

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

    Middle

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

    High

    LowMASE values

  • Predictability

    Visualisation of big time series data M3 competition data 33

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

    Low

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2Middle

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

    High

    MediumMASE values

  • Predictability

    Visualisation of big time series data M3 competition data 33

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

    Low

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

    Middle

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2High

    HighMASE values

  • Predictability

    Visualisation of big time series data M3 competition data 34

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Best

    EtsNoDiffStlmarTheta

    Yearly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2 Best

    NoDiffStlmar

    Yearly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Best

    EtsNoDiffStlmarTheta

    Quarterly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2 Best

    EtsNoDiffStlmar

    Quarterly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Best

    EtsNoDiffStlmarTheta

    Monthly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2 Best

    EtsNoDiffStlmar

    Monthly data

    Actual SVM prediction

  • Predictability

    Visualisation of big time series data M3 competition data 34

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Best

    EtsNoDiffStlmarTheta

    Yearly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2 Best

    NoDiffStlmar

    Yearly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Best

    EtsNoDiffStlmarTheta

    Quarterly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2 Best

    EtsNoDiffStlmar

    Quarterly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Best

    EtsNoDiffStlmarTheta

    Monthly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2 Best

    EtsNoDiffStlmar

    Monthly dataActual SVM prediction

  • Predictability

    Visualisation of big time series data M3 competition data 34

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Best

    EtsNoDiffStlmarTheta

    Yearly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2 Best

    NoDiffStlmar

    Yearly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Best

    EtsNoDiffStlmarTheta

    Quarterly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2 Best

    EtsNoDiffStlmar

    Quarterly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Best

    EtsNoDiffStlmarTheta

    Monthly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2 Best

    EtsNoDiffStlmar

    Monthly data

    Actual SVM prediction

  • Generating new time series

    We can use the feature space to:

    Generate new time series with similar features toexisting series

    Generate new time series where there are holes inthe feature space.

    Let {PC1,PC2, . . . ,PCn} be a population of timeseries of specified length and period.Genetic algorithm uses a process of selection,crossover and mutation to evolve the populationtowards a target point Ti.Optimize: Fitness (PCj) =

    (|PCj Ti|2).

    Initial population random with some series inneighbourhood of Ti.

    Visualisation of big time series data M3 competition data 35

  • Generating new time series

    We can use the feature space to:

    Generate new time series with similar features toexisting series

    Generate new time series where there are holes inthe feature space.

    Let {PC1,PC2, . . . ,PCn} be a population of timeseries of specified length and period.Genetic algorithm uses a process of selection,crossover and mutation to evolve the populationtowards a target point Ti.Optimize: Fitness (PCj) =

    (|PCj Ti|2).

    Initial population random with some series inneighbourhood of Ti.

    Visualisation of big time series data M3 competition data 35

  • Generating new time series

    We can use the feature space to:

    Generate new time series with similar features toexisting series

    Generate new time series where there are holes inthe feature space.

    Let {PC1,PC2, . . . ,PCn} be a population of timeseries of specified length and period.Genetic algorithm uses a process of selection,crossover and mutation to evolve the populationtowards a target point Ti.Optimize: Fitness (PCj) =

    (|PCj Ti|2).

    Initial population random with some series inneighbourhood of Ti.

    Visualisation of big time series data M3 competition data 35

  • Generating new time series

    We can use the feature space to:

    Generate new time series with similar features toexisting series

    Generate new time series where there are holes inthe feature space.

    Let {PC1,PC2, . . . ,PCn} be a population of timeseries of specified length and period.Genetic algorithm uses a process of selection,crossover and mutation to evolve the populationtowards a target point Ti.Optimize: Fitness (PCj) =

    (|PCj Ti|2).

    Initial population random with some series inneighbourhood of Ti.

    Visualisation of big time series data M3 competition data 35

  • Generating new time series

    We can use the feature space to:

    Generate new time series with similar features toexisting series

    Generate new time series where there are holes inthe feature space.

    Let {PC1,PC2, . . . ,PCn} be a population of timeseries of specified length and period.Genetic algorithm uses a process of selection,crossover and mutation to evolve the populationtowards a target point Ti.Optimize: Fitness (PCj) =

    (|PCj Ti|2).

    Initial population random with some series inneighbourhood of Ti.

    Visualisation of big time series data M3 competition data 35

  • Evolving new time series

    Visualisation of big time series data M3 competition data 36

    A

    B

    C

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

  • Evolving new time series

    Visualisation of big time series data M3 competition data 36

    Targ

    et A

    1950 1960 1970 1980 1990

    2000

    6000

    Evo

    lved

    A

    0 5 10 15 20 25 30

    4400

    4800

    5200

    Targ

    et B

    1980 1985 1990 1995

    3000

    5000

    7000

    Time

    Evo

    lved

    B5 10 15

    5000

    7000

    Targ

    et C

    1982 1984 1986 1988 1990 1992 1994

    2000

    4000

    Evo

    lved

    C

    0 5 10 15 20 25 30

    3000

    5000

    7000

  • Evolving new time series

    Visualisation of big time series data M3 competition data 37

    D

    E

    F

    3

    2

    1

    0

    1

    2

    3

    2 0 2 4PC1

    PC

    2

  • Evolving new time series

    Visualisation of big time series data M3 competition data 37

    Evo

    lved

    D

    0 5 10 15 20 25 30

    3000

    5000

    7000

    Evo

    lved

    E

    5 10 15

    4000

    8000

    1200

    0

    Evo

    lved

    F

    2 4 6 8 10

    020

    000

    4000

    0

  • Evolving new time series

    Visualisation of big time series data M3 competition data 38

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Targets

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Evolved yearly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Evolved quarterly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Evolved monthly data

  • Evolving new time series

    Visualisation of big time series data M3 competition data 38

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Targets

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Evolved yearly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Evolved quarterly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Evolved monthly data

  • Evolving new time series

    Visualisation of big time series data M3 competition data 38

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Targets

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Evolved yearly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Evolved quarterly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Evolved monthly data

  • Evolving new time series

    Visualisation of big time series data M3 competition data 38

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Targets

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Evolved yearly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Evolved quarterly data

    4

    2

    0

    2

    4

    2 0 2 4 6PC1

    PC

    2

    Evolved monthly data

  • Questions raised

    Can SVM be used to create a forecast selectionroutine to give better forecasts?

    How much do M3 conclusions depend on theparticular set of time series involved?

    Has the M3 data set biased forecast methoddevelopment?

    What other features should we consider? Whatdifference does it make?

    Is PCA the right approach? Perhaps we shoulduse multidimensional scaling? Or somethingelse?

    Should we use more than 2 PC dimensions?Visualisation of big time series data M3 competition data 39

  • Questions raised

    Can SVM be used to create a forecast selectionroutine to give better forecasts?

    How much do M3 conclusions depend on theparticular set of time series involved?

    Has the M3 data set biased forecast methoddevelopment?

    What other features should we consider? Whatdifference does it make?

    Is PCA the right approach? Perhaps we shoulduse multidimensional scaling? Or somethingelse?

    Should we use more than 2 PC dimensions?Visualisation of big time series data M3 competition data 39

  • Questions raised

    Can SVM be used to create a forecast selectionroutine to give better forecasts?

    How much do M3 conclusions depend on theparticular set of time series involved?

    Has the M3 data set biased forecast methoddevelopment?

    What other features should we consider? Whatdifference does it make?

    Is PCA the right approach? Perhaps we shoulduse multidimensional scaling? Or somethingelse?

    Should we use more than 2 PC dimensions?Visualisation of big time series data M3 competition data 39

  • Questions raised

    Can SVM be used to create a forecast selectionroutine to give better forecasts?

    How much do M3 conclusions depend on theparticular set of time series involved?

    Has the M3 data set biased forecast methoddevelopment?

    What other features should we consider? Whatdifference does it make?

    Is PCA the right approach? Perhaps we shoulduse multidimensional scaling? Or somethingelse?

    Should we use more than 2 PC dimensions?Visualisation of big time series data M3 competition data 39

  • Questions raised

    Can SVM be used to create a forecast selectionroutine to give better forecasts?

    How much do M3 conclusions depend on theparticular set of time series involved?

    Has the M3 data set biased forecast methoddevelopment?

    What other features should we consider? Whatdifference does it make?

    Is PCA the right approach? Perhaps we shoulduse multidimensional scaling? Or somethingelse?

    Should we use more than 2 PC dimensions?Visualisation of big time series data M3 competition data 39

  • Questions raised

    Can SVM be used to create a forecast selectionroutine to give better forecasts?

    How much do M3 conclusions depend on theparticular set of time series involved?

    Has the M3 data set biased forecast methoddevelopment?

    What other features should we consider? Whatdifference does it make?

    Is PCA the right approach? Perhaps we shoulduse multidimensional scaling? Or somethingelse?

    Should we use more than 2 PC dimensions?Visualisation of big time series data M3 competition data 39

  • Outline

    1 The problem

    2 Australian tourism demand

    3 M3 competition data

    4 Yahoo web traffic

    5 What next?

    Visualisation of big time series data Yahoo web traffic 40

  • Yahoo web-trafficTens of thousands of time series collected atone-hour intervals over one month.Consisting of several server metrics (e.g. CPU usageand paging views) from many server farms globally.Aim: find unusual (anomalous) time series.

    Visualisation of big time series data Yahoo web traffic 41

  • Yahoo web-traffic

    3

    6

    9

    10

    20

    30

    40

    1020304050

    1

    2

    3

    4

    25

    50

    75

    100

    bu

    sy2

    33

    bu

    sy2

    71

    bu

    sy5

    0bu

    sy2

    00

    bu

    sy3

    69

    20

    14

    1

    1

    09

    20

    14

    1

    1

    10

    20

    14

    1

    1

    11

    20

    14

    1

    1

    12

    20

    14

    1

    1

    13

    20

    14

    1

    1

    14

    20

    14

    1

    1

    15

    20

    14

    1

    1

    16

    20

    14

    1

    1

    17

    20

    14

    1

    1

    18

    20

    14

    1

    1

    19

    20

    14

    1

    1

    20

    20

    14

    1

    1

    21

    20

    14

    1

    1

    22

    20

    14

    1

    1

    23

    20

    14

    1

    1

    24

    20

    14

    1

    1

    25

    20

    14

    1

    1

    26

    20

    14

    1

    1

    27

    20

    14

    1

    1

    28

    20

    14

    1

    1

    29

    20

    14

    1

    1

    30

    20

    14

    1

    2

    01

    20

    14

    1

    2

    02

    20

    14

    1

    2

    03

    20

    14

    1

    2

    04

    20

    14

    1

    2

    05

    20

    14

    1

    2

    06

    20

    14

    1

    2

    07

    20

    14

    1

    2

    08

    20

    14

    1

    2

    09

    20

    14

    1

    2

    10

    20

    14

    1

    2

    11

    20

    14

    1

    2

    12

    date

    va

    lue

    25

    30

    35

    40

    45

    20

    25

    30

    35

    40

    50

    60

    70

    10

    15

    20

    25

    50

    60

    me

    mo

    ry4

    60

    me

    mo

    ry4

    29

    me

    mo

    ry1

    47

    me

    mo

    ry4

    13

    me

    mo

    ry4

    84

    20

    14

    1

    1

    09

    20

    14

    1

    1

    10

    20

    14

    1

    1

    11

    20

    14

    1

    1

    12

    20

    14

    1

    1

    13

    20

    14

    1

    1

    14

    20

    14

    1

    1

    15

    20

    14

    1

    1

    16

    20

    14

    1

    1

    17

    20

    14

    1

    1

    18

    20

    14

    1

    1

    19

    20

    14

    1

    1

    20

    20

    14

    1

    1

    21

    20

    14

    1

    1

    22

    20

    14

    1

    1

    23

    20

    14

    1

    1

    24

    20

    14

    1

    1

    25

    20

    14

    1

    1

    26

    20

    14

    1

    1

    27

    20

    14

    1

    1

    28

    20

    14

    1

    1

    29

    20

    14

    1

    1

    30

    20

    14

    1

    2

    01

    20

    14

    1

    2

    02

    20

    14

    1

    2

    03

    20

    14

    1

    2

    04

    20

    14

    1

    2

    05

    20

    14

    1

    2

    06

    20

    14

    1

    2

    07

    20

    14

    1

    2

    08

    20

    14

    1

    2

    09

    20

    14

    1

    2

    10

    20

    14

    1

    2

    11

    20

    14

    1

    2

    12

    date

    va

    lue

    0

    5000

    10000

    15000

    20000

    200

    400

    600

    0

    5000

    10000

    15000

    20000

    500

    1000

    0

    5000

    10000

    15000

    20000

    25000

    pa

    gin

    g5

    3p

    ag

    ing

    46

    7p

    ag

    ing

    37

    1p

    ag

    ing

    33

    7p

    ag

    ing

    36

    7

    20

    14

    1

    1

    09

    20

    14

    1

    1

    10

    20

    14

    1

    1

    11

    20

    14

    1

    1

    12

    20

    14

    1

    1

    13

    20

    14

    1

    1

    14

    20

    14

    1

    1

    15

    20

    14

    1

    1

    16

    20

    14

    1

    1

    17

    20

    14

    1

    1

    18

    20

    14

    1

    1

    19

    20

    14

    1

    1

    20

    20

    14

    1

    1

    21

    20

    14

    1

    1

    22

    20

    14

    1

    1

    23

    20

    14

    1

    1

    24

    20

    14

    1

    1

    25

    20

    14

    1

    1

    26

    20

    14

    1

    1

    27

    20

    14

    1

    1

    28

    20

    14

    1

    1

    29

    20

    14

    1

    1

    30

    20

    14

    1

    2

    01

    20

    14

    1

    2

    02

    20

    14

    1

    2

    03

    20

    14

    1

    2

    04

    20

    14

    1

    2

    05

    20

    14

    1

    2

    06

    20

    14

    1

    2

    07

    20

    14

    1

    2

    08

    20

    14

    1

    2

    09

    20

    14

    1

    2

    10

    20

    14

    1

    2

    11

    20

    14

    1

    2

    12

    date

    va

    lue

    Visualisation of big time series data Yahoo web traffic 42

  • Feature spaceACF1: first order autocorrelation = Corr(Yt, Yt1)Strength of trend and seasonality based on STLTrend linearity and curvatureSize of seasonal peak and troughSpectral entropyLumpiness: variance of block variances (block size 24).Spikiness: variances of leave-one-out variances of STL remainders.Level shift: Maximum difference in trimmed means of consecutivemoving windows of size 24.Variance change: Max difference in variances of consecutivemoving windows of size 24.Flat spots: Discretize sample space into 10 equal-sized intervals.Find max run length in any interval.Number of crossing points of mean line.Kullback-Leibler score: Maximum ofDKL(PQ) =

    P(x) ln P(x)/Q(x)dx where P and Q are estimated by

    kernel density estimators applied to consecutive windows of size 48.Change index: Time of maximum KL score

    Visualisation of big time series data Yahoo web traffic 43

  • Feature spaceACF1: first order autocorrelation = Corr(Yt, Yt1)Strength of trend and seasonality based on STLTrend linearity and curvatureSize of seasonal peak and troughSpectral entropyLumpiness: variance of block variances (block size 24).Spikiness: variances of leave-one-out variances of STL remainders.Level shift: Maximum difference in trimmed means of consecutivemoving windows of size 24.Variance change: Max difference in variances of consecutivemoving windows of size 24.Flat spots: Discretize sample space into 10 equal-sized intervals.Find max run length in any interval.Number of crossing points of mean line.Kullback-Leibler score: Maximum ofDKL(PQ) =

    P(x) ln P(x)/Q(x)dx where P and Q are estimated by

    kernel density estimators applied to consecutive windows of size 48.Change index: Time of maximum KL score

    Visualisation of big time series data Yahoo web traffic 43

  • Feature spaceACF1: first order autocorrelation = Corr(Yt, Yt1)Strength of trend and seasonality based on STLTrend linearity and curvatureSize of seasonal peak and troughSpectral entropyLumpiness: variance of block variances (block size 24).Spikiness: variances of leave-one-out variances of STL remainders.Level shift: Maximum difference in trimmed means of consecutivemoving windows of size 24.Variance change: Max difference in variances of consecutivemoving windows of size 24.Flat spots: Discretize sample space into 10 equal-sized intervals.Find max run length in any interval.Number of crossing points of mean line.Kullback-Leibler score: Maximum ofDKL(PQ) =

    P(x) ln P(x)/Q(x)dx where P and Q are estimated by

    kernel density estimators applied to consecutive windows of size 48.Change index: Time of maximum KL score

    Visualisation of big time series data Yahoo web traffic 43

  • Feature spaceACF1: first order autocorrelation = Corr(Yt, Yt1)Strength of trend and seasonality based on STLTrend linearity and curvatureSize of seasonal peak and troughSpectral entropyLumpiness: variance of block variances (block size 24).Spikiness: variances of leave-one-out variances of STL remainders.Level shift: Maximum difference in trimmed means of consecutivemoving windows of size 24.Variance change: Max difference in variances of consecutivemoving windows of size 24.Flat spots: Discretize sample space into 10 equal-sized intervals.Find max run length in any interval.Number of crossing points of mean line.Kullback-Leibler score: Maximum ofDKL(PQ) =

    P(x) ln P(x)/Q(x)dx where P and Q are estimated by

    kernel density estimators applied to consecutive windows of size 48.Change index: Time of maximum KL score

    Visualisation of big time series data Yahoo web traffic 43

  • Feature spaceACF1: first order autocorrelation = Corr(Yt, Yt1)Strength of trend and seasonality based on STLTrend linearity and curvatureSize of seasonal peak and troughSpectral entropyLumpiness: variance of block variances (block size 24).Spikiness: variances of leave-one-out variances of STL remainders.Level shift: Maximum difference in trimmed means of consecutivemoving windows of size 24.Variance change: Max difference in variances of consecutivemoving windows of size 24.Flat spots: Discretize sample space into 10 equal-sized intervals.Find max run length in any interval.Number of crossing points of mean line.Kullback-Leibler score: Maximum ofDKL(PQ) =

    P(x) ln P(x)/Q(x)dx where P and Q are estimated by

    kernel density estimators applied to consecutive windows of size 48.Change index: Time of maximum KL score

    Visualisation of big time series data Yahoo web traffic 43

  • Feature spaceACF1: first order autocorrelation = Corr(Yt, Yt1)Strength of trend and seasonality based on STLTrend linearity and curvatureSize of seasonal peak and troughSpectral entropyLumpiness: variance of block variances (block size 24).Spikiness: variances of leave-one-out variances of STL remainders.Level shift: Maximum difference in trimmed means of consecutivemoving windows of size 24.Variance change: Max difference in variances of consecutivemoving windows of size 24.Flat spots: Discretize sample space into 10 equal-sized intervals.Find max run length in any interval.Number of crossing points of mean line.Kullback-Leibler score: Maximum ofDKL(PQ) =

    P(x) ln P(x)/Q(x)dx where P and Q are estimated by

    kernel density estimators applied to consecutive windows of size 48.Change index: Time of maximum KL score

    Visualisation of big time series data Yahoo web traffic 43

  • Feature spaceACF1: first order autocorrelation = Corr(Yt, Yt1)Strength of trend and seasonality based on STLTrend linearity and curvatureSize of seasonal peak and troughSpectral entropyLumpiness: variance of block variances (block size 24).Spikiness: variances of leave-one-out variances of STL remainders.Level shift: Maximum difference in trimmed means of consecutivemoving windows of size 24.Variance change: Max difference in variances of consecutivemoving windows of size 24.Flat spots: Discretize sample space into 10 equal-sized intervals.Find max run length in any interval.Number of crossing points of mean line.Kullback-Leibler score: Maximum ofDKL(PQ) =

    P(x) ln P(x)/Q(x)dx where P and Q are estimated by

    kernel density estimators applied to consecutive windows of size 48.Change index: Time of maximum KL score

    Visualisation of big time series data Yahoo web traffic 43

  • Feature spaceACF1: first order autocorrelation = Corr(Yt, Yt1)Strength of trend and seasonality based on STLTrend linearity and curvatureSize of seasonal peak and troughSpectral entropyLumpiness: variance of block variances (block size 24).Spikiness: variances of leave-one-out variances of STL remainders.Level shift: Maximum difference in trimmed means of consecutivemoving windows of size 24.Variance change: Max difference in variances of consecutivemoving windows of size 24.Flat spots: Discretize sample space into 10 equal-sized intervals.Find max run length in any interval.Number of crossing points of mean line.Kullback-Leibler score: Maximum ofDKL(PQ) =

    P(x) ln P(x)/Q(x)dx where P and Q are estimated by

    kernel density estimators applied to consecutive windows of size 48.Change index: Time of maximum KL score

    Visualisation of big time series data Yahoo web traffic 43

  • Feature spaceACF1: first order autocorrelation = Corr(Yt, Yt1)Strength of trend and seasonality based on STLTrend linearity and curvatureSize of seasonal peak and troughSpectral entropyLumpiness: variance of block variances (block size 24).Spikiness: variances of leave-one-out variances of STL remainders.Level shift: Maximum difference in trimmed means of consecutivemoving windows of size 24.Variance change: Max difference in variances of consecutivemoving windows of size 24.Flat spots: Discretize sample space into 10 equal-sized intervals.Find max run length in any interval.Number of crossing points of mean line.Kullback-Leibler score: Maximum ofDKL(PQ) =

    P(x) ln P(x)/Q(x)dx where P and Q are estimated by

    kernel density estimators applied to consecutive windows of size 48.Change index: Time of maximum KL score

    Visualisation of big time series data Yahoo web traffic 43

  • Feature spaceACF1: first order autocorrelation = Corr(Yt, Yt1)Strength of trend and seasonality based on STLTrend linearity and curvatureSize of seasonal peak and troughSpectral entropyLumpiness: variance of block variances (block size 24).Spikiness: variances of leave-one-out variances of STL remainders.Level shift: Maximum difference in trimmed means of consecutivemoving windows of size 24.Variance change: Max difference in variances of consecutivemoving windows of size 24.Flat spots: Discretize sample space into 10 equal-sized intervals.Find max run length in any interval.Number of crossing points of mean line.Kullback-Leibler score: Maximum ofDKL(PQ) =

    P(x) ln P(x)/Q(x)dx where P and Q are estimated by

    kernel density estimators applied to consecutive windows of size 48.Change index: Time of maximum KL score

    Visualisation of big time series data Yahoo web traffic 43

  • Feature spaceACF1: first order autocorrelation = Corr(Yt, Yt1)Strength of trend and seasonality based on STLTrend linearity and curvatureSize of seasonal peak and troughSpectral entropyLumpiness: variance of block variances (block size 24).Spikiness: variances of leave-one-out variances of STL remainders.Level shift: Maximum difference in trimmed means of consecutivemoving windows of size 24.Variance change: Max difference in variances of consecutivemoving windows of size 24.Flat spots: Discretize sample space into 10 equal-sized intervals.Find max run length in any interval.Number of crossing points of mean line.Kullback-Leibler score: Maximum ofDKL(PQ) =

    P(x) ln P(x)/Q(x)dx where P and Q are estimated by

    kernel density estimators applied to consecutive windows of size 48.Change index: Time of maximum KL score

    Visualisation of big time series data Yahoo web traffic 43

  • Feature spaceACF1: first order autocorrelation = Corr(Yt, Yt1)Strength of trend and seasonality based on STLTrend linearity and curvatureSize of seasonal peak and troughSpectral entropyLumpiness: variance of block variances (block size 24).Spikiness: variances of leave-one-out variances of STL remainders.Level shift: Maximum difference in trimmed means of consecutivemoving windows of size 24.Variance change: Max difference in variances of consecutivemoving windows of size 24.Flat spots: Discretize sample space into 10 equal-sized intervals.Find max run length in any interval.Number of crossing points of mean line.Kullback-Leibler score: Maximum ofDKL(PQ) =

    P(x) ln P(x)/Q(x)dx where P and Q are estimated by

    kernel density estimators applied to consecutive windows of size 48.Change index: Time of maximum KL score

    Visualisation of big time series data Yahoo web traffic 43

  • Feature spaceACF1: first order autocorrelation = Corr(Yt, Yt1)Strength of trend and seasonality based on STLTrend linearity and curvatureSize of seasonal peak and troughSpectral entropyLumpiness: variance of block variances (block size 24).Spikiness: variances of leave-one-out variances of STL remainders.Level shift: Maximum difference in trimmed means of consecutivemoving windows of size 24.Variance change: Max difference in variances of consecutivemoving windows of size 24.Flat spots: Discretize sample space into 10 equal-sized intervals.Find max run length in any interval.Number of crossing points of mean line.Kullback-Leibler score: Maximum ofDKL(PQ) =

    P(x) ln P(x)/Q(x)dx where P and Q are estimated by

    kernel density estimators applied to consecutive windows of size 48.Change index: Time of maximum KL score

    Visualisation of big time series data Yahoo web traffic 43

  • Principal component analysis

    ACF1

    lumpin

    ess

    entropy

    lshiftvchange

    cpoints

    fspo

    ts

    trend

    linearity

    curvature

    spikin

    ess

    seas

    onpeak

    trou

    gh

    klscore

    chan

    ge.id

    x

    4

    2

    0

    2

    2.5 0.0 2.5standardized PC1 (28.7% explained var.)

    stan

    dard

    ized

    PC

    2 (1

    7.3%

    exp

    lain

    ed v

    ar.)

    Visualisation of big time series data Yahoo web traffic 44

  • What is anomalous

    ACF1

    lumpin

    ess

    entropy

    lshiftvchange

    cpoints

    fspo

    ts

    trend

    linearity

    curvature

    spikin

    ess

    seas

    onpeak

    trou

    gh

    klscore

    chan

    ge.id

    x

    4

    2

    0

    2

    2.5 0.0 2.5standardized PC1 (28.7% explained var.)

    stan

    dard

    ized

    PC

    2 (1

    7.3%

    exp

    lain

    ed v

    ar.)

    We need a measure of the anomalousness of a timeseries.

    1 Rank points based on their local density.2 Rank points based on whether they are within

    -convex hulls of different radius.Visualisation of big time series data Yahoo web traffic 45

  • What is anomalous

    ACF1

    lumpin

    ess

    entropy

    lshiftvchange

    cpoints

    fspo

    ts

    trend

    linearity

    curvature

    spikin

    ess

    seas

    onpeak

    trou

    gh

    klscore

    chan

    ge.id

    x

    4

    2

    0

    2

    2.5 0.0 2.5standardized PC1 (28.7% explained var.)

    stan

    dard

    ized

    PC

    2 (1

    7.3%

    exp

    lain

    ed v

    ar.)

    We need a measure of the anomalousness of a timeseries.

    1 Rank points based on their local density.2 Rank points based on whether they are within

    -convex hulls of different radius.Visualisation of big time series data Yahoo web traffic 45

  • What is anomalous

    ACF1

    lumpin

    ess

    entropy

    lshiftvchange

    cpoints

    fspo

    ts

    trend

    linearity

    curvature

    spikin

    ess

    seas

    onpeak

    trou

    gh

    klscore

    chan

    ge.id

    x

    4

    2

    0

    2

    2.5 0.0 2.5standardized PC1 (28.7% explained var.)

    stan

    dard

    ized

    PC

    2 (1

    7.3%

    exp

    lain

    ed v

    ar.)

    We need a measure of the anomalousness of a timeseries.

    1 Rank points based on their local density.2 Rank points based on whether they are within

    -convex hulls of different radius.Visualisation of big time series data Yahoo web traffic 45

  • Bivariate kernel density

    f(x;H) =1

    n

    ni=1

    KH(x Xi)

    Xi a bivariate random sample {X1,X2, . . . ,Xn}KH(x) is the standard normal kernel function

    H estimated by minimizing the sum of AMISE

    Rank points based on f values in 2d PCA space.

    Visualisation of big time series data Yahoo web traffic 46

  • Bivariate kernel density

    f(x;H) =1

    n

    ni=1

    KH(x Xi)

    Xi a bivariate random sample {X1,X2, . . . ,Xn}KH(x) is the standard normal kernel function

    H estimated by minimizing the sum of AMISE

    Rank points based on f values in 2d PCA space.

    Visualisation of big time series data Yahoo web traffic 46

  • Bivariate density ranking

    Visualisation of big time series data Yahoo web traffic 47

    5 0 5

    8

    6

    4

    2

    02

    46

    pc1

    pc2

    1

    2

    3

    45

  • Bivariate density ranking

    Visualisation of big time series data Yahoo web traffic 47

    010000200003000040000

    0200040006000

    01000020000300004000050000

    010000200003000040000

    010002000300040005000

    S7793

    S8494

    S10464

    S7833

    S1715

    2015

    02

    28

    2015

    03

    01

    2015

    03

    02

    2015

    03

    03

    2015

    03

    04

    2015

    03

    05

    2015

    03

    06

    2015

    03

    07

    2015

    03

    08

    2015

    03

    09

    2015

    03

    10

    2015

    03

    11

    2015

    03

    12

    2015

    03

    13

    2015

    03

    14

    2015

    03

    15

    2015

    03

    16

    2015

    03

    17

    2015

    03

    18

    2015

    03

    19

    2015

    03

    20

    2015

    03

    21

    2015

    03

    22

    2015

    03

    23

    2015

    03

    24

    2015

    03

    25

    2015

    03

    26

    2015

    03

    27

    2015

    03

    28

    2015

    03

    29

    2015

    03

    30

    2015

    03

    31

    2015

    04

    01

    date

    valu

    e

  • -convex hullsThe space generated by point pairs that can betouched by an empty disc of radius .

    gives a convex hull.Points can become isolated when is small.

    We rank points based on the value of whenthey become isolated.

    Visualisation of big time series data Yahoo web traffic 48

  • -convex hullsThe space generated by point pairs that can betouched by an empty disc of radius .

    gives a convex hull.Points can become isolated when is small.

    We rank points based on the value of whenthey become isolated.

    Visualisation of big time series data Yahoo web traffic 48

  • -convex hullsThe space generated by point pairs that can betouched by an empty disc of radius .

    gives a convex hull.Points can become isolated when is small.

    We rank points based on the value of whenthey become isolated.

    Visualisation of big time series data Yahoo web traffic 48

  • -convex hullsThe space generated by point pairs that can betouched by an empty disc of radius .

    gives a convex hull.Points can become isolated when is small.

    We rank points based on the value of whenthey become isolated.

    Visualisation of big time series data Yahoo web traffic 48

  • -convex hull

    Visualisation of big time series data Yahoo web traffic 49

  • -convex hull ranking

    Visualisation of big time series data Yahoo web traffic 50

    5 0 5

    8

    6

    4

    2

    02

    46

    12

    3

    4

    5

  • -convex hull ranking

    Visualisation of big time series data Yahoo web traffic 50

    01000020000300004000050000

    010000200003000040000

    0200040006000

    010002000300040005000

    0100002000030000

    S10464

    S7793

    S8494

    S1715

    S7826

    2015

    02

    28

    2015

    03

    01

    2015

    03

    02

    2015

    03

    03

    2015

    03

    04

    2015

    03

    05

    2015

    03

    06

    2015

    03

    07

    2015

    03

    08

    2015

    03

    09

    2015

    03

    10

    2015

    03

    11

    2015

    03

    12

    2015

    03

    13

    2015

    03

    14

    2015

    03

    15

    2015

    03

    16

    2015

    03

    17

    2015

    03

    18

    2015

    03

    19

    2015

    03

    20

    2015

    03

    21

    2015

    03

    22

    2015

    03

    23

    2015

    03

    24

    2015

    03

    25

    2015

    03

    26

    2015

    03

    27

    2015

    03

    28

    2015

    03

    29

    2015

    03

    30

    2015

    03

    31

    2015

    04

    01

    date

    valu

    e

  • HDR versus -convex hull

    HDR boxplot

    5 0 5

    8

    6

    4

    2

    02

    46

    pc1

    pc2

    1

    2

    3

    45

    -convex hull

    5 0 5

    8

    6

    4

    20

    24

    6

    12

    3

    4

    5

    Visualisation of big time series data Yahoo web traffic 51

  • Top 5 anomalous time series

    HDR0

    10000200003000040000

    0200040006000

    01000020000300004000050000

    010000200003000040000

    010002000300040005000

    S7793

    S8494

    S10464

    S7833

    S1715

    2015

    02

    28

    2015

    03

    01

    2015

    03

    02

    2015

    03

    03

    2015

    03

    04

    2015

    03

    05

    2015

    03

    06

    2015

    03

    07

    2015

    03

    08

    2015

    03

    09

    2015

    03

    10

    2015

    03

    11

    2015

    03

    12

    2015

    03

    13

    2015

    03

    14

    2015

    03

    15

    2015

    03

    16

    2015

    03

    17

    2015

    03

    18

    2015

    03

    19

    2015

    03

    20

    2015

    03

    21

    2015

    03

    22

    2015

    03

    23

    2015

    03

    24

    2015

    03

    25

    2015

    03

    26

    2015

    03

    27

    2015

    03

    28

    2015

    03

    29

    2015

    03

    30

    2015

    03

    31

    2015

    04

    01

    date

    valu

    e

    -convex hull0

    1000020000300004000050000

    010000200003000040000

    0200040006000

    010002000300040005000

    0100002000030000

    S10464

    S7793

    S8494

    S1715

    S7826

    2015

    02

    28

    2015

    03

    01

    2015

    03

    02

    2015

    03

    03

    2015

    03

    04

    2015

    03

    05

    2015

    03

    06

    2015

    03

    07

    2015

    03

    08

    2015

    03

    09

    2015

    03

    10

    2015

    03

    11

    2015

    03

    12

    2015

    03

    13

    2015

    03

    14

    2015

    03

    15

    2015

    03

    16

    2015

    03

    17

    2015

    03

    18

    2015

    03

    19

    2015

    03

    20

    2015

    03

    21

    2015

    03

    22

    2015

    03

    23

    2015

    03

    24

    2015

    03

    25

    2015

    03

    26

    2015

    03

    27

    2015

    03

    28

    2015

    03

    29

    2015

    03

    30

    2015

    03

    31

    2015

    04

    01

    date

    valu

    e

    Visualisation of big time series data Yahoo web traffic 52

  • Outline

    1 The problem

    2 Australian tourism demand

    3 M3 competition data

    4 Yahoo web traffic

    5 What next?

    Visualisation of big time series data What next? 53

  • What next?

    Develop a more comprehensive set of featuresthat are reliable measures and fast to compute.e.g., for finance data.Consider other dimension reduction methodsand more than 2 dimensions.Develop dynamic and interactive visualizationtools.Make methods available in an R package.

    Some of the methods are already available in theanomalous package for R on github.

    Papers: robjhyndman.com

    Code: github.com/robjhyndman

    Email: [email protected]

    Visualisation of big time series data What next? 54

  • What next?

    Develop a more comprehensive set of featuresthat are reliable measures and fast to compute.e.g., for finance data.Consider other dimension reduction methodsand more than 2 dimensions.Develop dynamic and interactive visualizationtools.Make methods available in an R package.

    Some of the methods are already available in theanomalous package for R on github.

    Papers: robjhyndman.com

    Code: github.com/robjhyndman

    Email: [email protected]

    Visualisation of big time series data What next? 54

  • What next?

    Develop a more comprehensive set of featuresthat are reliable measures and fast to compute.e.g., for finance data.Consider other dimension reduction methodsand more than 2 dimensions.Develop dynamic and interactive visualizationtools.Make methods available in an R package.

    Some of the methods are already available in theanomalous package for R on github.

    Papers: robjhyndman.com

    Code: github.com/robjhyndman

    Email: [email protected]

    Visualisation of big time series data What next? 54

  • What next?

    Develop a more comprehensive set of featuresthat are reliable measures and fast to compute.e.g., for finance data.Consider other dimension reduction methodsand more than 2 dimensions.Develop dynamic and interactive visualizationtools.Make methods available in an R package.

    Some of the methods are already available in theanomalous package for R on github.

    Papers: robjhyndman.com

    Code: github.com/robjhyndman

    Email: [email protected]

    Visualisation of big time series data What next? 54

  • What next?

    Develop a more comprehensive set of featuresthat are reliable measures and fast to compute.e.g., for finance data.Consider other dimension reduction methodsand more than 2 dimensions.Develop dynamic and interactive visualizationtools.Make methods available in an R package.

    Some of the methods are already available in theanomalous package for R on github.

    Papers: robjhyndman.com

    Code: github.com/robjhyndman

    Email: [email protected]

    Visualisation of big time series data What next? 54

  • What next?

    Develop a more comprehensive set of featuresthat are reliable measures and fast to compute.e.g., for finance data.Consider other dimension reduction methodsand more than 2 dimensions.Develop dynamic and interactive visualizationtools.Make methods available in an R package.

    Some of the methods are already available in theanomalous package for R on github.

    Papers: robjhyndman.com

    Code: github.com/robjhyndman

    Email: [email protected]

    Visualisation of big time series data What next? 54

  • What next?

    Develop a more comprehensive set of featuresthat are reliable measures and fast to compute.e.g., for finance data.Consider other dimension reduction methodsand more than 2 dimensions.Develop dynamic and interactive visualizationtools.Make methods available in an R package.

    Some of the methods are already available in theanomalous package for R on github.

    Papers: robjhyndman.com

    Code: github.com/robjhyndman

    Email: [email protected]

    Visualisation of big time series data What next? 54

    The problemAustralian tourism demandM3 competition dataYahoo web trafficWhat next?