8
The Keogh Lab 1 Data Mining and Structure Retrieval Presented by Abdullah Mueen

The Keogh Lab

  • Upload
    taite

  • View
    44

  • Download
    0

Embed Size (px)

DESCRIPTION

Data Mining and Structure Retrieval. Presented by Abdullah Mueen. The Keogh Lab. Overview of our work. Our Goal: Extract information from raw, noisy, massive, unstructured data. We develop algorithms for Classification Clustering Rule finding Motif discovery Discord discovery - PowerPoint PPT Presentation

Citation preview

Page 1: The Keogh  Lab

1

The Keogh Lab

Data Mining and Structure Retrieval

Presented byAbdullah Mueen

Page 2: The Keogh  Lab

2

Overview of our work• Our Goal: Extract information from raw, noisy, massive,

unstructured data.• We develop algorithms for

– Classification– Clustering– Rule finding– Motif discovery– Discord discovery– Shapelet discovery– Linkage discovery

• We work closely with the domain experts. – For collecting new data.– To verify our results.

Page 3: The Keogh  Lab

3

Case 1: Motif DiscoveryBeet Leafhopper (Circulifer tenellus)

plant membrane

Stylet

voltage source

input resistor

V

0 50 100 150 2000

10

20

to insectconductive glue

voltage reading

to soil near plant

Exact Discovery of Time Series Motifs.Abdullah Mueen, Eamonn Keogh, Qiang Zhu, Sydney

Cash,  Brandon Westover. SDM 2009.

MK motif discovery

Page 4: The Keogh  Lab

4

false nettles

stinging nettles

Case 2: Shapelet Discovery

false nettles

Shapelet

stinging nettles

Time Series Shapelets: A New Primitive for Data Mining.

Lexiang Ye and Eamonn Keogh. SIGKDD 2009 

Page 5: The Keogh  Lab

5

Case 3: Linkage Discovery

CK-1

0.6291

CK-1

0.9033

CK-1 Distance Measure

0.6

0.7

0.8

0.9

CK-1

Dist

ance Single Linkage Dendrogram

Print House 1 Print House 2

A Compression Based Distance Measure for Texture. Bilson Campana and Eamonn Keogh . SDM 2010

text

a hand-press bookcharacter matrix

textornaments text

Page 6: The Keogh  Lab

Lab Members

Dr. Eamonn KeoghDr. Gustavo Batista

Abdullah MueenQiang Zhu

Bilson CampanaThanawin Art R.

Bing HuYuan Hao

Jesin Zakaria6

Page 7: The Keogh  Lab

7

Motif in Online Data • Maintain motif in streaming data without

introducing latency.

Page 8: The Keogh  Lab

8

Motion Motif• Find repeated motion in motion capture data

which is a 32 dimensional time series.