14
Kohonen Mapping Kohonen Mapping and Text Semantics and Text Semantics Xia Lin College of Information Science and Technology Drexel University

Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University

Embed Size (px)

Citation preview

Page 1: Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University

Kohonen MappingKohonen Mappingand Text Semanticsand Text Semantics

Xia Lin

College of Information Science and Technology

Drexel University

Page 2: Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University

Ten years agoTen years ago

Lin, X., Soergel, D., & Marchionini, G. A self-organizing semantic map for information retrieval. SIGIR’89, pp. 262-269.

Applied Kohonen’s mapping to a small document set:– 140 documents– 25 indexing words

Page 3: Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University

A Semantic Map for 140 AI DocumentsA Semantic Map for 140 AI Documents

Citationdatabase

retrieval

4 1 3 1 1 1 2 1 6 22

1

2

1

1

2

2

1

1

1

2 2

4 4

29 2

2 1

2 2

11 1

6 1

3 3 4 2

1 1

1 2

3

1 6

1 1

1

3

3

1

13Intelligent

librarysearchonlineapplication

Machine learning knowledge

expert

systems

natural

process

language

research

network

others

Page 4: Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University

Features of the Semantic MapFeatures of the Semantic MapReveal frequencies and distribution of

underlying data.Preserve metric relationships “as faithfully

as possible” while mapping from high-dimensional data to a two-dimensional display

Display co-occurrence structures through its neighborhood structures.

Page 5: Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University

Why do you prefer using Self-Why do you prefer using Self-organizing Map (SOM) to organizing Map (SOM) to textual information? textual information? The power of abstractionThe feature of self-organizationThe format of output -- rich information

for display

Page 6: Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University

Information Abstraction Information Abstraction SOM utilizes statistical information of

text in a unique way– Both individual data and their inter-

relationships are represented.– Learning takes place gradually

To tolerate uncertainty/fuzziness in the input data

– It represents large amount of data economicallySimilar to the way the brain processes/stores

information?

Page 7: Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University

Information OrganizationInformation OrganizationSOM uses the input data to make a

random network become an organized network. – Each piece of information will find its own

identity (the best place) on the map.– All the related information should be

organized together. – A compromise or enforcement of both

“individual responsibilities” and “social responsibilities.”

Page 8: Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University

Information VisualizationInformation VisualizationSOM’s output is an associative network that

can be used to implement various interactive functions of the interface– A good overview of underlying data– A variety of topologic structures

Sizes of groups, distances, weights of vectors, patterns of inputs, etc.

– A space of both documents and terms Effective use all the space of the two-dimensional

area.

Page 9: Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University

How much semantics are How much semantics are represented in Kohonen’s map?represented in Kohonen’s map?

It’s an open question.

Understanding can be gained through comparisons and applications.

Page 10: Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University

dove

hen

duck

gooseowl

hawkeagle

cat fox

wolftiger

lion

cow

dog

horse

zerba

(a) Hierarchical cluster

fox

wolf

eagle

catdog

tiger

lion

horsezebra

cow

owlhawk

hen goose

duck

dove

(c) Kohonen's feature map

dog

goose

dove

hen

tiger

hawkowl

wolf

duck

lion

eagle

fox

cat

horsezebra

cow

(b) Principal component analysis

Page 11: Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University

graph

treeminor survey

time

responseuser

computer

interface human

EPS

M3

M4

M2

M1

C2C5

C1C3

C4

system

(a) Display of the Latent semantic indexing result

System

User

Human

Interface

Graph

Tree

C4

C1

C3

C5

C2

M1

M2

M3

M4

(d) Document and term map by the feature map

computer

Responsetime

EPS

minors

survey

Page 12: Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University

Visual SiteMap Visual SiteMap

Page 13: Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University
Page 14: Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University