Upload
eugene-maxwell
View
212
Download
0
Embed Size (px)
Citation preview
Kohonen MappingKohonen Mappingand Text Semanticsand Text Semantics
Xia Lin
College of Information Science and Technology
Drexel University
Ten years agoTen years ago
Lin, X., Soergel, D., & Marchionini, G. A self-organizing semantic map for information retrieval. SIGIR’89, pp. 262-269.
Applied Kohonen’s mapping to a small document set:– 140 documents– 25 indexing words
A Semantic Map for 140 AI DocumentsA Semantic Map for 140 AI Documents
Citationdatabase
retrieval
4 1 3 1 1 1 2 1 6 22
1
2
1
1
2
2
1
1
1
2 2
4 4
29 2
2 1
2 2
11 1
6 1
3 3 4 2
1 1
1 2
3
1 6
1 1
1
3
3
1
13Intelligent
librarysearchonlineapplication
Machine learning knowledge
expert
systems
natural
process
language
research
network
others
Features of the Semantic MapFeatures of the Semantic MapReveal frequencies and distribution of
underlying data.Preserve metric relationships “as faithfully
as possible” while mapping from high-dimensional data to a two-dimensional display
Display co-occurrence structures through its neighborhood structures.
Why do you prefer using Self-Why do you prefer using Self-organizing Map (SOM) to organizing Map (SOM) to textual information? textual information? The power of abstractionThe feature of self-organizationThe format of output -- rich information
for display
Information Abstraction Information Abstraction SOM utilizes statistical information of
text in a unique way– Both individual data and their inter-
relationships are represented.– Learning takes place gradually
To tolerate uncertainty/fuzziness in the input data
– It represents large amount of data economicallySimilar to the way the brain processes/stores
information?
Information OrganizationInformation OrganizationSOM uses the input data to make a
random network become an organized network. – Each piece of information will find its own
identity (the best place) on the map.– All the related information should be
organized together. – A compromise or enforcement of both
“individual responsibilities” and “social responsibilities.”
Information VisualizationInformation VisualizationSOM’s output is an associative network that
can be used to implement various interactive functions of the interface– A good overview of underlying data– A variety of topologic structures
Sizes of groups, distances, weights of vectors, patterns of inputs, etc.
– A space of both documents and terms Effective use all the space of the two-dimensional
area.
How much semantics are How much semantics are represented in Kohonen’s map?represented in Kohonen’s map?
It’s an open question.
Understanding can be gained through comparisons and applications.
dove
hen
duck
gooseowl
hawkeagle
cat fox
wolftiger
lion
cow
dog
horse
zerba
(a) Hierarchical cluster
fox
wolf
eagle
catdog
tiger
lion
horsezebra
cow
owlhawk
hen goose
duck
dove
(c) Kohonen's feature map
dog
goose
dove
hen
tiger
hawkowl
wolf
duck
lion
eagle
fox
cat
horsezebra
cow
(b) Principal component analysis
graph
treeminor survey
time
responseuser
computer
interface human
EPS
M3
M4
M2
M1
C2C5
C1C3
C4
system
(a) Display of the Latent semantic indexing result
System
User
Human
Interface
Graph
Tree
C4
C1
C3
C5
C2
M1
M2
M3
M4
(d) Document and term map by the feature map
computer
Responsetime
EPS
minors
survey
Visual SiteMap Visual SiteMap