Language Technology Enhanced Learning

Language TechnologyLanguage TechnologyEnhanced LearningEnhanced LearningFridolin WildThe Open University, UK

Gaston BurekUniversity of Tübingen

Adriana BerlangaOpen University, NL

Workshop OutlineWorkshop Outline

1 | Deep Introduction

Latent-Semantic Analysis (LSA)2 | Quick Introduction

Working with R3 | Experiment

Simple Content-Based Feedback4 | Experiment

Topic Proxy

#2

Latent-Semantic AnalysisLatent-Semantic Analysis

LSALSA

Latent Semantic AnalysisLatent Semantic Analysis

• Assumption: language utterances do have a semantic structure

• However, this structure is obscured by word usage (noise, synonymy, polysemy, …)

• Proposed LSA Solution: – map doc-term matrix – using conceptual indices – derived statistically (truncated SVD) – and make similarity comparisons

using e.g. angles

Input (e.g., documents)Input (e.g., documents)

{ M } =

Deerwester, Dumais, Furnas, Landauer, and Harshman (1990): Indexing by Latent Semantic Analysis, In: Journal of the American Society for Information Science, 41(6):391-407

Only the red terms appear in more than one document, so strip the rest.

term = feature

vocabulary = ordered set of features

TEXTMATRIX

Singular Value Decomposition

=

Truncated SVDTruncated SVD

… we will get a different matrix (different values, but still of the same format as M).

latent-semantic space

Reconstructed, Reduced Reconstructed, Reduced MatrixMatrix

m4: Graph minors: A survey

Similarity in a Latent-Semantic SpaceSimilarity in a Latent-Semantic Space

(Landauer, 2007)

m

ii

m

ii

m

iii

ba

ba

1

2

1

2

1cos Query

Target 1

Target 2Angle 2

Angle 1

Y d

ime

nsi

on

X dimension

doc2doc - similarities

Unreduced = pure vector space model

- Based on M = TSD’

- Pearson Correlation over document vectors

reduced

- based on M2 = TS2D’

- Pearson Correlation over document vectors

The meaning of "life" =

0.0465 -0.0453 -0.0275 -0.0428 0.0166 -0.0142 -0.0094 0.0685 0.0297 -0.0377 -0.0166 -0.0165 0.0270 -0.0171 0.0017 0.0135 -0.0372 -0.0045 -0.0205 -0.0016 0.0215 0.0067 -0.0302 -0.0214 -0.0200 0.0462 -0.0371 0.0055 -0.0257 -0.0177

-0.0249 0.0292 0.0069 0.0098 0.0038 -0.0041 -0.0030 0.0021 -0.0114 0.0092 -0.0454 0.0151 0.0091 0.0021 -0.0079 -0.0283 -0.0116 0.0121 0.0077 0.0161 0.0401 -0.0015 -0.0268 0.0099 -0.0111 0.0101 -0.0106 -0.0105 0.0222 0.0106 0.0313 -0.0091 -0.0411 -0.0511 -0.0351 0.0072 0.0064 -0.0025 0.0392 0.0373 0.0107 -0.0063 -0.0006 -0.0033 -0.0403 0.0481 0.0082 -0.0587 -0.0154 -0.0342

-0.0057 -0.0141 0.0340 -0.0208 -0.0060 0.0165 -0.0139 0.0060 0.0249 -0.0515 0.0083 -0.0303 -0.0070 -0.0033 0.0408 0.0271 -0.0629 0.0202 0.0101 0.0080 0.0136 -0.0122 0.0107 -0.0130 -0.0035 -0.0103 -0.0357 0.0407 -0.0165 -0.0181 0.0369 -0.0295 -0.0262 0.0363 0.0309 0.0180 -0.0058 -0.0243 0.0038 -0.0480 0.0008 -0.0064 0.0152 0.0470 0.0071 0.0183 0.0106 0.0377 -0.0445 0.0206

-0.0084 -0.0457 -0.0190 0.0002 0.0283 0.0423 -0.0758 0.0005 0.0335 -0.0693 -0.0506 -0.0025 -0.1002 -0.0178 -0.0638 0.0513 -0.0599 -0.0456 -0.0183 0.0230 -0.0426 -0.0534 -0.0177 0.0383 0.0095 0.0117 0.0472 0.0319 -0.0047 0.0534 -0.0252 0.0266 -0.0210 -0.0627 0.0424 -0.0412 0.0133 -0.0221 0.0593 0.0506 0.0042 -0.0171 -0.0033 -0.0222 -0.0409 -0.0007 0.0265 -0.0260 -0.0052 0.0388 0.0393 0.0393 0.0652 0.0379 0.0463 0.0357 0.0462 0.0747 0.0244 0.0598

-0.0563 0.1011 0.0491 0.0174 -0.0123 0.0352 -0.0368 -0.0268 -0.0361 -0.0607 -0.0461 0.0437 -0.0087 -0.0109 0.0481 -0.0326 -0.0642 0.0367 0.0116 0.0048 -0.0515 -0.0487 -0.0300 0.0515 -0.0312 -0.0429 -0.0582 0.0730 -0.0063 -0.0479 0.0230 -0.0325 0.0240 -0.0086 -0.0401 0.0747 -0.0649 -0.0658 -0.0283 -0.0184

-0.0297 -0.0122 -0.0883 -0.0138 -0.0072 -0.0250 -0.1139 -0.0172 0.0507 0.0252 0.0307 -0.0821 0.0328 0.0584 -0.0216 0.0117 0.0801 0.0186 0.0088 0.0224

-0.0079 0.0462 -0.0273 -0.0792 0.0127 -0.0568 0.0105 -0.0167 0.0923 -0.0843 0.0836 0.0291 -0.0201 0.0807 0.0670 0.0592 0.0312 -0.0272 -0.0207 0.0028

-0.0092 0.0385 0.0194 -0.0451 0.0002 -0.0041 0.0203 0.0313 -0.0093 -0.0444 0.0142 -0.0458 0.0223 -0.0688 -0.0334 -0.0361 -0.0636 0.0217 -0.0153 -0.0458

-0.0322 -0.0615 -0.0206 0.0146 -0.0002 0.0148 -0.0223 0.0471 -0.0015 0.0135 (Landauer, 2007)

ConfigurationsConfigurations

4 x 12 x 7 x 2 x 3 = 2016 Combinations

Updating: Folding-InUpdating: Folding-In

• SVD factor stability– Different texts – different factors– Challenge: avoid unwanted factor changes

(e.g., bad essays)– Solution: folding-in instead of recalculating

• SVD is computationally expensive– 14 seconds (300 docs textbase) – 10 minutes (3500 docs textbase)– … and rising!

The Statistical Language The Statistical Language and Environment Rand Environment R

RR

HelpHelp

> ?'+'

> ?kmeans

> help.search("correlation")

http://www.r-project.org

=> site search

=> documentation

Mailinglist r-help

Task View NLP: http://cran.r-project.org/ -> Task Views -> NLP

Installation & ConfigurationInstallation & Configuration

install.packages("lsa", repos="http://cran.r-project.org")

install.packages("tm", repos="http://cran.r-project.org")

install.packages("network", repos="http://cran.r-project.org")

library(lsa)

setwd("d:/denkhalde/workshop")

dir()

ls()

quit()

The lsa PackageThe lsa Package

• Available via CRAN, e.g.:http://cran.at.r-project.org/src/contrib/Descriptions/lsa.html

• Higher-level Abstraction to Ease Use– Five core methods:

textmatrix() / query()

lsa()

fold_in()

as.textmatrix()

– Supporting methods for term weighting, dimensionality calculation, correlation measurement, triple binding

Core Processing WorkflowCore Processing Workflow

tm = textmatrix(‘dir/‘)

tm = lw_logtf(tm) * gw_idf(tm)

space = lsa(tm, dims=dimcalc_share())

tm3 = fold_in(tm, space)

as.textmatrix(tm)

A Simple Evaluation of Students WritingsA Simple Evaluation of Students Writings

FeedbackFeedback

Evaluating Student WritingsEvaluating Student Writings

(Landauer, 2007)

External Validation?Compare to Human Judgements!

How to do it...How to do it...library( "lsa“ ) # load package

# load training texts

trm = textmatrix( "trainingtexts/“ )

trm = lw_bintf( trm ) * gw_idf( trm ) # weighting

space = lsa( trm ) # create an LSA space

# fold-in essays to be tested (including gold standard text)

tem = textmatrix( "testessays/", vocabulary=rownames(trm) )

tem = lw_bintf( tem ) * gw_idf( trm ) # weighting

tem_red = fold_in( tem, space )

# score an essay by comparing with

# gold standard text (very simple method!)

cor( tem_red[,"goldstandard.txt"], tem_red[,"E1.txt"] )

=> 0.7

Evaluating EffectivenessEvaluating Effectiveness

• Compare Machine Scores with Human Scores

• Human-to-Human Correlation– Usually around .6– Increased by familiarity between

assessors, tighter assessment schemes, …

– Scores vary even stronger with decreasing subject familiarity (.8 at high familiarity, worst test -.07)

•Test Collection: 43 German Essays, scored from 0 to 5 points (ratio scaled), average length: 56.4 words•Training Collection: 3 ‘golden essays’, plus 302 documents from a marketing glossary, average length: 56.1 words

(Positive) Evaluation Results(Positive) Evaluation ResultsLSA machine scores: Spearman's rank correlation rhodata: humanscores[names(machinescores), ] and machinescores S = 914.5772, p-value = 0.0001049alternative hypothesis: true rho is not equal to 0 sample estimates: rho 0.687324

Pure vector space model: Spearman's rank correlation rhodata: humanscores[names(machinescores), ] and machinescores S = 1616.007, p-value = 0.02188alternative hypothesis: true rho is not equal to 0 sample estimates: rho 0.4475188

Concept-Focused EvaluationConcept-Focused Evaluation

(using http://eczemablog.blogspot.com/feeds/posts/default?alt=rss)

Visualising Lexical SemanticsVisualising Lexical Semantics

Topic ProxyTopic Proxy

Network VisualisationNetwork Visualisation

• Term-2-Term distance matrix

t1 t2 t3 t4

t1 1

t2 -0.2 1

t3 0.5 0.7 1

t4 0.05 -0.5 0.3 1

==

Graph

Classical Landauer ExampleClassical Landauer Example

tl = landauerSpace$tk %*% diag(landauerSpace$sk)dl = landauerSpace$dk %*% diag(landauerSpace$sk)dtl = rbind(tl,dl)

s = cosine(t(dtl))s[which(s<0.8)] = 0

plot( network(s), displaylabels=T, vertex.col = c(rep(2,12), rep(3,9)) )

Divisive Clustering (Diana)Divisive Clustering (Diana)

edmediaedmediaTerminology

Code SampleCode Sampled2000 = cosine(t(dtm2000))

dianac2000 = diana(d2000, diss=T)

clustersc2000 = cutree(as.hclust(dianac2000), h=0.2)

plot(dianac2000, which.plot=2, cex=.1) # dendrogramme

winc = clustersc2000[which(clustersc2000==1)] # filter for cluster 1

wincn = names(winc)

d = d2000[wincn,wincn]

d[which(d<0)] == 0

btw = betweenness(d, cmode="undirected") # for nodes size calc

btwmax = colnames(d)[which(btw==max(btw))]

btwcex = (btw/max(btw))+1

plot(network(d), displayisolates=F, displaylabels=T, boxed.labels=F, edge.col="gray", main=paste("cluster",i), usearrows=F, vertex.border="darkgray", label.col="darkgray", vertex.cex=btwcex*3, vertex.col=8-(colnames(d) %in% btwmax))

PermutatingPermutating

PermutationPermutation

Permutation testPermutation test

• NON PARAMETRIC: does not assume that the data have a particular probability distribution.

• Suppose the following ranking of elements of two categories X and Y

• Actual data to be evaluated,• (x_1,x_2,y_1) = (1,9,2). • Let,• T(x_1,x_2,y_1)=abs(mean X- mean Y) = 2

http://en.wikipedia.org/wiki/Probability_distribution

PermutationPermutation

• Usually, it is not practical to evaluate all N! permutatioons.

• We can approximate the p-value by sampling randomly from the set of permutations.

The permutations are:The permutations are:

• permutation value of T

• --------------------------------------------

• (1,9,3) 2 (actual data)

• (9,1,3) 2

• (1,3,9) 7

• (3,1,9) 7

• (3,9,1) 5

• (9,3,1) 5

Some resultsSome results• Students discussions on safe prescribing:

• Classified according expected learning outcomes related subtopics topics: A=7, B=12, C=53, D=4, E=40, F=7

• Graded: poor, fair, good, excelent

• Methodology used:

• LSA

• Bag of words/Maximal Repeated Phrases

• Permutation test

Challenging QuestionsChallenging Questions

DiscussionDiscussion

QuestionsQuestions

• Dangers of using Language Technology?

• Ontologies = Neat? NLP = Nasty?

• Other possible application areas?

• Corpus Collection?

• What is good effectiveness? When can we say that an algorithm works well?

• Other aspects not evaluated…

Questions?Questions?

#eof.#eof.

Education

Language Technology Enhanced Learning