51
David E. Losada Fabio Crestani A Test Collection for Research on Depression and Language Use CLEF 2016, Évora (Portugal)

A Test Collection for Research on Depression and Language Use

Embed Size (px)

Citation preview

Page 1: A Test Collection for Research on Depression and Language Use

David E. Losada

Fabio Crestani

A Test Collection for Research onDepression and Language Use

CLEF 2016, Évora (Portugal)

Page 2: A Test Collection for Research on Depression and Language Use

350 million people sufer from

depression

Page 3: A Test Collection for Research on Depression and Language Use

early interventionis fundamental

Page 4: A Test Collection for Research on Depression and Language Use

human expert + technology

Page 5: A Test Collection for Research on Depression and Language Use

current technology

doesn´t supportearly alerts

reactive

works with very

explicit signals

Page 6: A Test Collection for Research on Depression and Language Use

current technology

doesn´t supportearly alerts

reactive

works with very

explicit signals

too often, too late!

Page 7: A Test Collection for Research on Depression and Language Use

instigate research on the onset of depression

proactive technologies

track temporal evolution

early alerts

Page 8: A Test Collection for Research on Depression and Language Use

Text analytics

natural language can be indicative of personality, social status, emotions, mental health, disorders, ...

Page 9: A Test Collection for Research on Depression and Language Use

linguistic markers

use of personal pronouns

statistical properties of text

topic modelspsychometrics

content vs style

social words

verb tense positive/negative emotions

psychological processes

cognitive processes

Page 10: A Test Collection for Research on Depression and Language Use

Lack of data on depression & language

few collections available

focus on 2-class categorisation

no temporal dimension, no early risk analysis

Page 11: A Test Collection for Research on Depression and Language Use

little context about the tweet writer

difficult to assess whether a mention of

depression is genuine

no way to extract a long history of tweets (e.g. several years)

Page 12: A Test Collection for Research on Depression and Language Use

little context about the tweet writer

difficult to assess whether a mention of

depression is genuine

no way to extract a long history of tweets (e.g. several years)

Page 13: A Test Collection for Research on Depression and Language Use

A Thin Line

Page 14: A Test Collection for Research on Depression and Language Use

A Thin Line

no way to extract any history

short messages, little context

Page 15: A Test Collection for Research on Depression and Language Use

A Thin Line

no way to extract any history

short messages, little context

Page 16: A Test Collection for Research on Depression and Language Use
Page 17: A Test Collection for Research on Depression and Language Use

large history for each redditor (several years)

many subreddits (communities) about different

medical conditions (e.g. depression or anorexia)

long messages

terms & conditions allow use

for research purposes

Page 18: A Test Collection for Research on Depression and Language Use

large history for each redditor (several years)

many subreddits (communities) about different

medical conditions (e.g. depression or anorexia)

long messages

terms & conditions allow use

for research purposes

Page 19: A Test Collection for Research on Depression and Language Use

depression group vs control group

Page 20: A Test Collection for Research on Depression and Language Use

depression group vs control group

“I am depressed” “I think I have depression”

Adopted extraction method from Coppersmith et al. 2014:

pattern matching search

search for explicit mentions of diagnosis (e.g. “I was diagnosed with depression”)

manual inspection of the results

Page 21: A Test Collection for Research on Depression and Language Use

depression group vs control group

(e.g. “My wife has depression”, “I am a student interested in depression”)

large set of random redditors

from a wide range of subreddits (news, media, ...)

also included some false positives from the depression subreddit

Page 22: A Test Collection for Research on Depression and Language Use

retrieved all history from any subreddit his/her posts + his/her comments to other posts

often several years of text

removed the post/comment with

the explicit mention of the

diagnosis (depression group)

redditor profile

Page 23: A Test Collection for Research on Depression and Language Use

pre- & post-diagnosis text

organised the writings in

chronological order

XML archives

redditor profile

Page 24: A Test Collection for Research on Depression and Language Use

collection: main statistics

Page 25: A Test Collection for Research on Depression and Language Use

early prediction task

detect early traces of depression

for each subject, sequentially process pieces of evidence...

Page 26: A Test Collection for Research on Depression and Language Use

early prediction task

detect early traces of depression

for each subject, sequentially process pieces of evidence...

------------2/13/13

John Doe's writings(post or comments)

Page 27: A Test Collection for Research on Depression and Language Use

early prediction task

detect early traces of depression

for each subject, sequentially process pieces of evidence...

------------2/13/13

John Doe's writings(post or comments)

Page 28: A Test Collection for Research on Depression and Language Use

early prediction task

detect early traces of depression

for each subject, sequentially process pieces of evidence...

------------2/13/13

------------2/15/13

John Doe's writings(post or comments)

Page 29: A Test Collection for Research on Depression and Language Use

early prediction task

detect early traces of depression

for each subject, sequentially process pieces of evidence...

------------2/13/13

------------

------------

2/15/13 3/1/13

John Doe's writings(post or comments)

Page 30: A Test Collection for Research on Depression and Language Use

early prediction task

detect early traces of depression

for each subject, sequentially process pieces of evidence...

------------2/13/13

------------

------------

------------

2/15/13 3/1/13 12/9/16

...John Doe's writings(post or comments)

Page 31: A Test Collection for Research on Depression and Language Use

early prediction task

detect early traces of depression

for each subject, sequentially process pieces of evidence...

------------2/13/13

------------

------------

------------

2/15/13 3/1/13 12/9/16

...John Doe's writings(post or comments)

tradeoff early decision

vs more informed decision

Page 32: A Test Collection for Research on Depression and Language Use

early prediction task

detect early traces of depression

for each subject, sequentially process pieces of evidence...

------------2/13/13

------------

------------

------------

2/15/13 3/1/13 14/9/16

...John Doe's writings(post or comments)

tradeoff early decision

vs more informed decision

when should I fire an alarm?

Page 33: A Test Collection for Research on Depression and Language Use

early prediction task: performance metric

After seeing k texts a system makes a binary decision dd about John Doe:

d=1 => possible risk of depressiond=0 => non-risk case

Page 34: A Test Collection for Research on Depression and Language Use

early prediction task: performance metric

After seeing k texts a system makes a binary decision dd about John Doe:

------------2/13/13

(1)

------------

------------

2/15/13(2)

3/10/14(k)

John Doe's writings(post or comments) ...

decision (d)

d=1 => possible risk of depressiond=0 => non-risk case

Page 35: A Test Collection for Research on Depression and Language Use

early prediction task: performance metric------------2/13/13

(1)

------------

------------

2/15/13(2)

3/10/14(k)

John Doe's writings(post or comments) ...

decision (d)

ERDEO(d,k)=

Early Risk Detection Error:

cfp

(false positive)

cfn

(false negative)

ctp

* lco(k) (true positive)

0 (true negative)

Page 36: A Test Collection for Research on Depression and Language Use

Early Risk Detection Error:

ERDEO(d,k)=

cfp

(false positive)

cfn

(false negative)

ctp

* lco(k) (true positive)

0 (true negative)

Usually, cfn >> c

fp

cfn ← 1, c

fp ← expected proportion of positive cases (e.g. 0.01)

Page 37: A Test Collection for Research on Depression and Language Use

True Positive cost: ctp

* lco(k)

ctp← c

fn (late detection ≈ no detection)

Latency cost function

Page 38: A Test Collection for Research on Depression and Language Use

experiments

Page 39: A Test Collection for Research on Depression and Language Use

Training Test

403 83 352 54

Page 40: A Test Collection for Research on Depression and Language Use

Training

403 83

------------

------------...

------------

------------

2/13/13 2/15/13 3/1/13 12/9/16

single docrepresentations

depression language classifier

------------

------------...

------------

------------

3/23/13 3/25/13 1/3/14 2/19/15

------------------------

John Doe

Jane Doe

Jane Doe

John Doe

------------------------

.

.

...

1:0.4 2:0.5 …..........+11:0.3 3:0.7 …..........-1

.

.

.

feature-based representations (tfidf weights)

logistic regression(L1 regularisation)

Page 41: A Test Collection for Research on Depression and Language Use

Test

352 54

random (after 1st message)

------------

------------...

------------

------------

2/13/13 2/15/13 3/1/13 14/9/16

rand ({0,1})

.

.

.

Page 42: A Test Collection for Research on Depression and Language Use

Test

352 54

minority class (after 1st message)

------------

------------...

------------

------------

2/13/13 2/15/13 3/1/13 14/9/16

1 (risk case)

Page 43: A Test Collection for Research on Depression and Language Use

Test

352 54

first n

1 2 n

------------...

------------

------------ ...

2/13/13 2/15/13 3/1/13

depression language classifier

decision

Page 44: A Test Collection for Research on Depression and Language Use

Test

352 54

dynamic

1 2 n

------------...

------------

------------ ...

2/13/13 2/15/13 3/1/13

depression language classifier

confident about risk?

we finish and predict 1 (risk case)

yes

Page 45: A Test Collection for Research on Depression and Language Use

Test

352 54

dynamic

1 2 n

------------...

------------

------------ ...

2/13/13 2/15/13 3/1/13

depression language classifier

confident about risk?

we wait and see more evidence...no

Page 46: A Test Collection for Research on Depression and Language Use

Test

352 54

dynamic

1 2 n

------------...

------------

------------ ...

2/13/13 2/15/13 3/1/13

depression language classifier

confident about risk?

we finish and predict 1 (risk case)

yes

Page 47: A Test Collection for Research on Depression and Language Use

Test

352 54

dynamic

1 2 n

------------...

------------

------------ ...

2/13/13 2/15/13 3/1/13

depression language classifier

confident about risk?

we wait and see more evidence...no

Page 48: A Test Collection for Research on Depression and Language Use

random/minority: poor F1 & ERDEfirst n: good F1 but slow at detecting risk casesdynamic: best balance between correctness & time

results

Page 49: A Test Collection for Research on Depression and Language Use

new collection on

depression & language

early risk detectionalgorithms

(preliminary baselines)

methodology for benchmark construction

temporal dimension

conclusions

Page 50: A Test Collection for Research on Depression and Language Use

David E. Losada

Fabio Crestani

A Test Collection for Research on Depression and Language Use

We also thank the “Ministerio de Economía y Competitividad”

of the Goverment of Spain &FEDER Funds (ref. TIN2015-64282-R)

This research was funded by the Swiss National Science Foundation

(project “Early risk prediction on the Internet: an evaluation corpus”, 2015)

Page 51: A Test Collection for Research on Depression and Language Use

Acknowledgements:

Ehnero. picture pg 1.CC BY NC 2.0.Gerald Gabernig. picture pg 2.CC BY 2.0.ankxt. picture pg 3.CC BY 2.0.NEC Corporation of America. picture pg 4.CC BY 2.0.Jordi Borràs i Vivó. picture pgs 5-6 .CC BY NC ND 2.0. Helen Harrop. picture pg 7.CC BY SA 2.0.Nilufer Gadgieva. picture pg 8.CC BY NC 2.0.Alix May. picture pg 9.CC BY NC 2.0.Justin Lincoln. picture pg 10.CC BY SA 2.0.Grace McDunnough. picture pgs 11-18 (top).CC BY NC ND 2.0. Andy Kennelly. picture pgs 19-21.CC BY NC 2.0.Joel Olives. picture pgs 22-23 (left).CC BY 2.0.Tim Morgan. picture pg 23 (right).CC BY 2.0.Conor Lawless. picture pg 24.CC BY 2.0.Oscar Rethwill. picture pgs 25-32.CC BY 2.0.Emily. picture pgs 33-37.CC BY NC 2.0.Tiberiu Ana. picture pg 38.CC BY 2.0.woodleywonderworks. picture pg 39 (left), 40 (left).CC BY 2.0.Niko Kaiser. picture pg 39 (right), 41-47.CC BY 2.0.John Sheets. picture pg 48.CC BY NC 2.0.Anders Sandberg. picture pg 49.CC BY NC 2.0.See-ming Lee. picture pg 51.CC BY NC 2.0.