Download pdf - Mark2Cure: a crowdsourcing platform for biomedical literature annotation

0

10

20

30

40

50

60

70

consistency with

NCB

I gold stan

dard Consistency with NCBI standard, Development Corpus

mturk experiment 1, minimum 3 votes per annota9on

mturk experiment 2, minimum 3 votes per annota9on

NCBO annotator (Human Disease Ontology)

NCBI condi9onal random field trained on the AZ corpus (only "all" reported)

Mark2Cure: a crowdsourcing platform for biomedical literature annotation

Identifying concepts and relationships in biomedical text enables knowledge to be applied in computational analyses, such as gene set enrichment evaluations, that would otherwise be impossible. As such, there is a long and fruitful history of BioNLP projects that apply natural language processing to address this challenge. However, the state of the art in BioNLP still leaves much room for improvement in terms of precision, recall and the complexity of knowledge structures that can be extracted automatically. Expert curators are still vital to the process of knowledge extraction but are in short supply.

Recent studies have shown that workers on microtasking platforms such as Amazon’s Mechanical Turk (AMT) can, in aggregate, generate high-quality annotations of biomedical text. In addition, several recent volunteer-based citizen science projects have demonstrated the public’s strong desire and ability to participate in the scientific process even without any financial incentives. Based on these observations, the mark2cure initiative is developing a Web interface for engaging large groups of people in the process of manual literature annotation. The system will support both microtask workers and volunteers. These workers will be directed by scientific leaders from the community to help accomplish ‘quests’ associated with specific knowledge extraction problems. In particular, we are working with patient advocacy groups such as the Chordoma Foundation to identify motivated volunteers and to develop focused knowledge extraction challenges. We are currently evaluating the first prototype of the annotation interface using the AMT platform.

ABSTRACT

Benjamin M Good, Max Nanis, Andrew I Su

The Scripps Research Institute, La Jolla, California, USA

REFERENCES

We acknowledge support from the National Institute of General Medical Sciences (GM089820 and GM083924).

CONTACT Benjamin Good: [email protected] Andrew Su: [email protected]

1.  Zhai, Haijun, et al. "Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing." Journal of medical Internet research 15.4 (2013).

2.  Doğan, Rezarta Islamaj, and Zhiyong Lu. "An improved corpus of disease mentions in PubMed citations." Proceedings of the 2012 Workshop on Biomedical Natural Language Processing. Association for Computational Linguistics, 2012.

ABSTRACT

FUNDING

Challenge Next Steps

RESULTS, Comparison to concept recogniFon tools Proof of Concept Experiment with AMT (work in progress)

Consistency(A,B) = 2*100*(N shared annota9ons)

(N(A) + N(B))

Can non-‐experts annotate disease occurrences in text beRer than machines?

To what degree can we reproduce the NCBI disease corpus [2]?

Objec9ves for Annotators Highlight all diseases and disease abbreviaFons “...are associated with Hun9ngton disease ( HD )... HD pa9ents received...” “The WiskoR-‐Aldrich syndrome ( WAS ) …” Highlight the longest span of text specific to a disease “... contains the insulin-‐dependent diabetes mellitus locus …” and not just ‘diabetes’. “...was ini9ally detected in four of 33 colorectal cancer families…” Highlight disease conjuncFons as single, long spans. “...the life expectancy of Duchenne and Becker muscular dystrophy pa9ents..” “... a significant frac9on of familial breast and ovarian cancer , but undergoes…” Highlight symptoms -‐ physical results of having a disease “XFE progeroid syndrome can cause dwarfism, cachexia, and microcephaly. Pa9ents ofen display learning disabili9es, hearing loss, and visual impairment. Highlight all occurrences of disease terms “Women who carry a muta9on in the BRCA1 gene have an 80 % risk of breast cancer by the age of 70. Individuals who have rare alleles of the VNTR also have an increased risk of breast cancer ( 2-‐4 )”.

•  6900 disease men9ons in 793 PubMed abstracts •  developed by a team of 12 annotators •  covers all sentences in a PubMed abstract •  Disease men9ons are categorized into Specific Disease,

Disease Class, Composite Men9on and Modifier categories.

Goal: structure all knowledge published as text on the same day it appears in PubMed with expert-‐human level precision and recall

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

Number arFcles added to PubMed

Approach: CiFzen Science

Idea: People are very effec9ve processors of text, even in areas where they aren’t experts [1]. Numerous experiments have shown the public’s desire to contribute to science. Lets give them an opportunity to help annotate the biomedical literature.

Use the AMT to test the concept before aRemp9ng to mo9vate a ci9zen science movement

Tes9ng on the 100 abstract “development set”, 5 workers per abstract, $.06 per completed abstract

RESULTS, 2 experiments

AMT workers performed beRer than condi9onal random field trained on the AZ corpus.

Examples

•  Con9nued refinement of the annota9on interface with AMT

•  Experiment to compare AMT results versus volunteers

•  Collabora9ons with disease groups such as the Chordoma Founda9on to prime the flow of ci9zen scien9st annotators

We are hiring! Looking for postdocs, programmers interested in crowdsourcing and bioinforma9cs contact [email protected]

Number of votes per annota9on

0

0.2

0.4

0.6

0.8

1

1 2 3 4 5

precision

recall

F

Experiment 1

•  one week each, ($30) •  one month turk-‐specific

developer 9me...

Costs

•  Expanded instruc9ons with more examples •  Minor interface changes (selec9ng one

term automa9cally selects all other occurrences)

Worker instruc9ons

Exp. 2 changes

Nearly iden9cal results

Exp. 1 results