Lecture 24: Relation Extraction - Computer Sciencekc2wc/teaching/NLP16/slides/24...v Support...

Preview:

Citation preview

Lecture 24: Relation Extraction

Kai-Wei ChangCS @ University of Virginia

kw@kwchang.net

Couse webpage: http://kwchang.net/teaching/NLP16

1CS6501-NLP

Goal

vAcquire structured knowledge from text

CS6501-NLP 2

Information extraction

vEntities recognition v Identify name entities: People, Organization,

Location, Times, Dates, etc.vor genes, proteins, diseases, etc.

vRelation extractionvLocation in, employed by, married to

CS6501-NLP 3

Example

CS6501-NLP 4

Why relation extraction?

v Create structured knowledge bases v Augment structured knowledge basesv Support question answering v The first step for event extraction and storyline

extractionv …

CS6501-NLP 5

Relation types (closed domain)

v 17 relations from Automated Content Extraction (ACE)

CS6501-NLP 6

Credit:DanJurafsky

Relation types (closed domain)

vUMLS: Unified Medical Language Systemv 134 entity types, 54 relations

CS6501-NLP 7

Relation types (open domain)

vFreebase: thousand relations/million entities

CS6501-NLP 8

Wikipedia Infobox

CS6501-NLP 9

CS6501-NLP 10

|undergrad=15,669<refname=facts/>|postgrad=6,316<refname=facts/>|city=[[Charlottesville,Virginia|Charlottesville]]|state=[[Virginia]]|country=U.S.|campus=[[Charlottesville,Virginiametropolitanarea|Small city]]<br/>{{convert|1682|acre|km2}}<br />[[WorldHeritageSite]]

How to build relation extractors (closed domain)

v Hand-written patternsv Supervised machine learning

vTake each sentence as inputv Identify name entities (mentions) vPerform multi-class classifications

v + constraints or features to model correlations

CS6501-NLP 11

CS6501-NLP 12

How to build relation extractors (open domain)

v Bootstrap learning [Brin 98, …]

v Use seed instances to extract a set of relational patterns

v Unsupervised learningv Cluster sentences based on relational patterns

vDistant supervisionDistant supervision for relation extraction without labeled data [Mintz 09+]

vCombine the above approaches

CS6501-NLP 13

v A follow-up approach:Relation Extraction with Matrix Factorization and Universal Schemas [Riedel 13+]

CS6501-NLP 14

Recommended