Click here to load reader
Upload
gong-cheng
View
156
Download
0
Embed Size (px)
Citation preview
Generating Illustrative Snippetsfor Open Data on the Web
Gong Cheng, Cheng Jin, Wentao Ding, Danyun Xu, Yuzhong Qu
Websoft Research GroupNational Key Laboratory for Novel Software Technology
Nanjing University, China
Websoft
The Web is in the era of open data.
Dataset search engines have emerged.
Metadata about a dataset is served,
and only metadata is served.
We proposeto also serve an illustrative snippet,
Dataset:A set of entity-property-value triples
Snippet:A size-limited subset of triples
Snippet generation
and to serve a high-quality snippet.
• CoverageTo cover the most important entity types and properties.
• FamiliarityTo contain entities familiar to average users.
• CohesionTo describe a set of related entities.
To this end, we formulate and solve a newcombinatorial optimization problem:
• Maximum-weight-and-coverage connected graph problem (MwcCG)
To this end, we formulate and solve a newcombinatorial optimization problem:
• Maximum-weight-and-coverage connected graph problem (MwcCG)
CoverageFamiliarity Cohesion
Quality of snippet
Experiment results
Baseline: PageRank-based snippet (Rietveld et al., ISWC’14)
Our snippet
Summary
• Motivation• To help people quickly know the contents of a large dataset
• Our contribution• We propose to automatically extract an optimal illustrative snippet
pursuing coverage, familiarity, and cohesion.• We formulate a new combinatorial optimization problem:
to maximize coverage & weights, constrained by graph connectivity.• We solve the problem using an approximation algorithm.
• Paper• Gong Cheng, Cheng Jin, Wentao Ding, Danyun Xu, Yuzhong Qu.
Generating Illustrative Snippets for Open Data on the Web.In Proc. WSDM ’17.