Tiziano Piccardi, Michele Catasta, Leila Zia, Robert West

Poor structureLimited content

No referencesNot clear

Not integrated

Help to structure the

content

We use the content of the articleto generate recommendations

We use the category networkto generate recommendations

Latent Dirichlet Allocation (LDA)200 topics

Topic 1 Topic 2 Topic 3

Article topics0.1 0.05 0.7

+ + + ...

Final recommendation

Topic 1 (0.1)Topic 2 (0.05)Topic 3 (0.7)...

Collaborative Filtering

Based on matrix factorization with Alternating Least Squares

One row per article and one column per section

1 if the section S appears in the article A

Limitation:

The article-based approach cannot generate recommendations for new

articles!

Intuition:

Articles in the same category share similar

sections

We can use the categories to generate templates for

the editorsCategory:American epic

{Plot, Cast, Production}

Taxonomic assumption

Categories are organised in a hierarchical structure

Frequent sections on the children may be relevant

for the parent

Wait, it’s not so easy…

Government ➜ Public administration ➜ Public economics ➜ Economic policy ➜ Government

Peter Eades, Xuemin Lin, W.F. Smyth

Removed 4k edges out of ~4M

Categories with heterogeneous articlesmust be removed

Distribution of the article types in a category

We assigned 55 top level types to the articles

Category C

to select the categories to keep

in the network

P(S1 | CAT1) = 2/7

Example:

American_male_film_actorsFilmography: 0.59

Career: 0.47Personal life: 0.38

Filmography appears in 59% of the articles in the category

“American_male_film_actors”

Probability P(S | C) of observing section S in category C

Category–Section counts

Collaborative Filtering

Based on matrix factorization with Alternating Least Squares

One row per category and one column per section

Ratings defined asP(S1 | CAT1)

American_male_film_actorsFilmography: 0.59Career: 0.47Personal life: 0.38...

American_film_producersFilmography: 0.39Career: 0.34Personal life: 0.26...

Living_peopleCareer: 0.25Personal life: 0.17Biography: 0.13...

Cj ∈ Categories_of(Leonardo

DiCaprio)Leonardo DiCaprio

Career: 9.98Filmography: 9.51Personal life: 9.48

Early life: 8.40Awards: 2.45

Merging phase

Learning2Rank: weighted sum

Categories_of(Leonardo DiCaprio) = {American_male_film_actors,

American_film_producersLiving_people}

Evaluation

English Wikipedia - September 20175.5M articles

300K sections

Collaborative filtering

Precision < 0.2%Recall < 1.5%

Cold start problem: in average 3.4 sections

Topic modeling

Precision@10 = 6%(upper bound 28%)

Recall@10 = 26%(upper bound 98%)

Collaborative filtering

Automatic evaluation has limitations!

The testing set contains articles with the problem we want to solve

few sections | inconsistent | different syntax

Human evaluation

Wikipedia editors Crowd-workers

Wikipedia editors:Precision@10 = 72%

Crowd-workers: Precision@10 = 81%

● Introduced the section recommendation problem● Explored several methods using

○ features derived from the raw input article○ Wikipedia’s category network

● Learned that category network is key in offering useful recommendations

● We developed a methodology to prune the category network

https://github.com/epfl-dlab/structuring-wikipedia-articles

https://meta.wikimedia.org/wiki/Recommendation_API

tiziano.piccardi@epfl.ch

@tizianopiccardi

https://github.com/epfl-dlab/structuring-wikipedia-articles

https://meta.wikimedia.org/wiki/Recommendation_API

Tiziano Piccardi, Michele Catasta, Leila Zia, Robert West · Awards: 2.45... Merging phase...

Documents

Leonardo Dicaprio. Leonardo Wilhelm DiCaprio (born November 11, 1974)[1] is an American actor and film producer whose career rose with his role in the

TRIO DICAPRIO REPERTOIRE · · 2018-01-22TRIO DICAPRIO REPERTOIRE Indie HOWL – Biffy Clyro SHE’S ELECTRIC ... Whitesnake . Title: Microsoft Word - Trio DiCaprio Song List.docx

George Clooney & Leonardo DiCaprio Through 'The Ides …appendices.weebly.com/uploads/1/4/0/4/14047972/... · George Clooney & Leonardo DiCaprio Through 'The Ides of March' Weigh

Leonardo DiCaprio och den melodramatiska klimatdiskursenhig.diva-portal.org/smash/get/diva2:1073282/FULLTEXT01.pdf · Leonardo DiCaprio and the melodramatic climate discourse A rhetorically

DiCaprio Has Depth - omahamarian.org AUGUST PDFS/004005mhsaug.pdfcaptivated by Leonardo DiCaprio. He brings true characters to life and places greater priority on extraordinary plots

TARTALOM Hieronymus Bosch, Leonardo DiCaprio,kislabnyom.hu/sites/default/files/letolt/91... · Hieronymus Bosch, Leonardo DiCaprio, klímaváltozás és kislábnyomos ünnepi készülődés

Boxoffice - L'information de référence de l'industrie du cinéma · 2012-09-17 · Leonardo DiCaprio Né en 1974 à Los Angeles, Leonardo DiCaprio poursuit, jeune, des cours de

Amministrare GNU/Linux - disi.unige.it · Amministrare GNU/Linux Simone Piccardi piccardi@truelite.it Truelite Srl info@truelite.it

Exemple Plan Daffaires Cafe Dicaprio 02-05-2007 France

Nicholas Dicaprio - Teorias de La Personalidad

Tecnólogo Óscar Dicaprio (Rango 1)

Nicholas s. Dicaprio - Teorias de La Personalidad

Photos of Leonardo DiCaprio at Surf Lodge in Montauk · LikeLike 195 ShareShare 195 0 Tweet 0 Leonardo DiCaprio and his Sports Illustrated swimsuit model girlfriend Kelly Rohrbach

Leonardo DiCaprio Movies with Oscar.pdf

Teorias de La Personalidad - Nicholas Dicaprio

LEONARDO DICAPRIO Leonardo Wilhelm DiCaprio (/d ɨˈ kæpri.o ʊ /; born November 11, 1974)[1] is an American actor and film producer. He has been nominated

MARBLE MAN & LEONARDO DICAPRIO - Robot City · Gualtiero Vanelli and Leonardo DiCapriothe actor : is an active ecologist whose foundationThe Leon— - ardo DiCaprio Foundationhas—

Amministrare GNU Linux Di Simone Piccardi Ver.3.3 Web1

Acclaimed Actor and Activist Leonardo DiCaprio to Present … · 2019-11-04 · Turner Classic Movies (TCM) today announced that Leonardo DiCaprio will present friend and frequent

Leonardo DiCaprio