Social People-Tagging vs. Social Bookmark-Tagging

  • View
    719

  • Download
    8

Embed Size (px)

DESCRIPTION

EKAW 2010 paper by Peyman Nasirifard, Sheila Kinsella, Krystian Samp, and Stefan Decker

Text of Social People-Tagging vs. Social Bookmark-Tagging

  • 1. Digital Enterprise Research Institute www.deri.ie Social People-Tagging vs. Social Bookmark-TaggingPeyman Nasirifard, Sheila Kinsella, Krystian Samp,Stefan Decker Copyright 2009 Digital Enterprise Research Institute. All rights reserved.

2. Bookmark-tagging and People-tagging Digital Enterprise Research Institute www.deri.ie todonlp friendlymusicresearchtechnician 3. Motivation Digital Enterprise Research Institutewww.deri.ie Understand better how people tag each other A starting point for tag recommendation in frameworks based on people-tagging Access control mechanisms Information filtering mechanisms We are especially interested in subjectivity of tags 4. Main questions Digital Enterprise Research Institutewww.deri.ie How do tags differ for resources of different categories? (person, event, country and city) How do tags for Wikipedia pages about persons differ from tags for friends? How do tags differ with age, gender of taggee? 5. Data collection Digital Enterprise Research Institute www.deri.ie1.Bookmark tagsWikipedia articles: Person, Event, Country, City 6. Data collection Digital Enterprise Research Institute www.deri.ie2.People tagshttp://blog.* network of blog sites.ca, .co.uk, .de, .frGoogle Translate to convert non-English toEnglish 7. Dataset Digital Enterprise Research Institutewww.deri.ie SourceCategory # Items # Tags # Unique Wikipedia Person4,03175,548 14,346 Event 1,427 8,9242,582 Country 63813,0023,200 City1,137 4,7031,907 Blog sitesFriend2,92717,126 10,913 8. Top tags Wikipedia articles Digital Enterprise Research Institute www.deri.ie PersonEventCountryCitywikipediahistorywikipediatravelpeople warhistorywikipediaphilosophy wikipediatravel italyhistoryww2geographygermanywiki politics africa historymusicwiki culturelondonpolitics military wiki ukartbattle referencewikibookswwii europe placesliterature iraq countryengland 9. Top tags blog sites Digital Enterprise Research Institutewww.deri.ie.de.fr.ca & .co.uk music junkieartfunny nicepolitics music livemusiclife funny kind kk friend dearadorable funky intelligent love friendly prettynice lovely sexydrawingcool lovefriendship sexy honesttrustworthylove 10. Distribution of tags Digital Enterprise Research Institute www.deri.ie 11. Subjectivity of tags Digital Enterprise Research Institute www.deri.ie Top 100 tags for each category 25 annotators each categorised 100 tags Objective e.g. london Subjective e.g. jealous Uncategorised e.g. abcxyz Average inter-annotator agreement: 86% 12. subjective objective uncategorized Digital Enterprise Research Institute www.deri.ieFriend Person Country City Event 13. Randomly selected tags Digital Enterprise Research Institutewww.deri.ie Before we looked at top tags, but what about long-tail tags? We also asked annotators to categorise 100 randomly chosen tags from each group Much higher rate of uncategorised (~3x) Lower inter-annotator agreement (76%) Less clear a meaning than the top tags, so probably less useful for applications like information filtering 14. Linguistic categories Digital Enterprise Research Institute www.deri.ie Automatic classification (WordNet) Noun/verb/adjective/adverb/uncategorised 15. Digital Enterprise Research Institute www.deri.ie Adjective Adverb Verb Noun Uncategorised 16. Age and gender of taggees Digital Enterprise Research Institutewww.deri.ie Generated sets of tags corresponding to ages brackets and genders Removed tags that refer to a specific gender Asked 10 participants if they could predict age and gender Results: Differences between gender were not perceptible Differences between younger and older were perceptible (and younger were more subjective) 17. Conclusions Digital Enterprise Research Institutewww.deri.ie Subjectivity: Articles of different categories are tagged similarly, but friends are assigned subjective tags more frequently Consequence: frameworks built on person- tags will need to handle more potentially unreliable tags Controlled vocabularies? Future work: Twitter Lists as person annotations for information filtering