66
Analysis of Social Analysis of Social Tagging and Book Tagging and Book Cataloging: A Case Study Cataloging: A Case Study Yi-Chen Chen 陳陳陳 Dept. of Library & Information Science National Taiwan University HKLA 50th Anniversary Conference Hong Kong, 5 November 2008

Analysis of Social Tagging and Book Cataloging: A Case Study

Embed Size (px)

DESCRIPTION

HKLA 50th Anniversary Conference Hong Kong, 5 November 2008. Analysis of Social Tagging and Book Cataloging: A Case Study. Yi-Chen Chen 陳怡蓁 Dept. of Library & Information Science National Taiwan University. Outline. Introduction Background + Related Work Research Questions - PowerPoint PPT Presentation

Citation preview

Page 1: Analysis of Social Tagging and Book Cataloging: A Case Study

Analysis of Social Tagging Analysis of Social Tagging and Book Cataloging: A and Book Cataloging: A Case StudyCase Study

Yi-Chen Chen 陳怡蓁

Dept. of Library & Information Science

National Taiwan University

HKLA 50th Anniversary Conference Hong Kong, 5 November 2008

Page 2: Analysis of Social Tagging and Book Cataloging: A Case Study

Outline•Introduction

▫Background + Related Work•Research Questions •Data and Methodology•Results•Conclusions and Future Directions

Page 3: Analysis of Social Tagging and Book Cataloging: A Case Study

Background

•the concept of social tagging has grown in popularity on the web-based services

• it is quite different from controlled vocabularies-based indexing or authority-based cataloging (Mathes, 2004; Guy & Tonkin, 2006)

•the emergence of social tagging has begun to challenge traditional ways of information organization

Page 4: Analysis of Social Tagging and Book Cataloging: A Case Study

Related Works…

•the difference between social tagging and traditional cataloging/indexing has been noticed (Tennis, 2006), but very few of the studies were conducted to verify it

• little research has been performed to examine how social tagging is applied to resources of books▫since the resources (books) are already

catalogued with library subject headings

Page 5: Analysis of Social Tagging and Book Cataloging: A Case Study

In this study…

Page 6: Analysis of Social Tagging and Book Cataloging: A Case Study

Objective & purposes

•discover the properties and functions of social tags attached to books

•compare user-created tags with authoritative subject headings

Page 7: Analysis of Social Tagging and Book Cataloging: A Case Study

Case study…

•A case study on LibraryThing’s tagging system ▫“an online service to help people catalog their

books easily” ▫allows its members to add tags for their

personal book collections

Page 8: Analysis of Social Tagging and Book Cataloging: A Case Study
Page 9: Analysis of Social Tagging and Book Cataloging: A Case Study

Research Questions•How can tags be organized or classified

into different function types? •What kinds of function tags are most often

used?•How are tagging terms similar to or

different from library subject headings?

Page 10: Analysis of Social Tagging and Book Cataloging: A Case Study

Two parts of our studies•Part 1 investigate functions of tags and derives a classification based on the types of functions

explore what kinds of tag classes are more popular among users

•Part 2 compare social tags with LCSH assigned to the same works of books and examine their overlaps and variations

Page 11: Analysis of Social Tagging and Book Cataloging: A Case Study

Part 1: Analysis of classifying tags and usage frequency

Page 12: Analysis of Social Tagging and Book Cataloging: A Case Study

Data collection

•LibrayThing (http://www.librarything.com) is chosen as our platform for data collection and analysis

•The data we used for this study was gathered from LibraryThing in July 2008.

Page 13: Analysis of Social Tagging and Book Cataloging: A Case Study

Data collection

•sample of books including both fiction and non-fiction works▫randomly selected from the “Most often tagged

fiction” and “Most often tagged non-fiction” booklist in LibraryThing

▫two criteria: (1) English books; (2) the corresponding catalog records should include LCSH.

Page 14: Analysis of Social Tagging and Book Cataloging: A Case Study

Data collection

•total number of works = 50•25 fiction + 25 non-fiction

Page 15: Analysis of Social Tagging and Book Cataloging: A Case Study

Part 1: Methodology

•extract the tagging data (tag cloud and tag frequency) from these 50 works

•only use the main pagemain page tag cloud tag cloud for analysis▫a tag cloud that appear on the main page

for each work includes only the top frequency tags

Page 16: Analysis of Social Tagging and Book Cataloging: A Case Study
Page 17: Analysis of Social Tagging and Book Cataloging: A Case Study

the tag frequency data was gathered on the tag cloud of each work, indicating how many times the tag was used for a particular work

Page 18: Analysis of Social Tagging and Book Cataloging: A Case Study

• In total, there are 2,249 tags associated with the selected works, and 45 tags per work on average

Number of tags

Total Average (per work)

Fiction 1142 45.68

Non-fiction 1107 44.28

All works 2249 44.98

Page 19: Analysis of Social Tagging and Book Cataloging: A Case Study

Part 1: Results(Ⅰ)

•For these 2,249 tags, we analyze their function types and classified them.

•Classification framework for tags was created

Page 20: Analysis of Social Tagging and Book Cataloging: A Case Study

ClassClass SubclassSubclass

1. Bibliographic Description

Genre/Form

Author

Country of origin

Language

Edition

Variant formats

Audience

2. Subject-related

Character/People

Timeframe

Setting/Place

Topic

Subject area

3. Personal reference

Ownership

Reading Progress

Time

Task

Location

4. Opinion 5. Awards/Top list 6. Community

Page 21: Analysis of Social Tagging and Book Cataloging: A Case Study

1. Bibliographic description

describe physical attributes of the work and can give factual information about the book

•Genre/Form (e.g. “science fiction” )•Author•Country of origin•Edition (e.g. “first edition”)•Variant formats (e.g. “film”)•Audience (e.g. “kids”)

Page 22: Analysis of Social Tagging and Book Cataloging: A Case Study

2. Subject-related

tags that intended to reflect what a work is about and deal with the content of the resource

•abstract and concrete concepts, things or objects, subject areas, character names, settings or place, timeframe of the story, themes of the document, topics and the like

Page 23: Analysis of Social Tagging and Book Cataloging: A Case Study

3. Personal reference

act as reminders to oneself based on his/her personal context

•Ownership (e.g. “borrowed”) •Reading Progress (e.g. “unread”, “tbr”) •Time (e.g. “2007”) •Task (e.g. “@work”, “textbook”)•Location (e.g. “bookshelf”)

Page 24: Analysis of Social Tagging and Book Cataloging: A Case Study

4. Opinion

•users can express their feelings and opinion about the resource with tags subjectively

•reveal the reader’s value judgments and emotional reaction to a particular book (e.g. “favorite”, “interesting”)

Page 25: Analysis of Social Tagging and Book Cataloging: A Case Study

5. Awards/Top list

•a specific award or prize name (e.g. “Pulitzer prize”, “Nobel prize”)

•the top book list (e.g. “1001books”)

Page 26: Analysis of Social Tagging and Book Cataloging: A Case Study

6. Community

•apply such tags to the books that they wish to share or discuss with others

•convey the community meaning of the resource(e.g. “book club”)

Page 27: Analysis of Social Tagging and Book Cataloging: A Case Study

Part 1: Results

•Distribution of number of tags•Distribution of tag frequency

Page 28: Analysis of Social Tagging and Book Cataloging: A Case Study

Part 1: Results

•Distribution of number of tags•Distribution of tag frequency

Page 29: Analysis of Social Tagging and Book Cataloging: A Case Study

Number of tags (all works)

Page 30: Analysis of Social Tagging and Book Cataloging: A Case Study

Bibliographic description

Genre/Form

Page 31: Analysis of Social Tagging and Book Cataloging: A Case Study

Personal reference

Reading Progress

Page 32: Analysis of Social Tagging and Book Cataloging: A Case Study

Number of tags (fiction)

Page 33: Analysis of Social Tagging and Book Cataloging: A Case Study

Number of tags (non-fiction)

Page 34: Analysis of Social Tagging and Book Cataloging: A Case Study

Part 1: Results

•Distribution of number of tags•Distribution of tag frequency

Page 35: Analysis of Social Tagging and Book Cataloging: A Case Study

Part 1: Results

•Distribution of number of tags•Distribution of tag frequency

▫investigate if certain tag classes are used more frequently than others

Page 36: Analysis of Social Tagging and Book Cataloging: A Case Study

Tag frequency (all works)

Page 37: Analysis of Social Tagging and Book Cataloging: A Case Study

Tag frequency (fiction)

Page 38: Analysis of Social Tagging and Book Cataloging: A Case Study

Tag frequency (non-fiction)

Page 39: Analysis of Social Tagging and Book Cataloging: A Case Study

Part 1: Results & Findings

•users are more likely to distinguish fiction works by their genre or form, and distinguish non-fiction works by their subject of books

Page 40: Analysis of Social Tagging and Book Cataloging: A Case Study

Number of tags vs. Tag frequency ?

Although Subject-related has the largest number of tags among all the works, its tag usage frequency is not as high as that of Bibliographic description.

vs.

Subject-related

Bibliographic description

Page 41: Analysis of Social Tagging and Book Cataloging: A Case Study

Part 1: Results & Findings

•number of tags vs. tag usage frequency ?• the subject matter could be divergent and

expressed in a variety of words, so its tag usage frequency is lower

• the descriptions of bibliographic data often have common usage, especially of genre/form, thus resulting in clear convergence on the tagging terms

Page 42: Analysis of Social Tagging and Book Cataloging: A Case Study

Part 2: Comparison of social tags and LCSH terms

Page 43: Analysis of Social Tagging and Book Cataloging: A Case Study

Part 2: Methodology

•Dataset• the Bibliographic description tags and

Subject-related tags (from part 1)

• the subject headings data was extracted from the LCSH terms assigned to each selected work in Library of Congress Online Catalogs (http://catalog.loc.gov/webvoy.htm)

Page 44: Analysis of Social Tagging and Book Cataloging: A Case Study
Page 45: Analysis of Social Tagging and Book Cataloging: A Case Study

Part 2: Methodology

•subject heading string may comprise the main headings and dash-subdivisions with complicated combinations (e.g. Japan --History --20th century --Fiction)

•we separated the combination of subject headings into several concept terms and excluded the duplicate terms. (e.g. Japan. History. 20th century. Fiction.)

Page 46: Analysis of Social Tagging and Book Cataloging: A Case Study

Part 2: Preliminary results•1,759 tags

(Bibliographic description and Subject-related tags)

•35.2 tags per work

•313 LCSH terms

•6.3 LCSH terms per work

Page 47: Analysis of Social Tagging and Book Cataloging: A Case Study

Rules of comparison(1) tags and LCSH terms associated with the

same work are compared in a term-by-term manner.

(2) the overlap is identified with an exact or almost exact match in spelling, including plural/singular forms and case variations.

(3) abbreviations or acronyms are considered the same as the full form of terms.

(4) preposition, punctuation mark and symbol are ignored.

Page 48: Analysis of Social Tagging and Book Cataloging: A Case Study

Overlaps between tags and LCSH

Overlapped tagsOverlapped tagstags not covered in SH

(Non-overlap)

tags not covered in SH

(Non-overlap)

Page 49: Analysis of Social Tagging and Book Cataloging: A Case Study

All works10.8%

(overlap)10.8%

(overlap)

Page 50: Analysis of Social Tagging and Book Cataloging: A Case Study

Fiction…10.2%

(overlap)10.2%

(overlap)

Page 51: Analysis of Social Tagging and Book Cataloging: A Case Study

Non-fiction…11.5%

(overlap)11.5%

(overlap)

Page 52: Analysis of Social Tagging and Book Cataloging: A Case Study

Tags vs. LCSH terms

•how the rest of 90% non-overlapped tags are different from the LCSH terms?▫not reflected in library subject headings

Page 53: Analysis of Social Tagging and Book Cataloging: A Case Study

Compared with LCSH, Tags … •more genre/form information •describe more character names in the

content of books•simpler and informal usage on personal

names, geographic names, and timeframe

Tag: “classic fiction”, “Thriller”,

“historical fiction”LCSH: “Fiction”

Tag: “da vinci”LCSH: “Leonardo, da Vinci, 1452-1519”

Tag: ““Delft”LCSH: “Delft (Netherlands)”

Tag: “1920s”LCSH: “Alfonso XIII, 1886-1931”

Page 54: Analysis of Social Tagging and Book Cataloging: A Case Study

the syntax of LCSH…

•multi-concepts phrases (e.g. “Good and evil”)

•subject headings can be with qualifiers to distinguish between homographs or to avoid ambiguity

•inverted headings (e.g. “Cookery, French”)

rarely used in social tagging

Page 55: Analysis of Social Tagging and Book Cataloging: A Case Study

Non-overlapped tags (three types)(1)terms with identical meanings, but

different words or different grammatical forms

(2)variations of broader terms, narrower terms, and related terms

(3)terms expressing extra concepts

different ways of representing the concepts and the semantic relationships among terms

the different interpretation in subject analysis between users and library catalogers

Page 56: Analysis of Social Tagging and Book Cataloging: A Case Study

Part 2: Findings

•Compared to the subject headings, these non-overlapped tags appear to be more exhaustive of the topics covered in a resource

• the non-overlapped tags, especially those terms with extra concepts, describe more themes or topics covered by the content of a book

Page 57: Analysis of Social Tagging and Book Cataloging: A Case Study

Conclusions

Page 58: Analysis of Social Tagging and Book Cataloging: A Case Study

Reorganizing and classifying tags•our classification framework is intended to

unveil the functions of tags applied to books

•understand how a book/item is described and identified by users in bibliographic records

•reorganize tags into classes to improve the user experience of searching and browsing

Page 59: Analysis of Social Tagging and Book Cataloging: A Case Study

Subject cataloging and social tags•the relatively low degree of overlaps

between tags and LCSH •tags provide a richer description of the

book’s subject matter•higher exhaustivity•using tags supplement existing controlled

vocabularies such as subject description

Page 60: Analysis of Social Tagging and Book Cataloging: A Case Study

Future Directions

•the classification framework for tags needs further evaluation to prove its usefulness and applicability of book cataloging

•the rules of comparison between tags and LCSH still need to be discussed

•study the overlapped and non-overlapped subject terms more comprehensively

•semantic issue…

Page 61: Analysis of Social Tagging and Book Cataloging: A Case Study

Thank you!Thank you!

Comments & Comments & SuggestionsSuggestions

Page 62: Analysis of Social Tagging and Book Cataloging: A Case Study
Page 63: Analysis of Social Tagging and Book Cataloging: A Case Study

Acknowledgments

•The author would like to thank Dr. Muh-Chyun Tang and Dr. Kuang-hua Chen for their helpful comments.

Page 64: Analysis of Social Tagging and Book Cataloging: A Case Study

References• Begelman, G., Keller, P. and Smadja, F. (2006). Automated Tag Clustering: Improving

search and exploration in the tag space. Paper presented at the WWW2006 Collaborative Tagging Workshop. Available online: http://www.pui.ch/phred/automated_tag_clustering/

• Brooks, C. H. & Montanez, N. (2006, May). Improved annotation of the blogosphere via autotagging and hierarchical clustering. In Proceedings of the 15th International Conference on World Wide Web (pp. 625-632). New York: ACM Press.

• Furner, J. (2007). User tagging of library resources:Toward a framework for system evaluation. Paper presented at the 157 Classification and Indexing. Available online: http://www.ifla.org/iv/ifla73/index.htm.

• Golder, S. A., & Huberman, B. A. (2006). Usage patterns of collaborative tagging systems. Journal of Information Science, 32(2), 198-208.

• Guy, M., & Tonkin, E. (2006). “Folksonomies: Tidying up Tags?” D-Lib Magazine, 12(1). • Heckner, M. , Mühlbacher, S. ,and Wolff, C. (2008). Tagging tagging. Analysing user

keywords in scientific bibliography management systems. Journal of Digital Information, 9(27).

• Kipp, M. E. & Campbell, D. G. (2006, November). Patterns and Inconsistencies in Collaborative Tagging Practices: An Examination of Tagging Practices. Paper presented at Proceedings of the Annual General Meeting of the American Society for Information Science and Technology, Austin, TX.

Page 65: Analysis of Social Tagging and Book Cataloging: A Case Study

• Kipp, M. E. (2006). Complementary or Discrete Contexts in Online Indexing: A Comparison of User, Creator, and Intermediary Keywords. Canadian Journal of Information and Library Science, 30(3). Retrieved June 18, 2008, from: http://dlist.sir.arizona.edu/1533/

• Kipp, M. E. (2007). @toread and Cool: Tagging for Time, Task and Emotion. In Proceedings 8th Information Architecture Summit, Las Vegas, Nevada, USA.

• Macgregor, G. & McCulloch, E. (2006). Collaborative tagging as a knowledge organisation and resource discovery tool. Library Review, 55(5), 291–300.

• Mai, J.-E. (2005). Analysis in indexing: Document and domain centered approaches. Information Processing & Management, 41, 599-611.

• Marlow, C., Naaman, M., Boyd, D., and Davis, M. (2006). HT06, tagging paper, taxonomy, Flickr, academic article, to read. In: U.K. Wiil et al. (eds), Proceedings of the 17th Conference on Hypertext and Hypermedia (pp. 31–39). ACM, New York.

• Mathes, A. (2004). Folksonomies - Cooperative Classification and Communication Through Shared Metadata.

• Olson, H. A., & Boll, J. J. (2001). Subject Analysis in Online Catalogs. Englewood, Colo: Libraries Unlimited.

• Quintarelli, E., Resmini, A., and Rosati L. (2007). FaceTag: integrating bottom-up and top-down classification in a social tagging system. Paper presented at International IA Summit 2007, Las Vegas, Nevada, United States. Available online: http://www.facetag.org/download/facetag-20070325.pdf

Page 66: Analysis of Social Tagging and Book Cataloging: A Case Study

• Sauperl, A. (2004). Catalogers’ common ground and shared knowledge. Journal of the American Society for Information Science and Technology, 55, 55-63.

• Sen, S., Lam, S. K., Rashid, A. M., Cosley, D., Frankowski, D., Osterhouse, J., Harper, F. M., and Riedl, J. (2006). Tagging, communities, vocabulary, evolution. In Proceedings of CSCW 2006, Banff, Alberta, Canada.

• Shirky, Clay (2005). Ontology is Overrated: Categories, Links, and Tags. Retrieved 16 November 2007, from: http://shirky.com/writings/ontology_overrated.html

• Sinclair, J. & Cardew-Hall, M. (2008). The folksonomy tag cloud: when is it useful? Journal of Information Science, 34 (1), 15–29.

• Spiteri, L. F. (2007). Structure and form of folksonomy tags: The road to the public library catalogue. Webology, 4(2). Retrieved June 18, 2008, from: http://www.webology.ir/2007/v4n2/a41.html

• Tennis, J. (2006). Social tagging and the next steps for indexing. Paper presented at the 17th Annual SIG/CR Classification Research Workshop, Austin, TX.

• Voß, J. (2007). Tagging, Folksonomy & Co - Renaissance of Manual Indexing? Retrieved December 19, 2007, from: http://arxiv.org/abs/cs/0701072v2

• Weinberger, D. (2005). Tagging and Why It Matters. Retrieved November 7, 2007, from: http://cyber.law.harvard.edu/home/uploads/507/07-WhyTaggingMatters.