Authority control, new library standards, and the Semantic Web
Gordon DunsirePresented to the Authority Control Interest Group (ACIG) meeting, ALA Annual, New Orleans, 26 June 2011
Overview
RDA, FRBR, FRAD and authority controlExtending authority control concepts to data
linkingLinked data and the Semantic Web
RDA implementation scenario 1: Relational/object-oriented database structure
FRBR
FRAD
Title: Cataloguing is fun!
Author: Mary MacDonald
Content type:
Media type:
LCSH:
microform
text
Cataloging
Bibliographic record: 12345 Name authority record: 8765
Heading: MacDonald, Mary
Place of birth: Edinburgh
LCSH authority record: 5432
Heading: Cataloging
See also: Books
RDA content type record: 1234
Term: text
Definition: Content expressed through a form of notation for language intended to be perceived visually.
ISBD media type record: 5432
Term: microform
Definition:Media used to store reduced-size images, not readable to the human eye, and designed for use with a device such as a microfilm or microfiche reader.
8765
5432
1234
5432
9876
65443
Title: Cataloguing is fun!
Author:
Content type:
Media type:
LCSH:
Bibliographic record: 12345
8765
5432
1234
5432
Name authority record: 8765
Heading: MacDonald, Mary
Place of birth: 9876
12345 8765Author
8765 Place of birth 9876
8765 Heading “MacDonald, Mary”
9876 Name “Edinburgh”
9876 Country 4567
Stop! Ambiguous: link not safe.
Identifier: ok to link.
8765 Heading “MacDonald, Mary”
12345 8765Author 8765 Place of birth 9876
9876 Name “Edinburgh”
12345 8765Author 8765 Place of birth 9876
9876 Country 4567
Linked data is not a new idea!It extends concepts of authority control
“Preferred” labelsCreate/maintain once; link many times
Re-use of metadataMore than one “attribute” associated with a
“heading”E.g. Place of birth of person with name heading
Concepts can be applied to authority recordsAs well as bibliographic description records
Full extension leads to “record” dis-aggregationAll “records” in bibliographic control systems
Linked data and RDF
Resource Description Framework (RDF)Designed for machine-processing of metadata
at global scale (Semantic Web)24/7/365Trillions of operations per second
Everything must be dis-ambiguatedMachines are dumb
Simplicity helps!Machine-readable identifiers
RDF tripleMetadata expressed as “atomic” statements
A simple, single, irreducible statementThe title of this book is “Cataloguing is fun!”
Constructed in 3 parts“Triple”
The title of this book is “Cataloguing is fun!”Subject of the statement = Subject: This bookNature of the statement = Predicate: has titleValue of the statement = Object: “Cataloguing is fun!”
This book – has title – “Cataloguing is fun!”subject – predicate - object
Identifiers
Need unambiguous way of identifying each part of the triple for efficient machine-processingHuman labels (“This book”, “has title”) no good
Same thing, different labels; different things, same label
Exploit the utility of the URLMachine-readable, regular syntax, unambiguous
Uniform Resource Identifier (URI)
Uniform Resource IdentifierCan be any unique combination of numbers and
lettersNo intrinsic meaning; it’s just an identifier
Can look like a URLhttp://iflastandards.info/ns/isbd/elements/P1001But does not lead to a Web page (in principle ...)
RDF requires the subject and predicate of triple to be URIsObject can be a URI, or a literal string (“Cataloguing is
fun!”)
RDF propertiesPredicates are called properties in RDF“Verbal” part of the metadata statement
E.g. “A has author B”, “B has heading ...”Properties link specific instances of two things
A = a specific book, B = a specific person... = a specific label, character string, annotation
=> a “literal”
Properties are the links in linked data, the pathways through the Semantic Web to human-readable metadata
Labels, global identifiers, linked data
Headings can be managed in the same way as other controlled vocabulariesThey are all RDF labels
Global identifiers (URIs) and RDF allow distributed authority controlBut without need to copy and maintain in local
systemsDifferent labels for the same thing can be linked,
and a chain can link a label to a resourceIts all linked data ...