55
Libraries, Standards, and the Web Where to Aim Our Eyes and Attention

UNC visit

Embed Size (px)

DESCRIPTION

Slides prepared for a guest appearance at Jane Greenberg's metadata class at the University of North Carolina, Chapel Hill. Delivered Monday, Dec. 6, 2010.

Citation preview

Page 1: UNC visit

Libraries, Standards, and the Web

Where to Aim Our Eyes and Attention

Page 2: UNC visit

What We’ll Talk About

Part 1:What’s up with Library Standards? Why is there so much confusion? How will we move forward? Where’s the leadership? The

participation? Part 2: The RDA Vocabs, a case study

What can we learn from this? Can we replicate this model? How? Stepping up to the plate …UNC Visit, 12/6/10 2

Page 3: UNC visit

Describing Library Standards

Library standards can be categorized to help us understand how they fit together

In general, functional categories are the most useful

Sometimes what the standards-makers assert isn’t the whole story …

UNC Visit, 12/6/10 3

Page 4: UNC visit

What kind of standards?

Data Structure Standards (element sets; schemes, schemas, or schemata) (examples: Dublin Core, MODS, CDWA, VRA, IEEE-LOM, RDA Vocabularies)

Data Content Standards (cataloging rules, input standards, best practice guides) (examples: AACR2, CCO, CDPDCMBP, RDA Text)

Data Value Standards (controlled vocabularies, encoding schemes) (examples: LCSH, AAT, TGN, LCTGM, ULAN, W3CDTF, DCMIType)

Data Format / Technical Interchange Standards (encoding standards for machine processing and interchange) (examples: XML, SGML, MARC)

Data Presentation Standards (examples: ISBD punctuation, CSS and/or XSLT for display, OPAC display settings) 

UNC Visit, 12/6/10 4

Page 5: UNC visit

Questions to Ponder

How can MODS (a derivative of MARC) be a Data Structure Standard if MARC isn’t?

Should MARC be categorized along with SGML (largely used for text markup)? MARC’s data model started out as the

catalog card What are the most important

criteria for this categorization? UNC Visit, 12/6/10 5

Page 6: UNC visit

Tectonic Shifts Ahead …

LITA surveyed its membership about their thinking about standards

LITA as a TF working on some recommendations aimed at increasing participation by librarians in standards activities NISO/ISO has traditionally been LITA’s focus in

the standards world—that is likely to broaden ISO increasingly seen as problematic because

of their business model Many web standards coming out of W3C

UNC Visit, 12/6/10 6

Page 7: UNC visit

Librarians & Standards

• The LITA survey confirms that lack of participation by librarians has to do with: ▪ institutional support (time, money) ▪ individual feelings of competence

The changes in the environment challenge us all to re-think how we relate to standards▪ … as well as who we think should be in charge

of building them

UNC Visit, 12/6/10 7

Page 8: UNC visit

RDA Vocabularies

The real story

Where is this taking us?

Join the party!

UNC Visit, 12/6/10 8

Page 9: UNC visit

A Brief History

It all started in London, the last day of April 2007 …

UNC Visit, 12/6/109

Page 10: UNC visit

What Was Accomplished

The participants agreed that DCMI and the Joint Steering Committee for the Development of RDA should work together to: Develop an RDA Element Vocabulary Expose RDA Value Vocabularies Develop an RDA Application Profile, based

on FRBR and FRAD The first two are largely complete; the

third is startedUNC Visit, 12/6/10 10

Page 11: UNC visit

The General Strategy

We used the Semantic Web as our “mental model”

Wanted to create a “bridge” between XML and RDF to support innovation in the library community as a whole, not just those at the cutting edge or the trailing edge

We registered the FRBR entities as classes in a ‘FRBR in RDA’ vocabulary, to enable specific relationships between RDA properties and FRBR

IFLA has followed suit using the Open Metadata Registry to add the ‘official’ FRBR entities, FRAD, and ISBD

This provides exciting opportunities to relate all the vocabularies together

UNC Visit, 12/6/10 11

Page 12: UNC visit

Structure: Rationale & Decisions

Property and value vocabularies registered on the Open Metadata Registry (formerly the NSDL Registry): http://metadataregistry.org/rdabrowse.htm

Used RDF Schema (RDFS), Simple Knowledge Organisation System (SKOS) and Web Ontology Language (OWL) Registry provides human and machine usable

interfaces All vocabularies have change history and

versioning capabilitiesUNC Visit, 12/6/10 12

Page 13: UNC visit

Our Methodology

Started with the Entity Relationship Diagrams produced by ALA Publishing The latest iterations are available on the RDA

Toolkit Site (http://www.rdatoolkit.org/background)

ERDs are organized in three groups: core, enhanced, special

These were developed and iterated with no change management strategy, so each new iteration had to be checked carefully to spot changes

ERDs built with a very XML view of the worldUNC Visit, 12/6/10 13

Page 14: UNC visit

The Basic ‘WHY’?

We think of the ‘generalized’ RDA properties as the real RDA vocabulary The ‘bounded’ properties should be seen as the

first pass at an Application Profile Extensions can be built more usefully from the

generalized properties Mapping will be cleaner using the generalized

properties (since most properties mapped to or mapped from will not be based on FRBR)

Generalized properties much more acceptable to non-library implementers (not often using FRBR)

UNC Visit, 12/6/10 14

Page 15: UNC visit

Taking a Look …

FRBR in RDA Vocabulary declared as classes

RDA Properties declared as a ‘generalized’ vocabulary, with no explicit relationship to FRBR entities

Subproperties for the generalized elements may be explicitly related to FRBR entities (using ‘domain’) Label/Name includes (Work) or other class to

provide unique name (unless the entity name already appears in the name of the property)UNC Visit, 12/6/10 15

Page 16: UNC visit

Property (Generalized, no FRBR relationship)

Subproperty (with relationship to one FRBR entity)

FRBR Entity

SemanticWeb

Library ApplicationsThe Simple Case:

One Property-- One FRBR Entity

UNC Visit, 12/6/10 16

Page 17: UNC visit

Book format

Book format (Manifestation)

Manifestation

SemanticWeb

Library ApplicationsThe Simple Case:

One Property-- One FRBR Entity

UNC Visit, 12/6/10 17

Page 18: UNC visit

UNC Visit, 12/6/10 18

http://RDVocab.info/Elements/bookFormatManifestation

Page 19: UNC visit

Property (Generalized, no FRBR relationship)

Subproperty (with relationship to one FRBR entity)

Subproperty (with relationship to one FRBR entity)

FRBR Entity

FRBR Entity

SemanticWeb

Library ApplicationsThe Not-So-Simple Case: One Property—more than

One FRBR Entity

UNC Visit, 12/6/10 19

Page 20: UNC visit

Extent

Extent (Item)

Extent (Manifestation)

FRBR Item

FRBR Manifestation

SemanticWeb

Library ApplicationsThe Not-So-Simple Case: One Property—more than

One FRBR Entity

UNC Visit, 12/6/10 20

Page 21: UNC visit

UNC Visit, 12/6/10 21

Page 22: UNC visit

Roles: Attributes or Properties? In 2005, the DC Usage Board worked with

LC to build a formal representation of the MARC Relators so that these terms could be used with DC

This work provided a template for the registration of the role terms in RDA (in Appendix I) and, by extension, the other RDA relationships Role and relationship properties are registered

at the same level as elements, rather than as attributes (as MARC does with relators, and RDA does in its XML)

UNC Visit, 12/6/10 22

Page 23: UNC visit

“Super” Property

Subproperty (with relationship to one FRBR entity)

Subproperty (Generalized, no FRBR relationship)

FRBR Entity

SemanticWeb

Library Applications The Roles Case: Properties, Subproperties

and FRBR Entities

Mapping,Etc.

UNC Visit, 12/6/10 23

Page 24: UNC visit

RDA:Creator

RDArole:Composer (Work)

RDArole:Composer

Work

SemanticWeb

Library Applications The Roles Case: Properties, Subproperties

and FRBR Entities

Mapping,Etc.

UNC Visit, 12/6/10 24

Page 25: UNC visit

Aggregated Statements RDA sets up Publication, Distribution, Manufacture

and Production statements very much the way they have been done since catalog card days:Assumed aggregation of Place, Name and Date are

obvious leftovers from catalog cards, and are not necessary to enable indexing or display of those elements together if libraries want to do that

We viewed those aggregations as ‘Syntax Encoding Schemes’ and built in ways to accommodate them within the bounded propertiesThose using the generalized properties (outside

libraries, usually) need not be constrained by these traditional aggregations of properties

UNC Visit, 12/6/10 25

Page 26: UNC visit

What Does This Structure Buy Us?

Release from the tyranny of records Potential for use with a variety of

encodings Opportunity to re-think how we build

and share data Potential for sharing data beyond the

library silo A challenge to our old notions of

what library data can do and should be doingUNC Visit, 12/6/10 26

Page 27: UNC visit

UNC Visit, 12/6/10 27

ID=23456 hasAuthor “Kurt Vonnegut”

ID=23456 hasPreferredTitle “Bluebeard,

a novel”

ID=23456 isFormOfWork “Novel”

ID=23456 hasOriginalLanguage

“English”

ID=23456

hasPlaceOfPublication “New

York”

ID=23456

hasStatementOfEdition “1 st

trade edition”

ID=23456 hasL

anguageOfExp

ress

ion

“Englis

h”

ID=23456 hasPublisher “Delacorte

Press”ID=23456 hasPublicationDate

“1987”

Statements on the Floor?

Page 28: UNC visit

UNC Visit, 12/6/10 28

ID=23456 hasAuthor “Kurt Vonnegut”

ID=23456 hasPreferredTitle “Bluebeard,

a novel”

ID=23456 isFormOfWork “Novel”

ID=23456 hasOriginalLanguage

“English”

ID=23456

hasPlaceOfPublication “New

York”

ID=23456

hasStatementOfEdition “1 st

trade edition”

ID=23456 hasLanguageOfExpression

“English”

ID=23456 hasPublisher “Delacorte

Press”

ID=23456 hasPublicationDate

“1987”

Is This Really Chaos?

Page 29: UNC visit

UNC Visit, 12/6/10 29

ID=23456 hasAuthor “Kurt Vonnegut”

ID=23456 hasPreferredTitle “Bluebeard,

a novel”

ID=23456 isFormOfWork “Novel”

ID=23456 hasOriginalLanguage

“English”

ID=23456

hasPlaceOfPublication “New

York”

ID=23456

hasStatementOfEdition “1 st

trade edition”

ID=23456 hasLanguageOfExpression

“English”

ID=23456 hasPublisher “Delacorte

Press”

ID=23456 hasPublicationDate

“1987”

Or Just an AggregationIn the Making?

Page 30: UNC visit

UNC Visit, 12/6/10 30

ID=23456 hasAuthor “Kurt Vonnegut”

ID=23456 hasPreferredTitle “Bluebeard, a novel”

ID=23456 isFormOfWork “Novel”

ID=23456 hasOriginalLanguage

“English”

ID=23456

hasPlaceOfPublication “New

York”

ID=23456

hasStatementOfEdition “1 st

trade edition”

ID=23456 hasLanguageOfExpression

“English”

ID=23456 hasPublisher “Delacorte

Press”

ID=23456 hasPublicationDate

“1987”

Page 31: UNC visit

UNC Visit, 12/6/10 31

ID=23456 hasAuthor “Kurt Vonnegut”

ID=23456 hasPreferredTitle “Bluebeard, a novel”

ID=23456 isFormOfWork “Novel”

ID=23456 hasOriginalLanguage “English”

ID=23456

hasPlaceOfPublication “New

York”

ID=23456

hasStatementOfEdition “1 st

trade edition”

ID=23456 hasLanguageOfExpression

“English”

ID=23456 hasPublisher “Delacorte

Press”

ID=23456 hasPublicationDate

“1987”

Page 32: UNC visit

UNC Visit, 12/6/10 32

ID=23456 hasAuthor “Kurt Vonnegut”

ID=23456 hasPreferredTitle “Bluebeard, a novel”

ID=23456 isFormOfWork “Novel”

ID=23456 hasOriginalLanguage “English”

ID=23456

hasPlaceOfPublication “New

York”

ID=23456

hasStatementOfEdition “1 st

trade edition”ID=23456 hasLanguageOfExpression

“English”

ID=23456 hasPublisher “Delacorte

Press”

ID=23456 hasPublicationDate

“1987”

Page 33: UNC visit

UNC Visit, 12/6/10 33

ID=23456 hasAuthor “Kurt Vonnegut”

ID=23456 hasPreferredTitle “Bluebeard, a novel”

ID=23456 isFormOfWork “Novel”

ID=23456 hasOriginalLanguage “English”

ID=23456

hasPlaceOfPublication “New

York”

ID=23456

hasStatementOfEdition “1 st

trade edition”ID=23456 hasLanguageOfExpression

“English”

ID=23456 hasPublisher “Delacorte

Press”

ID=23456 hasPublicationDate

“1987”

Work

Page 34: UNC visit

UNC Visit, 12/6/10 34

ID=23456

hasPlaceOfPublication “New

York”

ID=23456

hasStatementOfEdition “1 st

trade edition”

ID=23456 hasLanguageOfExpression “English”

ID=23456 hasPublisher “Delacorte

Press”

ID=23456 hasPublicationDate

“1987”

Page 35: UNC visit

UNC Visit, 12/6/10 35

ID=23456

hasPlaceOfPublication “New

York”

ID=23456 hasStatementOfEdition “1st

trade edition”

ID=23456 hasLanguageOfExpression “English”

ID=23456 hasPublisher “Delacorte

Press”

ID=23456 hasPublicationDate

“1987”

Page 36: UNC visit

UNC Visit, 12/6/10 36

ID=23456

hasPlaceOfPublication “New

York”

ID=23456 hasStatementOfEdition “1st

trade edition”

ID=23456 hasLanguageOfExpression “English”

ID=23456 hasPublisher “Delacorte

Press”

ID=23456 hasPublicationDate

“1987”

Expression

Page 37: UNC visit

UNC Visit, 12/6/10 37

ID=23456 hasPlaceOfPublication “New

York”

ID=23456 hasPublisher “Delacorte

Press”

ID=23456 hasPublicationDate

“1987”

Page 38: UNC visit

UNC Visit, 12/6/10 38

ID=23456 hasPlaceOfPublication “New

York”

ID=23456 hasPublisher “Delacorte Press”

ID=23456 hasPublicationDate

“1987”

Page 39: UNC visit

UNC Visit, 12/6/10 39

ID=23456 hasPlaceOfPublication “New

York”

ID=23456 hasPublisher “Delacorte Press”

ID=23456 hasPublicationDate “1987”

Page 40: UNC visit

UNC Visit, 12/6/10 40

ID=23456 hasPlaceOfPublication “New

York”

ID=23456 hasPublisher “Delacorte Press”

ID=23456 hasPublicationDate “1987”

Manifestation

Page 41: UNC visit

Extension

The inclusion of generalized properties provides a path for extension of RDA into specialized library communities and non-library communities They may have a different notion of how FRBR

‘aggregates’; for example, a colorized version of a film may be viewed as a separate work

They may not wish to use FRBR at all They may have additional properties to include,

that have a relationship to the RDA properties

UNC Visit, 12/6/10 41

Page 42: UNC visit

RDA:adaptedAs

RDA:adaptedAsARadioScript

hasSubprope

rty

UNC Visit, 12/6/10 42

Page 43: UNC visit

RDA:adaptedAs

RDA:adaptedAsARadioScript

KidLit:adaptedAsAPictureBook

hasSubproperty

hasSubprope

rtyUNC Visit, 12/6/10 43

Page 44: UNC visit

RDA:adaptedAs

RDA:adaptedAsARadioScript

KidLit:adaptedAsAPictureBook

hasSubproperty

hasSubprope

rty

KidLit:adaptedAsAChapterBook

hasS

ubprop

e

rty

UNC Visit, 12/6/10 44

Page 45: UNC visit

Completing the Vocabularies Completing the hierarchies, both

generalized and FRBR-bounded Elements and Relationships need to have

bounded hierarchies built (generalized hierarchies complete)

Roles need generalized properties created JSC review incomplete, for both

properties and vocabularies Status designations need to be updated

from ‘New—proposed’ to ‘Published’UNC Visit, 12/6/10 45

Page 46: UNC visit

ExtentExtent

(I)Extent

(M)

Extent of Text (M)

Extent of Still Image (M)

Extent of Text

Extent of Still Image

Extent of Text (I)

Extent of Still Image (I)

Subproperty

Subproperty

Subproperty

Subproperty

Subproperty

Current Registered RelationshipsUNC Visit, 12/6/10 46

Page 47: UNC visit

ExtentExtent

(I)Extent

(M)

Extent of Text (M)

Extent of Still Image (M)

Extent of Text

Extent of Still Image

Extent of Text (I)

Extent of Still Image (I)

Subproperty

Subproperty

Subproperty

Subproperty

Subproperty

Added Registered Relationships

SubpropertySubproperty

UNC Visit, 12/6/10 47

Page 48: UNC visit

Remaining Issues

How do these relate to the RDA guidance text? Who will maintain these? How will they be

kept in sync with the text? Will the governance model for these be

the same as the text? Who decides? How can we use these vocabularies

effectively? What else do we need to identify? How should we continue this work?

UNC Visit, 12/6/10 48

Page 49: UNC visit

What We’ve Learned

We wrote about the decisions we made for RDA in DLib:http://dlib.org/dlib/january10/hillmann/01hillmann.html

Need to continue to disclose what we’ve learned and work on building best practices documentation in this environment

UNC Visit, 12/6/10 49

Page 50: UNC visit

Issues in Limbo: Identification Traditional identifiers, like ISBN and ISSN, may

not be suitable for a different environment An ISBN is a publisher’s identifier for a product, may

not be precise enough for a manifestation (publishers sometimes reuse ISBNs for different versions)

OCLC numbers and LCCNs, because they’re optimized for a MARC environment, may not be what we need going forward (though useful for a transition?)

If we move from a centralized to a decentralized sharing environment, what kind of identifier system will we need?

UNC Visit, 12/6/10 50

Page 51: UNC visit

RDA Identification Issues

Resource identifiers Record level identifiers

What is a record in this context? ▪ Is it a description of an entity (work,

manifestation, person, event)?▪ Is it an aggregation of all these that is

designed to be shared?▪ Is it both? And more?

Identifiers between resources What about the ones we already use?

UNC Visit, 12/6/10 51

Page 52: UNC visit

IDENTIFICATION

UNC Visit, 12/6/10 52

Page 53: UNC visit

The W3C Library Linked Data Effort

http://www.w3.org/2005/Incubator/lld/ “The group will explore how existing building

blocks of librarianship, such as metadata models, metadata schemas, standards and protocols for building interoperability and library systems and networked environments, encourage libraries to bring their content, and generally re-orient their approaches to data interoperability towards the Web, also reaching to other communities. It will also envision these communities as a potential major provider of authoritative datasets (persons, topics...) for the Linked Data Web.”

UNC Visit, 12/6/10 53

Page 54: UNC visit

Exhortations

Engage: be part of the change we need! Participate: find the area you’re most

excited about, and find a way to get involved Mailing lists, blog comments, conferences,

etc., are all available Learn: take responsibility for your own

education Local study groups are a great option when

institutions aren’t providing trainingUNC Visit, 12/6/10 54

Page 55: UNC visit

Thank you! Questions?

Contact info: [email protected]

Metadata Matters: http://managemetadata.com/blog

UNC Visit, 12/6/10 55