39
1 Schema for Digital Library Projects in China Maosheng Lai, Xiudan Yang Department of Information Management Peking University April 28 2004

Ontology based metadata schema for digital library projects in China

Embed Size (px)

Citation preview

Page 1: Ontology based metadata schema for digital library projects in China

1

Ontology-based Metadata Schema for

Digital Library Projects in China

Maosheng Lai, Xiudan Yang

Department of Information ManagementPeking University

April 28 2004

Page 2: Ontology based metadata schema for digital library projects in China

2

1 Metadata Application in Chinese Information Resources

Metadata for digital library projects Metadata Profile for Sharing Information for

Sustainable Development of China (MPSISDC)

CELTS-42 CD1.6 (Metadata Standard for Chinese Educational Information)

MICI-DC (Digital Museum Project in Taiwan)

Page 3: Ontology based metadata schema for digital library projects in China

3

Metadata Schemes for Digital Library Projects

Three major types: National projects Business operation Digital projects in university

libraries

Page 4: Ontology based metadata schema for digital library projects in China

4

Metadata Application Profiles

General Format for Digitalized Chinese Full-text by Zhongshan Library (the Public Library of Guangdong Province, partly responsible for CPDLP ) Metadata Profiles of CPDLP by Shanghai LibraryChinese Metadata Specifications by National Library of China

Page 5: Ontology based metadata schema for digital library projects in China

5

General Format for Digitalized Chinese Full-text (GFDCF)

Including format structure, element definition and content rulesBased on Dublin Core 1.1Adding element “ record ” to be the 1th elementDefining HTML tag and cataloging “label” for each elementRecords are issued on the Web

Page 6: Ontology based metadata schema for digital library projects in China

6

Cataloging obeys Chinese Books Cataloging RulesDevelop a full-text retrieval systemEmploy object-oriented technology to do cataloging on text, picture, audio-video, computer program and other types of resourcesBe under the guide of International Standard Bibliographic Description (ISBD), but taking some revisions with indexing and digitalizationCatalog elements are mapped into elements in other metadata schemes

Page 7: Ontology based metadata schema for digital library projects in China

7

RecordTitle

CreatorSubject, Classification or Keyword

DescriptionPublisher

ContributorDateType

FormatIdentifierSource

LanguageRelationCoverage

Rights

Figure 1: metadata elements in GFDCF

Record

Page 8: Ontology based metadata schema for digital library projects in China

8

CPDLP Metadata Profiles

Based on Dublin Core/RDFDivided into two layers:Dublin Core(low), MARC or TEI Header(high)Including semantics, content rules, syntax structure and specifications of element qualifiersOpen to multi-metadata schemes for the purpose of describing different types of resources, for example, CNMARC is mapped here to Dublin Core and is encapsulated by XML/RDF

Page 9: Ontology based metadata schema for digital library projects in China

9

Chinese Metadata Specifications(CMS)

Issued in March 2002Adopts Open Archival Information System (OAIS) as reference modelElement sets are selected based on mature metadata from LC, NLA, Cedars, DC, NEDLIB, etcCMS keeps a mapping relation with DC within a consistent metadata frameworkCore element set is mandatory and frequently used within the whole specificationContains most elements of Dublin Core, adding several elements to DC, such as digital rights, preservation and usage of digital resources

Page 10: Ontology based metadata schema for digital library projects in China

10

Provides basic extended element set(optional)CMS Qualifiers identify the encoding scheme and refine the meaning of elements, which is similar to DC qualifiersRequires that application profiles should be constituted according to the actual needs and specific resourcesFramework structure comprises of core element set, extended element set, semantics and content rules, XML DTD and RDF Schema

Page 11: Ontology based metadata schema for digital library projects in China

11

Figure 2: Chinese core metadata framework structure

Available at: http://www.cdi.cn/download/dmds.pdf

Information Package

Preservation description information

Content information

Reference information

Context information

Provenance information

Inherent information

Description information

Resource description

Rights management

Management history

Original history

Identifiers

Structure information

Format description

Rights information

Processing history

Preservation history

Original technical environment

Page 12: Ontology based metadata schema for digital library projects in China

12

CMS Core Element Sets

Includes seven sets : Resource Description Relative Information Objects Rights Management Original History Management History Inherent Information Abstract Format Description

Page 13: Ontology based metadata schema for digital library projects in China

13

CMS Elements

Resource Description : Title, Subject, Edition,Abstract, Content type, Language, Coverage, Creator, Contributor, Date of Creation, Publisher, Copyrights holder, Identifier

Relative Information Objects: Related Objects Rights Management: Digital Publisher Name, Digital

Publisher Date, Digital Publisher Place, Rights Warning, Actors, Actions

Original History: Original Technical Environments Management History: Ingest Process History,

Administration History Inherent Information: Authentication Indicator Abstract Format Description: UAF-Description

Page 14: Ontology based metadata schema for digital library projects in China

14

Metadata Schemes in Business Companies

21dmediaEmbedded in warehouse platform

Elements are selected and defined for database

itself and not based on any existed schemes

recorded types of resource are print materials,

such as books, journals, and newspapers

building a mapping structure to CNMARC

Page 15: Ontology based metadata schema for digital library projects in China

15

Unihanprovides a tool, E-Cataloger, which could syncopate, identify and converse Chinese wordscore element set is based on Dublin Coreelements are represented in XMLrecords are mapped to CNMARC, and embedded in digital objectsemploys Unicode CJK to deal with multi-language issues, and the product is Japanese-Korean catalog created by E-Cataloger

Page 16: Ontology based metadata schema for digital library projects in China

16

Metadata Application Profiles in University Libraries

Peking University Rare Book Digital Library(RBDL)

Page 17: Ontology based metadata schema for digital library projects in China

17

Core Elements ( 12 )

Local Core Elements ( 2 )

Unique Elements ( 1)

DC Elements

资源形式     Format

题名     Title

主要责任者     Creator

其他责任者     Contributor

出版项     Date, Publisher

  版本( Edition )    

  外 观 形 态 ( Physical Description )

   

附注说明     Description

    收藏历史( Collection

History )

 

相关文献     Relation

主题词     Subject and Keywords

语种     Language

时空范围     Coverage

古籍标识     Resource Identifier

馆藏信息     Rights Management

RBDL Metadata Elements

Page 18: Ontology based metadata schema for digital library projects in China

18

RBDL metadata schema

RBDL metadata schema has three types of metadata: descriptive metadata, administrative metadata, and GIS metadata.

The element set is divided into three parts: core elements (a generalized component for all kinds of the objects, in accordance with DC), local core elements (a common part for local collections) and unique elements (designed for specific type of objects)

The most important function of metadata profile in RBDL is the metadata standard framework, which could guideline designing metadata schema for specific area

Page 19: Ontology based metadata schema for digital library projects in China

19

Chinese metadata standard framework in RBDL (PKUL)

Page 20: Ontology based metadata schema for digital library projects in China

20

2 Problem Analyses

☆Lack of unified semantics, content rules for Chinese information resources

☆Different core elements  ☆Lack of mapping and interoperability ☆Diversification of Chinese metadata system ☆Different semantics and records ☆Different thesaurus

Page 21: Ontology based metadata schema for digital library projects in China

21

3 Ontology-based metadata schema for Digital Library Projects in China

Ontology of Chinese information resources Ontology of bibliographic relations Ontology-based digital library metadata

schema

Page 22: Ontology based metadata schema for digital library projects in China

22

Ontology of Chinese information resources

Page 23: Ontology based metadata schema for digital library projects in China

23

Ontology of bibliographic relations

FRBR Entities Work Expression Manifestation Item

Page 24: Ontology based metadata schema for digital library projects in China

24

Expression

Manifestation

Item

Work

Physical -recording ofcontent

Intellectual/artistic content

is realized through

is embodied in

is exemplified by

Page 25: Ontology based metadata schema for digital library projects in China

25

“Book”–Song Shu

(item)

–“publication” at bookstore

(manifestation)

Page 26: Ontology based metadata schema for digital library projects in China

26

“ Book”–Who translated?

(expression)

–Who wrote?

(work)

Page 27: Ontology based metadata schema for digital library projects in China

27

FRBR Entity Levels

Work

Expression

Manifestation

The Novel

Orig.Text

Transl. CriticalEdition

The Movie

Orig.Version

Paper PDF HTML

Item Copy 1Copy 2

Page 28: Ontology based metadata schema for digital library projects in China

28

Possible FRBR applications

Authority

Bibliographic

Holding Item

Work/Expression

UniformTitle Concept

Manifestation

Person

Series (work/expression)

UniformTitle

Page 29: Ontology based metadata schema for digital library projects in China

29

ExpressionWork

Manifestation

Item

Page 30: Ontology based metadata schema for digital library projects in China

30

Page 31: Ontology based metadata schema for digital library projects in China

31

CONCEPTION Different from FRBR’s Work, it may be a

concept, plan, design for work, or the abstracted from any particular format.

A conception has a uniform-title, a uniform-name (for an author), other-specific-characteristic, description, keyword(s), topic(s), date, audience, and conceptual-level etc.

Page 32: Ontology based metadata schema for digital library projects in China

32

EXPRESSION

A conception with specified content

An expression also has a title, other-specific-characteristic, date, language, summary, context, critical-response, roles for various rights-owners (e.g. author), use-restrictions, and size.

Page 33: Ontology based metadata schema for digital library projects in China

33

MANIFESTATION

The physical embodiment of an expression of a conception

As it is the physical form, manifestation thus includes manuscripts, books, periodicals, maps, art works, paintings, posters, sound recordings, films, video recordings, CD-ROMs, DVDs, multimedia games, digitalized versions in PDF, HTML, web sites, and so on.

Page 34: Ontology based metadata schema for digital library projects in China

34

Manifestations also have a title (inherited from the expression, but may be a variant), a name for an author, a unique identifier, edition/issue, place-of-publication, serials, provider(s), roles for various rights-owners (e.g. publisher), terms-of-availability, contact, coverage, and update-frequency.

Page 35: Ontology based metadata schema for digital library projects in China

35

DIGITALIZATION

A manifestation encoded in a digital-format in digital libraries

A digitalization has a unique identifier, date, and provenance.

Page 36: Ontology based metadata schema for digital library projects in China

36

INSTANCE

Particular copy of a digitalization The entity defined as instance is a

concrete entity. Instance has a unique identifier, date,

address, access-mechanism, access-restriction, exhibition-history, condition, and treatment-history.

Instances could form into collection, which is the major descriptive and management objects of today’s metadata schemes.

Page 37: Ontology based metadata schema for digital library projects in China

37

Ontology-based Chinese digital library

metadata schema

Ontology of information resource

Ontology of bibliographic relations

MARC Metadata schema

Classifications Thesaurus

metadata

Page 38: Ontology based metadata schema for digital library projects in China

38

NameDescriptionKeywordProvider Rights-ownerContactTopicAudienceConceptual-level LanguageCoverageUpdate-frequencyAccess-mechanism(s)

…..

Page 39: Ontology based metadata schema for digital library projects in China

39

Thank You Thank You

!!