43
1 Information Management & Entrepreneurship DIG 3563 – Lecture 10 Metadata Systems J. Michael Moshell University of Central Florida Original image* by Moshell et al . Imagery is fromWikimedia except where marked with *.

1 Information Management & Entrepreneurship DIG 3563 – Lecture 10 Metadata Systems J. Michael Moshell University of Central Florida Original image* by

Embed Size (px)

Citation preview

1

Information Management & Entrepreneurship

DIG 3563 – Lecture 10

Metadata Systems

J. Michael Moshell

University of Central Florida

Original image* by Moshell et al .

Imagery is fromWikimedia except where marked with *.

-2 -

3 ½* Systems for Metadata

• Dublin Core Metadata Initiative DCMI• XML Namespaces• Resource Description Framework RDF• Adobe eXtensible Metadata Platform XMP

In (approximately) increasing order of complexity.

*why 3 ½? That's a good exam question!

-3 -

Ohio College Library Center Online College Library Center @

Dublin, Ohio

• Dublin Core – a SIMPLE metadata standard

For web search and retrieval

http://dublincore.org

The thing-described is the resource.

The metadata has 15 Elements in three groups:

- Content

- Intellectual Property

- Instantiationwww.dublincore.org

-4 -

Content Elements

• Title

• Subject (What’s the resource about)

• Description (text)

• Type (book, movie, website, etc.) see controlled vocabulary

• Source (if the resource is derived from something else)

• Relation (part of something else?)

• Coverage (spatial or temporal topic; jurisdiction)

www.dublincore.org

-5 -

Whats a controlled vocabulary?

• An “official list” of the values that are allowed for an Element• Example: Type of documents .. From the authoritative list at:

http://dublincore.org/documents/dcmi-type-vocabulary/

• Collection

• Dataset

• Event

• Image

• InteractiveResource

• MovingImage

www.dublincore.org

• PhysicalObject

• Service

• Software

• Sound

• StillImage

• Text

-6 -

Intellectual Property

• Creator (sometimes author; sometimes editor)

• Publisher

• Contributor (e. g. an edited volume has many contributors.)

• Rights (if several separate rights exist, list them all.)

www.dublincore.org

-7 -

Instantiation

• Date (use a recommended encoding scheme)

• Format (recommended: use the MIME controlled vocabulary)

• Identifier (e. g. a URL, or a library catalog number)

• Language (from a controlled vocabulary, e. g. ISO 639-2)

•http://www.loc.gov/standards/iso639-2/php/code_list.php

www.dublincore.org

-8 -

An Example

The Book of Kells

1) Decide what that document is. (It's not your whole site!)(Why?)

DOCUMENT: One Book, at Trinity College Dublin

2) Produce its DCMI Description – 15 elements

-9 -

The Example: Book of Kells

• Book of Kells (Volume 3)

• Four Gospels

• Latin religious text

• PhysicalObject

• Abbey of Kells

• Volume 3 of 4-Volume Set

• 0-33 A. D. (Traditional)

• Title

• Subject

• Description

• Type

• Source

• Relation

• Coverage

Content

-10 -

The Example: Book of Kells

• Creator

• Publisher

• Contributor

• Rights

• Celtic Monks (at Iona?)

• Catholic Church of Ireland

• none

• Roman Catholic Church

IP

-11 -

ca. 800 A. D.

n/a (not an electronic document)

DCL-014a.BOK

lat (Latin)

Instantiation

• Date

• Format (MIME)

• Identifier

• Language

The Example: Book of Kells

-12 -

More examples from DCMI

http://dublincore.org/documents/2001/04/12/usageguide/generic.shtml#contributor

-13 -

An Exercise (work in pairs, please)

Your company (whatever you're building your website about)

has created a document.

1) Decide what that document is. (It's not your whole site!)(Why?)

(If you can’t think of one, see slide 14 for an idea.)

2) Produce its DCMI Description – 15 elements – and be ready to show it to the class on a Powerpoint

You have 10 minutes.

-14 -

A document that your business

might have

A video demo of your company’s services, which you

intend to post on YouTube. It contains copyrighted

images that you have licensed from Getty Images

as well as footage you shot yourself.

-15 -

All Elements on One Page:

• Creator

• Publisher

• Contributor

• Rights

• Date

• Format (MIME Type)

• Identifier

• Language

• Title

• Subject

• Description

• Type (DCMI controlled vocabulary)

• Source

• Relation

• Coverage

("Type" is on Slide 5 of this PPT)

Content IP Instantiation

-16 -

DCMI: Summary

• Easy to use, and therefore widely used

Particularly by librarians and archivists

• Original DCMI was difficult to automate for searching:

Users had too much freedom of formatting.

A more formal structure was needed.

Sp DCMI Levels 2, 3, 4 were created

with the use of the Resource Description

Framework (RDF)

-17 -

DCMI leads to

A whole world of data and metadata

http://linkeddata.org/

More discussion

after we

understand

Namespaces

and

RDF

-18 -

XML Namespaces

• If you make up some XML names and want others to be

able to use them IN THE SAME WAY,

you create an XML Namespace.

It is identified by a URI: Uniform Resource Identifier, like

xmlns="http://www.w3.org/1999/xhtml"

(That Namespace happens to be the one that defines XML.)

-19 -

XML Namespaces

• A URI is not necessarily a URL (but may be …) It is common practice to place a formal description of some kind at the URL corresponding to the URI.

The description might be in Dublin Core (dc), in RDF, or other.

<rdf:RDF

xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#

xmlns:dc="http://purl.org/dc/elements/1.1/">

<rdf:Description rdf:about="http://en.wikipedia.org/wiki/Tony_Benn">

<dc:title>Tony Benn</dc:title>

<dc:publisher>Wikipedia</dc:publisher>

</rdf:Description>

</rdf:RDF>

-20 -

RDF:

Resource Description Framework

The 'Latin' of Namespaces:

(most common system for making them)

or

A language a Namespace can use to

define the semantics of some XML tags.

-21 -

Key RDF idea:Make statements

about resources

in the form of = = =

• subject-predicate-object expressions .... “triples”

The resource we’re talking about

Traits or features of the resource

-22 -

Key RDF idea:Make statements

about resources

in the form of = = =

• subject-predicate-object expressions .... “triples”

The resource we’re talking about

Traits or features of the resource

Values of those traits

-23 -

Key RDF idea:Make statements

about resources

in the form of = = =

• subject-predicate-object expressions .... “triples”

The resource we’re talking about

Traits or features of the resource

Values of those traits

“The sky has the color blue.”

-24 -

Key RDF idea:Make statements

about resources

in the form of = = =

• subject-predicate-object expressions .... “triples”

The resource we’re talking about

Traits or features of the resource

Values of those traits

“The sky has the color blue.”

-25 -

Key RDF idea:Make statements

about resources

in the form of = = =

• subject-predicate-object expressions .... “triples”

The resource we’re talking about

Traits or features of the resource

Values of those traits

“The sky has the color blue.”

-26 -

Key RDF idea:Make statements

about resources

in the form of = = =

• subject-predicate-object expressions .... “triples”

The resource we’re talking about

Traits or features of the resource

Values of those traits

“The sky has the color blue.”

-27 -

RDF Analogy (from Moshell)This is NOT a real RDF document

<?xml version="1.0"?>

<rdf:RDF

xmlns:RDF: Harvard Standard American English"

xmlns:Fr: Academie Francaise">

<rdf:Description rdf:vendor="http://www.perethomas.fr">

<Fr:artiste>Jacques Tatou</Fr:artiste>

<Fr::pays>France</Fr:pays>

<Fr::filme>Mon Oncle</Fr:filme>

<Fr::prix Fr:monnaie="eur">33.90</Fr:prix>

<Fr::an>1985</Fr:an>

</rdf:Description>

</rdf:RDF>

-28 -

RDF Example from W3schools

<?xml version="1.0"?>

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:cd="http://www.recshop.fake/cd#">

<rdf:Description

rdf:about="http://www.recshop.fake/cd/Empire Burlesque">

<cd:artist>Bob Dylan</cd:artist>

<cd:country>USA</cd:country>

<cd:company>Columbia</cd:company>

<cd:price>10.90</cd:price>

<cd:year>1985</cd:year>

</rdf:Description>

</rdf:RDF>

1. Identify the

namespaces

and URIs:

-29 -

RDF Example from W3schools

<?xml version="1.0"?>

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:cd="http://www.recshop.fake/cd#">

<rdf:Description

rdf:about="http://www.recshop.fake/cd/Empire Burlesque">

<cd:artist>Bob Dylan</cd:artist>

<cd:country>USA</cd:country>

<cd:company>Columbia</cd:company>

<cd:price>10.90</cd:price>

<cd:year>1985</cd:year>

</rdf:Description>

</rdf:RDF>

1. Identify the

namespaces

and URIs:

2. <RDF provides the

<rdf:Description ..>

element.

It has an attribute

called 'about'.

It has elements

described by

some other

namespace.

-30 -

RDF Example from W3schools

<?xml version="1.0"?>

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:cd="http://www.recshop.fake/cd#">

<rdf:Description

rdf:about="http://www.recshop.fake/cd/Empire Burlesque">

<cd:artist>Bob Dylan</cd:artist>

<cd:country>USA</cd:country>

<cd:company>Columbia</cd:company>

<cd:price>10.90</cd:price>

<cd:year>1985</cd:year>

</rdf:Description>

</rdf:RDF>

1. Identify the

namespaces

and URIs:

2. <RDF provides the

<rdf:Description ..>

element.

It has an attribute

called 'about'.

It has elements

described by

some other

namespace.

-31 -

RDF Example from W3schools

http://www.w3schools.com/rdf/rdf_example.asp

And follow the link to Validate it, down the page.

Remember: XML must be –well-formed—

and sometimes –validated– against a Schema

-32 -

RDF: Take-away understanding

Resource Description Framework (RDF) is a system for

Formalizing Semantics of XML documents.

What does a particular TAG in a particular XML Document

MEAN? The xmlns tells you, usually using RDF syntax.

A namespace is (most often) a collection of RDF elements.

-33 -

What’s the take-away knowledge?

RDF is part of the Semantic Web – a meaning-system

It is based on the idea of shared “vocabularies” called Namespaces

Its intention is to support

Artificial Intelligence

so that the Web can be used to answer questions.

-34 -

Properties of Resources

• Adobe products contain metadata tools to create XMP

• Resources:

- files such as JPEG or PDF documents

- a meaningful portion of a file (e. g. an image within a PDF)

- cannot be used with fine-grained detail (e. g. a word)

www.adobe.com

Moby Dick

“Herman Melville” “1851”

Author Date Written

-35 -

Properties of Resources

• Adobe products contain metadata tools to create XMP

• Resources:

- files such as JPEG or PDF documents

- a meaningful portion of a file (e. g. an image within a PDF)

- cannot be used with fine-grained detail (e. g. a word)

Moby Dick

“Herman Melville” “1851”

Author Date Written

Resource

-36 -

Properties of Resources

• Adobe products contain metadata tools to create XMP

• Resources:

- files such as JPEG or PDF documents

- a meaningful portion of a file (e. g. an image within a PDF)

- cannot be used with fine-grained detail (e. g. a word)

Moby Dick

“Herman Melville” “1851”

Author Date Written

Property

-37 -

Properties of Resources

• Adobe products contain metadata tools to create XMP

• Resources:

- files such as JPEG or PDF documents

- a meaningful portion of a file (e. g. an image within a PDF)

- cannot be used with fine-grained detail (e. g. a word)

Moby Dick

“Herman Melville” “1851”

Author Date Written

Value

-38 -

So … what is XMP, and why?

• It’s a special system for metadata about media

• It is expressed in terms of another system called RDF.

Oh no .. Systems within systems within systems.

www.russiandolls.co.uk

-39 -

So … what is XMP, and why?

• It’s a special system for metadata about media

• It is expressed in terms of another system called RDF.

XMP uses

XML Namespaces, which uses

RDF – Resource description framework, which uses XML

So … deal with it!

-40 -

XMP is a collection of Schemas

• A Schema has:

- An XML namespace and URI to identify it

- a list of properties.

For each property, the value type (controlled voc.)

e. g. Boolean, Font, MIMEType, Real …

Whether internal or external

Internal: managed by the application

External: managed by the user

-41 -

XMP includes these Schemas:

• Dublin Core Schema

• XMP Basic Schema

• XMP Rights Management Schema

• XMP Media Management Schema

• XMP Basic Job Ticket Schema

• XMP Paged-Text Schema

• XMP Dynamic Media Schema

• Adobe PDF Schema

• Photoshop Schema

• Camera Raw Schema … and a few more.

-42 -

XMP example

• Use Photoshop to open an example image

• Look at the File:File Info: Raw Data

• Understand what is found there

-- good exam questions lurk therein! –

-43 -

And ... what happens on Thursday?

•YOU must be ready to DEMONSTRATE

the CMS module that you are studying.

What is its PURPOSE?

What are its CONTROLS? (settable by Administrator)

What are its INPUTS? (accessible by ordinary users?)

What are its OUTPUTS (what does it display to the public?)

What are its STRONG POINTS?

Where does it Frustrate You? Weak Points, Missing features?

commentsyard.com