Upload
griselda-may
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
1
Information Management & Entrepreneurship
DIG 3563 – Lecture 10
Metadata Systems
J. Michael Moshell
University of Central Florida
Original image* by Moshell et al .
Imagery is fromWikimedia except where marked with *.
-2 -
3 ½* Systems for Metadata
• Dublin Core Metadata Initiative DCMI• XML Namespaces• Resource Description Framework RDF• Adobe eXtensible Metadata Platform XMP
In (approximately) increasing order of complexity.
*why 3 ½? That's a good exam question!
-3 -
Ohio College Library Center Online College Library Center @
Dublin, Ohio
• Dublin Core – a SIMPLE metadata standard
For web search and retrieval
http://dublincore.org
The thing-described is the resource.
The metadata has 15 Elements in three groups:
- Content
- Intellectual Property
- Instantiationwww.dublincore.org
-4 -
Content Elements
• Title
• Subject (What’s the resource about)
• Description (text)
• Type (book, movie, website, etc.) see controlled vocabulary
• Source (if the resource is derived from something else)
• Relation (part of something else?)
• Coverage (spatial or temporal topic; jurisdiction)
www.dublincore.org
-5 -
Whats a controlled vocabulary?
• An “official list” of the values that are allowed for an Element• Example: Type of documents .. From the authoritative list at:
http://dublincore.org/documents/dcmi-type-vocabulary/
• Collection
• Dataset
• Event
• Image
• InteractiveResource
• MovingImage
www.dublincore.org
• PhysicalObject
• Service
• Software
• Sound
• StillImage
• Text
-6 -
Intellectual Property
• Creator (sometimes author; sometimes editor)
• Publisher
• Contributor (e. g. an edited volume has many contributors.)
• Rights (if several separate rights exist, list them all.)
www.dublincore.org
-7 -
Instantiation
• Date (use a recommended encoding scheme)
• Format (recommended: use the MIME controlled vocabulary)
• Identifier (e. g. a URL, or a library catalog number)
• Language (from a controlled vocabulary, e. g. ISO 639-2)
•http://www.loc.gov/standards/iso639-2/php/code_list.php
www.dublincore.org
-8 -
An Example
The Book of Kells
1) Decide what that document is. (It's not your whole site!)(Why?)
DOCUMENT: One Book, at Trinity College Dublin
2) Produce its DCMI Description – 15 elements
-9 -
The Example: Book of Kells
• Book of Kells (Volume 3)
• Four Gospels
• Latin religious text
• PhysicalObject
• Abbey of Kells
• Volume 3 of 4-Volume Set
• 0-33 A. D. (Traditional)
• Title
• Subject
• Description
• Type
• Source
• Relation
• Coverage
Content
-10 -
The Example: Book of Kells
• Creator
• Publisher
• Contributor
• Rights
• Celtic Monks (at Iona?)
• Catholic Church of Ireland
• none
• Roman Catholic Church
IP
-11 -
ca. 800 A. D.
n/a (not an electronic document)
DCL-014a.BOK
lat (Latin)
Instantiation
• Date
• Format (MIME)
• Identifier
• Language
The Example: Book of Kells
-12 -
More examples from DCMI
http://dublincore.org/documents/2001/04/12/usageguide/generic.shtml#contributor
-13 -
An Exercise (work in pairs, please)
Your company (whatever you're building your website about)
has created a document.
1) Decide what that document is. (It's not your whole site!)(Why?)
(If you can’t think of one, see slide 14 for an idea.)
2) Produce its DCMI Description – 15 elements – and be ready to show it to the class on a Powerpoint
You have 10 minutes.
-14 -
A document that your business
might have
A video demo of your company’s services, which you
intend to post on YouTube. It contains copyrighted
images that you have licensed from Getty Images
as well as footage you shot yourself.
-15 -
All Elements on One Page:
• Creator
• Publisher
• Contributor
• Rights
• Date
• Format (MIME Type)
• Identifier
• Language
• Title
• Subject
• Description
• Type (DCMI controlled vocabulary)
• Source
• Relation
• Coverage
("Type" is on Slide 5 of this PPT)
Content IP Instantiation
-16 -
DCMI: Summary
• Easy to use, and therefore widely used
Particularly by librarians and archivists
• Original DCMI was difficult to automate for searching:
Users had too much freedom of formatting.
A more formal structure was needed.
Sp DCMI Levels 2, 3, 4 were created
with the use of the Resource Description
Framework (RDF)
-17 -
DCMI leads to
A whole world of data and metadata
http://linkeddata.org/
More discussion
after we
understand
Namespaces
and
RDF
-18 -
XML Namespaces
• If you make up some XML names and want others to be
able to use them IN THE SAME WAY,
you create an XML Namespace.
It is identified by a URI: Uniform Resource Identifier, like
xmlns="http://www.w3.org/1999/xhtml"
(That Namespace happens to be the one that defines XML.)
-19 -
XML Namespaces
• A URI is not necessarily a URL (but may be …) It is common practice to place a formal description of some kind at the URL corresponding to the URI.
The description might be in Dublin Core (dc), in RDF, or other.
<rdf:RDF
xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#
xmlns:dc="http://purl.org/dc/elements/1.1/">
<rdf:Description rdf:about="http://en.wikipedia.org/wiki/Tony_Benn">
<dc:title>Tony Benn</dc:title>
<dc:publisher>Wikipedia</dc:publisher>
</rdf:Description>
</rdf:RDF>
-20 -
RDF:
Resource Description Framework
The 'Latin' of Namespaces:
(most common system for making them)
or
A language a Namespace can use to
define the semantics of some XML tags.
-21 -
Key RDF idea:Make statements
about resources
in the form of = = =
• subject-predicate-object expressions .... “triples”
The resource we’re talking about
Traits or features of the resource
-22 -
Key RDF idea:Make statements
about resources
in the form of = = =
• subject-predicate-object expressions .... “triples”
The resource we’re talking about
Traits or features of the resource
Values of those traits
-23 -
Key RDF idea:Make statements
about resources
in the form of = = =
• subject-predicate-object expressions .... “triples”
The resource we’re talking about
Traits or features of the resource
Values of those traits
“The sky has the color blue.”
-24 -
Key RDF idea:Make statements
about resources
in the form of = = =
• subject-predicate-object expressions .... “triples”
The resource we’re talking about
Traits or features of the resource
Values of those traits
“The sky has the color blue.”
-25 -
Key RDF idea:Make statements
about resources
in the form of = = =
• subject-predicate-object expressions .... “triples”
The resource we’re talking about
Traits or features of the resource
Values of those traits
“The sky has the color blue.”
-26 -
Key RDF idea:Make statements
about resources
in the form of = = =
• subject-predicate-object expressions .... “triples”
The resource we’re talking about
Traits or features of the resource
Values of those traits
“The sky has the color blue.”
-27 -
RDF Analogy (from Moshell)This is NOT a real RDF document
<?xml version="1.0"?>
<rdf:RDF
xmlns:RDF: Harvard Standard American English"
xmlns:Fr: Academie Francaise">
<rdf:Description rdf:vendor="http://www.perethomas.fr">
<Fr:artiste>Jacques Tatou</Fr:artiste>
<Fr::pays>France</Fr:pays>
<Fr::filme>Mon Oncle</Fr:filme>
<Fr::prix Fr:monnaie="eur">33.90</Fr:prix>
<Fr::an>1985</Fr:an>
</rdf:Description>
</rdf:RDF>
-28 -
RDF Example from W3schools
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:cd="http://www.recshop.fake/cd#">
<rdf:Description
rdf:about="http://www.recshop.fake/cd/Empire Burlesque">
<cd:artist>Bob Dylan</cd:artist>
<cd:country>USA</cd:country>
<cd:company>Columbia</cd:company>
<cd:price>10.90</cd:price>
<cd:year>1985</cd:year>
</rdf:Description>
</rdf:RDF>
1. Identify the
namespaces
and URIs:
-29 -
RDF Example from W3schools
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:cd="http://www.recshop.fake/cd#">
<rdf:Description
rdf:about="http://www.recshop.fake/cd/Empire Burlesque">
<cd:artist>Bob Dylan</cd:artist>
<cd:country>USA</cd:country>
<cd:company>Columbia</cd:company>
<cd:price>10.90</cd:price>
<cd:year>1985</cd:year>
</rdf:Description>
</rdf:RDF>
1. Identify the
namespaces
and URIs:
2. <RDF provides the
<rdf:Description ..>
element.
It has an attribute
called 'about'.
It has elements
described by
some other
namespace.
-30 -
RDF Example from W3schools
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:cd="http://www.recshop.fake/cd#">
<rdf:Description
rdf:about="http://www.recshop.fake/cd/Empire Burlesque">
<cd:artist>Bob Dylan</cd:artist>
<cd:country>USA</cd:country>
<cd:company>Columbia</cd:company>
<cd:price>10.90</cd:price>
<cd:year>1985</cd:year>
</rdf:Description>
</rdf:RDF>
1. Identify the
namespaces
and URIs:
2. <RDF provides the
<rdf:Description ..>
element.
It has an attribute
called 'about'.
It has elements
described by
some other
namespace.
-31 -
RDF Example from W3schools
http://www.w3schools.com/rdf/rdf_example.asp
And follow the link to Validate it, down the page.
Remember: XML must be –well-formed—
and sometimes –validated– against a Schema
-32 -
RDF: Take-away understanding
Resource Description Framework (RDF) is a system for
Formalizing Semantics of XML documents.
What does a particular TAG in a particular XML Document
MEAN? The xmlns tells you, usually using RDF syntax.
A namespace is (most often) a collection of RDF elements.
-33 -
What’s the take-away knowledge?
RDF is part of the Semantic Web – a meaning-system
It is based on the idea of shared “vocabularies” called Namespaces
Its intention is to support
Artificial Intelligence
so that the Web can be used to answer questions.
-34 -
Properties of Resources
• Adobe products contain metadata tools to create XMP
• Resources:
- files such as JPEG or PDF documents
- a meaningful portion of a file (e. g. an image within a PDF)
- cannot be used with fine-grained detail (e. g. a word)
www.adobe.com
Moby Dick
“Herman Melville” “1851”
Author Date Written
-35 -
Properties of Resources
• Adobe products contain metadata tools to create XMP
• Resources:
- files such as JPEG or PDF documents
- a meaningful portion of a file (e. g. an image within a PDF)
- cannot be used with fine-grained detail (e. g. a word)
Moby Dick
“Herman Melville” “1851”
Author Date Written
Resource
-36 -
Properties of Resources
• Adobe products contain metadata tools to create XMP
• Resources:
- files such as JPEG or PDF documents
- a meaningful portion of a file (e. g. an image within a PDF)
- cannot be used with fine-grained detail (e. g. a word)
Moby Dick
“Herman Melville” “1851”
Author Date Written
Property
-37 -
Properties of Resources
• Adobe products contain metadata tools to create XMP
• Resources:
- files such as JPEG or PDF documents
- a meaningful portion of a file (e. g. an image within a PDF)
- cannot be used with fine-grained detail (e. g. a word)
Moby Dick
“Herman Melville” “1851”
Author Date Written
Value
-38 -
So … what is XMP, and why?
• It’s a special system for metadata about media
• It is expressed in terms of another system called RDF.
Oh no .. Systems within systems within systems.
www.russiandolls.co.uk
-39 -
So … what is XMP, and why?
• It’s a special system for metadata about media
• It is expressed in terms of another system called RDF.
XMP uses
XML Namespaces, which uses
RDF – Resource description framework, which uses XML
So … deal with it!
-40 -
XMP is a collection of Schemas
• A Schema has:
- An XML namespace and URI to identify it
- a list of properties.
For each property, the value type (controlled voc.)
e. g. Boolean, Font, MIMEType, Real …
Whether internal or external
Internal: managed by the application
External: managed by the user
-41 -
XMP includes these Schemas:
• Dublin Core Schema
• XMP Basic Schema
• XMP Rights Management Schema
• XMP Media Management Schema
• XMP Basic Job Ticket Schema
• XMP Paged-Text Schema
• XMP Dynamic Media Schema
• Adobe PDF Schema
• Photoshop Schema
• Camera Raw Schema … and a few more.
-42 -
XMP example
• Use Photoshop to open an example image
• Look at the File:File Info: Raw Data
• Understand what is found there
-- good exam questions lurk therein! –
-43 -
And ... what happens on Thursday?
•YOU must be ready to DEMONSTRATE
the CMS module that you are studying.
What is its PURPOSE?
What are its CONTROLS? (settable by Administrator)
What are its INPUTS? (accessible by ordinary users?)
What are its OUTPUTS (what does it display to the public?)
What are its STRONG POINTS?
Where does it Frustrate You? Weak Points, Missing features?
commentsyard.com