Upload
helia
View
32
Download
2
Embed Size (px)
DESCRIPTION
Controlled Vocabulary Workshop March 26-27, 2011. Finalize VOCAB “Terms of Reference” Define use cases for the keyword database and its development Develop procedures for capturing and managing keyword taxonomies - PowerPoint PPT Presentation
Citation preview
CONTROLLED VOCABULARY WORKSHOP MARCH 26-27, 2011
OBJECTIVES
Finalize VOCAB “Terms of Reference” Define use cases for the keyword
database and its development Develop procedures for capturing
and managing keyword taxonomies DONE: Identify suitable existing database
structures or software for managing the controlled vocabulary and adopt or modify them to meet the use cases
AGENDASaturday March 26
7-8 AM Breakfast at the SEV8-8:30 AM Welcome, Review of Agenda, progress report8:30- 9 AM Setup Use Case Working Groups
9-11 AM Use Case Working Groups11-Noon Report back from working groups, VTC with other
members for inputNoon – 1:30 PM
Lunch
1:30-2:30 PM
Review of Controlled Vocabulary Terms of Reference – List Management
2:30-3:30 PM
Work on using TemaTres software, including web services – work on taxonomies
3:30-5:00 PM
Work on some draft taxonomies
6 PM Depart for Dinner in Sicorro
AGENDA
Sunday March 27
7-8 AM Breakfast
8-9 AM Planning for workshop at SC with domain researchers
9-11 AM Work on Use Cases with focus on implementation steps
11-Noon Report on use cases, VTC with other members for input
Noon- 1:30 PM
Lunch
1:30-3 PM Write-up Use case scenarios, work on draft taxonomys
3-3:30 PM Wrapup
4 PM Depart for ABQ hotels and airport
STATUS
Tematres web-based thesaurus tool installed
Taxonomys implemented Habitats/Ecosystems Substances Processes Organisms
Terms classified (things, materials, activities/processes, properties, etc.) A few terms recommended for removal
STATUS
416 terms are part of the polytaxonomy Includes some new higher-level terms
264 terms remain to be linked Synonyms are listed, but not yet added A production server has been
established for the controlled vocabulary Ability to create instances for individual
sites Eda has worked a lot on import/export
issues
USE CASE WORKING GROUPS
Straw Man List Vocabulary use for searching and browsing – Eda,
Don, Corrina Putting the vocabulary into LTER documents –
Kristin, Margaret, John List Management – decision processes
Focus first on WHO DOES WHAT (not how) May be a diagram/flow chart showing actors, actions and
results Once the first step is accomplished then consider:
How it might be accomplished technically What resources would be required Who should be responsible for the implementation
WORKING GROUP NOTES
PUTTING WORDS IN DOCUMENTS
JP’s use case Draft EML document Use Duane’s HIVE tool to suggest probable words check off ones you want, returns
EML snippet to screen, for cut and paste into doc. Or Revised EML document with keywords added Or XML document with keywords (in keywordset node,
including thesaurus) to be used with web service client (allowing additions to relation databases etc.)
KRISTIN’S USE CASE
Populate Drupal web site with polytaxonomy
Within Drupal Metadata Editor - Browse – drop down list of levels, or search to find terms
Select term you want and it is automatically added to backend database that is used by the module that creates EML
MARGARET’S USE CASE
Browse or search keywords and check off desired terms
As things are checked off, generates internal list that is archived at a particular URL
Web service provides XML snippet that can go into EML
USING FOR NON-DATASETS
E.g., publications, projects etc. May not have EML representations Browse or search to locate potential terms Return
Simple list for inclusion (cut and paste) into publications etc.
EML snippet as part of an XML document for use with a web service client to interface with desired systems
Note: this could also use HIVE search tool instead of raw browse
BEST PRACTICES
Need a best practices guide that addresses use of the controlled vocabulary
Goal – assure that LTER data is discoverable Examples:
Use the most specific terms you can Specify how many or what categories of terms
should be included where applicable - examples Specifing a desirable number of terms E.g., At least one term from at least X of the LTER
taxonomys Should have at least one core area
RATING DOCUMENTS
Run document through congruency checker
It says how many keywords and taxonomys are represented in an EML document
Allows checking for conformance with best practices
WORKING GROUP - MANAGING VOCABULARY
Principles want to hit “sweet spot” for number of keywords
Enough to make reasonable search and browsing possible Not so specific that only data from a particular site or dataset would be
returned from a search Could be words used widely at a single site Want to avoid words that are too esoteric
The list should be modified periodically to capture additional words as they become widely used in the network
Each site should be able to propose new preferred terms, in suitable forms that are widely used in datasets from the site. A proposal should include justification, including information on related terms used at other LTER sites and where the term might be placed into the taxonomies
Sites can propose also non-preferred terms linked to existing preferred terms
Sites should be able to maintain independent, site-specific controlled vocabularies
CRITERIA FOR ACCEPTING OR REJECTING PROPOSED PREFERRED TERMS
The proposed terms should be suitable for inclusion (e.g., not locations or specific taxonomic identifiers)
Proposed terms should not be redundant with existing term(s) already in the vocabulary
Terms and their proposed places in taxonomys should conform in form with NISO Z39.19 2005 and successor documents (e.g., sections 6.5.1, 8.3)
CRITERIA FOR ACCEPTANCE OF PROPOSED NON-PREFERRED TERMS
The proposed terms should be suitable for inclusion (e.g., not locations or specific taxonomic identifiers)
The proposed terms must be sufficiently close synonyms to the preferred term to which they will be linked
CRITERIA FOR REMOVING OR ALTERING PREFERRED TERMS
Terms will never be altered, but they can be demoted to non-preferred status
Terms can only be removed if they are not currently in use by datasets
Removals or alterations of terms are expected to be rare
CHANGING LOCATION OF TERMS IN TAXONOMIES OR THESAURI
These have large subjective elements. Other resources should be frequently consulted when making changes
Sites or individuals can propose and justify changes that will be evaluated relative to NISO Z39.19
PROCESS
VOCAB committee may do research to identify terms that should be added based on use in site-specific vocabularies, use in datasets and other sources of information.
VOCAB committee receives and evaluates proposed changes Based on criteria make changes to development version of
the controlled vocabulary database The Controlled Vocabulary may make immediate
changes in the current official version to correct gross errors
New versions will be issued by VOCAB from time-to-time, and a request for endorsement will be forwarded to IMEXEC
SCIENCE COUNCIL WORKSHOP
Objective Engage SC members – sell on idea – develop some advocates Process followed: Objectives - Rules for taxonomys
Get guidance on specific issues The Controlled vocabulary
Need for related terms? Are there things missing? – core areas? Are there things that should be removed? Are there things that are out of place? Specifc areas of concern
Use Cases Feedback on proposed uses Priorities for getting implemented
Tasks before workshop Add definitions for all words to the taxonomy
Prioritize ones that are difficult Get way to display entire vocab. Improve diagram for content Send SC members link to Tematres – have them do test searches
AGENDA
Introduction – 1 hour Around the room introductions why we need controlled vocabulary steps taken so far Background – procedures for creating controlled
vocabularies Meeting objectives
How to use Controlled Vocabulary – 1 hour Question for SC members
What are your experiences with finding LTER data What would most help you find data in the future? Discussion of data discovery use cases
SC AGENDA Tour of Controlled Vocabulary – 1 hour
General Introduction Breakout groups (pair SC member with IM) to look at
areas of specific interest Feedback to entire group on things in the controlled
vocabulary that need improvement – 1 hour Discussion of specific issues
Core areas as top level hierarchy now integrated elsewhere
Management of the vocabulary – role of researchers Discussion of next steps
How do we engage larger LTER community? How much, and what sort of engagement is needed