23
CONTROLLED VOCABULARY WORKSHOP MARCH 26-27, 2011

Controlled Vocabulary Workshop March 26-27, 2011

  • Upload
    helia

  • View
    32

  • Download
    2

Embed Size (px)

DESCRIPTION

Controlled Vocabulary Workshop March 26-27, 2011. Finalize VOCAB “Terms of Reference” Define use cases for the keyword database and its development Develop procedures for capturing and managing keyword taxonomies - PowerPoint PPT Presentation

Citation preview

Page 1: Controlled Vocabulary Workshop March 26-27, 2011

CONTROLLED VOCABULARY WORKSHOP MARCH 26-27, 2011

Page 2: Controlled Vocabulary Workshop March 26-27, 2011

OBJECTIVES

Finalize VOCAB “Terms of Reference” Define use cases for the keyword

database and its development Develop procedures for capturing

and managing keyword taxonomies DONE: Identify suitable existing database

structures or software for managing the controlled vocabulary and adopt or modify them to meet the use cases

Page 3: Controlled Vocabulary Workshop March 26-27, 2011

AGENDASaturday March 26

 

7-8 AM Breakfast at the SEV8-8:30 AM Welcome, Review of Agenda, progress report8:30- 9 AM Setup Use Case Working Groups

9-11 AM Use Case Working Groups11-Noon Report back from working groups, VTC with other

members for inputNoon – 1:30 PM

Lunch

1:30-2:30 PM

Review of Controlled Vocabulary Terms of Reference – List Management

2:30-3:30 PM

Work on using TemaTres software, including web services – work on taxonomies

3:30-5:00 PM

Work on some draft taxonomies

6 PM Depart for Dinner in Sicorro

Page 4: Controlled Vocabulary Workshop March 26-27, 2011

AGENDA

Sunday March 27

 

7-8 AM Breakfast

8-9 AM Planning for workshop at SC with domain researchers

9-11 AM Work on Use Cases with focus on implementation steps

11-Noon Report on use cases, VTC with other members for input

Noon- 1:30 PM

Lunch

1:30-3 PM Write-up Use case scenarios, work on draft taxonomys

3-3:30 PM Wrapup

4 PM Depart for ABQ hotels and airport

Page 5: Controlled Vocabulary Workshop March 26-27, 2011

STATUS

Tematres web-based thesaurus tool installed

Taxonomys implemented Habitats/Ecosystems Substances Processes Organisms

Terms classified (things, materials, activities/processes, properties, etc.) A few terms recommended for removal

Page 6: Controlled Vocabulary Workshop March 26-27, 2011

STATUS

416 terms are part of the polytaxonomy Includes some new higher-level terms

264 terms remain to be linked Synonyms are listed, but not yet added A production server has been

established for the controlled vocabulary Ability to create instances for individual

sites Eda has worked a lot on import/export

issues

Page 7: Controlled Vocabulary Workshop March 26-27, 2011

USE CASE WORKING GROUPS

Straw Man List Vocabulary use for searching and browsing – Eda,

Don, Corrina Putting the vocabulary into LTER documents –

Kristin, Margaret, John List Management – decision processes

Focus first on WHO DOES WHAT (not how) May be a diagram/flow chart showing actors, actions and

results Once the first step is accomplished then consider:

How it might be accomplished technically What resources would be required Who should be responsible for the implementation

Page 8: Controlled Vocabulary Workshop March 26-27, 2011

WORKING GROUP NOTES

Page 9: Controlled Vocabulary Workshop March 26-27, 2011

PUTTING WORDS IN DOCUMENTS

JP’s use case Draft EML document Use Duane’s HIVE tool to suggest probable words check off ones you want, returns

EML snippet to screen, for cut and paste into doc. Or Revised EML document with keywords added Or XML document with keywords (in keywordset node,

including thesaurus) to be used with web service client (allowing additions to relation databases etc.)

Page 10: Controlled Vocabulary Workshop March 26-27, 2011

KRISTIN’S USE CASE

Populate Drupal web site with polytaxonomy

Within Drupal Metadata Editor - Browse – drop down list of levels, or search to find terms

Select term you want and it is automatically added to backend database that is used by the module that creates EML

Page 11: Controlled Vocabulary Workshop March 26-27, 2011

MARGARET’S USE CASE

Browse or search keywords and check off desired terms

As things are checked off, generates internal list that is archived at a particular URL

Web service provides XML snippet that can go into EML

Page 12: Controlled Vocabulary Workshop March 26-27, 2011

USING FOR NON-DATASETS

E.g., publications, projects etc. May not have EML representations Browse or search to locate potential terms Return

Simple list for inclusion (cut and paste) into publications etc.

EML snippet as part of an XML document for use with a web service client to interface with desired systems

Note: this could also use HIVE search tool instead of raw browse

Page 13: Controlled Vocabulary Workshop March 26-27, 2011

BEST PRACTICES

Need a best practices guide that addresses use of the controlled vocabulary

Goal – assure that LTER data is discoverable Examples:

Use the most specific terms you can Specify how many or what categories of terms

should be included where applicable - examples Specifing a desirable number of terms E.g., At least one term from at least X of the LTER

taxonomys Should have at least one core area

Page 14: Controlled Vocabulary Workshop March 26-27, 2011

RATING DOCUMENTS

Run document through congruency checker

It says how many keywords and taxonomys are represented in an EML document

Allows checking for conformance with best practices

Page 15: Controlled Vocabulary Workshop March 26-27, 2011

WORKING GROUP - MANAGING VOCABULARY

Principles want to hit “sweet spot” for number of keywords

Enough to make reasonable search and browsing possible Not so specific that only data from a particular site or dataset would be

returned from a search Could be words used widely at a single site Want to avoid words that are too esoteric

The list should be modified periodically to capture additional words as they become widely used in the network

Each site should be able to propose new preferred terms, in suitable forms that are widely used in datasets from the site. A proposal should include justification, including information on related terms used at other LTER sites and where the term might be placed into the taxonomies

Sites can propose also non-preferred terms linked to existing preferred terms

Sites should be able to maintain independent, site-specific controlled vocabularies

Page 16: Controlled Vocabulary Workshop March 26-27, 2011

CRITERIA FOR ACCEPTING OR REJECTING PROPOSED PREFERRED TERMS

The proposed terms should be suitable for inclusion (e.g., not locations or specific taxonomic identifiers)

Proposed terms should not be redundant with existing term(s) already in the vocabulary

Terms and their proposed places in taxonomys should conform in form with NISO Z39.19 2005 and successor documents (e.g., sections 6.5.1, 8.3)

Page 17: Controlled Vocabulary Workshop March 26-27, 2011

CRITERIA FOR ACCEPTANCE OF PROPOSED NON-PREFERRED TERMS

The proposed terms should be suitable for inclusion (e.g., not locations or specific taxonomic identifiers)

The proposed terms must be sufficiently close synonyms to the preferred term to which they will be linked

Page 18: Controlled Vocabulary Workshop March 26-27, 2011

CRITERIA FOR REMOVING OR ALTERING PREFERRED TERMS

Terms will never be altered, but they can be demoted to non-preferred status

Terms can only be removed if they are not currently in use by datasets

Removals or alterations of terms are expected to be rare

Page 19: Controlled Vocabulary Workshop March 26-27, 2011

CHANGING LOCATION OF TERMS IN TAXONOMIES OR THESAURI

These have large subjective elements. Other resources should be frequently consulted when making changes

Sites or individuals can propose and justify changes that will be evaluated relative to NISO Z39.19

Page 20: Controlled Vocabulary Workshop March 26-27, 2011

PROCESS

VOCAB committee may do research to identify terms that should be added based on use in site-specific vocabularies, use in datasets and other sources of information.

VOCAB committee receives and evaluates proposed changes Based on criteria make changes to development version of

the controlled vocabulary database The Controlled Vocabulary may make immediate

changes in the current official version to correct gross errors

New versions will be issued by VOCAB from time-to-time, and a request for endorsement will be forwarded to IMEXEC

Page 21: Controlled Vocabulary Workshop March 26-27, 2011

SCIENCE COUNCIL WORKSHOP

Objective Engage SC members – sell on idea – develop some advocates Process followed: Objectives - Rules for taxonomys

Get guidance on specific issues The Controlled vocabulary

Need for related terms? Are there things missing? – core areas? Are there things that should be removed? Are there things that are out of place? Specifc areas of concern

Use Cases Feedback on proposed uses Priorities for getting implemented

Tasks before workshop Add definitions for all words to the taxonomy

Prioritize ones that are difficult Get way to display entire vocab. Improve diagram for content Send SC members link to Tematres – have them do test searches

Page 22: Controlled Vocabulary Workshop March 26-27, 2011

AGENDA

Introduction – 1 hour Around the room introductions why we need controlled vocabulary steps taken so far Background – procedures for creating controlled

vocabularies Meeting objectives

How to use Controlled Vocabulary – 1 hour Question for SC members

What are your experiences with finding LTER data What would most help you find data in the future? Discussion of data discovery use cases

Page 23: Controlled Vocabulary Workshop March 26-27, 2011

SC AGENDA Tour of Controlled Vocabulary – 1 hour

General Introduction Breakout groups (pair SC member with IM) to look at

areas of specific interest Feedback to entire group on things in the controlled

vocabulary that need improvement – 1 hour Discussion of specific issues

Core areas as top level hierarchy now integrated elsewhere

Management of the vocabulary – role of researchers Discussion of next steps

How do we engage larger LTER community? How much, and what sort of engagement is needed