12
Controlled Vocabulary Working Group - 2013 PRESENTED BY JOHN PORTER

Controlled Vocabulary Working Group - 2013

  • Upload
    ludwig

  • View
    45

  • Download
    0

Embed Size (px)

DESCRIPTION

Controlled Vocabulary Working Group - 2013. Presented by John Porter. Goal. Make it easy for researchers to find the data they need from LTER repositories by Enhancing searches through the use of a thesaurus that provides synonyms, narrower terms and related terms - PowerPoint PPT Presentation

Citation preview

Page 1: Controlled Vocabulary Working Group - 2013

Controlled Vocabulary Working Group - 2013PRESENTED BY JOHN PORTER

Page 2: Controlled Vocabulary Working Group - 2013

Goal

Make it easy for researchers to find the data they need from LTER repositories byEnhancing searches through the use of a

thesaurus that provides synonyms, narrower terms and related terms

Creating a browseable structure for locating datasets

Page 3: Controlled Vocabulary Working Group - 2013

2013 Goals

Enhance term list to incorporate: New terms suggested by sites Frequently searched terms Frequently used terms Terms related to human activities (social science) More synonyms for existing terms that are found in LTER

Metadata Needed: Establish clear criteria for evaluating candidate

terms Best Practices

Page 4: Controlled Vocabulary Working Group - 2013

Goals

Add definitions for terms in the Controlled Vocabulary

Create plans for dealing with taxonomic names and places that are currently not part of the existing Controlled Vocabulary

Page 5: Controlled Vocabulary Working Group - 2013

Workshop – May 2013

Pre-Workshop Queried LTER Sites for new candidate terms – Melendez,

Henshaw, Vanderbilt Queried existing documents for words not currently in the

Controlled Vocabulary – Gastil-Buhl Queried logs for search terms used by Metacat users - Costa Updated Tematres software to the latest version - Porter Identified online sources for definitions – O’Brien, Vanderbilt Investigated taxonomic web services and gazetteers – Gries

Note: the group favors using Taxonomic and Geographic Coverage elements rather than keywords for these elements

Page 6: Controlled Vocabulary Working Group - 2013

Workshop Participants 2013

LTER Information ManagersMargaret O’Brien, Kristen Vanderbilt, Donald

Henshaw and John Porter Professional Librarians from UVA:

Sherry Lake and Ivey Glendon Added a lot to our discussions

“about” vs. “contains” taxonomies our focus is describing what datasets contain“about” is much harder to define for data

Page 7: Controlled Vocabulary Working Group - 2013

Workshop Results 2013

New Terms ~ 230 terms were suggested by 4 sites

~ 75 terms were accepted and added to LTER Vocabulary Reason for rejection was given for each term not added

~ 25 additional terms were added based on use at 3 or more LTER Sites or 2 or more sites with > 10 datasets

~ Several suggested terms were added as non-preferred (UF) terms

Definitions 309 new definitions added

Page 8: Controlled Vocabulary Working Group - 2013

Controlled Vocabulary Status

710 total preferred terms200 synonyms (“use for” terms)363 total definitions

Page 9: Controlled Vocabulary Working Group - 2013

Important Workshop Activities - 2013

Developed improved Best Practices for identifying additional terms for inclusion (http://im.lternet.edu/VocabBestPractices)Including a table that lays out grounds for

rejecting particular words

Page 10: Controlled Vocabulary Working Group - 2013

What Rationale Do’s ProblemAbbreviation

Keywords should be applied to a number of datasets across the LTER Network.

Data discovery is the goal, so keywords that find data are most useful.

Propose keywords that are used at several other sites, and numerous datasets

NR - not repeated in multiple datasets

Keywords should be used at more than one site

A goal is to enable cross-site searching

Propose keywords that are used at several other sites

A - absent from other sites

Avoid proposing stand-alone adjectives

Stand alone adjectives imply an “of what” question. Such as “aboveground” raises the question “aboveground what?”

Propose nouns or possibly verbs, but not stand-alone adjectives. Perferred terms can include an adjective with an object (e.g., aboveground biomass)

ADJ - stand-alone adjective

Be specific Vague or ill-defined terms are hard to consistently assign

Use specific, unambiguous and well-defined terms

V - Vague

Avoid duplicating concepts already in the Controlled Vocabulary

Duplicative keywords lead to inconsistent keyword assignments

Avoid duplication of nearly-equivalent terms

AWE - adequate alternative word exists

Keywords should be well-defined Without definition and context some technical terms may be difficult to assess or place

Provide good definitions NC - needs clarification or better definition

Proposed synonyms should have exact correspondence to the preferred term

Synonyms should not refer to different concepts than the associated preferred term

Select synonyms that are exact matches for the concept described by the preferred term

NS - not a synonym

Keywords should be terms that users frequently search on

Keywords that are not searched for by users are not particularly useful.

Propose keywords that are frequently used in searches

NU - not used for search 

Page 11: Controlled Vocabulary Working Group - 2013

Vision

Refining the “Vision” for how the controlled vocabulary can be used to make PASTA and other NIS elements more effective And link to other efforts such as DataOne, LODE and

EnvThes Optional workshop yesterday – tasks identified:

Identify systems and software tools that effectively exploit controlled vocabularies for searching/browsing and ranking

Metrics tools: help identify specific datasets that could benefit from additional keywords

Page 12: Controlled Vocabulary Working Group - 2013

Help us out!

During discussions today and tomorrow, think about how the Controlled Vocabulary can be leveraged

Incorporate terms from the Controlled Vocabulary into your site EML documents ASK us if you need help!!!!! – we have tools