12
Inline Tagging GENA SAN NICOLAS EDITOR/TAXONOMIST

Inline Tagging and Dictionary Connection

Embed Size (px)

DESCRIPTION

Gena M. San Nicholas, a taxonomist and biology subject-matter expert (SME) at Access Innovations, Inc., shows how Data Harmony's machine-aided indexing (M.A.I.) module produces tagged subject terms within bodies of text for XML and other repositories. This aids in search and leverages subject metadata, resulting in added value to data collections.

Citation preview

Page 1: Inline Tagging and Dictionary Connection

Inline TaggingGENA SAN NICOLAS

EDITOR/TAXONOMIST

Page 2: Inline Tagging and Dictionary Connection

Introduction

What’s the big deal about Data Harmony, anyway?

My background—biology Searching through science databases was tedious and laborious

Frequently, the only way to tell if an article was what you wanted was to actually read the whole thing

Costly if your institution didn’t have accession rights to that particular publication.

Page 3: Inline Tagging and Dictionary Connection

Data Harmony allows the user to “browse the book”

Rulebase allows editors to assign context to full-text and disambiguate terms

Indexing terms are XML-tagged by Data Harmony in the document

Rulebase is auto-generated but is easily edited

Page 4: Inline Tagging and Dictionary Connection

“Easily edited”—easy for an experienced editor

Test MAI

Look at indexing results

Compare rule to trigger words in full-text test

Tweak rule as necessary

Page 5: Inline Tagging and Dictionary Connection

We’ve made it easy for you!

Page 6: Inline Tagging and Dictionary Connection

But wait!!! There’s more!!!

Page 7: Inline Tagging and Dictionary Connection

With Inline Tagging, we make it EVEN EASIER for you!!!

What is Inline Tagging?

From the DH-Inline tagging documentation: “Access Innovations’ Inline Tagging function finds and labels thesaurus concepts (identified by rules stored within the thesaurus rule base ) within the full text of an article (in XML or PDF format ) by applying XML wrappers, or “tags”.  The process of adding XML tags within content is called “inline tagging.” Thanks to the XML format, metadata can be included within content files—not just in a set-aside area at the beginning or end of the file, but woven into the very text. “

This allows the user to truly “browse the book” according to your content management needs.

Page 8: Inline Tagging and Dictionary Connection

We take this MAIstro output:

Page 9: Inline Tagging and Dictionary Connection

…and turn it into this:

Page 10: Inline Tagging and Dictionary Connection

HTML output format is completely customizable

Page 11: Inline Tagging and Dictionary Connection

MAIstro Inline Tagging Web Service

To facilitate integration of DataHarmony's MAIstro suite with a publishing pipeline or other workflow, a simple web service can be installed that performs automatic indexing. This web service is an abstraction of the Java APIs that DataHarmony's MAIstro uses.

The web service has two functions: TestSettings: For configuration and debugging

GetTerms: Call MAIstro's GetTerms API and return a formatted document with the subject terms tagged inline with xml tags

Page 12: Inline Tagging and Dictionary Connection