Data Harmony Version 3.9 Features Update

Preview:

DESCRIPTION

Marjorie M.K. Hlava, President and founder of Access Innovations, Inc., unveils the newest version and module updates of the Data Harmony indexing software suite.

Citation preview

Marjorie M.K. Hlavamhlava@accessinn.comAccess Innovations, Inc.

www.accessinn.com

Leveraging your content semantically

10th Annual Data Harmony User Group Meeting

DH Technical Support Team

Development programming team Lamine Idjeraoui ** Allexander Lyons Daniel Vasicek Scott Roberts Doug Vendcat

Customer support Mary Garcia ** Jack Bruce Gabe Carr Samantha Lewis

Documentation Jack Bruce ** Kirk Sanders Gena San Nicolas Barbara Gilles

Systems Tom Peterson** SWCP

DH Customer Support Team

Sales and Licensing Marjorie Hlava Janice McIntyre Bill Richardson Jay Ven Eman ** Leland Yates

Blog and Web team Barbara Gilles Melody Smith ** Timothy Soholt **

Marketing Heather Kotula ** Ashley Beard

Editorial Team Taxonomy and Rule Building

Gabe Carr Jack Bruce Kathy Brown Barbara Gilles Bob Kasenchak **

Samantha Lewis Kirk Sanders Tim Soholt Gena San Nicolas Alice Redmond-Neal Eric Ziecker

Access Integrity

Kathy Brown Jerry Jorgeson John Kuranz** Leland Yates Access Rule Building Team Access Programming Team

Who’s Who?

Introduce yourself Relationship to Data Harmony Where do you use Data Harmony Project Name(s)

Access InnovationsWhat do we do?

Four Divisions Database Services Data Harmony

NewsIndexer National Information Center for

Educational Media (NICEM) MediaSleuth

Access Integrity Medical Claims Compliance Integracoder

Database Services

Database Design Consulting DTD / Metadata Schemas Workflow Scheduling

Editorial Services Metadata capture and creation Tagging – XML, SGML Abstracting Indexing Author disambiguation

Database Services - 2

Taxonomy Construction Thesaurus Vocabulary Ontology Data Linking (linked data) Authority Files – pick lists Rule Bases

Semantic Enrichment Data Format Conversion Database Applications Retrospective metadata tagging Author disambiguation

Database Services - 3

Applications development Search – Lucene and Solr Search Harmony interface Web services layer

Link to user experience or user interface Web calls

API setup and linking www.accessinn.com

Data Harmony

Built for our use starting in 1987 Visual Basic C++ Java Aid to the editorial and indexing processes Alleviate the clerical aspects Speed the tagging process Guarantee accuracy, consistency, and

depth of indexing

Data Harmony Suite – Main Modules

M.A.I. Thesaurus Master XIS

XML Intranet System Administrative configuration module “The Data Harmony Suite”

Tech stuff Downloadable Documentation revised 2014 APIs for client server versions Internet accessible Cloud and SaaS Full multilingual display Unicode - Accepts ASCII data Entification tables converted Drivers for display and print

For most languages

Data Harmony

Java Platform independent Applet modules Web services APIs

XML TCP/IP JSON and SSL on WEB Start GlassFish for extension support www.dataharmony.com

Full multilingual display

Data Harmony Machine Aided Indexing (M.A.I.)

Semantic, syntactic, morphological, etc. layer Rule Builder for users Concept Extractor for text Statistics for Machine Learning Use in automatic, batch, or assisted mode

Thesaurus Master For creating taxonomies, thesauri, ontologies, and

authority files MAIstro

Thesaurus Master and M.A.I. combined

Data Harmony Extensions

Inline Tagging Metadata Extractor MAIChem Search Harmony SharePoint integration Recommender

New

DH Author Submission System Author / Name Disambiguation MAIBatch GUI Semantic Fingerprinting Web Start Sneak Peek at “Ontology Master”

Retiring

Automatic Summarizer WebThes ThesViewer

TaxoDiary

Daily blog Weekly feature 3 + items per day Big archive Launched in June 2010

DH Bulletin Board Exchangehttp://dhd.accessinn.com  

Data Harmony Forum

Discussion threads Solutions to reported problems Access to the newest documentation Announcements of features Bug reports Enhancement requests

Data Harmony Partners

EJ Press MarkLogic

Really strategies (R Suite) Yuxi Xquire

Publishing Technology More ….

Some DH Connectors & Exports…

ACD/Labs’

Lucene (org. & Solr)

Perfect Search

Oracle/Stellent Universal Content Management

Jive Software’s Clearspace

EJ Press

Publishing Technology

OpenOffice

Mark Logic’s MarkLogic Server

Microsoft’s SharePoint

NorthPlains

Temis

Synaptica

and more…

Other DH offerings

Off-the-shelf taxonomy Term records Browseable list Rule bases

Consulting Information architecture DTD and schema creation

Search implementation

Knowledge Domains in over 40 subject areas.• Agriculture• Applied Technologies• Business (popular)• Business and Finance• Communications• Computer and Information

Science (popular)• Computer Science • Consumer and Homemaking

Education• Corporate Names• Counseling and Guidance• Economics• Education• Engineering• Environment• Geography (subject)• Geographical Place Names• Health and Safety• History• Language Arts

• Languages• Literature and Drama• Mathematics• News • Occupations• Organizational Names• Personal Names• Physical Education and

Recreation• Political Science• Psychology• Religion and Philosophy• Science (popular)• Science, Technology, and Medicine (STM)• Society• Sports• Technology• Visual and Performing Arts• US Industrial Codes (NAICS)• US Zip Codes and Places

Go to TaxoBank for more!

NewsIndexer

Automatic indexing of newspapers 8 topical areas Maps to IPTC, NAICS, ICB, and GICS

codes Popular, automatic, and fast Remote submission / ASP 13 levels Filter to 3 License and augment www.newsindexer.com

National Information Center for Educational Media - NICEM

667,000 records for non-print educational media

23,000 producers and distributors Based on school curriculum needs Online and CD-ROMs MARC cataloging Thesaurus Print www.nicem.com

MediaSleuth

Online ordering of media from NICEM Search Harmony implementation Full e-commerce platform for ordering Educational and popular materials

www.mediasleuth.com

Access Integrity, Inc. (AI2) Medical Claims Compliance Automatic IDC-9 suggestions CPT rule base HCPCS rule base ICD-9 V 3 Hospitals ICD-10 Accurate, deep, consistent coding Making medical billing efficient

Corporate Information

Closely held Financed by

Sweat and Persistence Good Cash Flow and Management

Since 1978 - 35 years in business Marjorie M.K. Hlava Jay Ven Eman Joanna Ginter

www.accessinn.com

Woman Owned Small Business

UPDATE

Data Harmony Users Group Meeting

February 10-14, 2014

The 15 modules + extensionsWhat’s new

Admin Module Author Submission

System Author / Name

Disambiguation Inline Tagging Metadata Extractor M.A.I. MAIBatch GUI

MAIChem Ontology Master Thesaurus Master Search Harmony SharePoint Recommender Web Start XIS

Rule Base

TermKeyRecord

ConceptExtractor

Statistics Module

M.A.I.

TaxonomyAuthority filesAll terms AlphabeticPermuted view

XML (Extensible Markup Language) - Unicode

Java Virtual Machine

TCP/IP Transmission Control Protocol / Internet Protocol

Thesaurus Master

Native XMLContentCreationRepository

OWL Zthes SKOSXMLMARC, etc.

Administrative modules

DH Extensions

XIS Search Harmony

NavTree

Auto Completion

Narrow Search - NT

Expanded Search - RT

Auto Sum

Metadata Extractor

MAI Chem

Data Harmony 2013 Stack

Data Harmony 2014 Stack

Rule Base

TermKeyRecord

ConceptExtractor

Statistics Module

TaxonomyAuthority filesAll terms AlphabeticPermuted view

XML (Extensible Markup Language) - Unicode

Java Virtual Machine

TCP/IP Transmission Control Protocol / Internet Protocol

Thesaurus Master

Native XMLContentCreationRepository

OWL Zthes SKOSXMLMARC, etc.

Administrative modules

Web Start, APIs, Web services and connectors

XIS Search Harmony

NavTree Auto Completion

Narrow

Search - NTExpanded

Search - RT

Metadata Extractor

MAIChem

Inline Tagging

Author Disambiguation

Recommender

M.A.I.

Automatic Summarizer

Author Submission System

SharePoint Connector

Ontology Master

MAIBatch

Admin Module

Configuration of Thesaurus Master, M.A.I., MAIstro

Separate Admin Module for XIS MAIBatch added to MAIstro Admin

Module

The author pastes the data into the

document template,

attaching images, graphs, etc. as

necessary:

Copyright © 2013 Access Innovations, Inc.

Author Submission Module

Author Submission Module

Copyright © 2013 Access Innovations, Inc.

The author fills in the data to the document template, attaching images and graphs as necessary.

An API calls Data Harmony and generates a list of indexing terms based on the content.

Authors review the indexing and may change it.

Content is stored into a data repository as HTML, XML, etc.

Author Submission Module

Copyright © 2013 Access Innovations, Inc.

DH Author Submission System

Leveraging Records Management with Documentum, Author Submission, and MAIstroMarjorie M.K. Hlava and Leland Yates, Access Innovations, Inc.

Admin Module

DH Author Submission System

Configure any field Index on any field XML or XHTML Link to the CMS

Author Submission

System Configuration Module

Author Disambiguation

Build a file of authors Name: first, second, surname DOIs published Publication rank (first author, etc.) Keywords for those DOIs Affiliation(s) Location(s) city, state, country, etc. Co-authors (inferred by DOI) Etc.

Affiliation Disambiguation

Build a file of affiliations Name

Lab, institute, etc. name DOI Location Full address Keywords Etc.

Author Disambiguation

Link the two databases Build a web service to accept files Auto-disambiguate incoming files Review new or non-match to ensure

accuracy Leveraging Semantic Fingerprinting for

Building Author NetworksBob Kasenchak, Wednesday @ 9:30 AM

Inline Tagging

Full text tagging Send search query directly to the place in

the document where the concept is mentioned.

Flexible in XML and HTML views Inline Tagging and Dictionary Connection

Gena San Nicolas, Wednesday @ 2:15

Inline tagging Web service

Use M.A.I. to put terms in context for high-precision indexing

Inline Tagging

Shows the exact point where the concept is mentioned

Mouse over to view the term record

Statistical summary, showing the number of times each term is mentioned in the article

XML View forInline Tagging

Copyright © 2013 Access Innovations, Inc.

Metadata Extractor

Automatic creation from PDF digital layer Position training needed Dublin Core metadata Bibliographic citation created Automatic summarization added Uses M.A.I. on full text Can be linked to Author Disambiguation

Input file

Source file PDF digital layer

Metadata Extractor Full Record Display

Output in XML

Or use with HTML Pages

. <document><title>Access Innovations -

Knowledge Management Professionals</title><document-type>Web Page</document-type><copyright>© 2007 Access Innovations, Inc.</copyright><address>

<street>131 Adams NE</street><city>Albuquerque</city><state>New Mexico</state>

</address><subject-terms>

<term>Data Harmony</term><term>Indexing</term><term>Taxonomies</term>

</subject-terms></document>

M.A.I.

M.A.I. is used to describe or categorize items by matching text to controlled vocabulary terms   Rule Builder Concept Extractor Statistics Collector Test MAI

M.A.I. 2014

Find in Test MAI Export Fields function Expanded warning and information labels Expanded print functions Rule error details Emphasis tags MAIBatch GUI

Find Function In Test MAI

Export with fields selection

Expanded warning and information labels

Delete term warning

Term warnings

Term with multiple Broader Terms warning Remove relationship warning message

Move term functions

Move a single term

Expanded print functions

Test the syntax of a rule

View information about a thesaurus term

MAIBatch GUI

IMAIBatch input format

PDF XML, nXML Web content (HTML, HTM) Plain text (TXT), rich text (RTF) MS Word documents (DOC, DOCX)

Full window with suggestedAND used terms

Select all or just some files to process

MAIBatch XML

Add Custom tags Click on “XML tags” in

the Settings menu.

MAIBatch - Adding files Viewing results

Upload File/Directory

Row of asterisks separates each document

file path of a document

suggested thesaurus terms

Log Statistics From source data to

compare accuracy By human editors

assigning values HIT MISS NOISE

From source file data

<USEDTERMS><TERM>Term 1</TERM><TERM>Term 2</TERM></USEDTERMS>

M.A.I. Statistics Module

Exporting MAIBatch

resultsSave as .txt file through export menu

Save to Log Spreadsheet .xls

MAIChem

Dictionaries Full terms Beginners Enders

M.A.I. Concept Extractor Links to graphical displays

Ontology Master

Sneak Peek Built on Thesaurus Master Full OWL and SKOS exports Full directional relationships Same extensive functionality Bob Kasenchak – Wednesday @ 1:15

PM

Recommender

More Like This - Recommender

Search Harmony

Built to leverage semantically enriched text

Uses the thesaurus sections BT-NT relationships for taxonomy tree Type ahead from tab, permuted index Related terms Narrower terms

Copyright © 2005 - Access Innovations, Inc.

Taxonomyview

ThesaurusTerm Record

view

Search Presentation Layer

Automatic completion

and type ahead from thesaurus

Search Presentation Layer

Related

Narrower

Search Presentation Layer

The Hierarchical view of the thesaurus is also a browseable view of the content.

The numbers include the number of hits 1. For the term 2. For the branch

Semantic Fingerprinting

People / Authors Articles Medical records Organizations and affiliations Point ads to users Related to author disambiguation

Thesaurus Master

Machine Aided Indexer

(M.A.I.™)

Repository

SearchPresentation:

90% accuracy

Browse by SubjectAuto-completionBroader TermsNarrower TermsRelated Terms

Client Taxonomy

Inline Tagging

Metadata and Entity

Extractor

Automatic Summarizatio

n

SearchSoftware

Client Data

Full Text

HTML, PDF,

Data Feeds, etc.

Client taxonomy

Fully integrated SharePoint

Copyright © 2013 Access Innovations, Inc.

[Data Harmony fully integrated with MOSS.]

Select term store management located under Site AdministrationEdit term sets to accurately reflect your document

libraries and content types. Term sets can be individual taxonomies or flat controlled vocabulary lists. 90

Thesaurus Master - 2014

Built for vocabulary control Taxonomy Thesaurus Entities

Full standards compliance ISO 25964 Parts 1 and 2 NISO Z39.19 – 2010

Emphasis Is Available for Preferred Terms

bold, italics, or underline Term with emphasized words

Term with enriched words

Change Term dialog with enhancement buttons

XML Emphasis Export

Full Path Export

Data Harmony Custom Features as Implemented for Triumph Learning

Kirk Sanders Wednesday @ 11:00

Emphasis Full path export

Thesaurus Master 2014

Emphasis tags – more Wednesday @ 11:00 Data Harmony Custom Features as

Implemented for Triumph LearningKirk Sanders, Access Innovations, Inc.

Pattern analysisDomain associations

Pattern analysisComponent gaps

Web Start

Replacing WebThes and ThesViewer Allows auto-start from the browser Full featured Password access control Everything from view only to full access

V

XIS

A XIS project consists of the following: Folders that XIS uses. These are the “project

folders.”  A schema (configuration file) called

projects.MyProject.xml.  A XIS DTD, called “projects.dtd.”

XIS links to Thesaurus Master and M.A.I.

XIS and Lucene

Search within a search (recursive search)

New Lucene search

Using Lucene for Search within XISAllexander Lyons, Wednesday @ 11:45

DHUG 2015

Albuquerque February 16 – 20 Call for papers is now open Ideas for what to do better and differently

VERY welcome

We Apply ImaginationKeep the System Flexible

Make the Applications Fun

Thank you!

Marjorie M.K. Hlava, President,

Access Innovations

505-998-0800

mhlava@accessinn.com

Recommended