24
Overview of the Analytical Information Markup Language Stuart J. Chalk, Department of Chemistry, University of North Florida [email protected] ACS Meeting Denver 2015

Overview of the Analytical Information Markup Language (AnIML)

Embed Size (px)

Citation preview

Overview of the Analytical Information Markup LanguageStuart J. Chalk, Department of Chemistry, University of North Florida

[email protected]

ACS Meeting Denver 2015

Data Formats

Goals for Data Handling

Introduction to AnIML

Sections of an AnIML file

AnIML Schemas and Files

AnIML Technique Definitions

Publishing Instrument Data

Referencing Data Elements

Calculations on Data

Future Developments

Conclusion

Overview

Native Data Formats Proprietary formats

"Metadata" separated from result data

Metadata and data in multiple files

Metadata not available electronically

No way to link metadata with result data

Interchange Data Formats Available for only a few techniques

ANDI — GC, LC, MS

JCAMP-DX — IR/FTIR, NMR, UV/Vis, IMS

Fixed order, fixed syntax, immutable formats

Content limitations

Inconsistent implementations

Data Formats

Extensible Easy to add new elements without breaking existing

applications

Flexible Useful for diverse needs: Interchange, Interconversion,

Archiving...

Useable & Maintainable Easy to create, use, adapt, maintain... Readily available tools

Acceptable Use standard mechanisms accepted by mainstream

computing

Human readable eXtensible Markup Language

Goals for Data Handling

Extensible Markup Language (XML) specification

Development under ASTM E13.15 ‘AnIML Task Group’

Data standard to:

“Develop an analytical data standard that canbe used to store data from any analytical instrument”

Introduction to AnIML

http://animl.sourceforge.net

JCAMP-DX http://www.jcamp-dx.org/

ANDI (netCDF)

ThermoML (NIST)

SpectroML Nguyen, A. D. T., Arslan, A., Travis, J., Smith, M., Schafer, R., &

Kramer, G. W. (2004) ‘Molecular Spectrometry Data Interchange Applications for NIST's SpectroML’, JALA 9 (6), 346-354. doi:10.1016/j.jala.2004.09.001

Generalized Analytical Markup Language (GAML) http://www.gaml.org/

First official meeting March 23, 2003 @ ASTM

Brief History of Time AnIML

Broad scope

Different types of data

Size of data sets

Everyone calls ‘widgit’ something different

Need for metadata dictionaries

One size does not fit all

Getting broad community involvement Domain experts

User communities

What format?

Challenges for AnIML

AnIML XML elements are ‘pigeon holes’ for metadata

Minimal ‘required’ information

If it’s not required you don’t have to include the element

Extensible

Store raw data not processed data(except for FT techniques)

Support for legacy data

Record of changes

Validatable

Signable (digital sense)

AnIML Design Philosophy

AnIML Schemas and Files

Sections of an AnIML File

AnIML Technique Definitions

AnIML - Sample

AnIML - Sample

AnIML-

Experiment

AnIML - Result

Access

Reference

Search

Visualize

Export

Manipulate

Process

Contextualize

Leverage XMLtools/formats

AnIML in an ELN

AnIML Viewer -> Jmol/JSpecView (http://jmol.sourceforge.net)

Viewing Instrument Data

Conversion of AnIML data to SVG using XSLT

Publishing Instrument Data

Expose an AnIML file at a URL

Optional: Define a DOI for that URL

Use XPath to reference a specific data point in an AnIML file

//ExperimentStepSet[1]/ExperimentStep[1]/Method[1]/Author[1]/Name[1]

Encode the XPath expression so it can be part of the URL

Referencing Instrument Data

Calculations with Instrument Data

Extract data from files using XPath

XML data to JSON conversion using XSLT*

Browser based JavaScript functions to

Smooth: moving window, Savitsky-Golay

Integrate: summation

Conversion: Absorbance <-> %T

Linear regression

*http://www.bramstein.com/projects/xsltjson/

AnIML 1.0 Deliverables Core Schema - Fundamental framework for AnIML documents Technique Schema - Fundamental framework for technique definition and

extension documents AnIML Technique Definition Documents (ATDD) - Rules for content of

specific technique file AnIML Naming and Design Rules - Specifies rules about data element

structure for interoperability Standard Practice for AnIML Files - Describes how the specification is

supposed to work How to Create a Technique Definition Document - Guidelines for creating

new technique definition documents

Other documents Draft Requirements Specification for AnIML Version 1.0 Requirements and Goals of the Analytical Information Markup Language

AnIML Specification

http://animl.sourceforge.net

Documentation

Core specification

Technique and extension specification

Naming and design rules

Annotated technique definitions(UV/Vis, IR, 1D NMR, MS, Chromatography)

Balloting through ASTM (end of 2015)

Vendor, User, Developer extensions

Semantic extension

Ontological reference to AnIML metadata items

Future Developments

Conclusion

AnIML is a great solution for storing instrument data

Human readable (plain text - UTF-8)

Platform neutral

Archivable

Validatable

Being XML based leverages the extensive XML ecosystem of tools that are mostly free

Software designers are familiar with dealing with XML due to its well defined and stable architecture

[email protected]

Phone: 904-620-1938

Skype: stuartchalk

LinkedIn/Slidehare: https://www.linkedin.com/in/stuchalk

ORCID: http://orcid.org/0000-0002-0703-7776

ResearcherID: http://www.researcherid.com/rid/D-8577-2013

Questions?