OASIS Electronic Trial Master File Standard Technical Committee Metadata Component Content Model Component February 17, 2014 9:00 – 10:00 AM PST

February 17, 2014 9:00 – 10:00 AM PST

  • Upload

  • View

  • Download

Embed Size (px)


OASIS Electronic Trial Master File Standard Technical Committee Metadata Component Content Model Component. February 17, 2014 9:00 – 10:00 AM PST. Agenda. Roll Call. Meeting Etiquette. Announce your name prior to making comments or suggestions - PowerPoint PPT Presentation

Citation preview

Page 1: February 17, 2014 9:00 – 10:00 AM PST

OASIS Electronic Trial Master File Standard Technical


Metadata Component Content Model Component

February 17, 2014

9:00 – 10:00 AM PST

Page 2: February 17, 2014 9:00 – 10:00 AM PST

AgendaTopic Presenter

9:00-9:05 Call to Order & Roll Call Zack Schmidt

9:05-9:10 Approval of Minutes https://www.oasis-open.org/committees/documents.php?wg_abbrev=etmf


9:10-9:15 OASIS policy review: member/observer roles Chet Ensign


9:15-9:25 Tech pres – Metadata, Content Model Components; Esig/Dsig Intro Z. Schmidt

9:30-9:50 Tech Discussion – Content Classification Layer All

9:50-9:55 Outreach Committee / New Business Jennifer AlpertAll

9:55-10:00 Next meeting agenda / Date Z. Schmidt

Page 3: February 17, 2014 9:00 – 10:00 AM PST

Name Company Voting Status Present?Jennifer Alpert Palchak CareLex Member/Voter Y

Aliaa Badr CareLex Member/Voter Y

Oleksiy (Alex) Palinkash CareLex Member/Voter Y

Troy Jacobson Forte Research Member/Voter Y

Mead Walker HL7 Member/HL7 Liason Y

Lou Chappuie Individual Member/Voter Y

Lisa Mulcahy Individual Member/ N

Sharon Elcombe Mayo Clinic Member/ (2nd mtg ) Y

Robert Gehrke Mayo Clinic Member/(np last mtg) N

Tom Johnson Mayo Clinic Member/(1st mtg ) N

Rich Lustig Oracle Member/Voter Y

Michael Agard Paragon Solutions Member/Voter Y

Christopher McSpiritt Paragon Solutions Member/Voter N

Jamie O’Keefe Paragon Solutions Member/(np last 2 mtgs) N

Fran Ross Paragon Solutions Member/Voter Y

Eldin Rammell Rammell Consulting Member/(1st mtg ) Y

Peter Alterman SAFE-BioPharma Member/Voter (Leave till 3-3) N

Catherine Schmidt SterlingBio Member/Voter Y

Zack Schmidt SureClinical Member/Voter Y

Trish Whetzel SureClinical Member/Voter N

Peter Junge Beijing Sursen Observer N

Steve Scribner EMC Observer Y

Laura Hilty Forte Research Observer N

Tony O’Hare Forte Research Observer N

Chet Ensign OASIS Staff Y

Roll Call

Page 4: February 17, 2014 9:00 – 10:00 AM PST

Meeting Etiquette• Announce your name prior to making comments or


• Keep your phone on mute when not speaking (#6)

• Do not put your phone on hold – Hang up and dial in again when finished with your other call – Hold = Elevator Music = very frustrated speakers and participants

• Meetings will be recorded and posted– Another reason to keep your phone on mute when not speaking!

• Use the join.me “Chat” feature for questions / comments / Votes

• We will follow Robert’s Rules of OrderNOTE: This meeting is being recorded and

minutes will be posted on TC page after the meeting

From eTMF Std TC to Participants:Hi everyone: remember to keep your phone on mute


Page 5: February 17, 2014 9:00 – 10:00 AM PST

Content Classification Layer

– Metadata component Recap

• Address comments regarding:

– Document Versioning, Country, Sponsor

– Content Model component Recap / RDF/XML

• Address comments regarding Content Model versioning

– Summarize Content Classification Layer

– Discussion

Tech Presentation

Page 6: February 17, 2014 9:00 – 10:00 AM PST

–Metadata Component:

• Metadata (‘Tags’)– Characterizes content

– Allows users to precisely search for information, create reports, share data online

– Use of standards-based

terms is critical for interoperability between systems

Metadata Component - Recap

Page 7: February 17, 2014 9:00 – 10:00 AM PST

Metadata Component Example

– Each Content Type contains metadata that describes it:

Metadata Component - Recap

Metadata Tagging:

Page 8: February 17, 2014 9:00 – 10:00 AM PST

Term Sourcing Concepts:

• Terms adopted by standards bodies should be used first in eTMF model

Primary Term Sources for eTMF Metadata:

– Internet Standards Dev Orgs: W3C, IETF, ISO, etc.

» Required for interoperability of machine code

– NIH NCIthesaurus: Term database for FDA, CDISC, HL7, other orgs

» Required for interoperability of clinical / health sciences data

Secondary, Tertiary Term Sources for eTMF Metadata:

• Medical & Published Standards metadata: Dicom (med imaging); Dublin Core

• Industry sources – widely used terms in enterprise content mgmt software, TMF RM

Metadata – Term Sources - Recap

*Spec, Table 6, p21

Page 9: February 17, 2014 9:00 – 10:00 AM PST

• Based on comments re: Doc Version support, a new metadata term is proposed:

Document Version (applies to eTMF Document or Content Item)

• Based on NCI/CDISC/FDA/HL7/BRIDG term definitions:– Per NCI/NIH/BRIDG: a ‘Representation of a particular edition or snapshot

of a document as it exists at a particular point in time.’

– NCI Code C93484, NCI Code C93816– Follows industry standard ‘Major.Minor’ numbering:

» Major =1.0, Minor = 1.1

• Document Version management is an application-specific / implementation specific task

Core Metadata – Document Version Numbering

Page 10: February 17, 2014 9:00 – 10:00 AM PST

Core Metadata – Document Version Numbering Policy

Document Version number text formatting

Major Version.Minor Version • Version numbering text are integer values separated by a period, without leading zeros.

1. 0

Major version – Changes to document/content items.

Minor version – Changes to any metadata for the document/content item.

Version Numbering Policies (based on NCI/CDISC/FDA/BRIDG def: C93816)

Page 11: February 17, 2014 9:00 – 10:00 AM PST

Core Metadata – Document Version Numbering

Version Created By Modified By Date Description

3. 0 DBROWN 2/16/2014 8:30PM Document modification

2. 0 JLENO 2/16/2014 7:30PM Document modification

1. 1 RJONES 2/15/2014 5:30PM Metadata only modification

1. 0 SSMITH 2/14/2014 4:30PM Original Item

MajorVersion• Content Item change• New Content Item• Any change to

doc/content item is major change

MinorVersion• Metadata change for a content item• Any change to doc/content item’s metadata

values or attributes represent minor change

Implementation Example – Version History for Doc/Content Item*

*Example only. Application-dependent.

Page 12: February 17, 2014 9:00 – 10:00 AM PST

Core Metadata Terms Created By

From last meeting – Created By is published by NCI and has the following definition.

Aliaa investigated CDISC BRIDG, has not discovered any conflict by CDISC BRIDG on the use of Created By.

*For additional info, see Spec, Appendix 8

Page 13: February 17, 2014 9:00 – 10:00 AM PST

Core Metadata Terms

Term Definition SourceFile Properties

* Created The date and time at which the resource is created. For a digital file, this need not match a file-system creation time. For a freshly created resource, it should be close to that time. Later file transfer, copying, etc., may make the file-system time arbitrarily different. NIH/NCI

* Modified The date and time the resource was last modified. NIH/NCI* Content Identifier The unique identifier for a content item, such as a document, image, or other media in a

specified context. (Document name.) NIH/NCI

* URI The unique uniform resource Identifier or path (URI) for a content item such as a document, image, or other media in a specified context. NIH/NCI

* Format Content Item File Format, e.g., PDF, JPG, GIF, XLS, DOC, DOCX, XLSX, PPT, PPTX. It uses a filename extension as the format value. NIH/NCI

*Document Version Per NCI/NIH/BRIDG, a Document version is a ‘Representation of a particular edition or snapshot of a document as it exists at a particular point in time.’ The term document version applies to documents as well as to content items. Synonym : Content Item Version (document or any other electronic file in eTMF)


Basic Audit Trail

* Created By Indicates the username of the person who brought the item into existence. NIH/NCI* Modified By Indicates the username of the person who changed an item. NIH/NCI


* Content Type Name The name of the Content Type such as 'CV.' A Content Type is a reusable collection of metadata, workflow, behavior, and other settings for a category of items in electronic content material. NIH/NCI

Note: Core metadata terms should be included for each content item. Required Terms - must have data values = *

*For additional info, see Spec, Appendix 8

ProposedAdoptedNew Core MDTerm:

Page 14: February 17, 2014 9:00 – 10:00 AM PST

Core Metadata Terms, Continued

*For additional info, see Spec, Appendix 8

Term Definition SourceBusiness Process Metadata (includes Digital Signatures)

Date Date of task or event, or date in the context of document or Content Type. Date can be different from date created. NIH/NCI

Process A sequence or flow of activities in an organization with the objective of carrying out work. Source: BPMN V2.0 Spec (4). Tasks are atomic activities. They are included within a Process. NIH/NCI

Task A single activity that has occurred within a business process. Generally, an end-user, an application, or both will perform the Task. Concept derives from BPMN V2.0. Example task values are: Submitted, Approved, Reviewed, Signed, etc., indicating that a task has been completed. Each task is date stamped and captured in a single record of the business process metadata history log.


Source Where the content item is from or its origin. Example values: Import, Scan, Fax, email, system, and other. NIH/NCI

Person The full name of the person who performed the workflow action (e.g., approved or submitted a document) or the person to whom this document is linked. NIH/NCI

Person Role The role of the person who is responsible for or linked to a content item, such as Principal Investigator, Sub-Investigator, Study Coordinator, Sponsor Project Manager, CRO Project Manager, or Data Manager.


Subject Identifier Subject Identifier is a unique sequence of characters used to identify, name, or characterize the study subject individual in a clinical trial study. NIH/NCI

*Organization The full name of the Organization linked to the resource. NIH/NCI

Organization Role Denotes the role of the organization, which is responsible for or linked to the Content Item. Values include Sponsor, Site, CRO, and Vendor. NIH/NCI

Username The account name used by a person to access a computer system (used for system generated tasks). NIH/NCI

Digital Signature Extra data embedded in a document or metadata linked to a document. It identifies and authenticates the signer of a document using public-key encryption. May be a URI or path to digital signature resource or certificate.


Digital Signature Status Specifies whether a document or content item has been digitally signed. If no signature is required, status = null. Values: Signed, Not Signed, Null NIH/NCI

Page 15: February 17, 2014 9:00 – 10:00 AM PST

eTMF Domain Metadata Terms

*For additional info, see Spec, Appendix 8

Term Definition Source

eTMF Domain Metadata

*Study ID Organization specific value, assigned and defined by the study sponsor. A sequence of characters used to identify, name, or characterize the study.


Country Name of country using ISO 3166-1 alpha-3 country codes- Example: USA. NIH/NCI*Clinical Study

Sponsor‘An entity that is responsible for the initiation, management, and/or financing of a clinical study; organization that initiates a study and that specifies the Study ID’ (check on NCI definition C48355)


Site ID A unique symbol that establishes the identity of the study site. NIH/NCI

Credential Professional credential of Person for study - MD, RN, PhD or other for Person linked to a content item / document; EX: MD, RN, PhD, MS, MA, BA, MBA


Visit Number The numerical identifier of the visit. NIH/NCI

Note: Study ID , Country and Clinical Study Sponsor metadata terms should be included for each content item in

the eTMF Domain. Required Terms are marked *

All other terms assigned to content types based on the published domain content model. For example ‘Site ID’ is assigned to content types within the ‘Site Management’ category. See published eTMF content model for details. All other terms are optional. Additional eTMF Domain Metadata terms may be added as needed in ‘Phase 2’ of the eTMF TC project

ProposedAdoptedNew Required eTMF MDTerm:

This is an organization specific term. If there are alternate Study ID numbers assigned by clinicaltrials.gov, or country-specific IDs, these would be described in alternate metadata tags.
Added by motion to definition:'A sequence of characters used to identify, name, or characterize the study, assigned and defined by the study sponsor'
Page 16: February 17, 2014 9:00 – 10:00 AM PST

General Metadata

Term Definition CodeGeneral Metadata

Description An account of the resource or content item. Dublin CoreLocation A spatial region or named place. Dublin Core

Title A name given to the resource or content item. Dublin CoreType The nature or genre of the resource or content item. Dublin Core

Note: General Metadata is not required, but is obtained from published standards organizations such as Dublin Core, DICOM, and other standards organizations

Page 17: February 17, 2014 9:00 – 10:00 AM PST

• Recap on Content Models – What and Why– Content Model Format / Exchange– How Used

• Content Model Versioning under W3C OWL/RDF/XML

Content Models

Page 18: February 17, 2014 9:00 – 10:00 AM PST

What are Content Models (CM):

• Represent content classifications, relationships, metadata in a semantic web taxonomy or ‘Ontology’

• CM’s are created using the W3C OWL2 language and RDF/XML


• Semantic web allows seamless sharing, linking, search of data across domains

• Possibility to link to other semantic models in future like CDISC, HL7, etc

• Industry moving to Semantic web:– CDISC/FDA/PHuse project– HL7, NIH/NCI, many more

Content Models Recap: What

and Why

Page 19: February 17, 2014 9:00 – 10:00 AM PST

Content Model Format / Exchange

• Content Model Profile for the eTMF domain represented as W3C OWL2 classes

– Allows for easy editing, sharing by anyone– Allows for limited validation

• Content Model Instances expressed as W3C RDF/XML (eTMF study specific)

• RDF/XML used as the syntax for content model exchange

• Exchange CM’s using Serialized RDF/XML or RDF/XML as a file with .owl extension:

– etmf.owl

• Exchange Protocol: No specific protocol is specified by RDF/XML, nor is one required for content model exchange.

– Any protocol which supports exchange of RDF/XML files or serialized data such as W3C http/s, REST, SOAP, RPS, CMIS, etc.

– Application / implementation- specific

Content Models Recap: Content Model Format /


*Per W3C

Page 20: February 17, 2014 9:00 – 10:00 AM PST

CM File Example• W3C RDF/XML used as the syntax for content

model representation and exchange

• Contains RDF and OWL in XML

• Contains reference to Content Model Profile for eTMF

• Contains Content Model Instance for Study

CM File Naming • The .OWL filename extension is used for

RDF/XML files. Example: etmf.owl

• Allowable filename characters: Filenames for content model exchange shall be similar to IETF URL naming as follows:

– Alphanumeric characters

– Special characters:

• Only ‘– ’ (hyphen) may be used to ensure future compatibility

Content Models Recap: Content Model Format ;

NamingExample W3C RDF/XML Content Model File Snippet: XML V1.0


Page 21: February 17, 2014 9:00 – 10:00 AM PST

How Used• For the eTMF Domain, a core standard set of

categories (categories, subcategories, content types) and core metadata will be published:

– Content Model Profile for eTMF Domain

• Core set of categories is included with all Content Models (users can show/hide categories, but not delete them)

• Enables interoperability

• Content models easily downloadable

Organization Specific• Includes Content Model Profile for eTMF


• Additionally, Orgs can create/add their own categories

• Provides flexibility

• Share, exchange CM’s through RDF/XML format

• Share with published URL

Content Models Recap: How Used

Study ID

Site Management



Central Files Protocol

Content Model Profile for eTMF Domain -Core Classes

Study ID

Site Management



Central Files


MyCorp SubCategory

Org-specific Content Model

Page 22: February 17, 2014 9:00 – 10:00 AM PST

Content Model Versioning• Versioning of Content Models is supported through

W3C OWL Versioning Policies

• W3C OWL supports granular level of versioning

• Version management is an application-specific task

• owl:versionInfo provides a hook suitable for use by versioning systems

Content Model Version numbering text:

– Major.Minor numbering

– Major = Content Model Profile Vn #

– Minor = Org Specific Version of CM. May be enhanced with org specific, application specific numbering within W3C OWL versioning policies

– Use with owl:versionInfo in RDF/XML for content model categories, annotation and data properties

– <owl:versionInfo>1 . 0 . 0 </owl:versionInfo>

Content Models CM Versioning


Study ID

Site Management



Central Files Protocol

Study ID

Site Management



Central Files


MyCorp SubCategory

Org-specific Content Model

Major Number = Content Model Profile for eTMF Domain – Published Version #

Minor number = Content Model Profile for eTMF, Minor change to metadata, annotation props, data props

Content Model Profile for eTMF Domain -Core Classes



V1.1.company.com.123Sub-Minor Number = Org-specific versioning – app specific

Sub-Minor Number = Org-specific/app specific

Two types of Versioning: Content Item Versioning, Content Model Versioning:

Page 23: February 17, 2014 9:00 – 10:00 AM PST

Standards-based Architecture:

• Content Classification

– Defined Rules, Policies for Naming, Numbering

• Metadata (‘Tags’)

– Rules to Characterize content

– Controlled vocab

• Content Models– WC3 RDF/XML

Summary: Content Classification Layer

Page 24: February 17, 2014 9:00 – 10:00 AM PST

• Status – New Members:– Outreach Activity summary / Milestones

– Joined: Tom Johnson, Sharon Elcombe /Mayo Clinic

• In Progress: Shire

– Active Prospects

– Deliverable – Summary Industry outreach / Comments report

Outreach Subcommittee

Page 25: February 17, 2014 9:00 – 10:00 AM PST

Core Metadata – Document Version Numbering Policy

Document Version number text formattingIn the eTMF Standard, the document version text values follow the same formatting that is familiar and commonly implemented in software and in other health science standards: Major Version.Minor Version. Version numbering text are integer values separated by a period, without leading zeros. There can be a new Major version every time the document/content item changes. There can be a new Minor version every time the metadata changes.

Version Numbering Policies (based on NCI/CDISC/FDA/BRIDG def: C93816)Within eTMF archives, document / content item version management shall be application specific to provide for application flexibility. However, for consistent content item exchange, version number text formatting should be implemented using eTMF document version numbering policies:

Each document Major version number is an integer starting at '1' and incrementing by 1. The first instance or original document should always be valued as '1'. The version number value must be incremented by one when a document is replaced, but can also be incremented more often to meet application specific requirements. Different versions of the same document belong to the same Content Type group. The document Minor version number would be an integer starting at ‘0' and incrementing by 1. The first instance of an original document with no minor version should always be valued as ‘1.0’, where ‘0‘ indicates that no minor version exists. Documents with a change to the metadata values would require a minor version. The first minor version for a 1.0 document would be indicated as 1.1. Successive changes to any of the document’s metadata would increment the Minor version by 1, for example 1.2 indicates major version 1 and minor version 2. The Minor version number value must be incremented by one when a document’s metadata is changed, but can also be incremented more often to meet application specific requirements.