32
Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies with Basic Formal Ontology Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Semantic Community Federal Big Data Working Group Meetup Data Science August 31, 2015 1

Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

Embed Size (px)

Citation preview

Page 1: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

1

Federal Big Data Working Group Meetup:The Yosemite Project: A Roadmap for

Healthcare Information Interoperability and The New Book: Building Ontologies

with Basic Formal OntologyDr. Brand Niemann

Director and Senior Data Scientist/Data JournalistSemantic CommunitySemantic Community

Federal Big Data Working Group MeetupData Science

August 31, 2015

Page 2: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

2

Agenda

• 6:30 p.m. Welcome and Background (New Tutorial and Mentoring) Slides Data Science for EHRs• 6:40 p.m. David Booth, PhD. HRG and Rancho BioSciences, Slides• 7:30 pm. Brief Member Introductions• 7:40 p.m. Professor Barry Smith, Department of Philosophy - University

at Buffalo New Book: Building Ontologies with Basic Formal Ontology Slides• 8:30 p.m. Open Discussion• 8:45 p.m. Networking• 9:00 p.m. Depart

Page 3: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

3

Schedule

• August 25th, Introduction to the American Community Survey Webinar (2-3 PM EST). http://www.census.gov/mso/www/training/ (See October 5th Meetup Below)

• September 1st, Data Science for EPA Hydraulic Fracturing Webinar.• September 14th, Big Data Science for Precision Farming Business Online Course.• September 16-17th, Building Ontologies with Basic Formal Ontology, Barry Smith• September 28th, Climate Change Data - Data Science Meetup of Meetups.• October 5th, Data Science for Census American Community Survey.• October 19th, Sensing Our Air: The Quest for Big Data About Our Air Quality.• November 2nd, Data Science for Random Forests• November 5-6th, OSTP/NSF Data Science Meetup of Meetups, Ballston, VA.• November 16th, Data Science for the DataAct Datathon• December 7th, TBA

Page 4: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

4

Background

• David Booth, PhD. HRG and Rancho BioSciences, The Yosemite Project. Slides 5-6.• Barry Smith, Professor of Philosophy and Co-Author of New Book on Ontology, Meeting

with Intelligence Community. Slides 7-11.• Jonathan Hines, ORNL science writer, doing a story on Semantic Medline and the ORNL

CADES – Compute and Data Environment for Science. Slides 12-15.• Joan Aron and Brand Niemann, Data Mining - Science - Questions - Publication Process.

Slides 16-28.• Weifeng Li (Lexie) and Brand Niemann, FDA Precision Medicine, Slides 29-32.• Chris Crawford and Jay Patkar, TIBCO Software Federal, Random Forests for Kaggle

Competitions and Spotfire TERR. November 2nd.• Steve Hanmer, Mission Source, Allyson Ugarte, Treasury, co-planning Data Science for Data

Act Datathon Meetup. He attended the Data Act Datathon and Forum this week and will report. November 16th.

Page 5: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

5

http://yosemiteproject.org/

Page 6: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

6

About the Yosemite Project

• The Yosemite Project is a collaborative effort to achieve semantic interoperability of all structured healthcare information, using RDF as a universal information representation. It is a follow-up to the Yosemite Manifesto -- a position statement issued by participants of the 2013 workshop on RDF as a Universal Healthcare Exchange Language, held at the Semantic Technology and Business Conference, in response to the 2010 PCAST report's call for a universal exchange language for healthcare information. The Yosemite Manifesto identifies RDF as the "best available candidate" to meet this need, and has been signed by over 100 healthcare and technology experts since it was issued. Add your name!

Page 7: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

7

https://mitpress.mit.edu/index.php?q=books/building-ontologies-basic-formal-ontology

Page 8: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

8

Building Ontologies with Basic Formal Ontology:Table of Contents

• Introduction• 1 What Is an Ontology?• 2 Kinds of Ontologies and the Role of Taxonomies• 3 Principles of Best Practice I: Domain Ontology Design• 4 Principles of Best Practice II: Terms, Definitions, and Classification• 5 Introduction to Basic Formal Ontology I: Continuants• 6 Introduction to Basic Formal Ontology II: Occurrents• 7 The Ontology of Relations• 8 Basic Formal Ontology at Work• Appendix on Implementation: Languages, Editors, Reasoners, Browsers, Tools for Reuse• Glossary• Web Links Mentioned in the Text• Notes• Bibliography

Page 9: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

9

Building Ontologies with Basic Formal OntologyBy Robert Arp, Barry Smith and Andrew D. Spear

• In the era of “big data,” science is increasingly information driven, and the potential for computers to store, manage, and integrate massive amounts of data has given rise to such new disciplinary fields as biomedical informatics. Applied ontology offers a strategy for the organization of scientific information in computer-tractable form, drawing on concepts not only from computer and information science but also from linguistics, logic, and philosophy. This book provides an introduction to the field of applied ontology that is of particular relevance to biomedicine, covering theoretical components of ontologies, best practices for ontology design, and examples of biomedical ontologies in use.

• After defining an ontology as a representation of the types of entities in a given domain, the book distinguishes between different kinds of ontologies and taxonomies, and shows how applied ontology draws on more traditional ideas from metaphysics. It presents the core features of the Basic Formal Ontology (BFO), now used by over one hundred ontology projects around the world, and offers examples of domain ontologies that utilize BFO. The book also describes Web Ontology Language (OWL), a common framework for Semantic Web technologies. Throughout, the book provides concrete recommendations for the design and construction of domain ontologies.

Page 10: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

10

http://ifomis.uni-saarland.de/bfo/

Page 11: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

11

http://ifomis.uni-saarland.de/bfo/users

Page 12: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

12

http://skr3.nlm.nih.gov/SemMed2/

Page 13: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

13

NIH Data Commons

• FAIR Principles:• Findable• Accessible• Interoperable• Reusable

• Cloud:• Data• Software• Results

• Federal Science Policy:• OSTP Public Access to Scientific Data

Memo (February 2013)• New Program: Big-Data-to-

Knowledge (2013)• New Position: Associate Director of

Data Science (2014)• Digital Enterprise (2015): Data

Commons• Metadata• Open APIs• Digital Objects• Containers

A NIH – Semantic Medline Data Science Data Publication Commons

Page 14: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

14

OSTP/NSF Data Science Meetup of Meetups

• Week of November 2nd:• NSF Data Science/Big Data

Principal Investigators (About 300)• NSF Data Hubs (4)• Organizers of Largest Data

Science/Big Data Meetups (About 65)

• Pipeline for Return on Investment:• PIs put their data, tools and

research results in the Data Hubs• Data Hubs provide those data,

tools, and research results to the world, but especially to the Data Science/Big Data Meetups• Data Science/Big Data Meetups

collaborate with PIs and Data Hubs to increase usage and feedback

Page 15: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

15

We Already Do This!

• Semantic Community:• Provides a Community Sandbox that is

like a GitHub, Data Hub, Data Commons, etc.• Metadata (MindTouch)• Open APIs (MIndTouch)• Digital Objects (MindTouch)• Containers (Spotfire)

• Organize the Federal Big Data Working Group Meetup

• Support Agencies and Programs in Crowdsourcing Their Data Sets

• Mentor Data Scientists (Tutorials and MOOCs) and Entrepreneurs (Eastern Foundry)

• Federal Big Data Working Group Meetup:• Federal: Supports the Federal Big Data

Initiative, but not endorsed by the Federal Government or its Agencies;

• Big Data: Supports the Federal Digital Government Strategy which is "treating all content as data", so big data = all your content;

• Working Group: Data Science Teams composed of Federal Government and Non-Federal Government experts producing big data products; and

• Meetup: The world's largest network of local groups to revitalize local community and help people around the world self-organize like MOOCs (Massive Open On-line Classes) now embraced by the White House.

Page 16: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

16

Data Mining - Science - Questions - Publication Process

• Data Mining Process:• Business Understanding• Data Understanding• Data Preparation• Modeling• Evaluation• Deployment

• Data Science Process:• Data Preparation• Data Ecosystem• Data Story

• Data Science Questions:• How was the data collected?• Where is the data stored?• What are the data results? and• Why should we believe the data results?

• Data Science Data Publication:• Knowledge Base• Spreadsheet Index• Web & PDF Tables to Spreadsheet• Data Browser• Dynamically Linked Adjacent

Visualizations

Page 17: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

17

Data Science Data Curation for Sustainable Data Science Meetups

of Meetups• I just finished four data science ecosystems:• RDA Climate Data Challenge (July 15):

• http://semanticommunity.info/Data_Science/Data_Science_for_RDA_Climate_Change_Data_Challenge

• RDA Information Week 2016 (Ebola Response and Nepal Earthquake) (July 17):• http

://semanticommunity.info/Data_Science/Data_Science_for_Global_Ebola_Response_Data

• USDA Microsoft Innovation Challenge (July 27):• http://semanticommunity.info/Data_Science/Big_Data_Science_for_Precision_Farming_

Business#Story• US Data Act (July 28):

• http://semanticommunity.info/Data_Science/Data_Science_for_the_DataAct_Datathon

Page 18: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

18

Collaboration for Data Science Win-Wins

• USDA Open Government Data Training, Innovation Competition, and Online Course in Data-Driven Farming:• http

://semanticommunity.info/Data_Science/Big_Data_Science_for_Precision_Farming_Business#Story

• Many Curated Government Data Sets and Data Science Products:• http://semanticommunity.info

• Pick an Agency and/or a Data Set and Look for a Meetup on That:• http://www.meetup.com/Federal-Big-Data-Working-Group/

• Mentor Startups Partnership with Eastern Foundry:• http://www.meetup.com/Federal-Big-Data-Working-Group/events/223140032/

Page 19: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

19

USDA Collaboration Chronology

• March 16th: USDA CIO and ACDO on Open Data Plan and Roundtable Meetup• March 25th: Government Technology & Innovation Incubator for Big Data Analytics II Meetup at

Eastern Foundry• May 18th: USDA Data Science MOOC Meetup• May 21st, USDA Open Data Quarterly Submission to OMB on USDA Data Usage provided (USDA Data

Science MOOC)• July 21st, Data-Driven Farming Online Course Announced by HeatSpring and Semantic Community• July 27th: USDA Microsoft Innovation Challenge Submission on Farm Data Dashboards• July 29th, Partnerships Sought for Data-Driven Farming Online Course• August 19, 2015, Data Science for USDA Big Data, Briefing and Demo• September 17th: Big Data Science for Precision Farming Business Online Course Meetup and

Commercial Examples: Farmers Business Network, FarmLogs, etc.• October 26-December 18th, Data-Driven Farming Online Course with Partners

Page 20: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

20

https://www.farmersbusinessnetwork.com/

Page 21: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

21

http://opendataenterprise.org/

Page 22: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

22

http://opendataenterprise.org/convene.html

Page 23: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

23

http://www.opendataenterprise.org/map/viz/index.html

Page 24: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

24

http://www.opendataenterprise.org/map/viz/index.html

Page 25: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

25

Agriculture:Data Type: 8 PagesIndustry Category: 3 Pages

http://www.opendataenterprise.org/map/viz/index.html

Page 29: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

29

FDA Mission

• FDA plays an integral role in President Obama’s Precision Medicine Initiative, which foresees the day when an individual’s medical care will be tailored in part based on their unique characteristics and genetic make-up. Yet while more than 80 million genetic variants have been found in the human genome, we don’t understand the role that most of these variants play in health or disease.• The FDA, an agency within the U.S. Department of Health and Human Services,

protects the public health by assuring the safety, effectiveness, and security of human and veterinary drugs, vaccines and other biological products for human use, and medical devices. The agency also is responsible for the safety and security of our nation’s food supply, cosmetics, dietary supplements, products that give off electronic radiation, and for regulating tobacco products.

Page 30: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

30

Page 31: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

31

precisionFDA

• Planned for beta release (work in progress) in December 2015, precisionFDA will offer community members access to secure and independent work areas where, at their discretion, their software code or data can either be kept private, or shared with the owner’s choice of collaborators, FDA, or the public. Initially, precisionFDA’s public space will offer a wiki and a set of open source or open access reference genomic data models and analysis tools developed and vetted by standards bodies, such as the National Institute of Standards and Technology (e.g., Genome in a Bottle). We believe precisionFDA will help us advance the science around the accuracy and reproducibility of NGS-based tests, and in doing so, will advance consumer safety. We look forward to continuing to update the community on the development of these new tools.http://blogs.fda.gov/fdavoice/index.php/2015/08/advancing-precision-medicine-by-enabling-a-collaborative-informatics-community/

Page 32: Federal Big Data Working Group Meetup: The Yosemite Project: A Roadmap for Healthcare Information Interoperability and The New Book: Building Ontologies

32

Previous FDA-Related Meetups

• July 7, 2014: Data Science of White House Big Data Review and Brooke Aker: Big Data Lens on OpenFDA• http://www.meetup.com/Federal-Big-Data-Working-Group/events/192336492/

• October 6, 2014: FDA Data Innovation Lab and Predictive Analytics• http://www.meetup.com/Federal-Big-Data-Working-Group/events/209068792/

• December 1, 2014: Data Science for NIH/FDA SEMOSS Data Federation and Analytics• http://www.meetup.com/Federal-Big-Data-Working-Group/events/210542792/

• April 6, 2015: Data Science for the FDA RFI• http://www.meetup.com/Federal-Big-Data-Working-Group/events/219374810/