24
Semantically Enhanced Model Experiment Evaluation Process (SeMEEP) within the Atmospheric Chemistry Community • Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1 1 School of Chemistry, University of Leeds 2 School of Computing, University of Leeds

Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1

  • Upload
    sai

  • View
    48

  • Download
    0

Embed Size (px)

DESCRIPTION

Semantically Enhanced Model Experiment Evaluation Process (SeMEEP) within the Atmospheric Chemistry Community. Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1 1 School of Chemistry, University of Leeds 2 School of Computing, University of Leeds. - PowerPoint PPT Presentation

Citation preview

Page 1: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

Semantically Enhanced Model Experiment Evaluation Process (SeMEEP)

within the Atmospheric Chemistry Community

• Chris Martin 1,2, Mo Haji 2, Peter Dew 2, Peter Jimack 2, Mike Pilling 1

• 1 School of Chemistry, University of Leeds

• 2 School of Computing, University of Leeds

Page 2: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

2

Outline of the Presentation

• Introduction

• Atmospheric community

• SeMEEP

• ELN Provenance capture

• Conclusion and next stage

Page 3: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

3

Section 1 Overview

• Application domain – atmospheric community

– Reliance on computational models to evaluate data

• Motivation

– Study how to transition from today's ad-hoc processes practises

– Sustainable process of

• Gathering, community evaluation and sharing data & models between scientists

• Minimising changes to proven working practises of the scientist

• Within world-wide co-laboratories

Page 4: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

4

Related projects

• CombeChem– Experimental organic chemistry– From source to long term data – perseveration (knowledge)– Semantically-enabled ELN– Data-driven workflow

• Collaboratory for Multi-Scale Chemical Science– Multi-layer chemical model

• myGrid– Bio-informatics and related areas (semantic pattern matching– Reusable semantic workflow using SMD (semantic metadata)– Data Quality

• Karama2– Weather forecasting – computation modelling– Data-driven workflow

Add

Sample

chem1 chem2

Quantum Thermo Kinetic Mechanism Reacting Flow

Chemistry Chemistry Simulation

Page 5: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

5

Section 2 Atmospheric Chemistry

• Seeks to understand the chemical processes (reactions) taking place in the lower atmosphere (e.g. smoke)

• It has significant implication for both:

– Air Quality

– Climate Change

Page 6: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

6

The Master Chemical Mechanism (MCM)

• Data repository of elementary chemical reactions & rate constants

• The mechanism is described by a computational model that is evaluated against experimental data

– Chamber experiments

– Field experiments

27.11.06 Methyl Glyoxal

0

20

40

60

80

100

120

140

0 5000 10000 15000 20000 25000 30000 35000 40000

time/ s

MG

LY

OX

/ pp

bv

MCMv3.1

measured (calibrated using isoprene)

Page 7: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

7

Section 3 SeMEEP

• Today

– Typically within the atmospheric chemistry community the provenance is recorded in an ad-hoc, unstructured fashion, using a combination of traditional lab-book, word processing documents and spreadsheet.

• Move to more sustainable evaluation process supports the gathering, evaluation and sharing of data and models

• Using semantic metadata

Page 8: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

8

Laboratory Database (s)

Shared Community Semantic Database

CommunityEvaluation(people)

Scientist (s) with personal ELN

SeMEEEP

Com Data manager

Datamanager

Public Database (s)

Datamanager

SeMEEP Vision

• SeMEEP semantically-enabled MEEP

– Supports the organisation of information but critically, records its provenance (say to recover secondary data)

Mike Pilling : “SeMEEP approach will radically enhance the effectiveness of a research community to deliver new science“

Page 9: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

10

Raw Data

Metadata

Publication

Metadata

Process DataE.g. k(T, p)

ELN

Community evaluation

(subjective)

May be partial information

PhysicalExperiment

AnalysisProcess

HistoricalData

Theory(e.g. quantum

mechanic)

IUPAC (kinematic, Int. Union of

pure and applied chemistry

From other labs

Requirements for metadata capture for elementary reactions

•Only published data•Rate constants from several labs•No access to the raw data•No access to secondary data•SeMEEP will provide this.

Page 10: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

11

Current Evaluation Processes for the MCM

Page 11: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

12

Envisioned Evaluation Processes

LaboratoryArchiveCommunity Semantic Database

Inputs to the modelling process:Benchmark data

Model parameter sets etc.

Scientist’s Personal ELN Archive

Workgroup database

ELN Capture of the Model Development Provenance

Model Development

Model ExecutionAnalysis

Links to experimental dataand provenance generation

processes

Data sources

Community EvaluationSubjective

SeMEEP

Semantic-enabled

ELN

Page 12: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

13

Section 4 Electronic Lab-Books (ELNs)

• ELNs address the limitations of the current methods of provenance capture.

• Southampton ELN for organic chemistry experiments.

• Benefits to the modeller

• Modelling process can be automatically captured

• Searchable

• Remote access is possible

• Provenance is structured

• Possible to use resolvable references to resources

Page 13: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

14

Will User attach quality metadata?

• Motivate users:

– By demonstrating the value of provenance in their day-to-day work

• Writing publication

• Managing their data

• Reinterpretting the data.

– Management

– Publishers

Page 14: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

16

The Modelling Process - A Three Layer Mapping

ExperimentExperiment

PlanExperiment Conclusions

Modelling Iteration

Iteration Plan

Iteration Conclusion /

Plan for Iteration n + 1

Modelling Iteration

Model Development

Model Parameters

Model Output

Model Execution Analysis

Iteration Plan

· Model Source code

· ……...

· Model Output Data from previous iterations

· External Data Sources· ……...

Experiment Layer

Modelling Iteration

Layer

Modelling Layer

Iteration Conclusion /

Plan for Iteration n + 1

Iteration Conclusion /

Plan for Iteration n + 1

Model Parameters

Iteration Conclusion /

Plan for Iteration n + 1

Page 15: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

17

MCM Mechanism being investigated

Page 16: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

18

Modelling Plan

Ontology

Compare to generate metadata

Mechanism Editing Model Execution Model Output Analysis

Mechanism version n

Mechanism version n-1

Scientific Process

Automatic Metadata Capture

Planning the

Scientific Process

User Annotation

Metadata Storeage

Metadata Storeage

Capture Metadata at run time

ELN Process

Page 17: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

19

ELN Screenshots

• Prompts displayed when changing the changing the chemical mechanism;

• Editing a reaction

• Adding a new reaction

Page 18: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

20

ELN Screenshots

Page 19: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

21

ELN Modelling SMD Architecture

SMD creation(e.g. Data driven

workflow)

Context ontology(e.g. materials/

process)

3-level scientific services (model dev; execution; analysis)

Data Storage (SMD, Model Output

& Analysis)

SMD Middleware Services(e.g. ontology. services, query etc

SMD Modelling sub-system

SemanticMetadata

level

Grid Fabrics

User Interface

Workflow constrictor Annotation interface Database Query & Retrieval

DL-based reasoner

Simulation server

Page 20: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

22

Evaluation Methodology

• In-depth interviews with members of the atmospheric chemistry model group here at Leeds, covering:

– Demonstration of the prototype

– User testing of the prototype

– Discussion of scenarios involving the use of the prototype (e.g. )

• Analysis

– Interviews recorded and transcribed

– Analysed using techniques from grounded theory

Page 21: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

23

Evaluation

Barriers to adoption:

– Effort required at modelling time for provenance capture

• “[in] your lab book you can write down what ever you want [but with an ELN] it is going to take time to go through the different protocol steps”.

– When asked if they would use an ELN requiring a similar amount of user input to the prototype the response was positive:

• “Yeah, I think it would be a good thing. I don’t think it is too much extra … work.”

– Rather than viewing the prompts for user annotation as interruption to their normal work the user recognised the value of being prompted

• “is a good way to do it because otherwise you won’t [record the provenance].”

Page 22: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

24

Evaluation

• Users intuitively grasped the benefits of recording provenance with an ELN and that the benefits would be realised after the time of modelling by a number of stakeholders:

– “if someone else wants to look at … [your provenance], that’s great because the person can see exactly what you have done, where you have been and where to go next. And for yourself, if you are writing up a PhD ... [you can] … see exactly what you’ve done whereas currently you have to rifle through lab-books to see exactly what you have done.”

Page 23: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

25

Section 5 Conclusions and future work

• Outlined SeMEEP and ELN

– User evaluated proposed modelling ELN

• Addressed case studies

– IUPAC

– MCM

• Developing a case study with the Geomagnetic community

• User and System issues

– Application of actively theory to capture requirements and user evaluation

– Querying and inference

– Address QoS issues (e.g. security, scalabilty, dynamic roles-based access control)

Page 24: Chris Martin  1,2 , Mo Haji  2 , Peter Dew  2 , Peter Jimack  2,  Mike Pilling  1

26

Questions