47
A centre of expertise in digital information management www.ukoln.ac.u k UKOLN is supported by: Codes, Clouds & Constellations: Open Science in the Data Decade Dr Liz Lyon, Director, UKOLN, University of Bath, UK Associate Director, UK Digital Curation Centre CNI Meeting, Baltimore, April 2010 . This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0

Codes, Clouds & Constellations: Open Science in the Data Decade

  • Upload
    lizlyon

  • View
    708

  • Download
    2

Embed Size (px)

DESCRIPTION

Presentation given at the CNI Meeting, Baltimore in April 2010.

Citation preview

Page 1: Codes, Clouds & Constellations: Open Science in the Data Decade

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

UKOLN is supported by:

Codes, Clouds & Constellations: Open Science in the Data Decade

Dr Liz Lyon, Director, UKOLN, University of Bath, UKAssociate Director, UK Digital Curation Centre

CNI Meeting, Baltimore, April 2010

.

This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0

Page 2: Codes, Clouds & Constellations: Open Science in the Data Decade

1. Scaling to Share2. Publication and Attribution3. Pathways to Participation4. Institutions and Informatics

http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/publications.html#november-2009

•2010 Perspectives

•November 2009

•Consultation

•eResearch Australasia slides •http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/presentations.html#2009-november-australasia

•Progress, Prospects?

Page 3: Codes, Clouds & Constellations: Open Science in the Data Decade

Scaling to Share

Human Genome printed http://www.flickr.com/photos/johnjobby/2252981353/sizes/l/

Page 4: Codes, Clouds & Constellations: Open Science in the Data Decade

From the Laboratory bench....

Page 5: Codes, Clouds & Constellations: Open Science in the Data Decade

…to a national crystallography service....

Page 6: Codes, Clouds & Constellations: Open Science in the Data Decade

....to Diamond Light Source

Page 7: Codes, Clouds & Constellations: Open Science in the Data Decade

• “Bridging the chasm” between the local laboratory bench and large scale facilities

• Develop Integrated Information Model

• Use cases and Inter-disciplinary Pilots

• Cost-benefit analysis: before and after

http://www.ukoln.ac.uk/projects/I2S2/

Page 8: Codes, Clouds & Constellations: Open Science in the Data Decade

Diamond Light Source

National Crystallography Service (NCS)

Local Earth Sciences Lab University of Cambridge

Function International service -multiple communities

UK service - multiple institutions. Also uses Diamond

Lone researcher at institution - uses NCS and ISIS large-scale facility

Administration Peer-reviewed proposal required

Paper-based records –experiments, safety ERA, instrument time

Multiple proposals, multiple forms

Metadata Core Scientific MetaData Model

eBank/eCrystals schema

?

Identifiers Beam-line number DOI InChI ?

Workflow Formulaic and bespoke

Formulaic, unrecorded Complex, unrecorded

Software In-house scripts In-house scripts + open-source suite

In-house scripts + open-source suite

Raw data In-house GDA store ATLAS data-store Laptop / local server

Derived data Taken offsite on laptop / USB stick

eCrystals repository Laptop / local server / USB stick

Page 9: Codes, Clouds & Constellations: Open Science in the Data Decade
Page 10: Codes, Clouds & Constellations: Open Science in the Data Decade

Technology race to market$1000 genome in <15 minutes ....by 2013?

Page 11: Codes, Clouds & Constellations: Open Science in the Data Decade

...data deluge challenges....

• Large-scale data storage that is:– Cost-effective (rent on-demand)– Secure (privacy and IPR)– Robust and resilient– Low entry barrier / ease-of-use– Has data-handling / transfer / analysis capability

• Move sequencing out of genome centres

• “....analyse an entire human genome in a single day sitting with a laptop at your local Starbucks.”

...cloud services?

Page 12: Codes, Clouds & Constellations: Open Science in the Data Decade

...data clouds in the media

Page 13: Codes, Clouds & Constellations: Open Science in the Data Decade

Clients in the cloud

Page 14: Codes, Clouds & Constellations: Open Science in the Data Decade
Page 15: Codes, Clouds & Constellations: Open Science in the Data Decade

Post-genome decade

Human genomes: >24 published &almost 200 unpublished

Page 16: Codes, Clouds & Constellations: Open Science in the Data Decade

“P4 medicine : predictive, personalised, preventive, participatory.”Leroy Hood – Institute for Systems Biology

• Each patient’s genome sequenced• Your genome is the basis of your medical record • New predictive models of health and disease• Individualised treatments focusing on preventative therapies

Image from Scientific American

Genome scale network biologyGenomic data as a commodity

Page 17: Codes, Clouds & Constellations: Open Science in the Data Decade

• Sage Bionetworks : Integrative genomics• Develop predictive models of disease: liver /

breast / colon cancer, diabetes, obesity • Open data in the Sage Commons• Human and mouse: clinical and genetics data• Congress San Francisco 23-24 April 2010

Stephen Friend

Page 18: Codes, Clouds & Constellations: Open Science in the Data Decade

They have shared their data….

Page 19: Codes, Clouds & Constellations: Open Science in the Data Decade

Heather Piwowar

…but many researchers don’t share…

…and are reluctant to re-use data…

Page 20: Codes, Clouds & Constellations: Open Science in the Data Decade

Publication and

Attribution

http://www.flickr.com/photos/digitalfemme57/3271063366/

Page 21: Codes, Clouds & Constellations: Open Science in the Data Decade

Calls for action, new metrics

Page 22: Codes, Clouds & Constellations: Open Science in the Data Decade

• Journal

• Article

• Workflow

• Data

• Annotation

• Concept

Macro

Micro / Nano

Attribution granularity

... complexity challenges...

Page 23: Codes, Clouds & Constellations: Open Science in the Data Decade

Citing network models

• Multiple data sources

• Many standards

• Workflow integration

• User requirements

• Service functionality?

Page 24: Codes, Clouds & Constellations: Open Science in the Data Decade

Pathways to Participation

http://www.flickr.com/photos/lemontwist/502860137/sizes/o/

Page 25: Codes, Clouds & Constellations: Open Science in the Data Decade

Continuum of Openness

Open accessClosed Access

Participation

Lone scholar

Professional, experts

Volunteers interested amateurs

Citizen science

“dark data”

Creative Commons Attribution-Non-Commercial-Share Alike 2.0

Page 26: Codes, Clouds & Constellations: Open Science in the Data Decade

Data Informatics: Logistics dilemma

Professional scientistCitizens

Capability

Capacity

Data scientists , LIS

Peer production

Volunteers, interested amateurs

Community curation

Creative Commons Attribution-Non-Commercial-Share Alike 2.0

Professional scientist

Observations

Audit

Preservation

Ontologies

Metadata schema

Annotation

Data management plans

Selection & Appraisal

Data cleansing

Training

Visualisation

Page 27: Codes, Clouds & Constellations: Open Science in the Data Decade
Page 28: Codes, Clouds & Constellations: Open Science in the Data Decade

Peer Production

Page 29: Codes, Clouds & Constellations: Open Science in the Data Decade
Page 30: Codes, Clouds & Constellations: Open Science in the Data Decade

Using gaming to drive curation

Page 31: Codes, Clouds & Constellations: Open Science in the Data Decade

Professional Scientists Enthusiastic amateurs

Training Citizen scientist

Standards and ethics Local : natural history, environ.

Peer-review Global : astronomy

Organisational support Self-supporting

Page 32: Codes, Clouds & Constellations: Open Science in the Data Decade

Citizen science...

Page 33: Codes, Clouds & Constellations: Open Science in the Data Decade

Privacy issues?

… “participatory urbanism”?

Page 34: Codes, Clouds & Constellations: Open Science in the Data Decade

“You have zero privacy anyway. Get over it”

Scott McNealy, CEO Sun Microsystems, 1999

Page 35: Codes, Clouds & Constellations: Open Science in the Data Decade
Page 36: Codes, Clouds & Constellations: Open Science in the Data Decade

Working with science professionals

...cultural challenges for faculty?

Page 37: Codes, Clouds & Constellations: Open Science in the Data Decade

Institutions and Informatics

University of Edinburgh Informatics Forum http://www.flickr.com/photos/chris_malcolm/2638210422/sizes/l/

Page 38: Codes, Clouds & Constellations: Open Science in the Data Decade

Open Science at Web-Scale Report 2009

Page 39: Codes, Clouds & Constellations: Open Science in the Data Decade

Institutional response : High Throughput Biology

Page 40: Codes, Clouds & Constellations: Open Science in the Data Decade

• North Carolina universities

• Cyber-infrastructure project

• Data cloud across three campuses

• “regional”

• Policy & practice

Page 41: Codes, Clouds & Constellations: Open Science in the Data Decade

New data support structures

Page 42: Codes, Clouds & Constellations: Open Science in the Data Decade

Facilitating team science

- Future Chips

- Biocomputation & Bioinformatics

- Tetherless World

- Integrative Systems Biology

- Graphic designers?

- Animators?

- Social scientists?

- Legal experts?

Page 43: Codes, Clouds & Constellations: Open Science in the Data Decade
Page 44: Codes, Clouds & Constellations: Open Science in the Data Decade
Page 45: Codes, Clouds & Constellations: Open Science in the Data Decade

Embedding data informatics education

...for faculty & LIS...

Page 46: Codes, Clouds & Constellations: Open Science in the Data Decade

Take homes1. Data sharing requires

pragmatic solutions

2. Attribution granularity & citation complexity

3. We need “the crowd”

4. Institutional strategies embrace informatics

5. The prospects are transformational...

http://www.flickr.com/photos/29170077@N05/4412360636/

Page 47: Codes, Clouds & Constellations: Open Science in the Data Decade

Slides will be available at :http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/presentations.html

http://www.dcc.ac.uk/