97
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Annotation on ChemSpider Antony Williams University of Illinois in Chicago, January 27 th 2012

Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Delivering Curated Chemistry to the

World via Crowdsourced Deposition

and Annotation on ChemSpider

Antony WilliamsUniversity of Illinois in Chicago, January 27th 2012

Page 2: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

The World of Online Chemistry

Property databases

Compound aggregators

Screening assay results

Scientific publications

Encyclopedic articles (Wikipedia)

Metabolic pathway databases

ADME/Tox data – eTOX for example

Blogs/Wikis and Open Notebook Science

Contributing Open Source code to projects

Page 3: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

We Have …Too Much Data!!!

Page 4: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

e-Science and Primary Data

How much data generated in a lab, that COULDgo public, is lost forever?

Page 5: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

TotallySynthetic.com

Page 6: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

e-Science and Primary Data

How much data generated in a lab, that COULDgo public, is lost forever?

Public Domain reference databases of value?

Syntheses

Properties

Spectra

CIFs

Images

Page 7: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

PubChem

Page 8: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

ChEMBL

Page 9: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Collaborative Knowledge Management

Page 10: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

e-Science and Primary Data

How much data generated in a lab, that COULDgo public, is lost forever?

Public Domain reference databases of value?

Syntheses

Properties

Spectra

CIFs

Images

Much of chemistry is chemical structure-based –where and how could we host these data?

Page 11: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

RSC’s ChemSpider

Page 12: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Available Information…

Linked to vendors, safety data, toxicity, metabolism

Page 13: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Available Information….

Page 14: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Crowdsourced “Annotations”

Users can add

Descriptions/Syntheses/Commentaries

Links to PubMed articles

Links to articles via DOIs

Add spectral data

Add Crystallographic Information Files

Add photos

Add MP3 files

Add Videos

Page 15: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing
Page 16: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Spectra

Page 17: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Spectra

Page 18: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Data on the Web

Page 19: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Chemistry Data online is messy

We have inherited errors

All public compound databases, including ours, have errors

“Incorrect” structures – assertions, timelines etc

“Incorrect” names associated with structures

Properties

Links

Publications

ENORMOUS CHALLENGE

Page 20: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

The Structure of Vitamin K?

Page 21: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

MeSH

A lipid cofactor that is required for normal blood clotting. Several forms of vitamin K have been identified: VITAMIN K 1 (phytomenadione) derived from plants, VITAMIN K 2 (menaquinone) from bacteria, and synthetic naphthoquinone provitamins, VITAMIN K 3 (menadione). Vitamin K 3 provitamins, after being alkylated in vivo, exhibit the antifibrinolytic activity of vitamin K. Green leafy vegetables, liver, cheese, butter, and egg yolk are good sources of vitamin K

Page 22: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

The Structure of Vitamin K1?

Page 23: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

What is the Structure of Vitamin K1?

Page 24: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

CAS’s Common Chemistry

Page 25: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Wikipedia

Page 26: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing
Page 27: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing
Page 28: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

ChEBI – Manual Curation

Page 29: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing
Page 30: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing
Page 31: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing
Page 32: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

“2-methyl-3-(3,7,11,15-tetramethylhexadec-2-enyl)naphthalene-1,4-dione”

Variants of systematic names on PubChem

2-methyl-3-[(E,7R,11R)-3,7,11,15-tetramethyl

2-methyl-3-[(E,7S,11R)-3,7,11,15-tetramethyl

2-methyl-3-[(E,7R,11S)-3,7,11,15-tetramethyl

2-methyl-3-[(E,7S,11S)-3,7,11,15-tetramethyl

2-methyl-3-[(E,11S)-3,7,11,15-tetramethyl

2-methyl-3-[(E)-3,7,11,15-tetramethyl

2-methyl-3-(3,7,11,15-tetramethyl

2-methyl-3-[(E)-3,7,11,15-tetramethyl

Page 33: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Question Everything online: www.dhmo.org

Page 34: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

It’s all on Wikipedia…

Page 35: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Chemistry on The Internet Is Messy

Page 36: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

It’s Methane…

Page 37: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

What’s Methane?

Page 38: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

What’s Methane?

Page 39: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

What ELSE is Methane???

Page 40: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing
Page 41: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

EPA’s DailyMed

Page 42: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

EPA’s DailyMed

Page 43: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

EPA’s DailyMed

Page 44: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

PHYSPROP Database

The freely downloadable database under the EPI Suite prediction software

Very Basic filters suggest data quality issues

Page 45: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

The Stereochemistry challenge.

12500 chemicals with “missed” stereo

Page 46: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

With Great Fanfare…

Page 47: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

NPC Browser http://tripod.nih.gov/npc/

Page 48: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

NPC Browser http://tripod.nih.gov/npc/

Page 49: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing
Page 50: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Openness and Quality IssuesWilliams and Ekins, DDT, 16: 747-750 (2011)

Science Translational Medicine 2011

Page 51: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Public Domain Databases

Our databases are a mess…

Non-curated databases are proliferating errors

We source and deposit data between databases

Original sources of errors hard to determine

Curation is time-consuming and challenging

Page 52: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Stop Whining – Fix it

Page 53: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Crowdsourced Curation

Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate

Page 54: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Search “Vitamin H”

Page 55: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

“Curate” Identifiers

Page 56: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

“Curate” Identifiers

Page 57: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

“Curate” Identifiers

Page 58: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Standards : Structure Standardization

Page 59: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Standards : Structure Standardization

Page 60: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Standards : Structure Standardization

Page 61: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

What needs to happen?

Standards

Standardization of structures ChEBI/PubChem sharing

InChI adoption

Page 62: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

The InChI Identifier

Page 63: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Multiple Layers

Page 64: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

InChIStrings Hash to InChIKeys

Page 65: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Vancomycin – Search the Internet

Page 66: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Vancomycin

Search Molecular

SKELETON

Search Full Molecule

Page 67: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Full Skeleton Search: 104 Hits

Page 68: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Full Molecule Search: 4 Hits

Page 69: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Crowdsourcing Works

>130 people have deposited data and participated in data curation

Different level curators check each other

More curators and depositors are encouraged!

Page 70: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

What needs to happen?

Standards

Standardization of structures ChEBI/PubChem sharing

InChI adoption

Collaboration

Stop reinventing the wheel

Share data, share efforts and speed the process

Page 71: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Antony Williams vs Identifiers

Passport ID

Dad, Tony, others

SSN

Green Card

License5 email addresses

ChemSpiderman (blog,

Twitter account,

Facebook, Friendfeed)

OpenID

….

Page 72: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Aspirin names and synonyms

• Text searches depend on correct association

• 335 suggested identifiers for Aspirin just on PubChem!

• Disambiguation dictionaries are necessary, not just for authors!

Page 73: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing
Page 74: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing
Page 75: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

The Final Search Strategy

Page 76: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

All Those Names, One Structure

Page 77: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Ambiguity in Identifiers

Page 78: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Curated Dictionaries Matter

Page 79: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Success Depends on Dictionaries

Page 80: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Validated Name-Structure Dictionaries

Chemical name dictionaries are used for: Text-mining (publications, patents)

Used to index PubMed and link to Google Patents

Linking to other databases – think Biology! When structures are not available drug names link

Searching the web Names link to structures link to InChIs

Page 81: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

I want to know about “Vincristine”

If all algorithms work then

everything on the page is

correct by default except the

name-structure relationship!

Page 82: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Vincristine: Identifiers and Properties

Page 83: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Vincristine: Vendors and SourcesLinked by Structure

Page 84: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Vincristine: PatentsLinked by Name

Page 85: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Vincristine: ArticlesLinked by Name

Page 86: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Challenges of Complex Molecules

Yohimbine

Page 87: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Originally 15 compounds “called” Yohimbine

54 Skeletons for Yohimbine

Page 88: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Internal and external content

Built to meet primary use-case

Tailored indexes and GUIs

Internal unique language & metadata

Poor interoperability/integration

Powerpoint, Documents, Excel

Many suppliers of systems and content in a single workflow

Literature Patents NewsPipeline SAR CSRs SafetyIn vivo Etc

Pharma Information Tombs

Page 89: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

What could create change?

Harvard Business Review (2010)

“One change would make a substantial difference [to drug R&D]: the creation of

agreed-upon standards for digitally representing drug assets.”

Page 90: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

It is so difficult to navigate…

What’s the

structure?

Are they in

our file?

What’s

similar?

What’s the

target?Pharmacology

data?

Known

Pathways?

Working On

Now?Connections

to disease?

Expressed in

right cell type?

Competitors?

IP?

Page 91: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Open PHACTS Project Develop a set of robust standards…

Implement the standards in a semantic integration hub

Deliver services to support drug discovery programs in pharma and public domain

22 partners, 8 pharmaceutical companies, 3 biotechs

36 months project

Guiding principle is open access, open usage, open source

- Key to standards adoption -

Page 92: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing
Page 93: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

ChemSpider Resources for Chemistry

Page 94: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Internet Data

The Future

Commercial Software

Pre-competitive Data

Open Science

Open Data

Publishers

Educators

Open Databases

Chemical Vendors

Small organic molecules

Undefined materials

Organometallics

Nanomaterials

Polymers

Minerals

Particle bound

Links to Biologicals

Page 95: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

The Future of Chemistry on the Web?

Public compound databases federate & build a linked environment of validated data!

Data validation needs are not ignored

Publishers layer on information to make publications discoverable

Public-Private databases can be linked

Open Data proliferate

The “Semantic Web” in action

Page 96: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Acknowledgments

The ChemSpider team

Our data providers, depositors, collaborators and curators

Software providers – OpenEye, ChemDoodle, ACD/Labs, GGA Software, Open Source (Jmol, JSpecView, OpenBabel)

Sean Ekins @collabchem

Page 97: Delivering Curated Chemistry to the World via Crowdsourced ... fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing

Thank you

Email: [email protected]

Twitter: ChemConnector

Blog: www.chemspider.com/blog

Personal Blog: www.chemconnector.com

SLIDES: www.slideshare.net/AntonyWilliams