Triples And Access

  • View
    1.146

  • Download
    1

Embed Size (px)

DESCRIPTION

Presentation given at the inaugural meeting of the Concept Web Alliance, 8 May 2009

Text of Triples And Access

  • 1. Triples & Access Jan Velterop
  • 2. There is something fascinating about science. One gets such wholesale returns of conjecture out of such a trifling investment of fact. Mark Twain, Life on the Mississippi
  • 3. O yeah? We have far too few returns in terms of usable knowledge out of such overwhelming investment of fact! A lot of fact is deeply hidden!
  • 4. Current Knowledge Transfer A metaphor (is Greek for truck after all) Needle transport
  • 5. Information overload? Too much knowledge? Stop acquiring it? Just filtering it? Or organisation underload? Lack of conceptual structure? Unprecedented opportunity?
  • 6. Information overload? Too much knowledge? Stop acquiring it? Just filtering it? Or organisation underload? Lack of conceptual structure? Unprecedented opportunity!
  • 7. Another metaphor: What is the use of water?
  • 8. H2O Drink (take in)
  • 9. What is the use of information?
  • 10. Age to Know Read (take in)
  • 11. Publish articles
  • 12. Stretching the water metaphor: Its already raining we must build the ark
  • 13. The animals to come on board:
  • 14. Slide by Carl Lagoze (Cornell) from this presentation: http://journal.webscience.org/112/3/orechem.pdf
  • 15. Stretching the metaphor further: If you need water, rain is free
  • 16. But if you want quality control and convenience:
  • 17. (node 1, unique ID) (node 2, unique ID) < Source concept > < Relations (edge) > < Target Concept > class date value owner condi/on DOI. All Triples Smart Triples curated curated curated Curated Remove Co-occ Observational Ambiguity and Redundancy Inferred Knowledge Space
  • 18. (node 1, unique ID) (node 2, unique ID) < Source concept > < Relations (edge) > < Target Concept > class date value author condi/on DOI } Database facts (multiple attributes) Community Annotations F+C+A+ Co-occurrence sentence (abstracts e.g. PubMed) Co-occurrence Full Text (publisher e.g. Springer) C+A+ Concept Profile Match Co-expression (gene expression Databases) A+ Modelling hypothesis (e.g. Plectix, InWeb) Multiple Triples T-Cell Development Graph Building (e.g. WikiPathways) Unique to 101668678 Cancer Promoting Genes Interleukin-7 Unique to Springer Unique to Plectix
  • 19. Unique to 101668678
  • 20. (node 1, unique ID) (node 2, unique ID) < Source concept > < Relations (edge) > < Target Concept > class date value author condi/on DOI } Database facts (multiple attributes) Community Annotations F+C+A+ Co-occurrence sentence (abstracts e.g. PubMed) Co-occurrence Full Text (publisher e.g. Springer) C+A+ Concept Profile Match Co-expression (gene expression Databases) A+ Modelling hypothesis (e.g. Plectix, InWeb) Multiple Triples T-Cell Development Graph Building (e.g. WikiPathways) Unique to 101668678 Cancer Promoting Genes Interleukin-7 Unique to Springer Unique to Plectix
  • 21. (node 1, unique ID) (node 2, unique ID) < Source concept > < Relations (edge) > < Target Concept > class date value owner condi/on Etc. Triples Smart Triples In these areas significant value Remove is added to the triples Curated Ambiguity and Redundancy Remove Observational Ambiguity and Redundancy Remove Inferred; Ambiguity and constructed Redundancy Knowledge Space
  • 22. The trustmark CWATM: Triple model Best practice Interoperability Et cetera
  • 23. DownloadConceptWebAlliancecer/edtriples Includes edges from: Pubmed (400,000,000 sentences, 5,000,000,000 concept co-occurrences) (from public data) Protein databases (UniProt, IntAct, PDB, HPRD 75,000 human curated PPIs) (from public data) Gene (co-expression databases (GEO, Express 25 square genes) (from public data) STRING edges (200,000 gene-gene edges) (from semi public data) InWeb edges (240,000 unique edges from 17 species) (from proprietary data) Reactome edges (240,000 unique edge