NCI/CADD Chemical Structure Web Services
Markus SitzmannComputer-Aided Drug Design Group, Chemical Biology Laboratory, Frederick National Laboratory for Cancer Research, NIH, DHHS
http://cactus.nci.nih.gov
Chemical Structure Web API
NCI/CADDweb service
NCI/CADD Chemical StructureDataBase (CSDB)
CACTVS
externalweb services
http
ChemicalIdentifierResolver
othersoftwarepackages
Chemical Structure Web API
OPSIN
NCI/CADDweb service
Chemical Structures
chemical structureNCI/CADD Identifiers
InChI/InChIKey
ChemSpider ID
PubChem SID/CID
chemical names
CAS Registry Number
NSC number
FDA UNII
ChemNavigator SID
SMILES
SD File
Chemical FormulaChEBI ID
PDB Ligand ID
MRV
CML
SYBYL Line Notation
GIF image
Chemical Identifier Resolver (CIR)
http://cactus.nci.nih.gov/chemical/structure
CIR works as a resolver for different chemical structure identifiers orrepresentations. It allows one to convert a givenstructure identifier into anotherrepresentation or structureidentifier.
Chemical Identifier Resolver (CIR)
http://cactus.nci.nih.gov/chemical/structure
• officially released in June 2009• since then four beta versions
(for testing, learning, experience things)• one larger database update March 2010• since early 2012: major internal rewrite
(which will allow us to add new servicesand API functionality while not breakingthe existing API)
• major database update and servicesplanned for 2013
7
CIR Usage Statistics
0
2,000,000
4,000,000
6,000,000
8,000,000
10,000,000
12,000,000
Typical number of unique IP addresses per month: 4,000 – 8,000
Requests per month since June 2009
8
Academic/Hospitals• St. Olaf College• Carnegie Mellon• Drexel University• Princeton• Mayo
Pharma/Chemical Industry• Eli Lilly• Dow Chemical• Intermune• Procter & Gamble• Vertex
U.S. Government• EPA• NIH (NIEHS, NCI, NLM...)• Lawrence Livermore Natl. Lab.• CDC• DoD
Other• Google• Amazon• HP• Agilent• Symyx
Top Users (US)
• CIR node for KNIME, by Talete s.r.l.• Lab Helper app for Windows Phone• Avogadro molecule editor• Jmol/JSmol open-source viewer for chemical structures in 3D• GChem for Google Spreadsheet• Bioclipse (CIR plugin)• Macs in Chemistry• Accelrys Draw
...and educational tools/sites such as:• Jmol/JSmol Virtual Molecular Model Kit• ISU CheMagic• Caltech Library
9
External web services and applications
Examples using CIR
Chemical Identifier Resolver (CIR)C7H6O2APtclcactv03051222202D 0 0.00000 0.00000 15 15 0 0 0 0 0 0 0 0999 V2000 2.8660 -2.0600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -1.5600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -0.5600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -0.0600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -0.5600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -1.5600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 0.9400 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 1.4400 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 1.4400 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -2.6800 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2690 -1.8700 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2690 -0.2500 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 -0.2500 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 -1.8700 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 2.0600 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 2 2 0 0 0 0 2 3 1 0 0 0 0 3 4 2 0 0 0 0 4 5 1 0 0 0 0 5 6 2 0 0 0 0 1 6 1 0 0 0 0 4 7 1 0 0 0 0 7 8 1 0 0 0 0 7 9 2 0 0 0 0 1 10 1 0 0 0 0 2 11 1 0 0 0 0 3 12 1 0 0 0 0 5 13 1 0 0 0 0 6 14 1 0 0 0 0 8 15 1 0 0 0 0M END$$$$SD file
ChemWriter Editor
Chemical Identifier Resolver (CIR)benzoic acid65-85-0WLN: QVRUnisept BZAAIDS018010Salvo liquidBenzoic acid-ring-UL-14CST5213864BenzoesaeureCHEBI:30746NSC 149benzenecarboxylic acidphenylformic acidBenzoic acid (JP15/USP)Benzoic acid (TN)18102_RIEDELAromatic hydroxy acidBenzoic acid (7CI,8CI,9CI)Benzoic acid [USAN:JAN]W213128_ALDRICH47849_SUPELCOAcide benzoique [French]Acido benzoico [Italian]Benzoate (VAN)Benzoesaeure [German]Benzoic acid (natural)Acide benzoiqueBenzeneformic acidBenzenemethanoic acidBenzoesaeure GKBenzoesaeure GVBenzoic acid, tech.CarboxybenzeneKyselina benzoovaPhenylcarboxylic acidnames
ChemWriter Editor
Chemical Identifier Resolver (CIR)
InChIKey=WPYMKLBDIGXBTP-UHFFFAOYSA-NInChI=1S/C7H6O2/c8-7(9)6-4-2-1-3-5-6/h1-5H,(H,8,9)C1=CC=C(C=C1)C(O)=O
InChIKeyInChI
SMILES
ChemWriter Editor
Chemical Identifier Resolver (CIR)
programmatic URL API:
http://cactus.nci.nih.gov/chemical/structure/”identifier”/”representation”
if a request is not successful: HTTP404 status message
Chemical Identifier Resolver (CIR)
• access by programming libraries/languages (e.g. Python):
• access from Unix shell level (e.g., via wget):
shell > wget -qO - \http://cactus.nci.nih.gov/chemical/structure/tamiflu/cas204255-11-8
from urllib2 import *url = “http://cactus.nci.nih.gov/chemical/structure/tamiflu/cas”resolver = urlopen(url) try:
response = resolver.read() except HTTPError:
raise “your own error handling”print response204255-11-8
Chemical Identifier Resolver (CIR)
http://cactus.nci.nih.gov/chemical/structure/PGZUMBJQJWIWGJ-ONAKXNSWSA-N/cas
204255-11-8 MIME type: text/plain
examples:
http://cactus.nci.nih.gov/chemical/structure/tamiflu/image
MIME type: image/gif
CIR
chemical namesIUPAC names (OPSIN)
CAS numbersSMILES strings
IUPAC InChI/InChIKeysNCI/CADD Identifiers
CACTVS HASHISYNSC number
PubChem SIDZINC Code
ChemSpider IDChemNavigator SID
eMolecule VIDUNII
/smiles/names, /iupac_name/cas/inchi, /stdinchi/inchikey, /stdinchikey/ficts, /ficus, /uuuuu /image/file, /sdf/mw, /monoisotopic_mass /formula/twirl/urls/chemspider_id/pubchem_sid/chemnavigator_sid
“identifier” “representation”
http://cactus.nci.nih.gov/chemcial/structure
CIR
Chemical Identifier Resolver (CIR)
http://cactus.nci.nih.gov/chemical/structure/LFQSCWFLJHTTHZ-UHFFFAOYSA-N/smiles
CCO
http://cactus.nci.nih.gov/chemical/structure/LFQSCWFLJHTTHZ-UHFFFAOYSA/smiles`
CCOCC[OH2+]
http://cactus.nci.nih.gov/chemical/structure/LFQSCWFLJHTTHZ/smiles
C(C(O)([2H])[2H])[2H]CC(O)([2H])[2H]C(CO)([2H])([2H])[2H]CC[17OH]C(CO)[2H][14CH3]COCCO
• resolve Standard InChIKey into full structure representation: Ethanol
(Partial) InChIKey Lookup
Chemical File Representation
• available file format representations:
alc Alchemy formatcdxml CambridgeSoft ChemDraw XML formatcerius MSI Cerius II formatcharmm Chemistry at HARvardMacromolecular Mechanics file formatcif Crystallographic Information Filecml Chemical Markup Languagegjf Gaussian input data filegromacs GROMACS file formathyperchem HyperChem file formatjme Java Molecule Editor format
maestro Schroedinger MacroModelstructure file formatmol Symyx molecule filesybyl2/mol2 Tripos Sybyl MOL2 formatmrv ChemAxon MRV formatpdb Protein Data Banksdf Symyx Structure Data Formatsdf3000 Symyx Structure Data Format 3000sln SYBYL Line Notationsmiles SMILESxyz xyz file format
http://cactus.nci.nih.gov/chemical/structure/Aspirin/file?format=sdf
Chemical Structure Images (GIF, PNG)
http://cactus.nci.nih.gov/chemical/structure/XMWRBQBLMFGWIX-UHFFFAOYSA-N/image?height=300&width=300&bgcolor=black&bondcolor=white
http://cactus.nci.nih.gov/chemical/structure/Aspirin/image?height=200&width=200&symbolfontsize=7&footer="Aspirin"
Buckyball
Chemical Properties
• request molecular weight:
http://cactus.nci.nih.gov/chemical/structure/BSYNRYMUTXBXSQ-UHFFFAOYSA-N/weight
180.1598
/mw molecular weight/formula formula/monoisotopic_mass monoisotopic mass/h_bond_donor_count H bond donor count/h_bond_acceptor_count H bond acceptor count/h_bond_center_count H bond center count/rotor_count number of rotatable bonds/effective_rotor_count number of effectively rotatable bonds/rule_of_5_violation_count number of Rule-of-5 violations/xlogp2 octanol−water partition coefficient XLOGP2
/aromatic compound is aromatic/macrocyclic compound is macrocyclic/heteroatom_count heteroatom count/hydrogen_atom_count H atom count/heavy_atom_count heavy atom count/deprotonable_group_count number of deprotonable groups/protonable_group_count number of protonable groups/ring_count number of rings/ringsys_count number of ringsystems
MIME type: text/plain
Aspirin
• request (alternative) names:
<?xml version="1.0" encoding="UTF-8" ?> <request string=“Aspirin" representation="names">
<data id="1" resolver=“name" string_class=“Name"><item id="1" classification=“pubchem_iupac_name">2-acetyloxybenzoic acid</item><item id="2" classification="pubchem_iupac_openeye_name">2-Acetoxybenzoic acid</item><item id="3" classification="pubchem_generic_registry_name">50-78-2</item><item id="4" classification="pubchem_generic_registry_name">11126-35-5</item><item id="5" classification="pubchem_generic_registry_name">11126-37-7</item><item id="6" classification="pubchem_generic_registry_name">2349-94-2</item><item id="7" classification="pubchem_generic_registry_name">26914-13-6</item><item id="8" classification="pubchem_substance_synonym">NCGC00090977-04</item><item id="9" classification="pubchem_substance_synonym">KBioSS_002272</item><item id="10" classification="pubchem_substance_synonym">SBB015069</item><item id="11" classification="pubchem_substance_synonym">Aspirin</item><item id="12" classification="pubchem_substance_synonym">D00109</item>
[…]
http://cactus.nci.nih.gov/chemical/structure/Aspirin/names/xml
Chemical Name Lookup
example: all chemical names that contain the words “morphine” and “methyl”(name pattern: ‘+morphine +methyl‘):
http://cactus.nci.nih.gov/chemical/structure/+morphine +methyl/stdinchikey/xml?resolver=name_pattern
based on the open sourcefull text search server Sphinx(http://sphinxsearch.com)
• Google-like searches on CIR’s name index (approx. 70 million names)
Chemical Name Pattern Search
<request string="+morphine +methyl" representation="stdinchikey"><data id="1" resolver="name_pattern" notation="Morphine 3-methyl ether">
<item id="1">InChIKey=OROGSEYTTFOCAN-DNJOTXNNSA-N</item></data><data id="2" resolver="name_pattern" notation="6-Methyl-delta(sup 6)-deoxy-morphine">
<item id="1">InChIKey=CUFWYVOFDYVCPM-GGNLRSJOSA-N</item></data><data id="3" resolver="name_pattern" notation="Morphine, dihydro-6-methyl-">
<item id="1">InChIKey=NBKVWIJQJMEQLE-NGTWOADLSA-N</item></data><data id="4" resolver="name_pattern“ notation="6-METHYL-MORPHINE ETHER">
<item id="1">InChIKey=FNAHUZTWOVOCTL-UHFFFAOYSA-N</item></data><data id="5" resolver="name_pattern" notation="Morphine alcoholic methyl ether">
<item id="1">InChIKey=FNAHUZTWOVOCTL-XSSYPUMDSA-N</item></data><data id="6" resolver="name_pattern" notation="N-Methyl morphine chloride">
<item id="1">InChIKey=MJNCZWBHCFTYFU-SCLAZZCHSA-N</item></data><data id="7" resolver="name_pattern" notation="Morphine, 7-hydroxy-6,6-dimethoxy-3-O-methyl-">
<item id="1">InChIKey=URFKRBIESURBKC-UHFFFAOYSA-N</item></data>
</request>
Search name pattern ‘+morphine +methyl’: 7 matching names
Chemical Name Pattern Search
example: chemical names that contain the words “morphine” and “methyl”but not “hydroxyl” (name pattern: ‘+morphine +methyl -hydroxyl‘): http://cactus.nci.nih.gov/chemical/structure/+morphine +methyl -hydroxyl/stdinchikey/xml?resolver=name_pattern
example: chemical names that contain the substring “morphine”somewhere in the name (name pattern: ‘*morphine*‘) http://cactus.nci.nih.gov/chemical/structure/*morphine*/stdinchikey/xml?resolver=name_pattern
example: chemical names that contain a single character “m” and the word “benzene” in a maximum distance of 3 words (finds meta-substituted aromaticcompounds, name pattern: ‘“m benzene”~3‘):http://cactus.nci.nih.gov/chemical/structure/(m benzene)~3/stdinchikey/xml?resolver=name_pattern
6 matching names
45 matching names
22 matching names
NCI/CADD Chemical Structure DataBase CSDB 2010
Chemical Structure Normalization/Identifier
structurenormalization
parentstructure
NCI/CADDIdentifier
hashcodecalculation
E_HASHISY
original structure
record
MolfileSDFSMILESChemDraw cdxPDB
SDFSMILESdatabase
original structure records, parent structures and identifiersare stored in the database
• stepwise process:
• calculation of a set of parent structures with differentsensitivity to chemical features:
structurenormalization
parentstructure
NCI/CADDIdentifier
hashcodecalculation
FICTS
original structure
record
FICuS
uuuuu
FICTS
FICuS
uuuuu
Chemical Structure Normalization/Identifier
E_HASHISY
all steps are performed using CACTVS
NCI/CADD Identifiers (FICTS, FICuS, uuuuu)
HNN NH2
O-
ONa+
6C16DE2351F9FF50-FICTS
NNH NH2
OH
O
9850FD9F9E2B4E25-FICTS
HNN
OH
O
NH2HN
NOH
O
NH2HN
N NHOH
O
E92E4BA2869F3611-FICTS 8A7AD1EB498CC76A-FICTS
E92E4BA2869F3611-FICuS 8A7AD1EB498CC76A-FICuSE5F83F10C5DB080A-FICuS
E5F83F10C5DB080A-FICTS
tautomer 2 salt SRtautomer 1
structure normalization - histidine:
based on CACTVS hashcodes (HASHISY)16-digit hexadecimal number (64-bit unsigned) HN
N NH2
OH
O
9850FD9F9E2B4E25-FICuS 9850FD9F9E2B4E25-FICuS
9850FD9F9E2B4E25-uuuuu 9850FD9F9E2B4E25-uuuuu9850FD9F9E2B4E25-uuuuu9850FD9F9E2B4E25-uuuuu 9850FD9F9E2B4E25-uuuuu
9850FD9F9E2B4E25
• calculation of Standard InChIKey from the union set ofparent structures
structurenormalization
parentstructure
NCI/CADDIdentifier
hashcodecalculationoriginal
structurerecord
FICTS
FICuS
uuuuu
Chemical Structure Normalization/Identifier
E_HASHISY
Standard InChIKeyunion set:
1.03
Chemical Structure Database (CSDB)
• ChemNavigator iResearch Librarycompilation of commercially available screeningcompounds from ~300 international chemistrysuppliers
• PubChem Substance Databaseincluding Open NCI database, EPA DSSTox databases, NIAID HIV database, NIST Webbook, NLM ChemIDplus, ChemSpider, …
• Commercial Sources / othersAsinex, Comgenex, eMolecules, …
ChemNav.iResearch Lib.~56%
PubChem~38%
others
~6%
140 chemical structure databases120 million structure records
84.6 million unique structures by FICuS110 million Standard InChIKeys for lookup
current status: (released March 2010)
NCI/CADD Chemical Structure DataBase CSDB 2013
FICTS ~125.0 million FICuS ~121.4 million uuuuu ~109.0 million
• >270 small-molecule database• >600 database releases (full, incremental, “historic versions”)• 385 million original database records
Chemical Structure Database 2013
unique structure count:
union set: 141.7 million unique structures
InChI/InChIKey (Version 1.04) calculated with four InChI flag sets:
Set 1
Set 2
Set 3
Standard Standard InChIKey
DONOTADDH W0 FIXEDH RECMET NEWPS SPXYZ SAsXYZ Fb Fnud KET 15T
DONOTADDH W0 FIXEDH RECMET NEWPS SPXYZ SAsXYZ Fb Fnud KET 15T
DONOTADDH W0 FIXEDH RECMET NEWPS SPXYZ SAsXYZ Fb Fnud KET 15T
Add H
Add H
Add H
Add H
CACTVS
:
:
:
:
Standard Set, Set 1 & Set 2: addition of hydrogen atoms by CACTVSSet 3: addition of hydrogen atoms by the InChI library
Chemical Structure Database 2013
• calculation of Standard InChIKey
structurenormalization
parentstructure
NCI/CADDIdentifier
hashcodecalculationoriginal
structurerecord
FICTS
FICuS
uuuuu
E_HASHISY
union set:
Standard InChIKey 1.04
Set 1 Set 2 Set 3Standard
Chemical Structure Database 2013
Chemical Structure Database 2013
• database schema is entirely implemented in python/
• supports many different database engines: Oracle, PostreSQL, MySQL
• SQLAlchemy provides:
• the communication layer with the database engine
• creates a object-oriented data model representation of the database to the “python”-side
• table relationships:
• either defined by Foreign Key relationships in the database or specified on python level
• SQLAlchemy creates table joins on the SQL level
structure_table = Table(‘structure’, metadata,Column(‘id’, Integer, primary_key=True, autoincrement=True),Column(‘hash’, Char(16), unique=True,Column(‘smiles’, Text()),schema=schema
)
class Structure(TableRepr, TableInit):__table__ = structure_table
mapper(Structure, structure_table, relationship={‘name’: relationship(Name, backref=backref(‘structure’,primaryjoin=structure_table.c.id=name_table.c.structure_id
})
Chemical Structure Database 2013
• SQLAlchemy table definition
Chemical Structure Database 2013
• Query the database
> s = db.session.query(Structure).filter(Structure.id==1234).one()<object “Structure”>> s.smilesCCO
> q = select([structure_table.c.id,]).where(structure.c.id==1234)> s = q.execute().fetchone()(CCO,)
• if the object-oriented data model representation creates too much overhead, SQLAlchemy supports writing “almost
bare” SQL but still follows the python paradigms
Chemical Structure Database
• index any chemical structures that can be referenced in some way or has a known source
• may also include virtual chemistry or generic structure collections• collect public dataset/databases/structure collections• normalize them to our standards• make them available in our public web interfaces and APIs
(if we are allowed to)• no refusal/deletion of structures – curation is performed by “keep the
bad and tag it as bad”
track chemical space
• Goals
NCI/CADD Chemical Web Apps
NCI/CADD Chemical Web Apps
• implemented with jQuery Mobile (1.3.0)• HTML5• supports web browser on major mobile platforms: iOS, Android,
BlackBerry, WindowsPhone, Windows 8, Palm, Symbian• supports major Desktop web browsers: Google Chrome, Firefox, IE9/10• WAI-ARIA compliant (W3C specification draft describing accessibility
standards of dynamic Web content for people with disabilities)
• services will be optimized for usage on tabled-sized touch screens devices, however, not (yet) for smart-phone sized devices (current development is done on an iPad3)
• all services work on a common platform
chemical structure
prediction of physicochemical properties and activities
Chemical Activity Predictor - GUSAR
characteristics:
chemical structures are represented byQNA descriptorsMNA descriptors
mathematical algorithmunique algorithm of self- consistent regression allows to select the best set of descriptors for a robust and reliable QSAR model.
main developerAlexey Zakharov
Chemical Activity Predictor - GUSAR
GUSAR Software
comparison was performed on the following data sets:
• ligand–enzyme interactions• ligand–receptor interactions• acute toxicity• interaction with drug-metabolism• enzymes
GUSAR Software
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
CoMFA CoMSIA HQSAR EVA 2DCerius2
3DCerius2
GOLPE GUSAR
Accu
racy
(R2
test
)
Chemical Activity Predictor - GUSAR
Chemical Activity Predictor - GUSAR
• QSAR-based models created by GUSAR can be used separatelyfrom the application
• broad spectra of chemical/biological activity and property prediction models for small molecules in development:• physicochemical properties• assessment of toxicity, metabolism and antineoplastic activities• HIV-1-related models
• will be available as Web App and programmatic URL API:
http://cactus.nci.nih.gov/chemical/activity/CCOCC/boiling_point
{in_applicability_domain: True, datatype: ‘float’, value: 42.660}
Chemical Activities
Categories Models Endpoints
PhysicochemicalProperties
PhysicochemicalModels
Boiling pointDensity Flash pointMelting pointSurface tensionThermal conductivityVapor pressureViscosityWater solubilityHIV-1 Integrase (Strand Transfer) InhibitorHIV-1 Reverse Transcriptase Inhibitor
HIV-ModelsBiological Activities
Activity Endpoints
Activity Endpoints
Activity Endpoints
Activity Endpoints
Prediction ResultsGUSAR• value• unit• in applicability domain• quantitative and
qualitative models
Chemical Activity Predictor – GUSAR beta
http://cactus.nci.nih.gov/chemial/apps
Chemical Activity Predictor – GUSAR beta
http://cactus.nci.nih.gov/chemial/apps
Chemical Structure Lookup Service (CSLS)
• first version was released in 2006, development stalled in 2008• new version will be based on CSDB• new release planned for 2013• allows easy lookup of chemical structures within the constituting
databases in CSDB
InChI/InChIKey Resolver
InChI/InChIKey Resolver
“loose coupling”of InChI resolversprovided by differentorganizations
central list of resolvers
each resolvermust provide aspecific protocol.
InChI/InChIKey Resolver
• Evan Bolton (NCBI, NLM, NIH)• Valery Tkachenko (RSC/ChemSpider)• Marc Nicklaus (CADD Group, NCI, NIH)• Steven Bachrach (Trinity University)• Antony Williams (RSC/ChemSpider)• Markus Sitzmann (CADD Group, NCI, NIH)
Chemical Structure Web API
NCI/CADDweb service
NCI/CADD Chemical StructureDataBase (CSDB)
CACTVS
externalweb services
http
ChemicalIdentifierResolver
othersoftwarepackages
Chemical Structure Web API
NCI/CADDweb service
OPSIN
Chemical Structure Web API
NCI/CADDweb service
NCI/CADD Chemical StructureDataBase (CSDB)
CACTVS
externalweb services
http
ChemicalIdentifierResolver
othersoftwarepackages
Chemical Structure Web API
OPSIN
NCI/CADDweb service
GUSAR
http://cactus.nci.nih.gov/blog
NCI/CADD TeamAlexey ZakharovLaura Guasch PàmiesMegan Peach Marc Nicklaus
Xemistry GmbH, GermanyWolf-Dietrich Ihlenfeldt
Acknowledgements
ChemNavigatorScott HuttonTad Hurst
InChI Team
Pubchem
All other database providers
Acknowledgments - Software
CACTVS
Python Web FrameworkChemWriter
Python SQL Library
Javascript library
Peter Ertl (Novartis)
Fulltext Search Engine
http://cactus.nci.nih.gov