40
250 th ACS Na,onal Mee,ng, Boston MA, USA 17 th August 2015 Unlocking chemical informa0on from tables and legacy ar0cles Daniel Lowe and Roger Sayle NextMove So?ware Aileen Day and Antony Williams Royal Society of Chemistry

Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Unlocking  chemical  informa0on  from  tables  and  legacy  ar0cles  

Daniel  Lowe  and  Roger  Sayle  NextMove  So?ware  

 

Aileen  Day  and  Antony  Williams  Royal  Society  of  Chemistry  

Page 2: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Topics

•  Chemical  property  extrac,on  

•  Applica,on  of  chemical  property  extrac,on  to  tables  

•  RSC  back-­‐archive  mining  

Page 3: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Chemical property extraction

•  Mel,ng  points  •  Boiling  points  •  Mass  spectrum  •  Textual  NMR  spectra  •  Specific  rota,on  •  Chromatography  reten,on  ,mes  •  IR/UV  spectra  •  Ac,vity  data  e.g.  IC50,  EC50,  Ki  •  Etc.  

Page 4: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Simple grammar and corresponding state machine

Isotope:  ‘1H’|‘  13C’  |‘  19F’  Nmr:  ‘-­‐NMR’  NmrPrelog:  Isotope  Nmr    

1 3 C

9 F

H

N M R -

Page 5: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Melting point recognition

Term   Examples  of  text  matched  FromLiterature   “lit.”  Mel0ngPoint   “mpt”,  “mel,ng  point”,  “m.p.”  Qualifier   “>”;  “approximately”  Value   “75°  C”,  “200°  F”,  “one  hundred  degrees  Celsius”  Range   “184-­‐186°  C”,  “191.5  to  192.4°  C”  

MeasurementError   “50±°  C”  OutcomeQualifier   “decomp.”,  “with  decomposi,on”,  “subl.”  

FromLiterature?  Mel,ngPoint  Qualifier?  (Value  |  Range  |  MeasurementError)  OutcomeQualifier?  

M.p.: 230°C (dec.)

Page 6: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

NMR recognition

Term   Examples  of  text  matched  Isotope   “1H”,  “13C”,  “19F”  NMR   “NMR”,  “RMN”  

NmrMethod   “400  MHz,  CDCl3”  Peak   “3.7”  

PeakAnnota0on   “s,  3H”  

Isotope  NMR  NmrMethod?  Peak  PeakAnnota,on?  (Delimiter  Peak  PeakAnnota,on?)*  

1H  NMR  (300  MHz,  DMSO):  7.5-­‐7.8  (m,  5H),  7.9  (d,  J=8Hz,  2H),  8.33  (d,  J=5Hz,  2H)  

Page 7: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Recognition and parsing

•  Grammar  dis,nguishes  parts  of  an  en,ty  of  interest  e.g.  25°C  à25  (value)  °C  (unit)  

•  Can  groups  constructs  together  e.g.  25  to  30  (range)  

Page 8: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Example parse Tree serialised to XML

Mp:  131.9-­‐132.6  °C <parse>

<quantityType quantityType="MeltingPoint">Mp</quantityType>

<measurement>

<range>

<valueOptUnit>

<decimalValue>131.9</decimalValue>

</valueOptUnit>

<rangeDelimiter>-</rangeDelimiter>

<valueOptUnit>

<decimalValue>132.6</decimalValue>

<unitContainer>

<unit unitType="Temperature" normalizationFactor="1">°C</unit>

</unitContainer>

</valueOptUnit>

</range>

</measurement>

</parse>

 

Page 9: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Recognition and parsing

•  Grammar  dis,nguishes  parts  of  an  en,ty  of  interest  e.g.  25°C  à25  (value)  °C  (unit)  

•  Can  groups  constructs  together  e.g.  25  to  30  (range)  

•  However  this  introduces  non-­‐determinism  e.g.  aoer  seeing  “25”  both  the  possibility  of  being  in  and  not  being  in  a  range,  need  to  be  considered  

Page 10: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

•  Same  grammar  can  be  used  to  generate:  – Single  state  machine  representa,on  

•  Parts  of  en,ty  not  dis,nguished  •  Extremely  fast  recogni,on  •  Allows  spelling  correc,on  of  input  that  is  close  to  being  a  match  

– Mul,  state  machine  parser  representa,on  •  Slower…  but  only  needs  to  be  run  on  a  small  amount  of  text  

•  Dis,nguishes  parts  of  en,ty  •  Can  group  parts  into  a  parse  tree  

Page 11: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Grammar implementation details

Page 12: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Table Extraction

Page 13: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Melting point table

Page 14: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

NMR table

Page 15: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

More difficult… Against what?

Need to be looked up else where in document. Could be in text, might be in images

Page 16: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Even More difficult…

Page 17: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Tables in USPTO patents

Page 18: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

…and the xml provided <row>

<entry>1</entry> <entry>N<sup>1</sup>-hydroxy-N<sup>2</sup>-{[4-(phenyloxy)phenyl]sulfonyl}-</entry> <entry>H-NMR; &#x3b4; (CD3OD): 7.79 (d, 2H),</entry>

</row> <row>

<entry/> <entry>D-lysinamide</entry> <entry>7.42 (t, 2H), 7.22 (t, 1H), 7.09 (d, 2H),</entry>

</row> <row>

<entry/> <entry/> <entry>7.05 (d, 2H), 3.63 (t, 1H), 2.87 (t, 2H),</entry>

</row> <row> <entry/>

<entry/> <entry>1.57-1.68 (m, 4H), 1.44 (m, 1H),</entry>

</row> <row> <entry/>

<entry/> <entry>1.37 (m, 1H)</entry>

</row> <row>

Page 19: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Naïve interpretation (Google patents)

Green:  chemical  subs,tuent  Purple:  chemical  molecule  Blue:  NMR  

Page 20: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

SureChemBl

Page 21: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

After heuristically detecting which rows are the same row

 Purple:  chemical  molecule  Blue:  NMR  

Page 22: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

What could be extracted?

8056714854565090342032283161287525582148 740 568 410 329 197 197 187 171 101 96 73 40 290

100000

200000

300000

400000

500000

600000

Nam

e/Iden

tifier  to  prop

erty  re

latio

nships

Page 23: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Compound number determination <heading level="2" id="h-0055">EXAMPLE-14 </heading> <heading level="2" id="h-0056">2-(2,4-difluorophenoxy)-5...</heading>

<parse> <referenceType type="Example">EXAMPLE</referenceType> <referenceId>14</referenceId> </parse>

<heading level="2" id="h-0008">3. (4aS,8aR)-2-(1-Acetyl-pipe..</heading>

2-Chloro-5-iodo-1H-benzo[d]imidazole (1)

Page 24: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

RSC-back archive mining

Page 25: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

RSC back archive

•  1841-­‐1999,  211k  ar,cles  (available  as  XML  derived  from  OCR  and  PDF)  

•  2000  -­‐,  230k  ar,cles  (available  as  born  digital  XML  and  PDF)  

•  Also  over  150k  Electronic  suppor,ng  informa,on  files  (mostly  PDF,  but  also  Word  docs,  Excel  files,  videos  etc.)  

Page 26: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Legacy document handling

•  Chemical  proper,es  are  ooen  implicitly  associated  with  a  compound  by  being  in  the  same  experimental  sec,on  

Page 27: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Page 28: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Legacy document handling

•  Chemical  proper,es  are  ooen  implicitly  associated  with  a  compound  by  being  in  the  same  experimental  sec,on  

•  This  requires  sec,on  detec,on  e.g.  a  heading  and/or  a  paragraph  where  a  compound  is  being  synthesised  

•  In  the  XML  for  pre-­‐2000  papers  all  sec,ons  on  a  page  run  together  (including  page  numbers!),  and  the  text  posi,on  informa,on  is  lost.  

•  …so  back  to  the  source  PDF  

Page 29: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Heading/Paragraph detection workflow

Page 30: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Results (Melting points) 1841-­‐1999  RSC  journal  ar0cles  

2000-­‐2015  RSC  journal  ar0cles  

2001-­‐2015  USPTO  patent  applica0ons  

Compound-­‐value  associa0ons  

2,155   29,996   172,886  

Suspicious  Values  (typically  mistake  in  the  document)  

70  (3.2%)   39  (0.13%)   426  (0.25%)  

Unique  Compounds  (StdInChI)  

1,830  (84.9%)   27,956  (93.2%)   95,140  (55.0%)    

Page 31: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

SDF output

Page 32: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

F

B–

O

NH

H3C

O+

H3C

F

Page 33: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Cl

Te

Te

F

F

Cl

F

F

Page 34: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Results (NMR) 1841-­‐1999  RSC  journal  ar0cles  

2000-­‐2015  RSC  journal  ar0cles  

2001-­‐2015  USPTO  patent  applica0ons  

Compound-­‐value  associa0ons  

4,972   94,610   1,295,325  

Suspicious  Values  (typically  mistake  in  the  document)  

561  (11.3%)   2,001  (2.11%)   29,775  (2.30%)  

Unique  Compounds  (StdInChI)  

2,899   48,137   655,295  

Page 35: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Legacy text issues

•  OCR  errors  in  important  compound  names  or  data    –  chemical  names  in  italics  problema,c…  key  compounds  ooen  in  italics!  

–  °  is  more  ooen  than  not  misinterpreted  e.g.  '  o        

•  Tools  prefer  experimental  sec,ons  where  one  compound  is  being  synthesised,  qualita,vely  older  documents  are  less  formalised  

 

Page 36: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

4-­‐ChZoro-­‐6-­‐hydroxy-­‐2-­‐methyZamino~yrimidine.-­‐4-­‐Chloro-­‐6-­‐methoxy-­‐2-­‐methylaminopyrim-­‐idine  (10g.)  was  heated  on  the  steam-­‐bath  for  30  min.  with  concentrated  hydrochloric  acid  (60  c.c.).  The  hydvoxy-­‐cmfiound  which  separated  on  cooling  was  collected  and  purified  by  dis-­‐  solu,on  in  alkali,etc.  as  above  and  had  m.  p.  266"  (decornp.)  (6.6  g.)  (Found  c  38.3  ;  H  4.1;  N  26-­‐2.  C,H,0N3C1  requires  C  37.6;  H  3.8;  N  26.3%).  

Page 37: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Conclusions

•  Grammars  facilitate  rapid  extrac,on  and  interpreta,on  of  chemical  proper,es  

•  Table  extrac,on  is  vital  to  extrac,ng  large  quan,,es  of  certain  data  e.g.  ac,vity  data  

•  Large  amounts  of  high  quality  data  can  be  extracted  from  journal  ar,cles  

•  …but  extrac,on  from  older  documents  remains  very  challenging,  and  over  ,me  represents  a  smaller  and  smaller  percentage  of  the  scien,fic  literature  

Page 38: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Acknowledgements

•  Igor  Tetko  (Mel,ng  point  quality  feedback)  •  Carlos  Cobas  (NMR  quality  feedback)  

Funding  provided  by:  

Page 39: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Sci-­‐Mix  8:00pm  –  10:00pm  Today  Hall  C  –  Boston  Conven,on  

&  Exhibi,on  Center  

6-­‐aminopyrimidine-­‐2,4,5-­‐triolChinese  (Hanzi used  for  each  morpheme)

6-­‐氨基嘧啶-­‐2,4,5-­‐三醇

Japanese  (Phonetic  translation  to  Katakana)6-‐‑‒アミノピリミジン-‐‑‒2,4,5-‐‑‒トリオール

Korean  (Phonetic  translation  to  Hangul)6-아미노피리미딘-2,4,5-트리올

ammonia  radical          pyrimidine                                                                three    alcohol

amino                                                      pyrimidine                                                                                                                            tri                              ol

amino                                                      pyrimidine                                                                                                                      tri                                           ol

Chemistry Enabling Chinese, Japanese and Korean Patents

Page 40: Unlocking)chemical)informa0on) fromtablesandlegacyarclesbulletin.acscinf.org/PDFs/250nm/2015-fall_CINF74.pdf · 2015-12-08 · 250 th&ACS&Naonal&Mee,ng,&Boston&MA,&USA&17 &August2015&

250th  ACS  Na,onal  Mee,ng,  Boston  MA,  USA  17th  August  2015  

Thank  you  for  your  ,me!  

h}p://nextmovesooware.com  h}p://nextmovesooware.com/blog  

[email protected]