Transcript
Page 1: Development of a Screening Informatics System at the UNM Center for Molecular Discovery

Development  of  a  Screening  Informa3cs  System  at  the    UNM  Center  for  Molecular  Discovery  

Mesa OpenEye OpenBabel SciTouch Powered by:

Jeremy Yang, Oleg Ursu, Stephen Mathias, Cristian Bologa, Anna Waller, Annette Evangelisti, Gergely Záhoransky-Köhalmi, and Tudor Oprea University of New Mexico, Albuquerque, New Mexico, USA

ACS National Meeting, San Diego, March 25-29, 2012

Intro  to  Flow  Cytometry  at  UNMCMD  

References:  1. Flow  Cytometry  Shi8ing  Gears,  Gene=c  Eng  &  Biotech  News,  Nov  15,  2011  (Vol.  31,  No.  20)  ,  hJp://www.genengnews.com/gen-­‐ar=cles/flow-­‐cytometry-­‐shi8ing-­‐gears/3913.  

2. Edwards  BS,  Young  SM,  Saunders  MJ,  Bologa  C,  Oprea  TI,  Ye  RD,  Prossnitz  ER,  Graves  SW,  Sklar  LA.    High-­‐throughput  flow  cytometry  for  drug  discovery.    Expert  Opin.  Drug  Discov.  2,  685-­‐696,  2007.  

3. Haynes  MK,  Strouse  JJ,  Waller  A,  Leitão  A,  Curpan  RF,  Bologa  C,  Oprea  TI,  Prossnitz  ER,  Edwards  BS,  Sklar  LA,  Thompson  TA.  Detec=on  of  intracellular  granularity  induc=on  in  prostate  cancer  cell  lines  by  small  molecules  using  the  HyperCyt®  high  throughput  flow  cytometry  system.  J.  Biomol.  Screening,  14,  596-­‐609,  2009  

4. The  NIH's  Molecular  Libraries  Program  -­‐  What's  Next?  |  SLAS  Electronic  Laboratory  Neighborhood,  hJp://www.eln.slas.org/story/1/52-­‐the-­‐nihs-­‐molecular-­‐libraries-­‐program-­‐whats-­‐next.  

Mul=plexed!  

One   of   the   primary   mo=va=ons   for   cheminforma=cs   has   been   drug  discovery   which   involves   bioassay   screening   and   increasingly,   high-­‐throughput  screening  (HTS).        

What  is  screening  informa3cs?  •   Informa=cs  in  support  of  screening  for  biomolecular  discovery,  usually  pharma  discovery:  Acquisi=on,  processing    and  storage  of  bioassay  data  for  use  during  projects  and  for  retrospec=ve  analyses.      •    Searching   over   molecules,   assays,   ac=vi=es,   targets,   etc.     I/O   &  integra=on  in  conformance  with  contractual,  legal/regulatory,  business,  and  scien=fic  requirements.      •    Applica=ons   and   interfaces   suited   to   trans-­‐disciplinary   audience  (biology,  chemistry,  pharmacology,  medicine,  etc.).    

Screening  Informa3cs  ≠  Cheminforma3cs  !!    Cheminforma=cs   is  a  key  part  of  screening   informa=cs  but  biology   is  primary.    Plates,  wells,  samples,  and  measurements  are  physically  real  and   informa=cally  authorita=ve  while  structure  data   is  a  model  which  may   be   incorrect   or   imprecise.     Chem-­‐   and   bio-­‐   contexts   must   be  integrated  for  successful  system.    E.g.  EC50  =  1.7µM  is  about  a  sample,  a  well,  a  plate,  an  assay,  a  biological  system…  eventually  we  hope  about  a  lead  compound.    

Why  screening  informa3cs?  

Major  challenges  •   New  methodology,  such  as  high-­‐content  and  mul=plex  bioassays  •   More  data,  internal  and  external    •   New  privacy  and  collabora=on  models  •   Advances  in  cheminforma=cs  and  bioinforma=cs  methodology  •   Development  concurrent  with  ongoing  projects  and  deadlines  requiring  con=nually  opera=onal  system.  

No  shrink-­‐wrapped  solu3ons    Due  to  the  complexity  of  modern  screening  informa=cs,  and  in  par=cular  our  novel,  highly  versa=le  mul=plex  flow-­‐cytometry  plasorm  (patented,  and  commercialized  as  HyperCyt),  there  cannot  be  a  shrink-­‐wrapped  solu=on  providing  all  needed  func=onality  for  all  possible  experiments.  

Solu3on:  hybrid,  agile  system  of  apps  &  APIs  Heterogeneous  so8ware  components  from  (1)  commercial  vendors,  (2)  open  source  projects,  and  (3)  custom  code  developed  at  UNM.        

AEI  &  Pipeline  Pilot  &  customiza3on  A8er  licensing  the  Accelrys  Accord  Enterprise  Informa=cs  (AEI)  and  Pipeline  Pilot  (PP)  so8ware  in  2009,  efforts  began  to  configure  and  customize  AEI/PP.    Accelrys-­‐UNMCMD  consulta=on,  customiza=on  and  training,  revealed  (1)  what  components  could  be  used  with  minor  configura=on  efforts,  and  (2)  scope  of  required  custom  coding.    This  experience  was  essen=al  and  decisive  in  the  evolu=onary  design  process.  

UNMCMD    specialized  for  flow  cytometry  

Automa3ng  when  every  assay  is  special  Flow  cytometry  generates  mul=ple  fluorescence  measurements  per  sample  and  per  target,  where  mul=plex  =  mul=-­‐target.  Even  “singleplex”  assays  may  employ  mul=ple  posi=ve  and  nega=ve  control  targets.  Assays  can  differ  greatly  in  raw  data  outputs  and  analysis  protocols  to  calculate  a  “response”  represen=ng  a  biological  outcome  (e.g.  binding  to  a  target).  In  some  cases,  it  may  seem  more  appropriate  to  conceive  an  API  (programming  interface)  to  recode  each  assay  analysis  rather  than  an  informa=cs  system,  flexible  but  generally  constant,  and  in  fact,  our  solu=on  combines  elements  of  both.  

Custom  code:  Using  the  right  tools  for  the  tasks  Custom    so8ware  development  has  included:  Oracle  SQL  w/  AEI,    Excel  macros,  Perl,  Java,  Python,  NCBI  EntrezU=ls  apps,  custom  PP  protocols,  Prism  batch  code,  and  more.    Interfaces  include  command  line  apps,  web  apps,  and  in-­‐house  APIs  for  rapid  development.    

Conclusion  The  good  news  is  that  advances  in  so8ware  and  informa=cs  provide  choices  of  solu=ons  and  opportuni=es  to  effec=vely  manage  screening.  The  complexity  of  the  so8ware    landscape  is  truly  both  a  challenge  and  opportunity.    It  is  hoped  that  our  experiences  will  be  helpful  to  others  similarly  tasked  with  designing  and  implemen=ng  a  screening  informa=cs  system.  

c/o  Anna  Waller,  UNMCMD  HyperViewSession_20110603  

Accurate  data  acquisi3on  key  pre-­‐requisite    

Excel  remains  an  important  tool  for  scien=fic  data  processing,  analysis  and  visualiza=on,  at  UNMCMD  and  elsewhere.    But  it  has  fundamental  limita=ons  and  drawbacks,  esp.  data  and  code  access  and  version  control.  

E.g.  Bcl-­‐2  assay  analysis  worksheets,  UNMCMD,  

2007  (PubChem  AID=1693).  

MicroSoP  Excel,  not  going  away  soon  

Screening  informa=cs  depends  on  accurate  measurements  with  addi=onal  informa=cs  challenges,  such  as  “binning”,  i.e.  correla=ng  fluorescence  data  to  wells  and  substances.      

AEVA  (Assay  Explora3on,  Viewing  &  Analysis)  web  app    

PP  protocol,  via  WebPort,  to  generate  PubChem  compliant  depositor  upload.  

Hit  Defini3on:  various  assays,  various  methods  • Response:  >(ac=va=on)  or  <(inhibi=on)  cutoff    • SD:  >(ac=va=on)  or  <(inhibi=on)  cutoff  SDs  from  plate  mean.    • Custom:  custom  func=on  specified  for  assays  with  "special  needs“.  Custom  may  include  counter-­‐targets,  mul=ple  +/-­‐  controls,  etc.,  etc.  

Recommended