1
Development of a Screening Informa3cs System at the UNM Center for Molecular Discovery Mesa OpenEye OpenBabel SciTouch Powered by: Jeremy Yang, Oleg Ursu, Stephen Mathias, Cristian Bologa, Anna Waller, Annette Evangelisti, Gergely Záhoransky-Köhalmi, and Tudor Oprea University of New Mexico, Albuquerque, New Mexico, USA ACS National Meeting, San Diego, March 25-29, 2012 Intro to Flow Cytometry at UNMCMD References: 1.Flow Cytometry Shi8ing Gears, Gene=c Eng & Biotech News, Nov 15, 2011 (Vol. 31, No. 20) , hJp://www.genengnews.com/genar=cles/flowcytometryshi8inggears/3913 . 2.Edwards BS, Young SM, Saunders MJ, Bologa C, Oprea TI, Ye RD, Prossnitz ER, Graves SW, Sklar LA. High throughput flow cytometry for drug discovery. Expert Opin. Drug Discov. 2, 685696, 2007. 3.Haynes MK, Strouse JJ, Waller A, Leitão A, Curpan RF, Bologa C, Oprea TI, Prossnitz ER, Edwards BS, Sklar LA, Thompson TA. Detec=on of intracellular granularity induc=on in prostate cancer cell lines by small molecules using the HyperCyt® high throughput flow cytometry system. J. Biomol. Screening, 14, 596609, 2009 4.The NIH's Molecular Libraries Program What's Next? | SLAS Electronic Laboratory Neighborhood, hJp:// www.eln.slas.org/story/1/52thenihsmolecularlibrariesprogramwhatsnext. Mul=plexed! One of the primary mo=va=ons for cheminforma=cs has been drug discovery which involves bioassay screening and increasingly, high throughput screening (HTS). What is screening informa3cs? Informa=cs in support of screening for biomolecular discovery, usually pharma discovery: Acquisi=on, processing and storage of bioassay data for use during projects and for retrospec=ve analyses. Searching over molecules, assays, ac=vi=es, targets, etc. I/O & integra=on in conformance with contractual, legal/regulatory, business, and scien=fic requirements. Applica=ons and interfaces suited to transdisciplinary audience (biology, chemistry, pharmacology, medicine, etc.). Screening Informa3cs ≠ Cheminforma3cs !! Cheminforma=cs is a key part of screening informa=cs but biology is primary. Plates, wells, samples, and measurements are physically real and informa=cally authorita=ve while structure data is a model which may be incorrect or imprecise. Chem and bio contexts must be integrated for successful system. E.g. EC50 = 1.7µM is about a sample, a well, a plate, an assay, a biological system… eventually we hope about a lead compound. Why screening informa3cs? Major challenges New methodology, such as highcontent and mul=plex bioassays More data, internal and external New privacy and collabora=on models Advances in cheminforma=cs and bioinforma=cs methodology Development concurrent with ongoing projects and deadlines requiring con=nually opera=onal system. No shrinkwrapped solu3ons Due to the complexity of modern screening informa=cs, and in par=cular our novel, highly versa=le mul=plex flowcytometry plasorm (patented, and commercialized as HyperCyt), there cannot be a shrink wrapped solu=on providing all needed func=onality for all possible experiments. Solu3on: hybrid, agile system of apps & APIs Heterogeneous so8ware components from (1) commercial vendors, (2) open source projects, and (3) custom code developed at UNM. AEI & Pipeline Pilot & customiza3on A8er licensing the Accelrys Accord Enterprise Informa=cs (AEI) and Pipeline Pilot (PP) so8ware in 2009, efforts began to configure and customize AEI/PP. AccelrysUNMCMD consulta=on, customiza=on and training, revealed (1) what components could be used with minor configura=on efforts, and (2) scope of required custom coding. This experience was essen=al and decisive in the evolu=onary design process. UNMCMD specialized for flow cytometry Automa3ng when every assay is special Flow cytometry generates mul=ple fluorescence measurements per sample and per target, where mul=plex = mul=target. Even “singleplex” assays may employ mul=ple posi=ve and nega=ve control targets. Assays can differ greatly in raw data outputs and analysis protocols to calculate a “response” represen=ng a biological outcome (e.g. binding to a target). In some cases, it may seem more appropriate to conceive an API (programming interface) to recode each assay analysis rather than an informa=cs system, flexible but generally constant, and in fact, our solu=on combines elements of both. Custom code: Using the right tools for the tasks Custom so8ware development has included: Oracle SQL w/ AEI, Excel macros, Perl, Java, Python, NCBI EntrezU=ls apps, custom PP protocols, Prism batch code, and more. Interfaces include command line apps, web apps, and inhouse APIs for rapid development. Conclusion The good news is that advances in so8ware and informa=cs provide choices of solu=ons and opportuni=es to effec=vely manage screening. The complexity of the so8ware landscape is truly both a challenge and opportunity. It is hoped that our experiences will be helpful to others similarly tasked with designing and implemen=ng a screening informa=cs system. c/o Anna Waller, UNMCMD HyperViewSession_20110603 Accurate data acquisi3on key prerequisite Excel remains an important tool for scien=fic data processing, analysis and visualiza=on, at UNMCMD and elsewhere. But it has fundamental limita=ons and drawbacks, esp. data and code access and version control. E.g. Bcl2 assay analysis worksheets, UNMCMD, 2007 (PubChem AID=1693). MicroSoP Excel, not going away soon Screening informa=cs depends on accurate measurements with addi=onal informa=cs challenges, such as “binning”, i.e. correla=ng fluorescence data to wells and substances. AEVA (Assay Explora3on, Viewing & Analysis) web app PP protocol, via WebPort, to generate PubChem compliant depositor upload. Hit Defini3on: various assays, various methods Response: >(ac=va=on) or <(inhibi=on) cutoff SD: >(ac=va=on) or <(inhibi=on) cutoff SDs from plate mean. Custom: custom func=on specified for assays with "special needs“. Custom may include countertargets, mul=ple +/ controls, etc., etc.

Development of a Screening Informatics System at the UNM Center for Molecular Discovery

Embed Size (px)

Citation preview

Page 1: Development of a Screening Informatics System at the UNM Center for Molecular Discovery

Development  of  a  Screening  Informa3cs  System  at  the    UNM  Center  for  Molecular  Discovery  

Mesa OpenEye OpenBabel SciTouch Powered by:

Jeremy Yang, Oleg Ursu, Stephen Mathias, Cristian Bologa, Anna Waller, Annette Evangelisti, Gergely Záhoransky-Köhalmi, and Tudor Oprea University of New Mexico, Albuquerque, New Mexico, USA

ACS National Meeting, San Diego, March 25-29, 2012

Intro  to  Flow  Cytometry  at  UNMCMD  

References:  1. Flow  Cytometry  Shi8ing  Gears,  Gene=c  Eng  &  Biotech  News,  Nov  15,  2011  (Vol.  31,  No.  20)  ,  hJp://www.genengnews.com/gen-­‐ar=cles/flow-­‐cytometry-­‐shi8ing-­‐gears/3913.  

2. Edwards  BS,  Young  SM,  Saunders  MJ,  Bologa  C,  Oprea  TI,  Ye  RD,  Prossnitz  ER,  Graves  SW,  Sklar  LA.    High-­‐throughput  flow  cytometry  for  drug  discovery.    Expert  Opin.  Drug  Discov.  2,  685-­‐696,  2007.  

3. Haynes  MK,  Strouse  JJ,  Waller  A,  Leitão  A,  Curpan  RF,  Bologa  C,  Oprea  TI,  Prossnitz  ER,  Edwards  BS,  Sklar  LA,  Thompson  TA.  Detec=on  of  intracellular  granularity  induc=on  in  prostate  cancer  cell  lines  by  small  molecules  using  the  HyperCyt®  high  throughput  flow  cytometry  system.  J.  Biomol.  Screening,  14,  596-­‐609,  2009  

4. The  NIH's  Molecular  Libraries  Program  -­‐  What's  Next?  |  SLAS  Electronic  Laboratory  Neighborhood,  hJp://www.eln.slas.org/story/1/52-­‐the-­‐nihs-­‐molecular-­‐libraries-­‐program-­‐whats-­‐next.  

Mul=plexed!  

One   of   the   primary   mo=va=ons   for   cheminforma=cs   has   been   drug  discovery   which   involves   bioassay   screening   and   increasingly,   high-­‐throughput  screening  (HTS).        

What  is  screening  informa3cs?  •   Informa=cs  in  support  of  screening  for  biomolecular  discovery,  usually  pharma  discovery:  Acquisi=on,  processing    and  storage  of  bioassay  data  for  use  during  projects  and  for  retrospec=ve  analyses.      •    Searching   over   molecules,   assays,   ac=vi=es,   targets,   etc.     I/O   &  integra=on  in  conformance  with  contractual,  legal/regulatory,  business,  and  scien=fic  requirements.      •    Applica=ons   and   interfaces   suited   to   trans-­‐disciplinary   audience  (biology,  chemistry,  pharmacology,  medicine,  etc.).    

Screening  Informa3cs  ≠  Cheminforma3cs  !!    Cheminforma=cs   is  a  key  part  of  screening   informa=cs  but  biology   is  primary.    Plates,  wells,  samples,  and  measurements  are  physically  real  and   informa=cally  authorita=ve  while  structure  data   is  a  model  which  may   be   incorrect   or   imprecise.     Chem-­‐   and   bio-­‐   contexts   must   be  integrated  for  successful  system.    E.g.  EC50  =  1.7µM  is  about  a  sample,  a  well,  a  plate,  an  assay,  a  biological  system…  eventually  we  hope  about  a  lead  compound.    

Why  screening  informa3cs?  

Major  challenges  •   New  methodology,  such  as  high-­‐content  and  mul=plex  bioassays  •   More  data,  internal  and  external    •   New  privacy  and  collabora=on  models  •   Advances  in  cheminforma=cs  and  bioinforma=cs  methodology  •   Development  concurrent  with  ongoing  projects  and  deadlines  requiring  con=nually  opera=onal  system.  

No  shrink-­‐wrapped  solu3ons    Due  to  the  complexity  of  modern  screening  informa=cs,  and  in  par=cular  our  novel,  highly  versa=le  mul=plex  flow-­‐cytometry  plasorm  (patented,  and  commercialized  as  HyperCyt),  there  cannot  be  a  shrink-­‐wrapped  solu=on  providing  all  needed  func=onality  for  all  possible  experiments.  

Solu3on:  hybrid,  agile  system  of  apps  &  APIs  Heterogeneous  so8ware  components  from  (1)  commercial  vendors,  (2)  open  source  projects,  and  (3)  custom  code  developed  at  UNM.        

AEI  &  Pipeline  Pilot  &  customiza3on  A8er  licensing  the  Accelrys  Accord  Enterprise  Informa=cs  (AEI)  and  Pipeline  Pilot  (PP)  so8ware  in  2009,  efforts  began  to  configure  and  customize  AEI/PP.    Accelrys-­‐UNMCMD  consulta=on,  customiza=on  and  training,  revealed  (1)  what  components  could  be  used  with  minor  configura=on  efforts,  and  (2)  scope  of  required  custom  coding.    This  experience  was  essen=al  and  decisive  in  the  evolu=onary  design  process.  

UNMCMD    specialized  for  flow  cytometry  

Automa3ng  when  every  assay  is  special  Flow  cytometry  generates  mul=ple  fluorescence  measurements  per  sample  and  per  target,  where  mul=plex  =  mul=-­‐target.  Even  “singleplex”  assays  may  employ  mul=ple  posi=ve  and  nega=ve  control  targets.  Assays  can  differ  greatly  in  raw  data  outputs  and  analysis  protocols  to  calculate  a  “response”  represen=ng  a  biological  outcome  (e.g.  binding  to  a  target).  In  some  cases,  it  may  seem  more  appropriate  to  conceive  an  API  (programming  interface)  to  recode  each  assay  analysis  rather  than  an  informa=cs  system,  flexible  but  generally  constant,  and  in  fact,  our  solu=on  combines  elements  of  both.  

Custom  code:  Using  the  right  tools  for  the  tasks  Custom    so8ware  development  has  included:  Oracle  SQL  w/  AEI,    Excel  macros,  Perl,  Java,  Python,  NCBI  EntrezU=ls  apps,  custom  PP  protocols,  Prism  batch  code,  and  more.    Interfaces  include  command  line  apps,  web  apps,  and  in-­‐house  APIs  for  rapid  development.    

Conclusion  The  good  news  is  that  advances  in  so8ware  and  informa=cs  provide  choices  of  solu=ons  and  opportuni=es  to  effec=vely  manage  screening.  The  complexity  of  the  so8ware    landscape  is  truly  both  a  challenge  and  opportunity.    It  is  hoped  that  our  experiences  will  be  helpful  to  others  similarly  tasked  with  designing  and  implemen=ng  a  screening  informa=cs  system.  

c/o  Anna  Waller,  UNMCMD  HyperViewSession_20110603  

Accurate  data  acquisi3on  key  pre-­‐requisite    

Excel  remains  an  important  tool  for  scien=fic  data  processing,  analysis  and  visualiza=on,  at  UNMCMD  and  elsewhere.    But  it  has  fundamental  limita=ons  and  drawbacks,  esp.  data  and  code  access  and  version  control.  

E.g.  Bcl-­‐2  assay  analysis  worksheets,  UNMCMD,  

2007  (PubChem  AID=1693).  

MicroSoP  Excel,  not  going  away  soon  

Screening  informa=cs  depends  on  accurate  measurements  with  addi=onal  informa=cs  challenges,  such  as  “binning”,  i.e.  correla=ng  fluorescence  data  to  wells  and  substances.      

AEVA  (Assay  Explora3on,  Viewing  &  Analysis)  web  app    

PP  protocol,  via  WebPort,  to  generate  PubChem  compliant  depositor  upload.  

Hit  Defini3on:  various  assays,  various  methods  • Response:  >(ac=va=on)  or  <(inhibi=on)  cutoff    • SD:  >(ac=va=on)  or  <(inhibi=on)  cutoff  SDs  from  plate  mean.    • Custom:  custom  func=on  specified  for  assays  with  "special  needs“.  Custom  may  include  counter-­‐targets,  mul=ple  +/-­‐  controls,  etc.,  etc.