Upload
brent-leonard
View
214
Download
1
Embed Size (px)
Citation preview
Cloud Computingat
Johnson & Johnson
Pharmaceutical Research & Development LLC
Agenda
• Introduction• Strategic Goals• Success Stories• Lessons Learned• Future Plans
What is Cloud Computing?
• Utility based computing and storage– Pay for use
• CPUs per hour of use• Storage per gigabyte used
– Scalable on demand• Provisioning new CPUs takes minutes• Storage can be grown as needed within minutes
• Multi-tiered solution– Infrastructure as a service (ex. Amazon EC2, S3)– Platform as a service (ex. Azure, Salesforce.com) – Software as a service (ex. Gmail, Google Apps)
Strategic Goals
• Resources for High Performance Computing (HPC) peak demand– Additional CPUs to shorten processing time– Additional storage for ‘scratch’ space
• Archival storage• Development instances• Training instances• Collaboration environments• Quickly extend existing infrastructure
Success Story: PK/PD
• Nonmem and Bootstrapping– Needs additional CPUs to
• shorten response times for FDA submission inquires• create more detailed models
– Larger population sizes– Greater number of parameters
– Implemented on Amazon EC2
Client PC Firewall
Amazon EC2
Cycle Cloud Head Node+
Nonmem Compiler
Compute Node
Compute Node
Compute NodeCompute Node
SSL Secured Communication/Encryption
Success Story: Cloud Storage
• Goals– Test viability of Nirvanix Storage delivery Network– Collect transfer speeds for upload and download– Integrate into Veritas Netbackup as storage media by creating Virtual
Tape Library using CloudNAS client.– Evaluate performance, cost.
• Results– All base functionality worked as planned– Tests showed increased elasticity and scalability of storage– Performance met targets and significant cost savings
• Planned usage for archival and retrieval of• NextGen Sequencer data – 10+ TBs• DNA Chip data – 5 ~10 TBs• NuGenesis data – 2 TBs
Success Story: TranSMART
• Partnership/collaboration with Recombinant Data Corporation to leverage Clinical and Biomarker data– Use Pathway Analysis and Biomarkers to direct research
investment decisions– Execute the transition from bench to bedside translational
research– Provide a collaboration platform for Pre-Clinical and Clinical,
Biologists, Clinicians and Bioinformaticists– Execute a systems biology approach for Discovery and
Development
Success Story: Image Processing
• Business opportunity or challenge– Research Capabilities uses a program called Feldkamp to convert 2D cat scan
images to 3D image slices for visualization– The processing time for each cat scan is 22 hours on a local server– The next study has 100 images that need to be processed; on a local server this
would take ~ 92 days (meaning the business would not conduct the study at all)
• Solution– Launch 11 concurrent servers in the Amazon Cloud to process one cat scan at a
time; reduce processing time from 22 hours to 2 hours
• Expected business results– For 100 cat scan files, processing time will be reduced from 92 days to 8 days– Estimate cost for processing one cat scan file = $13.82 (for all 100 files = $1,382)– Saved time = 84 days
Lessons Learned
• Security– Involve security folks early– Internal processes bigger
hurdle than technical learning
• Applicability– Cloud is not the solution for
everything• HPC heavy on Message
Passing Interface (MPI)• Business critical systems
• Architecture– Do not split your systems across
networks (e.g., app and DB)– Include security in your design
• Legal– Start work on the hosting
agreement ASAP– Educate your legal staff
– Hosting without specifying physical asset
Future Plans
• Evaluate additional applications for Cloud deployment• Develop enterprise strategy
– In progress with Corp IT
• Evaluate additional providers and vendors– Avoid lock-in to a single platform
• Work cross-sectors on cloud initiatives– In progress with MD&D
• Expand internal compute Grid to the Cloud– Done in DEV
Molecular Conformations
• Application to perform molecular conformations given an input file of molecules (e.g. SMILES) and a governing rule-set
• Business challenge– Molconf currently runs on individual users’ machines– Performance limited by the user’s hardware (> 7 hours to run 100,000 molecules)
• Solution– Distribute all computations to the Microsoft Azure cloud platform
• Expected Results– Processing time in minutes rather than hours– Major performance upgrade: all work can be distributed among multiple nodes– Scalability: spin up nodes on-demand to handle workload– Independence from users’ machines; submit a job and retrieve it later
Observational Medical Outcomes Partnership (OMOP)
• Public-private partnership between FNIH and large Pharma companies– Improve monitoring of drug safety, establish Common Data Model schema (CDM)
• OMOP Cup: contest to collect best safety surveillance scripts– Open-source scripts that leverage CDM schema
• Business challenge:– How to best harness the power of the CDM and the OMOP Cup scripts in a
federated, scalable environment
• Solution:– Work with data vendors (United Healthcare, Thomson, Premier, etc.) to embrace
CDM instead of using proprietary schemas– Work with Microsoft to establish cloud solution for accessing this data and running
the OMOP Cup scripts
External Collaboration
• Leveraged a neutral third party to develop cloud solutions in the pre-competitive space– Eli Lilly and Cycle Cloud => BLAST– J&J PRD and Cycle Cloud => Nonmem– Both Pharmas have access to BLAST and Nonmem
• Cross-Pharma HPC Group– Started 4 years ago for best practices in Advanced Computing– Now used for pre-competitive collaboration
• Vendor Management• Open source initiatives• Cloud Computing opportunities, trends and challenges• Benchmarking of HPC across the Pharma sector
Acknowledgments• Sebastian Piotrowski
– IT Lead, R&D Advanced Technology CoE– [email protected]
• Tom Messina– IT Manager, R&D Advanced Technology CoE– [email protected]