73
Oracle Database in the Cloud Bill Hodak Oracle Jamie Kinney Amazon Web Services Joseph Adler Recombinant Data Paul Parsons The Server Labs

Oracle Database in the Cloud...Oracle Database in the Cloud Bill Hodak Oracle Jamie Kinney Amazon Web Services Joseph Adler Recombinant Data Paul Parsons The Server Labs Cloud Computing

  • Upload
    others

  • View
    39

  • Download
    0

Embed Size (px)

Citation preview

Oracle Database in the Cloud

Bill HodakOracle

Jamie KinneyAmazon Web Services

Joseph AdlerRecombinant Data

Paul ParsonsThe Server Labs

Databases in the Cloud

Jamie Kinney - Amazon Web Services

[email protected]

The Amazon Network

Cloud Computing AttributesAbstract

ResourcesFocus on your needs, not on hardware specifications. As your

needs change, so should your resources.

On-Demand Provisioning

Ask for what you need, exactly when you need it. Pay only for what you use.

Scalability Scale up or down depending on usage needs.

No Up-FrontCosts

No contracts or long-term commitments.Pay only for what you use.

Efficiency of Experts Utilize the skills, knowledge and resources of experts.

The AWS Cloud

Your business

Your Your businessbusiness

Managing all the “Heavy Lifting”

Managing all the Managing all the ““Heavy LiftingHeavy Lifting””

More time to focus onyour business

More time to focus onMore time to focus onyour businessyour business

Configuring cloud assets

Configuring Configuring cloud cloud assetsassets

30% 70%

On-PremiseInfrastructure

30%70%

AWS CloudInfrastructure

The AWS cloud provides reliable and dependable on-demand infrastructure that frees time and expense for you to focus on your business.

Predictions Cost Money

You just lostcustomers

InfrastructureCost $

Time

LargeCapital

Expenditure

OpportunityCost

PredictedDemand

TraditionalHardware

ActualDemand

AutomatedVirtualization

AWS is Growing Rapidly

Amazon Web Services

Simple Storage Services (S3) Elastic Compute Cloud (EC2) Elastic Block Store (EBS)

SimpleDB CloudFront Simple Queuing Services (SQS)

The Oracle and AWS Partnership

Oracle Software on EC2A partnership between Oracle and AWS that allows you to develop and deliver your applications on the Amazon Elastic Compute Cloud

Easy to use. Start developing your applications on Oracle software on Amazon EC2 in minutes

No barriers. Oracle is providing software at no charge for development of commercial applications on Amazon EC2. Pay only infrastructure charges - as little as $0.10/hour.

Pay as you go. Run production versions of leading Oracle software products and pay only for what you need, when you need it using the new “Oracle SaaS for ISVs” monthly licensing model.

Portability. Use your existing Oracle licenses for most Oracle software products in the cloud or on premise - it’s now your choice.

Products. Currently Oracle Database 9i-11gR2, TimesTen, Oracle Coherence and all Fusion Middleware. Oracle Enterprise Linux and Enterprise Manager Grid Control are fully supported on AWS. The Oracle Secure Backup Cloud Module allows you to backup your databases directly to Amazon S3.

We are currently working on support for Oracle’s Enterprise Applications and Real Application Clusters.

Oracle on AWS Use CasesProof-of-Concept/Development: Many projects involving a new technology stack begin by creating development and test environments. Prior to the cloud, these environments could take months to build out and might require executive approval for the associated CapEx. With AWS, creating a new development environment takes minutes and only costs pennies per hour.

Steady State Usage: Migrate your existing Oracle software licenses to Amazon EC2 Reserved Instances and pay even less per hour for non-stop production servers.

On-Demand Usage: Oracle’s cost-effective monthly “SaaS for ISVs” licensing model combined with the Amazon Elastic Compute Cloud allows you to easily scale up or down the number of instances as your application’s workload changes over time. This model works well for unpredictable or variable workloads.

Backup and Recovery: Use the Oracle Secure Backup Cloud Module to backup your production databases directly to Amazon S3 using your existing RMAN scripts.

Disaster Recovery: Create an in-sync Oracle Data Guard standby database on EC2.

Oracle Cloud Computing Center

http://www.oracle.com/tech/cloud/index.html

EC2 Overview

HW as a ServiceRoot/Admin access to Linux, Windows, and OpenSolaris servers

On DemandProvision custom servers in minutes

ElasticScale up and scale down as needed

Web Scale1000’s of cores, multiple availability zones, EU and US locations

UtilityPay for only what you use. No minimums

EC2 Is.…

Why EC2?Business Agility

React in real time to market and customer demandsLower Risk

Remove under/over investment in infrastructure for new initiativesReliability

Easy to support highly available, highly scalable applicationsCore Competency

Amazon knows datacenter scale, security, and reliability. You can focus on your business

ROISpend for average utilization vs. peak. Economies of scale. No Upfront $

From 50 to 3500 Instances in 3 Days

1:Many Relationship Between AMIs and Instances

AMIAMIInstanceInstance

InstanceInstanceInstanceInstance

InstanceInstanceInstanceInstance

Amazon Machine Images

EC2 Instance Lifecycle

AMIAMIAMI

Instance(Pending)InstanceInstance(Pending)(Pending)

RunInstances call to cloud•Specify which AMI to launch•Provide parameters (# instances, security group, etc)

Instance launch initiated•Copy AMI from S3•Assign parameters

Instance(Running)InstanceInstance(Running)(Running)

Instance(Shutting

Down)

InstanceInstance(Shutting (Shutting

Down)Down)Instance

(Terminated)InstanceInstance

(Terminated)(Terminated)

•Attach EBS Storage once running

•Assign Elastic IP Address

•Resources automatically detached (IP, storage)

•Can also be initiated as normal operating system shutdown

Amazon Virtual Private Clouds

AWS Security White PaperAvailable to the publicaws.amazon.com/security

Amazon EC2 Instance TypesStandard On-Demand Instances Hourly Price

1 YearReserved Instance

Price

Memory Virtual Cores Storage

Small $0.10 $227.50 +$0.03/hour 1.7GB 1 @ 1 ECU 160 GB

Large $0.40 $910 + $0.12/hour 7.5 GB 2 @ 2 ECU 850 GB

Extra Large $0.80 $1820 + $0.24/hour 15 GB 4 @ 2 ECU 1690 GB

High-CPU On-Demand Instances

Medium $0.20 $455 + $0.06/hour 1.7 GB 2 @ 2.5 ECU 350 GB

Extra Large $0.80 $1820 + $0.24/hour 7 GB 8 @ 2.5 ECU 1690 GB

Coming Soon!In response to requests from customers running memory and I/O-intensive applications in the cloud we are planning additions to our EC2 instance family

Aimed at database, memory caching and other high-throughput applications, these instances would offer much larger memory sizes and significantly more network I/O bandwidth.

Stay tuned over the next few weeks for an announcement!

October 09 © The Server Labs S.L. 2009

Scientific Data Processing in Amazon EC2 with Oracle

© The Server Labs S.L. 2009, Images Courtesy of ESA29-Oct-09 24

Who are The Server Labs

European specialised, niche consultancy

IT architects

Extensive experience

Hands-on

Agile execution

Passion for technology

© The Server Labs S.L. 2009, Images Courtesy of ESA

About this presentation

Results of a Feasibility Study to move part of ESA’s Gaia Data Processing to Amazon’s EC2 Cloud

© The Server Labs S.L. 2009, Images Courtesy of ESA

Study Objectives

Two main objectives

Evaluate the Tecnical Feasibility of using Amazon EC2 to run scientific data processing applications

Evaluate the Financial Viability of using pay on demand compute power vs. traditional in-house data processing

© The Server Labs S.L. 2009, Images Courtesy of ESA

A Stereoscopic Census of our Galaxy

(based on slides from Jos de Bruijne and William O’Mullane)

Seminar, IAC, 30th April 2008 27 Dr. Ralf Kohley, European Space Astronomy Centre

© The Server Labs S.L. 2009, Images Courtesy of ESA

The Gaia Mission

Primary goal of the Gaia mission is to create an astrometric catalogue of 1 billion stars (approx 1% of our Galaxy) with micro arc second precision.

Gaia satellite to be launched in 2011.

Observations done until 2017.

Catalogue ready around 2019.

© The Server Labs S.L. 2009, Images Courtesy of ESA29

The Gaia Mission

Seminar, IAC, 30th April 2008 Dr. Ralf Kohley, European Space Astronomy Centre

Credit: Images ESA

© The Server Labs S.L. 2009, Images Courtesy of ESA

If it took 1 millisecond to process one image, the processing time for just one pass through the data

(on a single processor) would take 30 years.

Obviously the adopted solution is much faster ……. distributed/parallel processing.

© The Server Labs S.L. 2009, Images Courtesy of ESA

AGIS: Astrometric Global Iterative Solution

Sky scans(highest accuracy

along scan)

Scan width: 0.7°

1. Objects are matched in successive scans2. Attitude and calibrations are updated3. Objects positions etc. are solved4. Higher-order terms are solved5. More scans are added6. Whole system is iterated

© The Server Labs S.L. 2009, Images Courtesy of ESA

AGIS Architecture

Datatrains drive through AGIS Database passing observations to algorithms.There can be as many Datatrains in parallel as we wish

Algorithm does not access data directly

Calibration Global Attitude Source

Elementary TakersElementary Takers

Optimised AGISDatabase

Optimised AGISDatabase

Data Access LayerData Access Layer

AstroElementaryAstroElementary

© The Server Labs S.L. 2009, Images Courtesy of ESA

AGIS Architecture - detailed

RunManagerRunManager

ConvergenceMonitor

ConvergenceMonitor

AGIS DB

StoreStore

GaiaTable

Object FactoryObject Factory

AstroElementaryElementaryDataTrain

ElementaryDataTrain

Request AstroElementariesbetween a range (x,y)

Calibration CollectorCalibration Collector

Attitucde CollectorAttitucde Collector

Source CollectorSource Collector

Global CollectorGlobal Collector

Source CollectorSource Collector

Attitude UpdateServer

Attitude UpdateServer

Global UpdateServer

Global UpdateServer

© The Server Labs S.L. 2009, Images Courtesy of ESA

Scheduling

Very simple ..Keep all machines busy all the time!

Busy = CPU ~90%

Post jobs on whiteboard

Trains/Workers Mark Jobs – and do themMark finished – repeat until done

Previous attempt had much more general scheduling It was also ~1000 times slower.

Job 1Job 2Job 3

…Job N

Job 1Job 2Job 3

…Job N

© The Server Labs S.L. 2009, Images Courtesy of ESA

The problem

Data centre cost� AGIS run times decrease as more processors are added. Note that the

data volume increased from 2005 to 2006 from 18 months to 5 years, theprocessor power also increased but the run time went up. This wasdramatically improved in 2007. The normalised column shows throughput perprocessor in the system (total observations/processors/hours) e.g. anindication of the real performance.

Current estimation for in-house data processing for AGIS is around 1.2 million euros

© The Server Labs S.L. 2009, Images Courtesy of ESA

Economics of Cloud Computing

Unused resourcesStatic data center Data center in the cloud

Demand

Capacity

TimeDemand

Capacity

Time

© The Server Labs S.L. 2009, Images Courtesy of ESA

AGIS Peaks

Iterative processing – 6 month Data Reduction CyclesAt current estimates AGIS will run 2 weeks every 6 monthsAmount of data increases over the 5 year mission

0

500

1000

1500

2000

2500

Hours

Nov-11

Nov-12

Nov-13

Nov-14

Nov-15

Nov-16

Nov-17

Date

AGIS Peak Processing (Hours)

AGIS 6 monthly processing

Cap Ex

© The Server Labs S.L. 2009, Images Courtesy of ESA

The Study: Running AGIS in Amazon EC2

Technical Feasibility:Can AGIS run in the cloud?What are the restrictions?What modifications do we have to make?

Financial ViabilityWhat would be the cost of using EC2 for AGIS?Can we do a hybrid solution using a local data centre followed by a mix of local/EC2?

© The Server Labs S.L. 2009, Images Courtesy of ESA

EC2 Images

64 bit imagesLarge, Extra Large and High CPU Large

Oracle ASM Image based on Oracle Database 11g Release 1 Enterprise Edition - 64 Bit (Large instance) -ami-7ecb2f17

AGIS Self configuring Image based on Ubuntu8.04 LTS Hardy Server 64-Bit (Large, Extra Large and High CPU Large Instances) - ami-e257b08b

© The Server Labs S.L. 2009, Images Courtesy of ESA

Architecture in the Cloud

ConvergenceMonitor

ConvergenceMonitor

Attitude UpdateServer

Attitude UpdateServer

StoreStore

GaiaTable

Object FactoryObject Factory

AstroElementaryElementaryDataTrain

ElementaryDataTrain

Request AstroElementaries between a range (x,y)

Calibration CollectorCalibration Collector

Attitucde CollectorAttitucde Collector

Source CollectorSource Collector

Global CollectorGlobal Collector

Source CollectorSource Collector

Data Trains

AGIS DB

RunManagerRunManager

1x Large Instance

AGIS AMI

Elastic IP

<n> x Extra Large or High CPU Large instances

AGIS AMI1x Large instance

Oracle AMI

Elastic IP

3 x Extra Large instances

AGIS AMI

© The Server Labs S.L. 2009, Images Courtesy of ESA

ASMDiskGroup(EBS)

ASMDiskGroup(EBS)

Oracle Image

EC2 Large instance (m1.large)Oracle Enterprise Edition 11g 64 bit (11.0.6)Oracle ASMElastic Block Storage

5 x 100GB disks /dev/sdg - /dev/sdk

AGIS DB

/dev/sdh/dev/sdh

/dev/sdi/dev/sdi

/dev/sdj/dev/sdj

/dev/sdk/dev/sdk

/dev/sdg/dev/sdg

/mnt /dev/sdb/mnt /dev/sdb

/ /dev/sda1/ /dev/sda1ORACLEORACLE EXTERNAL

redundancy best

© The Server Labs S.L. 2009, Images Courtesy of ESA

Configuring Oracle

Launch an m1.large instance of ami-7ecb2f17Attach the /mnt partition properly so it has enough spaceCreate 5 EBS vols of 100GB each and attach them to the instanceSet up Oracle ASMLib

Install driversRun oracleasm_debug_linkRun oracleasm configure, createdisk

Copy a pre-recorded Oracle response file up to create an ASM instanceRun Oracle installer to create the ASM instanceCopy a pre-recorded Oracle response file up to create the AGIS DB instanceRun Oracle installer to create the AGIS DB instance

© The Server Labs S.L. 2009, Images Courtesy of ESA

Configuring Oracle cont.

Create an Elastic IP and associate it with the instanceChange the hostname to be the new public DNS name

hostname ec2-174-129-223-59.compute-1.amazonaws.comRun localconfig remove followed by localconfig addThis will run for ever unless you edit /etc/inittab and change the following line

Start the ASM instanceStart the AGIS DB instance

h1:35:respawn:/etc/init.d/init.cssd run >/dev/null 2>&1 </dev/null

to

h1:345:respawn:/etc/init.d/init.cssd run >/dev/null 2>&1 </dev/null

Don’t forget to make a new image!!!

© The Server Labs S.L. 2009, Images Courtesy of ESA

AGIS Image

Instances (m1.large and c1.xlarge).Java version 1.6.0_13

Java(TM) SE Runtime Environment (build 1.6.0_13-b03)Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed mode)

Apache Tomcat 6.0.14AGIS software.Creation of agis user acoountrc.local script modified to run the AGIS process

Self configuring using user-data

© The Server Labs S.L. 2009, Images Courtesy of ESA

Configuring AGIS Image

Launch an m1.large instance of ami-e257b08bCheckout the source code from the svn server.Create an agis user to run the process.Set up /etc/rc.local to execute the runAgis.sh Create a generic runAgis.sh script that reads data passed to the ami during boot time (using the AWS).

Data contains JVM parameters and especificapplication parameters (Depending on the DataTrainthat will be executed at that node).

© The Server Labs S.L. 2009, Images Courtesy of ESA

Problems Encountered along the way

Creating new images takes a long time so make sure youget it right

No ASMLib drivers for 2.6.18-53.1.13.9.1.el5xen

Oracle is very fussy about it’s IP address, hence theElastic IP

Oracle instance hostname changed to be the public DNS name

Some work still needed on the startup script to make sure the ASM boots first time

The Attitude Servers took a long time to start up (20 mins)This was due to a race condition caused by spin locks in the type of java Thread Pool we were using.

© The Server Labs S.L. 2009, Images Courtesy of ESA

EC2 Instances

© The Server Labs S.L. 2009, Images Courtesy of ESA

AGIS Works!

© The Server Labs S.L. 2009, Images Courtesy of ESA

I/O Transfer from disk of up to 50 MB/sec

Oracle Performance in the Cloud

© The Server Labs S.L. 2009, Images Courtesy of ESA

Demo

Let’s see it in action!

© The Server Labs S.L. 2009, Images Courtesy of ESA

Conclusions

AGIS and Oracle can be run in the cloud!

The Economics work out:EC2 may work out cheaper than buying the hardware!

With the knowledge that EC2 is an option we can delay buying more machines until the middle of the mission (2014) and decide then.

Now running a new feasibility study with 60 million primary stars (1/3 of the final data)

Aim is to try and scale out to 1000 High CPU instances

© The Server Labs S.L. 2009, Images Courtesy of ESA29-Oct-09 52

Contact us!

Dolores Saiz, CEO Paul Parsons, CTOWebsite, e-mail

http://[email protected]@[email protected]

SpainThe Server Labs S.L.C/Pinar, 528006 Madrid, SpainTel: (+34) 91 745 68 77

UKThe Server Labs Ltd.Aston Court, Kingsmead Business Park Frederick Place High WycombeHP11 1LATel: (+44) 20 8133 1620

GermanyTrianon, Mainzer Landstraße 1660325 FrankfurtTel: (+49) (0) 69 971 68 428

© The Server Labs S.L. 2009, Images Courtesy of ESASeminar, IAC, 30th April 2008 Dr. Ralf Kohley, European Space Astronomy Centre 53

Copyright Notice

This presentation contains images and videos which have been released publicly from ESA. You may use ESA images or videos for educational or informational purposes.

The publicly released ESA images may be reproduced without fee, on the following conditions:

* Credit ESA as the source of the images:Examples: Photo: ESA; Photo: ESA/Cluster; Image: ESA/NASA - SOHO/LASCO

* ESA images may not be used to state or imply the endorsement by ESA or any ESA employee of a commercial product, process or service, or used in any other manner that might mislead.

* If an image includes an identifiable person, using that image for commercial purposes may infringe that person‘s right of privacy, and separate permission should be obtained from the individual.

If these images are to be used in advertising or any commercial promotion, layout and copy must be submitted to ESA beforehand for approval to:

ESA [email protected]

Some images contained in this presentation have come from other sources, and this is indicated in the Copyright notice. For re-use of non-ESA images contact the designated authority.

Use of ESA videos

The use of ESA video images in streaming and downloadable format is limited to direct viewing and/or file storage on a single computer per stream and/or download. Forwarding of files or streams to other computers, or use on any non-ESA Web is

prohibited. For the authorisation of any such use, please contact:

ESA [email protected]

Copyright © 2009 Recombinant Data Corp. All rights reserved.

Joseph AdlerSolutions Architect

Recombinant Data Corp.

Joseph AdlerSolutions Architect

Recombinant Data Corp.

55

About me• Joseph Adler

– 12 years experience in data warehousing and data mining

– Multiple patents on cryptography and computer security

– Shamelessly plugs books at conferences

56

About RecombinantWe are a startup from Newton, MA focused on secondary uses of clinical data.

Core Competencies

•Clinical data warehousing & integration services•Translational research & quality reporting solutions

•Data strategy, governance & compliance consulting

•Open Source implementations & extensions

57

Representative Clients• Partners HealthCare• The Dartmouth Institute• Health Sciences South Carolina• UCSF• UC Davis Health System• Boston University• Massachusetts General Hospital• University of Washington• UMass Medical School• Morehouse School of Medicine• Piedmont Healthcare• UMass Memorial Health Care• Maine Health• UC Irvine• Moses Cone Health System• Department of Veterans Affairs• Cincinnati Children’s Hospital

58

Translational Medicine Defined

59

Healthcare and Life Sciences

• What’s unique about healthcare and life sciences data warehousing?– Large volume of data

• Long data (thousands of patients)

• Wide data (thousands of measurements)

– Many workers• Thousands of employees, multiple locations—all of whom need to access the same data

– Diverse skill sets• Research scientists, statisticians, medical doctors, etc

• Lots of people with advanced degrees…

60

Translational Research for

Healthcare and Life Sciences:Case Study

61

Case StudyWe were hired by a major pharmaceutical company to build a data warehouse to house pre‐clinical, clinical, and third party data and publications•Helps different types of users (scientists, clinicians, CDTLs, biostatisticians, executives) find and view results

•Allows queries of data across different stages of a study or multiple studies relating to the same subject

•Provides a view of all projects across the organization

62

Case Study• Operational and technical challenges

– Limited budget• Time, Money

– Security concerns• Proprietary corporate information

– Clinical trial results, Research focus, Experimental data• Patient data

– HIPAA, European privacy laws, etc

– Agile development process• Short development cycles, working software, collaboration with customers

– Scalability• Thousands of users, terabytes of data (eventually)

• We chose Amazon Web Services for our development, testing, and production systems.

63

Case Study

Estimated System Cost10 TB Warehouse, Total expenses over 3 years

64

Case Study

Why AWS?– Lowest cost

• Lowest storage cost

• Lowest overhead

– Highest flexibility• Add or remove instances at any time

• Start with a little storage space,scale up over time

– Easiest deployment• Fast, reliable access from our development office in MA, customer sites across US and in Europe

65

Case Study• We used these AWS tools and services:

– S3 Storage• Approximately 2 TB in use right now, scaling to 30+

– EC2 Instances• Started with Oracle AMI

– Oracle Enterprise Linux Release 5 Update 1– Oracle Database 11g Release 1 Enterprise Edition

• 6 Primary instances– Separate database and application servers– Development, Testing, and Production Systems

• Additional instances– Backup– Performance testing

66

Case Study

• Architecture diagramPharmaceutical company network

Amazon Web Services “Cloud”

Development DB

Development DB

Development Application Server

Development Application Server

Testing/QA DB

Testing/QA DB

Testing/QA Application Server

Testing/QA Application Server

Production DB

Production DB

Production Application Server

Production Application Server

Internal DatabasesInternal Databases

Internal DocumentsInternal 

DocumentsInternal data filesInternal data files

Developers (App, ETL) Users  (Researchers)

67

Case Study• Amazon Web Services is great for data warehouses– Data warehouses are different from transactional systems

• Loading data in batches• Querying data across large tables

– AWS performance characteristics are good for data warehouses

• Good I/O operations per second, but excellent bandwidth

• Ample CPU, Memory

– Unlimited, low cost storage

68

Lessons Learned

• Optimizing Oracle on AWS:– Observations

• Minimize I/O queries

• Load large blocks of data

• Leverage CPU and memory

– Recommendations• Table and index compression

• Bitmap indexes, bitmap join indexes

• Disable logging to the redo file (NOLOGGING)

• “Extra‐Large” EC2 instances (15 GB RAM, 4 cores, 64 bit)

69

Lessons Learned

• Increasing availability on AWS– EC2 instances can die

• Actual story: We lost one day of development work when I changed a configuration parameter and rebooted an AMI instance, locking us out.

– Recommendations• Make frequent backups

– Daily backups (to another AMI instance)

– Backup to S3

• Build your own AMI– Build a machine image with your app

– Make sure you can start, stop the AMI

70

Lessons Learned

• AWS security– Remember that each EC2 instance looks like a public host on the internet

– Recommendations• Use IP filtering (“AWS security groups”) to restrict access during development

• Get your organizations security staff involved early

• Start with a secure AMI

• Treat the EC2 instance like a Windows/Linux server on the public internet

71

Thank You

• Recombinant Data Corp.255 Washington Street, Suite 235Newton, MA 02458Tel: (617) 243‐3700 Fax: (617) 243‐[email protected]

Additional References and Contacts

• Oracle Cloud Computing Center (OTN)• http://www.oracle.com/technology/tech/cloud/index.html• Provide feedback and ask questions using the “Cloud

Computing Discussion Forum”

• Amazon Web Services Website• http://aws.amazon.com