1
Virtualized Infrastructure Darwin A. Campbell 1 ; Carson M. Andorf 1 ; Ethalinda K. Cannon 2 ; Bremen L. Braun 1 7 ; Scott M. Birkett 2 7 ; Jack M. Gardiner 2 6 7 ; Lisa C. Harper 3 4 7 ; Mary L. Schaeffer 5 ; Taner Z. Sen 1 2 ; Carolyn J. Lawrence 1 2 1 USDA-ARS Corn Insects and Crop Genetics Research Unit, Ames, IA 50011; 2 Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50011; 3 USDA-ARS Plant Gene Expression Center, Albany, CA 94710; 4 Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720; 5 USDA-ARS Plant Genetics Research Unit and Division of Plant Sciences, University of Missouri, Columbia, MO 65211; 6 School of Plant Sciences, University of Arizona, Tucson, AZ 85721- 0036; 7 These authors are listed alphabetically. Abstract Over the past year, the MaizeGDB system infrastructure has been completely redesigned and virtualized to insure availability and increase performance. Here we show the overall design of the virtualized infrastructure, our server environment, updated software, and methods to gather usage statistics. Moving to a virtualized environment has had significant impacts on how we do business. We are happy to report that previous physical hardware limitations no longer bog down development, thus improving our ability to provide timely software releases that support your research needs. Funding acknowledgement: United States Department of Agriculture (USDA) “Virtual Infrastructure” refers to the collective hardware and software working together. At MaizeGDB, we choose to use VMware Esx version 3.5 (our repurposed hardware is not compatible with the latest version of VMware). Our host servers are Dell 2850s with dual quad core processors with 16 Gb of ram. The host servers are connected via redundant fibre channel to an 5.5 Tb HP Storage Works Storage Area Network (12-450Gb 15K rpm dual port drives). At present, the two host servers support 16 virtual servers. In the past, under ideal conditions, one server could support the functionality of 2 virtual servers. The virtualized infrastructure affords us the ability to test and develop (data and software) in an isolated environment. Through the implementation of High Availability, the system automatically moves virtual servers from host to host depending on established resource (RAM, CPU) thresholds. Current versions: Server OS: Linux RedHat Advance Server 5.4 and Windows Server 2008 Database: Oracle 11g PHP: PHP 5.2.13 Apache: Apache 2.2.3 BLAST: BLAST+ 2.2.23 In concert with the virtualization of the infrastructure, the methodology which MaizeGDB gathers and reports usage statistics went under a radical redesign. In the past statistics were gathered on each physical server then aggregated into a report. It was difficult to monitor and report on all traffic into MaizeGDB, and usage stats prior to mid- 2009 are no doubt artificially low. Our solution was to implement a PROXY server that sits between the internet and the MaizeGDB virtual servers that support 37 sub- domains (e.g. video.maizegdb.org, cornfab2.maizegdb.org). Now that all traffic is routed through one server, monitoring incoming traffic and the reporting usage statistics is more accurate and timely. The proxy server is a virtual server residing on the same physical hardware and does decrease site performance since all communication between the virtual machines are limited only by hard drive performance. This illustration shows the integration of the proxy server into the virtual environment: •Connection to the “public” web interface (two instances), •Connection to the Gbrowse servers (two instances), •Connection to the BLAST service at MaizeGDB residing on a separate physical server/virtual infrastructure and •the monitoring of web traffic to all the other virtual servers. The BLAST service requests at MaizeGDB are routed through the proxy server to an independent physical server/virtual infrastructure that is connected together by a gigabit Ethernet connection. Isolating the BLAST service manages computational intensive processes and negates their impact on the public interface. The standard BLAST requests are routed through the proxy server, to load balancer. The load balance submits BLAST requests to one of three blast servers in a round-robin format. uBLAST requests are routed through the proxy server then directly to the uBLAST server. The common data server distributes data to all BLAST servers and minimizes redundant file storage. Monthly usage, reported by year 2011 Maize Meeting Program and abstracts Stay connected to MaizeGDB!

Virtualized Infrastructure Darwin A. Campbell 1 ; Carson M. Andorf 1 ; Ethalinda K. Cannon 2 ; Bremen L. Braun 1 7 ; Scott M. Birkett 2 7 ; Jack M. Gardiner

Embed Size (px)

Citation preview

Page 1: Virtualized Infrastructure Darwin A. Campbell 1 ; Carson M. Andorf 1 ; Ethalinda K. Cannon 2 ; Bremen L. Braun 1 7 ; Scott M. Birkett 2 7 ; Jack M. Gardiner

Virtualized InfrastructureDarwin A. Campbell 1; Carson M. Andorf1; Ethalinda K. Cannon 2; Bremen L. Braun1 7; Scott M. Birkett 2 7;

Jack M. Gardiner2 6 7; Lisa C. Harper 3 4 7; Mary L. Schaeffer 5; Taner Z. Sen 1 2 ; Carolyn J. Lawrence 1 2

1 USDA-ARS Corn Insects and Crop Genetics Research Unit, Ames, IA 50011; 2 Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50011;3 USDA-ARS Plant Gene Expression Center, Albany, CA 94710; 4 Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720;

5 USDA-ARS Plant Genetics Research Unit and Division of Plant Sciences, University of Missouri, Columbia, MO 65211; 6 School of Plant Sciences, University of Arizona, Tucson, AZ 85721-0036;7 These authors are listed alphabetically.

AbstractAbstractOver the past year, the MaizeGDB system infrastructure has been completely redesigned and virtualized to insure availability and increase performance. Here we show the overall design of the virtualized infrastructure, our server environment, updated software, and methods to gather usage statistics. Moving to a virtualized environment has had significant impacts on how we do business. We are happy to report that previous physical hardware limitations no longer bog down development, thus improving our ability to provide timely software releases that support your research needs.

Funding acknowledgement: United States Department of Agriculture (USDA)

Over the past year, the MaizeGDB system infrastructure has been completely redesigned and virtualized to insure availability and increase performance. Here we show the overall design of the virtualized infrastructure, our server environment, updated software, and methods to gather usage statistics. Moving to a virtualized environment has had significant impacts on how we do business. We are happy to report that previous physical hardware limitations no longer bog down development, thus improving our ability to provide timely software releases that support your research needs.

Funding acknowledgement: United States Department of Agriculture (USDA)

“Virtual Infrastructure” refers to the collective hardware and software working together. At MaizeGDB, we choose to use VMware Esx version 3.5 (our repurposed hardware is not compatible with the latest version of VMware). Our host servers are Dell 2850s with dual quad core processors with 16 Gb of ram. The host servers are connected via redundant fibre channel to an 5.5 Tb HP Storage Works Storage Area Network (12-450Gb 15K rpm dual port drives).

At present, the two host servers support 16 virtual servers. In the past, under ideal conditions, one server could support the functionality of 2 virtual servers.

The virtualized infrastructure affords us the ability to test and develop (data and software) in an isolated environment. Through the implementation of High Availability, the system automatically moves virtual servers from host to host depending on established resource (RAM, CPU) thresholds.

Current versions:Server OS: Linux RedHat Advance Server 5.4 and Windows Server 2008Database: Oracle 11g

PHP: PHP 5.2.13Apache: Apache 2.2.3BLAST: BLAST+ 2.2.23

“Virtual Infrastructure” refers to the collective hardware and software working together. At MaizeGDB, we choose to use VMware Esx version 3.5 (our repurposed hardware is not compatible with the latest version of VMware). Our host servers are Dell 2850s with dual quad core processors with 16 Gb of ram. The host servers are connected via redundant fibre channel to an 5.5 Tb HP Storage Works Storage Area Network (12-450Gb 15K rpm dual port drives).

At present, the two host servers support 16 virtual servers. In the past, under ideal conditions, one server could support the functionality of 2 virtual servers.

The virtualized infrastructure affords us the ability to test and develop (data and software) in an isolated environment. Through the implementation of High Availability, the system automatically moves virtual servers from host to host depending on established resource (RAM, CPU) thresholds.

Current versions:Server OS: Linux RedHat Advance Server 5.4 and Windows Server 2008Database: Oracle 11g

PHP: PHP 5.2.13Apache: Apache 2.2.3BLAST: BLAST+ 2.2.23

In concert with the virtualization of the infrastructure, the methodology which MaizeGDB gathers and reports usage statistics went under a radical redesign. In the past statistics were gathered on each physical server then aggregated into a report. It was difficult to monitor and report on all traffic into MaizeGDB, and usage stats prior to mid-2009 are no doubt artificially low.

Our solution was to implement a PROXY server that sits between the internet and the MaizeGDB virtual servers that support 37 sub-domains (e.g. video.maizegdb.org, cornfab2.maizegdb.org). Now that all traffic is routed through one server, monitoring incoming traffic and the reporting usage statistics is more accurate and timely. The proxy server is a virtual server residing on the same physical hardware and does decrease site performance since all communication between the virtual machines are limited only by hard drive performance.

In concert with the virtualization of the infrastructure, the methodology which MaizeGDB gathers and reports usage statistics went under a radical redesign. In the past statistics were gathered on each physical server then aggregated into a report. It was difficult to monitor and report on all traffic into MaizeGDB, and usage stats prior to mid-2009 are no doubt artificially low.

Our solution was to implement a PROXY server that sits between the internet and the MaizeGDB virtual servers that support 37 sub-domains (e.g. video.maizegdb.org, cornfab2.maizegdb.org). Now that all traffic is routed through one server, monitoring incoming traffic and the reporting usage statistics is more accurate and timely. The proxy server is a virtual server residing on the same physical hardware and does decrease site performance since all communication between the virtual machines are limited only by hard drive performance.

This illustration shows the integration of the proxy server into the virtual environment:

•Connection to the “public” web interface (two instances),

•Connection to the Gbrowse servers (two instances),

•Connection to the BLAST service at MaizeGDB residing on a separate physical server/virtual infrastructure and

•the monitoring of web traffic to all the other virtual servers.

This illustration shows the integration of the proxy server into the virtual environment:

•Connection to the “public” web interface (two instances),

•Connection to the Gbrowse servers (two instances),

•Connection to the BLAST service at MaizeGDB residing on a separate physical server/virtual infrastructure and

•the monitoring of web traffic to all the other virtual servers.

The BLAST service requests at MaizeGDB are routed through the proxy server to an independent physical server/virtual infrastructure that is connected together by a gigabit Ethernet connection. Isolating the BLAST service manages computational intensive processes and negates their impact on the public interface.

The standard BLAST requests are routed through the proxy server, to load balancer. The load balance submits BLAST requests to one of three blast servers in a round-robin format.

uBLAST requests are routed through the proxy server then directly to the uBLAST server.

The common data server distributes data to all BLAST servers and minimizes redundant file storage.

The BLAST service requests at MaizeGDB are routed through the proxy server to an independent physical server/virtual infrastructure that is connected together by a gigabit Ethernet connection. Isolating the BLAST service manages computational intensive processes and negates their impact on the public interface.

The standard BLAST requests are routed through the proxy server, to load balancer. The load balance submits BLAST requests to one of three blast servers in a round-robin format.

uBLAST requests are routed through the proxy server then directly to the uBLAST server.

The common data server distributes data to all BLAST servers and minimizes redundant file storage.

Monthly usage, reported by year

2011 Maize Meeting Program and

abstracts

Stay connected to MaizeGDB!