21
Using Cloud Technologies to Build Scalable Environments Nipun Rahman June 15, 2015

Using Cloud Technologies to Build Scalable Environments

Embed Size (px)

Citation preview

Page 1: Using Cloud Technologies to Build Scalable Environments

Using Cloud Technologies to Build Scalable Environments

Nipun RahmanJune 15, 2015

Page 2: Using Cloud Technologies to Build Scalable Environments

What is the ‘Cloud’?The National Institute of Standards & Technology (NIST) has defined cloud as “a computing model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction”.

In other words, the cloud is an ecosystem of commodity hardware and software working in a synchronized manner to provide demand based information capabilities to different users over a network.

What is a Scalable Environment?A scalable environment is one which can grow on demand to meet computing needs of a process/system. Because of the highly scalable nature of cloud computing, the environments outlined in this presentation make extensive use of cloud based products.

Why Scalable Environments?Cloud based scalable environments provide the following benefits:• Minimal start up costs. The user pays for only the resources used, thus avoiding expensive

licensing fees. Moreover, costs can be predicted upfront at the beginning of the project. • Scaling. User can access the desired amount of resources as project needs change. • Pre-installed software. In most cases, software will be preinstalled. This minimizes software

installation issues, thus resulting in fewer technical problems. • Mobility. User doesn’t have to be tied to a specific site or network. In most cases, the user can

access his/her work from anywhere using an internet browser.

2

Page 3: Using Cloud Technologies to Build Scalable Environments

1. Platform: Operating Systems that also contain the application software and storage. We will demonstrate two cloud based platforms: AWS (Amazon Web Services) Microsoft Azure

2. Storage: Devices where data and files are held prior to processing. We will utilize the following cloud based storage systems: S3 Dropbox Azure Blob Storage

3. Language: Formal constructed language designed to send instructions to the computer. Although every analytics software has its own unique language, we will use SQL as our common language for all three tools: SQL

4. Tools: Application software designed to perform a specific set of tasks or functions. We will demonstrate the following three analytical tools: R Python SAS

5. Visualization: Applications that show information in graphical form, which make it easier for humans to consume and digest. We will demonstrate three data visualization tools: Tableau TIBCO Spotfire Shiny (works with R)

6. Version Control/Collaboration: Software that enables multiple people to work on the same software code while ensuring integrity of the source code. We will use the following tool for version control/collaboration: Git/GitHub

PlatformAWS | Azure

Operating SystemWindows | Linux

StorageBlock | SQL Based | NoSQL

ToolsModeling | Predictive Analytics | NLP

Machine Learning | Geospatial

Visualization/Content Delivery

Reports | Dashboards | KPIs

Elements To Build The Cloud Environment The elements needed to build the environment can be broadly categorized into six groups:

Data

Figure 1. Elements of a cloud based scalable environment. 3

Page 4: Using Cloud Technologies to Build Scalable Environments

Platform

Amazon Web Services (AWS)Amazon Web Services (AWS) is the cloud computing platform offered by Amazon. Launched in 2006, it is one of the earliest cloud platforms to be launched by a major company, and remains one of the most popular cloud platform services in use today. AWS offers a full range of cloud based services, including virtual servers and storage. In addition, AWS Marketplace offers third party software that works with the AWS cloud computing platform.

You can sign up for AWS and access the platform via the following website: http://aws.amazon.com/. AWS allows you to use their platform free of charge for 750 hours per month up to one year.

Key terms to remember when using AWS:

EC2: Also known as Elastic Compute Cloud, EC2 is the virtual computing environment which enables users to launch ‘instances’ and load software. Users can create, launch and terminate instances as needed, hence the term ‘Elastic’.

Instance: Virtual Machine on EC2 is required for running applications on AWS. Instances provide the system platform for executing the operating system (OS).

AMI (Amazon Machine Image): File system setup that includes the OS, drives and additional software required to deliver a service.

Private/Public Key Encryption: For security purposes, AWS employs the 128 bit Private/Public Key Encryption technology. This encrypts any information flow from your computer to the cloud, and vice versa. When opening an account, AWS will ask you to create a private key, which you will need when you connect to an instance. AWS will also create a public key.

Security Group: Virtual firewall that controls internet traffic to your instances.

Figure 2. User Interface for AWS.

4

Page 5: Using Cloud Technologies to Build Scalable Environments

Platform

AzureAzure is the cloud computing platform offered by Microsoft. Azure was launched in 2010 as Windows Azure, and rereleased in 2014 as Microsoft Azure. Similar to AWS, Azure offers a full range of cloud based services, including virtual servers, building and deploying web based applications, and managing storage and databases. Although Azure offers third party applications, it is tightly integrated with many Microsoft specific products, such as Active Directory, SQL Server, SharePoint, Visual Studio, etc.

Azure provides a 30 day free trial period to try out its service offerings. You can get more information on Azure and its service offerings from the following website: https://azure.microsoft.com/

Key concepts to remember when using the Azure platform:

Virtual Machine (VM): Also called instances, a VM is the operating system setup that allows users to install application software. Azure offers both Windows and Linux based VMs.

Image: In Azure, an image refers to the file system setup of the VM and attached data disks. Saving an image allows the user to initiate new VMs with similar configurations with minimal hassle.

Remote Desktop Protocol (RDP): RDP is a proprietary protocol developed by Microsoft to allow users to connect to a computer over a network connection. In Azure, RDP is the default method to connect to your VM.

Azure Machine Learning (ML): Azure ML is a GUI based Integrated Development Environment (IDE) for constructing and operationalizing machine learning workflows. It provides a drag and drop interface to pull and link modules into a graphical model. Each module performs a specific function. You can upload your data to the IDE or retrieve from an external website or persistent storage device. You can use the pre-built algorithms provided by Azure ML or write your own algorithms using R or Python.

Figure 3. Azure Machine Learning Interface.

5

Page 6: Using Cloud Technologies to Build Scalable Environments

Storage

S3Also known as Simple Storage Service, S3 is a cloud based file storage service offered by Amazon. S3 allows uploading, storing and downloading files up to 5GB. In S3, a folder is known as a Bucket. Files uploaded to the bucket are called objects. An object can be a text file, spreadsheet, picture, audio or video file. S3 provides a web link to access the object anywhere using a web browser. The user can use the permissions option to make the contents of the bucket private or accessible to everybody. In addition, users can enable the versioning option to ensure a version is saved whenever a object is modified.

S3 storage can be paired with other AWS service offerings (i.e., EC2 instance) through a VPC (Virtual Private Cloud), provided that all services stay in the same region. When an S3 storage is paired with an EC2 instance through a VPC, data transfers between the services stay within same environment and do leave the network.

Using the Boto library, python users can write code to directly access the bucket, download and upload objects to the S3 bucket from their EC2 instance.

For S3, Amazon charges by gigabyte-month usage, as well as additional charges for sending and receiving data. There is also a per request charge.

In addition to S3, AWS also offers other storage options: EBS (block storage), Glacier (low-cost storage for infrequently accessed data, such as backups) and RDS (a relational database service).

To learn more about S3, visit the following site:http://aws.amazon.com/s3/

Figure 4. User Interface for S3.

6

Page 7: Using Cloud Technologies to Build Scalable Environments

Storage

DropboxDropbox is a web based file hosting service that offers cloud storage built on S3. Dropbox provides users with a very user friendly interface to create storage on the cloud and upload files. Dropbox also creates a ‘Dropbox’ folder on your PC that is automatically synchronized with the storage on the cloud. Thus, users can create folders and save files on their PCs in the ‘Dropbox’ folder and access the same folders and content on the cloud. Folders and/or contents can be made private or public.

Signing up for a Dropbox account is very easy and can be done through the following website: https://www.dropbox.com/

R users can use the repmis library to write code to directly access a dropbox storage folder, as well as download and upload contents to the folder from their program.

Users are initially given 2GB of free space, which can be expanded to 1TB with a paid subscription. Dropbox also offers business accounts with additional security measures.

Figure 6. Web version of Dropbox.

Figure 5. Desktop version of Dropbox

7

Page 8: Using Cloud Technologies to Build Scalable Environments

Storage

Azure BLOB StorageAzure Binary Large Objects or BLOB storage is a type of persistent cloud based data storage service offered by Microsoft Azure. BLOBs are analogous to files to be uploaded to the storage, while folders are known as Containers. Similar to S3 objects, a BLOB can be a text file, spreadsheet, picture, audio or video file. Azure also provides a web link to access the BLOB anywhere using an internet browser. Unlike hard drives associated with an Azure VM, files saved in BLOB storage are not lost when the VM is terminated. Moreover, BLOB storage can be attached to a VM as an additional drive.

Azure requires third party software to upload and download BLOBs from the storage container. Azure offers several options, including Azure Storage Explorer. To connect to the container, you will need the storage account (you create when provisioning the storage) , as well as an access key and endpoint (both generated by Azure). Both BLOBs and containers can be designated as public or private.

Azure stores BLOBs in triplicates or more to prevent data loss due to hardware failure. Users are given three options: Locally Redundant Storage (LRS), Zone Redundant Storage (ZRS) and Geo Redundant Storage (GRS). In LRS, the file copies are saved in the same data center; thus if the data center is destroyed, all data is lost. In ZRS, which is slightly more expensive, the copies are saved in a different data center within the same region. In GRS, which is the most expensive, three copies are saved in different data centers located within the same region, while three more copies are saved in a data center located in another region. GRS offers the greatest protection against data loss due to hardware failure.

Pricing for blob storage depends on several factors. These include the amount of data stored, requests against the service, and whether the data is being accessed by applications across different service regions.

Figure 8. Azure Storage Explorer interface.

Figure 7. Azure BLOB storage interface.

8

Page 9: Using Cloud Technologies to Build Scalable Environments

Language

SQLStructured Query Language or SQL is a special purpose language originally designed for getting information from and into databases. SQL queries work as a command language that allow users to perform a wide range of tasks, including creating data tables, joining data tables, selecting specific fields from a dataset, retrieving data based on certain conditions, adding new data to an exiting dataset, performing basic mathematical calculations and so forth. SQL coding standards are maintained by the American National Standards Institute (ANSI) and International Organization for Standardization (ISO). Despite many vendors releasing their own ‘flavor’ of SQL, much of its coding standards remain universal among many systems.

Although originally designed to perform data management tasks for databases, SQL can also be used as a common language in various analytics programming software for performing data analytics. In our demo, we will show how to use SQL code to summarize data in three different applications (R, Python and SAS), even though they have their own distinct programming languages.

Figure 9. Using SQL code in SAS programming.

Figure 10. Using SQL code in Python programming.

Figure 11. Using SQL code in R programming. 9

Page 10: Using Cloud Technologies to Build Scalable Environments

Tools

RR is an open source programming language widely used by statisticians and data scientists for statistics and data analysis. R capabilities can be greatly enhanced through user created packages. These packages allow users to perform different functions, such as import/export data in various formats, run specialized statistical techniques, and generate high quality interactive graphics. For instance, the ‘SQLDF’ package allows users to run SQL code on R, and the ‘REPMIS’ package allows users to retrieve data from folders on Dropbox. Although a core set of packages are included in the initial installation of R, users can access and utilize more than 5,800 additional packages.

Base R can be downloaded from the Comprehensive R Archive Network (CRAN) website: http://cran.r-project.org/.

Most people use R in conjunction with R Studio, which is a freely available Integrated Development Environment (IDE) for R. R Studio provides many benefits. It allows users to run R programs both via scripts and command lines, view table contents, generate logs, check the help menu, etc. – all from a single window. R studio can be downloaded from the following website: http://www.rstudio.com/products/rstudio/download/

Using R on AWSThere are two ways to use R on AWS:1. Initiate an AMI with R already installed. The following website maintained by Louis Aslett contains AMIs preinstalled with R Studio. Just follow the instructions: http://www.louisaslett.com/RStudio_AMI/

2. Install R on an EC2 instance yourself. I have uploaded the instructions for installing R on an EC2 instance on Github: https://github.com/Nipun15/Install-R-Studio-AWS

Figure 12. R Studio Interface.

10

Page 11: Using Cloud Technologies to Build Scalable Environments

Tools

PythonPython is an open source general purpose programming language that touches almost every aspect of system design to web development. Similar to R, Python’s capabilities can be greatly expanded by importing libraries, which allow python to perform many specialized functions. Although popular among web developers, python can be used for performing complex analytics. NumPy is an extension to Python which supports multi dimensional arrays, and contains a large library of mathematical functions.

Python can be downloaded from the following site: https://www.python.org/downloads/

Python downloads come with a free Integrated Development Environment (IDE) called IDLE. IDLE allows users to execute both scripts and command lines.

NumPy can be downloaded from the following site: http://sourceforge.net/projects/numpy/files/

Using Python on AWSThe easiest way to use python on AWS is to initiate an EC2 instance with python preinstalled. AWS maintains several Linux AMIs with python preinstalled (you can use Amazon Linux AMI 2015.03 HVM) and will let you choose when initiating an instance. After launching the instance, AWS will let you connect to the instance and run python via Putty or the default MindTerm terminal emulator.

Figure 13. Python executed via the MindTerm emulator.

11

Page 12: Using Cloud Technologies to Build Scalable Environments

Tools

SASSAS is a proprietary statistical software product offered by the SAS Institute. Since 1976, SAS has been one of the leading analytics tools in the market. Although the SAS institute offers many analytical products, some geared towards specific markets; Base SAS remains its most versatile and popular product. Writing a basic program in base SAS consists of two steps: the DATA step and the PROC (Procedure) step. The data step reads data and prepares it for subsequent DATA or PROC steps. A PROC, which is similar to R packages or Python libraries, is a collection of code that perform a specific task or function. The PROC SQL procedure, for instance, allows users to write SQL code to manipulate data.

Using SAS on AWSSAS offers its University Edition free of charge on AWS.

To access SAS University Edition on AWS, you will first need to open an AWS account. After registering with AWS, search for SAS University Edition in the AWS market place (https://aws.amazon.com/marketplace/).

SAS provides a quick start guide to launch an EC2 instance preloaded with SAS. The login ID for the instance will be ‘sasdemo’ and the password will be the instance ID provided by AWS. You can stop the instance (which will preserve your work) or you can terminate the instance (which will delete your work).You can also launch as many instances as you want.

Figure 14. SAS Interface.

12

Page 13: Using Cloud Technologies to Build Scalable Environments

Visualization

TableauTableau is a proprietary Business Intelligence (BI) suite of software tools, mostly known for its data visualization products that are used to build dashboards. Tableau uses a simple drag and drop interface to allow users to transform data into sophisticated interactive graphs without having to learn any programming skills. Tableau can pull data from a single file (i.e., spreadsheet or text file) or read data from a database.

Tableau offers different products to create visuals and distribute content. The two main products offered are Tableau Desktop and Tableau Server. Tableau Desktop allows users to read in data and to create visualizations. Tableau Server is subsequently used to share the visualizations created in Tableau Desktop with authorized users or group of users. Tableau Desktop and Tableau Server can be downloaded from the following site: http://www.tableau.com/.

Using Tableau on AWSCurrently Tableau Desktop is not offered on AWS. Only Tableau Server is offered. You can download Tableau Desktop on your PC and connect to Tableau Server on the cloud. Graphs created on Tableau Desktop will be synchronized and appear on Tableau Server for distribution.

Instances preinstalled with Tableau Server are available in the AWS Market place (https://aws.amazon.com/marketplace/). You will need a fairly large instance to run Tableau Server on AWS (m3.2xlarge or bigger).

Unlike other software offered at AWS marketplace, Tableau Server does not have a per hour usage fee. The monthly charge for using Tableau Server is $750. However, if your organization already has a Tableau license, you can use that license to register your instance. Figure 16. Tableau Server on AWS Interface.

Figure 15. Tableau Desktop Interface.

13

Page 14: Using Cloud Technologies to Build Scalable Environments

Visualization

TIBCO SpotfireSpotfire is another proprietary BI suite of products with data visualization capabilities similar to Tableau. Like Tableau, Spotfire allows users to use a simple drag and drop menu to create interactive graphs for dashboards. Spotfire can pull data from a single file or connect to a database. Spotfire Desktop is used to read in data and create visualizations, while Spotfire Platform is used to distribute content to authorized users or group of users.

Spotfire can be downloaded from the following web site: http://spotfire.tibco.com/.

Using Spotfire on AWSUnlike Tableau, both Spotfire Desktop and its content delivery platform are available on AWS. Instances preloaded with Spotfire are available on the AWS Marketplace (https://aws.amazon.com/marketplace/). As seen with Tableau, you will need a medium instance (t2.medium) or bigger to launch Spotfire.

On AWS, Spotfire offers a web version and the full desktop version of its products. Both versions synchronize, so you have the same analysis on both the web version and the desktop version, regardless of where it was conducted. When using Spotfire on AWS, the login ID will be ‘spotfireadmin’ and the password would be instance ID generated by AWS.

Pricing for Spotfire will vary depending on the size of your instance, beginning at $0.802/hr or $6,320/yr for a t2.medium instance.

Figure 17. Spotfire Desktop on AWS Interface.

Figure 18. Spotfire Web Interface. 14

Page 15: Using Cloud Technologies to Build Scalable Environments

Visualization

ShinyShiny is an R package which allows R users to transform data into interactive graphs. Like other R add-ins, Shiny is open source and free. Shiny is also relatively easy to program, and does not require the user to possess any HTML, CSS or JavaScript programming skills.

The first step to using Shiny would be to download the Shiny package and write the Shiny application. There are three components to a Shiny application: the user interface, the server interface and the launch program application. All three components need to be saved as R programs in the same folder. The user interface contains the code specifying the look and feel of graphical user interface (text, menus, buttons, sliders, etc.) and has to be saved as an R program named ‘ui.r’. The server interface reads in data and specifies the type of visualizations (bars, pie charts, maps, etc.) and has to be saved as an R program named ‘server.r’. The launch program application consists of code that initializes the ‘ui.r’ and ‘server.r’ programs and launches a window containing the interactive graph. The following website contains information, tutorials, examples and sample code for using Shiny: http://shiny.rstudio.com/

Content created by Shiny can be deployed and shared with other users online through your own server or the cloud using shinyapps.io (http://www.shinyapps.io/).

Using Shiny on AWSFollow the steps shown in slide 10 and install R Studio on an AWS instance. launch R Studio and install the Shiny package. Create a folder on the AWS instance. Create the ‘ui.r’ and ’server.r’ programs, and another R program to contain the launch program code and save in the folder just created. Run the program. Your Shiny app should launch on AWS. You can use shinyapps.io to publish your visualizations on the web.

Figure 19. Shiny interactive user web interface.

15

Page 16: Using Cloud Technologies to Build Scalable Environments

Version Control/Collaboration

Git/GitHub

You can register for a GitHub account and download Git from the following website: https://github.com/.

The following are some common Git commands/functions:• Init – Create a new local repository.• Add – Add files to the local repository.• Commit – Commit changes to the repository.• Master – The default repository that Git creates when imitating

a repository (i.e., creating the ‘main’ branch).• Branch – Duplicate repository that is separate from the master.

Working on this repository will not affect the master or main branch.

• Merge – Merge all changes in the subsidiary branches to the master branch.

• Fork – Creating a copy of a remote repository.• Clone – Download a project with all its versions from a remote

repository.• Pull – Download files from a remote repository and merge into

your current branch. • Push – Upload all changes in your branch to the remote

repository.

Git is an open source distributed revision control and source code management system which allows many people to work on the same project. A major advantage of Git is that it allows many people to work on the same code without having to share a common network. Git maintains a complete history of any changes made to your code as well as saving all your code revisions in a .Git directory. GitHub, on the other hand, is a web based Git repository system which offers the capabilities of Git along with its own features, such as documentation, task management and bug tracking. While Git is a command line tool, GitHub provides a web based graphical interface for version control and code tracking. Users can create online repositories in GitHub, which can be made private or public. Using Git, you can push your code to a GitHub repository, which can be downloaded and worked by others, and uploaded again to the repository.

Figure 20. Git command tool interface.

Figure 21. GitHub Web interface. 16

Page 17: Using Cloud Technologies to Build Scalable Environments

SecurityEncryption: AWS uses the 128 bit public key-private key encryption method for the transmission of data. In this method, a public key is used to encrypt any data uploaded or downloaded to an AWS service offering, while the private key, which the user has to create when applying for an AWS account, is used to decrypt the data.

Firewall: AWS users can set up virtual firewalls called Security Groups. Security Groups define the remote login method (SSH, HTTPS, etc.) to access your service, and also allows you restrict inbound traffic to specific IP addresses. Azure offers similar services.

Access Control List (ACL): AWS users can use the Access Control List to set permissions for their service offerings. The ACL is a table that tells a computer OS what privileges a user or group has to a particular system object, such as a folder or file. System objects have a set of security attributes which allows a user to perform a specific operation, such as read, write or execute.

Virtual Private Cloud (VPC): AWS allows the use of VPCs to bundle AWS service offerings. A VPC is virtual network dedicated to an account, which isolates the network from other networks providing an additional security layer. Users can use VPCs to change security group membership, control both outbound and inbound traffic, and add an additional layer of access control through ACLs. Data flows within the VPC and is not exposed to external networks.

Active Directory: Active Directory is a directory service developed by Microsoft to authenticate and authorize users trying to access a network. AD is tightly integrated with Azure.

GovCloud: AWS offers GovCloud, which an isolated AWS Region designed specifically for US government agencies to move their sensitive data into the cloud by addressing their specific regulatory and compliance requirements. Current GovCloud clients include the DoD, CIA, NASA, US Treasury, USDA, CDC and many others.

17

Page 18: Using Cloud Technologies to Build Scalable Environments

Pricing

Linux OS Small (1 CPU, 2GB RAM) Large (16 CPU, 32GB RAM)AWS  $0.026/hr ($151/yr)  $1.400/hr ($6,507/yr)Azure  $0.085/hr ($745/yr)  $1.387/hr ($12,150/yr)

Block Storage3 Small (1 TB) Large (10,000 TB) I/O CostsAWS EBS  $51  $1,024,000 $0.05 per million requestsAzure Blob  $25  $584,244 $0.036 per million requests

Windows OS Small (1 CPU, 2GB RAM) Large (16 CPU, 32GB RAM)AWS  $0.036/hr ($232/yr)  $1.944/hr ($10,932/yr)Azure  $0.148/hr ($1,296/yr)  $2.372/hr ($20,779/yr)

Pricing a cloud based environment is very complex, because there are many elements involved. The figures below should be used as estimates1. Moreover, users can negotiate deep discounts with providers.

Platform2

Storage

S3 – 1TB - $0.03/GB per month ($30/month) 10,000 TB - $0.02875/GB per month ($287,500/month) Data Transfer IN to S3 - $0/GB Data Transfer OUT of S3 – Same region $0/GB Other regions - $0.020/GB Request - PUT, COPY, POST, or LIST Requests - $0.005 per 1,000 requests GET Requests - $0.004 per 10,000 requests

Azure SQL Database – Small (2GB) - $0.0067/hr (~$5/mo) Large (500GB) - $5/hr (~$3,720/mo)

Dropbox – Standard 2GB (Free) Dropbox Pro 1 TB ($99/yr) Dropbox for Business 5 TB ($750/yr)

1. If no citation is provided, the cost figures have been retrieved from the service offering’s website. 2 & 3. "Cloud Vendor Benchmark 2015: Price and Performance Comparison Among 15 Top IaaS Providers". Cloud Spectator White Paper. April 2015.

18

Page 19: Using Cloud Technologies to Build Scalable Environments

Pricing (Continued)

ToolsR – FreePython – FreeSAS University Edition - Free

VisualizationTableau – Tableau Desktop - $999 per user for first year; $199/yr thereafter Tableau Desktop Professional - $1,999 per user for first year; $399/yr thereafter Tableau Online - $500/yr per user Tableau Server on AWS - $750 per month AWS fees are separate and not included

TIBCO Spotfire - t2.medium instance (2 CPU, 4GB RAM) - $0.802/hr ($6,320/yr) r3.8xlarge instance (32 CPU, 244GB RAM) - $208.583/hr ($1,644,465/yr) AWS fees are separate and not included

Shiny – Shiny charges when you use shinyapps.io to publish your visualizations. There are four tiers: Shinyapps.io Free Tier - $0 (only 5 applications allowed) Shinyapps.io Basic Tier - $440/yr Shinyapps.io Standard Tier - $1,100/yr Shinyapps.io Professional Tier - $3,300/yr

Version Control/CollaborationGit/GitHub - Free

19

Page 20: Using Cloud Technologies to Build Scalable Environments

WebsitesList of websites cited in this presentation:

1. Amazon Web Services - http://aws.amazon.com/2. Microsoft Azure - https://azure.microsoft.com/3. AWS S3 - http://aws.amazon.com/s3/4. Dropbox - https://www.dropbox.com/5. Download Base R - http://cran.r-project.org/6. Download R Studio - http://www.rstudio.com/products/rstudio/download/7. Louis Aslett's website containing AIMs with preinstalled R Studio - http://www.louisaslett.com/RStudio_AMI/8. Installing R Studio on AWS - https://github.com/Nipun15/Install-R-Studio-AWS9. Download Python - https://www.python.org/downloads/10.Download NumPy - http://sourceforge.net/projects/numpy/files11. AWS Marketplace - https://aws.amazon.com/marketplace12.Tableau - http://www.tableau.com/13.Spotfire - http://spotfire.tibco.com/14. Information, tutorials and examples for Shiny - http://shiny.rstudio.com/15.Publishing Shiny apps online - http://www.shinyapps.io16.Register on GitHub and download Git - https://github.com/

20

Page 21: Using Cloud Technologies to Build Scalable Environments

Q&A

All code used in this presentation has been uploaded to GitHub: https://github.com/Nipun15/Cloud_Presentation_Code

If you have any questions or comments, please contact:

Nipun RahmanAssociate Booz | Allen | [email protected]

21