25
変変 [hen-tsoo] noun 1. Resourcefulness – the quality of being able to cope with a difficult situation 2. Adaptability – the ability to change (or be changed) to fit changed circumstan 3. Agility – the power of moving quickly and easily; nimbleness INFINITELY SCALABLE CLUSTERS Grid computing on public cloud

Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

  • Upload
    hentsu

  • View
    114

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

変通 [hen-tsoo] noun1. Resourcefulness – the quality of being able to cope with a difficult situation2. Adaptability – the ability to change (or be changed) to fit changed circumstances3. Agility – the power of moving quickly and easily; nimbleness

INFINITELY SCALABLE CLUSTERSGrid computing on public cloud

Page 2: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

WELCOME TO HENTSŪ

October 2016

Page 3: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

AGENDA• Grid computing overview• Trusted tools moving into public cloud• Alternative cloud services

October 2016

Page 4: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

SOME BACKGROUND

October 2016

Page 5: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

TERMINOLOGY• Public Cloud (AWS, Azure,

Google)• Private Cloud (Your

datacentre)• High Performance Computing

(HPC)• Grid computing• Compute cluster• Mathworks MATLAB

• CPUs / Processors / Cores• RAM (processor storage)• Disk (physical storage)• IaaS (virtual hardware and

networking)• PaaS (software services)

October 2016

Page 6: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

WHAT IS PUBLIC CLOUD?“A service provider makes resources, such as virtual machines, applications and storage, available to the general public.”• Utility model• No contracts• Shared hardware / multi tenant• Self managed

October 2016

Page 7: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

WHAT IS GRID COMPUTING?Traditional resource limitations:• Data store performance • PC Processor / Memory / Storage• Network bandwidthThe researcher may wait a long time for results.

• Grid computing moves the computational work from the PC to a cluster of servers

• The cluster processes the data on behalf of the researcher and returns the results

• Processing time is reduced• Larger datasets can be tackled

October 2016

Page 8: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

KEY CONCEPTSThe challenges The workflows

Number of tasks

Size

of

data

Big Data

High Throughput Computing

MapReduce

High Performance Computing

Ingest Process

Analyse

Visualise

Store

October 2016

Page 9: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

CHOICE OF TOOLS AND PLATFORMS

October 2016

Page 10: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

TRUSTED TOOLS & PUBLIC CLOUD

October 2016

Page 11: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

HARDWARE INFLEXIBILITY• Buy 22 core processors at

2.2GHz or 6 core processors at 3.6GHz?

• Buy 8GB, 16GB or 32GB memory modules (RAM per core ratio)?

• Graphical Processing Units (GPUs)?

• How much local storage per server?

• What network devices between servers (32 or 48 port switches?)

• What size file server?

Monday Tuesday Wednesday Thursday Friday Saturday Sunday0

20

40

60

80

100

120

Date

Jobs

per

day

Grid usage varies depending on research priorities:

October 2016

Page 12: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

PROFILING MATLAB RESOURCE USAGE• MATLAB uses one processor

core at a time (50% on a 2 vCPU machine). Use parallel computing toolkit for multicore PCs.

• MATLAB stores all data in RAM, very little I/O while processing

• I/O spike when writing out results

SysInternals Process ExplorerOctober 2016

Page 13: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

MATLAB GRID WITH PUBLIC CLOUD- Pay only for what you

use- Scale compute resource

up AND down- Minimal capital outlay

on hardware- Experiment with grid

computing platforms quickly, cheaply and with no commitment

October 2016

Page 14: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

A DAY IN A PUBLIC CLOUD CLUSTER

Time 02:00:0004:10:0006:20:0008:30:0010:40:0012:50:0015:00:0017:10:0019:20:0021:30:0023:40:000

20

40

60

80

100

120

140

160

180

Workers Tasks in Queue

- Cluster consisting 32x 4 cores

- Max 128 worker nodes- Ramps up as jobs get

submitted- Tears down nodes when

jobs finished- Minimising costs when not

in use

October 2016

Page 15: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

IDEAL CLUSTER SIZE?

8 16 32 64 96 128 160 192 2240

200

400

600

800

1000

1200

1400

Job Run time in seconds

Cores

Seco

nds

Ingest Process

Analyse

Visualise

Store

Optimise other parts of the workflow?

October 2016

Page 16: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

RUNNING MATLAB CLUSTER ON IAASAWS vCPUs are hyper-threaded™

Each vCPU is a hyper thread of an Intel Xeon core for 2nd generation instance types(M4, M3, C4, C3, R3, HS1, G2, I2, and D2)https://aws.amazon.com/ec2/instance-types/

Azure does not overcommit memory or cores. vCPUs are physical cores.Azure does not use hyper-threading.https://aws.amazon.com/ec2/instance-types/ 

October 2016

Page 17: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

GRID DEPLOYMENT OPTIONS1. Infrastructure as a Service (IaaS) DIY

Spin up a compute cluster on VMs for additional capacity and new workloads

2. BurstUse existing on premises compute cluster and burst on cloud as required

3. Software as a Service (SaaS)Software vendors and Managed Service Providers provide their own SaaS solutions. Pay for compute and application software per hour

4. Platform as a Service (PaaS)Cloud providers’ data analytics platform as a service:Google BigQuery & Datalab, Microsoft HDInsight, Amazon EMR

October 2016

Page 18: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

CLOUD HOSTED DATA AND ANALYTICS AS A SERVICE

October 2016

Page 19: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

GOOGLE BIG DATA REFERENCE ARCHITECTURE

October 2016

Page 20: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

WHAT IS BIGQUERY?Hadoop based “service that enables interactive analysis of massively large datasets”• Distributed File System -

Stores data that’s larger than can fit on a single machine

• Map Reduce – Distributes processing across multiple systems

http://blogs.forrester.com/mike_gualtieri/13-06-07-what_is_hadoop October 2016

Page 21: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

GOOGLE BIGQUERY AND DATALAB DEMO

October 2016

Page 22: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

DON’T FORGET SECURITYSecurity considerations:• Secure transfer and storage of data and code• Secure remote access to cloud hosted environment• Secure authentication

• Windows AD credentials• AWS IAM credentials• Google accounts• Microsoft accounts

• Auditing (who accessed what, who changed what)

October 2016

Page 23: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

SUMMARY• Traditional grid and HPC tools can benefit from moving into cloud• Vast landscape of available tools• Off-the-shelf PaaS offerings• Integrations and ecosystems• Cheap and very quick to experiment

October 2016

Page 24: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

Hentsu Ltd1 Fore StreetLondon EC2Y 9DT

[email protected]://hentsu.com

MORE INFORMATION?

Page 25: Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

NEXT EVENT: JANUARY 2017Intellectual Property (IP) security for Public Cloud ServicesSecuring mobile email and cloud based file storage

October 2016