39
Research Computing Muataz Al-Barwani, Ph.D. December 2019

Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Research Computing

Muataz Al-Barwani, Ph.D.

December 2019

Page 2: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Outline

• Center for Research Computing

• High Performance Computing

• Computational Research at NYUAD

• Research Support Services

Page 3: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Center for Research Computing

The Center for Research Computing (CRC) at NYU Abu Dhabi offers a set of services that partners with researchers and supports their use of technology as an enabler for their research activities.

Members of our team engage researchers and faculty across all academic divisions, centers and institutes at NYU Abu Dhabi.

We provide High Performance Computing (HPC), Research Application Hosting, Research Professional Services, Research Lab Support and Research Data Science Services.

Page 3

Page 4: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

High Performance

Computing at NYUAD

Page 5: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Our goals are to:

• Deliver high quality and efficient High Performance Computing

services exceeding the researchers’ and faculty expectations

• Expand and adapt HPC services to satisfy the researchers’ and

faculty changing research needs for High Performance

Computing

• Maximize the utilization of the HPC by building awareness and

promoting the use of HPC services to the research community

Page 5

Page 6: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

• Collaboration with the Technology Industry

• NYU AD is a HPE Beta testing site

• Testing and collaboration with 2CRSI

• Network testing and collaboration with Mellanox and others

• Training and Knowledge Transfer

• Executive training for the Navy staff on HPC management and operation

• HPC Research and Development

• Development of HPC Enhanced Software Management Environment – Presented

at HPC Saudi 2017

• Development of a novel MPI optimization method – Patent filed

Page 6

HPC Center of Excellence

Page 7: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

• Locally & Regionally:

• Established UAE HPC Collaboration

Network, members include: Khalifa

University, and UAEU as well as

Ankabut and ADNOC

• Worked closely with American

University of Sharjah (AUS) to

establish a HPC Center at AUS

• Collaboration with KAUST in KSA

• Collaboration with Sultan Qaboos

University (SQU) in Oman

• Collaboration with OMREN in Oman

Page 7

HPC Collaboration

Page 8: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

BuTinah: Our first HPC Cluster

Page 9: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

What is BuTinah?

BuTinah was NYU Abu Dhabi’s first High Performance Computing (HPC)

Cluster delivered in April 2012. Named after Bu Tinah a tiny protected nature

reserve found in the waters of Abu Dhabi.

In Brief, it was a 70 TFLOPs cluster ranked 397 in the top500 in June 2012 built

in 15 racks consisting of:

Page 9

• 512 Nodes

• 6144 cores

• 48GB RAM each

• 8 High Memory Nodes

• 96 cores

• 192 GB RAM each

• 1 Very Large Memory Node,

• 32 cores

• 1 TB of RAM

• 16 GPU nodes with

• Single NVIDIA Tesla M2070Q

• 96 GB RAM

• 16 Visualization nodes

• NVIDIA Quadro FX 2800M

• All connected through 4xQDR

Infiniband (IB) @ 40 Gb/s

• With 900 TB of Storage (NAS and

Distributed/Parallel)

• Tape backup system (Server, tape

drives and library)

Page 10: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

BuTinah Utilization

Page 10

0

10

20

30

40

50

60

70

80

No

ve

mbe

r-12

De

ce

mbe

r-12

Janu

ary

-13

Feb

ruary

-13

Ma

rch

-13

April-1

3

Ma

y-1

3

June

-13

July

-13

Augu

st-

13

Septe

mb

er-

13

Octo

be

r-1

3

No

ve

mbe

r-13

De

ce

mbe

r-13

Janu

ary

-14

Feb

ruary

-14

Ma

rch

-14

April-1

4

Ma

y-1

4

June

-14

July

-14

Augu

st-

14

Septe

mb

er-

14

Octo

be

r-1

4

No

ve

mbe

r-14

De

ce

mbe

r-14

Janu

ary

-15

Feb

ruary

-15

Ma

rch

-15

April-1

5

Ma

y-1

5

June

-15

July

-15

Augu

st-

15

Septe

mb

er-

15

Octo

be

r-1

5

No

ve

mbe

r-15

De

ce

mbe

r-15

Janu

ary

-16

Feb

ruary

-16

Ma

rch

-16

April-1

6

Ma

y-1

6

June

-16

July

-16

Mo

nth

ly A

ve U

tilizati

on

(%

)

Month

Page 11: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Dalma: NYUAD latest HPC Cluster

Page 12: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

What is Dalma?

Dalma is NYU Abu Dhabi’s current HPC Cluster – launched in 2016

Named after Dalma Island one of the oldest known permanent settlements in

the UAE with some of the earliest evidence of date palm cultivation going

back 7000 years.

In Brief, it is a 385 TFLOPs (12,000 core) cluster hosted at NYUAD Data Center

in Saadiyat in 20 racks consisting of: of:

Page 12

• 432 Nodes each with

• 28 Broadwell cores

• 128GB RAM each

• 3 Very Large Memory Nodes,

• 64 - 72 cores

• 2 TB of RAM

• 10 GPU Nodes

• 32 Nvidia Tesla V100 GPUs

• Over 3.5 PB of Parallel Storage

(3.3 PB Lustre, 200 TB BeeGFS)

• Over 3 PB Archive (400 TB Disk

and 2.5 PB Tape)

• All connected through a 1 to 1

non-blocking Mellanox EDR

Infiniband (IB) @ 100 Gb/s

• Database server and Viz nodes

Page 13: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Dalma Utilization

Page 14: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Dalma Growth

Academic Year Compute Storage - Scratch Storage – Archive

(Disk + Tape)

2016 236 Nodes 900 TB 1 PB

2017 + 44 = 280 Nodes

Faculty owned

No Change No change

2018 + 148 = 428 Nodes

+ 10 GPU nodes*

No Change Additional Tapes

2019 Visualization nodes

Add Year 4 support

+2.5 PB + 3 PB

2020 Refresh

New Compute & Network No Change No Change

Dalma Growth

Page 15: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

NYUAD next HPC Cluster

Coming soon!

Planned Launch 2020

Page 16: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Reason for Refresh

• Dalma – end of life in 2020• Increase in cost of support

• Obsolete technology

• No room for growth • Network limitation

• Space & power limitations in

data center

• Need more Compute• More projects

• New Faculty/Researchers

16

New HPC

Page 17: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Computational

Research at NYUAD

Page 18: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

HPC Research Publications

Page 19: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

HPC Research Publications

Page 19

Chemistry, 10

Climate Modeling, 25

ComputerScience, 3Engineering, 4

Genomics, 16

Mathematics, 5

Physics, 16

Social Science, 3

Publications up to Oct 2018: 82

Chemistry

Climate Modeling

ComputerScience

Engineering

Genomics

Mathematics

Physics

Social Science

Page 20: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Artificial Intelligence

20

Page 21: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Molecular Modeling & Simulations

Chemistry

12/23/2019 22

Page 22: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Climate Modeling

Page 23

Page 23: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

24

Above: Fluid injection

A simulation that ran on 700 cores for

about 8 days and produced 325 GBs

of data which upon post processing

gives that 6 seconds of flow

visualization.

Left: Q-criterion Iso-Contours

Engineering

Page 24: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

100 Date Palm Project

Genomics

http://www.thenational.ae/uae/winners-of-khalifa-

international-date-palm-award-announced

NovaSeq 6000

Page 25: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Social Science

26

Page 26: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Research Support Services

Page 27: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Research Computing Support

Research Application Hosting Services

• Network Storage (Research Storage)

• Co-location services

• Managed Server

• Managed Network

• Managed Storage

• Managed Application

Research Lab Support Services

• Transition support service

• Integration support service

Research Professional Services

• Research grant support

• Scientific applications support

• Training

• Programming and Algorithm

development support

Research Data Science Services

• Data Analytics

• Data Visualization

• Data Management

• Big Data

• Artificial Intelligence (AI) Support

Page 28

Page 28: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Compute & Storage @ Saadiyat

• IaaS (Infrastructure as a

Service

• 2 Virtual Hosts providing 100-

200 VMs

• 64 Physical Workstation Blades

• Over 1.2 PB of Storage

• Backed-up to disk and tape

Compute & Storage

Page 29: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

• Network Storage

• Total ~ 1.2 PB

• Allocated ~ 686 TB (56%)

• Utilized ~ 540 TB (44%) (78%)

• Co-Location

• Hosting 50 Servers / Workstations

• Managed Server

• 64 Physical Blades

• 119 various VMs

• 50 hosted Servers / Workstations

Research Application Hosting

30

• Managed Application

• Core labs Scheduling

Platform for the CTP

• E.g. Ansys, Cadence and

Synopsys for Engineering

• GitHub

• Managed Network

• Malware Lab for CCS

• Managed Storage

• 3 NAS Storages

• Backup & Archiving

Page 30: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Hyper Converged Infrastructure

• HCI Hardware

• Virtual Machines (VMware)

• Containers (Kubernetes)

• Management & Support

(Rancher)

Research & Development

Page 31: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

• 4 Compute + 1 GPU Node

• 97 TB Storage ( NVME ) via VSAN

• 2.37 TB of Memory

• 2 x V100 GPU Cards

• 25 GB Ethernet Backend Connectivity

• 36 Cores CPU Per Node

• GPU Virtualization through NVIDIA GRID ( for VM’s and Containers )

• Minimum turnaround time for VM creation and operation through templates

R & D Hardware

32

Page 32: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Data Science Services

• Data Analytics

• Data Management

• Big Data

• Data Visualization

• Artificial Intelligence

Research Data Science Services

Page 33: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

• Data Management & Big Data

• Developing customized data

management plans

• Organizing data (data collection and

analysis)

• Database Management and

Development

• Big Data handling and Processing

• High-memory, multi-processing

computational support.

Research Data Analytics

34

• Data Analytics

• Assistance with data analysis

using available software or

customized tools (e.g. Power BI,

Tableau & QlikView)

• Developing analysis software and

customized pipelines

• Statistical analysis of results.

Page 34: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Research Data Science Services

35

• Artificial Intelligence (AI)

Support

• Advanced Algorithms Design,

Development and Implementation.

• Parallelization and Optimizing of code.

• State of the Art Advanced Model

implementation like Image Recognition,

Social Network Analysis,

Recommendation Engine, Speech and

Text Mining using Deep Learning

frameworks.

Page 35: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

• Data Visualization

• Viz Wall (3x3 49” HD screens)

• Visualization Resources

• Viz tools (e.g. GIS & Web Maps,

ggplot2 and Matplotlib, Power BI,

Tableau & QlikView)

• Visualization professional

Service

Research Data Visualization

36

Page 36: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Operating Model

• Support and Consulting

• Collaboration and Projects

Research Data Science Services

37

Page 37: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

• Support and Consulting

We provide short-term support on the following:

• Research data organization: sharing and secure storage

• Research data processing / cleaning

• Research programming

• Selection and interpretation of statistical methods

• Research data visualization

• Using HPC & Cloud

Research Data Science Services

38

Page 38: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Research Data Science Services

39

• Collaboration and Projects

We provide extended support & partnership over the lifecycle of a research

project by embedding a data scientist in a research team.

We can

• Design and implement a data analysis pipeline (including Data

Analytics, Big data and AI)

• Develop prototypes of the research focused software tool.

Page 39: Research Computingasrenorg.net/eage19/sites/default/files/files/Center for...services exceeding the researchers’ and faculty expectations • Expand and adapt HPC services to satisfy

Questions

Thank You!

Do you have any Questions?

Page 40