34
Introduction to scientific computing infrastructures Week #1 Basics of Scientific Computing Infrastructures Hardi Teder [email protected] University of Tartu February 12th 2014 Lauri Anton [email protected]

Introduction to scientific computing infrastructures

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to scientific computing infrastructures

Introduction to scientific computing infrastructures

Week #1Basics of Scientific Computing

InfrastructuresHardi Teder

[email protected]

University of TartuFebruary 12th 2014

Lauri [email protected]

Page 2: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 2/35

Overview

● administrative information

● Introduction to Computing Infrastructures

● Authentication procedures

Page 3: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 3/35

course background● 2005 – 2009 Basics of Grid computing

– by prof. Eero Vainikko

– gLite middleware

● 2010 – Basics of Grid and Cloud computing

– Dr Satish Narayana Srirama joined with Cloud part

– gLite and ARC middlewares

● 2013 – same but

– Eero Vainikko is at the University of Bath, UK

– almost no gLite

● 2014 – Basics of Scientific Computing Infrastructures

– Cloud Computing becomes an independent course (6 ECTS)

Page 4: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 4/35

Web and discussion

● Course web page

– http://courses.cs.ut.ee/2014/tatar

– Contacts

– Times and rooms

– Lectures slides

– Lab exercises and deadlines

● Mailing list

[email protected]

Page 5: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 5/35

Course Structure

● Scientific Computing Infrastructures (3EAP)● 8 Lectures ● 8 Labs

– Usually the deadline of exercises is 2 weeks but not always

● You can earn 100 points:

● 50p labs● 50p exam

Page 6: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 6/35

Course given by● Lauri Anton

● Head of the Infrastructure Service at ITO.ut.ee● 4 Lectures and Labs on Thursday

● Hardi Teder● Director of EENet● 4 Lectures and Labs on Wednesday

Page 7: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 8/35

Course Outline● Introduction to scientific computing infrastructures.

● Managing compute jobs on cluster (SLURM)

● Managing compute jobs on grid (ARC)

● Information portals (resources, accountig, debugging)

● Data management (SE, Big Data)

● Special types of compute jobs (MPI, OpenMP, parametric jobs)

● Types of resources (CPU, GPGPU, Xeon Phi, Big Memory Machines, InfiniBand)

● Overview of the computing infrastructures in the worl (PRACE, EGI, ... )

Page 8: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 9/35

Introduction to Scientific Computing Infrastructures

● Who needs it?

● What is it?

Page 9: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 10/35

Driving forces of computational science

● Environment simulation● Climate changes

● Prediction of amount of fish in Norwegian fjords

● Ice glacier flow simulation

● Solving fluid dynamic problems

● Weather predictions● Design of hypersonic airplanes

● Design of more efficient cars

● Extremely quiet submarines● Design of efficient and safe

nuclear power stations

● Simulation of nuclear explosions

● Satellite data analysis

● Data analysis of DNA-sequences

● Simulation of 3D proteine molecules

● Simulation of global economical processes

● etc. in more and more fields

Page 10: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 11/35

Driving forces of computational science

● Common to all examples - need for larger than usual set of resources

● CPU cycles● Data volumes● Special devices producing

data● Parallel processing

● Common problems● how to store data?● how to move data?● which algorithms can be

used?

Page 11: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 12/35

Why the Grid?

● Science is becoming increasingly digital and needs to deal with increasing amounts of data.

– Large Hadron Collider (LHC), Radio telescopes, gene research

● More complicated simulations grew bigger than HPC centres could provide resource

● Collaboration

– Grid provides infrastructure for sharing resources.

Concorde(15 Km)

Balloon(30 Km)

CD stack with1 year LHC data!(~ 20 Km)

Mt. Blanc(4.8 Km)

Page 12: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 13/35

computing experiment

● Rendering 3D movie● 1h movie has 86400 frames (24 fps)● Rendering 1 frame takes 1 h on 1 CPU core● Rendering on 1 CPU core takes 3600 days

● Can be rendered parallel● Rendering each frame separately

Page 13: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 14/35

experiment stages

● Grid experiment stages● Pre-processing● Running the experiment

– Computing grid jobs

● Post-processing

● Rendering 3D movie● Preparing input data for

“rendering jobs” and generating job descriptions

● Sending the the jobs tracking them and collecting the results

● Glueing the frames to a movie

Page 14: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 15/35

parallel experiment

● Rendering 3D movie● 1h movie has 86400 frames (24 fps)● Rendering 1 frame takes 1 h on 1 cpu core● Rendering on 1 CPU core takes 3600 days

– Rendering on 12 CPU cores takes 300 days

● Rendering on 3600 cores takes only 1 day

Page 15: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 16/35

Grid experiments

● Running an experiment can spend thousands of CPU years

● The results may not be relevant when you get the results

● Sometimes experiment doesn't fit in a HPC centre● Then you need Grid

Page 16: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 17/35

Grid

Grid is securely share distributed resources (computation, storage, etc) so that users can collaborate within Virtual Organisations (VO):

Page 17: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 18/35

GRID

MIDDLEWARE

Visualization

Supercomputer, PC-Cluster

Data-storage, Sensors, Experiments

Internet, networks

Desktop

Mobile Access

Page 18: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 19/35

Middleware

● Tools and packages for building Grid:

– Globus toolkit

– Nordugrid ARC

– gLite

– Unicore

● EMI

European

Middleware

Initiative

Page 19: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 20/35

Grid foundations

Page 20: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 21/35

Resource management

● Computing resources

● Storage resources

● Other specific resources

Page 21: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 22/35

Information services

● Maintains information about hardware, software, services and people participating in a Virtual Organization

– Should scale with the Grid´s growth

– Sharing jobs

– Logging and accounting

– Monitoring

Page 22: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 23/35

Data management

● Data access and transfer

– Simple, automatic multi-protocol file transfer tools: Integrated with Resource Management service

● Move data from local machine to remote machine, where the job is executed (input file staging)

● Move the output files from the remote computer to the local machines (output file staging)

● Pull executable from a remote location

– To have a secure, high-performance, reliable file transfer over modern WANs: GridFTP

● Data replication and management

Page 23: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 24/35

Grid security

● Basic security:

– Authentication: Who we are on the Grid?

– Authorization: Do we have access to a resource/service?

– Protection: Data integrity and confidentiality

● Grid Security Infrastructure (GSI):

– Grid credentials: digital certificate and private key

● International trust

– IGTF

– EUGridPMA

Page 24: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 25/35

AuthN and AuthZ

Grid resources (A)

Grid resources (B)

Certification Authority (CA)BobCert request

User Interface (UI)

Bob´s Grid certificate

VO Database

VO ServiceVO

Manager

VO membership request

VO

VO Account

Pool

VO Account

Pool

Automatic mappingfor Bob

Automatic mappingfor Bob

voms-proxy-init

Page 25: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 26/35

Virtual Organizations

● Distributed people and resources

R

R

RR

R

R

R

R

R

RR

R

Page 26: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 27/35

Virtual Organizations

● People and resources

● Network connections

● Sharing resources

● Dynamic

● Fault tolerant

RR

R

R

R

R

R

RR

R

R

R

VO-BVO-A

Page 27: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 28/35

Grid User Interfaces

● Linux command line UI

● GUIs

● Web portals

Page 28: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 29/35

Estonian Grid

● 2004 started with Grid developments in Estonia

● Started with NorduGrid middleware

● BalticGrid and BalticGrid-II projects for developing Grid in Baltic States 2005-2010

● Estonian NGI at European Grid Infrastructure (EGI)

Page 29: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 30/35

ETAIS● Estonian Scientific

Computing infrastructure

– To provide computing and storage resources for science and education

– 2011-2012

● Partners

– Tartu Ülikool

– Tallinna Tehinkaülikool

– Keemilise ja Bioloogilise Füüsika Instituut

– EENet

Page 30: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 31/35

European Grid Infrastructure

● Central body for coordinating international Grid collaboration and standardization and interoperability

● 34 members

● EENet represents Estonian NGI

Page 31: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 32/35

Cloud computing

● Gartner: “Cloud computing is a style of computing where massively scalable IT-related capabilities are provided ‘as a service’ across the Internet to multiple external customers”

● Why it is getting cloudy?

– Development

– Business model

– Management model

– Virtualization level

Page 32: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 33/35

Grid vs Cloud

● Rather

– Science

– Real machines

– Similar resources (Linux clusters, mostly SLC5)

– Many resource providers in the same group

– Collaboration in big Vos

– PKI

● Rather

– Business

– Virtual machines

– Different resources (Linux, Windows, etc)

– One (few) resource providers per cloud

– Services for special groups

– AAI

Page 33: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 34/35

What is Grid useful for?

● Resource sharing● simplified access to remote resources: computing, databases,

software● multiple geographically apart resource aggregation● flexibility: in case of sudden need for large amount of

computing resources● reliability: network cuts, resource downtimes ● collaboration: remote working groups and developing teams

Page 34: Introduction to scientific computing infrastructures

Basics of Scientific Computing Infrastructures 35/35

Thank you● More information from:

● http://courses.cs.ut.ee/2014/tatar● [email protected]