The Science DMZ: Recent Developments · 16/05/2017  · • Science DMZ As Plaorm • Modern...

Preview:

Citation preview

TheScienceDMZ:RecentDevelopments

EliDart,NetworkEngineerESnetScienceEngagementLawrenceBerkeleyNa@onalLaboratory

WRNP17

Belém,Brazil

May16,2017

©2017,EnergySciencesNetwork

Overview

•  ScienceDMZAsPlaMorm•  ModernResearchDataPortal

•  PacificResearchPlaMorm– PRP– NRP

•  Note:ThistalkassumesyoualreadyunderstandtheScienceDMZ

–  Ifyouhaven’tencounteredtheScienceDMZ,severalfolksinRNPcanhelpyou,includingLeandroCiuffoandAlexMoura

–  Orcheckoutthefasterdataknowledgebase:•  hXp://fasterdata.es.net/science-dmz/

2 – ESnet Science Engagement (engage@es.net) - 5/15/17

©2017,EnergySciencesNetwork

•  OncetherearemanyScienceDMZsinyournetwork,morethingsbecomepossible

•  Easyfiletransferisgood,butwhatelsecanwedo?– Updatethearchitectureofdataportals– Buildservicesbetweenins@tu@ons–  Interconnectfacili@es

•  Severaleffortsunderwaytodothesethings

ScienceDMZAsAPla3orm

3 – ESnet Science Engagement (engage@es.net) - 5/15/17 ©2017,EnergySciencesNetwork

ScienceDataPortals

•  Largerepositoriesofscien@ficdata–  Climatedata–  Skysurveys(astronomy,cosmology)–  Manyothers–  Datasearch,browsing,access

•  Manyscien@ficdataportalsweredesigned15+yearsago–  Single-web-serverdesign–  Databrowse/search,dataaccess,userawarenessallinasinglesystem–  Allthedatagoesthroughtheportalserver

•  Inmanycasesbydesign•  E.g.embargobeforepublica@on(enforceaccesscontrol)

4 – ESnet Science Engagement (engage@es.net) - 5/15/17 ©2017,EnergySciencesNetwork

LegacyPortalDesign

10GE

Border Router

WAN

Firewall

Enterprise

perfSONAR

perfSONAR

Filesystem(data store)

10GE

Portal Server

Browsing pathQuery pathData path

Portal server applications:· web server· search· database· authentication· data service

•  Verydifficulttoimproveperformancewithoutarchitecturalchange–  Sodwarecomponentsalltangledtogether

–  DifficulttoputthewholeportalinaScienceDMZbecauseofsecurity

–  EvenifyoucouldputitinaDMZ,manycomponentsaren’tscalable

•  Whatdoesarchitecturalchangemean?

5 – ESnet Science Engagement (engage@es.net) - 5/15/17 ©2017,EnergySciencesNetwork

ExampleofArchitecturalChange–CDN

•  Let’slookatwhatContentDeliveryNetworksdidforwebapplica@ons•  CDNsareawell-deployeddesignpaXern

–  Akamaiandfriends–  En@reindustryinCDNs–  Assumedpartoftoday’sInternetarchitecture

•  WhatdoesaCDNdo?–  Storesta@ccontentinaseparateloca@onfromdynamiccontent

•  Complexityisn’tinthesta@ccontent–it’sintheapplica@ondynamics•  Webapplica@onsarecomplex,full-featured,andslow–  Databases,userawareness,etc.–  Lotsofintegratedpieces

•  Dataserviceforsta@ccontentissimplebycomparison

–  Separa@onofapplica@onanddataserviceallowseachtobeop@mized

6 – ESnet Science Engagement (engage@es.net) - 5/15/17 ©2017,EnergySciencesNetwork

ClassicalWebServerModel

•  Webbrowserfetchespagesfromwebserver–  Allcontentstoredonthewebserver–  Webapplica@onsrunonthewebserver

•  Webservermaycallouttolocaldatabase•  Fundamentallyallprocessingislocaltothewebserver

–  Webserversendsdatatoclientbrowseroverthenetwork•  Perceivedclientperformancechangeswithnetworkcondi@ons

–  Severalproblemsinthegeneralcase–  Latencyincreases@metopagerender–  Packetloss+latencycauseproblemsforlargesta@cobjects

HostingProvider

TransitNetwork

Residential BroadbandWEB

Long Distance / High Latency

Web Server

Browser

7 – ESnet Science Engagement (engage@es.net) - 5/15/17 ©2017,EnergySciencesNetwork

SoluFon:PlaceLargeStaFcObjectsNearClient

HostingProvider

TransitNetwork

Residential BroadbandWEB

Long Distance / High Latency

CDN

DATA

Short Distance / Low Latency

Web Server

CDN Data Server

Browser

•  CDNprovidessta@ccontent“close”toclient–  Latencygoesdown

•  Timetopagerendergoesdown•  Sta@ccontentperformancegoesup

–  Loadonwebservergoesdown(noneedtoservesta@ccontent)

–  Webservers@llmanagescomplexbehavior•  Localreasoning/fastchangesforapplica@onowner

•  Significantwinforwebapplica@onperformance

8 – ESnet Science Engagement (engage@es.net) - 5/15/17 ©2017,EnergySciencesNetwork

ClientSimplySeesIncreasedPerformance

•  Clientdoesn’tseetheCDNasaseparatething–  Webcontentisalls@llviewedinabrowser

•  Browserfetcheswhatthepagetellsittofetch•  Differentcontentcomesfromdifferentplaces•  Userdoesn’tknow/care

•  CDNsprovideanarchitecturalsolu@ontoaperformanceproblem–  Notbrute-force–  Worksmarter,notharder

The‘NetWEB

Browser

Web Server

Rich, Slow

DATA

CDN Data Server

Simple,Fast

The‘NetWEB

Browser

Web Server

©2017,EnergySciencesNetwork9 – ESnet Science Engagement (engage@es.net) - 5/15/17

ArchitecturalExaminaFonofDataPortals

•  Commondataportalfunc@ons(mostportalshavethese)–  Search/query/discovery–  Datadownloadmethodfordataaccess–  GUIforbrowsingbyhumans–  APIformachineaccess–ideallyincorporatessearch/query+download

•  Performancepainisprimarilyinthedatahandlingpiece–  Rapidincreaseindatascaleeclipsedlegacysodwarestackcapabili@es–  Portalserversodenstuckinenterprisenetwork

•  Canwe“disassemble”theportalandputthepiecesbacktogetherbeXer?–  UseScienceDMZasaplaMormforthedatapiece–  AvoidplacingcomplexsodwareintheScienceDMZ

10 – ESnet Science Engagement (engage@es.net) - 5/15/17 ©2017,EnergySciencesNetwork

LegacyPortalDesign

10GE

Border Router

WAN

Firewall

Enterprise

perfSONAR

perfSONAR

Filesystem(data store)

10GE

Portal Server

Browsing pathQuery pathData path

Portal server applications:· web server· search· database· authentication· data service

11 – ESnet Science Engagement (engage@es.net) - 5/15/17 ©2017,EnergySciencesNetwork

Next-GeneraFonPortalLeveragesScienceDMZ

10GE10GE

10GE

10GE

Border Router

WAN

Science DMZSwitch/Router

Firewall

Enterprise

perfSONAR

perfSONAR

10GE

10GE

10GE10GE

DTN

DTN

API DTNs(data access governed

by portal)

DTN

DTN

perfSONAR

Filesystem (data store)

10GE

Portal Server

Browsing pathQuery path

Portal server applications:· web server· search· database· authentication

Data Path

Data Transfer Path

Portal Query/Browse Path

12 – ESnet Science Engagement (engage@es.net) - 5/15/17 ©2017,EnergySciencesNetwork

PutTheDataOnDedicatedInfrastructure

•  Wehaveseparatedthedatahandlingfromtheportallogic•  Portaliss@llitsnormalself,butenhanced

–  PortalGUI,database,search,etc.allfunc@onastheydidbefore–  QueryreturnspointerstodataobjectsintheScienceDMZ–  Portalisnowfreedfrom@estothedataservers(runitonAmazonifyouwant!)

•  Datahandlingisseparate,andscalable–  High-performanceDTNsintheScienceDMZ–  Scaleasmuchasyouneedtowithoutmodifyingtheportalsodware

•  Outsourcedatahandlingtocompu@ngcentersorcampuscentralstorage–  Compu@ngcentersaresetupforlarge-scaledata–  Letthemhandlethelarge-scaledata,andlettheportaldotheorchestra@onofdataplacement

13 – ESnet Science Engagement (engage@es.net) - 5/15/17 ©2017,EnergySciencesNetwork

The Pacific Research Platform Creates a Regional End-to-End Science-Driven “Big Data Freeway System”

NSF CC*DNI Grant $5M 10/2015-10/2020

•  PI: Larry Smarr, UC San Diego Calit2

•  Co-PIs: -  Camille Crittenden, UC

Berkeley CITRIS, -  Tom DeFanti, UC San Diego

Calit2, -  Philip Papadopoulos, UC

San Diego SDSC, -  Frank Wuerthwein, UC San

Diego Physics and SDSC

PRPProvidesInteroperability

•  ScienceDMZsatpar@cipa@ngsitesensureinteroperability•  PRPengineersworktoensuretheyinteroperate

–  GlobusdatatransferbetweenDTNs–  perfSONAR

•  Somevaria@oninDTNs–  SomehaveFIONADTNs

•  FIONA==FlashI/ONetworkAppliance•  DesignedbyPRPengineersatUCSanDiego•  hXps://fasterdata.es.net/science-dmz/DTN/fiona-flash-i-o-network-appliance/

–  SomehaveDTNsconnectedtoHPCstorage•  Key–theyallinteroperate,removingintegra@onburdenfromscien@sts

15 – ESnet Science Engagement (engage@es.net) - 5/15/17 ©2017,EnergySciencesNetwork

PRPScienceDrivers

•  Mul@plescienceareas– Astronomyandastrophysics– Biomedicalapplica@ons–  Lifesciences–  Par@clephysics– Virtualrealityanddatavisualiza@on

•  hXp://prp.ucsd.edu/

5/15/1716

NaFonalResearchPla3orm(NRP)

•  ReplicatethePRPonana@onalscale•  Interoperable,high-performancecyberinfrastructure

–  Builttoservedomainscience–  Scaleupto~200ins@tu@ons

•  Firstworkshoptobeheldthissummer–  Domainscienceinput–  Policyques@ons–  Architecture,scalability–  IncludecampusIT,regionalnetworks,na@onalnetworks,fundingagencies,etc.inacommonconversa@on.

17 – ESnet Science Engagement (engage@es.net) - 5/15/17 ©2017,EnergySciencesNetwork

PetascaleDTNProject

•  AnotherexampleofbuildingontheScienceDMZ•  Supportsalldata-intensiveapplica@onswhichrequirelarge-scaledataplacement

•  Collabora@onbetweenHPCfacili@es–  ALCF,NCSA,NERSC,OLCF

•  Goal:per-Globus-jobperformanceat1PB/weeklevel–  15gigabitspersecond–  Withchecksumsturnedon,etc.–  Nospecialshortcuts,noarcaneop@ons

•  Referencedatasetis4.4TBofastrophysicsmodeloutput–  Mixoffilesizes–  Manydirectories–  Realdata!

18 – ESnet Science Engagement (engage@es.net) - 5/15/17 ©2017,EnergySciencesNetwork

PetascaleDTNProject

10.0 Gbps

17.6 Gbps

14.8 Gbps

19.3 Gbps

17.4 Gbps 17.0 Gbps

32.4 Gbps

25.3 Gbps

18.3 Gbps

16.3 Gbps

24.1 Gbps

24.0 Gbps

DTN

DTN

DTN

DTN

alcf#dtn_miraALCF

nersc#dtnNERSC

olcf#dtn_atlasOLCF

ncsa#BlueWatersNCSA

Data set: L380Files: 19260Directories: 211Other files: 0Total bytes: 4442781786482 (4.4T bytes)Smallest file: 0 bytes (0 bytes)Largest file: 11313896248 bytes (11G bytes)Size distribution:

1 - 10 bytes: 7 files10 - 100 bytes: 1 files100 - 1K bytes: 59 files1K - 10K bytes: 3170 files10K - 100K bytes: 1560 files100K - 1M bytes: 2817 files1M - 10M bytes: 3901 files10M - 100M bytes: 3800 files100M - 1G bytes: 2295 files1G - 10G bytes: 1647 files10G - 100G bytes: 3 files

March 2017L380 Data Set

19 – ESnet Science Engagement (engage@es.net) - 5/15/17 ©2017,EnergySciencesNetwork

Thanks!

EliDartdart@es.netEnergySciencesNetwork(ESnet)LawrenceBerkeleyNa@onalLaboratory

hXp://fasterdata.es.net/

hXp://my.es.net/

hXp://www.es.net/

ExtraSlides

5/15/1721

WhatIsScienceEngagement?

•  Technologypeopleworkingwithscien@ststohelpsolveproblems–  Improvedatatransferperformance–  Improvedataworkflows(e.g.torequirelesshumaneffort)–  Improveexperimentopera@ons–  …andmore…

•  Usingexperiencegainedfromhelpingscien@ststoimprovecyberinfrastructure–  Networkdesign–  Tooldesign–  Systemdesign

5/15/1722

EngagementIsImportant:OldModel

•  Scien@stasintegrator– Requiresscien@ststodiscovernewtechnologies– Requiresscien@ststobecomeexpertinnewtechnologies– Requiresscien@ststoassembledis@ncttechnologiesintoanintegratedsolu@onthatworksforthem

–  Somescien@stsdothisbrilliantly–mostdonot

5/15/1723

EngagementIsImportant:NewModel

•  Scien@stascollaborator–  Technologistsunderstandtechnology–  Technologistsunderstandenoughofthesciencetoseehowtechnologyfits

–  Technologistshelpscien@stsadoptausefulsolu@on–  Thisismuchmoreproduc@ve,andrequiresscienceengagement

5/15/1724

Recommended