33
H2020 – EINFRA – 2015 – 1 Page 1 of 33 Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and Development Task 5.5 E-Learning application services Author (s) Pedro Gonçalves Fabrice Brito Terradue Terradue Reviewer (s) Helen Glaves Paulo Nunes NERC SatCen Approver (s) Cristiano Slivagni ESA Authorizer Mirko Albani ESA Document Identifier EVER-EST WP5-D5.5 Dissemination Level Public Status Draft to be approved by the EC Version 1.0 Date of Issue 09/12/2016

Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page1of33

TechnicalNoteone-LearningServices,IntermediateVersion

Workpackage 5 VREInfrastructureandServicesDesignandDevelopment

Task 5.5 E-Learningapplicationservices

Author(s) PedroGonçalves

FabriceBrito

Terradue

Terradue

Reviewer(s) HelenGlaves

PauloNunes

NERC

SatCen

Approver(s) CristianoSlivagni ESA

Authorizer MirkoAlbani ESA

DocumentIdentifier EVER-ESTWP5-D5.5

DisseminationLevel Public

Status DrafttobeapprovedbytheEC

Version 1.0

DateofIssue 09/12/2016

Page 2: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page2of33

DocumentLog

Date Author Changes Version Status

14/10/2016 PedroGonçalves ToC 0.1 Draft

28/10/2016 PedroGonçalves Rationale and initialarchitecturaldesign

0.2 Draft

10/11/2016 PedroGonçalves ScopeandUseCases 0.3

14/11/2016 PedroGonçalves Data Agency, JupyterNotebooks,DataCubes

0.4 Draft

05/12/2016 PedroGonçalves Updates from revisionnotes

0.5 Draft

12/12/2016 PedroGonçalves Updateafterfinalreview 1.0 DrafttobeapprovedbyEC

Page 3: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page3of33

TableofContents

1Introduction................................................................................................................................................7

1.1 Purposeofthedocument..................................................................................................................71.2 Background.......................................................................................................................................71.3 Documentstructure..........................................................................................................................7

2 EarthSciencee-LearningServices.............................................................................................................82.1 Scope................................................................................................................................................82.2 Operationalscenarios........................................................................................................................9

2.2.1 Administratorofe-learningservices...................................................................................................92.2.2 Developerofe-learningmodules.....................................................................................................102.2.3 Managerofe-learningcourses.........................................................................................................102.2.4 Participantofe-learningcourses......................................................................................................11

3 Components...........................................................................................................................................123.1 Overview.........................................................................................................................................123.2 DataAgency....................................................................................................................................13

3.2.1 DataCatalogue..................................................................................................................................133.2.2 DataGateway...................................................................................................................................133.2.3 DataStoring......................................................................................................................................14

3.3 Webnotebooks...............................................................................................................................153.3.1 Jupyternotebookwebapplication...................................................................................................153.3.2 Kernels..............................................................................................................................................163.3.3 Jupyternotebookdocuments...........................................................................................................17

3.4 Datacube........................................................................................................................................184 Deployment...........................................................................................................................................22

4.1 Dataaccess......................................................................................................................................224.2 Provisioning.....................................................................................................................................224.3 Persistentstorage............................................................................................................................224.4 Scalability........................................................................................................................................234.5 Authentication................................................................................................................................23

5 e-LearningCatalogueandPortfolio........................................................................................................245.1 Sentinel-1productinformationandmetadata.................................................................................245.2 Sentinel-1productsubset................................................................................................................245.3 Sentinel-1changedetectionforfloodextent...................................................................................265.4 Sentinel-2vegetationindices...........................................................................................................29

Page 4: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page4of33

ListofFigures

Figure1–Scopeofthee-LearningService,ComponentsandrespectiveUseCases..................................................8Figure2–e-Learningserviceadministratorusecase..................................................................................................9Figure3–e-Learningservicemoduledeveloperusecase.........................................................................................10Figure4–e-Learningmodulemanagerusecase.......................................................................................................10Figure5-e-Learningcourseparticipantusecase......................................................................................................11Figure6-e-LearningServiceArchitecturalDiagram:fromServertoApplication.....................................................12Figure7-DataAgencyservicesforfacilitatingthedataflowtoapplications...........................................................13Figure8-DisplayingaNotebookfileinthebrowser.................................................................................................15Figure9-SimpleinteractiveexampleinJupyterNotebooks.....................................................................................16Figure10-Convertinganotebooktootheroutputformats.....................................................................................18Figure11-EarthObservationDataCubes.................................................................................................................19Figure12-LoadingdatafromthedatacubeinJupyter............................................................................................19Figure13-Retrievingarraydatafromthedatacube................................................................................................20Figure14-Plottingamulti-bandimagefromadatacubeinJupyter........................................................................21

ListofTables

N/A

Page 5: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page5of33

DefinitionsandAcronyms

Acronym Description

AGDC AustralianGeoscienceDataCube

AJAX AsynchronousJavaScriptandXML

API ApplicationProgrammingInterface

CEOS CommitteeonEarthObservationSatellites

DAG DirectedAcyclicGraph

DOI DigitalObjectIdentifier

EBS ElasticBlockStorage

EC2 ElasticComputeCloud

EO EarthObservation

ES EarthScience

ESA EuropeanSpaceAgency

EVER-EST EuropeanVirtualEnvironmentforResearch-EarthScienceThemes

FitSM StandardsforfreeandlightweightITManagement

FTP FileTransferProtocol

FTPS FTPoverSSL

GDAL GeospatialDataAbstractionLibrary

GUI GraphicalUserInterface

HDFS HadoopDistributedFileSystem

HTML HypertextMark-upLanguage

HTTP HypertextTransferProtocol

HTTPS HTTPoverTLS,HTTPoverSSL,andHTTPSecure

IDE IntegratedDevelopmentEnvironment

ICT InformationandCommunicationTechnology

IS IdentityServer

IT InformationTechnology

ITSM ITservicemanagement

JPEG JointPhotographicExpertsGroup

JSON JSObjectNotation

OGC OpenGeospatialConsortium

PDF PortableDocumentFormat

PNG PortableNetworkGraphics

PSNC PoznańSupercomputingandNetworkingCenter

REST RepresentationalStateTransfer

SAR SyntheticApertureRadar

Page 6: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page6of33

SLA ServiceLevelAgreement

SNAP SentinelApplicationPlatform

SSO SingleSign-On

SVG ScalableVectorGraphics

S3 SimpleStorageService

URL UniformResourceLocator

VM VirtualMachine

VRC VirtualResearchCommunity

VRE VirtualResearchEnvironment

XML EXtensibleMark-upLanguage

YARN YetAnotherResourceNegotiator

ApplicableDocuments

DocumentID DocumentTitle

Grant_Agreement-674907-EVER-EST

EVER-ESTGrantAgreement

EVER-ESTDELWP1-D1.1 ProjectManagementandQualityPlan

ReferenceDocuments

DocumentID DocumentTitle

EVER-ESTDELWP3-D3.1 VREDetailedDefinitionofUseCases

EVER-ESTDELWP5-D5.1 VREArchitectureandInterfacesDefinition

FitSM StandardsforfreeandlightweightITManagementhttp://fitsm.itemo.org/fitsm

Page 7: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page7of33

1Introduction

1.1 PurposeofthedocumentThemainpurposeofthisdocumentistodescribetheconsolidateddesignandthedevelopmentofthee-LearningServices according to the specific implementations and requirements outlined in D5.1. It describes the opensourcecomponentsselectedtosupporttheVRC’sinteractiveexplorationofEOdataandguidetheminadaptingtheirworkflowsfornewdatasources.ItaimstocoverthefullEOdatalifecycle,fromdataaccess,datacleansing,exploration,andreproducibilitytoinformationdissemination.ThisdocumentisanintermediateversiondeliveredinM14withthefinalversiontobedeliveredbyM18.

1.2 BackgroundEarthObservation sensors are currently generatinghuge amounts of data that is not easily integrated into theprocessingchainsoftheEVER-ESTVRCs.Toimprovetheirusage,itisnecessarytotraintheVRCsonthepotentialofthesedataflowsanddemonstratetheirapplicabilityforspecificusecases.Thedesignofthee-LearningServiceuses Web Notebooks as a way to develop interactive EO data applications that can use a large number ofprogramming languages, in the form of executable documents organized in units. It covers EO data sciencecomputingtechniquesthatwillsupportthetrainingandguidefuturedatascientiststoovercomethechallengesofincreasing EO data volumes and support their ability to validate, analyse, visualize, store and curate theinformation.

1.3 DocumentstructureThisintroductorychapteraimstoprovidekeyinformationtoreadersthatdonotbelongtotheEVER-ESTtechnicalteam inorder toprovide thecontextandplacementof thisdocument in theoverallWP5activities.ForamoregeneralperspectivethereadingofD5.1isrecommended.Chapter 2 will address the general scope behind the e-learning services, their relation to the EVER-ESTinfrastructuresandtargetedusecases.Chapter3willaddressthemaincomponentsofthee-LearningServicesgivingspecialconsiderationtotheuseoftheCloudPlatformDataAgencytoconnecttoexternalEOdatastoragestogetherwithJupytercomponentsandDataCubes.Chapter 4 will address deployment approaches and scalability features for the Notebooks and data cubesconsideringtheuseofDockercontainers.Chapter5willshowtheinitiale-learningservicesimplementedforthisintermediaryversion.

Page 8: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page8of33

2 EarthSciencee-LearningServices

2.1 ScopeThenewgenerationofin-situandspaceEarthObservationsensorsiscurrentlygeneratinghugeamountsofdatanot easily integrated into processing chains outside the ground segments of space agencies and very largeinstitutions.Theuseofthisdatafore-LearningServicesislimitedtosomedownloadedscenesand,duetothelackofcomputingpowerandstoragecapacitytoexplorethesenewdataflows,itneedsseveralprocessingstepstobecarriedoutbeforethedataisinausableform.Toovercomethislimitation,theEVER-ESTe-LearningServicesmainrequirement is toallowthedevelopmentanddeploymentofvirtual laboratories thatallowtheVRCstoexploreandexecute thee-learningmodules. Theseunitswill containdata resources,executioncode, software librariesanddocumentation,andwillempowerthecommunitiestoexplorethepotentialofEOdataontheirexistingandfutureworkflows.

Figure1–Scopeofthee-LearningService,ComponentsandrespectiveUseCases

The approach followed in EVER-EST takes advantage of the latest developments in Information andCommunication Technology (ICT). It facilitates the handling of large volumes of data and service creation and,most importantly, focusesonmoving theprocessing towhere thedata is, togetherwithnewdataexploitationcapabilities.Theavailabilityoflargedataholdingsaccessibledirectlyfromwebapplicationsprovidesawiderandeasieraccess toEOdataand increasessoftwaresharinganddatadisseminationcapabilitiesbyempoweringtheenduserswithrelevanttechnologies.Bydeliveringinfrastructure,platformorsoftwareasaserviceitispossibletosupport and optimise the use of VRE ICT resources using load balance and provisioning. The EVER-EST CloudPlatform (Figure 1) provides virtual machines on demand from the ICT resources available at the PSNCinfrastructure.Thesearecustomisedforexplicite-learningtasksandprovisionedtobuildvirtuallaboratoriesthat

Page 9: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page9of33

support users to seamlessly run the courses and their respective modules. The necessary prerequisites arebundledinthepreconfiguredVMwithalltherequiredsoftwareanddataconnectivitycapabilities.ThescopeoftheactivitydescribedinthisdocumentistoimprovetheEOdataaccessine-learningservicesusingtwounifyingtechnologies:DataCubesandWebNotebooks.Datacubesareaneffectivewaytostoreandaccessmulti-dimensionalarraysofvalues,commonlyusedtodescribea timeseriesofdata.For interactivelyexploringdata in aDataCube,WebNotebooksallow theonlineexecutablepresentationof research results immediatelyreproduced, validated and possibly extendable by others. By using these two technologies the objective is todevelop EO e-learning serviceswith rich exploratory data analysis functionality that take full advantage of theever-increasingvolumeofEOinformation.

2.2 OperationalscenariosThissectiondescribestheoperationalplatformscenariosas:

● Administratorofe-learningservices● Developerofe-learningmodules● Managerofe-learningcourses● Participantofe-learningcourses

2.2.1 Administratorofe-learningservicesThisscenariosupportsanAdministrator insetting-uptheaccesstothedataholdingsnecessary increatingdatacubesandthemanagementoftheresourcesallocationtousers.TheServiceAdministratoractivitiesarefocusedonthedataagencycomponents,catalogueanddatagateway,andon cloud management activities. The latter concerns activities like the configuration and monitoring of VMs,deployingthenecessaryapplicationpackagesandmanagingalltheauthorizationlayersandaccessroles.Thedataagency components deal with the preparation of data activities to manage data requests. The cataloguecomponentdiscoversthenecessarydataandthedatagatewaycomponentfacilitatestheaccesstotherequesteddataandmanagesthedifferentdatapoliciesanddataflowoptimization.BothcomponentsofthedataagencywillbefurtherpresentedinChapter3.

Figure2–e-Learningserviceadministratorusecase

Page 10: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page10of33

2.2.2 Developerofe-learningmodulesThis scenario supports aDeveloper in defining an e-learningmodule including the data holdings selection andvalidationactivity.

Figure3–e-Learningservicemoduledeveloperusecase

The Module Developer activities are focused on the development activities and data agency components,catalogueanddatagateway.Thelatterwillguidethedeveloperinassessingthenecessarydataandtodefinethedatarequirementsoftheapplication.Thedevelopmentactivitiesincludeseveralactivitiesliketherequestofthedata buckets, develop the actual code that will run the application and the validation procedures and will besupportedbytheVMresourcesandtheCloudController.Thedeveloper’sdashboardwillenablethedevelopertocheck the status, deploy or stop the different VM resources used to develop the Notebooks and Data Cubesapplications.BoththesecomponentswillbefurtherpresentedinChapter3.

2.2.3 Managerofe-learningcoursesThisscenariosupportsaManagertosetupane-learningcourseincludingthedefinitionofthecoursemodulesandassignmentofparticipants.Italsoincludesthetasktoassesstheparticipant'sfeedbacktothecoursecontentsandvalue.TheCourseManageractivitiesincludetheselectionandallocationoftheVMresourcesusingtheCloudControllerand,throughthedeveloper’sdashboard,tocheckthestatus,deployorstopthedifferentVMresourcesusedtodeveloptheNotebooksandDataCubesapplications.TheCourseManagerwillalsobeabletocollectthe inputsfromtheCourseParticipantsandassessthecoursepotentialandimprovementpaths.

Figure4–e-Learningmodulemanagerusecase

Page 11: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page11of33

2.2.4 Participantofe-learningcoursesThisscenariosupportsaParticipanttoattendane-learningcourse,includingtheaccesstodataholdingsandthecapability to interactively execute and test the code. The participantwill also be able to provide feedback andsuggestions.

Figure5-e-Learningcourseparticipantusecase

The participantwill be able to discover the available e-Learning Servicemodules and interactively execute theWebNotebooksandtherequiredDataCubes.TheCourseParticipantwillalsobeable toprovide feedbackandsuggestiononthecoursecontents.

Page 12: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page12of33

3 Components

3.1 OverviewWhileEOdatasetsarebecomingmoreavailable,sometechnicalchallengesstillremaintoefficientlystore,curateandservesuchdatasets.Furthermore,asapplicationsincreasinglyneedmultipledatasourceswithdifferenttypesof dissemination and exploitation policies, users and developers need support integrate them. Data discovery,access and integration canbe achieved inmultipleways and selecting a proper technology largely dependsonexploitation goals of data repositories and catalogues. To surpass these challenges the EVER-EST e-LearningServiceusesadatamanagement for fast indexingofdatasetmetadatadocumentbroughttogetherbytheDataAgency to support two core technologies: Data Cubes1and Web Notebooks2. The use of these technologiesprovide an easy integration of EO data and a capability to provide a complete set of analysis tools for the e-Learning modules available for the final user. Their web context and the provision of services from bothcomponentsallowparticipantstointeractivelyexecutethecourses.

Figure6-e-LearningServiceArchitecturalDiagram:fromServertoApplication

InthissectionbothcomponentsaredescribedandassessedtogetherwiththeirpotentialtofullysupporttheEOdatalifecyclefromdataaccess,datacleansing,exploration,andreproducibilitytoinformationdissemination.E-learning services must provide common capabilities that allow users to perform data operations likeprocessing/re-processing, projection, visualization or analysis. In addition, theymust be able to train users foreach phase of their research activities, providing, for instance the capability to search data, or extract singleparametersorcombinedproductsfromremoterepositories.Forthisreason,thee-learningmodulemustinterfacewith data management tools that offer easy and seamless access to all relevant repository search and dataretrievaloperationsallowingextractionanddistributionofsingleparametersorcombinedproductsondemand.To facilitate this the JupyterNotebookswill takeadvantageofEOtoolboxes (e.g.GDAL3,SNAP4),accessdata inHDFS,DockerdatabucketsandDataCubesrunningontopofaCloudbasedclusterasshowninFigure6.

1http://www.datacube.org.au/2http://jupyter.org/3GDALisanopensourcetranslatorlibraryforrasterandvectorgeospatialdataformats-http://www.gdal.org/4SNAPisthecommonsoftwareplatformoftheSentinelToolboxes-http://step.esa.int/main/toolboxes/snap/

Page 13: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page13of33

3.2 DataAgencyTheDataAgency isasetofcomponentsprovidingservices facilitatingdata flow(discoveryandaccess).For thispurpose,itincludesacataloguetostoredatasetmetadataandperformcomplexqueries.Thecatalogueprovidesasearchenginecapableofdealingwithdifferenttypesofqueries(Geographic,Temporal,Textualornumeric)andadistributed OpenSearch interface with diversemetadata search capabilities together with online access pointswithmultipleaccessprotocols.ItprovidesaframeworktosupporteasydiscoveryofEOdata(remotesensingandinsitu),usingbestpracticesforsearchservicessuchasOpenSearchwithGeo,TimeandEOextensionsasdefinedby the CEOS (Committee on Earth Observation Satellites)5allowing standardized and harmonized access tometadataanddataofworld’ssatelliteEarthobservationdataproviders.

Figure7-DataAgencyservicesforfacilitatingthedataflowtoapplications

3.2.1 DataCatalogueThe Data Agency Catalogue is able to store and query the EO product metadata in indexes and provides aninterface for searching the dataset in a catalogue via anOpenSearch interface according to a datamodel. Fordatasetingestion,ittransformsthemetadatafeedfromindexedJSONdocuments.Fordatasetquerying,itexploitsthesearchenginetoretrievethedocumentsinJSONandtransformstheminametadatafeed.Thetransformationand query semantics are defined through plug-ins enabling severalmetadatamodels and feed formats. It usesElasticsearch6,asearchserverbasedonLucene7,thatprovidesadistributed,multitenant-capablefull-textsearchenginewithaRESTinterfaceandschema-freeJSONdocuments.

3.2.2 DataGatewayTheDataAgencyalsocontainsasetofcomponents,calledDataGateway,whichprovideservicestofacilitatedataaccess. This component exposes a data pipe service that provides the bestway to deliver data to the user byfinding thebest locationaccording toparameters suchas theprocessing serviceand location.According to thedatapartnershipapplicable,dataisprovideddirectlyfromtheplatforminfrastructure(mirror)orbyre-routingtheuserdirectlytothedataproviderfacility(Figure7).

5http://ceos.org/ourwork/workinggroups/wgiss/access/opensearch/6Elasticsearch isasearchenginethatprovidesadistributed,multitenant-capablefull-textsearchenginewithanHTTPwebinterfaceandschema-freeJSONdocuments-https://www.elastic.co/7ApacheLuceneisafreeandopen-sourceinformationretrievalsoftwarelibrary-http://lucene.apache.org/

Page 14: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page14of33

TheDataGatewaybehavesasaData-as-a-Serviceplatformusedtoresolvethebestlocationandprovideaccesstothedatabasedona setof rules. The rule-basedmechanismmanages thedatapartnership, accesspoliciesanddataprocessingscenario.Thisapproachallowsanevolutionofdataresourcestargetsandensuresthelong-termavailabilityofthecurrentandexistingdataresourcesaswellastheadditionofnewones.ThegeneralapproachfortheevolutionofthedataresourcesisbasedonthedevelopmentorconfigurationoftheDataAgencyandDataGateway platform components. The development may include new metadata harvesters, new correlationfunctionsforadvancedsearches(e.g.cloudcoverageforopticaldata)ornewdataaccessfunctions.

3.2.3 DataStoringTo ensure that all the e-Learning Service data requirements are capitalized, the EO data source is directlyprovisioned from Data Gateway using those tools and systematically archived on the PSNC storage byimplementingthreemethodsconnectingdataproviders:

● Remoteaccesseitherbyuserredirectionorbypipingthedatarequestdownload;● CachingthedataresourceonPSNCstorageforadefinedretentiontime;● MirroringthedataresourceonPSNCstorageforanundefinedtimelimit.

Whenapplied,thedatamirroringoccurstoallproducts’typesthatarefetchedandcachedintheinfrastructurewithadjustabletimewindowandcachingpolicy.ADataAgentprovidingallthesystematicstoringcoordinatesthedataaccessandautomaticdataflow.Thiscomponentisinchargeofmonitoringdatasourcesfornewdatasetsbyandperiodicallyharvestingtheexternalcataloguesanddatasources.Allnewdatasetsareautomaticallyingestedinthecataloguetogetherwiththeirlocation.Toread/writedataconcurrentlyfromacloudapplication,technologiessuchasAmazonAWS’ElasticBlockStorage(EBS)diskattachedtoanEC2 instanceareapossiblesolutionandcanbeconfigureddirectly fromthePlatformCloudController.TosimplystorepersistentdataontheEVER-ESTinfrastructure,thePSNCCloudstorageusesS38and data access occurs via a client tool like s3cmd. Applications can make use of the client tool from theirpremises,orfromaVirtualMachineinstance.Nevertheless,S3doesnotallowrandomaccesstofilesanditneveraddspartialobjectstothestoragespace(asuccessresponseofaS3operationmeansthattheentireobjectwasaddedtotheS3bucket),anddoesnotprovideobjectlocking.Alsoifmultiplewriterequestsarereceivedforthesameobjectsimultaneously,onlythelastobjectwrittenwillpersist.Assuch,thes3cmdclientisacommandlinetool available to the users for uploading, retrieving andmanaging data on the PSNC cloud storage. This tool isbasically suited for power users who are familiar with command line programs and for batch scripts andautomatedbackups,butallitscomplexityshouldbehiddenwithinane-LearningModule.TheDeveloperCloudSandboxesserviceonthePlatformisalsomakinguseoftheHadoopDistributedFileSystem(HDFS).EachHadoopSandboxcomeswithaHDFSpartition(typically25GB)complementingtheclassiclocalfilesystem (also sized to 25 GB by default). This setting is the unit processing space at simulation level, that willaggregateandscalewithinacluster.TheApplicationWorkflowoutputsthatneedtopersistfromoneprocessingsteptotheother(fromajobtoanother)mustpublishtotheHDFSpartition,sothatthenextoperationintheDAGcan be assigned its input by Hadoop, tapping into the stack of HDFS data to be processed until all have beenconsumed.StandardoperationsonHDFSareusing the ‘Hadoopdfs’utilityand theDeveloperuses theHadoopSandboxAPIthatprovidesthe‘ciop-publish’and‘ciop-copy’wrappersontopoftheHadoopdfsutility,inordertohandletheseautomateddatamanagementfunctionswithinaHadoopworkflow.Whenconsideredoveraclusterofworkermachines, each having a HDFS partition provisioned from the Hadoop Sandbox template, it delivers‘data locality’ foraworker,whereHadoopwill send thenextprocessingunit (hencemoving code to thedata).Withthisapproach,theimportantthingistomanageappropriatelythestandardoutput(stdout)ofeachHadooptaskinordertopassthemcorrectlyasinputstosubsequentHadooptasks.

8S3isasetofwebservicesinterfacedevelopedbyAmazontostoreandretrievedata.Itisbecomingade-factostandardinCloudsystemsfordataaccess-http://docs.aws.amazon.com/AmazonS3/latest/dev/Welcome.html

Page 15: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page15of33

3.3 WebnotebooksThee-learning servicedeliversawebapplication thatallowplatformusers tocreateandsharedocuments thatcontainlivecode,equations,visualizationsandexplanatorytext.Thetypicalusesforsuchdocumentsinclude:datacleaningandtransformation,numericalsimulation,statisticalmodellingandmachinelearning.TheJupyterNotebookisaninteractivecomputingenvironmentthatenablesuserstoauthornotebookdocumentsthat include: live code, interactive widgets, plots, narrative text, equations, images and video. The JupyterNotebook provides a complete and self-contained record of a computation that can be converted into variousformatsandsharedwithothers.Itcombinesthreecomponents:

1. The Jupyter Notebook web application: An interactive web application for writing and running codeinteractivelyandauthoringnotebookdocuments.

2. TheJupyterKernel:Separateprocessesstartedbythenotebookwebapplicationthatrunsusers’codeinagivenlanguageandreturnsoutputbacktothenotebookwebapplication.Thekernelalsohandlesthingslikecomputationsforinteractivewidgets,tabcompletionandintrospection.

3. TheJupyterNotebookdocuments:Self-containeddocumentsthatcontainarepresentationofallcontentvisibleinthenotebookwebapplication,includinginputsandoutputsofthecomputations,narrativetext,equations, images, and rich media representations of objects. Each notebook document has its ownkernel.

The Notebook web application stores the code, executes it and displays its output together with Markdown9notes, in an editable document. When saved, the result is sent from the browser to the notebook server byHTTP(S),which saves it on disk as a JSON filewith a .ipynb extension (Figure 8). Theweb application, not thekernel, isresponsibleforsavingand loadingnotebooks,so it ispossibletoeditnotebookseven if thekernel forthatlanguageisnotavailable.Thekerneldoesn’tknowanythingaboutthenotebookdocumentitselfasitjustgetscellsofcodetoexecutewhentheuserrunsthem.

Figure8-DisplayingaNotebookfileinthebrowser

3.3.1 JupyternotebookwebapplicationTheJupyternotebookwebapplicationistheGUIwithIDEcapabilitiesofferingapowerful 'scratchpad'paradigmforthecreationandmanagementoflivecomputationaldocumentswithrichmedia.Userscanexecuteblocksofcode (provided by a given kernel) in the browser with automatic syntax highlighting, indentation and tabcompletion/introspection.Firstand foremost, thewebapplication isan interactiveenvironment forwritingandrunningcodedirectlyinanassociatedkernel.SoforexamplethecodebelowwilldisplayasinshowninFigure9:

a=10print(a)

9MarkdownisalightweightmarkuplanguagewithplaintextformattingsyntaxdesignedsothatitcanbeconvertedtoHTMLandmanyotherformats-https://daringfireball.net/projects/markdown/

Page 16: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page16of33

Figure9-SimpleinteractiveexampleinJupyterNotebooks

Inacodecellitispossibletoeditandwritenewcode,withfullsyntaxhighlightingandtabcompletion.Bydefault,thelanguageassociatedtoacodecellisPython,butdependingonthekernel,otherlanguages,suchasJuliaandR,can be handled interactively.When a code cell is executed, its code is sent to the kernel associatedwith thenotebook and the results that are returned from this computation are displayed in the notebook as the cell’soutput.Theoutputisnotlimitedtotext,withmanyotherformsofoutputalsopossible.Theresultsofthecomputationareattachedtothecodethatgeneratedthemasrichmediarepresentations,suchas HTML, LaTeX10, PNG, SVG, PDF, etc. Besides these rich media representations, users can create and useinteractiveJavaScriptwidgets,whichbindinteractiveuserinterfacecontrolsandvisualizationstoreactivekernelsidecomputations.Alongsidethecode,userscankeepnotesandothertextbychangingthestyleofaNotebookcell from"Code" to"Markdown".Thenotescanbeorganized inahierarchical structurewithdifferent levelsofheadingsandauthoredasnarrativetextusingtheMarkdownmark-uplanguage.The Notebook dashboard is the home page and its main purpose is to display the notebooks and files in thecurrentdirectory.Notebooksandfilescanbeuploadedtothecurrentdirectorybydragginganotebookfileontothe notebook list. The notebook list shows green “Running” text and a green notebook icon next to runningnotebooks(asseenbelow).Notebooksremainrunninguntilexplicitlyshutdown;closingthenotebook’spage isnotsufficient.Toshutdown,delete,duplicateorrenameanotebookthereareanarrayofcontrolsthatwillappearat the topof thenotebook list that alsouse theoperationsondirectories and fileswhenapplicable. Themainfeaturesofthewebapplicationcanbesummarizedas:

● In-browser code editing,with automatic syntax highlighting, indentation, togetherwith tab completionandintrospection;

● Codeexecutingfromthebrowser,withtheresultsofcomputationsattachedtothecodewhichgeneratedthem;

● Displayingtheresultofcomputationusingrichmediarepresentationsincludingpublication-qualityfiguresrenderedbythematplotliblibrarythatcanbeincludedinline;

● In-browsereditingforrichtextusingtheMarkdownmark-uplanguage,whichcanprovidecommentaryforthecode,isnotlimitedtoplaintext;

● The ability to easily include mathematical notation within Markdown cells using LaTeX, and renderednativelybyMathJax.

3.3.2 KernelsThrough Jupyter’skernelandmessagingarchitecture, the Jupyternotebookallowscode tobe run ina rangeofdifferentprogramming languages.Foreachnotebookdocument thatauseropens, thewebapplicationstartsakernel that runs the code for that notebook. Kernels are programming language specific processes that run

10LaTeXisadocumentpreparationsystemforhigh-qualitytypesetting.Itismostoftenusedformedium-to-largetechnicalorscientificdocumentsbutitcanbeusedforalmostanyformofpublishing-https://www.latex-project.org/about/

Page 17: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page17of33

independently and interact with the Jupyter applications and their user interfaces. Each kernel is capable ofrunningcodeinasingleprogramminglanguageandtherearekernelsavailableinseverallanguages.The“KernelZero”isIPythonanditcomesasadependencyofJupyter.TheIPythonkernelcanbethoughtofasthereferenceimplementationbut thenumberofkernels supportedby Jupyter isgrowing,withother languagesavailable likeJulia,R,Ruby,Haskell,Scala,node.jsandGo11.Thenotebookprovidesasimplewayforuserstopickwhichofthekernelsisusedforagivennotebook.The notebookweb server is written in Python and allows server extensions to bewritten as Pythonmodules.Several popular data science Python libraries are already available like NumPy, SciPy, Matplotlib, Pandas andStatsmodelsandothermoreadvancedlibrariessuchas:

● Scikit-learncontainssimpleandefficienttoolsfordatamininganddataanalysisanditimplementsawidevarietyofmachinelearningalgorithmsandprocessestoconductadvancedanalytics;

● Statsmodelsallowsuserstoexploredata,estimatestatisticalmodels,andperformstatisticaltestswithanextensivelistofdescriptivestatistics,statisticaltests,plottingfunctions,andresultstatisticsareavailablefordifferenttypesofdataandeachestimator;

● NLTK allows the development of programs toworkwith human language data. It provides easy-to-useinterfacestoover50corporaandlexicalresourcessuchasWordNet,alongwithasuiteoftextprocessinglibrariesforclassification,tokenization,stemming,tagging,parsing,andsemanticreasoning,andanactivediscussionforum.

3.3.3 JupyternotebookdocumentsAs described in the previous sections, Jupyter notebook documents contain the inputs and outputs of aninteractivesessionaswellasnarrativetext thatsupport thecodebutarenotmeant forexecution.Richoutputgeneratedbyrunningcode,includingHTML,images,video,andplots,isembeddedinthenotebook,whichmakesitacompleteandself-containedrecordofacomputation.Notebookdocumentsarefilesonthelocalfilesystemwith a “.ipynb” extension and allow users to use classical workflows for organizing the Jupyter notebookdocuments into folders or remote repositories to allow sharing these with others. The notebook documentsformatisJSONdatawithbinaryvaluesin“base64”.ThisallowstheJupyternotebookdocumentstobereadandmanipulatedprogrammaticallybyanyprogramminglanguage,andasJSONisatextformat,notebookdocumentsareversioncontrolfriendly.Jupyternotebookdocumentsconsistofalinearsequenceofcells.Therearefourbasiccelltypes:

● Codecells:Inputandoutputoflivecodethatisruninthekernel;● Markdowncells:Narrativetext;● Headingcells:6levelsofhierarchicalorganizationandformatting;● Rawcells:Outputtextthatisincluded,withoutmodification.

TheMarkdowncellsareusedtodocumentthecomputationalprocessinaliterateway,alternatingdescriptivetextwithcode,usingrichtext.TheMarkdownlanguageprovidesasimplewaytoperformthistextmark-up,thatis,tospecify which parts of the text should be emphasized (italics), bold, form lists, etc.When aMarkdown cell isexecuted, the code is converted into the corresponding formatted rich text.Markdown allows arbitrary HTMLcodeforformatting.WithinMarkdowncells,itispossibletoincludemathematicsinastraightforwardway,usingstandard LaTeX notation that are automatically rendered in the HTML output as equations with high qualitytypography. Raw cells provide a place in which the output is written directly and are not evaluated by thenotebook.

11Acompletelistofthesupportedkernelsisavailableat

https://github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages

Page 18: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page18of33

Jupyternotebookdocumentsavailable fromapublicURLonorGitHubcanbe sharedvia thenbviewer service.ThisserviceloadstheJupyternotebookdocumentandrendersitasastaticwebpage.TheresultingwebpagemaythusbesharedwithotherswithouttheirneedingtoinstalltheJupyterNotebook.The Nbconvert tool in Jupyter converts notebook files to other formats, such as HTML, LaTeX, orreStructuredText12.As shown in Figure10, this conversiongoes througha seriesof stepswherepre-processorsmodify the notebook inmemory (by running the code in the notebook and updates the output), an exporterconvertsthenotebooktoanotherfileformatusingtemplatesandpost-processorsworkonthefileproducedbyexporting. The nbviewer website uses this tool with the HTML exporter. When given a URL, it fetches thenotebookfromthatURL,convertsittoHTML,andservestheHTMLback.

Figure10-Convertinganotebooktootheroutputformats

3.4 DatacubeTosupporttheEVER-ESTe-LearningServiceseffectivelyitisnecessarytoimprovethecollaborativeapproachforstoring,organisingandanalysingthevastquantitiesofsatellite imageryandotherEarthObservationswithnewfunctionalities to create EO products data cubes on-demand. A data cube (or datacube) is amultidimensionalarrayofvalues,commonlyusedtodescribeatimeseriesofimagedata.Thedatacubeisusedtorepresentdataalong somemeasure of interest. Even though it is called a “cube”, it can be 2-dimensional, 3-dimensional, orhigher dimensional. Every dimension represents a new attribute in the database and the cells in the cuberepresent themeasureof interest indifferent temporalandspectraldimensions.Datacubes includeaseriesofstructuresandtoolsthatcalibrateandstandardisedatasets,enablingtheapplicationoftimeseriesandtherapiddevelopment of quantitative information products. By calibrating the information, data cubes make it moreaccessible,easiertoanalyse,andreducetheoverallcostforpilotapplicationandusers.TheDataCubeisasystemdesignedto:

● CataloguelargeamountsofEarthObservationdata;● ProvideaPythonbasedAPIforhighperformancequeryinganddataaccess;● Give scientists and other users the ability to easily perform Exploratory Data Analysis (e.g. combining

multi-sensordataonthesamereferencegridandpixelsize);● Allowscalablecontinentscaleprocessingofthestoreddata;● Tracktheprovenanceofallthecontaineddatatoallowforqualitycontrolandupdates.

12reStructuredTextisaneasy-to-read,what-you-see-is-what-you-getplaintextmarkupsyntaxandparsersystem

http://docutils.sourceforge.net/rst.html

Page 19: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page19of33

Figure11-EarthObservationDataCubes

TheEVER-EST ServiceDeveloper is providedwith aVMcontaining theopen sourceAustralianGeoscienceDataCube(AGDC)softwarepackage.SupportedbytheDataAgency,thisVMisableto instantiatenewdatacubes inthe Cloud Platform according to the needs of the e-LearningModules. This allows the provision of data cubesdirectlytothemodulesandremovefromthecoursethecomplexitiesregardingdatadiscoveryanddataaccess.ThedeployeddatacubescanthenbedirectlyaccessedfromJupyterNotebookswithAPIstoperformbasicdataqueries and analysis. For example, Figure 12 shows how to access the data from the data cube using the loadfunctionfromthedatacube library.The loadfunctiontakesasargumenttheproducttoaccess, thespatialandtemporalextenttodefinetheexactpartitionofthedatacubethatisrequested.

Figure12-LoadingdatafromthedatacubeinJupyter

The returned data is an array object (e.g. xarray.Dataset) which is a labelled n-dimensional array wrapping aNumPyarray.NumPyisPythonLanguagemainobjectrepresentinghomogeneousmultidimensionalarrayandcanbeuseddirectlyinthenotebook.Withthisinformationitispossibletoinvestigatethedata(Figure13)andtoseethevariables(measurementbands)anddimensionsthatwerereturnedusingthedata_varsdictionary.

Page 20: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page20of33

Figure13-Retrievingarraydatafromthedatacube

NumPyisanextensiontothePythonprogramminglanguagethataddssupportforlarge,multidimensionalarraysandmatricesthatareidealtooperateindatacubes.Italreadycontainsalargelibraryofhigh-levelmathematicalfunctionstooperateonthesearrays.ThisextensiontriestoovercomePythonslowercodeexecutionbydirectlyprovidingmultidimensionalarraysandfunctionsandoperatorsthatoperateefficientlyonarrays.UsedtogetherwithaplotPython library likematplotlib it isalsopossiblegraphicrepresentationsof thedata in thenotebook.Thislibraryisapython2Dplottinglibrarywhichproducesfiguresthatcanbeusedinpythonscripts.Itsimplifiesthegenerationofplotsandhistogramswithjustafewlinesofcodeandgivingthefullcontroloflinestyles,fontproperties, axes properties, etc., via an object oriented interface. Figure 14 shows how to display compositeimagesdirectly inthenotebookbyloadingthedatafromthedatacube.Theprocedural interfaceisdesignedtocloselyresemblethatofMATLABandmakesmatplotlibeasytolearnforexperiencedMATLABusers,makingitaviablealternativeinEVER-ESTtoMATLABasane-LearningdevelopingtoolforEOdataprocessing.ThecombineduseofPython,NumPy,andmatplotliboverMATLABincludes:

● Python-based, a full-featured modern object-oriented programming language suitable for large-scalesoftwaredevelopment;

● Free,opensource,nolicenseservers;● NativeSVGsupport.

Page 21: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page21of33

Figure14-Plottingamulti-bandimagefromadatacubeinJupyter

Nevertheless, in its current implementation (version 2), the AGDC software is still only intended as a workingprototypeandnotintendedforoperationaluse.Fortheintermediateversionofthisdocument,theobjectiveoftheworkistoevaluatethefeasibilityofthiscomponenttoprovideacohesive,sustainableframeworkfor large-scale multidimensional data management and access for EO data in the EVER-EST e-Learning Modules. It isintendedtopresentacompletedemonstrationinthefinalversionofthisdocument.

Page 22: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page22of33

4 DeploymentThe Jupyter notebook web applications are provisioned in a multi-tenant environment and self-contained inDockercontainers.Thiscapacity ismadeoftwocomponents: theJupyterHub,aserverthatgivesmultipleusersaccess to Jupyternotebooks, runningan independent Jupyternotebook server for eachuser and the spawnersthatcontrolhowJupyterHubstartstheindividualnotebookserverforeachuser.

4.1 DataaccessTheJupyternotebookserversanddatacubesaredeployedinDockercontainerswiththedataaccesshappeningwithin thecontainer itself. Inorder tobeable to savedataandsharedatabetweenDocker containers,Dockercame upwith the concept of Docker volumes. These volumes are directories (or files) that are outside of thedefaultUnionFileSystemandexistasnormaldirectoriesandfilesonthehostfilesystem(theUnionFileSystemisacombinationofread-onlylayerswitharead-writelayerontopthatislostwhenthecontainersaredismissed).TheJupyternotebookserversanddatacubesarespawnedinaDockercontainerthatwillaccessdatabymountingaDockervolume.ThedataavailableintheDockervolumeisdictatedbythedatapackagethatoriginatedit.Thedefinition of a data package relies on the data discoverymechanismoffered by theDataAgency. By accessingOpenSearchcataloguesfeaturingtensofdatacollectionsandprovidingadvancedquerymechanismsdrivenbythethematicfacetsofthedata(e.g.interferometricsearchforSARdataorcloudcoverageforopticaldata)theDataAgencybuildsdatapackagescontaining references tooneormorecatalogueelemententries.Oncestored, theelementsreferencewithinagivendatapackagearefetchedfromthearchives(localorremote)andalltogethercreate a Docker volume that ismounted on theDocker container hosting the user notebook server. From thenotebook-anduser-perspective,accessingthedatacontainedintheDockervolumeisdonewithatypicalPOSIXfilesystem,thusprovidinghighthroughput.

4.2 ProvisioningUsersaccessJupyterHubviathewebbrowserastheywoulddowiththeJupyterwebapplicationbygoingtotheaddressof theJupyterHubserver.Usersauthenticateusingthedefinedauthenticator (inourcaseaSingleSign-On) and trigger a new instance of a Jupyter server using the spawner. The approach followed uses theDockerSpawner todeployDockercontainerstoprovideresourcestotheJupyterserver.TheDockerSpawnercanprovidetwotypesofspawners:dockerspawner.DockerSpawner,forspawningidenticalDockercontainersforeachuser and dockerspawner.SystemUserSpawner, for spawning Docker containers with an environment and homedirectoryforeachuser.

4.3 PersistentstorageThe Jupyter notebooks contain live code that can be run repeatedly and the outcomeof these executions caneitherbeinlineresults(andthuscontainedinthenotebook)orphysicalresultsthatarewrittenonthelocalfilesystem.Inthefirstcase-theinlineresults-thepersistenceisguaranteedbyJupyterwhenthenotebookissaved,in the second case, the persistence of the physical results producedmust be addressed by othermeans. In asimilarapproachasforthedatapackages,thepersistenceofthephysicalresultsisachievedbycreatinganotherDockervolumethatisassociatedwiththenotebook.ThissolutionalsoprovidesthepossibilitytosharetheDockervolumeasaninputdatapackagetoanothernotebookthatcanbeownedbyanotheruserwithintheplatform.

Page 23: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page23of33

4.4 ScalabilityJupyternotebooksoffer interactiveprocessingofdataviaa JupyterWebApplication.While thevertical scaling,which is bynature limitedbyhost capacity, allowsextensionof theprocessing capacityof the resourcesmadeavailable to a user, the horizontal scaling offers support of the execution of the code in Jupyter notebookdocumentsagainstlargearchivesofEarthSciencedata.ThishorizontalscalingofaJupyternotebookistheprocessoftranslatingtheJupyternotebookintoanoperationaltoolforlarge-scaleandcosteffectiveprocessingagainstlargesetsofdata.The horizontal scaling is done by exploiting the Cloud framework offered by the platformwhere YARNplays acentral role. As explained in section 6.4 of deliverable D5.1, YARN provides to the Cloud Production Center, aresource-management platform responsible for managing computing resources in clusters and using them forschedulingofusers'applications. Inparticular, thecapacityofYARNtodeployDockercontainersand theYARNcapacity for supporting several computationalmodels were explained in detail in deliverable D5.1. A new anddedicated computational model will be adopted in YARN to offer the horizontal scaling and thus support thereplication of the single Jupyter notebook processing a batch of input data in several tens, hundreds or eventhousandsofDockercontainerseachprocessingasubsetoftheinputdata.

4.5 AuthenticationThee-LearningservicesmayrequireuserauthenticationattheJupyternotebookserversanddatacubeslevel.TheJupyterHub already provides severalways for users to authenticate via the authenticators layer. This layer is aflexible environment that supports several implementations of the component delivering the mechanism forauthorizingusersfromtheEVE-ESTidentifyprovider.

Page 24: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page24of33

5 e-LearningCatalogueandPortfolioThe following section provides a snapshot of the existing e-Learningmaterial. The updated and full list of themodulesisavailableonGitHubEVER-EST13.

5.1 Sentinel-1productinformationandmetadataThise-learningmodule targetsa simplehands-on lessononSentinel-1dataand theSentinel-1Toolbox (S1TBX)thatispartoftheSentinelApplicationPlatform(SNAP).TheS1TBXconsistsofacollectionofprocessingtools,dataproductreadersandwritersandadisplayandanalysisapplicationtosupportthelargearchiveofdatafromESASARmissions includingSENTINEL-1,ERS-1&2andENVISAT,aswellas thirdpartySARdata fromALOSPALSAR,TerraSAR-X,COSMO-SkyMedandRADARSAT-2.ThismoduleusesthesnappytoolboxthatprovidestheaccesstotheSNAPJavaAPIfromPython.ThismoduleopensaSentinel-1GRDproductandextractsafewmetadatafieldsandproductinformation.Modulelevel:beginner

5.2 Sentinel-1productsubsetThise-learningmodulealsofocusesonasimplehands-onlessononSentinel-1dataandtheSentinel-1Toolbox.ThismodulealsousessnappytoprocesstheSentinel-1GRDproduct.ThismoduleopensaSentinel-1GRDproductandextractsasubsetproductdefinedwitharadarcoordinateandextent.Thiseasesanysubsequentprocessingtasks(e.g.changedetectionfortheidentificationofafloodextent).

13https://github.com/ec-everest/e-learning-modules

Page 25: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page25of33

Modulelevel:beginner

Page 26: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page26of33

5.3 Sentinel-1changedetectionforfloodextentThis e-learningmodule implements a complexworkflow to identify the floodextentusing Sentinel-1data. ThismodulealsousessnappytoprocesstheSentinel-1GRDproduct.The use of SAR satellite imagery for change detection dedicated to flood extentmapping constitutes a viablesolution to process images quickly, providing near real-time flooding information to relief agencies.Moreover,floodextent informationcanbeused fordamageassessmentand riskmanagementcreating scenarios showingpotentialpopulation,economicactivitiesandtheenvironmentatpotentialriskfromflooding.Theworkflowcontainsseveralsteps:

● Step0:Datapreparation-Subset● Step1:Pre-processing-Calibration● Step2:Pre-processing-Specklefiltering● Step3:Binarization● Step4:Post-processing-Geometriccorrection

Modulelevel:advanced

Page 27: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page27of33

Page 28: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page28of33

Page 29: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page29of33

5.4 Sentinel-2vegetationindicesThise-learningmodule implementsaworkflowtoprocessanumberofvegetation indices fromSentinel-2data.ThismodulealsousessnappyandtheS2TBX.Vegetation indicesareaspectral transformationoftwoormorebandsdesignedtoenhancethecontributionofvegetation properties and allow reliable spatial and temporal inter-comparisons of terrestrial photosyntheticactivityandcanopystructuralvariations.Modulelevel:intermediate

Page 30: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page30of33

Page 31: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page31of33

Page 32: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page32of33

Page 33: Technical Note on e-Learning Services, Intermediate Version · Technical Note on e-Learning Services, Intermediate Version Work package 5 VRE Infrastructure and Services Design and

H2020–EINFRA–2015–1 Page33of33