EARTHCUBE CONCEPTUAL DESIGN A Scalable ?· A Scalable Community Driven Architecture ... High Performance…

Embed Size (px)

Text of EARTHCUBE CONCEPTUAL DESIGN A Scalable ?· A Scalable Community Driven Architecture ... High...

  • EARTHCUBE CONCEPTUAL DESIGN A Scalable Community Driven Architecture http://earthcube.org/group/scalable-community-driven-architecture

    Overview PI: G. Djorgovski (Caltech)

    CO-I: D. Pilone, T. Pilone (Element 84), D. Crichton, E. Law (JPL)

    Other key personnel: S. Caltagirone (E84), S. Hughes (JPL),

    T. Huang (JPL), A. Mahabal (Caltech)

    1/7/16 1 2016 ESIP Winter Meeting

  • A high level system blueprint for the definition, construction, and deployment of both existing and new components to ensure that they can be unified and integrated into an evolutionary national infrastructure for EarthCube

    1/7/16 2

  • Methodology

    ! Identification of stakeholders, concerns and requirements

    ! Identification of architectural use cases and drivers

    ! Selection of an architectural framework

    ! Development of the architectural principles

    ! Development of the architectural models

    ! Capture of the architecture artifacts in a consolidated report

    ! Generation of recommendations for adopting the architecture for the EarthCube program

    1/7/16 3

  • 1/7/16 4

  • Stakeholders Stakeholder/Actor Concerns

    NSFProgramManagersMakedecisionandprovideguidanceattheEarthCubeprogramlevel.

    Providesuf>icientfundingtosupporttheEarthCubemission.

    EarthCubeScientistsUseEarthCuberesourcesandservicestoconductscienti>icresearch.

    Publishscienti>icresults&curatedataasneeded.

    EarthCubeDevelopers DeveloptechnologiesandservicesthatcanbeintegratedintoEarthCube.

    EarthCubeArchitects

    EstablishEarthCuberequirements,frameworkandoperationalconcept.

    Developinformationmodel(vocabulary,ontology).Establishstandardsguidelines.EnsureinteroperabilitybetweenEarthCubeBuildingBlocks.

    ExternalDataUsers UseEarthCuberesourcesandservicesforresearch,education,anddecision-making.

    Curator EnsuredataisproperlycapturedinEarthCubecompliantdatarepositories.

    DataOwner Responsibleforproducingthedata.Concernedaboutitsdistributionanduse.

    ExternalDataFacility Responsibleforarchivingdataatotheragencies(NASA,NOAA,USGS,etc);interoperabilitywiththeEarthCubeCyberinfrastructure.

    EarthCubeGovernanceCommittees

    Responsibleforgeneratingandmonitoringthegovernanceforthesystemincludingdatacuration,access,usecasepriority,interoperabilitystandards,etc.

    EarthCubeOf>iceStaff ResponsibleformaintainingthecommunityinvolvementwithinEarthCubeandcommunicatingchangesandhowtousethesystem.1/7/16 5

  • Use Cases ! Big Science Discovery, Comparison, Provenance, Model & visualization

    ! Collaborative Science

    ! Dark Data Contribution

    ! Tools Contribution

    ! Data Documentation

    ! Models Sharing

    ! High Performance Computing and Storage Resources

    ! Real Time Data

    ! Physical Sample Curation

    1/7/16 6

  • Drivers ! Transform and accelerate research and discovery by turning data

    into knowledge and enabling interdisciplinary data integration.

    ! Provide critically needed data, tools, and computational resources and frameworks for cross-domain scientific collaboration, analysis and with long-term geoscience software and data preservation, discovery and use.

    ! Provide a geosicences cyberinfrastructure and architecture that is scalable, extensible and sustainable.

    1/7/16 7

  • Frameworks ! Zachman Framework - For organizing stakeholder concerns and

    perspectives.

    ! ISO/IEC/IEEE 42010:2011- For architectural description guidelines.

    ! Reference Model for Open Distributed Processing (RM-ODP) For architectural patterns for distributed systems.

    ! Open Group Architecture Framework (TOGAF) For managing the architecture.

    ! Federal Enterprise Architecture Framework (FEAF) For classifying the architecture into architectural elements and viewpoints.

    ! ISO 14721:2003 - Open Archival Information System (OAIS) Reference Model - Provides a standard for information objects.

    ! ISO/IEC 11179:3 Registry Metamodel and Basic Attributes specification - Provides a schema for a metadata registry.

    1/7/16 8

  • ! Scalability ! Community Driven ! Open Science ! Interoperability ! Sustainability ! Distributed ! Data Model Driven

    1/7/16 9

  • ScienceDataManage

    SatelliteInstrumentDataSystems

    ScienceDataManageAirborne

    Data

    ScienceDataManageAgency

    EarthDataArchives

    Data Provider

    EarthCubeCI

    EarthCube Discovery

    1/7/16 10

  • ScienceDataManage

    SatelliteInstrumentDataSystems

    ScienceDataManageAirborne

    Data

    ScienceDataManageAgency

    EarthDataArchives

    Data Provider

    EarthCubeCI

    OtherDataSystems(e.g.NOAA)OtherDataSystems(e.g.NOAA)OtherDataSystems(In-Situ,University)

    EarthCube Repository EarthCube Discovery

    1/7/16 11

  • ScienceDataManage

    SatelliteInstrumentDataSystems

    ScienceDataManageAirborne

    Data

    ScienceDataManageAgency

    EarthDataArchives

    Data Provider

    EarthCubeCI

    OtherDataSystems(e.g.NOAA)OtherDataSystems(e.g.NOAA)OtherDataSystems(In-Situ,University)

    EarthCube Repository

    Data Science Infrastructure (Data, Algorithms, Machines)

    ScienceTeams

    EarthCube Discovery

    1/7/16 12

  • Applica>ons

    DecisionSupport

    ScienceDataManage

    SatelliteInstrumentDataSystems

    ScienceDataManageAirborne

    Data

    ScienceDataManageAgency

    EarthDataArchives

    Research

    Data ProviderData Analysis

    EarthCubeCI

    OtherDataSystems(e.g.NOAA)OtherDataSystems(e.g.NOAA)OtherDataSystems(In-Situ,University)

    EarthCube Repository

    Data Science Infrastructure (Data, Algorithms, Machines)

    Earthcube Data Analytics Centers

    ScienceTeams

    EarthCube Discovery

    1/7/16 13

  • Benchmark

    ! Earth System Grid Federation (ESGF)

    ! Early Detection Research Network (EDRN)

    ! NASAs Earth Observing System Data and Information System (EOSDIS)

    ExArch'Mee*ng,'October'2012

    Node2Architecture

    Internally,'each'ESGF'Node'is'composed'of'services'and'applica*ons'that'collec*vely'enable'data'and'metadata'access,'and'user'management.'ESGF'soNware'stack'combines'custom'soNware'components'developed'by'ESGF'with'other'freely'available'applica*ons'from'eCommerce'(Apache'Tomcat,'Solr,'Postgres,...)'and'geoIinforma*cs'(Thredds'Data'Server,'LAS,'...)SoNware'components'are'grouped'into'4'areas'of'func*onality'(aka'flavors):

    Data'Node':'secure'data'publica*on'and'accessIndex'Node':'metadata'indexing'and'searchingweb'portal'UI'to'drive'human'interac*ondashboard'suite'of'admin'applica*onsmodel'metadata'viewer'plugin

    'Iden*ty'Provider':'user'authen*ca*on'and'group'membership'Compute'Node':'analysis'and'visualiza*on

    Nodes'flavors'can'be'installed'in'various'combina*ons'depending'on'site'needs,'or'to'achieve'higher'performance'and'scalability

    ExArch'Mee*ng,'October'2012

    SoGware2Stack2:2Node2Manager

    Enables'con*nuos'exchange'of'service'and'state'informa*on'among'NodesInternally,'it'collects'Node'health'informa*on'and'metrics'(cpu,'disk'usage,'etc.)Installed'for'all'Node'flavors

    PeerIToIPeer'(P2P)'protocol

    Gossip'protocol:'informa*on'is'exchanged'randomly'among'peersEach'Node'receives'informa*on'from'one'Node,'merges'it'with'its'own'informa*on,'and'propagates'it'to'two'other'Nodes'at'random

    No'central'coordina*on,'no'single'point'of'failureNodes'can'join/leave'the'federa*on'dynamicallyEach'Node'is'bootstrapped'with'knowledge'of'one'default'peerEach'Node'can'belong'to'one'or'more'peer'groups'within'which'informa*on'is'exchanged

    XML'Registry

    XML'document'that'is'payload'of'P2P'protocolContains'service'endpoints'and'SSL'public'keys'for'all'Nodes'in'the'federa*on

    Derived'products'(list'of'search'shards,'trusted'IdPs,'loca*on'of'Airibute'Services,...)'are'used'by'federa*onIwide'services

    Challenge:'good'news'travel'fast,'bad'news'travel'slow...

    ASF DAAC SAR Products Sea Ice, Polar

    Processes

    SEDAC Human Interactions

    in Global Change LP DAAC

    Land Processes & Features

    PO.DAAC Ocean Circulation

    Air-Sea Interactions ASDC

    Radiation Budget, Clouds, Aerosols, Tropo Chemistry

    ORNL DAAC Biogeochemical

    Dynamics, EOS Land Validation

    GES DISC Atmos Composition &

    Dynamics, Global Modeling, Hydrology,

    Radiance

    LAADS/ MODAPS

    Atmosphere

    OBPG Ocean Biology & Biogeochemistry

    GHRC Hydrological Cycle &

    Severe Weather

    CDDIS Crustal Dynamics

    Solid Earth NCAR, U of Col. HIRDLS, MOPITT,

    SORCE GSFC

    GLAS, MODIS, OMI, OBPG

    LaRC CERES, SAGE III

    GHRC AMSR-E, LIS,

    AMSR2

    JPL MLS, TES

    San Diego ACRIM

    NSIDC DAAC Cryosphere, Polar

    Processes

    SIPSs

    Key Data

    Center

    ECS Sites

    1/7/16 14

  • ProcessArchitecture

    EarthCubeSystem

    Architecture

    DataLifecycle

    Data Generation

    Data Curation

    DataTransport

    Data Ingest

    DataManagement

    SearchDistribution

    DataAnalytics

    Visualization

    SoftwareLifecycle Administrative

    TechnologyPlanning

    SoftwareDevelopment

    Release

    Governance

    Standards

    Technology

    Policies

    ResourcePlanning

    DataArchitecture

    TechnologyArchitecture

    Ingest (Receive, Validate, Accept)

    Catalog/DataManagement

    Storage(Repository)

    Processing

    Search and Discovery

    DataIntegration

    DataAnalysis

    Distribution

    Visualization

    InformationModel