36
20 January 2004 ESS Technical Colloquium 1 NVO Infrastructure Gretchen Greene THE US NATIONAL VIRTUAL OBSERVATORY

NVO Infrastructure

  • Upload
    foy

  • View
    54

  • Download
    2

Embed Size (px)

DESCRIPTION

T HE US N ATIONAL V IRTUAL O BSERVATORY. NVO Infrastructure. Gretchen Greene. Collaborators. - PowerPoint PPT Presentation

Citation preview

Page 1: NVO Infrastructure

20 January 2004ESS Technical Colloquium 1

NVO Infrastructure

Gretchen Greene

THE US NATIONAL VIRTUAL OBSERVATORY

Page 2: NVO Infrastructure

20 January 2004ESS Technical Colloquium 2

Collaborators

• Bob Hanisch (ST),William O’Mullane (JHU), Alex Szalay (JHU), Tamas Budavari (JHU), Maria A. Nieto-Santisteban (JHU), Ani Thakar, Jeongin Lee (GSFC), Tom McGlynn (GSFC), Ray Plante (NSCA), Tony Linde (AstroGrid/Univ. of Leicester), Kevin Benson (AstroGrid/Mullard Space Science Laboratory), Niall Gaffney (STScI), Antonio Volpicelli (STScI/OATo), NVO service providers (including MAST, Randy Thompson)

Page 3: NVO Infrastructure

20 January 2004ESS Technical Colloquium 3

NVO?

If only I could take HST images and

2MASS data and SDSS spectra and figure out

the wonders of the universe

!.,!

Page 4: NVO Infrastructure

20 January 2004ESS Technical Colloquium 4

What is NVO?

• A NSF funded collaboration between astronomical researchers and information technologists to build a global network of astronomical resources which facilitates scientific discovery and operations in the space science community.

• http://www.us-vo.org/

Page 5: NVO Infrastructure

20 January 2004ESS Technical Colloquium 5

Who works on the NVO Project?

• Distributed US Projects – PI (Alex Szalay/JHU), Project Manager (Bob

Hanisch/STScI), Astronomical Institutions…

• International Collaboration – IVOA is the “parent” consortium

• http://www.ivoa.net/• 13+ partners: GAVO, JVO, NVO, AstroGrid, others

– Global network requires this– Astronomical community is international

Page 6: NVO Infrastructure

20 January 2004ESS Technical Colloquium 6

NVO 2003 Milestones

• Demonstrate Science with VO concepts • Establish standard…

– Services: • Spatial catalog query, “cone search”• Simple Image Access protocol (SIAP)• SkyNode *new* web service

– Protocols• Data exchange format => XML schemas (VOTable,

VOResource)

• Form Collaborations to accomplish goals– Working Groups: Metadata, Data Models, Web Service

and Grid, DAL (Data Access Layer), Applications

Page 7: NVO Infrastructure

20 January 2004ESS Technical Colloquium 7

NVO 2004 Milestones

• Complete Spectral Access Protocol (SSAP)

• Incorporate Data Models (DM)– For mining more complex data structures

• Tune Standards and Schemas based on 2003 lessons learned

• Network related issues– Security, authentication

• Continue to build science tools and applications

Page 8: NVO Infrastructure

20 January 2004ESS Technical Colloquium 8

VO Framework/Architecture

Computational Services

Virtual Data

ConeSearc

h

VOTableSIAP, S

SAP, ADQL

VOTable, FITS,

GIF

Information Discovery

Registries

Data Access Layer (DAL)

Archives, Collections Catalogs

Portals, User Interfaces, ToolsVOPlot Mirage Topcat Treevie

wDIS MAST

HTTP, Web,& Grid Services

Page 9: NVO Infrastructure

20 January 2004ESS Technical Colloquium 9

Where is NVO locally?

• STScI– Project management for NVO– VO services, VO science app and registry

prototypes, MAST integration

• JHU– Science leadership– SkyServer, SkyQuery, and prototypes as

drivers for NVO

• STScI and JHU– technical collaboration– Established local tech exchange

meetings• [email protected] majordomo

– Academic + Operational Science gives you the best of both worlds

Page 10: NVO Infrastructure

20 January 2004ESS Technical Colloquium 10

STScI VO Resources

• Searchable NVO prototype registry– http://sdssdbs1.stsci.edu/nvo/voregistry/index.aspx

• Science Demo Portal (Galaxy Morphology)– http://www.us-vo.org/prototypes/

galaxymorphology.html

• GSC, DSS, Hip, Tycho, HST pointings– http://www-gsss.stsci.edu/gscvo/index.jsp

• MAST VO services (including GALEX)– http://archive.stsci.edu/index.html

Page 11: NVO Infrastructure

20 January 2004ESS Technical Colloquium 11

Galaxy Morphology Demo

• Goal:– Analyze the morphology of a cluster of galaxies to

study the formation properties….

• Technology:– NVO standards + Grid + .NET + JAVA +++

• What resources are available?– Define required parameters, manually identify

resources, form integration plan

• Collaboration…a ‘must’ to obtain all components– STScI, JHU, NCSA, Fermi, USC/ISI

HAVE THIS WORKING in a couple months using the NVO……

Page 12: NVO Infrastructure

20 January 2004ESS Technical Colloquium 12

Galmorph Demo

• Glimpse into the potential applications…• Show Under the Hood….

– Go To US-VO site– Go To Prototype– Run Under the Hood

Page 13: NVO Infrastructure

20 January 2004ESS Technical Colloquium 13

What was missing?

• A Resource Registry– Find the resources and services available to

perform this scientific task…..– How do I connect these together…..– What tools are available to visualize and

analyze my results?

Page 14: NVO Infrastructure

20 January 2004ESS Technical Colloquium 14

The Role of Resource Registries

• Used to discover and locate resources—data and services—that can be used in a VO application– May also include tools, e.g. ETC

• Registry: a list of resource descriptions– Expressed as structured metadata

to enable automated processing and searching

• Registries are themselves VO Resources

Page 15: NVO Infrastructure

20 January 2004ESS Technical Colloquium 15

Registry Requirements

• Allow user to select resources that are likely to pertain to a scientific question

• Select resources based on characteristics…– Type of resource: catalogs, image archives, EPO, services– Coverage in space, time, and frequency– Where data comes from, who curates it

• Dynamic: resources will come and go

• Distributed: Should not depend on a single point of failure or single view of the VO.

• Preserve the data providers’ control over their data– Curators control what gets registered, content, updates– Allow integration with existing resource management

• Allow extension to new types of resources *customized

Page 16: NVO Infrastructure

20 January 2004ESS Technical Colloquium 16

IVOA Registry Working Group (RWG)

• Common approach to registries

• Work packages– Science requirements and use cases– Resource metadata– Registry interfaces– Prototyping

• Distributed model for registries

Page 17: NVO Infrastructure

20 January 2004ESS Technical Colloquium 17

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Model

Page 18: NVO Infrastructure

20 January 2004ESS Technical Colloquium 18

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Model

harvest(pull)

Page 19: NVO Infrastructure

20 January 2004ESS Technical Colloquium 19

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Model

harvest(pull)

replicate

Page 20: NVO Infrastructure

20 January 2004ESS Technical Colloquium 20

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Model

harvest(pull)

replicate

selectiveharvesting

Page 21: NVO Infrastructure

20 January 2004ESS Technical Colloquium 21

NVO Prototype Registry

• To support a Data Inventory Service (DIS)

What is known about a position in the sky?

– Use a registry to locate and query standard services:• Cone Search Services: querying catalogs• Simple Image Access Services:

querying image archives and cutout services

http://heasarc.gsfc.nasa.gov/vo/data-inventory.html

• Components – Publishing Registries– Searchable Registry– Resource Metadata– Harvesting Protocol– Populated with service descriptions

Page 22: NVO Infrastructure

20 January 2004ESS Technical Colloquium 22

Resource Metadata

• KEY means for exchange between Registries• Under development within the IVOA RWG• The standard comes in two parts:

– Prose document that defines concepts independent of an encoding scheme

• Resource Metadata Document (RSM) by Hanisch et. al.

– XML Schemas

• Draws on Dublin Core metadata– An interdisciplinary standard for core resource

metadata http://dublincore.org

Page 23: NVO Infrastructure

20 January 2004ESS Technical Colloquium 23

Resource Metadata: XML Schema

• Classes of ResourcesOrganization, DataCollection, Service, Registry– Specific classes inherit from generic <Resource>

• Organized into separate schemas:– Core resource metadata: VOResource

– Various extensions schemas containing specific types

• Capable of describing…– Data centers, research organizations, missions,

observatories– Data collections, archives – VO standard services: Cone Search, Simple Image

Access– Existing Browser/CGI-based services

Page 24: NVO Infrastructure

20 January 2004ESS Technical Colloquium 24

Publishing Registries: getting information into registries

• Multiple publishing registries established

• Motivation: – Register VO Services– Develop techniques for

easy registration

• Variable Resource descriptions storage solutions– XML Documents– Custom File Systems– Relational Databases

Page 25: NVO Infrastructure

20 January 2004ESS Technical Colloquium 25

Harvesting Interface

• Adopted Open Archives Initiative (OAI) Protocol for Metadata Harvesting– HTTP/CGI-based protocol for exposing metadata to

harvesters (e.g. searchable registries)

• Advantages:– Existing, field-tested design we didn’t have to re-invent– Fairly easy to implement– Existing tools for emitting and harvesting metadata– Exposes our metadata to larger digital library

community

Page 26: NVO Infrastructure

20 January 2004ESS Technical Colloquium 26

Searchable Registry

• Searchable Registry was set up at JHU/STScIhttp://sdssdbs1.stsci.edu/nvo/voregistry/index.aspx

• OAI harvester collects resource descriptions – from Publishing Registries

• Use of modification Date for – Parses XML and Loads data into relational database

• SOAP Web Service interfacehttp://sdssdbs1.stsci.edu/nvo/voregistry/registry.asmx

– Searching• Currently provides specialized SQL querying useful for

DIS

• Web Form Access for conventional practices

Page 27: NVO Infrastructure

20 January 2004ESS Technical Colloquium 27

Local PublishingRegistry

FullSearchableRegistry

Local PublishingRegistry

Heasarc

STScI

harvest(pull)

DataInventory Service

search forservices

Registry Model

NCSA

DIS

Local PublishingRegistry

Local PublishingRegistry

Local PublishingRegistry

Vizier

Caltech

Astrogrid

Page 28: NVO Infrastructure

20 January 2004ESS Technical Colloquium 28

ConeSearchService

ConeSearchService

Simple ImageAccess

Simple ImageAccess

FullSearchableRegistry

STScI

harvest(pull)

DataInventory Service

search forservices

Registry Model

DIS

ConeSearchService

Simple ImageAccess

DataProviders

Local PublishingRegistry

Local PublishingRegistry

Heasarc

NCSA

Local PublishingRegistry

Local PublishingRegistry

Local PublishingRegistry

Vizier

Caltech

Astrogrid

Page 29: NVO Infrastructure

20 January 2004ESS Technical Colloquium 29

Registry Page

• http://sdssdbs1.stsci.edu/nvo/voregistry/index.aspx

Page 30: NVO Infrastructure

20 January 2004ESS Technical Colloquium 30

NVO: Data Inventory Service (DIS)

• Rapid retrieval of all registered images, catalogs, and pointed observations for a selected position on the sky

Page 31: NVO Infrastructure

20 January 2004ESS Technical Colloquium 31

NVO: DIS Registry Findings

Images

Pointed observations

Source catalogs

Page 32: NVO Infrastructure

20 January 2004ESS Technical Colloquium 32

NVO: Visualization of DIS Findings

Easy comparison of multiwavelength data

Radio

Optical

X-ray

Ala

din

image v

iew

er, C

DS

Page 33: NVO Infrastructure

20 January 2004ESS Technical Colloquium 33

Lessons Learned

• XML schema needs simplification– hierarchical layering makes parsing complex

for very heterogeneous resources– Data Integrity– Transition from 100 => many K of resources

need efficient means for validating metadata

• Synchronization Between Repositories– Rating integrity of resources

• Stamps

• Large Scale Harvesting– Network instabilities need to be accounted

for

Page 34: NVO Infrastructure

20 January 2004ESS Technical Colloquium 34

VO Goals at STScI for 2004

• Quality Data Provider – Standard VO services to the archives

• Build Science Discovery mechanisms– Efficient User interfaces to Services, Registry, analysis

tools • Scientific Leadership

– Build next generation applications using the VO technology (planning)

• Coordinated Development – Internal: cross division (ESS, ODM, CMO, OPO)– Continued collaboration with JHU and other VO

partners

Page 35: NVO Infrastructure

20 January 2004ESS Technical Colloquium 35

NVO Goals for 2004

• Standardized service provider functions– heartbeat (is service alive)– footprint (spatial field)

• Cross correlation, HTM, Healpix indexing

• Large Scale Data Correlations– Large archives– Mining large number of resources

• Automated discovery and analysis• Authorization/Authentication/Security

– Grid and web service technology• Seamless integration

– Make client application building simple• Underlying model may be complex, yet modular for

maintainability

Page 36: NVO Infrastructure

20 January 2004ESS Technical Colloquium 36