9
Special 2.5m telescope, at Apache Point, NM 3 degree field of view Zero distortion focal plane Two surveys in one Photometric survey in 5 bands - 200 million objects Spectroscopic redshift survey - 1 million distances Automated data reduction Over 120 man-years of development (Fermilab + collaboration scientists) Very high data volume Expect over 40 TB of raw data About 2 TB processed catalogs Data made available to the public Features of the SDSS

Features of the SDSS

Embed Size (px)

DESCRIPTION

Features of the SDSS. Special 2.5m telescope, at Apache Point, NM 3 degree field of view Zero distortion focal plane Two surveys in one Photometric survey in 5 bands - 200 million objects Spectroscopic redshift survey - 1 million distances Automated data reduction - PowerPoint PPT Presentation

Citation preview

Page 1: Features of the SDSS

Special 2.5m telescope, at Apache Point, NM3 degree field of viewZero distortion focal plane

Two surveys in onePhotometric survey in 5 bands - 200 million objectsSpectroscopic redshift survey - 1 million distances

Automated data reductionOver 120 man-years of development(Fermilab + collaboration scientists)

Very high data volumeExpect over 40 TB of raw dataAbout 2 TB processed catalogsData made available to the public

Features of the SDSS

Page 2: Features of the SDSS

Data Processing Pipelines

Page 3: Features of the SDSS

All raw data (40TB) saved at Fermilab

Object catalog 500 GB parameters of >108 objects

Redshift Catalog 1 GB parameters of 106 objects

Atlas Images 1500 GB 5 color cutouts of >108 objects Spectra 60 GB in a one-dimensional formDerived Catalogs 20 GB clusters QSO absorption lines4x4 Pixel All-Sky Map 60 GB heavily compressedCorrected Frames 15 TB

Object catalog 500 GB parameters of >108 objects

Redshift Catalog 1 GB parameters of 106 objects

Atlas Images 1500 GB 5 color cutouts of >108 objects Spectra 60 GB in a one-dimensional formDerived Catalogs 20 GB clusters QSO absorption lines4x4 Pixel All-Sky Map 60 GB heavily compressedCorrected Frames 15 TB

SDSS Data Products

Page 4: Features of the SDSS

Accessing the Data

• Few fixed access patterns– one cannot build indices for all possible queries– worst case scenario is linear scan of the whole table

• Increasingly large differences between– Random access– Sequential I/O

• Often much faster to scan than to seek• Good layout of data => more sequential I/O• Geometric indexing – partitioning in storage• Using Objectivity/DB• Ported to MS SQL Server (w. Jim Gray)

Page 5: Features of the SDSS

SDSS in GriPhyN

• Two Tier2 Nodes (FNAL+JHU)– testing framework on real data in different scenarios

• FNAL node– massive reprocessing of images

• full regeneration of catalogs from the images (on disk)• gravitational lensing, finer morphological classification• Image coaddition, differencing

• JHU node– catalog calculations, integrated with database

• tasks require lots of data, can be run in parallel• various statistical calculations, likelihood analyses• power spectra, correlation functions, Monte-Carlo

• Public access– creating virtual data for NVO services (implemented later)

Page 6: Features of the SDSS

The SDSS Southern Survey

• Scanning a single stripe on the sky >30 times over• Coaddition => extra depth• Differencing => time dimension• Multiple ways to combine the stripes

– Rerun the pipelines with custom parameters– Build a new object catalog– Perform particular science analysis (lensing map)

• On the right timescale to try GriPhyN framework

Page 7: Features of the SDSS

Large Scale Statistical Analysis

• Galaxy distribution has non-trivial clustering patterns– Reflects conditions in the early universe

• Spatial statistical tools to be run on object catalog, applying many different cuts to the data– Spatial power spectrum – Correlation functions

• These algorithms are typically N2 or N3 with the number objects!!

• Some of the analyses will partition well (likelihood), others will not (pair counts)

Page 8: Features of the SDSS

Trends in Astronomy

Future dominated by detector improvements

Total area of 3m+ telescopes in the world in m2, total number of CCD pixels in Megapix, as a function of time. Growth over 25 years is a factor of 30 in glass, 3000 in pixels.

• Moore’s Law growth in CCD capabilities

• Gigapixel arrays on the horizon

• Improvements in computing and storage will track growth in data volume

• Investment in software is critical, and growing

Page 9: Features of the SDSS

VO- The challenges

• Large number of new surveys– multi-TB in size, 100 million objects or more– individual archives planned, or under way

• Multi-wavelength view of the sky– more than 13 wavelength coverage in 5 years

• Size of the archived data40,000 square degrees is 2 Trillion pixels– One band 4 Terabytes– Multi-wavelength 10-100 Terabytes– Time dimension 10 Petabytes

• Current techniques inadequate• Scalable hardware/networking requirements• Transition to the new astronomy

MACHO2MASSDENISSDSSDPOSSGSC-IIVISTACOBE MAPNVSSFIRSTGALEXROSATOGLE, ...

MACHO2MASSDENISSDSSDPOSSGSC-IIVISTACOBE MAPNVSSFIRSTGALEXROSATOGLE, ...