Download ppt - Features of the SDSS

Transcript
Page 1: Features of the SDSS

Special 2.5m telescope, at Apache Point, NM3 degree field of viewZero distortion focal plane

Two surveys in onePhotometric survey in 5 bands - 200 million objectsSpectroscopic redshift survey - 1 million distances

Automated data reductionOver 120 man-years of development(Fermilab + collaboration scientists)

Very high data volumeExpect over 40 TB of raw dataAbout 2 TB processed catalogsData made available to the public

Features of the SDSS

Page 2: Features of the SDSS

Data Processing Pipelines

Page 3: Features of the SDSS

All raw data (40TB) saved at Fermilab

Object catalog 500 GB parameters of >108 objects

Redshift Catalog 1 GB parameters of 106 objects

Atlas Images 1500 GB 5 color cutouts of >108 objects Spectra 60 GB in a one-dimensional formDerived Catalogs 20 GB clusters QSO absorption lines4x4 Pixel All-Sky Map 60 GB heavily compressedCorrected Frames 15 TB

Object catalog 500 GB parameters of >108 objects

Redshift Catalog 1 GB parameters of 106 objects

Atlas Images 1500 GB 5 color cutouts of >108 objects Spectra 60 GB in a one-dimensional formDerived Catalogs 20 GB clusters QSO absorption lines4x4 Pixel All-Sky Map 60 GB heavily compressedCorrected Frames 15 TB

SDSS Data Products

Page 4: Features of the SDSS

Accessing the Data

• Few fixed access patterns– one cannot build indices for all possible queries– worst case scenario is linear scan of the whole table

• Increasingly large differences between– Random access– Sequential I/O

• Often much faster to scan than to seek• Good layout of data => more sequential I/O• Geometric indexing – partitioning in storage• Using Objectivity/DB• Ported to MS SQL Server (w. Jim Gray)

Page 5: Features of the SDSS

SDSS in GriPhyN

• Two Tier2 Nodes (FNAL+JHU)– testing framework on real data in different scenarios

• FNAL node– massive reprocessing of images

• full regeneration of catalogs from the images (on disk)• gravitational lensing, finer morphological classification• Image coaddition, differencing

• JHU node– catalog calculations, integrated with database

• tasks require lots of data, can be run in parallel• various statistical calculations, likelihood analyses• power spectra, correlation functions, Monte-Carlo

• Public access– creating virtual data for NVO services (implemented later)

Page 6: Features of the SDSS

The SDSS Southern Survey

• Scanning a single stripe on the sky >30 times over• Coaddition => extra depth• Differencing => time dimension• Multiple ways to combine the stripes

– Rerun the pipelines with custom parameters– Build a new object catalog– Perform particular science analysis (lensing map)

• On the right timescale to try GriPhyN framework

Page 7: Features of the SDSS

Large Scale Statistical Analysis

• Galaxy distribution has non-trivial clustering patterns– Reflects conditions in the early universe

• Spatial statistical tools to be run on object catalog, applying many different cuts to the data– Spatial power spectrum – Correlation functions

• These algorithms are typically N2 or N3 with the number objects!!

• Some of the analyses will partition well (likelihood), others will not (pair counts)

Page 8: Features of the SDSS

Trends in Astronomy

Future dominated by detector improvements

Total area of 3m+ telescopes in the world in m2, total number of CCD pixels in Megapix, as a function of time. Growth over 25 years is a factor of 30 in glass, 3000 in pixels.

• Moore’s Law growth in CCD capabilities

• Gigapixel arrays on the horizon

• Improvements in computing and storage will track growth in data volume

• Investment in software is critical, and growing

Page 9: Features of the SDSS

VO- The challenges

• Large number of new surveys– multi-TB in size, 100 million objects or more– individual archives planned, or under way

• Multi-wavelength view of the sky– more than 13 wavelength coverage in 5 years

• Size of the archived data40,000 square degrees is 2 Trillion pixels– One band 4 Terabytes– Multi-wavelength 10-100 Terabytes– Time dimension 10 Petabytes

• Current techniques inadequate• Scalable hardware/networking requirements• Transition to the new astronomy

MACHO2MASSDENISSDSSDPOSSGSC-IIVISTACOBE MAPNVSSFIRSTGALEXROSATOGLE, ...

MACHO2MASSDENISSDSSDPOSSGSC-IIVISTACOBE MAPNVSSFIRSTGALEXROSATOGLE, ...


Recommended