8
Data acquisition, handling, and analysis at the Advanced Photon Source Chris Jacobsen Associate Division Director, X-ray Science Division, Advanced Photon Source Professor, Physics & Astronomy; Applied Physics; Chemistry of Life Processes Institute; Northwestern University

Data acquisition, handling, and analysis at the Advanced ... · 0.000 01 0.000 1 0.001 0.01 0.1 1 10 100 1000 1-id-1 1-id-2 1-id-3 1-id-4 3-id-b 3-id-c 7 8-bm 9 1 1-id-b 1-id-c 1-id-d

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data acquisition, handling, and analysis at the Advanced ... · 0.000 01 0.000 1 0.001 0.01 0.1 1 10 100 1000 1-id-1 1-id-2 1-id-3 1-id-4 3-id-b 3-id-c 7 8-bm 9 1 1-id-b 1-id-c 1-id-d

Data acquisition, handling, and analysis at the Advanced Photon Source

Chris JacobsenAssociate Division Director, X-ray Science

Division, Advanced Photon SourceProfessor, Physics & Astronomy; Applied

Physics; Chemistry of Life Processes Institute; Northwestern University

Page 2: Data acquisition, handling, and analysis at the Advanced ... · 0.000 01 0.000 1 0.001 0.01 0.1 1 10 100 1000 1-id-1 1-id-2 1-id-3 1-id-4 3-id-b 3-id-c 7 8-bm 9 1 1-id-b 1-id-c 1-id-d

0.000 010.000 1

0.0010.01

0.11

10100

1000

1-ID

-11-

ID-2

1-ID

-31-

ID-4

3-ID

-B3-

ID-C 7

8-BM

911

-ID-B

11-ID

-C11

-ID-D

12-B

M12

-ID-B

12-ID

-C/D

15-ID 20

21-ID

21-ID

-D21

-ID-E

21-ID

-F21

-ID-G 22

23-ID

-D23

-ID-B 30

32-ID

-1

Data Volume at present (TB/day). Cumulative: 165 TB/day. Imaging highlighted.

Advanced Photon Source (APS) Beamline

34-ID

32-ID

-2

2-B

M2-

ID-B

2-ID

-E2-

ID-D

8-ID

-I8-

ID-E

Beamlinecomputer(control, collect,visualize, analyze,distribute)

Light sourcecluster (store,analyze, track)

Computing centercluster (analyze)

Computing centerpetascale storage(archive, track,distribute)

Globus Online/GridFTPTracking: where is the data? Verified? Provenance throughout analysis?Portable hard drives

Detector

1

APS data rate: ~100 TB/day at present

1

LHC:  abo

ut  15  pe

taBytes/year  or  ~

80  TB/day

h:p://pu

blic.web

.cern.ch/pub

lic/en/LH

C/Co

mpu

Bng-­‐en

.htm

l

Detectors are part of a data handling system.Move to cloud computing using Hadoop?

Page 3: Data acquisition, handling, and analysis at the Advanced ... · 0.000 01 0.000 1 0.001 0.01 0.1 1 10 100 1000 1-id-1 1-id-2 1-id-3 1-id-4 3-id-b 3-id-c 7 8-bm 9 1 1-id-b 1-id-c 1-id-d

10

DECTRIS/PSI detectors• Pilatus 6M, 20 bits at 12 frames/sec: 380 TB/hour flat out• Eiger 16M, 12 bits at 187 frames/sec: 16,000 TB/hour flat out

10

May  2012:  Workshop  on  high  speed  data  and  HDF5  files  held  at  Paul  Scherrer  InsBtut.    Nicholas  Schwarz  of  APS  parBcipated.

Page 4: Data acquisition, handling, and analysis at the Advanced ... · 0.000 01 0.000 1 0.001 0.01 0.1 1 10 100 1000 1-id-1 1-id-2 1-id-3 1-id-4 3-id-b 3-id-c 7 8-bm 9 1 1-id-b 1-id-c 1-id-d

APS/XSD beamline scientists

MCS datapipelining

Scientificcollaborators(ANL and beyond)

EMC electronmicroscopists

Experiments, datacollection, analysis,interpretation

APS/AES SSGSoftwareServices

APS/AES BCDABeamlinecontrols

APS/XSD DETDetectors

APS/AES ITInformation tech

Software for dataaquisition

Initial data transfer,long term storage

Data aggregation,distribution

MCS mathematicalanalysis

Mathematicalmethods and tools

12

Organizational complexity

12

• APS X-ray Science Division: beamline scientists, Detector Gruop, Scientific Software group (2 people)

• APS Engineering Services (AES): Beamline Controls and Data Aquisition (BCDA), Software Support Group (SSG), Information Technology (IT)

• Mathematics and Computer Science: multiple research groups, multiple large computing systems

Page 5: Data acquisition, handling, and analysis at the Advanced ... · 0.000 01 0.000 1 0.001 0.01 0.1 1 10 100 1000 1-id-1 1-id-2 1-id-3 1-id-4 3-id-b 3-id-c 7 8-bm 9 1 1-id-b 1-id-c 1-id-d

experiment(s)

data reduction

data analysis

reduced data

raw data

preliminary model

re-interpretation

re!nement

optimizer

transformation

modeling

At the FacilityAt the Home Institution

experiment(s)

data reduction

data analysis

publication, presenta-tion, archival

visualization

modeling

reduced data, I(Q)

raw data (2-D Intensity, E, T, P, t, etc.)

adjustableparameters

At the Facility

At the Home Institu

tion

16

Data processes: changing models?

16

Figures  courtesy  F.  De  Carlo  and  P.  Jemian,  from  “Living  Data  for  Extreme-­‐Scale  Science  FaciliBes”,  submi:ed  by  I.  Foster  et  al.  to  DoE’s  Advanced  ScienBfic  CompuBng  Research  (ASCR)  call,  April  2012

Page 6: Data acquisition, handling, and analysis at the Advanced ... · 0.000 01 0.000 1 0.001 0.01 0.1 1 10 100 1000 1-id-1 1-id-2 1-id-3 1-id-4 3-id-b 3-id-c 7 8-bm 9 1 1-id-b 1-id-c 1-id-d

Visible light Red blood cells

Algae Yeast

18

Finding patterns in spectroscopic imaging

• Automatic classification of cell types followed by histograms of elemental content.

• S. Wang, J. Ward, S. Vogt et al.

18

J. Lehmann, D. Solomon, J. Kinyangi, L. Dathe, S. Wirick, and C. Jacobsen, Nature Geoscience 1, 238 (2008). Mathematical method: M. Lerotić, C. Jacobsen, T. Schäfer, S. Vogt, Ultramicroscopy 100, 35 (2004).

Page 7: Data acquisition, handling, and analysis at the Advanced ... · 0.000 01 0.000 1 0.001 0.01 0.1 1 10 100 1000 1-id-1 1-id-2 1-id-3 1-id-4 3-id-b 3-id-c 7 8-bm 9 1 1-id-b 1-id-c 1-id-d

12

Storing and managing data• Automated pipelines• Metadata on experimental conditions• Self-documented, compressible, platform-independent format with

support for parallel computing: HDF5 (www.hdfgroup.org)– Data chunking has big effect on performance– Compression by dedicated computer?– Parallel writing of precompressed chunked data– HDF5 will soon have Single Writer/Multiple Reader (SWMR)– HDF5 has tentative plans for Multiple Writer/Multiple Reader (MWMR)

but schedule depends on funding of The HDF Group (www.hdfgroup.org)

• Provenance: tracking what you did to process the data• HDF5-based schema for metadata, provenance: DataExchange

(www.aps.anl.gov/DataExchange/)

12

Argonne: team effort between APS beamline scientists, APS detector group, APS engineering support group, and Math and Computer Science division

APS: Nicholas Schwarz

Page 8: Data acquisition, handling, and analysis at the Advanced ... · 0.000 01 0.000 1 0.001 0.01 0.1 1 10 100 1000 1-id-1 1-id-2 1-id-3 1-id-4 3-id-b 3-id-c 7 8-bm 9 1 1-id-b 1-id-c 1-id-d

9

Sesame Street Science

9

Who are the people in your neighborhood?