1
Scanning nano-SAXS: bridging real-space and reciprocal space for biological imaging Small Angle X-ray Scattering: quantify structure sizes, morphology, orientations common approach: large ensemble averages in solutions nano-SAXS: focused X-ray beam averages over small volumes it becomes possible see local structures and orientations real-space resolution: determined by beam size 500 nm down to 50 nm scanning technique Example: 20 ms per point, 250×250 points, 400 nm steps: 100 µm square FOV, 40 minutes reciprocal-space resolution: scattering to largest q-vector, currently ~ 1 nm -1 label-free imaging of biological matter on the nano scale quantitative information channels available: darkfield (how many photons are scattered? – measure the electron density), differential phase contrast (where are the photons scattered? – gradients), azimuthal and radial analyses – quantify local ordering and structures: e.g. study the actin cytoskeleton and understand the physical parameters of cells Typical frame and data rates: 10 to 100 Hz EigerX 4M (up to 750 Hz possible) 50 to 500 Mega Pixel per second ~ 50 MiB/s (compressed LZ4-HDF5), ~ 1 GiB/s (uncompressed for analysis) Typical scan parameters: 50×50 to 250×250 (sometimes 1000×1000) from 1 ms to 50 ms per point 10 11 pixels in less than half an hour References [1] M. Bernhardt, J.D. Nicolas, M. Eckermann, B. Eltzner, F. Rehfeldt, T. Salditt: Anisotropic x-ray scattering and orientation fields in cardiac tissue cells, New Journal of Physics 19, 2017. [2] B. Weinhausen, J.F. Nolting, C. Olendrowitz, J. Langfahl-Klabes, M. Reynolds, T. Salditt, S. Köster: X-ray nano-diffraction on cytoskeletal networks, New Journal of Physics 14, 2012. [3] M. Priebe, M. Bernhardt, C. Blum, M. Tarantola, E. Bodenschatz, T. Salditt: Scanning X-Ray Nanodiffraction on Dictyostelium discoideum, Biophysical Journal 107, 2014). [4] J.-D. Nicolas, M. Bernhardt, M. Krenkel, C. Richter, S. Luther and T. Salditt: Combined scanning X-ray diffraction and holographic imaging of cardiomyocytes, J. Appl. Crystallogr. 50, 2017. [5] M. Osterhoff: dada – a web-based 2D detector analysis tool, J. Phys: Conf. Ser. 849, 2017 (XRM 2016). Acknowledgements We thank our workshops (electrical and mechanical), especially Thomas Pingel, for support and construction. We thank DESY Photon Science, especially André Rothkirch, for discussion and support during beamtime. single 255×255 STXM scan http://heinzel/stxm/GINIX/ run60/eiger/desylo/1/1 ?horz=255&vert=255 + parameter + colour map etc. parallel regions, e.g. …/1/ 1?horz=255&vert=10, …/1/2551?horz=255&vert=10, …/1/5101?horz=255&vert=10, …/1/7651?horz=255&vert=10 User web browser dada18.html GUI to define analysis and parameters dada18.js generate URL Proxy lighttpd as load balancer requests to h001 h024 caching database aerospike dada18 running on all Heinzelmännchen load / process and analyse sends results – to user – to cache Parallel jobs via HTTP proxy parallelisation is “easy”: many independent Eiger images optimal strategy depends on scan geometry; e.g. stitching-STXM full scan is broken down into patches with individual URLs multiple requests to HTTP proxy, lighttpd acts as load balancer patches are “glued” to full dataset all data can be imported from e.g. Matlab / Python / whatever by URLs URLs are “human-readable” I [ph./s] 0.00 0.40 lin. scale 1.00E5 1.80E7 lin. scale ω pa [1] 0 180 lin. scale 0 180 lin. scale θ pa [°] θ fs [°] a I [ph./s] 10^0.0 10^2.0 log. scale d 3 1 2 4 3 1 2 4 b c ideal 1 2 3 4 5 6 7 8 10 12 # threads speedup 10× @ 32 cores „Crabat“ „Dancer“ 8 PCs 20 40 60 80 100 % calculation vs. latency Performance Benchmarks Estimated analysis times, 1000×1000-Scan 1×1 1×4 8×1 8×4 16×4 24×4 32×4 48×4 64×4 1‘ 4‘ 16‘ 1h 4h 16h nodes ×cores Poor scaling on multi-core-systems current trend: more cores per CPU, but ≤ 2 MiB cache per core full EigerX 4M image: 8 MiB @ int16_t limited speed-up on multi-core calculation time vs. latency decreases Bottleneck: CPU cache throughput from RAM okay, but latency too high multi-core analysis faster than data can flow into CPU 24 Heinzelmännchen: analysing 1000×1000 images in ten minutes dada18: web GUI and re-worked C-backend dada, the data daemon [5]: centralised entry node to data do not worry about file name, folder structure, data format, compression et al. any longer reducing obstacles for new students centralised entry node to analysis collecting our student’s developments, so the next generation can use old methods on new data after testing: optimising performance web GUI and for the user, HTTP interface for software GUI to generate URLs for analysis browsing & pre-processing composites (2D array of 2D images) STXM analysis, different algorithms new: snapshots and parallel jobs Remedy: more cache = more CPUs = more boards STXM-cluster of individual systems 24 systems (“Heinzelmännchen”): Boards: ASUS H110M-A M.2 CPUs: Intel Core i7–7700 @ 3.60 GHz RAM: 8 GiB per node 1 control system (“Heinzelfrau”): CPU: Intel Core i7–8700 @ 3.20 GHz RAM: 64 GiB SSD: 1×Samsung 850 PRO 256 GiB cache: 5×Sandisk Ultra II 960 GiB network: 2×10G uplink from Heinzelfrau, 1G per node; 10G extern to NFS server 96 cores, 192 MiB L3-Cache “cheap, but many” under construction * now * File Server homer4b NetApp ≥ 150 TiB 2×8 G SAN heinzelfrau 10 G Ether NFS 3.4 TiB data cache, 128 GiB results cache, 60 GiB page cache 128 MiB RAM cache for metadata + recent data h001 … h024 2×10 G to 24×1 G fuse based client, TCP + Jumbo frames, read-only dadafs: network filesystem with multiple caches raw data accessible via NFS only visible on Heinzelfrau Heinzelmännchen mount via fuse-based dadafs-client local caching: recently accessed data, meta data, i.e. calls to stat(2) Heinzelfrau caching: dadafsd-server caches accessed file fragments on SSD (ZFS pool) results: cached in aerospike DB (distributed caching database) faster than NFS and Samba focusing optics pinhole Piezo scanner alignment microscope pixel detector Automated segmentation Cells on 3 mm Si 3 N 4 membrane Chiara Cassini, S. Köster et al. [1] Figure 4. Orientation map and dark-field image of the keratin network in a freeze-dried eukaryotic cell reconstructed from a mesh scan with a step size of 100 nm and 1 s exposure time. The inset shows a fluorescence microscopy image of the keratin network recorded before freeze-drying and the scanned region is marked by a red box. [2] [3] [4] Institut für Röntgenphysik – Friedrich-Hund-Platz 1 – 37077 Göttingen Markus Osterhoff – Jan Goeman – Sarah Köster – Tim Salditt dada-STXM: parallel analysis of Eiger scanning-SAXS data

dada-STXM: parallel analysis of Eiger scanning-SAXS data fileScanning nano-SAXS: bridging real-space and reciprocal space for biological imaging Small Angle X-ray Scattering: quantify

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: dada-STXM: parallel analysis of Eiger scanning-SAXS data fileScanning nano-SAXS: bridging real-space and reciprocal space for biological imaging Small Angle X-ray Scattering: quantify

Scanning nano-SAXS: bridging real-space and reciprocal space for biological imaging► Small Angle X-ray Scattering: quantify structure sizes, morphology, orientations common approach: large ensemble averages in solutions nano-SAXS: focused X-ray beam averages over small volumes it becomes possible see local structures and orientations

► real-space resolution: determined by beam size 500 nm down to 50 nm

► scanning technique Example: 20 ms per point, 250×250 points, 400 nm steps: 100 µm square FOV, 40 minutes

► reciprocal-space resolution: scattering to largest q-vector, currently ~ 1 nm-1

► label-free imaging of biological matter on the nano scale

► quantitative information channels available: darkfield(howmanyphotonsarescattered?–measure the electron density), differentialphasecontrast(wherearethephotonsscattered?–gradients), azimuthalandradialanalyses–quantifylocal ordering and structures: e.g. study the actin cytoskeleton and understand the physical parameters of cells

Typical frame and data rates:► 10 to 100 Hz EigerX 4M ►(upto750Hzpossible)

► 50 to 500 Mega Pixel per second

►~50MiB/s(compressedLZ4-HDF5), ► ~ 1 GiB/s (uncompressed for analysis)

Typical scan parameters:► 50×50 to 250×250(sometimes1000×1000)

► from 1 ms to 50 ms per point

► 1011 pixels in less than half an hour

References[1]M.Bernhardt,J.D.Nicolas,M.Eckermann,B.Eltzner,F.Rehfeldt,T.Salditt:

Anisotropicx-rayscatteringandorientationfieldsincardiactissuecells,NewJournalofPhysics19,2017.[2]B.Weinhausen,J.F.Nolting,C.Olendrowitz,J.Langfahl-Klabes,M.Reynolds,T.Salditt,S.Köster:

X-raynano-diffractiononcytoskeletalnetworks,NewJournalofPhysics14,2012.[3]M.Priebe,M.Bernhardt,C.Blum,M.Tarantola,E.Bodenschatz,T.Salditt:

ScanningX-RayNanodiffractiononDictyosteliumdiscoideum,BiophysicalJournal107,2014).[4]J.-D.Nicolas,M.Bernhardt,M.Krenkel,C.Richter,S.LutherandT.Salditt:

CombinedscanningX-raydiffractionandholographicimagingofcardiomyocytes,J.Appl.Crystallogr.50,2017.[5]M.Osterhoff:dada–aweb-based2Ddetectoranalysistool,J.Phys:Conf.Ser.849,2017(XRM2016).

AcknowledgementsWethankourworkshops(electricalandmechanical),especiallyThomasPingel,forsupport and construction.

WethankDESYPhotonScience,especiallyAndréRothkirch,fordiscussionandsupport during beamtime.

single 255×255 STXM scan

http://heinzel/stxm/GINIX/ run60/eiger/desylo/1/1 ?horz=255&vert=255

+ parameter+ colour map etc.

parallel regions, e.g.

…/1/ 1?horz=255&vert=10,…/1/2551?horz=255&vert=10,…/1/5101?horz=255&vert=10,…/1/7651?horz=255&vert=10

User

web browser

dada18.htmlGUI to defineanalysis andparameters

dada18.jsgenerate URL

Proxy

lighttpd asload balancer

requests toh001 … h024

caching database

aerospike

dada18

running on allHeinzelmännchen

load / processand analyse

sends results– to user– to cache

Parallel jobs via HTTP proxy► parallelisation is “easy”: many independent Eiger images

► optimal strategy depends on scangeometry;e.g.stitching-STXM

► full scan is broken down into patcheswithindividualURLs

►multiplerequeststoHTTPproxy, lighttpd acts as load balancer

► patches are “glued” to full dataset

► all data can be imported from e.g. Matlab/Python/whateverbyURLs

URLsare“human-readable”

I [ph./s]

0.00 0.40lin. scale

1.00E5 1.80E7lin. scale

ωpa [1]

0 180lin. scale

0 180lin. scale

θpa [°]θfs [°]

a

I [ph./s]

10^0.0 10^2.0log. scale

d

3

1

24

3

1 2

4

b c

ideal

1 2 3 4 5 6 7 8 10 12 # threads

spee

dup

→ 10× @ 32 cores„Crabat“

„Dancer“

8 PCs

20

40

60

80

100

% c

alcu

lati

on

vs. l

aten

cy

Performance Benchmarks

Estimated analysis times, 1000×1000-Scan

1×1

1×4

8×1

8×4

16×

4

24×

432

×4

48×

4

64×

4

1‘

4‘

16

‘ 1h

4h

16

h

nodes×cores

Poor scaling on multi-core-systems►currenttrend:morecoresperCPU, but≤2MiBcachepercore

► full EigerX 4M image: 8MiB@int16_t

► limited speed-up on multi-core

► calculation time vs. latency decreases

► Bottleneck: CPU cache

throughputfromRAMokay, but latency too high

multi-core analysis faster than datacanflowintoCPU

► 24 Heinzelmännchen: analysing 1000×1000 images in ten minutes

Graphical user interface to generate URLs for analysis

browse detector images, define ROIs, basic pre-processing(e.g. empty division, subtraction; stack operations)

Composites (2D array of 2D images)

STXM analysis with different algorithms

new: snapshots to share your analysis with colleagues

dada18: web GUI and re-worked C-backend► dada, the data daemon [5]: centralised entry node to data

►donotworryaboutfilename, folder structure, data format, compression et al. any longer

► reducing obstacles for new students

centralised entry node to analysis

► collecting our student’s developments, so the next generation can use old methods on new data

► after testing: optimising performance

► web GUI and for the user, HTTP interface for software

►GUItogenerateURLsforanalysis

► browsing & pre-processing composites(2Darrayof2Dimages) STXManalysis,differentalgorithms

► new: snapshots and parallel jobs

Remedy: more cache = more CPUs = more boards►STXM-clusterofindividualsystems

►24systems(“Heinzelmännchen”): Boards: ASUS H110M-A M.2 CPUs: IntelCorei7–[email protected] RAM: 8GiBpernode

►1controlsystem(“Heinzelfrau”): CPU: IntelCorei7–[email protected] RAM: 64GiB SSD: 1×Samsung850PRO256GiB cache: 5×SandiskUltraII960GiB

► network: 2×10G uplink from Heinzelfrau, 1G per node; 10GexterntoNFSserver

► 96 cores, 192 MiB L3-Cache “cheap, but many”

under construction *now*

File Serverhomer4b

NetApp≥ 150 TiB

2×8 G SAN

heinzelfrau

10 G

Eth

er

NFS

3.4 TiB data cache,

128 GiB results cache,

60 GiB page cache

128 MiB RAM cache

for metadata

+ recent data

h001 … h024

2×10

G to

24×

1 G

fuse

bas

ed c

lient

,TC

P +

Jum

bo

fram

es,

read

-onl

y

dadafs: network filesystem with multiple caches►rawdataaccessibleviaNFS only visible on Heinzelfrau

►Heinzelmännchenmountvia fuse-based dadafs-client

► local caching: recently accessed data, metadata,i.e.callstostat(2)

► Heinzelfrau caching: dadafsd-server caches accessed filefragmentsonSSD(ZFSpool)

results:cachedinaerospikeDB (distributedcachingdatabase)

► faster than NFS and Samba

focusing optics pinhole

Piezo scanner

alignment microscope

pixel detectorAutomated segmentation

Cellson3mmSi3N4 membrane

ChiaraCassini, S.Kösteretal.

[1]

9

Figure 4. Orientation map and dark-field image of the keratin network in afreeze-dried eukaryotic cell reconstructed from a mesh scan with a step size of100 nm and 1 s exposure time. The inset shows a fluorescence microscopy imageof the keratin network recorded before freeze-drying and the scanned region ismarked by a red box.

3.2. Radial intensity of averaged and single diffraction patterns

At each scan point a scattering pattern in reciprocal space is recorded. The scattering signal isanalyzed by azimuthal integration of single and averaged diffraction patterns. To account foranisotropy in the scattering signal, the reciprocal space is divided into eight angular segmentsand the first and fifth segments are centered symmetrically around the major axis of theellipse approximating the scattering signal. Average diffraction patterns recorded on the cellularextension and on the empty region are shown in figures 5(a) and (b) and the angular segmentsare indicated by dashed, white lines. For azimuthal integration of the average diffraction patternfrom the empty region, the same positions of the angular segments as for the cell region areused. The insets indicate from which regions of the scan the diffraction patterns are used toobtain an average diffraction pattern from the cellular extension and the empty region. Only

New Journal of Physics 14 (2012) 085013 (http://www.njp.org/)

[2]

[3] [4]

InstitutfürRöntgenphysik –Friedrich-Hund-Platz1–37077GöttingenMarkusOsterhoff–JanGoeman–SarahKöster–TimSalditt

dada-STXM:parallel analysis of Eiger scanning-SAXS data