Neural Network High Precision Processing for Astronomical ... · NEURAL NETWORK HIGH PRECISION PROCESSING FOR ASTRONOMICAL IMAGES ... and optical engineer- ... Neural Network High

NEURAL NETWORK HIGH PRECISION PROCESSINGFOR ASTRONOMICAL IMAGES

Rossella Cancelliere1, Mario Gai2

1Department of Computer Science, University of Turinc. Svizzera 185, 10149, Torino, Italy

phone: +390116706777, fax: +39011751603, email: [email protected]

2INAF - Astron. Observatory of TurinV. Osservatorio 20, 10025 Pino T.se (TO), Italy

phone: +390118101943 fax: +390118101930, email: [email protected]

ABSTRACT

In this paper we deal with the diagnosis and removalof chromaticity, a relevant source of error in high precisionastrometric measurements, using a feed forward neural net-work and focuse on the usefullness of a carefully optimisedimage processing.

The first problem we study is the image construction viaFourier transform so we suggest a method to effectively eval-uate it no longer involving FFT algorithm but via direct ma-trix multiplication. The second problem is related to the ne-cessity of a good choise of the parameters used to encodeimages, that we solved with a careful preprocessing and fil-tering; these parameters are then used as inputs to a feed for-ward neural network trained by backpropagation to removechromaticity.

1. INTRODUCTION

In the framework of the modern astronomical data process-ing the mathematical and computing tools must meet chal-lenging requirements on resolution and precision, consistentwith the measurement goals, e.g. image position accuracy1000 smaller than the typical image size.

The measurement is therefore sensitive to small discrep-ancies between the real signal and its expected profile, so thatit is crucial to implement the capability of modelling and cal-ibrating the latter (i.e. the instrument or signal model) at anunprecedented level of precision.

To this purpose, sometimes the resolution provided bythe DFT may be not adequate, or at least it is necessary todefine with adequate detail the variables. A way to increaseresolution is to add extra zeros to the vector containing thepupil function values; in this paper we apply this method to-gether with a different approach to the DFT evaluation, nolonger involving the usual FFT technique.

The position estimate of a stellar source produced byany location algorithm (e.g. the centre of gravity, COG, orbarycentre) is affected by discrepancy from the image gener-ated by an ideal optical system with respect to the nominalposition.

The variation of apparent position with source spectraldistribution is what we call chromaticity, and it is relevant tohigh precision astrometry. An analysis of chromaticity ver-sus aberrations, optical design aspects, and optical engineer-ing issues, are discussed in [1], which also deals with designoptimisation guidelines.

The imaging problem can be considered as the formula-tion of appropriate analytical or computing tools for the de-scription of the intensity distribution generated on the focalplane of an optical system.

For an unobstructed circular pupil of diameter D, atwavelength λ , the image of a star, considered as a point-likesource at infinity, and produced by an ideal telescope, has ra-dial symmetry and is described by the squared Airy function:

I (r) = k [2J1 (ν)/ν]2 , ν = 2πr. (1)

(see [2] for notation).J1 is the Bessel function of the first kind, order one, k a

normalisation constant, and r = D/2 the aperture radius.The diffraction image on the focal plane of any real tele-

scope, described by a set of aberration values, for a givenpupil geometry, is deduced by the square modulus of theFourier transform of the pupil function e iΦ:

I (r,φ) =k

π2

∣∣∣∣∫

dρ∫

dθ ρ eiΦ(ρ ,θ)e−iπrρ cos(θ−φ)∣∣∣∣2

(2)

where r, φ and ρ , θ are the radial coordinates, re-spectively on image and pupil plane, and the integration do-main corresponds to the pupil (for the circular case, 0 ≤ ρ ≤1 and 0 ≤ θ ≤ 2π).

The phase aberration Φ describes the wavefront error, i.e.in a real case the deviation from the ideal flat wavefront, andcan be decomposed e.g. by means of the first 21 terms of theZernike functions (as described in [2]):

Φ(ρ ,θ ) =2πλ

WFE =2πλ

21

∑n=1

Anφn(ρ ,θ ) . (3)

If Φ = 0 (non-aberrated case, An = 0), we obtain a flatwavefront, i.e. WFE = 0, and Eq. (1) is retrieved for thecircular pupil.

The main part of Eq. (2) can easily be recognised as aFourier transform, so that it is evident that the implementa-tion of this basic tool is crucial to the overall model perfor-mance.

In this paper we also investigate the diagnostic capabili-ties of a sigmoidal neural network (see [3] for basic defini-tions) for identification of the chromatic effect on the base ofthe previous suggested method to improve resolution; in [4] a

©2007 EURASIP 1774

15th European Signal Processing Conference (EUSIPCO 2007), Poznan, Poland, September 3-7, 2007, copyright by EURASIP

sigmoidal neural network was used for correction of the chro-maticity in the framework of the astrometric mission Gaia[5] that remains also in this work the reference context. Withrespect to this previous work here we increase the computa-tional efficiency evaluating the Fourier Transform via directmatrix multiplication and compare the obtained results withthe classical FFT algorithm.

In Section 2 we briefly describe the data processing tech-niques and in Section 3 we describe the current results.

2. DATA PROCESSING

In this section we describe the generation of the Fouriertransform, the identification of the most convenient imageparameters, the data main features and the generation of thetraining and test sets. Below, we describe the usage of Dis-crete Fourier transform in our application, the image encod-ing by moments, and data filtering.

2.1 Discrete Fourier Transform

There are many science applications for which an accuratefrequency spectrum or Fourier transform of the signal is nec-essary. The Discrete Fourier transform (DFT) of a sequenceq(k) of samples of a signal is:

Q(Ω) =K−1

∑k=0

q(k)exp(− jkΩ) (4)

where Ω = ωT , T is the sampling interval and q(k) =q(kT ) is the kth sample of signal q(t).

In this equation we assume that the signal has a finite du-ration so that there are only K contiguous nonzero samples.Usually Q(Ω) is evaluated at a set of K evenly spaced pointsin the interval [0,2π ] by the algorithm called Fast Fouriertransform (FFT).

The DFT gives the Fourier transform of the signal atthe points Ω = 2πn/K, (n = 0 . . .K − 1) so in terms ofreal frequencies we have a resolution of Ω = 2π/(KT) orf = 1/(KT). When this resolution is not sufficient, it canbe increased by simply pad the samples with extra zeros; ifwe want to multiply by M the frequency resolution we canadd (M − 1)K zeros in the sequence q(k), thus obtaining anew sequence of length N = KM. The alternative of simpleinterpolation is often acceptable, but the choice requires cau-tion with respect to the effective information content (signalresolution and duration), and to the possible introduction ofartefacts.

The zero-padding approach is associated to a computa-tional cost for the usual FFT algorithm of order of NlogN =MKlog(MK); however, in many applications it is necessaryto increase the resolution only within a small frequency in-terval, which makes convenient the evaluation of the DFT bydirect matrix multiplication. In this case the computationalcost for the Fourier transform of M samples for K frequencyvalues in a given interval is MK, i.e. smaller than the cost ofthe FFT algorithm by a factor log(MK).

The practical implementation can be further optimised,of course; besides, the estimate is valid only within thecomputation-limited regime, and for any real computer, atincreasing size of the processed arrays, the case becomesinput/output-limited, when the physical memory is saturatedand the virtual memory mechanisms start swapping data to-wards the mass storage devices.

10 20 30 40 50 6010

−3

10−2

10−1

100

101

102

N. points

Pro

cess

ing

time

[s]

FFTDFT

Figure 1: FFT and DFT processing times as functions of K.

We verify these considerations by performing FT evalu-ation by both DFT and FFT, on square format images, overthe range 10≤ K ≤ 65 points: results are shown in Fig.1 andit is possible to see that DFT is always faster.

Since the actual image format for the FFT size is K 4, thelargest array considered is 4K ×4K, i.e. 16 mega-pixels. Atthis point, our desktop computers already have a significantvirtual memory access.

2.2 Image encoding and filtering

To maximise the field of view, i.e. observe simultaneouslya large area, typical astronomical images are sampled over asmall number of pixels.

The minimum sampling requirements that is related tothe Nyquist-Shannon criterion, are of order of two pixelsover the full width at half maximum, or about five pixelswithin the central diffraction peak. The signal detected ineach pixel is then affected by strong variations depending onthe initial phase (or relative position) of the parent intensitydistribution (the continuous image) with respect to the pixelarray, even in a noiseless case. The pixel intensity distribu-tion of the measured images, then, is not convenient for eval-uating the discrepancy of the effective image with respect tothe nominal image.

In the encoding scheme we adopt each input image is de-scribed by the centre of gravity and the first central momentsas follow:

µy =∫

dyy · I (y) / Iint

σ2y =

∫dy (y−µy)2 · I (y) / Iint

M( j) =∫

dy(

y−µyσy

) j · I (y) / Iint , j > 2

(5)

where Iint =∫

dyI (y) is the integrated photometric level ofthe measurement.

The central moments are much less sensitive than thepixel intensity values to the effects related to the finite pixelsize and the position of the image peak with respect to thepixel borders, i.e. the relative phase between optical imageand pixel array. The encoding technique based on using mo-ments as image description parameters for neural processing

©2007 EURASIP 1775


−15 −10 −5 0 5 10 15−1000

−500

0

500

1000

10 12 14 16 18 20 22 24−1000

−500

0

500

1000

−1.5 −1 −0.5 0 0.5 1 1.5−1000

−500

0

500

1000

Figure 2: Distribution of chromaticity vs. image COG, RMSwidth and M(3) over the train set.

Figure 3: Sigmoidal neural network with one hidden layer

was first introduced in [6], where it was applied to the detec-tion of astronomical Seidel aberrations.

We verify that the across scan moments (x in the Gaia ref-erence frame) are all irrelevant, i.e. their effect on chromatic-ity is negligible. Usage of the standard one-dimensional sci-ence data is therefore appropriate.

Some of the along scan (y) moments do not show an ap-parent signature associated to chromaticity, as can be seenin Fig.2 where chromaticity vs. image COG (top plot) androot mean square width (central plot) are shown; also, theother even order moments do not show an apparent structurein their dependence from chromaticity, so they are not usedas input features to the neural network.

The odd moment selection was verified on the neuralnetwork through a pruning technique, i.e. removing someof them until reaching the minimum number of parameterscompatible with good convergence of the training.

The neural network input can therefore be defined in

terms of the local instrument response, encoded in the nom-inal moments for a reference star, and the individual mea-surement moments. The COG of the reference object is thedeviation of the image position with respect to an ideal sys-tem, and it is associated to the classical distortion. The otherreference object inputs we used are the third and fifth ordermoments.

The input associated to the measured signal, from an un-known type star, is a simple pair of values, i.e. the varia-tion in the third and fifth order moments with respect to theknown reference case. The moments are all computed withstraightforward operations from the measured data, as wellas the variation with respect to the nominal moment valuesof a selected reference spectral type.

In this work we use the multilayer perceptron, first in-troduced in 1986 (see [3]), as an extension of the perceptronmodel [7].

The multilayer perceptron, with sigmoidal units in thehidden layers, is one of the most known and used NN model:it computes distances in the input space using a metric basedon inner products and it is usually trained by the backprop-agation algorithm. The architecture of a sigmoidal neuralnetwork is schematically shown in Fig. 3, in which we findthe most common three-layers case.

The network is described by Eqs. 6:

ak+1j = ∑

j′wj j′o

kj′ +bias j

ok+1j = σ(ak+1

j ) ≡ 1

1+e−ak+1

j

ooutm ≡ ∑

jwm joout−1

j

(6)

Here a is the input to each unit, o is its output and wi j isthe weight associated to the connection between units i and j;each unit is defined by two indexes, a superscript specifyingits layer (i.e. input, hidden or output layer) and a subscriptlabelling each unit in a layer.

The training procedure is finalized to find the best set ofweights wi j solving the approximation problem o(x i) ≈F (xi) and this is usually reached by the iterative process cor-responding to the standard backpropagation algorithm.

At each step, each weight is modified accordingly to thegradient descent rule (a more detailed description can befound in [3]), completed with the momentum term, w i j =wi j + ∆wi j , ∆wi j = −η ∂E

∂wi jwhere E is the error functional

defined above. This procedure is iterated many times overthe complete set of examples xi,F (xi) (the training set),and under appropriate conditions it converges to a suitableset of weights defining the desired approximating function.

Convergence is usually defined in terms of the error func-tional, evaluated over the whole training set; when a pre-selected threshold ET is reached, the NN can be tested usinga different set of data x′i,F (x′i), the so called test set.

In this work we start from a reasonable sampling of theaberration space, using a uniform random distribution ofthe 21 lowest order Zernike coefficients, within the range±50 nm on each term.

For each aberration case, defined by the set of 21 Zernikecoefficient values, we build the PSF for two source cases, i.e.emission in the blue or red spectrum; on each PSF, the photo-centre position is evaluated as the COG, and the statisticalmoments up to order five are computed, as defined in the

©2007 EURASIP 1776


sections that follow. The chromaticity is directly derived asCOG difference.

The almost linear dependence in the distribution of chro-maticity with respect to M(3) for the train set, shown in Fig.2(bottom plot), suggests that some preprocessing can be use-ful, in order to ease the subsequent neural processing.

Taking advantage of this data structure, we tried to sub-tract the linear trend to the target values (the chromaticity) inthe training set so defining a preprocessed chromaticity; thispreprocessing is supposed to ease the NN learning task. Theinverse transformation is applied to the output data on the testset.

The best fit parameters for the linear fit are283.15 µas/mas (slope) and 0.45 µas (offset); the in-teresting results obtained are discussed in the next section.

3. NEURAL NETWORK PROCESSING ANDRESULTS

We used a sigmoidal feed forward neural network with re-spectively five inputs (three nominal and two measured val-ues), one output (the chromaticity), and a single hidden layerwith 300 units.

The training and test sets include respectively the data of8000 and 2000 aberration instances, i.e. properly codifiedimages, built accordingly the observations made in the pre-vious sections.

−800 −600 −400 −200 0 200 400 600 800−1000

−500

0

500

1000

Nominal chromaticity [µas]Est

imat

ed c

hrom

atic

ity [µ

as]

−800 −600 −400 −200 0 200 400 600 8000

50

100

Nominal chromaticity distribution[µas]

−5 −4 −3 −2 −1 0 1 2 30

50

100

150

200

Residual chromaticity distribution[µas]

Figure 4: Performance results.

The NN performances are evaluated on the test set; inparticular, the discrepancy between the NN output (estimatedchromaticity) and target (actual chromaticity for the test setdata instances) can be considered as the residual chromaticityafter correction based on the NN results.

Input Chromaticity Residual Chromaticity[µas] [µas]

Min. -653.86 -2.36Mean -7.50 -1.23Max. 617.92 0.29RMS 190.52 0.48

Table 1: Statistics over the test set of input and residual chro-maticity

The main statistical parameters of the residual chromatic-ity distribution, compared with the corresponding values inthe input test sets, are listed in Tab. 1.

In Fig. 4 we show the initial chromaticity distributionand the residual chromaticity distribution (central and bottomplots). We remark that 99.9% of the data are within the ±3σinterval in both experiments.

Since the goal is the computation of output values coin-cident with the pre-defined target values, the plot of outputvs. target should be ideally a straight line (y = a+bx) at an-gle π/4, passing for the origin, i.e. with parameters (a = 0,b = 1).

We computed these straight line (top plot in Fig. 4 bestfit parameters and its standard deviations obtaining the valuesa = −1.249± 0.007, b = 0.9981± 0.39e− 004 ; the resultsare very satisfactory.

4. CONCLUSION

In this paper we dealt with the problems of i) increasingthe computational efficiency in the evaluation of the Fouriertransform, necessary for image construction and ii) to diag-nose chromaticity from image themselves.

The computational cost is usefully reduced evaluating theFourier transform via direct matrix multiplication by a factorlog(MK), where MK is the dimension of the signal sampledsequence.

The main statistical moments are then computed and usedas image describing inputs to the neural network trained todiagnose chromaticity.

We obtained some really interesting results, reducingchromaticity from an initial value distribution in the interval[-653.86 617.92] to a final distribution in the interval [-2.360.29]

REFERENCES

[1] M. Gai et al., “Chromaticity in all-reflective telescopesfor astrometry,” Astronomy and Astrophysics, 2004.

[2] M. Born and E. Wolf, Principles of optics. 1985.[3] D. Rumelhart et al., “Learning internal representations

by error propagation, ” in Parallel Distributed Pro-cessing, Eds. D.E.Rumelhart and J.L.McClelland, Cam-bridge, MA: MIT Press, vol. 1, pp. 318–362, 1986.

[4] R. Cancelliere and M. Gai, “Neural network correctionof astrometric chromaticity,” Mon. Not. R. Astron. Soc.,vol. 362, no. 4, pp. 1483–1488, 2005.

[5] M. Perryman et al., “Composition, formation and evolu-tion of the galaxy. concept and technology study,” Rep.and Exec. Summary, ESA-SCI, European Space Agency,Munich, Germany, 2000.

©2007 EURASIP 1777


[6] R. Cancelliere and M. Gai, “A comparative analysisof neural network performances in astronomy imag-ing,”Applied Numerical Mathematics, vol. 45, no. 1, pp.

87–98, 2003.[7] Minsky M., Papert S., 1969, Perceptrons, Cambridge,

MA:MIT Press.

©2007 EURASIP 1778


Documents

Neural Network High Precision Processing for Astronomical ... · NEURAL NETWORK HIGH PRECISION PROCESSING FOR ASTRONOMICAL IMAGES ... and optical engineer- ... Neural Network High