15
Image formation by a general optical system. 1: General theory Harold H. Hopkins The most general optical system is considered, in which the object and image may also be curved surfaces. For each object point a base ray is chosen, whose intersection with the image surface defines the geometrical image point. It is shown that, with suitable choices of axes for the object and image spaces, principal azi- muths always exist around the object space and image space parts of the base ray. These azimuths are such that, in a paraxial type approximation, rays entering with coordinates (XS,O) and (O,YT), respectively, emerge with coordinates (XS,O) and (OyT). This theorem enables the definition of canonical entrance and exit pupil variables, (xsyT) and (x s,yT), and object and image space variables, (GS,HT) and (G: 5 ,HT), such that the geometrical image of (GS,HT) is at GS = GS,HT = HT, and (for an isoplanatic image point) any fi- nite aperture ray entering with coordinates (xS,YT) emerges with xs = XS,YT = YT. The analysis also leads to formulas for the two principal local magnifications of the image and to the sine condition for a general op- tical system. Both the geometrical and diffraction theory of image formation for a general optical system are in this manner shown to reduce to exactly the same forms as for an axially symmetric system. In Part 2, the necessary computing methods are obtained for the practical application of the theory. (1) Introduction There is a long history of investigations into the problem of image formation about a ray passing through a general optical system, many of the later publications merely reproducing results obtained earlier. One of the earliest, and perhaps the most comprehensive,is a paper by Sampson' published as early as 1897. Later publi- cations include those of Smith 2 and Herzberger. 3 All these treatments of the problem were limited to what is effectively a paraxial approximation and offer little of use to the practical optical designer. In part this has been due to the rather abstract approaches employed and the absence of results with simple bearing on the practical tasks of optical design, but in addition is the fact that in practice one has to be concerned with image formation by pencils of finite aperture. Moreover, nowadays the designer is not only concerned with ab- errations but has to be able to calculate the forms of diffraction images and the optical transfer function, in which these earlier studies provide no help. In the case of systems having rotational symmetry, the writer 4 has shown that, for both the geometrical and The author is with University of Reading, Physics Department, Whiteknights, Reading RG6 2AF, U.K. Received 21 February 1984. 0003-6935/85/162491-15$02.00/0. ©1985 Optical Society of America. the diffraction theory of image formation, it is an ad- vantage to regard the reference spheres, centered on the object and geometrical image points, respectively, as the, pupil surfaces and not to use the pupil planes. It is then found that the intensity in the point spread function is given with high accuracy by the squared modulus of the Fourier transform of the distribution of complex am- plitude over the exit pupil sphere, which relationship is essential to the calculation of the optical transfer function from the constructional data of the optical system. Furthermore, the transverse ray aberrations are then also given accurately by the aperture deriva- tives of the wave front aberration, and the departure from isoplanatism for the rays to any given image point is given by the differences between their exit pupil and entrance pupil coordinates. This latter property is of considerable practical use in that, for a nominally iso- planatic image point, the emergent rays have (apart from quantities of aberrational magnitude) exit pupil coordinates equal to the entrance pupil coordinates of the incident rays. A square mesh of entering rays, for example, then emerges as a square mesh. The resulting variables have been termed canonical coordinates. More recently 5 the writer has shownthat the entrance and exit pupils as ordinarily defined do not have any precise preferential role in aberration and image as- sessment calculations and has given in detail the methods and formulas necessaryfor these computations for general optical systems. In what follows it is shown that canonical coordinates may also be defined for image formation by a completely general optical system 15 August 1985 / Vol. 24, No. 16 / APPLIED OPTICS 2491

Image formation by a general optical system 1: General theory

Embed Size (px)

Citation preview

Page 1: Image formation by a general optical system 1: General theory

Image formation by a general optical system.1: General theory

Harold H. Hopkins

The most general optical system is considered, in which the object and image may also be curved surfaces.

For each object point a base ray is chosen, whose intersection with the image surface defines the geometrical

image point. It is shown that, with suitable choices of axes for the object and image spaces, principal azi-

muths always exist around the object space and image space parts of the base ray. These azimuths are such

that, in a paraxial type approximation, rays entering with coordinates (XS,O) and (O,YT), respectively,emerge with coordinates (XS,O) and (OyT). This theorem enables the definition of canonical entrance and

exit pupil variables, (xsyT) and (x s,yT), and object and image space variables, (GS,HT) and (G:5 ,HT), such

that the geometrical image of (GS,HT) is at GS = GS,HT = HT, and (for an isoplanatic image point) any fi-

nite aperture ray entering with coordinates (xS,YT) emerges with xs = XS,YT = YT. The analysis also leads

to formulas for the two principal local magnifications of the image and to the sine condition for a general op-

tical system. Both the geometrical and diffraction theory of image formation for a general optical systemare in this manner shown to reduce to exactly the same forms as for an axially symmetric system. In Part

2, the necessary computing methods are obtained for the practical application of the theory.

(1) Introduction

There is a long history of investigations into theproblem of image formation about a ray passing througha general optical system, many of the later publicationsmerely reproducing results obtained earlier. One of theearliest, and perhaps the most comprehensive, is a paperby Sampson' published as early as 1897. Later publi-cations include those of Smith 2 and Herzberger. 3 Allthese treatments of the problem were limited to whatis effectively a paraxial approximation and offer littleof use to the practical optical designer. In part this hasbeen due to the rather abstract approaches employedand the absence of results with simple bearing on thepractical tasks of optical design, but in addition is thefact that in practice one has to be concerned with imageformation by pencils of finite aperture. Moreover,nowadays the designer is not only concerned with ab-errations but has to be able to calculate the forms ofdiffraction images and the optical transfer function, inwhich these earlier studies provide no help.

In the case of systems having rotational symmetry,the writer4 has shown that, for both the geometrical and

The author is with University of Reading, Physics Department,Whiteknights, Reading RG6 2AF, U.K.

Received 21 February 1984.0003-6935/85/162491-15$02.00/0.© 1985 Optical Society of America.

the diffraction theory of image formation, it is an ad-vantage to regard the reference spheres, centered on theobject and geometrical image points, respectively, as the,pupil surfaces and not to use the pupil planes. It is thenfound that the intensity in the point spread function isgiven with high accuracy by the squared modulus of theFourier transform of the distribution of complex am-plitude over the exit pupil sphere, which relationshipis essential to the calculation of the optical transferfunction from the constructional data of the opticalsystem. Furthermore, the transverse ray aberrationsare then also given accurately by the aperture deriva-tives of the wave front aberration, and the departurefrom isoplanatism for the rays to any given image pointis given by the differences between their exit pupil andentrance pupil coordinates. This latter property is ofconsiderable practical use in that, for a nominally iso-planatic image point, the emergent rays have (apartfrom quantities of aberrational magnitude) exit pupilcoordinates equal to the entrance pupil coordinates ofthe incident rays. A square mesh of entering rays, forexample, then emerges as a square mesh. The resultingvariables have been termed canonical coordinates.

More recently5 the writer has shown that the entranceand exit pupils as ordinarily defined do not have anyprecise preferential role in aberration and image as-sessment calculations and has given in detail themethods and formulas necessary for these computationsfor general optical systems. In what follows it is shownthat canonical coordinates may also be defined forimage formation by a completely general optical system

15 August 1985 / Vol. 24, No. 16 / APPLIED OPTICS 2491

Page 2: Image formation by a general optical system 1: General theory

Fig. 1. Base and local coordinates in the object and imageplanes.

and that these variables have precisely the same prop-erties as for rotationally symmetric systems. The mostgeneral case is considered, where the object and imageare not necessarily planes perpendicular to any axes.

(2) General Description of Coordinate SystemsFigure 1, shown for simplicity in only two dimensions,

has S as the surface of the object and S' as the receivingsurface for the image. 00 and 0 are any convenientlychosen origins at the object and image surfaces; 00 E0and Eo0o are sultably selected axes used for ray tracing.

is in no sense necessarily the image of 0O; and the axisE00'0 in the image space is, again, not necessarily theemergent ray corresponding to the object-space axisOoEo when considered as an incident ray. The axes(Qo,fo,to), with origins at OO and O along OoEo, arerectangular axes for the object surface S; and (0,l',g),with origin at 00 and go along the continuation of EO,are axes for the image surface S'. The orientations of(Lo,?o) around o and of (,X0) around go are not nec-essarily related in any way: they will be chosen, like theaxes OoEo and E'O, as merely those suitable for raytracing of the given optical system. We shall refer to(o,?1OMA) and' (0,n'&,) as the base coordinate sys-tems.

We now choose planes through E0 and E'O, perpen-dicular to the axes OoEo and E'O, as entrance and exitpupil planes, as described in Ref. 5. That is to say, E0and E0 are not necessarily in any way images throughthe front and rear parts of the system of a limiting ap-erture stop. Indeed for a system of crossed cylindricallenses, for example, these images of an aperture stop willbe highly astigmatic, and no unique image positionsexist for defining pupils in the ordinary sense. Corre-sponding to the coordinates (Qo,lo,Ao) and ( wechoose systems (X0 ,Yo,ZO) and (X0,YOZo), with originsat E and E'0 and with Z0 and Z'0 along OOEO and EOO,as the base coordinates in the initial and final imagespaces, respectively, used for ray tracing. The axes(Xo,YO) are chosen to be parallel to (0,X), and similarly(X'0 ,YO) are chosen to be parallel to (',X0), as shown inFig. 1.

We now consider any object point Q on the objectsurface S, from which exploratory rays are first tracedthrough the optical system to the image surface S'. Thecentral ray, in any conveniently defined sense, from Qis then chosen. This is the ray QE in Fig. 1, intersectingthe entrance pupil plane at E and emerging to cut theexit pupil plane at E' and the image surface at Q'. Theray QE... E'Q' is taken to be the base or axis ray for theimagery of the point Q, and Q' is taken to be the geo-metric image of Q.

For the element of object surface at Q, the objectsurface will coincide locally with the tangent plane atQ. We then define local object coordinates Qn,), asshown in Fig. 1, with (h) lying in the tangent plane atQ and with origin at 0, which point is the foot of theperpendicular from E to the tangent plane at Q. Theorientation of the axis of X is conveniently chosen to bein the plane through 0 and parallel to the (oAO) planeof the base coordinates in the object space. Alterna-tively we could equally take the axis of t to be in theplane through 0 and parallel to the QoA0) plane of thebase system. For the entrance pupil, we use a similarlyrotated coordinate system, with origin at E and (X, Y)parallel to (Si1) and so lying in the plane through Ewhich is parallel to the tangent plane at Q. The Z axisis along the continuation of OE; and so, like the axis,it is along the perpendicular OE from E to the tangentplane to the object surface at Q.

In a similar manner in the image space, we take localcoordinate axes (t',n',f with (',i') lying in the tangentplane at Q' to the image surface, and with ' in the planethrough O' parallel to the (77'&) plane of the base sys-tem of the image space. The origin of (t',',) is at O',which is the footof the perpendicular from E' to thetangent plane at Q'. Correspondingly, for the exit pupilwe use local coordinates (X',Y',Z'), where (X',Y') lie inthe plane through E' parallel to the tangent plane at q'and are parallel to the axes (',q'). The origin for (X'-,Y',Z') is at E'; and, like the ¢' axis, the Z' axis is alongthe perpendicular E'O' from E' to the tangent plane ofthe image surface at Q'.

2492 APPLIED OPTICS / Vol. 24, No. 16 / 15 August 1985

Page 3: Image formation by a general optical system 1: General theory

x

Fig. 2. Local coordinates for the object and image

points iq'.

It will be noted that the need for the local coordinatesystems (^,, (X,Y,Z), Q',7,1,0 and (X',Y',Z'), whichare rotated relative to the base systems, arises solelyfrom the assumption that the object and image arecurved surfaces. In the special case where these areplanes, these axes will usually be parallel to the corre-sponding base coordinate systems to be used for the raytracing.

It will be shown next that, for any optical system,principal azimuths exist (for the imagery of any pointQ at Q') around the base ray in the object and imagespaces. The precise sense in which these are principalazimuths is explained later. We remark here that, if the(X,Y) and (X',Y') axes are rotated around the Z and Z'axes, respectively, of Fig. 1 to give new axes, (XS,YT)

and (XS,YT), lying in the principal azimuths in theobject and image spaces, these new axes are those forwhich canonical relations exist. They may, therefore,be termed the canonical axes.

(3) Principal Azimuths

Figure 2 shows the local axes in the object and imagespaces for the imagery of the point Q at Q'. Thus C

... E'Q' is the base ray. A general ray from Q, such asQB in the diagram, is specified by the local coordinates(X,Y) of the point B in which the ray cuts a referencesphere centered on Q and passing through E. Aftertraversing the optical system, this ray emerges to cut theimage space reference sphere in B'. This referencesphere has its center at Q' and passes through E'. Thepoint B' is specified by its local coordinates (X',Y').The coordinate axes (X,Y,Z) and (X',Y',Z') are thelocal coordinate systems of Fig. 1.

We shall now establish a general result which is fun-damental to the existence of canonical coordinates forthe imagery of the point Q at Q'. This result is the ex-istence of principal azimuths in the object and imagespaces such that, if the coordinate axes (X,Y) and(X',Y') of Figs. 1 and 2 are rotated around the Z and Z'axes to give new axes (XS,YT) and (X',YT) which liein these azimuths, all rays entering the optical systemwith YT = 0 emerge (to a first order in the aperture)with YT = 0; and, similarly, all rays entering with Xs

= 0 emerge with Xs = 0 to first-order accuracy. Toestablish this basic theorem we require only that thesurfaces of the optical system encountered by the raysaccepted from Q shall be continuous and smooth.Thus, for refraction or reflection at a conical surface, theray patch at the surface must not include the apex of thecone. Similarly, in a gradient-index system, the vari-ation of refractive index must be a continuous andsmooth function of position.

To specify any general ray enteringthe optical sys-tem, that is not necessarily a ray from Q, we employ theabove local coordinates (XY) of the point B where theray cuts the reference sphere EB, and the difference indirection cosines of the general ray and the base ray. Ifthese latter are (L,M,N) and (L,M,N), when referredto the local axes (X,Y,Z) in Fig. 2, we define

X=L-L, ji=M-M, (3.1)

and the ray is then completely specified, relative to thebase ray, by the four quantities (X,X,YAu). In the imagespace we define

(3.2)

where (L',M',N') and (L',M',N') are the direction co-sines of the general ray and the base ray in the imagespace when referred to the local axes (X',Y',Z'). If(X',Y') are the local coordinates of the point B' wherethe emergent ray cuts the reference sphere E'B', thefour quantities (X',X',Y',,u') completely specify theemergent ray relative to the emergent base ray.

Given the above conditions that all surfaces of theoptical system are continuous and smooth, each of theimage space ray data will be a continuous function of theobject space data of the given ray. We write this as

xI = x'(x'XYJ),

(3.3)A'= '(X,X,Y,u),

y = Y (X,X, Y,U)

A = '(XXYJ),

in which X' = X' = Y' = u' = whenX = X = Y= =0, because the general ray then coincides with the baseray QE ... E'Q'. Since the four functions in Eqs. (3.3)

15 August 1985 / Vol. 24, No. 16 / APPLIED OPTICS 2493

XI = LI - LI, A I = MI - M"

Page 4: Image formation by a general optical system 1: General theory

are also continuous and smooth, we may expand eachby Maclaurin's theorem to give the forms

X = ai 1X + a12X + a1 3 Y+ al4,u + D(2),

= a2lX + a22X + a2 3 Y+ a24 A + D(2),

Y = a3 lX + a32 X + a33 Y + a34. + D(2),

A = a4 lX + a42\ + a4 3 Y + a44L + D(2),

where D(n) denotes terms which are of total degree >nin powers and products of the variables (X,X,Yg). Thecoefficients aij depend on the form of the optical systemand the choice of base ray.

For the special case of a ray from Q, such as QB in Fig.2, where Q has coordinates (Q,ij) with origin at 0, thecoordinates of B and E will be given by

X = +L(-R) = t+ (-R) (3.5)Y = l + M(-R), 0 = + M(-R),

where R = (EQ), which is negative for the case shown,is the radius of the reference sphere. From Eqs. (3.5)we obtain

X=L-E=-R- /1=M-M=--, (3.6)R R

so that, for any ray from Qthe relations (3.4) reduce tothe forms

XI = allX + 1 2 Y+ A(2),

= a 21 X + 22 Y + A(2),

Y = a 31 X + 32Y + A(2),

A = a41X + 42Y + A(2),

in which the higher-order terms D(2) in Eqs. (3.4) arenow written A(2), where A(n) denotes terms of totaldegree n in the aperture variables (X,Y) only of thegiven ray from Q. The new coefficients in Eqs. (3.7) aregiven by

a12all = all - -R

a14212 = a1 3 --

Ra22 _ a24a21=a2 l-- - a22 =a23 --- XR R7

a32 - a34 (3.8)a31 a3l = , a32 = a33 -RRR

a42 - a44a41 a4 l-- =, a42 = a4 3 -R4,R R

and they depend, through R, on the position of theobject point Q.

We now introduce polar coordinates (r,,O) and (r',4/)for the points B and B', as shown in Fig. 2, such that

X = rsino, Y = r coso,X'= r' sino', Y' = r' coso',

using which Eqs. (3.7) give

X = r' sin' = Uiir sin + l2r cosq + A(2),Y' = r' cosqb = a3lr sin + 32 r cos + A(2), (3.10)

from which

tan' = El, tano + a1 2 + A(2), (3.11)a31 tan + 32

so that, in the paraxial type of approximation (whichretains only terms of the first degree in the aperture),all rays entering in any given azimuth 0 will emerge inthe same azimuth q' given by Eq. (3.11). Moreover,since tan(k + 7r) = tank and tan(' + r) = tano', Eq.(3.11) shows that all rays entering in the azimuth ( +r) will emerge in the azimuth (' + -7r). With reference

to Fig. 2, rays in the azimuths 0 and ( + -7r) will inter-sect the reference sphere EB in the curve of intersection(B.E and its continuation) with the reference sphere ofa plane through the Z axis making an angle s with theY axis. These rays emerge to cut the reference sphereE'B' in the curve of intersection (B'E' and its contin-uation) with the reference sphere of a plane through theZ' axis making the angle q' with the Y' axis. Deviationsfrom this condition will depend on the square andhigher powers of the aperture.

We now enquire whether the azimuth q may be sochosen that, in addition to rays in the azimuth emerging in the azimuth q' given by Eq. (3.11), rays inthe azimuth ( + r/2), perpendicular to 0, will emergein the azimuth [k' + (7r/2)], perpendicular to 4'. With0 replaced by [ + (7r/2)], we see from Eq. (3.11) thatrays in the azimuth [ + (r/2)] will emerge in an azi-muth given by

ta ' Ull(-cot) + al2 + "a3l(-cotW) + a32

(3.12)

and, if ' = [ + (/2)], the value of 0 has to be suchthat tan0 = -coto'. Using Eqs. (3.11) and (3.12), thiscondition is

Ull(-cot) + a1 2 _ a31 tani + 32

aU3(-cotW) + U32 ll tano + a12

(3.13)

where the higher-order terms A (2) are omitted since wenow consider the paraxial approximation. Equation(3.13) reduces to

(U + 2- ( + a32 21tan2 - 1 1a1a2 + a3 a3 2 tan - = 0, (3.14)

as the equation to be satisfied by 0 for the given condi-tion to hold. Since the constant term in Eq. (3.14) isnegative, both of the roots (tanki and tan02 ) of thisequation will be real. Thus azimuths and [ + (r/2)]always exist such that rays entering in them emerge inazimuths 0' and [4' + (7r/2)], respectively. We shall callthese the principal azimuths. It will be noted that thevalues of 0 and ' will in general be different for dif-ferent object positions because the a coefficients of Eqs.(3.8) depend on the value of R.

To treat the theory of image formation, we rotate thecoordinates (X,Y) and (X',Y') of Fig. 2 around the Zand Z' axes to lie in the principal azimuths [0, + (7r/2)]and [4/,k' + (r/2)], with given by Eq. (3.14) and then4/ by Eq. (3.11). These principal axes are then denotedby (XS,YT) and (X',Y+) as in Fig. 3. It should benoted here that the values 01 and 02 of 0 satisfying Eq.(3.14) are such that tanO2 tano = -1, so that 02 = 1+ 7r/2. Thus, if we choose the positive root tan5 o 0 ofEq. (3.14) and then let the YT axis lie in this azimuth,the Xs axis will be in the azimuth [ + (7r/2)]; and

2494 APPLIED OPTICS / Vol. 24, No. 16 / 15 August 1985

Page 5: Image formation by a general optical system 1: General theory

Fig. 3. Base coordinates and canonical axes for the

object and image points Q,Q'.

choosing the other root for the azimuth of YT wouldmerely interchange the azimuths of Xs and YT. Sim-ilar considerations apply to the axes (Xs,Y'T) in theimage space.

Corresponding to the axes (Xs,YT) and (XS,YT), werotate the object space and image space axes Qq) and(Q',tj') around the axes P and ¢', shown in Figs. 1 and 2,to be parallel to the axes (XS,YT) and (XS,YT), re-spectively. The orientations of these new axes, (QSM1T)

and (Q'sT), are then as shown in Fig. 3. It is with ref-erence to these axes that we shall find that canonicalrelations exist and accordingly term them the canonicalor roper axes for image formation around the base rayQE ... E'Q'. Since the roots of Eq. (3.14) are neces-sarily real it follows that such canonical axes exist forany optical system.

Although this result is fundamental to the existenceof canonical coordinates and the consequent simplifi-cation of image theory, it should be noted that noproperty of the optical system had to be invoked toprove it. The result rests solely on the existence of aone-to-one correspondence between lines in the objectspace and in the image space, with the correspondencebeing representable by continuous smooth functions asin Eqs. (3.3). It applies, for example, to holographicoptical elements.

Following the rotation of the coordinates (X,Y)through the angle 0 and of (X',Y') through the angle 4',

the new coordinate axes (Xs,YT) and (XS,YT) will liein the principal azimuths around the base ray QE ...E'Q'; and in place of Eqs. (3.7) we shall have relationsof the forms

Xs = dllXs + A(2),

Xs = d2lXs + d22YT + A(2),(3.15)

YT = d32YT + A(2),

AT = d4lXS + d42YT + A(2)

between the image space and object space variables ofthe rays from Q with new coefficients aij in place of dip

In Eqs. (3.15) the object and image space quantities Xs= (Ls - LS),AT = (MT - M T) and Xs = (LS - ZS),piT= (MT - MT) are differences in the direction cosinesof the general ray from Q and those of the base ray,when these direction cosines are referred to the axes(XS,YT) and (XS,YT) in the principal azimuths.

The presence of the terms a22YT and &4lXs in Eqs.(3.15) is because, in the general case, the emergentpencil will be astigmatic and its principal sections willnot necessarily coincide with the principal azimuthsfound above. This is illustrated in Fig. 4, which showsthe special case when the canonical axes (X,Y ) areperpendicular to the base ray E'Q', and the principalsections of the astigmatic pencil also coincide with theprincipal azimuths. In this case, the rays passingthrough the YT axis are associated with the wave frontsection B`TBT and focus at the point T': these rays thuslie in the plane X' = 0. For such rays (L' - LS) = 0 for

all values of YT (in the paraxial approximation), so thatwe should have a2 2 = 0 in Eqs. (3.15). At the same time,the rays passing through the X's axis are associated withthe wave front section B'B' and focus at the point S':these rays thus lie in the plane YT = 0. For such rays

= (M'T - MT) = 0 for all values of Xs, and so weshould also have a4l = 0 in Eqs. (3.15). Rays, such asB'QSQ'T in Fig. 4, which do not lie in one of the principalsections will cut the focal lines in points such as Q' andQT, and all such rays are skew to the base ray E'Q'.Thus, if the principal azimuths were such that B' in Fig.4 were contained in one of them, we should not have Xs= (LS - L'5) and giT = (M+ - MT) equal to zero for allrays with YT = 0 and Xs = 0, respectively, and thecoefficients a22 and a4l would then also not be zero.

It is worth remarking that quite a number of the re-sults obtained in earlier publications by means of la-borious algebra are almost self-evident if the abovenature of the pencil of rays associated with an astigmaticelement of wave front is kept in mind. Thus, if we takeall the rays passing through any pair of lines perpen-dicular to and spaced apart along the base ray in the

15 August 1985 / Vol. 24, No. 16 / APPLIED OPTICS 2495

Page 6: Image formation by a general optical system 1: General theory

a,

Fig. 4. Astigmatic pencil in the case when the principal sec-tions coincide with the principal azimuths.

Q

Fig. 5. Optical paths along image space associated rays.

object space and lying in orthogonal planes, there willbe a unique astigmatic element of wave front which isan orthogonal surface to these rays. Such an astigmaticwave front, after traversing any optical system, willemerge as an astigmatic wave, all of whose rays will passthrough two focal lines as in Fig. 4. The proof of theexistence of this correspondence between any pair ofsuch lines in the object space and a corresponding pairof (focal) lines in the image space is by no means simpleif the paths of the rays are treated algebraically surfaceby surface.

(4) Aberration Differences Along Associated RaysBefore applying the theorem of the preceding section

we shall need to obtain a formula for the difference inwave front aberration between corresponding rays fromthe object point Q and from any nearby point Q, thegeometrical images of which are formed at Q' and Q',respectively. For this purpose it is important to defineprecisely which ray from Q is to be taken to correspondwith any given ray from Q. In Fig. 5, the base ray, ,from Q emerges to cut the image surface in Q', whichdefines the position of the geometric image of Q. Theentrance and exit pupils are at E and E', and the ref-erence spheres EBO and 'B are again centered onand Q'. We then take the base ray from Q to be that raywhich emerges to pass through E', as shown in Fig. 5,

and the geometric image of Q is taken to be at Q' wherethis ray, E'Q' in Fig. 5, cuts the image surface or equallyits tangent plane at Q'.

QBo in Fig. 5 is a general ray from Q cutting theobject space reference sphere for Q in B, and emergingto cut the image space reference sphere for Q' in B'0.This emergent ray will in general have aberration andthus not pass exactly through Q'. We take the corre-sponding general ray from Q to be again that ray whichintersects the image space reference sphere for Q' in thesame point, B, as the ray from Q. The rays QE. . . ,and QB ... B will be termed the image space associatedrays. These rays cut the reference sphere EB, for theobject point Q, in the points E and B, respectively, andthey cut the reference sphere for Q in Eo and BO. Thelatter points will not usually coincide with the inter-section points E and B of the rays from Q. The objectspace associated rays from Q would, in contrast, bethose rays passing through the points E and B on theobject space reference sphere for Q, and these rayswould not cut the image space reference sphere for Q'in ' and B. The result which follows depends on theimage space associated rays being used.

Using square brackets to denote optical path lengths,the wave front aberration of the ray QBo ... B of thepencil forming the image at Q' will be given by

Wa = [ .. E]-Q - [Bo ... * *o, (4.1)

the subscripts denoting that we are concerned with raysfrom Q. Similarly, the wave front aberration of the rayQB ... B' will be given

WQ = [ ... * Q -E [B *.*. B ]Q, (4.2)

E and B being the points where the two rays from Q cutthe reference sphere BE centered on Q, and B' being thepoint where the ray QB ... B' cuts the image spacereference sphere B'E' centered on Q'. The differencein the wave front aberrations of these image space as-sociated rays will thus be given by

WQ - W-Q = [. ... r']Q -[E- ... Elk-- B .*. B'BOQ - [B'B;]+ - [Bo + [B ... B;J-QJ,

where fl is the point in which the ray QB cuts the ref-

2496 APPLIED OPTICS / Vol. 24, No. 16 / 15 August 1985

qI

(4.3)

Page 7: Image formation by a general optical system 1: General theory

erence sphere for Q. We shall show that certain termsin Eq. (4.3) cancel by virtue of Fermat's theorem.

Since the ray QEoE in Fig. 5 is a radius of the refer-

ence sphere BEE, the line EE is perpendicular to theray QEOE. It follows from Fermat's theorem, since theangle between the base rays QE and QE is of the orderof the object size (QQ), that

[EI . .E]Q = [E... 'i] + F(2), (4.4)

where F(n) denotes terms of total degree > n in powersand products of the coordinates (Q - (,- f) of Q rel-ative to Q, that is, powers of the object size. In a similarmanner, since the ray QBBO is also a radius of the ref-erence sphere BEE, it follows that

[BO ... B]Q = [BBo... Bo] + F(2), (4.5)

the line BB being perpendicular to the rayQBBo fromQ, with the angle between this and the ray QBBO frm Qbeing the magnitude of F(1). Using Eqs. (4.4) and (4.5)in Eq. (4.3), we have

WQ' - = [B'BO] - [AB0] + F(2), (4.6)

which is seen to depend, apart from the higher-orderterm F(2), only on the optical path lengths of the raysegments BBO and B'B0 ,.which are independent of thedetailed structure of the optical system concerned.

Since the centers of the reference spheres in Fig. 5 areat Q' and Q' on the base rays 'Q' and E'Q', each of thewave fronts forming the images at Q' and Q' will havea common tangent plane with its reference sphere, andthe aberrations WQ' and W-Q will thus both be of at leastthe second degree in the aperture variables. The for-mula (4.6) must, therefore, be written

WQ' - WQ = [B'B] - [ABO] + A(2)F(2) (4.7)

to take account of the fact that the higher-order termsmust also depend on at least the square of the aper-ture.

It remains to find formulas for the optical pathlengths [B'B] and [BBO]. To the order of accuracyneeded here, these are equal to transverse focal shifts,that is, the additions to the wave front aberration of theray QBo ... Bo when the centers of the reference spheresare moved, respectively, from Q to Q and from Q' to Q'.For this reason, we shall refer to them as transverse focalshifts. Since formulas for these transverse focal shiftsrequire approximations, it is important in the presentcontext that they be established with the nature of theerror terms explicitly expressed. Such a treatment isgiven in what follows next.

(5) Transverse Focal Shift Formulas

Figure 6, shown in only two dimensions, reproducesthe essentials of the image space part of Fig. 5, so thatBo is the point where the general ray QB 0. ... BO, fromQ cuts the reference sphere of center Q', and B'BCN' isthe image space associated ray of the pencil forming theimage at Q'. We join Q'B to cut the reference spherefor Q' in C' and putp' = (C'B). Now if the ray B'B'Ihad no transverse aberration, M' would coincide withQ', and the angle 3U' = ZM'B'OQ' would be zero. It

'1

Fig. 6. Image space transverse focal shift.

follows that U' = A(1), where A(n) again denotes termsof degree >n in the aperture variables. Since C'BQ'is a radius of the reference sphere C'B'E,

(B'B) = = p'[l + A(2)],cos3 U'

that is, since p' = A(1)F(1) as seen later in Eq. (5.7),where F(n) again denotes terms of degree > n in thefield variables specifying the position of Q' relative toQ',

(B'B'0 ) = P' + A(3)F(1), (5.1)

and we now require a formula for p'.If R' = (E'Q') = (C'Q') is the radius of the reference

sphere for Q', we shall have

(B0QI)2 = (R' - p') 2 = (' - X/) 2 + (ii' - Y') 2 + (R' - Z-)2,(5.2)

where (X',Y',Z') and (Q',71',R') are the local coordinatesof B' and Q', both with origin at E'. Similarly, if R' =

(E'Q') = (BRQ) is the radius of the reference sphere forQ',I

(B'QQ) = RI 2 = (t' _ X/)2 + (Wi - y')2 + (R' - Z'), (5.3)

where (S',ij',R') are the coordinates of Q', also with ori-gin at E'. Subtraction of Eq. (5.2) from Eq. (5.3)gives

El 2 - (R' - p') 2 = {t' 2 + i7' 2 - 2(t'X' + i'Y')J- {t 2 + n' 2 - 2(Q'X' + 'Y')I, (5.4)

the remaining terms canceling. We now note that

(E'Q') 2 = R' 2 ' 2 + 2 + RI 2,

(E'Q') 2 = R' 2 2 + q' 2 + RI 2,

subtraction of which gives

( 2 + V 2) - (Q 2 + 7'2) =R 2 - ,R 2,

using which identity in Eq. (5.4) leaves

R 2 - (' - p')2 = 21(' - ')X' + (71' -')Yl (5.5)

as an (exact) equation for p'.If we factorize the left-hand side of Eq. (5.5) and write

(2R' - p') = 2R'(1 - p'/2R'), we obtain

(5.6)

15 August 1985 / Vol. 24, No. 16 / APPLIED OPTICS 2497

J)Y' PI -1PI = Q, - OX' + (n, - � 1

I I - Tf� 11 I

Page 8: Image formation by a general optical system 1: General theory

a

Fig. 7. Object space transverse focal shift.

in which Ip'/2R' I << 1, so that replacing the final factorin Eq. (5.6) by unity will still leave a numerically accu-rate value. This first approximation for p' consists ofterms which are of magnitude A(l)F(1), where F(n)denotes terms of total degree >n in powers and productsof the field size variables (Q' - ' -'). Thus Eq. (5.6)may be written

P' , = X ' W) ')Y + A(2)F(2)

or, since (1/R-') = (1/R') + F(1), we may equally write

P = - )' + (+ - ')Y' + A(1)F(2) + A(2)F(2) (5.7)Rt'

for the value of p'. Using Eq. (5.7) in Eq. (5.1), with n'for the refractive index of the image space, we obtain

[B'B] = n'(B'B0) = n(' - ')X' + ( - r}[BIQ = OB%)~~~~~it+ [A(1)F(2) + A(2)F(2)] + A(3)F(1) (5.8)

for the optical path length [B'B] in the formula (4.7)for the aberration difference WQ' - W7Q,.

It is to be expected, of course, that the formula for theoptical path length [BO] in Eq. (4.7) will be of the sameform as that for [B'B'], apart possibly from the formsof higher-order terms. Nevertheless, because of thefundamental importance of the result, and particularlythe form of dependence on the aperture and field vari-ables of the higher-order terms, a separate treatmentis called for.

The essential features of the object space part of Fig.5 are shown, again for simplicity, only in two dimensionsin Fig. 7. We join QBo to cut the reference sphere BEfor Q in C and put p = (CBO). Since QCBo is a radiusof the reference sphere for Q the line BC on this spherewill be perpendicular to QCBo. Thus, if bU = LQBoQ,which is of the order of magnitude F(1).

(Bo) = = P[1 + F(2)],cost

that is, since p = A (1)F(l),

(BBo) = p + A(1)F(3), (5.9)

where A (n) and F(n) again denote quantities of degree>n in the aperture and the field variables, respec-tively.

It will be noted that, U' in Fig. 6 being the anglebetween the reference sphere radius C'BJQ' and the(possibly aberrated) ray B'B0N' of the pencil formingthe image at Q', the angle U' is a measure of the an-gular aberration of the ray B'BN' and is thus of mag-nitude A(1). In the object space, however, the angle bUin Fig. 7 is that between the two rays QCBo and QBBO,and this angle goes to zero as Q - Q: it follows that c3U= F(1). We now need to obtain the formula for p.

The radius of the reference sphere for Q is denotedbyR = (EQ) = (CQ); and, having the same sign as R =(EO), the radius R is negative for the case shown in Fig.7. Thus, (QBo) = (-R + p), and we shall have

(QBO)2 = ( - p)2 = (X - )2 + (Y - )2 + (Z - R)2, (5.10)where (X,Y,Z) and (Qr,R) are, respectively, the localcoordinates of Bo and Q, both with origin at E. Simi-larly, if R = (EQ) = (BoQ) is the radius of the referencesphere for Q, which is again negative for the case shown,we have

(QBO)2 = R-2 = (X - )2 + (y - i)2 + (Z - R)2, (5.11)where (Q,iyR) are the coordinates of Q, again with originat E. Subtraction of Eq. (5.10) from Eq. (5.11) gives

it2 - ( - p) 2= (42 + 2) - 2(#X + ijY)J

- JQ2 + 72) - 2X + Y)J, (5.12)the remaining terms canceling. We now note that

(QE) 2 = 2 = 2 +l2+R2,

(QE) 2 =A 2 =2+ 172 +R 2,

so that

( + i2) - (2 + 72) = R2 - 2,

using which identity in Eq. (5.12) leaves

R?'2 ( - p)2 = 2 - )X + ( - ) Y (5.13)as an (exact) equation for p.

We factorize the left-hand side of Eq. (5.13) and write2R - p = 2R (1 - p/2R), and Eq. (5.13) then gives

- O)X + ( _ i)y P[1 p -1P = R 11 2Rif (5.14)

in which [p/2R] << 1, so that replacing the final factorin Eq. (5.14) by unity will still leave a numerically ac-curate value for p. As with p', this first approximationforp will be of magnitude A(1)F(l), and Eq. (5.14) maythus be written

(P - )X + ( - VW + A(2)F(2)R

or, since (1/R) = (1/R) + F(1), we may equally write-)X + - + A(1)F(2) + A(2)F(2) (5.15)R

for the value of p. Using Eq. (5.15) in Eq. (5.9), with nfor the refractive index of the object space, we obtain

[A0BB nAB- n - )X + (- )YJ

+ [A(1)F(2) + A(2)F(2)] + A(1)F(3) (5.16)

2498 APPLIED OPTICS / Vol. 24, No. 16 / 15 August 1985

Page 9: Image formation by a general optical system 1: General theory

for the optical path length [AB0 ] in the formula (4.7) forthe aberration difference WQ' - WQ,.

We shall now show how substitution for [B'B] and[f3BO], from Eqs. (5.8) and (5.16), respectively, into theformula (4.7) for WQ' - WQ, leads to the canonicalrelations for the imagery of Q at Q'. Formulas for thetwo local magnifications between the element of objectat Q and its image at Q' are also obtained.

(6) Canonical Coordinates and Local Magnifications

The substitutions for [B'B'] and [BBO], from Eqs.(5.8) and (5.16), into Eq. (4.7) give

-- = )X'+ ( M -I')YI

+ [A(1)F(2) + A(2)F(2) + A(3)F(1)]

n{(Q - $,X + 0o - V)R

- [A(1)F(2) + A(2)F(2) + A(1)F(3)] + A(2)F(2)(6.1)

for the difference in wave front aberration between theimage space associated rays from Q and Q. Thehigher-order terms in Eq. (6.1) are left there in thegroups that arise in Eqs. (5.8), (5.16), and (4.7), in orderto show the origins of the different terms. Althoughterms such as A(2)F(2) will be included in A(1)F(2), itis nevertheless useful to keep them separate. With thisin mind, and grouping some of the higher-order termstogether, Eq. (6.1) gives

WQ'- W-Q'_ n'('- ')X' + n'(n'- T)Y'

RI

n(Q,- ,)X + n( - ij)Y

+ [A(3)F(1) + A(1)F(2) + A(2)F(2) + A(1)F(3)](6.2)

into which we shall substitute suitably reduced formsfor the aperture and field variables.

The derivations of the formulas (4.7), (5.8), and (5.16),leading to the result (6.2), made no explicit reference tothe orientations of the (X,Y) and (X',Y') axes aroundthe base ray QE... E'Q'. We now stipulate that theseshall be the canonical axes, lying in the principal azi-muths for the imagery of Q at Q'. The axes (XS,YT),(X',Y'T) and (Qs,1T), ('s,n/T) are then as shown in Fig.3. The determination of the azimuths 0 and 4/ of theprincipal azimuths from ray tracing is considered in part2.

We now introduce normalized forms of the canonicalcoordinates (Xs,YT) and (Xs,YT). A ray, QPS in Fig.3, is assumed to be traced with pupil coordinates (Xs= XS,YT = 0). This is conveniently chosen to be theedge S ray. Similarly, the ray QPT of Fig. 3 is assumedto be traced, where PT has coordinates (Xs = O,YT =

'T), it again being convenient to choose the edge T ray.The coordinates of the points Ps and PT are used todefine the reduced coordinates

X5 YTXs YT (6.3)

for any point on the reference sphere for Q.The exit pupil coordinates of the ray (Xs,O) will be

given by

XS = dilXs + A(2), YT = A(2) (6.4)

in accordance with Eqs. (3.15), which relations holdwhen the canonical axes are employed. For the samereason the exit pupil coordinates of the ray (0,'T) willbe of the forms

X = A(2), YT = a31'T + A(2), (6.5)

again as in Eqs. (3.15). We employ the quantities

X's = an1Xs, YT = G31yT (6.6)

as normalizing factors for the exit pupil. The reducedexit pupil coordinates are then defined to be

XSS YT -S T

(6.7)

for any point (X',Y+). It will be noted from Eqs. (6.4)and (6.5) that in a paraxial type approximation, whichomits terms in the square and higher powers of the ap-erture, the rays (s,O) and (YT) will emerge withpupil coordinates (',O) and (0,Y" ), respectively.

Using the forms (3.15) for Xs and YT, and the defi-nitions (6.6) of 's and "T, the reduced exit pupilcoordinates (6.7) for any ray with entrance pupil coor-dinates (XS,YT) will be given by

X'5 a1 1Xs + A(2)XS = , = a = xs + A(2),

I YT 31YT (+ A ,Y=TT =' 3lkT =T+A(

(6.8)

using the definitions (6.3) of xs and YT. The mannerin which the canonical pupil variables (XS,YT) and(x's,y') are employed in ray tracing is dealt with in Part2.

We now return to the formula (6.2) and consider thecase when the axes of (X,Y), (X',Y'), Qq), and (Q',?J')are taken to be the canonical axes (XSYT), (XSY+),(QsflT), and (sf'). We write Xs = XSXS,YT = YTYTand X = 'sx ,Y+ = pTy r; and Eq. (6.2) then gives

WQ-W~ JnleT -| R. ) XS + |r4 R. HI) Y1'WQ' - WQ' = nZ 5 I Is~ I+ Y

-{nS(s - _s)} XS {n'|T(?T - IT)} YT

+ [A(3)F(1) + A(1)F(2) + A(2)F(2) + A(1)F(3)],(6.9)

and we now define normalized object and image coor-dinates

nkS(s - (s)Gs =R

RlT(77T iT)HT =

GS g~(s = - k') HT = 07l -G~~ = RI , H~~~ = R~i'

(6.10)

(6.11)

for the canonical coordinates of the points Q and Q',respectively. The formula (6.9) is then written

15 August 1985 / Vol. 24, No. 16 / APPLIED OPTICS 2499

Page 10: Image formation by a general optical system 1: General theory

WQ' - W-Q = (G'xS - Gsxs) + (H'yT - HTYT)+ [A(3)F(1) + A(1)F(2) + A(2)F(2) + A(1)F(3)],

(6.12)

where A (n) and F(n) now denote quantities of degree>n in powers and products of the variables (XSYT) and(GS,HT), respectively.

We shall first show that, in the limit of small object.height, G' = Gs and H+ = HT provided (as here) thecanonical axes in the principal azimuths are employed.In this case, and only in this case, (xs,y+) have the forms(6.8), namely,

x4 = xs + A(2), YT = YT + A(2), (6.13)

and Eq. (6.12) may therefore be written

WQ' - W = (GS - Gs)xs + (HT - HT)YT + A(2)F(1)+ [A(3)F(1) + A(1)F(2) + A(2)F(2) + A(1)F(3)],

(6.14)

which, since it holds for any ray from Q, must be validfor all values of xs and YT. Now the base rays E'Q' andR'Q' in the image space will be normal to the elementsat E' of the wave fronts forming the images at Q' and Q'.These wave fronts will thus each have a common tan-gent plane at E' with its reference sphere. It followsthat WQ' and WQ, will both contain only terms of thesecond degree and higher in the aperture variables(XS,YT). The aberration W, will, of course, be inde-pendent of the field variables, but WQ' could containterms of the first degree in these variables. We maythus write Eq. (6.14) as

[A(2) + A(2)F(1)] - [A(2)] = (GS - Gs)xs + (Hr - HT)YT+ [A(2)F(1) + A(3)F(1)+ A(1)F(2) + A(2)F(2)+ A(1)F(3)],

and, since this will be valid for all values of (XSYT), wemay equate the linear terms in these variables to ob-tain

(G - Gs)xs + (HT - HT)YT - [A(1)F(2) + A(1)F(3)] = 0,(6.15)

which, for YT = 0 and xs = 0, give

G = G + F(2) + F(3),

H = HT + F(2) + F(3) (6.16)

for the (reduced) coordinates (GS,H ) of the image ofthe point (Gs,HT), after having canceled the aperturedependence from the error terms.

In the limit of small field size, Eqs. (6.16) will reduceto G = Gs and H+ = HT. For a differentially smallobject, the geometrical image of the point Q, coordinates(GS,HT), will thus be at Q' with coordinates

G = G, H = HT, (6.17)

and we shall now take these to define the position of theideal image of Q, regarding the quantities F(2) and F(3)in Eqs. (6.16) as geometrical distortion terms. This isa correct interpretation, since they derive from theterms A(1)F(2) and A(1)F(3) in Eq. (6.2), which will not

be present in WQ' and represent terms in WQ' which arelinear in the aperture. Such terms in the wave frontaberration correspond to transverse focal shifts byamounts depending on terms of the second degree andhigher in the coordinates of Q' relative to Q'.

Taking Eqs. (6.17) to define the position of the imagepoint Q', the definitions (6.10) and (6.11) of the ca-nonical object and image coordinates give

s- s n.Xs/R

M = rlT - li nr/R (6.18)M T-i = TR

liT - flT nl'rTR'~/for the two principal magnifications for the imageryaround the base ray from Q to Q'. These, like theGaussian magnification for a rotationally symmetricoptical system, are strictly speaking differential mag-nifications. The geometrical significance of the mag-nification formulas (6.18) is considered later. In thelimit of small object size they are, of course, exact for-mulas. A square with sides parallel to the canonicalaxes (S,77T) is thus always imaged as a rectangle or asquare with sides parallel to the axes ('s,q''), withmagnifications MS and MT.

Taking the geometrical image point Q' to be given byG = Gs and HT = HT, the general formula (6.12) be-comes

WQ' - = ( - xs)GS - (Y- YT)Hr+ [A(3)F(1) + A(1)F(2) + A(2)F(2) + A(1)F(3)],

(6.19)

in which the higher-order term A (3)F(1) is that whichoccurs in the formula (5.1) for (B'B') arising from thepresence of the factor 1/cosbU' = 1 + A(2). The termA(3)F(1) in Eq. (6.19) is thus zero when the image spaceassociated ray, BIBW' in Fig. 6, passes exactly throughthe geometrical image point Q' and so has no angularaberration; it is also very small for any nominally cor-rected system. Thus, for a corrected system, the termA(3)F(1) may be omitted from Eq. (6.19).

We now denote the wave front aberration of the raywith exit pupil coordinates (x'sy) on the referencesphere for Q', and belonging to the pencil forming theimage at Q', by W(GsHT;xy'T). With this notation,and omitting the higher-order term A(3)F(1), the for-mula (6.19) is written

W(GS,HT;x4,yT) = W(O,O;xS,yT) + (xS - xs)GS + (YT - YT)HT

+ [A(1)F(2) + A(2)F(2) + A(1)F(3)],

from which it follows that

aw , aw=Xs7-Xs, 7 YTY (6.20)

the derivatives being those at q', that is, for G' =H=0.

The formulas (6.20) are the first pair of canonicalrelations. They are exact for a corrected system and ofgood accuracy for any system which is nominally cor-rected. An important aspect of these relations is inrelation to the theory of the optical transfer function.This requires that the isoplanatism condition be satis-

2500 APPLIED OPTICS / Vol. 24, No. 16 / 15 August 1985

Page 11: Image formation by a general optical system 1: General theory

fied. For the image in the neighborhood of Q', thiscondition is dW/OGS = 0 and dW/aH+ = 0. Thus, fromEqs. (6.20) we find that the image patch at Q' will beisoplanatic if

X = XS, Yr = YT (6.21)

for the pencil of rays forming the image at Q'. Con-versely, for any corrected system any (finite aperture)ray entering with coordinates (XS,YT) will emerge withcoordinates x's = Xs and YT = YT. Moreover, since

5W = W- Gs + W-. *HT = (xS - xs)bGS + (YT - YT)bH'T

(6.22)

gives the change in aberration for a shift of the imagepoint from Q' to Q', the values of (x' - xs) and (YT -

YT) express the departure from fulfillment of the gen-eralized sine condition, which is the condition for iso-planatism. The geometrical significance of this isconsidered later in connection with the principal mag-nifications.

The second pair of canonical relations is that the(reduced) transverse aberrations of the rays of the pencilforming the image at Q', are given by the aperture de-rivatives of the wave front aberration. Thus, if thegeneral ray, B'D'M' in Fig. 8 cuts the image plane at M'with coordinates (&s + ST'1T + 3i1'T) where (('S,i' ) arethe coordinates of Q', the components of transverseaberration of the ray are given by

R' W T R' WS =-- , AX/ =- , Y. , (6.23)

in which R' = (E'Q') is the radius of the referencesphere and n' is the refractive index of the image space.With X = sxs and Y' = y7 'y from the definitions(6.7), the formulas (6.23) become

where 6G = - Ta7 6Gx's'

It

6H'T = - 0;7

6H'T = 'I4n

ow

are the reduced forms of the components of transverseaberration of the ray. In Eqs. (6.23) and (6.24), thewave front aberration is that shown as W = [B'0 D'] inFig. 8 and is W(O,O;x4,y+) in the notation used above.The geometrical significance of the normalizing factorsused for 6's and 677T, which are the same as those usedin the definitions (6.11) of Gs and HT, is consideredlater.

The formulas (6.25) are the second pair of canonicalrelations. We have thus shown that principal azimuthsalways exist around the base ray in the object and imagespaces; also that, if the axes of the object and entrancepupil and those of the image and exit pupil are taken tolie in these azimuths, canonical relations exist which areidentical in form to those for rotationally symmetricsystems. These results are equally valid for gradient-index systems and for holographic optical elements.

Moreover, as shown above, if the axes are chosen tolie in the principal azimuths around the base ray, a

Fig. 8. Transverse ray aberration.

square with sides parallel to these axes in the object isalways imaged as a rectangle or a square with sidesparallel to the corresponding axes in the image. It isparticularly important to note this fundamental result,because of the misleading distinction made by earlierwriters between so-called orthogonal and nonorthogonalsystems. Luneburg,6 for example, states that anonorthogonal system images a square as a parallelo-gram, giving torsion in the image, and also that it is onlywith an orthogonal system that there is no such torsion.This basic error arises from the irrelevant choice of theprincipal sections of the astigmatic pencil forming theimage at Q' as the axes to which the images of neigh-boring object points are referred, coupled with igno-rance of the existence of the principal azimuths whichhave now been shown to exist. The correct but scarcelynecessary distinction is simply that these astigmaticaxes lie in the principal azimuths for an orthogonalsystem and in different azimuths for a nonorthogonalsystem.

(7) Geometrical Significance of the PrincipalMagnifications and the Generalized Sine-ConditionFormulas

We shall now consider the geometrical significanceof the definitions of the reduced object coordinates(GS,HT) and the corresponding image coordinates(GS,H'T), and also how these relate to ray trace data.Figure 9(a) shows the base ray QE and the edge ray QPSin the object space. The point Ps, on the referencesphere PSE of center Q, has coordinates (XS = Xs, YT- 0), Q is the point (Qs,VT,R) with origin at E, and (EQ)- R. In accordance with what has been said above, thesign of R is negative for the case shown. We shall use(LS,MTN) to denote the direction cosines of any rayrelative to the canonical axes. Then, if LS and LS arethe direction cosines of the rays QPS and QE relativeto the Xs axis, the coordinates of Ps and E will be givenby

Is = s + Ls(-R), O = + Ls(-R),

so that

= = (-Ls) -(-s),

and the definition (6.11) thus gives

15 August 1985 / Vol. 24, No. 16 / APPLIED OPTICS 2501

Page 12: Image formation by a general optical system 1: General theory

Gsns(s - = n[(-Ls) - (-ls)](Qs - s), (7.1)

where (QSMT) are the coordinates of the nearby objectpoint Q.

In Fig. 9(a), the plane QFJJ is parallel to the coor-dinate plane Xs = 0; and QJ and QJ are the projectionson this plane of the rays QPS and QE, respectively. Wedenote by as and Zs the angles between the rays QPSand QE and this plane, choosing the signs of these an- (a)

gles to be opposite the signs of the direction cosines Lsand Ls, respectively. We then have Ls = -sinas andLs = -sinis; and Eq. (7.1) is written

G = n(sinas - sins)(Qs - s), (7.2)

which is the form suitable for the case when the objectplane is at a finite distance.

If the object plane is at infinity, we express the defi-nition in Eq. (7.1) using the notation of Fig. 9(b). Thelines FE and FE are, respectively, the projections of the(object space associated) base ray from Q, that is, QE,and the base ray from Q, that is, QE, on the plane XT =0. We denote the angles between these lines and theaxis OE by Os and js, respectively, as indicated in Fig.9(b). The signs of Os and Os are chosen to be such thatthe coordinates of Q and Q are given by

U = -R tangls, Us = -R tans, (7.3)

where R = (O) is negative for the case shown in Fig.9. This choice of signs is to give agreement with theconvention that would be used in the case of an axiallysymmetric system, with Q and Q both on the s axis.Using Eqs. (7.3), we find from Eq. (7.1) that

G = -nXsN(tanpls - tangls) (7.4)

since RIR = N, the third direction cosine of the base rayQE relative to the canonical axes. In terms of the di-rection cosines (LsMT,N) of this base ray the angle Osis given numerically by

costs TV Age(7.5)

and similarly for the angle O3s, the signs being chosen asimplied in Eqs. (7.3). (a)

We shall regard n(sinas - sinZs) as the (angular)numerical aperture in the Xs direction, and nXs as thelinear measure of the numerical aperture. In Eq. (7.2)the linear coordinate of the point Q relative to Q usesthe (angular) numerical aperture as the normalizingfactor; in Eq. (7.4) the angular coordinate is normalizedusing the linear numerical aperture. The canonicalvariable Gs is, of course, the same in all cases, and eitherEq. (7.2) or Eq. (7.4) is used to express Gs in real spacedepending on whether the object is at a (not too large)finite distance or at a large (or infinite) distance. Thereis, of course, no approximation in the transition fromEq. (7.2) to Eq. (7.4). (b)

The reduced coordinate HT is similarly treated, thecorresponding angles being indicated in Fig. 10. Theray QPT in Fig. 10(a) cuts the reference sphere centeredon Q in PT, with coordinates (Xs = 0,YT = 'T), and weshall thus have, for the coordinates of PT and E,

Fig. 9. Angles asZs and fls,ffs.

A

Fig. 10. Angles aT,Z~r and 13rTJr.

2502 APPLIED OPTICS / Vol. 24, No. 16 / 15 August 1985

Page 13: Image formation by a general optical system 1: General theory

T = T + fT(-R), 0 = T + MT(-R),

where MT and MT are direction cosines of the ray QPTand the base ray QE relative to the canonical axes. Thedefinition (6.11) of HT thus gives

HT - i) = n[(-MT) - (-MT)]( - V), (7.6)

where (QS,'7T) and ((S,iiT) are again the coordinates ofthe object points Q and Q.

In Fig. 10(a) the plane QGKK is parallel to thecoordinate plane YT = 0; and QK and QK are the pro-jections on this plane of the rays QPT and QE, respec-tively. We denote by aT and -aT the angles between therays QPT and QE and this plane, choosing the signs ofthese angles to be opposite the signs of the directioncosines MT and M, respectively. We then have NT =

-sinaT and M = -sinZTT; and Eq. (7.6) is written

1T = n(sinaT - sinar)(r -T), (7.7)

which is the form suitable for the case when the objectplane is at a finite distance.

If the object plane is at infinity, the definition in Eq.(7.6) is expressed using the notation of Fig. 10(b) inwhich the lines GE and GE are the projections of thebase rays QE and QE on the plane As = 0. We denotethe angle between these lines and the axis OE by AT andAT, as indicated in Fig. 10(b), the signs being chosen togive, for the coordinates of Q and Q,

T -- R tan/O, jiT = -R tan7T, (7.8)

where R = (EO) is negative for the case shown. UsingEqs. (7.8), we find from Eq. (7.6)

HT = -nTN(tan#T - taOr), (7.9)

where N = RIR is the direction cosine of the base rayQE. Corresponding to Eq. (7.5), the angle AT is givennumerically by

coslr =X (7.10)

and similarly for the angle AT, the signs being as impliedin Eqs. (7.8).

We regard n(sinaT - singi) as defining the (angular)numerical aperture in the YT direction and n IVT as thelinear measure of this numerical aperture. The com-ments made regarding the normalization of the objectcoordinate Gs in the Xs direction also apply to the re-duced coordinate HT.

For the image space the diagrams will correspondexactly to those of Figs. 9 and 10 with primes, except for'the sign of R' which, in all the earlier diagrams, has beenshown for the case when R' is positive. This, of course,makes no difference to the essentially algebraic for-mulas. The points Ps and PT on the reference spherefor Q' are defined to be those having coordinates (X= XSY+ = 0) and (Xs = 0,YT = YT), respectively,where Is and YT are the paraxial approximations,defined in Eqs. (6.6), to the exit pupil coordinates of thefinite rays assumed to be traced with (Xs = Xs,YT =

0) and (Xs = O,YT = CT). These rays will in generalhave small but nonzero values of Xs - X's = A(2),YT

= A(2) and X = A(2),Y+ - Y'T = A(2), respectively.They will thus not pass exactly through the points Psand PT. Also, these rays will in general'not pass exactlythrough the geometrical image point Q'. In the objectspace the lines QPS and QPT of Figs. 9(a) and 10(a) areactual rays from Q. In the image space the corre-sponding lines, PSQ' and P+Q' indicated in Fig. 3, aremerely the joins of PS and PT to Q'. If the imagearound Q' is exactly isoplanatic, the reduced coordi-nates of the rays satisfy Eqs. (6.21), namely, xs = xs andYT = YT. In this case the emergent rays in question willpass exactly through the points Ps and PT. If theserays also have no transverse aberration, they will bothpass exactly through Q'. In this case, the lines P 5Q'and PTQ' will coincide exactly with the rays: in othercases they are merely geometrical constructions whichare close approximations to the actual rays. These linesand the coordinate axes are as in the image part of Fig.3.

We shall use (Ls,M'T,N') for the direction cosines ofany ray or line relative to the image space canonicalaxes. The coordinates of the point Ps and E' will thenbe given by

1s = As - SRl, 0 = As,-LR',

where L' and L' are direction cosines of PSQ' and E'Q',respectively, ('s,iT,R') are the coordinates of Q' withorigin at E', and R' = (E'Q'). Similarly, if M'T and MTare direction cosines of PTQ' and E'', respectively, thecoordinates of PT and E' are given by

1'T = X-OT R -° = XT - R'k,

again with the origin at E'. Using the above relations,the definitions (6.11) give

GS = = n[(-L) - (-Es)](s -

H' = = n'[(-&'T) - (-M)I(n' - HIT) (7.11)

for the reduced coordinates of the geometrical imagepoint Q' relative to Q'.

Corresponding to the angles as and as of Fig. 9(a) forthe object space, we now use the angles a's and o's be-tween the line PSQ' and the base ray E'Q', respectively,and the plane 's = 0 so that Ls = -sina's and L'S =-sina's, the signs of the angles as and 's being chosento be opposite those of the direction cosines. Similarly,corresponding to the angles aT and aT of Fig. 10(a), ifa'r and a' are angles between the line P'rQ' and thebase ray E'Q', respectively, and the plane X7T = 0, weshall have NT = -sinr and MT = -sin'a. Using theabove angles, the formulas of Eqs. (7.11) give

G = n'(sina's - sina-s)(s - ),HT = n'(sinaT - sina-T)( r - T), (7.12)

corresponding to Eqs. (7.2) and (7.7) for the objectspace.

When the image is at infinity, we again proceed as inthe object space. Thus fls and ;Ts are the angles be-tween the projections of the base rays E'Q' and E'Q',

15 August 1985 / Vol. 24, No. 16 / APPLIED OPTICS 2503

Page 14: Image formation by a general optical system 1: General theory

respectively, on the plane AqT = 0 and the axis E'O',corresponding to the angles Os and Os in Fig. 9(b).Similarly, flT and S3 T correspond to the angles T andAT in Fig. 10(b), and they are the angles between theprojections of the base rays E'Q', and E'', respectively,on the plane As = 0 and the axis E'O'. Thus, as in Eqs.(7.3) and (7.8) for the object space, the coordinates ofthe image points Q' and Q' are given by

= -R' tang', US =-R' tang3 s, (7.13)

lir = -R' tan3, liT =-R' tan'Th, |

where R' = (E'O'), and the signs of the four angles arechosen to be consistent with Eqs. (7.13). Thus, in termsof the direction cosines of the base ray E'Q', the anglesI's and igT are given numerically by

cos/'s = , cos'r = (7.14)-MT ~ -ias in Eqs. (7.5) and (7.10) for the object space and sim-ilarly for the angles fs and fT. Using Eqs. (7.13) for theimage coordinates (Ts,7'T) and ('s,?1T), the definitionsin Eqs. (7.11) give

G = -n'X'N(tanf3s - tan;'s),

H' = -n AN'(tanffr - tanO'r), (7.15)

corresponding to Eqs. (7.4) and (7.9) for the objectspace.

The (angular) numerical apertures for the imagespace in the principal azimuths are taken to be n'(sinas- sinas) and n'(sinaT - sin T); and n''S and n' rT arethe linear measures of these numerical apertures.Using these, the normalized transverse ray aberrationsfor any ray, defined by Eqs. (6.25), are written

BG' = n'(sinas - sinas)b~s,

6H'T = n'(sinar - sin-')8lTr (7.16)

or alternatively

G = -n'g'shN[tan(3's + 63s) - tan's],

HT = -n'?'rN'[tan(g'T + 601T) - tanT], (7.17)

where bflSflT are the components of the angular ab-erration of the ray in question.

We may now express the principal magnifications, forthe image at Q' of an element of the object at Q, directlyin terms of data obtained from ray tracing. These arethe two magnifications, MS and MT of Eqs. (6.18), ob-tained by equating the reduced coordinates (G's,H+) ofthe geometrical image point Q' to the reduced coordi-nates (GS,HT) of the object point Q. In the samemanner, but using Eq. (7.1) for Gs, Eq. (7.6) for HT, andEqs. (7.11) for Gs and HI, we obtain

U-U~s n[(-Ls) - (-Ls)]Ms- = -s~ n'(-Ls) -(7)I

lir - ir n[(-fT) - (7).] (7.18)M = T ii n'[(MT) - (-M'

or writing the direction cosines in terms of the angles(as,aT), (%,~T), (as,a'T), and (s,'i):

n(sinas - sings)n'(sinas - sina-s)n(sinar - sinT)

Mr n'(sina' - si (7.19)

as also follows from equating the values of G5 ,HT andGs,HT in the formulas (7.2), (7.7), and (7.12). For theaxial image of an axially symmetric system, we shallhave as = OfT = as = O= 0, and as = aT = a,a' = aT= a': the two formulas of Eqs. (7.19) then both reduceto the classical magnification formula. For an extra-axial image of an axially symmetric system, we shallhave Zes = 0, aT = °, a-s = 0, and aT = , and Eqs. (7.19)then give the formulas (3.19) of Ref. 7 for the magnifi-cations in the sagittal and tangential sections, respec-tively.

If the object or the image or both are at infinity, usingone or other of the forms (7.4), (7.9), and (7.15) for(Gs,HT) and (GS,H ), we may define and obtain for-mulas for either two principal focal lengths or (whenboth object and image are at infinity) the two principalangular magnifications.

We shall now note the forms taken by the generalizedsine condition when written in terms of ray trace data.If QB is a general ray from Q, cutting the referencesphere for Q in B, and this ray emerges to cut the ref-erence sphere for Q' in B', as in Fig. 2, for example, thewave front aberration of the image space associated raythrough B' for the image point Q' will be given by

aow ,owW(GS,H':x',y') = W(0,0;xS,y') + Gs +Hr 7

(7.20)

if higher powers of the image coordinates are neglected.Using Eqs. (6.20) for the two derivatives, Eq. (7.20) iswritten

WQ' - W' = (X- Xs) aw + (Y' - YT) -7'OG'5 OHrT

(7.21)

and since the geometric image of the object point(Gs,HT) is at (G = G,H' = HT), we may use eitherthe image or the object coordinates in Eqs. (7.20) and(7.21).

The sine conditions, or the condition for isoplanatism,is the condition that the aberration shall not change forfirst-order displacements of the image point from Q' toQ'. For the ray (XSYT), this condition is thus expressedby

X = XS, Y = YT, (7.22)

which we term the generalized sine condition. Fromthe definitions (6.7) and (6.3), the condition (7.22),written in terms of the ray data, becomes

X X Y YT

X S XS T YT(7.23)

which are the forms appropriate to the case when theobject and/or the image are at infinity.

For other cases, we note that, with (LS,MT,N) for thedirection cosines (relative to the canonical axes) of thegeneral ray (xs,yT), the coordinates of its intersection

2504 APPLIED OPTICS / Vol. 24, So. 16 / 15 August 1985

Page 15: Image formation by a general optical system 1: General theory

point with the reference sphere for Q and those of thepoints Ps, PT, and E will be given by

Xs=s+Ls(-), Xs=is+Ls(-K), o =s+Ls(-]7),

YT = XiT + MT(-R3, Y'T = XiT + MT(-R), = + T(-R),

or using the final of these relations for the values of(Qs iiT),

Xs (-Ls) - (-Ls) YT (-MT) - (-MT)Xs (-Ls) -(-s) YT (-fT) -(-MT)

with corresponding expressions for x and YT.In terms of direction cosines, the generalized sine

condition (7.22) is thus written

L'5 - = Ls-Ls M'T-MT MT-MT (724)L'-s E. Ls-Ls-' MT-MT =MT(-MT'

which we may usefully rewrite in terms of the angles(as,a;Es,°is) and (aT,a'T; TZia-) of Figs. 9(a) and 10(a)defined earlier. For the ray (XSYT), we define corre-sponding angles (s,0s) and (OT,O6), where for the objectspace

Ls = -sings, MT = -sinOr (7.25)

are the direction cosines of the incident general ray fromQ, and

L = -sinGs, MT = -sinOT

coordinates of the point P'. In the factor outside theintegral in Eq. (8.1), the quantity e is given by

2irn' J(t' -tI)2 + (1 - ij) 2 + 2(Q' -~)(Oq' - X (8.2)X I 2it'

with X the wavelength of the light.Assuming the canonical axes to have been used, we

now-replace the variables in Eq. (8.1) by the canonicalvariables (xs,yT) and (G,H). Thus, we write

n'X's(6 -) nxes(6' . ) (X' (l)

___________ A '_ = (Y ' (1 TyT

and then denote by

s= AS, erT (8.4)

the reduced coordinates (G'SHT) of P' relative to Q'when X is employed as the unit of length. Using Eqs.(8.4) and omitting the constant factor X'S'T from dX- X `rTdxsdyT, the formula (8.1) gives

Up' = exp(-ie)F(usvT)(7.26)

are the direction cosines of the line joining the point B'(where this ray cuts the reference sphere for Q') to theimage point Q'. Using these angles, the generalized sinecondition (7.24) takes the form

sinGO - sines sins - sinus

sinas - sin-as sinas - sinus (7.27)sinG'T - sin,'r sinOT - sinir(sinar - sinar sinar - inar

which reduces to the form of the classical sine conditionwhen Zes = ais = OT = a'T = 0.

For the aberration of the image point Q' to be thesame as that of Q' for all rays, the relations (7.27) haveto be satisfied for all rays. It is, thus, merely the ratiosof the differences in the sines rather than of the sinesthemselves which have to be constant.

(8) Diffraction Theory of the Image

The coordinate axes (X',Y',Z') and (Q',ij') used hereare identical to those used for the image space in Ref.5. Equation (10.5) of this reference gives, for thecomplex amplitude at any point such as P' in Fig. 8 inthe point spread function of the image at Q', the for-mula

Up' = exp(-is) Sf f(X',Y')

1 X RXep{i2lr [nX(t 't) 1 fY(li -1 l) 11 dX'dY', (8.1)

apart from irrelevant constant factors. In Eq. (8.1),f(X',Y') is the pupil function of the optical system,written in terms of the coordinates (X',Y') of points onthe reference sphere centered on Q', and (W',m') are the

(8.5)

for the complex amplitude at P', where

F(us,v'T) = S f(xs,yr) expli2r(usxs + v'y'r)fdxdy'

(8.6)

is the Fourier transform of the pupil function expressedin terms of the canonical pupil variables (xs,y'). Theintensity in the point spread function is thus given bythe squared modulus of the Fourier transform of thepupil function, and, like the geometrical theory, thewhole of the diffraction theory of image formation fora general optical system becomes essentially the sameas for an axially symmetric system.

References1. R. W. Sampson, "A Continuation of Gauss's 'Dioptrische Unter-

suchungen,"' Proc. London Math. Soc. 24, 33 (1897).2. T. Smith, "Imagery Around a Skew Ray," Trans. Opt. Soc. 31,131

(1929/30).3. M. Herzberger, "First-Order Laws in Asymmetrical Optical Sys-

tems," J. Opt. Soc. Am. 26, 354 (1936).4. H. H. Hopkins, "Canonical Pupil Coordinates in Geometrical and

Diffraction Image Theory," Jpn. J. Appl. Phys. Suppl. 4, 31(1965).

5. H. H. Hopkins, "Calculation of the Aberrations and Image As-sessment for a General Optical System," Opt. Acta 28, 667(1981).

6. R. K. Luneburg, Mathematical Theory of Optics (publisher, lo-cation, 1964), pp. 234-243.

7. H. H. Hopkins, "Canonical and Real-Space Coordinates Used inthe Theory of Image Formation," in Applied Optics and OpticalEngineering, R. R. Shannon and J. C. Wyant, Eds. (Academic,New York, 1983), pp. 307-369.

8. H. H. Hopkins, "The Use of Diffraction-Based Criteria of ImageQuality in Automatic Optical Design," Opt. Acta 13, 343 (1966).

15 August 1985 / Vol. 24, No. 16 / APPLIED OPTICS 2505 -.