Matrix Methods for Optical Layout

7/21/2019 Matrix Methods for Optical Layout

1/133


2/133


3/133

Tutorial Texts Series

Matrix Methods for Optical Layout, Gerhard Kloos, Vol. TT77

Fundamentals of Infrared Detector Materials, Michael A. Kinch, Vol. TT76

Practical Applications of Infrared Thermal Sensing and Imaging Equipment, Third Edition, HerbertKaplan, Vol. TT75

Bioluminescence for Food and Environmental Microbiological Safety, Lubov Y. Brovko, Vol. TT74

Introduction to Image Stabilization, Scott W. Teare, Sergio R. Restaino, Vol. TT73

Logic-based Nonlinear Image Processing, Stephen Marshall, Vol. TT72

The Physics and Engineering of Solid State Lasers, Yehoshua Kalisky, Vol. TT71

Thermal Infrared Characterization of Ground Targets and Backgrounds, Second Edition, Pieter A. Jacobs,Vol. TT70

Introduction to Confocal Fluorescence Microscopy, Michiel Mller, Vol. TT69

Artificial Neural Networks An Introduction, Kevin L. Priddy and Paul E. Keller, Vol. TT68

Basics of Code Division Multiple Access (CDMA), Raghuveer Rao and Sohail Dianat, Vol. TT67

Optical Imaging in Projection Microlithography, Alfred Kwok-Kit Wong, Vol. TT66

Metrics for High-Quality Specular Surfaces, Lionel R. Baker, Vol. TT65

Field Mathematics for Electromagnetics,Photonics,and Materials Science, Bernard Maxum, Vol. TT64

High-Fidelity Medical Imaging Displays, Aldo Badano, Michael J. Flynn, and Jerzy Kanicki, Vol. TT63

Diffractive OpticsDesign,Fabrication,and Test,Donald C. OShea, Thomas J. Suleski, Alan D.Kathman, and Dennis W. Prather, Vol. TT62

Fourier-Transform Spectroscopy Instrumentation Engineering, Vidi Saptari, Vol. TT61

The Power- and Energy-Handling Capability of Optical Materials,Components,and Systems,Roger M.Wood, Vol. TT60

Hands-on Morphological Image Processing,Edward R. Dougherty, Roberto A. Lotufo, Vol. TT59

Integrated Optomechanical Analysis,Keith B. Doyle, Victor L. Genberg, Gregory J. Michels,Vol. TT58

Thin-Film Design Modulated Thickness and Other Stopband Design Methods,Bruce Perilloux, Vol. TT57

Optische Grundlagen fr Infrarotsysteme,Max J. Riedl, Vol. TT56

An Engineering Introduction to Biotechnology, J. Patrick Fitch, Vol. TT55

Image Performance in CRT Displays, Kenneth Compton, Vol. TT54

Introduction to Laser Diode-Pumped Solid State Lasers, Richard Scheps, Vol. TT53

Modulation Transfer Function in Optical and Electro-Optical Systems, Glenn D. Boreman, Vol. TT52

Uncooled Thermal Imaging Arrays,Systems,and Applications, Paul W. Kruse, Vol. TT51

Fundamentals of Antennas, Christos G. Christodoulou and Parveen Wahid, Vol. TT50

Basics of Spectroscopy, David W. Ball, Vol. TT49

Optical Design Fundamentals for Infrared Systems, Second Edition, Max J. Riedl, Vol. TT48

Resolution Enhancement Techniques in Optical Lithography, Alfred Kwok-Kit Wong, Vol. TT47 Copper Interconnect Technology, Christoph Steinbrchel and Barry L. Chin, Vol. TT46

Optical Design for Visual Systems, Bruce H. Walker, Vol. TT45

Fundamentals of Contamination Control, Alan C. Tribble, Vol. TT44

Evolutionary Computation Principles and Practice for Signal Processing, David Fogel, Vol. TT43

Infrared Optics and Zoom Lenses,Allen Mann, Vol. TT42

Introduction to Adaptive Optics,Robert K. Tyson, Vol. TT41

Fractal and Wavelet Image Compression Techniques,Stephen Welstead, Vol. TT40

Analysis of Sampled Imaging Systems,R. H. Vollmerhausen and R. G. Driggers, Vol. TT39

Tissue Optics Light Scattering Methods and Instruments for Medical Diagnosis, Valery Tuchin, Vol. TT38

Fundamentos de Electro-ptica para Ingenieros, Glenn D. Boreman, translated by Javier Alda, Vol. TT37

Infrared Design Examples,William L. Wolfe, Vol. TT36

Sensor and Data Fusion Concepts and Applications,Second Edition, L. A. Klein, Vol. TT35


4/133

Bellingham, Washington USA

Tutorial Texts in Optical Engineering

Volume TT77


5/133

Library of Congress Cataloging-in-Publication Data

Kloos, Gerhard.Matrix methods for optical layout / Gerhard Kloos.

p. cm. -- (Tutorial texts series ; TT 77)

ISBN 978-0-8194-6780-51. Optics--Mathematics. 2. Matrices. 3. Optical instruments--Design and construction. I. Title.

QC355.3.K56 2007

681'.4--dc222007025587

Published by

SPIEP.O. Box 10

Bellingham, Washington 98227-0010 USA

Phone: +1 360 676 3290Fax: +1 360 647 1445

Email: [email protected]

Web: spie.org

Copyright 2007 Society for Photo-optical Instrumentation Engineers

All rights reserved. No part of this publication may be reproduced or distributedin any form or by any means without written permission of the publisher.

The content of this book reflects the work and thought of the author(s).Every effort has been made to publish reliable and accurate information herein,but the publisher is not responsible for the validity of the information or for any

outcomes resulting from reliance thereon.

Printed in the United States of America.


6/133

Introduction to the Series

Since its conception in 1989, the Tutorial Texts series has grown to more than 70titles covering many diverse fields of science and engineering. When the series

was started, the goal of the series was to provide a way to make the material

presented in SPIE short courses available to those who could not attend, and to

provide a reference text for those who could. Many of the texts in this series are

generated from notes that were presented during these short courses. But as

stand-alone documents, short course notes do not generally serve the student or

reader well. Short course notes typically are developed on the assumption that

supporting material will be presented verbally to complement the notes, which

are generally written in summary form to highlight key technical topics and

therefore are not intended as stand-alone documents. Additionally, the figures,tables, and other graphically formatted information accompanying the notes

require the further explanation given during the instructors lecture. Thus, by

adding the appropriate detail presented during the lecture, the course material can

be read and used independently in a tutorial fashion.

What separates the books in this series from other technical monographs andtextbooks is the way in which the material is presented. To keep in line with the

tutorial nature of the series, many of the topics presented in these texts are

followed by detailed examples that further explain the concepts presented. Many

pictures and illustrations are included with each text and, where appropriate,tabular reference data are also included.

The topics within the series have grown from the initial areas of geometrical

optics, optical detectors, and image processing to include the emerging fields of

nanotechnology, biomedical optics, and micromachining. When a proposal for a

text is received, each proposal is evaluated to determine the relevance of the

proposed topic. This initial reviewing process has been very helpful to authors in

identifying, early in the writing process, the need for additional material or other

changes in approach that would serve to strengthen the text. Once a manuscript is

completed, it is peer reviewed to ensure that chapters communicate accurately theessential ingredients of the processes and technologies under discussion.

It is my goal to maintain the style and quality of books in the series, and to

further expand the topic areas to include new emerging fields as they become of

interest to our reading audience.

Arthur R. Weeks, Jr.

University of Central Florida


7/133


8/133

Contents

Preface xi

1 An Introduction to Tools and Concepts 1

1.1 Matrix Method 1

1.2 Basic Elements 21.2.1 Propagation in a homogeneous medium 2

1.2.2 Refraction at the boundary of two media 3

1.2.3 Reflection at a surface 5

1.3 Comparison of Matrix Representations Used in the Literature 5

1.4 Building up a Lens 6

1.5 Cardinal Elements 7

1.6 Using Matrices for Optical-Layout Purposes 10

1.7 Lens Doublet 12

1.8 Decomposition of Matrices and System Synthesis 131.9 Central Theorem of First-Order Ray Tracing 14

1.10 Aperture Stop and Field Stop 16

1.11 Lagrange Invariant 18

1.11.1 Derivation using the matrix method 18

1.11.2 Application to optical design 18

1.12 Petzval Radius 19

1.13 Delano Diagram 19

1.14 Phase Space 20

1.15 An Alternative Paraxial Calculation Method 211.16 Gaussian Brackets 22

2 Optical Components 25

2.1 Components Based on Reflection 25

2.1.1 Plane mirror 25

2.1.2 Retroreflector 26

2.1.3 Phase-conjugate mirror 26

2.1.4 Cats-eye retroreflector 27

2.1.5 Roof mirror 28

2.2 Components Based on Refraction 292.2.1 Plane-parallel plate 29

2.2.2 Prisms 31


9/133

viii

2.2.3 Axicon devices 47

2.3 Components Based on Reflection and Refraction 49

2.3.1 Integrating rod 492.3.2 Triple mirror 52

3 Sensitivities and Tolerances 53

3.1 Cascading Misaligned Systems 55

3.2 Axial Misalignment 56

3.3 Beam Pointing Error 57

4 Anamorphic Optics 59

4.1 Two Alternative Matrix Representations 59

4.2 Orthogonal and Nonorthogonal Anamorphic Descriptions 604.3 Cascading 60

4.4 Rotation of an Anamorphic Component with Respect to the

Optical Axis 61

4.4.1 Rotation of an orthogonal system 62

4.4.2 Rotation of a nonorthogonal system 65

4.5 Examples 65

4.5.1 Rotated anamorphic thin lens 65

4.5.2 Rotated thin cylindrical lens 66

4.5.3 Cascading two rotated thin cylindrical lenses 674.5.4 Cascading two rotated thin anamorphic lenses 68

4.5.5 Quadrupole lens 69

4.5.6 Telescope built by cylindrical lenses 72

4.5.7 Anamorphic collimation lens 72

4.6 Imaging Condition 73

4.7 Incorporating Sensitivities and Tolerances in the Analysis 75

5 Optical Systems 77

5.1 Single-Pass Optics 77

5.1.1 Triplet synthesis 77

5.1.2 Fourier transform objectives and 4farrangements 79

5.1.3 Telecentric lenses 80

5.1.4 Concatenated matrices for systems of n lenses 81

5.1.5 Dyson optics 82

5.1.6 Variable single-pass optics 84

5.2 Double-Pass Optics 92

5.2.1 Autocollimator 92

5.3 Multiple-Pass Optics 95

5.4 Systems with a Divided Optical Path 995.4.1 Fizeau interferometer 99

5.4.2 Michelson interferometer 101


10/133

ix

5.4.3 Dyson interferometer 103

5.5 Nested Ray Tracing 107

6 Outlook 111

Bibliography 113

Index 119


11/133


12/133

Preface

This book is intended to familiarize the reader with the method of Gaussian matri-

ces and some related tools of optical design. The matrix method provides a means

to study an optical system in the paraxial approximation.

In optical design, the method is used to find a solution to a given optical task,which can then be refined by optical-design software or analytical methods of aber-

ration balancing. In some cases, the method can be helpful to demonstrate that

there is no solution possible under the given boundary conditions. Quite often it is

of practical importance and theoretical interest to get an overview on the solution

space of a problem. The paraxial approach might then serve as a guideline during

optimization in a similar way as a map does in an unknown landscape.

Once a solution has been found, it can be analyzed under different points of

view using the matrix method. This approach gives insight on how degrees of

freedom couple in an optical device. The analysis of sensitivities and tolerances is

common practice in optical engineering, because it serves to make optical devices

or instruments more robust. The matrix method allows one to do this analysis in a

first order of approximation. With these results, it is then possible to plan and to

interpret refined numerical simulations.

In many cases, the matrix description gives useful classification schemes of

optical phenomena or instruments. This can provide insight and might in addition

be considered as a mnemonic aid.

An aspect that should not be underestimated is that the matrix description re-

presents a useful means of communicating among people designing optical instru-

ments, because it gives a kind of shorthand description of main features of an opti-cal instrument.

The book contains an introductory first chapter and four more specialized chap-

ters that are based on this first chapter. Sections 1.11.14 are intended to provide a

self-contained introduction into the method of Gaussian transfer matrices in parax-

ial optics. The remaining sections of the chapter contain additional material on how

this approach compares to other paraxial methods.

The emphasis of Chapters 3 and 4 is on refining and expanding the method of

analysis to additional degrees of freedom and to optical systems of lower symmetry.

The last part of Chapter 4 can be skipped at first reading.To my knowledge, the text contains new results such as theorems on the design

of variable optics, on integrating rods, on the optical layout of prism devices, etc.


13/133

xii Preface

I tried to derive the results in a step-by-step way so that the reader might apply

the methods presented here to her/his design problems with ease. I also tried to

organize the book in a way that might facilitate looking up results and the ways ofhow to obtain them.

It would be a pleasure for me if the reader might find some of the material

presented in this text useful for her/his own engineering work.

Gerhard Kloos

June 2007


14/133

Chapter 1

An Introduction to Tools and

Concepts

1.1 Matrix Method

Ray-transfer matrices is one of the possibile methods to describe optical systems

in the paraxial approximation. It is widely used for first-order layout and for the

purpose of analyzing optical systems (Gerrard and Burch, 1975). The reason why

the paraxial approximation is often used in the first phase of a design or of an optical

analysis becomes obvious if we have a look at the law of refraction in vectorial form

as follows:

n1 a N=n2 b N , (1.1)

wherea is the vector of the ray incident on the interface with the normal N. Thisinterface separates two homogeneous media with indices of refraction n1 andn2.

The refracted ray is described by the vector b. For optical-layout purposes, we needan explicit expression of this ray in terms of the other quantities because we have

to trace the ray through the optical system. Using vector algebra, Eq. (1.1) can be

rewritten in the following way:

b= n1

n2

a n1

n2

N a 1 n1

n2

2

1( N a)2 N . (1.2)The form obtained like this is complicated and it is difficult to trace the ray without

making use of a computer. Therefore, a linearized form of this law would be helpful

for thinking about the optical system, and this is the motivation for starting with a

paraxial layout.

It would be a precious tool for analyzing optical instruments if the approxi-

mated description would also allow for cascading subsystems to describe a com-

pound system. The method of ray-transfer matrices provides this advantage and

cascading of subsystems is performed by matrix multiplication.

Another aspect, which might be sometimes underestimated, is that paraxialdescriptions, and especially the matrix method, provide a convenient shorthand

notation to communicate and discuss ideas to other optical designers. In a way, this


15/133

2 Chapter 1

branch of optics is axiomatic like thermodynamics, for example. The framework

of the underlying theory can be reduced to a limited number of basic rules and

elements. But combining these rules and elements allows one to study a greatvariety of optical systems.

1.2 Basic Elements

We will now look for linearized relations that describe three situations, namely,

propagation of a ray, and its refraction and reflection. The matrices obtained in this

way serve as building blocks of the matrix description.

1.2.1 Propagation in a homogeneous medium

Let us first consider the propagation of a ray in a homogeneous medium. We as-

sume that the ray propagates in the y z plane and choose the z-axis as the optical

axis. In any plane perpendicular to the optical axis, the ray can now be described

by its distance from the optical axis, y, and by the angle , which it has with a

line parallel to the optical axis. As the ray propagates along the optical axis, these

coordinates may change and take different values in different planes perpendicular

to the optical axis. We now choose two reference planes separated by a distancet

inside a homogeneous medium (Fig. 1.1) and determine the inputoutput relation-

ship. The ray starts with the coordinates[y(1), (1)]. Due to the propagation alonga rectilinear line, the angle remains unchanged,

(2) = (1). (1.3)

The height in the second reference plane depends on the distance traveled and on

the starting angle,

y(2) =y (1) +ttan (1). (1.4)

Figure 1.1 Propagation in a homogeneous medium. The two reference planes are at a

distancet.


16/133

An Introduction to Tools and Concepts 3

Under the assumption that the paraxial approximation is valid, i.e., for small an-

gles (1), we can linearize the trigonometric function in Eq. (1.4) as

y(2) =y (1) +t (1). (1.5)

Equations (1.3) and (1.5) can now be combined and written as a matrix relation,y(2)

(2)

=

1 t

0 1

y(1)

(1)

. (1.6)

The matrix depends on the distance of the two reference planes. We will later refer

to it as the translation matrixT, defined as

T = 1 t0 1

. (1.7)1.2.2 Refraction at the boundary of two media

Now, we will try to obtain a linearized expression for the refraction of a ray at a

spherical surface described by the radius R. This surface separates two homoge-

Figure 1.2 Refraction at a spherical surface. The spherical surface separates two media

with refractive indices n1 andn2.


17/133

4 Chapter 1

neous media of refractive indices n1 and n2. Let us first draw a line representing

the ray as it hits the spherical surface in a reference plane (Fig. 1.2). We consider

how the input and output variables are changed in this single reference plane wherethe refraction takes place. The distance from the optical axis remains unchanged

for the ray leaving the reference plane, i.e.,

y(2) =y (1). (1.8)

The change in angle is described by the law of refraction,

n1sin i(1) =n2sin i

(2), (1.9)

where the anglesi (1) andi (2) refer to the normal vector that is perpendicular to the

surface.

Assuming that the paraxial approximation is valid, Eq. (1.9) can be linearizedas

n1i(1) =n2i

(2). (1.10)

But we need expressions in terms of the angles (1) and (2) that are measured

with respect to a line parallel to the optical axis. To obtain relations between these

angles and the angles appearing in Eq. (1.9), we have a closer look at the triangles

in Fig. 1.2. Applying the exterior angle theorem for triangles twice, we have

i(1) = (1) +, (1.11)

i(2) = (2) +. (1.12)

Substituting these equations into Eq. (1.10), we find

(2) = n1

n2(1) +

n1n2

n2. (1.13)

Neglecting the small distance between the intersection of the spherical surface with

the optical axis and the reference plane, we approximate the angle appearing in

Eq. (1.13) as

tan = y(1)

R. (1.14)

Linearizing the trigonometric function for small angles (tan = ), Eq. (1.14) issubstituted into Eq. (1.13) and we have

(2) = n1

n2(1) +

n1n2

n2Ry(1). (1.15)

This is the linearized inputoutput relation we were looking for. In combination

with Eq. (1.8), we can write it in matrix form asy(2)

(2)

=

1 0

n1n2n2R

n1n2

y(1)

(1)

. (1.16)

The corresponding matrix will be used later as the refraction matrix R , defined as

R=

1 0

n1n2n2R

n1n2

. (1.17)


18/133


Figure 1.3The unfolding of a spherical mirror.

1.2.3 Reflection at a surface

A geometrical consideration quite similar to the one that led to Eq. (1.17) can also

be used to find the matrix for a spherical concave mirror. In this case, the output

ray remains on the same side of the reference plane.

It is interesting to note that we can formally obtain the matrix of an unfolded

spherical concave mirror by settingn1=1 andn2= 1 in Eq. (1.17), i.e.,

S= 1 0

2R

1 . (1.18)Unfolding refers to the symmetry operation (or coordinate break) depicted in

Fig. 1.3. This can be helpful in finding the matrix chain of a compound optical

system. Please note that some signs might change in the system matrix with re-

spect to the starting system because reference is made to an optical axis with a

different direction after the coordinate break.

1.3 Comparison of Matrix Representations Used in theLiterature

In the literature, different notations used to write the ray-transfer matrices can be

found. Many authors use coordinates that have nas the second coordinate, where

n is the index of refraction (Guillemin and Sternberg, 1984). An advantage of

this notation is that the determinant value of the ray-transfer matrices is always 1.

This provides a useful check during calculations and can also simplify theoretical

arguments based on the determinant. In the description used here, the determinant

of the ray-transfer matrixA has the value

det A= n1

n2

, (1.19)

with n1 as the refractive index of the medium at the entrance reference plane and

n2 as the refractive index of the medium at the exit reference plane.


19/133

6 Chapter 1

The second coordinatencan also be introduced as a modified ray slope (Sieg-

man, 1986) as

r (z)=n(z)dr(z)

dz . (1.20)

The interpretation of this coordinate in terms of slope can be fruitful in some cir-

cumstances.

1.4 Building up a Lens

With the prerequisite of Eqs. (1.7) and (1.17), we can determine the matrix of a

spherical lens. The refraction at the first surface is expressed by the matrixR(a).

The ray is then propagated through the lens using the translation matrix T and

finally refracted at the second surface of the lens. To describe this refraction, thematrix R(b) is used. The combined effect is calculated as the product of these

matrices,

S=R (b)T R(a). (1.21)

More explicitly, this equation reads as

S=

1 0

n2n3n3R2

n2n3

1 t

0 1

1 0

n1n2n2R1

n1n2

, (1.22)

where t is the thickness of the lens and R1 and R2 are the radii of the first andthe second surfaces of the lens, respectively. Because the lens is in air, we can

specialize the set of refractive indices as n1 = 1, n2 = n, andn3 = 1. Therefore,we have

S=

1 n1

R1

tn

tn

n1R1

1nR2

+ n1R1

1nR2

tn

1 1nR2

tn

. (1.23)

This might suggest the following abbreviations:

P1= n1

R1, (1.24)

P2= 1n

R2. (1.25)

With these abbreviations, Eq. (1.23) then takes the form

S=

1P1

tn

tn

P1P2+P1P2tn

1P2tn

. (1.26)

The so-called thin lens is obtained by letting the lens thickness t tend to zero in

Eq. (1.26),

S=

1 0

P1 P2 1

. (1.27)


20/133


1.5 Cardinal Elements

To identify the lower-left entry in the matrix of the thin lens, we first look at a lens

described by a more general matrix of the form

A=

a11 a12

a21 a22

. (1.28)

Its focal plane can be found by letting a ray parallel to the optical axis pass through

the lens and determine the distance b from the exit reference plane to the plane

where it intersects the optical axis. Expressing this in matrix notation, we have

0

out = 1 b0 1

a11 a12a21 a22

yin

0 , (1.29)

or 0

out

=

a11+ ba21 a12+ ba22

a21 a22

y in

0

. (1.30)

This implies that

0= (a11+ba21)yin. (1.31)

This equation should hold for all values ofy in. Therefore, it follows that

a11+ ba21=0. (1.32)

The position of the second focal plane of the lens described by the matrix A is

therefore determined by

b= a11

a21, (1.33)

and we can identify b as the focal length fof the lens.

Applying this result to the thin lens of Eq. (1.27), for which a11 = 1 holds, wesee that the lower-left entry represents the negative inverse of its focal length, i.e.,

the matrix of the thin lens is

F =

1 0 1

f 1

. (1.34)

The second focal plane is one of the cardinal elements of a lens. The position of the

first focal plane is calculated on the same footing, but by letting a parallel ray enter

from the other side into the system or by finding the distance for which the light

from a point source in front of the lens is collimated. In both ways, the following

result is obtained for the position of the first focal plane:

a = a22

a21

. (1.35)

A straightforward way to obtain other cardinal elements is by direct comparison

with the thin lens. We are interested in finding the positions of the planes with


21/133

8 Chapter 1

respect to which a lens given by the matrix Acould be described similar to a thin

lens. To this end, we take the following approach:1 h2

0 1

a11 a12

a21 a22

1 h1

0 1

=

1 0

1f

1

, (1.36)

whereh1 andh2 are the distances that have to be determined. The corresponding

planes are called principal planes and, together with the focal planes, they are the

cardinal elements of a lens. After performing the matrix multiplication on the left-

hand side, we have

a11+a21h1 a12+a11h1+a22h2+a21h1h2

a21

a22

+a12

h2

= 1 0

1f 1

. (1.37)The position of the first principal plane is therefore given by

h1= 1a11

a21, (1.38)

measured with respect to the first reference plane of the lens. The position of the

second principal plane is at

h2= 1a22

a21, (1.39)

measured with respect to the second reference plane of the lens.

A beautiful illustration of the principal planes concept is given by Lipson et

al. (1997). We can trace typical rays through the lens and draw this on a piece

of paper. If we now fold this paper along the lines that represent the principal

planes, we can hold it in such a way that the part between the principal planes

is perpendicular to the other parts. These other parts are combined to represent a

simplified arrangement (Fig. 1.4), which corresponds to a thin lens.

This is in complete analogy to Eq. (1.36). The results on the cardinal elements

are collected in Fig. 1.5.

With these prerequisites, we can state the cardinal elements of the thick lensgiven by Eq. (1.23). The equation for the focal lengthfof the lens is

1

f=

n1

R1+

1n

R2

(n1)(1n)t

nR1R2. (1.40)

Its principal planes are at

h1= fn1

R1

t

n, (1.41)

h2= f1n

R2

t

n. (1.42)


22/133


Figure 1.4 Principal planes visualized by folding. The optical system is described by the

matrixA. It has the focal points F1 andF2 and its principal planes are at h1 andh2, respec-

tively.

Figure 1.5 Cardinal elements. The focal pointsF1 and F2 and the positions h1 and h2 of

the principal planes serve to characterize the optical system given by the matrix A.


23/133

10 Chapter 1

1.6 Using Matrices for Optical-Layout Purposes

In the derivation of the position of the second focal plane, we considered the optical

arrangement formed by a lens, which was given by the matrix A, and a translationmatrixT, i.e.,

S=T A. (1.43)

On this combined optical arrangement, the condition s12 = 0 was then imposed toensure that the ray height in the output plane was independent of the ray angle in

the input plane. We used this condition because in the paraxial approximation, it

characterizes a point in the second focal plane. This way of reasoning can also be

applied to other situations.

Its application to the first focal plane is convenient; to this end, we consider the

combined arrangement given by the matrix product,

S=AT . (1.44)

We then impose a condition on the combined matrix Sthat expresses (in the linear

approximation) that a bundle of rays at a given ray height yin but with different

angles in in the entrance plane ofSwill be transformed into a parallel beam, i.e.,

a bundle of rays with the same angle, at the exit plane. The general inputoutput

relation is youtout

=

s11 s12s21 s22

y in in

. (1.45)

To ensure that out has a single value for a given ray height y in, it has to be inde-

pendent of in. A look at the inputoutput relation suggests that this condition is

met if we choose

s22=0. (1.46)

This choice determines the distance contained in the translation matrix and thereby

the position of the first focal plane, which corresponds to the matrixA.

At this point, we have conditions for the first focal plane (s22 = 0)and for thesecond focal plane (s11 = 0), and we might ask: what is the characteristic featureof a ray-transfer matrixSthat describes imaging? The rays leaving at a point at y in

in the object plane with different angles in intersect in a point at y out in the image

plane. If the matrixA describes a lens, we have to add two spacings on both sides

to model imaging, so we have

S=BAG, (1.47)

with B =

1 b0 1

andG =

1 g0 1

. Considering the inputoutput relation again, we

find thaty out is independent of in if

s12=0. (1.48)

This is the characteristic feature of a matrix Sthat represents imaging.


24/133


We can apply this condition immediately to find the imaging relation for a thin

lens. The corresponding matrix chain is

S=

1 b

0 1

1 0

1f

1

1 g

0 1

=

1 bf g+b bgf

1f

1 gf

. (1.49)

Usings12=0, we have the well-known imaging condition

1

g+

1

b=

1

f, (1.50)

which expresses thatb varies in a hyperbolic way as a function ofg and vice versa.

The signs of the distances are positive in Eq. (1.50) because the direction of the

distances is chosen as the direction of the optical axis.To find a relation for the first focal plane, we asked under which conditions par-

allel rays leaving the system might be independent of the input angle. Alternatively,

we can consider the situation where the rays leaving the system are independent of

the ray height in the entrance plane. This is the case if a collimated input beam is

transformed into a collimated output beam. Making reference to the inputoutput

relation forS, we see that setting

s21= 0 (1.51)

ensures that out does not depend on the ray heighty in in the input reference plane.

Because collimated rays are considered, no additional translation matrices have to

be introduced here and therefore S = A. Earlier, we related the matrix entry a12to the negative inverse of the focal length of an optical system via Eq. (1.33). This

matrix entry takes the value of zero now, which corresponds to the case of an afocal

system.

Typical examples for such systems are telescopes. In the paraxial approxima-

tion, we might model a telescopic arrangement using thin lenses. We choose two

lenses with focal lengthsf1andf2, separated by a distanced. Concatenation of the

corresponding matrices gives us the system matrix

S=

1 0

1f2

1

1 d

0 1

1 0

1f1

1

=

1 d

f1d

1f1

1f2

+ df1f2

1 df2

. (1.52)

Now, we impose the condition thats21=0 should hold. This implies that

1

f1

1

f2+

d

f1f2=0. (1.53)

The setting ofd=f1+f2solves this equation and we have

S=

f2

f1f1+ f2

0 f1

f2

(1.54)


25/133

12 Chapter 1

Figure 1.6The significance of zero-matrix entries.

for the system matrix of the telescopic arrangement. It represents a Newtonian tele-

scope if both focal lengths are positive. If the focal length of the first lens is nega-

tive, the matrix describes a Galilean telescope, which is composed of a concave and

a convex lens. Optical arrangements of this type also serve as transmissive beamexpanders (Das, 1991) and intracavity telescopes (Siegman, 1986). The results on

the significance of special matrix entries are summarized in Fig. 1.6.

1.7 Lens Doublet

We encountered telescopic arrangements as the first examples of a lens doublet and

we now have a closer look at optical systems composed of two lenses. The matrix

that describes two lenses separated by a distance dforms the starting point of our

discussion:

S=

1 d

f1d

1f1

1f2

+ df1f2

1 df2

. (1.55)


26/133


The terms21 is related to the focal length of the doublet (measured with respect to

its second principal plane).

1

f= 1

f1+ 1

f2 d

f1f2. (1.56)

[As shown before, this principal plane is at a distance z = (1s11)/(s21)from thesecond reference plane of the system.] To facilitate the discussion, it is convenient

to reference the intermediate distance to the second focal plane of the first lens and

to the first focal plane of the second lens by setting

d=f1+E +f2. (1.57)

With this setting, the equation for the focal length of the doublet reduces to

f = f1f2

E. (1.58)

It can now be discussed in terms of the signs of the three parameters that intervene.

Depending on whether f1 < 0 orf1 > 0,f2 < 0 orf2 > 0, orE < 0, E = 0, orE > 0, twelve cases can be distinguished. The case where f1 > 0 andf2 > 0 and

E=0, for example, represents the Galilean telescope.At this point, it is near at hand to make a distinction between divergent (f 0)doublets in terms of their three parameters. A compound

microscope represented as a doublet is characterized by f1 > 0 and f2 > 0 and

E > 0, and it is interesting to note that it is an example of a divergent system

(Prez, 1996)

1.8 Decomposition of Matrices and System Synthesis

In the layout of a new optical system, it is advantageous to know how the ray-

transfer matrix of a given optical system can be factorized. Let us consider the

design of an optical device with given properties and that some of these features

can be expressed in terms of a system matrix. To realize the device, it is now of

interest to systematically explore in which ways a device with the given features

can be realized. To this end, it is useful to divide the device into subsystems,

the combination of which would create the desired functionality. In the matrix

description, this is equivalent to considering matrix products of the target matrix,

and this is where factorizing the system matrix comes into play. The problem of

a synthesis of optical systems using this approach has been studied in depth by

Casperson (1981).

In what follows, we will consider optical systems that have both their object

and image planes in air. Therefore, n1 = 1 andn2 = 1 and the determinant of the

system matrixScan be written as

det S= n1

n2=1. (1.59)


27/133

14 Chapter 1

Therefore, the condition

s11s22s12s21= 1 (1.60)

is contained implicitly in Eqs. (1.61) and (1.62). A generalization is possible andcan, for example, be found in the work of Casperson (1981).

The appropriate factorization depends on the matrix entries. If we consider a

nonimaging problem, we can assume s12=0 for the system matrix. Such a matrixcan be factorized as

S=

s11 s12

s21 s22

=

1 0s221

s121

1 s12

0 1

1 0s111

s121

. (1.61)

If the lower-left entry of the system matrix can be assumed to be nonzero (s21=0),

i.e., if we do not look for an afocal system, the following matrix decompositionis appropriate:

S=

s11 s12

s21 s22

=

1 s111

s21

0 1

1 0

s21 1

1 s221

s21

0 1

. (1.62)

What is left are the cases in which both s12 = 0 and s21 = 0. These cases corre-spond to optical systems that are imaging and afocal devices. In the above-cited

work, four possibilities for a decomposition of this diagonal matrix are given. The

system matrix is either decomposed in a product of matrices A andB witha21= 0

andb21= 0 as

S=

s11 0

0 s22

=

1 t

0 1

s11 ts22

0 s22

, (1.63)

S=

s11 0

0 s22

=

s11 t s11

0 s22

1 t

0 1

, (1.64)

or a product of matrices witha12=0 andb12=0 as

S= s11 0

0 s22

= 1 0

1f 1 s11 0

s11f

s22

, (1.65)

S=

s11 0

0 s22

=

s11 0s22f

s22

1 0

1f

1

. (1.66)

Depending on the application, the matrices appearing in the product can then be

further decomposed by applying the same set of rules.

1.9 Central Theorem of First-Order Ray Tracing

We will now turn to a theorem that is of prime importance to ray tracing using

the matrix method. It can be applied to different sets of rays. Its main content isthat the number of rays necessary to characterize an optical system in the linear

approximation is rather small.


28/133


Let us consider two rays labeled a and b that are traced through the optical

system described by the matrixA. Each ray vector entering the system is mapped

onto an output ray vector as follows:ya

a

youta

outa

, (1.67)

yb

b

youtb

outb

. (1.68)

The mapping is given by the system matrixA. Therefore, we have

youta

outa= A ya

a , (1.69)

youtb

outb

= A

yb

b

. (1.70)

We assume that we can completely determine the four ray coordinates and that

we want to use this information to determine the system matrix A. Its entries are

therefore the unknown variables of the problem, and we can state it by rewriting

the above equations as the following system of linear equations:

ya a 0 0

yb b 0 0

0 0 ya a

0 0 yb b

a11

a12

a21

a22

=

youta

youtb

outa

outb

. (1.71)

Because the matrix is partitioned, two sets of linear equations can be solved inde-

pendently. If the determinant

D=det

ya a

yb b

=0, (1.72)

the problem has a unique solution, namely,

a11=det

youta ayoutb b

D

, (1.73)

a12=det

ya youtayb y

outb

D

, (1.74)

a21=det

outa a

outb b D

, (1.75)

a12=det

ya outayb

outb

D

. (1.76)


29/133

16 Chapter 1

D = 0 is equivalent to the condition that the input ray vectors are linearly in-dependent. We can therefore conclude that the ray-transfer matrix is completely

determined if we know a set of two linearly independent input ray vectors and thecorresponding output ray vectors. In the linear approximation, the passage of any

other third ray through the system is then known because we can trace it through

the system using the matrix A. Putting it in different words, the theorem states that

in the approximation used, the inputoutput relation is completely characterized

once the input and output data of two linearly independent rays are known.

This gives the theoretical basis of why an optical system can be characterized

to such an extent by just tracing the principal ray and the axial ray.

1.10 Aperture Stop and Field Stop

Theaperture stopis defined as the opening of an optical system that limits the input

angle at zero height in the object plane. A ray with these coordinates can be trans-

ported through the system. If the input angle of a ray is slightly greater than this

critical angle, the ray is blocked. We might have several candidates in the system to

cause this blockage, and which of them forms the aperture stop can be determined

in the following way using the matrix method. We label the free diameters of the

candidates as y(k). To every candidate now corresponds a matrix P(k) that maps

the start ray into the reference plane at z(k),

y(k)

(k)

= P(k)

0(k)

. (1.77)

This implies that

(k) = y(k)

p(k)12

. (1.78)

The aperture stop is at the position z(k) for which (k) takes the minimum value of

all the candidates. It has the heighty as =y(k) if(k )is the label for that minimum.

The axial ray is the ray that starts at zero height in the object plane and that

passes through the aperture stop at the maximum possible height. If we supposethat the matrix Pdescribes the mapping of the ray from the object plane to the

aperture plane, we can trace this ray to that plane using

P

0

=

yas

. (1.79)

Its start coordinates in the object plane are

y in

in =

0yas

p12 ,

and this ray can now be traced through the complete optical system. We describe

the second part of the system, i.e., the part between the aperture plane and the image


30/133


plane, byQ. Therefore, the system matrix is

S=QP , (1.80)

and the coordinates of the axial ray as its leaves the system areyout

out

=

y as

p12

s12

s22

. (1.81)

While the axial ray starts at zero height in the object plane and passes through the

aperture stop at its margin, the principal ray starts at the marginal height of the

object (if this corresponds to the field stop) and passes through the aperture stop

at zero height. In the matrix description, we can express this relation by using the

matrix P, which describes the mapping from the object plane to the plane of theaperture stop, as

P

yfield

=

0

. (1.82)

To be able to trace the principal ray through a complete system, we need its input

angle, which we can calculate from the following equation:

= p11

p12yfield. (1.83)

Therefore, the input coordinates of the principal ray are given byy in

in

=

yfield

p11p12

yfield

, (1.84)

and the output coordinates after passage through the whole system areyout

out

= y fieldS

1

p11p12

. (1.85)

It is interesting to note the followingsymmetrythat exists between the axial ray and

the principal ray:

P

0

=

y as

for the axial ray, (1.86)

yfield

= P1

0

for the principal ray, (1.87)

where the corresponding inverse matrix has been used. Das (1991) expressed this

symmetry relation by writing . . . the field stop is nothing but the new aperture

stop, when the object is placed at the center of the actual aperture stop.

(Please note that in writing the symmetry relation it was assumed that the ex-tension of the object can be identified with the extension of the field stop. This is

quite often the case, but more intricate situations are possible.)


31/133

18 Chapter 1

1.11 Lagrange Invariant

1.11.1 Derivation using the matrix method

We know that

det A= n1

n2(1.88)

holds for a ray-transfer matrix. During the derivation of the central theorem of first-

order ray tracing, the following result was obtained for two linearly independent

rays that pass through the system described by this matrix:

A=

a11 a12

a21 a22

=

1

D

youta b y

outb a yay

outb yby

outa

outa b outb a ya

outb yb

outa

. (1.89)

If we use this result in Eq. (1.88), we have

(youta b youtb a)(ya

outb yb

outa )(

outa b

outb a)(yay

outb yby

outa )

(ya b yba)2 =

n1

n2.

(1.90)

Performing the multiplications, we find

youta outb

outa y

outb

yab a yb=

n1

n2. (1.91)

This equation can now be rearranged slightly, to separate input and output quanti-

ties, as

n1(ya b a yb)= n2

youta outb

outa y

outb

. (1.92)

We can therefore conclude that the following quantity is conserved during the pas-

sage through the system:

L= n(yab ayb). (1.93)

This is the Lagrange invariant.

1.11.2 Application to optical design

We now turn to an application of the Lagrange invariant that is useful when design-

ing imaging systems. Let us consider the invariance condition given by Eq. (1.92)

and specialize it for the case where ray a is the axial ray and ray b is the principal

ray. From the discussion earlier, we know that these are two linearly independent

rays and are therefore suitable to characterize the system. These rays pass through

an imaging system. As reference planes, the object plane and the image plane are

natural choices, so we have

n(yar pr ar ypr )= n(y ar

pr

ar y

pr ). (1.94)


32/133


In both these planes, the height of the axial ray is equal to zero. Therefore, the

above equation reduces to

n(ar ypr )= n( ar y

pr ). (1.95)

The quantitiesar andar can be identified with the aperture of the system in the

object plane and the image plane, respectively, and yar andyar are the heights of

the object and the image. In this way, the Lagrange invariant allows one to establish

a direct relation between these quantities. This relation is, for example, of use to

study the magnification and angular magnification of an optical instrument.

1.12 Petzval Radius

The Petzval radius is the reciprocal of the curvature of the image field. This is a

concept of aberration theory, and it is surprising that we can determine its value

from paraxial quantities. The quantities to be considered are quite similar to those

that we encountered building up a lens, namely,

P1= n1

R1, P2=

1n

R2.

We state the relation for the Petzval radiusRp of a system made up ofN refractive

surfaces, without proof, as

1

Rp

=

Nk=1

Dk

nk, (1.96)

where Dk = nk/Rk is the refractive power of the kth surface, nk is the dif-ference in refractive indices, and Rk is the radius of curvature of the kth surface.

A proof can be found in textbooks on optics (Born, 1933; Born and Wolf, 1980)

The Petzval radius should tend to infinity to have a flat image field. This corre-

sponds to finding a combination for the right-hand side of the equation that brings

its value close to zero. An example for an optical device that minimizes this quan-

tity is Dysons system (Dyson, 1959). This optical arrangement will be consideredin more detail in Section 5.1.

1.13 Delano Diagram

The Delano diagram or y y diagram is a visual tool of paraxial analysis (Delano,

1963; Shack, 1973; Besenmatter, 1980). It is created by tracing a principal ray (y)

and an axial ray (y) through an optical system and by drawing the corresponding

ray heights (y , y ) in a single diagram. The position on the optical axis does not

appear explicitly in this diagram and this representation is therefore somewhat ab-

stract compared to a usual ray trace. But in many cases it can give an overview thatis like a kind of shorthand notation for the system. Figure 1.7 shows four Delano

diagrams that represent the different cases of matrices with zero entries.


33/133

20 Chapter 1

Figure 1.7Delano diagram, four cases.

1.14 Phase Space

Looking at the set of coordinates used in matrix optics, it is near at hand to try a

representation in a plane using the ray height as one coordinate, and the ray angle as

the other coordinate. A set of rays in a given reference plane can then be represented

as a surface in this abstract plane.

To familiarize ourselves with this concept, let us choose a set of rays whose

coordinates are represented by a rectangle in this so-called phase space. We would

like to see what might be the effect of the translation matrixTon this set of rays. 0

0

0

0

+

t0

0

, (1.97)

y0

0

y0

0

+

t0

0

, (1.98)

0

0

0

0

, (1.99)

y00 y0

0 . (1.100)

Graphically, this can be expressed as in Fig. 1.8. The corresponding mapping for

the refraction matrix is shown in Fig. 1.9.


34/133


Figure 1.8 Effect of the translation matrix in phase space. The spatial coordinate is repre-sented along they -axis and the angular coordinate is represented along the -axis.

Figure 1.9 Effect of the refraction matrix in phase space. As in Fig. 1.8, the spatial coor-

dinate is represented along the y -axis and the angular coordinate is represented along the

-axis.

It is an important feature of this phase space that the volume (or surface inthe two-dimensional case considered here) is conserved if we consider mappings

between input and output planes where the refractive indices are 1. This conserva-

tion is a consequence of Eq. (1.19).

From the theoretical point of view, it is more convenient to use coordinates

(y, n). The corresponding volume (or surface) is then always conserved. The

phase-space approach is common in laser technology (Hodgson and Weber, 1997).

1.15 An Alternative Paraxial Calculation Method

An alternative paraxial method (Berek, 1930) uses distances sk measured along theoptical axis and ray heights hk measured perpendicular to it as a set of coordinates

for the description of a ray. For readers who use this method, it might be interesting


35/133

22 Chapter 1

to see how both methods are connected. They will be familiar with the so-called

transition equations. These equations state how a ray is transformed that leaves

a lens surface labeled with k and

and that reaches another surface (labeled withk+1) situated at a distance e k as follows:

sk+1= sk e

k, (1.101)

sk+1= hk+1

uk+1, (1.102)

s k = hk

uk, (1.103)

whereuk+1andu

k

are angles. We can combine these equations to obtain

hk+1

uk=

hk

ukek. (1.104)

Settinguk+1=uk, we can write the linear relation,

hk+1

uk+1

=

1 ek0 1

hk

uk

, (1.105)

which is quite similar to the relation for the translation matrix T.

The other method also makes use of the relation

nk

1

Rk

1

sk

= nk

1

Rk

1

s k

(1.106)

for the two sides of a refracting surface with radius Rk. Usinguk = hk/sk for theparaxial angle, we can write this equation as follows:

uk = hk

Rk 1nk

n

k +

nk

n

k

uk. (1.107)

Assuminghk =hk in addition, we have an augmented linear relation of the formhk

uk

=

1 0

nk nkRk n

k

nknk

hk

uk

. (1.108)

This relation corresponds to the refraction matrix R of the matrix description.

1.16 Gaussian Brackets

While cascading matrices in order to determine a system matrix, the recursive char-acter of the problem became clear. But at this point, we were unable to state the

underlying recursion law that would allow us to express the ray that finally leaves


36/133


the system, and that is described by the product matrix, as a function of the in-

put ray without performing the matrix multiplication. The use of the mathematical

concept of Gaussian brackets makes it possible to state this recursion formula.An introduction to the algebra of Gaussian brackets and the recursion law for

cascaded linearized optical systems was given by Herzberger (1943). In this text,

the Gaussian brackets are defined by the following recursion formula:

[a1, . . . , ak] =a1[a2, . . . , ak] + [a3, . . . , ak], (1.109)

with[ ] =1. This implies

[a1] =a1, (1.110)

[a1, a2] =a1a2+1, (1.111)[a1, a2, a3] =a1a2a3+a1+ a3, (1.112)

[a1, a2, a3, a4] =a1a2a3a4+a1a2+a1a4+a3a4+1. (1.113)

From Herzbergers article, we take a description of a lens using his symbols, but

arrange the linear transformation as a matrix as follows:x2

2

=

1 d12

n121

d12n12

1+2 d12n12

12 1 d12n12

2

x1

01

. (1.114)

wherex is the distance between intersections of the optical axis and the ray withthe reference plane andis the inclination angle of the ray. The subscripts indicate

the corresponding surfaces. k is the refractive power of the kth surface, dk,k+1 is

the distance between the k th and the (k+ 1)st surface, andnk,k+1 is the refractiveindex between the kth and the(k+ 1)st surface. Using Eqs. (1.110)(1.112), thiscan be recast in the following way:

x2

2

=

1,

d12n12

d12

n12

1, d12n12 , 2 d12n12 , 2

x1

01

. (1.115)

For the general case, the recursion law reads as follows (Herzberger, 1943):

x

=

1,

d12n12

, . . . , dk1,knk1,k

d12

12, 2, . . . ,

dk1,knk1,k

1, . . . , dk1,knk1,k

, k

d12n12

, . . . , k

x1

01

. (1.116)

This equation establishes a link between the input coordinates and the output coor-

dinates of an optical system. The optical properties of the system are described by

the matrix entries, which are Gaussian brackets. In order to state these entries ex-

plicitly in terms of refractive powers, distances, and refractive indices, the recursive

relations have to be evaluated using Eq. (1.109) in a kind of backtracking proce-dure. The approach of the matrix method is advantageous, because it allows us to

obtain the system matrix by concatenating matrices, i.e., by matrix multiplication.


37/133

24 Chapter 1

In cases, where it is useful to state inputoutput relations in a recursive way, the

method of Gaussian brackets provides an alternative approach.

In the framework of the matrix method presented in this text, I found no way tostate the recursion law with a similar conciseness. Therefore, I would like to draw

the readers attention to the Gaussian-brackets method that allows such a formula-

tion. It is beyond the scope of this book to derive the method here, and interested

readers are referred to Herzbergers work. In optical design, Gaussian brackets

were applied to the layout of zoom systems (Pegis and Peck, 1962; Tanaka, 1979,

1982).


38/133

Chapter 2

Optical Components

2.1 Components Based on Reflection

Mirrors are a key component of many optical devices. The matrix for the reflection

at a spherical mirror was considered in the introductory chapter. We will now

discuss reflectors more generally.

2.1.1 Plane mirror

The matrix describing the reflection at a plane mirror can be obtained by taking the

matrix for reflection at a spherical reflector and letting the radius of the spherical

mirror tend to infinity. In this way, the unity matrix is obtained as

A=

1 0

0 1

. (2.1)

The signs that appear in this matrix are surprising at first, and it is instructive to

derive the matrix also in an alternative way. In Fig. 2.1, the reflection of a ray at a

plane mirror, which is perpendicular to the optical axis, is depicted. In the matrix

representation used here, we unfold the ray using the reference plane of the mirror

as the plane of the coordinate break. Figure 2.2 shows the result of this unfolding.

It is this coordinate break that causes the positive signs in the matrix of the plane

reflector.

Figure 2.1Plane mirror.


39/133

26 Chapter 2

Figure 2.2Unfolded plane mirror.

2.1.2 Retroreflector

Unlike a plane mirror, a retroreflector redirects the beam back into the same direc-

tion from where it came. In the matrix description, this is expressed by a negative

sign of the matrix entry a22. Reflection at a retroreflector is combined with a change

in the height of the beam. The height of the incident beam is changed from yin

toy out = yin. This corresponds to a parallel shift and is expressed by a negative

sign of the matrix entrya11. The complete matrix of the retroreflector reads as

A=

1 0

0 1

. (2.2)

More generally, a retroreflector has the following matrix:

A=

1 a12

0 1

. (2.3)

2.1.3 Phase-conjugate mirror

An element that redirects an incident beam into itself without any shift in ray height

would be described by the following matrix (Lam and Brown, 1980):

A= 1 00

1 . (2.4)

There are devices that exploit nonlinear optical effects and that are able to operate in

such a way on an incident laser beam. To describe these so-called phase-conjugate


40/133

Optical Components 27

mirrors in the paraxial approximation, the matrix stated above can be used. The

nonlinear effect itself is far beyond the scope of this approach. Retroreflectors, on

the other hand, can be advantageously modeled using the matrix method.

2.1.4 Cats-eye retroreflector

A prominent example of this type of optics is the cats-eye retroreflector. It ba-

sically consists of a lens and a plane mirror (Fig. 2.3). The distance between the

lens and the mirror is chosen as the focal length of the lens (with respect to the

backward principal plane of the lens). The lens will be approximated as a thin lens.

To describe the plane mirror, we can use the matrix given earlier [Eq. (2.1)]. The

distances involved are first designated by g and b. Following the ray through the

unfolded arrangement, we find the following matrix chain:

S=

1 g

0 1

1 0

1f

1

1 b

0 1

1 0

0 1

1 b

0 1

1 0

1f

1

1 g

0 1

. (2.5)

After performing the matrix multiplications, we have

S= 1+2 b

f g

f 1 2 g

f 2g +2b4

bg

f +2

g2

f b

f 1

2f

bf

1

2g

f

bf

1

+12 bf

. (2.6)

We know that lettingb =fmakes the arrangement work as a cats-eye retroreflec-

tor. It is instructive to see what happens if we choose the entrance and exit planes

of the system either at the position of the lens or at a distance f in front of it. The

first alternative corresponds to settingg = 0 and the second one to letting g = f.

In this way, we find

S(g =0, b =f )= 1 2(f g)

0 1

(2.7)

Figure 2.3Cats-eye arrangement.


41/133

28 Chapter 2

and

S(g =f, b =f )= 1 0

0 1 . (2.8)The form of the second matrix is equal to the matrix of a retroreflector stated in

Eq. (2.2).

2.1.5 Roof mirror

A roof mirror is formed by two plane mirrors meeting at a right angle. This optical

arrangement is also being designated as a double mirror (DeWeerd and Hill, 2004).

We will consider the plane that is perpendicular to the roof edge. In this plane,

the mirror can be unfolded as shown in Fig. 2.4 using an xy plane as the plane

of symmetry. The coordinates of the rays that leave the mirror after reflection canbe found by tracing a rectilinear line through the unfolded arrangement. If we first

ignore the coordinate break caused by the unfolding operation, the figure shows the

passage of a ray through a distance 2t. This propagation can be described by the

following matrix:

T =

1 2t

0 1

. (2.9)

Additionally, it has to be taken into account that the orientation of the reference

axis changes due to the unfolding operation. The corresponding sign change of the

coordinates in the new coordinate system can be seen in Fig. 2.4 and expressed bythe following matrix:

1 0

0 1

. (2.10)

Combining, this gives the component matrix,

S=

1 0

0 1

1 2t

0 1

=

1 2t

0 1

. (2.11)

Therefore, a roof reflector acts like a retroreflector [Eq. (2.3)] in the plane that is

perpendicular to the roof edge.

Figure 2.4Unfolding the roof mirror.


42/133


Figure 2.5Arrangement with roof mirror and plane mirror.

It is interesting to see how a combination of this roof mirror and a plane mirror

as depicted in Fig. 2.5 would act on a ray. To this end, it is advantageous to unfold

the optical arrangement with respect to the reference plane of the plane mirror. This

leads to the following matrix chain:

S=

1 2t

0 1

1 b

0 1

1 0

0 1

1 b

0 1

1 2t

0 1

. (2.12)

The evaluation of this matrix product gives the system matrix of the arrangement

of Fig. 2.5 as follows:

S=

1 4t+2b

0 1

. (2.13)

This equation has some similarity to the equation of a plane mirror, but is of the

following more general form:

A = 1 a120 1

. (2.14)

2.2 Components Based on Refraction

Lenses are of course very prominent examples of optical components based on

refraction. Because their description using the matrix method are treated in other

chapters, other components based on refraction are considered here.

2.2.1 Plane-parallel plate

The plane-parallel plate is a component that is often encountered in optical setups.It can be a simple glass plate used for path-length compensation or a component

that is used to influence the polarization and has the form of a plate. Its system


43/133

30 Chapter 2

matrix can be concluded as a special case of the system matrix of a thick lens

[Eq. (1.23)] in air by letting both radii tend to infinity. In this way, we find

S=

1 t

n

0 1

, (2.15)

wheretis the thickness of the plane-parallel plate andn is its refractive index.

To recall this equation during design work, we may also apply the linearized

law of refraction twice. In matrix notation, this reads as y(1)

(1)

=

1 0

0 1n

yin

in

(2.16)

for the rays entering the plate and yout

out

=

1 0

0 n

y(2)

(2)

(2.17)

for those leaving it. In between, they pass through a homogeneous medium of

refractive indexn. Putting this together, we have

S=

1 0

0 n

1 t

0 1

1 0

0 1n

=

1 t

n

0 1

. (2.18)

This matrix can be used to determine the shift z depicted in Fig. 2.6, which occurs

if a plane-parallel plate is introduced into a convergent beam. For comparison, we

first describe the situation without a plane-parallel plate in the following way: yout,1

out,1

=

1 z10 1

1 t

0 1

y in

in

=

1 t+z10 1

y in

in

. (2.19)

At the intersection with the optical axis, we have y out,1 =0. This implies

z1 = y in

in t . (2.20)

Figure 2.6Shift caused by a plane-parallel plate.


44/133


The situation with a plane-parallel plate can be described by

yout,2out,2

=

1 z2

0 1 1 t

n0 1

yin in

=

1 tn

+z2

0 1 yin

in

. (2.21)

Usingy out,2 =0, we find

z2 = yin

in

t

n. (2.22)

The difference of Eqs. (2.22) and (2.20) gives the shift zwe are looking for in the

paraxial approximation,

z = z2 z1 = t1

n 1

. (2.23)

Because the term in brackets is generally negative, the shift z takes a positive

value as is expected from Fig. 2.6.

2.2.2 Prisms

Prisms have a vast range of applications in several fields of optics. In optical spec-

troscopy, for example, an important branch of science, prisms serve both as disper-

sive means (Demtrder, 1999) and as components for beam shaping. In optical data

storage technology, prisms are also used for beam shaping purposes, namely, to cir-cularize the asymmetric beam emitted by a semiconductor laser

(Marchant, 1990).

2.2.2.1 Two types of prisms

It is useful to divide prisms into two distinct groups that have different properties

with respect to dispersion. A convenient graphical tool to make this distinction

is the so-called tunnel diagram (Yoder, 1985). To draw this diagram, the prism

has to be optically unfolded as if it has a surface from which the beam of light

is reflected back into the prism. Figures 2.72.10 show two examples that are

representative for the two groups. The tunnel diagram in Fig. 2.8 shows that the

Dove prism (Fig. 2.7) is equivalent to a plane-parallel plate. There is a variety of

prisms that can be understood in terms of this basic component. The second prism

(Fig. 2.9), on the other hand, can be reduced to a prism with an apex angle using

the tunnel diagram (Fig. 2.10).

Figure 2.7 Dove prism.


45/133

32 Chapter 2

Figure 2.8Tunnel diagram of the Dove prism.

Figure 2.9Prism with one internal reflection.

Figure 2.10Tunnel diagram of the prism with one internal reflection.

We can conclude from these examples that there are the following:

1. Prisms that are reducible to a plane-parallel plate.

2. Prisms that are not reducible to a plane-parallel plate.

Another way of stating this important difference is talking of nondispersive (Wolfe,

1995) and dispersive prisms (Zissis, 1995). The first case can be treated with the

matrix of a plane-parallel plate, which has already been derived. We will therefore

have a closer look at the other type of prisms.

Thin-prism approximation for dispersive prisms If both the prism angle and the

incidence angle measured against the normal of the first face of the prism are small,


46/133


Snells law can be linearized and the approximated expression for the deviation

angle is as follows (Heavens and Ditchburn, 1991):

=(n1). (2.24)

It is interesting to observe that the angle of incidence does not appear in this ap-

proximated expression.

In terms of the matrix method, the thin prism can be written in the following

way: y

=

1 0

0 1

y

+

0

(n1)

. (2.25)

We will apply this later to get an approximated expression for an axicon.

Trigonometric description of dispersive prisms In many cases, the linear approxi-

mation is not sufficient and it is necessary to perform the trigonometric calculations

to trace a ray through a prism.

Using beam shaping, especially laser beam circularization, is a representative

example of how prism arrangements can be analyzed with trigonometric transfer

functions. Semiconductor lasers emit with a high beam divergence perpendicular

to the junction and with a low beam divergence parallel to it. In several applica-

tions, it is of importance to transform this elliptical Gaussian beam into a circular

Gaussian beam or, more generally speaking, to adapt the elliptical Gaussian beamto a given optical system by expanding or compressing it along an axis. There are

also applications in which the intensity profile that results in a plane perpendicular

to the optical axis differs from a circular one after the Gaussian beam is shaped.

Only beam shaping of collimated light is considered here, i.e., prisms that can be

used to perform the expansion or compression after the beam emitted by the laser

diode has passed a collimating lens are described.

Of course, prism arrangements are not the only way to realize beam shaping

in an optical system. The reader who would like to know more about alternative

techniques is referred to Dickey and Holswade (2000).

2.2.2.2 Brewster condition

To minimize losses, the so-called Brewster condition important. Reflective losses

at a surface are minimal if a polarized beam is parallel to the plane of incidence,

i.e., a p-polarized beam, is incident at Brewsters angle (Young, 1997; Marchant,

1990). This angle depends on the refractive index in the following way:

0 =arctan(n). (2.26)

The Brewster condition is also often expressed by saying that the ray reflected by

the surface and the ray refracted into the medium are perpendicular with respect toeach other, i.e.,

0 =90 deg 1, (2.27)


47/133

34 Chapter 2

where1 is the angle that the refracted ray forms with the surface normal pointing

into the medium. This implies that the following equation is an alternative expres-

sion of the Brewster condition in terms of the angle1:

1 =arccot(n). (2.28)

2.2.2.3 Refracting prism

Relation of incident and exit angle in the case of a single prism To derive a rela-

tion between the angle of incidence 0and the exit angle3of the refracting prism

characterized by its apex angle and the refractive index n, the law of refraction

has to be applied twice. This leads to

3 =arcsin(n sin 2), (2.29)

1 =arcsin

1

nsin 0

, (2.30)

where 1 and 2 are the corresponding angles inside of the prism as depicted in

Fig. 2.11. In a prism, these angles are related by

1+ 2 =. (2.31)

If these equations are put together, the following relation is obtained:

3 =arcsin

n sin

arcsin

1

nsin 0

. (2.32)

In an analogous way, the corresponding equation for the inverse relation can be

derived as

0 =arcsin

n sin

arcsin

1

nsin 3

. (2.33)

Figure 2.11 Refracting prism.


48/133


From Eq. (2.32), a relation can be derived that links the incidence angle and

the apex angle for the case where the beam exiting the prism is perpendicular to

the second surface of the prism, i.e., 3 = 0. Using the trigonometric identityarcsin x =90 deg arccos x, Eq. (2.32) can be reformulated as follows:

3 =arcsin

n sin( 90 deg + 0)

. (2.34)

For3 =0, this equation implies that

90 deg + 0 =0. (2.35)

This relation is used later.

The transfer function for the beamwidth altered by a single prism To study the

beam-shaping effect of a prism on a collimated beam, it is appropriate to consider

first how a single refracting surface changes an incident beam of width w0. The

angle of incidence of this beam, with respect to the surface normal is, designated

by 0. The angle 1, with respect to this normal inside a medium of refractive index

nis, determined by Snells law as

1 =arcsin

1

nsin 0

. (2.36)

If0is not zero, the beamwidth is increased after transition from a medium of lower

refractive index to a medium of higher refractive index. Figure 2.12 shows a cutthrough a surface that forms the boundary between air and the prism material. The

beamwidth after refraction is calledw1. From the figure, the following relations for

the cosines of the two angles are obvious:

cos 0 =w0

h, (2.37)

cos 1 =w1

h. (2.38)

Figure 2.12 Change of beamwidth at a refracting surface.


49/133

36 Chapter 2

If they are combined, a relation follows that is an expression for the beamwidths as

a function of incident and exiting angle, i.e.,

w1

w0=

cos 1

cos 0. (2.39)

Together with Snells law [Eq. (2.36)], this leads to the beamwidth transfer function

for a refracting surface,

w1

w0=

cos{arcsin[(1/n) sin 0]}

cos 0. (2.40)

The corresponding equation for the other refracting surface of the prism, i.e., the

transition from a medium with refractive index n to air, is

w3

w1=

cos[arcsin(n sin 2)]

cos 2. (2.41)

In a prism of apex angle , the link between the known angle 1 and the angle2is given by =1+ 2.

In this way, one finds

w3

w1

=cos[arcsin(nsin{ arcsin[(1/n) sin 0]})]

cos{

arcsin[(1/n) sin 0

]}. (2.42)

Substituting for w1 from Eq. (2.40), the beamwidth transfer function for a prism

with the apex angle is obtained as

w3

w0=

cos[arcsin(nsin{arcsin[(1/n) sin 0]})]

cos{ arcsin[(1/n) sin 0]}

cos{arcsin[(1/n) sin 0]}

cos 0.

(2.43)

This equation shows that the transfer functions of the refracting surfaces can be

multiplied to obtain the transfer function of the prism. This is a general property of

the beamwidth transfer functions.

A prism for expansion along one axis It is of interest to consider the situation

where the light beam is incident on the prism in accordance with the Brewster

condition [Eq. (2.26)], i.e.,sin 0

cos 0=n. (2.44)

To avoid reshaping at the other surface of the prism, i.e., compression in the direc-

tion of the axis that has been expanded before, it is convenient to have the beam

exiting the prism perpendicular to its rear surface(3 =0). Combining this condi-

tion [Eq. (2.35)] with the Brewster condition leads to the equation,

=90 deg + arctan(n). (2.45)


50/133


Figure 2.13Expanding prism.

This implies that the following relation between the apex angle of the prism and its

refractive index should hold:

=arccot(n). (2.46)

In this situation of practical importance (Fig. 2.13), the relation for the beamwidths

is especially simple, i.e.,

w3w0

=n. (2.47)

This formula can be concluded from Eq. (2.33) using the Brewster condition and

the additional condition expressed in Eq. (2.46). To this end, it is helpful to consider

the argument arcsin(sin 0/n) first. Using Snells law and the trigonometric

identity arcsin x =90 arccos x, one has

arcsin

sin 0

n

= 90 deg + 0. (2.48)

Combining this equation with Eq. (2.46) and the Brewster condition, it follows that

arcsin

sin 0

n

= 0. (2.49)

This implies that the first factor in Eq. (2.43) is equal to 1. Using Snells law and

the trigonometric identity stated before, the second factor can be expressed as

cos[arcsin(sin 0/n)]

cos 0

=sin 0

cos 0

. (2.50)

If this is combined with the Brewster condition (tan 0 = n), one finds that the

second factor is equal to the refractive index n.


51/133

38 Chapter 2

It seems worthwhile to present a shortcut to derive Eq. (2.47) for the case where

it can already be assumed that the second surface of the prism leaves the beamwidth

unaltered, i.e.,3 =0

w3

w1=1. (2.51)

In this case, it is sufficient to consider

w1

w0=

cos 1

cos 0. (2.52)

Using the Brewster condition (1/ cos 0 =n/ sin 0), it is concluded that

w1

w0=

cos 1

sin 0n. (2.53)

The sine term can now be expressed by Snells law (sin 0 =n sin 1)as

w1

w0=

cos 1

sin 1=cot 1. (2.54)

At this point, it is convenient to make use of the alternative expression of the Brew-

ster condition(cos 1 =n)in order to obtain Eq. (2.47) and we have

w3

w0=

w3

w1

w1

w0=n.

The type of prism considered here has numerous applications in optical devices.

In the literature, the abbreviationM =w1/w0for magnification can sometimes be

encountered in conjunction with the following equation (Hanna et al., 1975):

M =1

n

n2 sin2 0

1sin2 0. (2.55)

Its equivalence with Eq. (2.52) can directly be seen using Snells law (sin 0 =

n sin 1).

A prism for compression along one axis An analogous relation holds for a prism

that reduces the beamwidth along an axis instead of expanding it, namely,

w3

w0=

1

n. (2.56)

Such a prism can be realized by letting a beam pass undeviated through the first

surface, so that the beam shaping occurs at the exit surface (Fig. 2.14). In contrast

to the beam-shaping effect described earlier, in this case a transition from a mediumwith a higher refractive index to a medium with a lower index of refraction is of

importance(n sin 2 =sin 3).


52/133


Figure 2.14Compressive prism.

Equation (2.56) can be derived in a way very similar to the derivation of

Eq. (2.47) if it is assumed that w2/w0 =1 holds, i.e., that perpendicular incidence

on the first surface is realized. Additionally, it has to be assumed that

tan 3 =n cot 2 =n. (2.57)

Combined with Snells law, the two equivalent equations lead to

w3

w2=

cos 3

cos 2=

1

cot 2=

1

n, (2.58)

and combined with the assumption of perpendicular incidence finally gives

w3

w0=

w2

w0

w3

w2=

1

n. (2.59)

Tolerancing For the purpose of tolerancing, it is advantageous to have explicit

formulas for the quantities of interest as functions of physical quantities that are

controlled by adjustment (the incidence angle0) or by manufacturing (the refrac-

tive indexn, the apex angle). Equations (2.32) and (2.43) are appropriate starting

points for such a sensitivity analysis.

An important point is the question of how the exiting angle changes if the input

angle0 is not well adjusted. In Fig. 2.15, the output angle is plotted as a function

of the deviation = 0 opt

0 from the optimum angle opt

0 = arctan(n) for

an expanding prism (Fig. 2.15) made of the standard glass BK 7. The wavelength

considered is = 405 nm. This determines the refractive index used in Eq. (2.32),which is n = 1.53024. In Fig. 2.16, the change of beamwidth calculated from

Eq. (2.43) is shown as a function of the deviation for the same prism.


53/133

40 Chapter 2

Figure 2.15Exit angle versus adjustment error for the prism of Fig. 2.13.

Figure 2.16Beamwidth versus adjustment error for the prism of Fig. 2.13.

Figure 2.17 gives a representation of the change of the exiting angle with dis-

adjustment for a compressive prism (Fig. 2.14). To allow for comparison with

Fig. 2.9, the same material and wavelength as before are chosen in this example.

Figure 2.18 shows the dependence of the beamwidth on a deviation from the opti-

mum input angle.Being a function of wavelength and temperature, the refractive index that has to

be considered in the analysis might change in practice, depending on the conditions


54/133


Figure 2.17Exit angle versus adjustment error for the prism of Fig. 2.14.

Figure 2.18Beamwidth versus adjustment error for the prism of Fig. 2.14.

under which the laser is operated. Figure 2.19 shows how the exiting angle changes

in the case of an expanding prism (Fig. 2.13) if there is a deviation n =nnopt

from the optimum refractive index. The apex angle is considered to be a fixed

value. The same holds for the angle of incidence. Again, Eq. (2.32) can be usedfor a simple analysis. Figure 2.20 shows the same dependence for a compressive

prism (Fig. 2.14).


55/133


56/133


Figure 2.21Arrangement of two prisms to increase dispersion.

Figure 2.22Arrangement of two prisms to decrease dispersion.

in optical spectroscopy to increase dispersion, while the second case is of impor-

tance here. Arrangements of two prisms in which the second case is realized are

common in devices for optical recording (Okuda et al., 1995) and also have been

used in dye laser systems (Niefer and Atkinson, 1988).

Relation of incident and exit angle in the case of a two-prism arrangement Hav-

ing a look at the result for a single prism, the following equations can be written

immediately:

7 =arcsin

nII sin

II arcsin

sin 4

nII

, (2.60)

3 =arcsin

nI sin

I arcsin

sin 0

nI

, (2.61)

where the angles are designated as depicted in Fig. 2.22 and II is the apex angleof the second prism andnII is its index of refraction, while the variablesI andnI

describe the first prism. In the equation for 4, the relative orientation of the two


57/133

44 Chapter 2

prisms intervenes. It is expressed by the variable =3 4 as

4

=arcsin

nIsin

I arcsin

sin 0nI

. (2.62)

If this expression is substituted into the equation for 7, the output angle of the

two-prism arrangement is found as a function of the input angle, i.e.,

7 =arcsin

nII sin

II arcsin

1

nIIsin

arcsin

nI sin

I

arcsin

sin 0

nI

. (2.63)

The transfer function for the beamwidth altered by two prisms Exploiting the factstated earlier that the beamwidth transfer function can be composed as the prod-

uct of the transfer function of the refracting surfaces that intervene, the transfer

function for the two prisms can now be written directly as

w7

w0=

cos(arcsin{nII sin[II arcsin(sin 4/nII)]})

cos[II arcsin(sin 4

nII )]

cos[arcsin(sin 4/nII)]

cos 4

cos(arcsin{nI sin[I arcsin(sin 0/n

I)]})

cos[I arcsin(sin 0/n)]

cos[arcsin(sin 0/nI)]

cos 0.

(2.64)

The link bet

Documents

Matrix Methods for Optical Layout