[Adrian C. Melissinos] Principles of Modern Techno(Book4You)

Principles of modern technology

Principles of modern technologyAdrian C. MelissinosDepartment of PhysicsUniversity of Rochester

The right of theUniversity of Cambridge

to print and sellall manner of books

was granted byHenry VIII in 1534.

The University has printedand published continuously

since 1584.

CAMBRIDGE UNIVERSITY PRESSCambridge

New York Port Chester Melbourne Sydney

Published by the Press Syndicate of the University of CambridgeThe Pitt Building, Trumpington Street, Cambridge CB2 1RP40 West 20th Street, New York, NY 10011, USA10 Stamford Road, Oakleigh, Melbourne 3166, Australia

© Cambridge University Press 1990

First published 1990

British Library cataloguing in publication dataMelissinos, Adrian C.

Principles of modern technology.1. TechnologyI. Title600

Library of Congress cataloguing in publication data available

ISBN 0 521 35249 5 hard coversISBN 0 521 38965 8 paperback

Transferred to digital printing 2003

MP

ForJohn and Andrew

Contents

Part A Microelectronics and computers1 The transistor

1.1 Intrinsic semiconductors1.2 Doped semiconductors1.3 Charge transport in solids1.4 The p-n junction1.5 The junction transistor1.6 Manufacture of transistors; the planar geometry1.7 The field effect transistor (FET)1.8 Transistor-transistor-logic1.9 Logic gates

Exercises

2 Digital electronics2.1 Elements of Boolean algebra2.2 Arithmetic and logic operations2.3 Decoders and multiplexers2.4 Flip-flops2.5 Registers and counters2.6 Data representation and coding2.7 Computer memories2.8 Magnetic storage2.9 The compact disk2.10 Computer architecture

Exercises

Part B Communications3 The transmission of signals

3.1 The electromagnetic spectrum and the nature of thesignals

3.2 Fourier decomposition3.3 Carrier modulation3.4 Digital communications3.5 Noise in communications channels3.6 Sources of noise

1339

1214202529343740

424245495155586467727580

8183

8385899496

101

viii Contents

3.7 Elements of communication theory 1063.8 Channel capacity 112

Exercises 115

4 Generation and propagation of electromagnetic waves 1174.1 Maxwell's equations 1174.2 Radiation and antennas 1214.3 Directional antennas 1234.4 Reflection, refraction and absorption 1284.5 The ionosphere 1324.6 Satellite communications 1364.7 Waveguides and transmission lines 1394.8 Fiber optics 1454.9 The laser 1494.10 Properties of laser radiation 156

Exercises 161

Part C Nuclear energy 1635 Sources of energy 165

5.1 Introduction 1655.2 The terrestrial energy balance 1695.3 The atomic nucleus 1745.4 Nuclear binding, fission and fusion 1795.5 Nuclear reactors 1835.6 Radioactivity 1895.7 Controlled fusion 1925.8 Solar energy 197

Exercises 202

6 Nuclear weapons 2046.1 Fission and fusion explosives 2046.2 The effects of nuclear weapons 2106.3 Delivery systems and nuclear arsenals 2166.4 Reconnaissance satellites 2226.5 Proposed defense systems 2286.6 Arms limitation treaties 233

Exercises 235

Part D Space travel 2397 Airplane and rocket flight 241

7.1 Fluid flow and dynamic lift 2417.2 Airplane flight 2477.3 The effects of viscosity 250

Contents ix

7.4 Supersonic flight 2557.5 Propulsion dynamics 2587.6 Rocket engines 2627.7 Multistage rockets 2667.8 The NASA shuttle 270

Exercises 274

8 To the stars 2768.1 The solar system 2768.2 Motion in a central field of force 2818.3 Transfer orbits 2868.4 Encounters 2918.5 The Voyager-2 grand tour of the planets 2958.6 Interstellar travel 3018.7 Inertial guidance 307

Exercises 314

Appendices1. The Fourier transform 3162. The power spectrum 3193. The equations of fluid mechanics 3214. The speed of sound 324

References and suggestions for further reading 326

Index 328

Preface

Ours is the age of technology, rivaling the industrial revolution in itsimpact on the course of civilization. Whether the great achievements oftechnology, and our dependence on them, have improved our lot, or leadinexorably to a 'strange new world' we shall not debate here. Instead wefocus on the physical laws that make technology possible in the first place.Our aim is to understand and explain modern technology, as distinctfrom describing it.

Even when the principles underlying a technical process or device arewell understood, a great deal of engineering effort and a longmanufacturing infrastructure are needed to translate them into practice.In turn, the technical skills that are developed lead to new possibilitiesin basic research and to new applications. For instance, the laser couldhave been easily built at the turn of the century; yet it was a long roadstarting with the development of radar and followed by the invention ofthe maser that led to the proposal for the laser. The use of computers inso many manufacturing areas and research fields is another example ofthe interplay between technology and basic science.

Because of the complexity of modern devices and of the rapid advancesin all scientific fields, the need for specialization is acute. Thus, often,science students are only vaguely aware of the applications of the principlesthey have learned, whereas engineering students are too involved toappreciate the power of the physical law. Such disparity may carry overeven into one's professional career, as modern life leaves little time forreflection. However, unless we understand the principles on which ourtechnological developments are based, we remain simple users and aredestined to be enslaved by our own inventions. It is therefore importantthat the common citizen be familiar with the principles of moderntechnology and we believe that this is possible with only modest scientificbackground but sufficient determination.

This book evolved from a course on the 'Physics of Modern Technology'that has been taught at the University of Rochester over the past 15 years.It was initiated by the late Elliott Montroll, a distinguished theoreticalphysicist of broad interests and intense intellectual curiosity. Elliott wasa fascinating raconteur of scientific history and a wonderful colleague andfriend. The course is directed to juniors and seniors in science orengineering, the only prerequisite being an introductory sequence in

xii Preface

physics. Fortunately the subject is so wide and rich that it remainsinteresting and valuable to students of differing background. It is hopedthat this will also be the case for the more general readership of this book.

In discussing a subject as vast as modern technology one can easilybecome descriptive and even encyclopedic. This is not our aim and thepresentation is quantitative but without complicated mathematicalexpressions. In particular we have tried to convey the simplicity and beautyof the physical laws, and the excitement of discovering that they do indeedwork in practice. The choice of subjects is far from exhaustive and tosome extent arbitrary. Nevertheless, the explosion in microelectronics, thedevelopment of a nuclear capability and the realization of space travelare undoubtedly among the most prominent technological achievementsof the late twentieth century.

The material is divided into four parts, each consisting of two chapters.We begin with microelectronics and devote the first chapter to theproperties of semiconductors, the realization of the transistor and to verylarge scale integration. In the second chapter we build on the previousmaterial; we start with simple circuits and end by discussing the operationof a model computer. We also examine information storage devices suchas magnetic tape and the optical disk. The second part deals withcommunications and with the fascinating scientific discoveries that mademodern communications possible and so pervasive. For instance, inChapter 3 we discuss the harmonic analysis of arbitrary waveforms andtheir digital representation; we also consider noise in communicationchannels and elements of information theory. The transmission of signalsand the propagation of electromagnetic radiation over a broad spectrumare discussed in Chapter 4. This covers radio waves, microwaves and laserradiation, as well as propagation in optical fibers.

The third part of the book is devoted to nuclear energy. In the first ofthe two chapters we consider the production of energy and global energyneeds. We discuss nuclear fission and fusion and also examine the use ofsolar energy. The second chapter is devoted to nuclear weapons and tothe dilemma facing the world as a consequence of the existing nucleararsenals. We made an effort not to editorialize and to only presentquantitative information; this includes defense systems and a review ofarms limitation treaties. Since this field is evolving rapidly the readershould update his information as saner policies are adopted worldwide.

It was originally planned to consider and discuss all modes oftransportation. However to keep the book in proportion only airplaneand rocket propulsion are discussed in the final part of this book. Theseare, of course, the modes which enable us to travel beyond our planetand where the future lies. In Chapter 7 we concentrate on the fundamental

Preface xiii

principles of flight in the atmosphere, including elements of fluid dynamics,and end with a description of the NASA shuttle. The next, and last chapteris devoted to extra-terrestrial travel. We discuss the dynamics ofinterplanetary missions and in particular the journey of the Voyager-2craft. We conclude by assessing the possibilities of travel beyond the solarsystem and leave the reader to ponder where the future will lead us to.

The book should be useful as a general reference but also as a text.Typically, five chapters can be covered in one academic term. These canbe chosen with some flexibility since the major parts are fairly independent.We have also included a few simple exercises at the end of each chapter,for the entertainment of the reader. Some material of more mathematicalnature is presented in the appendices; it refers to Chapters 3 and 7 andis not needed for following the main text. Finally, a word of caution aboutunits: they are mixed. To the extent possible we have tried to use MKSunits, but in many cases, replacing the traditional engineering units iscounter productive. To provide references to original work was neitherpractical nor possible. We have however included a list of books andreview articles for further reading; many of these were used as sources forthe course and others expand on the material presented.

I am indebted to many colleagues for their help and suggestions forthe course and for their comments on various stages of the manuscript.In particular Drs R. Forward, S. Kerridge, M. Shlesinger and C. S. Wuprovided me with original and published material for the course.Drs M. Bocko, S. Craxton, R. Forward, E. Jones, L. Mandel,M. Migliuolo, B. Moskowitz, R. Polling, J. Rogers and Y. Semertzidisread and commented on parts of the draft and their input was invaluable.Ms Constance Jones and Ms Judith Mack typed the many versions ofthe manuscript skillfully, efficiently and cheerfully and I wish to thankthem sincerely. The artwork is due to Mr Roman Kuril. Finally I thankthe students who took the course over the years and provided theenthusiasm and the rationale for completing the manuscript. Last but notleast, thanks to my wife Joyce for her support and understanding of 'bookwriting'.

On a more personal note, the study of the physics of modern technologyengenders a sense of admiration, but also of affection, for the men andwomen who had the vision, the perseverance and the good fortune tounderstand the physical law and use it correctly in their intriguingapplications. What a wonderful adventure it must have been, an adventurewhich still goes on and in which, hopefully, the reader will participate inhis own way.

A. C. MelissinosRochester, New York

Acknowledgements

We wish to thank the following organizations for permission to use theiroriginal material. Addison-Wesley Inc. for use of Fig. 1.32; J. Wiley andSons for use of Figs. 5.6, 6.2, 6.4(a), and 7.21; American Association ofPhysics Teachers and Professor T. D. Rossing for use of Fig. 2.32;McGraw-Hill Inc. for use of Figs. 6.4(b), 7.10, and 7.16; Gordon andBreach Publishers for use of Fig. 3.20; The Dallas Times Herald for useof Fig. 6.6; Itek Corporation for use of Fig. 6.12(a); U. S. Department ofthe Air Force for use of Fig. 6.12(b); the American Institute of Physicsfor use of Table 6.1; the U. S. National Aeronautics and SpaceAdministration for use of Figs. 7.25 and 7.26; Bantam-Doubleday-Dellfor use of Fig. 7.14; Jet Propulsion Laboratory of the California Institute

of Technology for use of Figs. 8.13, 8.14 and 8.15.

PART A

MICROELECTRONICS ANDCOMPUTERS

Microelectronics are found today at the heart of almost everydevice or machine. Be it an automobile, a cash register or just adigital watch it is controlled by electronic circuits built on smallsemiconductor chips. While the complexity of the functionsperformed by these devices has increased by several orders ofmagnitude their size is continuously decreasing. It is thisremarkable achievement that has made possible the developmentof powerful processors and computers and has even raisedthe possibility of achieving artificial intelligence.

The basic building block of all microcircuits is the transistor,invented in 1948 by John Bardeen, Walter Brattain and WilliamShockley at Bell Telephone Laboratories. The first chapter isdevoted to a discussion of the transistor beginning with a briefreview of the structure of semiconductors and of the motion ofcharge carriers across junctions. We discuss the p-n junction andbipolar as well as field-effect transistors. We then consider moderntechniques used in very large scale integration (VLSI) of circuitelements as exemplified by Metal-Oxide-Silicon (MOS) devices.

In the second chapter we take a broader look at how aprocessor, or computer, is organized and how it can be built outof individual logical circuit elements or gates. We review binaryalgebra and consider elementary circuits and the representationof data and of instructions; we also discuss the principles of massdata storage on magnetic devices. Finally we examine thearchitecture of a typical computer and analyze the sequence ofoperations in executing a particular task.

1THE TRANSISTOR

1.1 Intrinsic semiconductors

It is well known that certain materials conduct electricity withlittle resistance whereas others are good insulators. There also existmaterials whose resistivity is between that of good conductors andinsulators, and is strongly dependent on temperature; these materials arecalled semiconductors. Silicon (Si), germanium (Ge) and compounds suchas gallium arsenide (GaAs) are semiconductors, silicon being by far themost widely used material. Solids, in general, are crystalline and theirelectrical properties are determined by the atomic structure of the overallcrystal. This can be understood by analogy to the energy levels of a freeatom.

A free atom, for instance the hydrogen atom, exhibits discrete energylevels which can be exactly calculated. A schematic representation of suchan energy diagram is shown in Fig. 1.1 (a). If two hydrogen atoms arecoupled, as in the hydrogen molecule, the number of energy levels doublesas shown in part (b) of the figure. If the number of atoms that are coupledto each other is very large - as is the case for a crystal- the energy levelscoalesce into energy bands as in Fig. 1.1 (c). The electrons in the crystalcan only have energies lying in these bands.

When an atom is not excited the electrons occupy the lowest possibleenergy levels. In accordance with the Pauli principle only two electrons(one with spin projection up and the other down) can be found at anyone particular energy level. Thus the levels - or states - becomeprogressively filled from the bottom. The same holds true in the crystal.The electrons progressively fill the energy levels within a given band, andonly when the band is completely filled do they begin to populate thenext band. The energies of the electrons are typically few electron-Volts(eV).

4 The transistor

In an insulator the occupied energy bands are completely filled. As aresult the electrons cannot move through the crystal. This is becausemotion implies slightly increased energy for the electrons but the nextavailable energy level is in the conduction band which is far removedfrom the valence band. Thus the electron must acquire enough energy toovercome the energy gap between the valence band and the conductionband as shown in Fig. 1.2(a). In a conductor the valence and conductionbands overlap and the outermost electron of the atom is free to movethrough the lattice (Fig. 1.2(c)). In a semiconductor the energy gap ismuch smaller than for insulators and due to thermal motion electronshave a finite probability of finding themselves in the conduction band.

Fig. 1.1. Energy levels of an atomic system: (a) single atom, (b) twocoupled atoms, (c) in a many-atom system the energy levels coalesceinto energy 'bands'.

O l(a) (b) (c)

Fig. 1.2. Energy band structure for: (a) an insulator such as SiO2, {b) asemiconductor, such as Ge, (c) a good conductor such as Al.

Conduction*—band —*»

Valence, band

(a) SiO2 (b) Ge (c) Al

Intrinsic semiconductors

Furthermore when an electron makes a transition from the valence to theconduction band it leaves a vacancy in the valence band. This vacancycan move through the lattice (just as a bubble 'moves' through a liquid)and contribute to the flow of current; we speak of transport of electriccharge by the motion of holes (Fig. \.2{b)).

To obtain a feeling for the occupancy of the energy levels in a solid wecan consider the following simple model for a conductor. We assume thatone electron in each atom is so loosely bound that it is practically freeinside the crystal. This is the case for copper which has Z = 29, and thusevery atom has 29 electrons. Of these, 28 electrons completely fill the n = 1(2 electrons), n = 2 (8 electrons) and n = 3 (18 electrons) shells, leavingone electron outside the closed shells. Such an electron is loosely boundto the atom and in fact it occupies a level in the conduction band; thusit can move freely through the crystal. It is simple to calculate the densityof free electrons in copper. We have Z = 29, A ~ 63, p = 8.9 g/cm3 andassume one free electron per atom; then

No /a toms\ / g \ 6 x 1023

Yi == ( I Xg cm 63

8.9 = 8.5 x 1 0 -cnr

(1.1)

where No = 6 x 1023 is Avogadro's number.The free electrons in a metal can be described approximately as particles

confined within a cubic box but with no other forces acting on them. Thissituation is depicted for one dimension in Fig. 1.3 and we speak of a'potential well' of length 2L. In this case the solution of Schrodinger'sequation leads to wave functions of the form

*n(x) = (1/VL) cos(knx) or (l/y/L) sin(knx)

Fig. 1.3. The wave function for the lowest and next to lowest energystates of a particle confined to the region - L < x < L by an infinitelyhigh potential.

o l- L x =0

6 The transistor

where the wave number kn can take only the discrete values

k 1 2 3

so as to satisfy the boundary conditions i ( —L) = i^(L) = 0. Thus theallowed energies of the particles in the potential well are quantized andgiven by

2m 2m 8mL2 V

If we generalize to three dimensions, we must use three quantum numbers,nx, ny and nz and the energy is given by

h2n2En = V~ri (n* + ny + "*) "*> nv "z = 1, 2, 3 , . . . (1.3')

Every particular combination of nx, ny9 nz represents a different energylevel and only two electrons can occupy it. Note that several energy levels(different combinations of nx, ny, nz) can have the same energy; we saythat these levels are degenerate.

We can use Eq. (1.3') to calculate the energy of the highest filled levelgiven the density of free electrons ne in the crystal. This level is called theFermi level and its energy is the Fermi energy for the system. It is given by

EF=^(3n2ne)2^ (1.4)2m

where m is the mass of the electron. For Cu we use the result of Eq. (1.1) andfe = 2x 10~5eV-cmme2 = 0.5 x 106 eV

to find EF = 1.1 eV, in good agreement with observation. To see howEq. (1.4) is derived we must count the number of (nx, ny, nz) combinationsavailable when the maximal value of (n2 + n2 + n2) is specified. InFig. 1.4 every combination of (nx,ny,nz) is indicated by a dot in3-dimensional space. When nx, ny, nz are large, a given value of(nx' + n2 + n2y/2 = constant defines the surface of a sphere in this space;all levels on the surface of the sphere have the same energy. The numberof levels inside the sphere equals its volume, because the dots are spacedone unit apart from one another. Since nx, ny, nz must be positive thenumber of combinations Nc is given by the volume of one octant

6Because of the Pauli principle the number of electrons occupying the Nc

Intrinsic semiconductors

levels is Ne = 2NC. Thus the energy of the highest occupied level ish2n2 , ,8mL2

h2n2

Sml2

2/3

2m L 8L3J2/3

(1.4')

Note that the Ne electrons are confined in a volume of size V = (2L)3 andtherefore in Eq. (1.4') (JVe/8L3) = ne is the free electron density establishingthe result of Eq. (1.4).

Let us now return to the free electron model. In the absence ofexcitations, that is at very low temperature, only the levels below theFermi energy, EF will be occupied. Let / ( £ ) indicate the probability thata level at energy E is occupied; clearly f(E) is bounded between 0 and 1.If we plot f(E) as a function of £ for T = 0 it must have the square formindicated by curve A in Fig. 1.5. As the temperature increases some ofthe levels above EF will become occasionally occupied, and correspondinglysome levels below EF will be empty. The probability of occupancy, f(E)for a finite temperature Tx / 0 is indicated by curve B in Fig. 1.5. The

Fig. 1.4. Counting the number of states labeled by the integers nx, nyand nz such that (H* + nj + n2

z) ^ R2. Each state is represented by a dot.

Fig. 1.5. The Fermi distribution function for zero temperature ( 4) andfor finite temperature (B); EF is the Fermi energy.

((A)T =

1 Vnj

8 The transistor

function f(E) is known as the Fermi function and is given by

In the limit T -> 0 Eq. (1.5) reduces to f(E) = 1 if E < EF or to f(E) = 0if £ > EF in agreement with curve A of Fig. 1.5.

For finite T, consider an energy level Ek lying above EF; we definee = (Ek — EF). As long as e >> 3kT Eq. (1.5) can be approximated by

/(e)ê-^r (1.6)For a level Ej lying below £F we define sf = (EF — Ej). We are nowinterested in the probability that the level Ej is empty, namely inf'{e!) = 1 — f(e'). As long as £' >; 3kT a valid approximation to Eq. (1.5) is

* ^ r (1.6')1 / 0 0 = 1 _ =ee e/fci + 1

Eqs. (1.6) show that at finite temperature there are as many occupiedstates above the Fermi level as there are empty states below it. This resultcan serve as a rigorous definition of the Fermi level. Finally we note thatthe expansions of Eqs. (1.6,6') coincide with the classical Boltzmanndistribution.

To get a better feeling for the implications of the Fermi function on thedistribution of carriers in a semiconductor we first calculate kT at roomtemperature. Boltzmann's constant

fc=1.38x 1(T23J/Kand if we take

T=300KkT = 4.1 x 10"21 J = 0.026 eV

The energy gap for an insulator is of order AE ~ 5 eV whereas forsemiconductors it is Eg ~ 1 eV. Thus for semiconductors at roomtemperature a small fraction of the electrons in the valence band can bethermally excited into the conduction band.

For a pure semiconductor we designate the number density of (intrinsic)electrons in the conduction band by n{. For an intrinsic semiconductorthe density of holes will also equal n{ and therefore the Fermi level willlie in the middle of the energy gap as shown in Fig. \2(b). The intrinsiccarrier density is then given by the probability of occupancy f(e) multipliedby Ns the number of available states per unit volume. Using Eq. (1.6) wefind

Wi = Nse-£- /2*r (1.7)

(Ns is an effective density of states near the band edge and for silicon it

Doped semiconductors 9

is of order ~1019cm~3). In general n{ is much smaller than the freeelectron density in a good conductor. For instance, for silicon whereEg = 1.1 eV, at room temperature n{ ~ 1010 cm"3, whereas for germanium(Eg = 0.7 eV), ttj ~ 1013 cm"3. This should be compared to the free electrondensity in copper which we calculated to be n{ ~ 1023 cm"3. Of coursethe crystal as a whole remains electrically neutral, but if an electric fieldis applied the carriers will be set in motion and this will lead to thetransport of charge. It is evident from Eq. (1.7) that the conductivity ofa pure semiconductor will be highly temperature dependent.

1.2 Doped semiconductors

We saw in the previous section that the intrinsic carrier densitiesare quite small. Thus, unless a semiconductor is free of impurities to ahigh degree, the phenomena associated with the motion of the intrinsiccarriers will not be manifest. On the other hand, by introducing a particularimpurity into the semiconductor one can greatly enhance the number ofcarriers of one or of the other kind (i.e. of electrons or of holes). The greattechnical advances in selectively and accurately controlling theconcentration of impurities in silicon have made possible the developmentof microelectronics. We speak of doped semiconductors.

To understand the effect of doping we note that the electronic structureof Si or Ge is such as to have four electrons outside closed shells; theyare elements of chemicals valence 4.

Filled shells ValenceSi Z=14 ,4-28 (n=l,w = 2)10 (3s)2 (3p)2Ge Z = 32 A~72 (n = 1, n = 2, n = 3)28 (4s)2 (4p)2

If one examines the periodic table in the vicinity of Si and Ge, one findsthe valence 3 elements boron (B, Z = 5), aluminum (Al, Z = 13) or indium(In, Z = 49). On the other side are valence 5 elements such as phosphorus(P, Z = 15), arsenic (As, Z = 33) or antimony (Sb, Z = 51). What willhappen if impurities from these elements are introduced into pure silicon?

If valence 5 elements are introduced into the silicon lattice the extraelectron will be loosely bound and can be easily excited into the conductionband. We say that these elements are donor impurities. If valence 3 elementsare introduced they will have an affinity for attracting an electron fromthe lattice, creating a vacancy or hole in the valence band. We say thatvalence 3 elements are acceptor impurities.

Because of their different electronic structure as compared to that ofthe crystal lattice, the donor levels are situated just below the conduction

10 The transistor

band, as shown in Fig. 1.6(a). The acceptor levels are instead locatedslightly above the valence band (Fig. 1.6(b)). This energy difference is sosmall that at room temperature the impurity levels are almost completelyionized. Thus in the case of donor impurities the charge carriers areelectrons and we speak of an n-type semiconductor whereas for acceptorimpurities the carriers are the holes and we speak of a p-typesemiconductor. Recall that the crystal is always electrically neutral andthat the charge of the carriers is compensated by the (opposite) chargeof the ionized impurity atoms, the ions however, remain at fixed positionsin the lattice.

In the presence of impurities the position of the Fermi level is determinedby the concentration of the impurities and moves toward the conductionband if the dominant free carriers are electrons, toward the valence bandif the dominant free carriers are holes. This is sketched in Fig. 1.6. Theposition of the donor level is indicated by the plus signs in Fig. 1.6(a),of the acceptor level by the minus signs in (b) of the figure.

As an example we consider an n-type semiconductor, and as usual,designate the (extrinsic) conduction electron density by n, and the holedensity by p. Then according to Eqs. (1.6, 6')

l ' '

Here we introduced a new concept, the effective density of states N. Thisis the number of available energy states per unit volume, the subscriptsc and v referring to the condition and valence band correspondingly. Ingeneral Nc and Nv need not be equal to one another.

Similar relations hold for the intrinsic carriers except that we designate

Fig. 1.6. Energy band diagram for doped semiconductors. Dotsrepresent electrons and open circles holes: (a) for an n-typesemiconductor (note the position of the donor level), (b) for a p-typesemiconductor (note the position of the acceptor level).

(a) H-type

Edge of conduction

^ Ionized donors

^* Edge of valence - ^^ band

^ /M/M/t/iW

^-^ — — —/ o /o/o/o/o/o /o o o o o o

(Z?) p-type

Ionized^ acceptors

Doped semiconductors 11

the corresponding Fermi level by £i? and n{ must equal pr Thus

This relationship can be solved to yield the exact value of Ex

£, = i(£c + £v) + i/cTln(iVv/iVc) (1.9')

as well as a convenient expression for n{„ /AT AT \l/2~-(Ec-Ey)/2kT (\ c\"\nx — {iyi yivc) e [i.y )

Finally we can multiply the two Eqs. (1.8) with one another

and by comparing with Eq. (1.9") obtain the very important relationnp = nf (1.10)

The product of the electron and hole densities is independent of the dopingand depends on the intrinsic properties of the semiconductor and thetemperature. This is true under equilibrium conditions and provided theintrinsic carriers are not highly excited.

In a doped semiconductor we have majority and minority carriers. Forinstance in an n-type semiconductor the electrons are the majority carriersand the holes the minority carriers; the opposite is of course true for p-typesemiconductors. The density of ionized donors and acceptors is designatedby ND and NA respectively. Then if the electrons are the majority carriersit holds

ND » NA and n~ ND

We can obtain more accurate relations by taking into account theelectrical neutrality of the crystal. The charge density p must equal zeroand therefore

p — q[p — n + JVD — NA) = 0 (1.11)Solving Eq. (1.11) for p and inserting the result in Eq. (1.10) we obtaina quadratic equation for n, whose solution is

n = ±(ND - NA) ± ^l(ND - NA)2 + 4 n 2 ] 1 / 2

For an rc-type semiconductor, where (ND — NA)>0 we must keep thesolution with the positive radical. And if (ND — NA)» n{ we have

nf nfPn = (ND-NA)^ND

where the subscript indicates the type of semiconductor. For instance, nn

or pp are majority carrier concentrations. Similar relations are valid forp-type semiconductors.

12 The transistor

Thus we see that by the controlled introduction of impurities we cancreate materials with a particular type of majority carriers. It is the junctionof two or more such materials that makes possible the control andamplification of electric current by solid state devices.

1.3 Charge transport in solids

One is familiar with the notion of an electric current 'flowing'through a wire. What we are referring to is the transport of electric chargesthrough the wire, and this in turn is a consequence of the motion of thecarriers in the wire. In a good conductor the carriers are electrons, whilein a gas discharge or in a liquid the carriers are both electrons and positiveions. In a semiconductor the carriers are electrons, or holes, or both,depending on the material. The current at a point x along the conductoris defined as the amount of charge crossing that point in unit time/ = AQ/At. It is more convenient to use the current density J which is theamount of charge crossing unit area (normal to the direction of J) perunit time. By definition then

J = qmd (1.13)Here q is the charge of the electron (carrier), n is the carrier density, andvd is the drift velocity of the carriers.

The carriers in a solid are in continuous motion because of theirthermal energy. This motion is completely random as the carriers scatterfrom the lattice and it does not contribute to net transport of charge.Thus to transport charge a drift velocity must be superimposed on therandom motion. This can be achieved by applying an external electricfield. The motion is then modified as shown schematically inFigs. 1.7(a), (b). (We have assumed that the carriers are electrons so their

Fig. 1.7. Idealized motion of free electrons in a metal: (a) in the absenceof an external electric field, (b) in the presence of an external electricfield a net drift current is established.

- Net displacement(a) (b)

Charge transport in solids 13

motion is opposite to the direction of $.) Another cause for net carriermotion is the presence of density gradients. The carriers will then moveso as to equalize the density and we speak of diffusion. Finally, carrierscan be lost by recombination with impurities, or conversely, they can becreated by photo-ionization or thermal excitation.

We first examine the motion of carriers under the influence of an electricfield S. The acceleration of a charged particle will be a = F/m = qS/m*where we have replaced the mass, m, of the particle by an effective massm* because the carriers do not move in free space but in the lattice. Ifthe time between collisions is £coll then the average or drift velocity in thedirection of the electric field will be

V X « * (1-14)

Namely, the drift velocity is proportional to the electric field S. This is avery general result and the proportionality coefficient is called the mobility\x. Thus

vd = ^ (1.14')

From Eqs. (1.14, 14') we can express the current density as3 = qnnS (1.15)

This result is equivalent to Ohm's law which states that the current densityis proportional to the electric field and is related to it by the conductivity o

3 = o£ (1.15')Thus

o = qn\i (1-16)Conductivity has dimensions of ( o h m ^ m " 1 ) and has been recentlydefined as the 'Siemens'. When both types of carriers contribute to thetransport of charge, Eq. (1.16) must be modified to read

a = q(nfi_+pfi+) (1.16')

The inverse of the conductivity is the resistivity, p, and the resistance ofa conductor of cross sectional area A and length L is given by

R = p- = - - (1.16")A a A

We can evaluate the mobility if we knew the time between collisionstcoll. Instead, it is convenient to introduce the mean free path (m.f.p.), /,between collisions. Then, £coll = l/vrms where vrms is the velocity due tothermal motion. We can write

14 The transistor

and therefore vrms = (3/cT/m*)1/2, so that from Eq. (1.14)

2(3/cTm*)1/2

Thus the mobility is a property of the crystal and depends on thetemperature. As the electric field is increased the drift velocity increasesand reaches a saturation value vs. Typical values for negative carriers insilicon are

\JL ~ 102 cm2/V-s vs ~ 107 cm/sIn general, the mobility of the positive carriers is much smaller than thatof the negative carriers.

When density gradients are present in the solid, the carriers will diffusefrom regions of high concentration to those of lower concentration. Theflux of carriers is proportional to the density gradient. In one dimensionwe have

Px=-D^ (1.17)ax

where D is the diffusion coefficient. We can expect that the diffusioncoefficient is related to the mobility of the carriers, and this relationshipwas first established by Einstein. One finds that

kTD = — fi (1.17)

qTherefore the current due to diffusion is given by

(LIT)

a result that should be compared to Eq. (1.15).In addition to the drift and diffusion currents, carriers may be being

lost due to recombination. Recombination often takes place at traps,which are locations in the crystal where a hole is trapped near theconduction band. In many semiconductors, under the influence of lightor other radiation, an electron can become excited from the valence tothe conduction band, increasing the density of carriers; thus aphotocurrent can flow through the circuit if it is suitably biased.

1.4 The p-n junction

So far we have considered current flow in semiconductors whichwere uniformly doped to make n-type or p-type material. If two such

The p-n junction 15

semiconductor materials of different type are joined the current flowthrough the junction depends on the polarity of the external bias. Thetechnology for making p-n junctions is an important development whichwe discuss later. For the analysis of the junction it is sufficient to use aone-dimensional approximation as shown in Fig. 1.8. We assume that forthe p-type material (to the left of the junction) the ionized acceptor densityis NA, while for the n-type material the ionized donor density is ND andthat NA>ND; this is shown in Fig. 1.8(a). In the idealized case thedistribution of positive carriers follows the impurity distribution andwould be as in Fig. 1.8(b), where the numbers in brackets are typical

Fig. 1.8. Carrier density distribution in the immediate vicinity of a p-nsemiconductor junction: (a) donor and acceptor densities define a stepjunction, (b) positive and (c) negative carrier densities for the idealizedcase, (d) and (e) represent the realistic equilibrium distribution of carrierdensities. Values in brackets are typical concentrations per cm3

A A

—»x (a)

— [ 1 0 4 ] -

[10 1 5 ]

(c)

(d) (e)

16 The transistor

values. To the left of the junction the holes are the majority carriers whilethey are minority carriers to the right of the junction; the converse is truefor the negative carriers as indicated in Fig. 1.8(c). For silicon at roomtemperature n{ = 1010, so we assume that np = nf = 1020.

The idealized distributions shown in Figs. 1.8(ft), (c) are modified inpractice because the majority carriers diffuse across the junction. As theholes move into the n-type material they very quickly recombine with thefree electrons and this results in a reduction of the majority carriers to theright of the junction; similarly, as the electrons diffuse into the p-typematerial recombination takes place reducing the majority carriers to theleft of the junction. Thus the carrier distribution has the form shown in(d), (e) of Fig. 1.8; a finite depletion zone is created in the vicinity of thejunction.

Following the above discussion we sketch our model junction as inFig. 1.9(a) where we indicate the holes by open circles and the electronsby dots; the junction is at x = 0 and the boundaries of the depletion regionare labeled by — xp and xn. For x < 0 there exists an excess of ionizedacceptors, that is an excess of negative charge. For x > 0 there exists anexcess of ionized donors, that is an excess of positive charge. Thus the

Fig. 1.9. The electrostatic parameters in the vicinity of a junction:(a) definition of the depletion region, (b) electric charge density, (c) electricfield, (d) electrical potential.

PWI

id)

The p-n junction 17

charge density p(x) is distributed as shown in Fig. 1.9(b). A non-zerocharge distribution gives rise to an electric field, which in its simplestform, is directed from the positive to the negative charge. Thus the fieldis negative and as shown in Fig. 1.9(c). Finally by integrating the electricfield we can find the potential in the vicinity of the junction; this is indicatedin (d) of the figure.

Clearly, the electric field 'pushes' the electrons towards positive x, andthe holes toward negative x; that is, against the direction in which thecarriers tend to diffuse. The electric field can be calculated by integratingGauss' law

—rs

£ o J - c

p(x)dx (1.18)

Here Ks is the dielectric constant of silicon; Ks~ 11.8. In our examplepeak field is reached at x = 0 and Six = 0) is negative. Similarly, thepotential V(x) is given by integrating the electric field

- J : S{x)Ax (1.18')

The difference in potential across the junction is designated by Vhi (wherebi stands for 'built-in') as shown in Fig. 1.9(d).

We can evaluate Vhi by noting that under equilibrium conditions boththe electron current and the hole current across the junction must be zero.The total current is the sum of the drift current Jdr and the diffusioncurrent JD. Looking at the current we have, using Eqs. (1.15, 1.17")

dn0^dr/n '• - u/n - x i — i j

or

q J-oo n q \n_The electron densities at large positive x and large negative x can be takenas n^ = ND and n_Q0 = nf/NA so that

(1.18")nf J

For the densities used in Fig. 1.8 we find Vhi = 0.65 eV which is typicalof most commercial junctions. The typical thickness of the depletion regionis of the order of 1 micron (10~6 m) or less.

A most convenient way for looking at the potentials and the carriermotion at a junction is to consider the energy band diagram. This isshown in Fig. 1.10. As we recall the position of the Fermi level with respect

18 The transistor

to the edge of the valence or the conduction band is different for p-typeand n-type materials (see Fig. 1.6). However when the two materials arejoined, the Fermi levels must be at the same energy when the system is inequilibrium; otherwise there would be flow of charge until the Fermi levelsequalized.* Thus the band diagram takes the form shown in Fig. 1.10,the relative displacement of the bands being given by qVhi; (here negativepotential is toward the top of the page in contrast to Fig. 1.9). Theimportance of this diagram is that the electrons must gain energy to moveupwards, thus their motion from the n-region to the p-region is impeded.Similarly the holes must gain energy to move downwards (uphill for theholes is down), thus their motion from the p-region to the n-region isimpeded. The depletion region is characterized by the sloping part of theband edges and in fact the electric field is proportional to that slope.

When an external voltage is applied between the ends of the p-type andn-type material, the potential difference will appear across the junctionand modify the energy diagram by displacing the relative position of theFermi levels. We say that the junction is biased. There are two possibilities:if a positive voltage is applied to the n-type material the potential differenceacross the junction will increase and there can be no current flow acrossthe junction as shown in Fig. 1.11 (a). The junction is reverse biased. If

Fig. 1.10. Energy band diagram for a p-n junction; at thermalequilibrium the Fermi level must be at the same energy in both parts of thejunction.

.5 S

c o Depletionregion

* As an analogy one can think of two containers of fluid which are filled to differentheights; when the containers are put in communication the water will flow until the levelsequalize.

The p-n junction 19

the negative voltage is applied to the n-type material the potential acrossthe junction is reduced and there is flow of electrons toward the left endof holes toward the right. When the carriers cross the junction theyrecombine but the flow is sustained because the potential drives themajority carriers on both sides toward the junction. In this configurationthe diode is forward biased, as shown in Fig. 1.11 (ft). In the case of a.forward biased junction there are enough carriers lying sufficiently highin the conduction band to have energyE' > Ec; these carriers drift acrossthe junction under the influence of the external potential. As can bededuced from Eq. (1.18") for a typical n-p junction a bias of 0.5-1.0 voltis sufficient to reach saturation. We remind the reader that the current iscarried by the majority carriers in each part of the bulk material, i.e. byholes in the p-region and by electrons in the n-region.

The simple p-n junction such as described here forms a very usefuldevice widely used in electrical circuits. It is referred to as a diode andrepresented by the symbol shown in Fig. 1.12(a). For positive voltage thejunction is forward biased and the current flow grows exponentially untilit reaches saturation. The current v. voltage (I-V) characteristic of atypical diode is shown in Fig. 1.12(ft); the non-linear nature of the diodeis clearly exhibited. The I-V curve can be described analytically by an

Fig. 1.11. Energy band diagram for a biased p-n junction: (a) reversebias, (b) forward bias. The physical connection to the voltage source isindicated in the sketches.

- « U > 'i - J(a)

(b)

20 The transistor

Fig. 1.12. A p-n junction forms a diode: (a) circuit symbol, (b) I-Vcharacteristic.

/ i

0.5Vk-

equation of the form

where V is the biasing voltage.

1,5 The junction transistor

The junction transistor consists of two p-n junctions connectedback to back with the common region between the two junctions madevery thin. A model of the n-p-n transistor is shown in Fig. 1.13. Notethat one junction is forward biased at a relatively low voltage, whereasthe other junction is reverse biased at a considerable voltage. The threedistinct regions of differently doped material are labeled emitter, base andcollector respectively.

According to the biasing shown in the figure, electrons will flow fromthe emitter into the base; one would expect a large positive current /B

Fig. 1.13. A n-p-n transistor consists of two back to back diodejunctions. The biasing scheme and current flow are indicated.

Emitter A T f Collector

' B ,

Forward biasF-small

Reverse bias7-large

The junction transistor 21

from the base to the emitter. If however the base is thin enough, theelectrons injected into the base will reach the base-collector junctionbefore recombining or diffusing in the base. Once electrons cross thebase-collector junction they can move freely in the collector since theyare majority carriers. Furthermore, the base-collector voltage differenceis large so that the electrons gain much more energy than they lost inovercoming the voltage difference between the collector and base. Sucha system can provide power amplification. Thus IB is a small current,while /E and Ic are much larger when the transistor is in the conductingstate.

The energy band diagram for the n-p-n transistor is shown in Fig. 1.14.The majority carriers are electrons and therefore once they reach thecollector they fall through the potential hill. The small voltage betweenbase and emitter can be used to control the flow of current across themuch larger base-collector voltage. For the device to operate in thisfashion the electrons injected from the emitter must traverse the junctionwithout attenuation. In a good transistor ~0.95 to 0.99 of the injectedcarriers traverse the base. Typical widths for the base are of orderWB ~ 1-5 /mi. The emitter is heavily doped with donors so as to be ableto provide the necessary current even with small base emitter bias. Junctiontransistors are referred to also as 'bipolar' transistors to distinguish themfrom field effect devices.

The symbols for a transistor are shown in Fig. 1.15. The arrows indicatethe direction of positive current flow so that on the left of the figure werecognize an n-p-n transistor (as in Fig. 1.13) and a p-n-p transistor on

Fig. 1.14. Energy band diagram for a biased n-p-n transistor. Forelectrons, positive energy is toward the top of the page (uphill); thuselectrons flow in the direction indicated.

• •••-• • • ••• • • • «i • • • • •

22 The transistor

the right. Typical values are FB E~0.2V whereas KC E~5-10V. Theperformance of a transistor can be characterized by the current transferratio a, which is defined as

A/F(1.19)

Here we use A to indicate changes in current rather than steady statecurrents that may result from any particular biasing arrangement. Clearlya < 1, but for a good transistor a must be close to one.

To calculate the current gain of a transistor, we recall that the emittercurrent is the sum of the collector and base currents.

AJE = A/C + AJB (1.19')The current gain /? is defined by

P = AIC/AIB (1.20)

Using this definition and that of Eq. (1.19) in Eq. (1.19') we obtain

AI

or

1 - a(1.20')

For a typical value of a ~ 0.98 one finds /?~49. Namely, we can use asmall current into the base of the transistor to control the flow of a muchlarger current from the collector to the emitter. Thus a transistor is adevice that controls current flow.

The transistor is a three-terminal device and thus there are more biasingpossibilities than for a diode. There are three basic biasing configurationswhich can be classified as: common or grounded base; common emitter;

Fig. 1.15. Circuit symbol for a junction transistor; the current flow andvoltage definitions are indicated: (a) n-p-n, (b) p-n-p.

(a) n-p-n (b) p-n-p

The junction transistor 23

and common collector, also referred to as emitter follower. Theseconfigurations are shown in Figs. 1.16 to 1.18 and serve different functions.For instance the grounded base configuration shown in Fig. 1.16 leads tovoltage gain. The base-emitter resistance can be taken to be of orderRBE ~ 100 Q. If the load resistor is RL = 104 Q then the voltage gain is oforder RL/RBE ~ 100 a s shown below. The base-emitter voltage AVBE willin general be small (~ 10 " 3 V) and

KBE

R= -AIr

BE

Further-VCC-RLIC

but

AVCE= -RLAIC= -ocRLAKBE

so that (note AVEB = —AVBE)AVCE RL RL

R» R«(1.21)

^ E B ^ B E ^ B E

Note that the collector voltage Vc maintains itself only slightly above theemitter voltage when the transistor is conducting; this is true in thisconfiguration where the base is grounded.

The common or grounded emitter configuration shown in Fig. 1.17 isused to provide current gain. In this case the collector current is obtaineddirectly from the definition of Eq. (1.20'). A small current flow into thebase controls the current A/c flowing through a particular device, suchas the light bulb shown in the figure. The current gain is given by /?,provided of course that the device is not saturated (pAIB < VCC/RL).

Fig. 1.16. Circuit diagram for grounded base operation.

k

VInput

Output

A K B E ~ icr3 vAKC~ 10"1 VVrr ~ +5 V

24 The transistor

The third configuration is that of common collector, better known asthe emitter follower, and is shown in Fig. 1.18. As the name indicates theemitter voltage follows the base voltage, but the current flowing throughthe emitter resistor RE is ft times larger than the current supplied to thebase. Therefore a high impedance source can drive a load of lowimpedance; the emitter follower acts as an impedance transformer withoutvoltage gain. In view of the condition AVE = AVB we have

_AF E _AF B

Furthermore

— A/ AVB

and since the impedance of the source (input impedance) is defined as

Fig. 1.17. Circuit diagram for grounded emitter operation.

AIr

AA/,c _ aAir. \ - a

Fig. 1.18. Emitter follower circuit.

FB-0.5V

Manufacture of transistors; the planar geometry 25

in = AKB/A/B, we have

R E . ^ 0.22)

where RE is the output impedance. Emitter followers are used to drive longlines, meters, or other devices that have low impedance.

The design of electrical circuits is a vast subject of great technicalimportance. The interested reader is referred to the excellent text The Artof Electronics by P. Horowitz and W. Hill, Cambridge University Press,N.Y., 1980.

1.6 Manufacture of transistors; the planar geometry

The manufacture of transistors depends first on the availabilityof pure silicon. The second requirement is the ability to introduce donorsor acceptors in a highly controlled fashion. Finally it must be possible tomake very thin junctions between materials of different types. Thistechnology is by now highly developed and relies on manufacturingtransistors and entire circuits on the surface of thin silicon wafers; wespeak of planar geometry.

Ultrapure silicon is obtained by zone melting and other refiningtechniques. Single crystals of silicon are grown from a melt and theresulting ingots are typically 4 inches in diameter and 20 inches long. Thesilicon can be doped by adding the desired impurity to the melt. The ingotis then sliced into wafers approximately 0.5 mm thick and polished; thecut is usually along the (1,0,0) crystallographic plane. When goodresistivity is desired the silicon is doped so as to result in p-type material.Germanium and gallium arsenide are also produced in wafers.

To make junctions one introduces the desired impurity, or silicon ofopposite doping into the substrate (wafer). This can be done by alloying,by epitaxy, by diffusion, or by ion implantation. Alloying was one of thefirst techniques used in producing junctions; it is shown schematically inFig. 1.19. A pellet of p-type silicon is placed on top of an n-type wafer.

Fig. 1.19. Formation of a junction by alloying: (a) initial configuration,(b) melting at high temperature, (c) the junction after re-crystallization.

p-type pellet Melted pellet

V////A

«-type

(a) (b) (c)

26 The transistor

When this assembly is heated in an oven the pellet melts and penetratesinto the wafer. As the system cools and the silicon recrystallizes, a structuresuch as shown in (c) of the figure is attained. Repeating this sequence,one can obtain a junction transistor. In this process the base thicknesscan be controlled reasonably well and several units can be manufacturedsimultaneously. Molecular beam epitaxy (MBE) is the preferred techniquewhen very thin surface layers are to be deposited; it is becoming fairlywidely used. Ion implantation allows the very exact positioning of impurityions in the substrate and also the creation of very thin layers of donorsor acceptors.

Diffusion is by far the most common manufacturing process and it ledto a real breakthrough because it could be combined with photolithographyto make devices of very small dimensions. This was facilitated by thediscovery that silicon dioxide (SiO2) was easily formed by simply exposingthe pure silicon to an atmosphere of oxygen. SiO2 is glass and even thinlayers completely prevent the diffusion of impurities in or out of thesubstrate.

The manufacture of transistors in planar geometry by the diffusionprocess is outlined in the sketches of Fig. 1.20. One starts with p-typesilicon and the first step is to produce a thin layer of SiO2 on its surface;it suffices to heat the wafer in an oxygen-rich atmosphere. Next the waferis coated with a layer of photoresist. This is a chemical whose propertiesare affected when it is exposed by ionizing radiation, i.e. visible light,U.V., X-rays, etc. The exposure takes place through a photographic maskwhich has been highly reduced from its original layout; the mask carriesthe pattern that is to be transferred to the wafer. Then the exposed resistis dissolved away by dipping the wafer in a solvent, leaving the unexposedareas intact. At that stage the assembly looks as in (e) of the figure.

Next the wafer is introduced in an etching solution, for instance HF,and the exposed SiO2 is removed, exposing the silicon in the patterninscribed by the mask. Impurities are then diffused into the wafer asindicated in (g). By controlling the temperature and length of the exposureone can obtain the desired concentration and depth in the diffused layerto great accuracy. When the diffusion process is completed, the remainingresist coat is washed away using a stronger solvent. Finally a new oxidelayer is established over the wafer to provide electrical isolation betweenthe different parts of the circuit and to protect the silicon. If a secondlayer of diffusion is desired, the process is repeated. Eventually, metalliccontacts are placed at the appropriate locations to provide the electricalconnections for the circuit. This process involves quite a lot of toxicchemicals and the workers must be adequately protected.

Manufacture of transistors; the planar geometry 27

The physical arrangement of the n- and p-type silicon regions in ajunction transistor manufactured by diffusion technology is shown inFig. 1.21. In general one starts with a p-type substrate into which anrc-type 'island' is diffused. This provides electrical isolation and long-termstability. For an n-p-n type transistor (see Fig. 1.21 (a)) a p-type islandis established to provide the base and then two highly doped n-type regionsare introduced to serve as the base emitter and collector. The insulatingSiO2 layer and the electrical connections are also indicated. In this casethe majority carriers are electrons and they flow along the arrow (the

Fig. 1.20. Steps involved in the production of a transistor: (a) p-typesilicon is used for the substrate, (b) SL SiO2 layer is formed, (c) the waferis coated with photoresist, (d) the photoresist is exposed through a mask,(e) exposed photoresist is washed away, ( / ) in the exposed area theoxide layer is chemically removed, (g) n-type material is diffused intothe substrate, (h) the remaining photoresist is removed. The process cannow be repeated to introduce additional n-type or p-type regions.

y / / / Si p-type' (a)

V///////////////////Photoresist

(c) y«-type island

-*- Photoresist-«-SiO2

-<-p-type

(g)

-SiO2

(e)

28 The transistor

positive current is in the inverse direction). Note that the thin base region(p-type) is between the highly doped emitter region and the rc-type islandthrough which the carriers can flow to the collector.

For a p-n-p transistor the current flow geometry is different as indicatedin Fig. 121(b). The emitter and collector islands are adjacent to oneanother and the holes have to travel across the base which is part of then-type island. A highly doped rc-type region around the base electrodeprovides the necessary minority carriers to control the emitter-collectorcurrent flow.

The semiconductor technology developed in the manufacture ofjunctions could be adapted to make circuit elements such as resistors orcapacitors on the surface of a silicon chip. Examples are shown in Fig. 1.22.For instance, the thickness of the n-type region in Fig. 1.22(a) determinesthe resistance R between the electrodes at A and B. In (b) of the figure

Fig. 1.21. Typical layout of junction transistors manufactured bydiffusion methods. Note the 'islands', location of the contacts and currentflow: (a) n-p-n, (b) p-n-p.

SiO

(a) n-p-n (b) p-n-p

Fig. 1.22. Circuit elements can be realized with semiconductor materials:(a) resistor, (b) capacitor.

SiO

A R BO VWWWVW O

(a) (b)

The field effect transistor (FET) 29

the capacitance C between the two electrodes is controlled by the thicknessof the dielectric layer. In this way it became possible to integrate thetransistor with the discrete components needed to make a complete circuit.

An example of the evolution of a simple circuit from its discretecomponent form to its integrated form is shown in Fig. 1.23. The n-regionsare indicated by the dots, the p-regions are shaded whereas the undopedsilicon is left clear. By controlling the depth of the diffused layers andtheir area the performance of the circuit can be optimized. This type ofintegrated circuit is not very common any more, having been replaced bytransistor-transistor logic. For high performance circuits, however,discrete components are still utilized.

1.7 The field effect transistor (FET)

In the junction transistor the current flow from the collector tothe emitter is controlled by the current flowing into the base. In the fieldeffect transistor the current flow is controlled by the electric field between

Fig. 1.23. Monolithic integrated circuit construction: (a) circuitdiagram, (b) discrete elements and junctions, (c) combined elements,(d) wafer layout.

-oOut

In

(b)

(c) id)

30 The transistor

the gate and the other electrodes. In this sense the FET resembles a triodeelectronic vacuum tube and efforts to construct an FET go back to thelate thirties, even though the first FET was not built until 1953. Thegreat success of the FET is due to the advances in the technology whichpermitted the construction of very thin SiO2 layers, the diffusion of n-typesilicon into a substrate, and the application of metallic gates and contacts.Thus the technology is know as MOS for Metal-Oxide-Silicon.

As a first example we consider a small slab of n-type silicon to whichsource and drain contacts are made as shown in Fig. 1.24(a). Furthermore,two heavily doped p-regions are diffused at the other two sides of theslab; they form the gate of the device. With no voltages applied a depletionregion will be created around the p+-islands as shown in Fig. 1.24(a). Ifa positive voltage VD is applied to the drain, with the source grounded,the electrons in the rc-type substrate will move from source to drain; thedepletion region will widen because the electrons are being pulled towardthe drain. As VD is increased, the current /DS increases and so does thedepletion region until it reaches 'pinch off as shown in Fig. 1.24(fr); atthat point /DS becomes independent of VD. The /DS v. VD characteristicsfor VG = 0 and VG < 0 are shown in Fig. 1.24(c).

If the gate voltage VG is made negative, holes will be pulled toward thegate and electrons pushed away from it. Thus the depletion zone willwiden and as a consequence the impedance to the flow of electrons from

Fig. 1.24. Representation of a junction field effect transistor where thesource, and drain electrodes and the gate regions are shown: (a) no bias,in which case the depletion region is small, (b) biased to saturation, (c)the current-voltage characteristic for different gate voltages.

Source

'DS

r1 (c)


source to drain will increase. Furthermore, 'pinch-off will occur at lowervalues of VD as shown by the /DS-^D characteristic for VG < 0. The regionbetween source and drain where the carriers flow is called the channel,and the particular device we described is known as a J-FET.

The most commonly used FETs are manufactured by MOS technologyand are referred to as MOSFETs; they lend themselves readily to verylarge scale integration (VLSI). There are two types of MOSFETs, theenhancement and depletion types. An enhancement FET is shown inFig. 1.25. Starting with a p-type substrate two n-type islands are introducedto form the source and drain. Since the channel is p-type material it willnot conduct when the gate is at ground voltage. If however a sufficientlypositive voltage is applied to the gate the electric field between the gateand the source will draw negative carriers into the channel and electronswill flow from the source to the drain unimpeded as indicated in (b) ofthe figure.

The depletion FET is shown in Fig. 1.26. Here we want the channel tobe conducting when no voltage is applied to the gate. To achieve this,

Fig. 1.25. Realization of an enhancement FET in MOS technology:(a) with no voltage on the gate the device is not conducting, (b) whenKGS is positive there is current flow.

©

(a) (b)

Fig. 1.26. Realization of a depletion FET in MOS technology: (a) withno voltage on the gate the device conducts, (b) negative VGS turns thecurrent off.

(a) (b)

32 The transistor

donors are introduced by ion implantation into a very shallow channelbetween source and drain as shown in (a) of the figure. If a negativevoltage is applied at the gate the channel becomes narrower as the electronsare repelled from the gate and eventually conduction stops altogether asshown in Fig. 1.26(b). Thus the operation of the depletion FET resemblesthat of the J-FET.

The physical construction of an FET can be better understood withthe help of the sketches of Fig. 1.27; note that the transverse dimensionsare highly exaggerated. The silicon dioxide layer is extremely thin and itsthickness D is of order D ~ 0.02 /im = 200 A; this is only a small fractionof a wavelength of visible light and of the order of a hundred atomiclayers or less. On top of the SiO2 the gate is grown out of polycrystallinesilicon (poly); the lateral dimensions of the gate are length L and widthW, typically 5 /mi each. The drain and source are formed by diffusingdonors into the substrate creating the n-type regions. A plan view of theFET is shown in part (b) of the figure; note that the gate is placedtransversely with respect to the two diffused regions. For a depletion typeFET a very thin layer of donors is introduced into the channel by ionimplantation.

The symbols used to indicate FETs are shown in Fig. 1.28. For eachparticular manufacturing technology a typical supply voltage is used. Thisis referred to as VDD and is of order of few volts, i.e. VDD ~ 5 volts. Anenhancement FET will conduct if the gate-source voltage exceeds a certainthreshold Vthr; typically Vthr~0.2VDD. When the FET conducts thedrain-source voltage, VD can be less than VDD. The dependence of thedrain-source current /DS on VG and VD is as shown in Fig. 1.29.

To become familiar with orders of magnitude we estimate the current

Fig. 1.27. FET construction in MOS planar geometry: (a) perspectiveview with the vertical dimensions greatly exaggerated, (b) plan view.

Diffusion

Drain- Diffusion

Gate

- Diffusion

Source

(b)


through an FET. We first calculate the capacitance of the gateA WL

Cg = Kse0- = Ks80— (1.23)

where Ks is the relative dielectric constant of SiO2, Ks~\l. The chargein transit in the channel is

and the time of transit is given byL L L2

At = — = -vdr

(1.23')

(1.23")

Combining the above equations the source to drain current can beexpressed as

(1.24)LD

Fig. 1.28. Circuit symbols for FETs: (a) enhancement FET, {b) depletionFET.

Gate 11

Drain

Source

(a) Enhancement FET

Gate I

-p,

Drain

t H+ Source

(b) Depletion FET

Fig. 1.29. FET current-voltage characteristics for different gatevoltages: (a) enhancement FET, (b) depletion FET.

rrr i

VG

= 0.4KDD

= 0.3KDD

^G

/

jfi^^ VGr^~ i

= 0 ^ .

= -0.2KDD

= -0-3KDD

1 .0.5Kn 'DD

(a) Enhancement FET (b) Depletion FET

34 The transistor

In the above equations W, L and D are the width, length and depth ofthe gate as defined in Fig. 1.27, and fi is the mobility. Eq. (1.24) is validin the linear region where /DS is proportional to VD. The saturation currentis independent of VD and given to a good approximation by

/ -KB -^(V -V Y (125)Jsat — ^sÔ n /-% r \VG Kthr/ \1-Z'J/

This quadratic dependence on (VG — Vthr) can be seen in the characteristiccurves of Fig. 1.29.

As a numerical example we choose W = L = 6 fim, D = 0.02 jam and(VG-Vthr) = 1 V ; furthermore let fi = 1000 cm2/V-s and we have thatKS=U and e0 = 8.85 x 10"1 2 coul/V-m. Thus

10 coul'sat ^

\ -10

2 x ( 2 x l O " 8 ) swhich is typical for a high mobility FET.

= 0.025 A

1.8 Transistor-transistor logic

One of the simplest digital logic operations is that of inversion,and we show how it can be easily implemented using FETs. The circuitdiagram for an inverter is shown in Fig. 1.30(a). If we designate by RDS

the drain to source impedance when the FET is in the conducting statewe have the conditions

Mn < Khr* M)S = 0, Vout = M)D?

Fig. 1.30. Circuit diagram for an inverter using FETs: (a) with loadresistor, (b) with depletion FET replacing the load.

'DD

(a) (b)

Transistor-transistor logic

The result for Vin > Vthr assumes that RL » RDS, because

35

Thus

In MOS technology it is difficult to create a suitably high resistanceRL to serve as the load for the inverter circuit such as the one shown inFig. 1.30(a). This is because all paths are kept short to accommodate ahigh density of elements on the chip; short paths have low resistance. Thesolution is to use a depletion FET with its gate connected to its sourceas shown in (b) of the figure. Under these conditions the FET acts like aresistive load (see Fig. 129(b) for VG = 0) and it is always in the conductingstate. However, the current flowing through it is controlled by the stateof the enhancement FET to which it is connected in series. The circuit ismade up of two transistors coupled to one another, hence the designationas transistor-transistor-logic.

The physical realization of the inverter using an enhancement and adepletion FET is shown in Fig. 1.31. The gate of the depletion FET iswide so as to provide the necessary resistance. The KDD, ground and outputconnections are made directly to the diffused rc-type regions, whereas theinput connection is made directly to the gate of the enhancement FET.Note the connection between the gate and source of the depletion FET.

The actual processes involved in the construction of such an invertergate using nMOS technology can be chronicled with the help of Fig. 1.32

Fig. 1.31. Plan view of the MOS inverter using transistor-transistor-logic.

Ground

36 The transistor

T

Mask for diffusion

T

A'

GND Diffuse donors

(a) Patterning SiO2 (rf) Placing diffused region

T

Depletion area forion implantation

for depletion FET

T

(b) Patterning ion implantation (e) Placing contact cuts

Putting gates onpolycrystalline

T

Evaporate metal in

T ^' | ^ « T

\f///////. '///S////.

(c) Patterning polysilicon ( / ) Patterning the metal layer

Fig. 1.32. Steps involved in the construction of the inverter shown in theprevious figure. (From C. Mead and L. Con way, Introduction to VLSI,Addison-Wesley, Reading, MA, 1980.)

Logic gates 37

(from Introduction to VLSI Systems by C. Mead and L. Conway,Addison-Wesley, 1980). The transverse dimensions are, as usual,exaggerated to make them visible in the drawing. First a thin layer ofSiO2 is placed on the p-type substrate. Then a pattern as shown in (a) isetched away. Note the design in the plane of the wafer; the cross-sectionshown is taken at the registration marks. Next the SiO2 is masked so thatonly the implantation region is exposed and the donors are implanted toform the channel of the depletion FET; this is shown in (b). The photoresistis removed, a thin SiO2 layer is formed on the exposed silicon, and a newmask is used to form the polycrystalline gates for both FETs; this is shownin (c). The SiO2 is etched away from the diffusion regions and the donorsare introduced to form the n-type regions as indicated in (d). Our deviceis now finished but we still must make contacts to the ground, VDD, Aand A' points of the inverter. An insulating layer is placed over the waferexcept in the region where the contacts will be made, as shown in (e),and finally the metallic contacts are evaporated onto the assembly. Thegate to source connection for the depletion FET is made at this stage aswell as the four connections of the circuit. Final steps in the manufacturingprocess are connection of the leads and placing the wafer in an appropriatepackage. The whole idea of large scale integration is to interconnect manygates on the same wafer.

1.9 Logic gates

The circuits that perform the simplest digital logic functions arequite generally referred to as gates. In a digital circuit a signal level(voltage) can be in one of two conditions: either high or low. The exactvoltage levels and their tolerances depend on the technology used. Oneof the conditions corresponds to the true, or asserted, or T state whilethe other condition is the false, or negated, or '0' state.

A particular logic function is specified in terms of a truth table thatgives the output state for all possible combinations of input states. Forinstance the truth table for the inverter (also referred to as NOT gate),

A

01

A'

10 A

Truth table and symbol for inverter

38 The transistor

and its symbol are given above. The circuits of Figs. 1.30 or 1.31 performexactly this function. In general we will use primes to indicate thecomplement of any variable: that is if B is true, B' is false and if B is falsethen B' is true. The open circle at the output of a gate indicates inversion.

In addition to inversion there are two other basic logic functions. Theseare the AND and OR functions. The output of an AND function is trueif all inputs are true. The output of an OR function is true if any one ofits inputs is true. From these basic conditions several variations can begenerated. We shall discuss one particular variation, that of the NANDand of the NOR gate. These are AND and OR gates but with their outputscomplemented.

The truth table and symbol for the NAND gate (Not AND) with twoinputs are shown below. The circuit diagram and its physicalimplementation in the rcMOSFET technology are shown in Fig. 1.33.

A

0011

B

0101

(A -B)'

1110

(A-B)'

Truth table and symbol for NAND gate

Fig. 1.33. Circuit diagram and plan view of a NAND gate in MOStechnology.

(A'B)'

1 Ground

Logic gates 39

When input A is high but not input B the enhancement FET, A, conductsbut no current will flow because FET, B, is off. The same is true wheninput B is high but input A is low. When both inputs are high, both FETsconduct and current flows from VDD to ground. This brings the output ofthe circuit (at the source of the depletion FET) to ground, namely to thelow state. There is a limitation on how many inputs can be attached toa gate because of the increase of the impedance as many channels areconnected in series, but four or five inputs can be easily used; we speakof a two-fold, three-fold, etc. NAND gate. The NOR gate (Not OR), fortwo inputs, has the truth table and symbol shown below. The circuitdiagram and its physical implementation are shown in Fig. 1.34. Here ifeither input A or input B are high, current will flow from VDD to groundbecause one of the enhancement FETs conducts. Thus the output of thecircuit will drop towards ground, that is it will go to the low state. Onecan OR several logic signals together, with a limit of five to ten inputs toa single gate.

A

0011

B

0101

{A + B)'

1000

(A + B)'

Truth table and symbol for NOR gate

Fig. 1.34. Circuit diagram and plan view of a NOR gate in MOStechnology.

Ground

40 The transistor

We will see in the next chapter how all logic circuits can be reduced tothe basic functions that we introduced here and can therefore be built interms of simple logic gates. We should also keep in mind that any particularsignal is asserted (or negated) only for a finite time interval. Thus, forinstance, an AND gate will not be asserted unless the two inputs are trueat the same time. These time intervals can be very short, of order 100 nsor less, and this is why a digital circuit can perform many consecutiveoperations in one second, typically in excess of 106.

Exercises

Exercise 1.1

(a) Look up the atomic mass number A, and the density p of Si andGe and find the number of atoms per cm3.

(b) Assuming that the atoms are in a diamond structure (8 atoms/unitcell) find the lattice spacing.

(c) Find the resistivity of Ge at room temperature if it is doped with1015 atoms/cm3 of Sb. Assume a mobility of the donor's electronsof fic = 1200 cm2/V-s.

Exercise 1.2

Calculate the forward saturation current in a silicon n-p junction of area10"4cm2. Let the impurity concentration be 1/106 for both holes andelectrons. Assume that the width of the depletion region is W = 200 fjmand a forward bias of 0.5 V; assume some reasonable values for themobility of electrons and holes.

Exercise 1.3

Consider germanium doped with 1014/cm3 atoms of arsenic.(a) Find the conductivity assuming a reasonable value for the

mobility of the impurities.(b) The energy gap of germanium is Eg = 0.67 eV and the density of

states at the edge of the conduction band can be taken asNc= 1019/cm3. Estimate the intrinsic carrier density forgermanium at room temperature.

(c) Use the result of (b) to find the density of holes in the dopedsample.

Exercises 41

Exercise 1.4

Make a plot of the Fermi-Dirac distribution at T = — 78°C, roomtemperature, and at T = 500°C when EF = 1 eV.

Exercise 1.5

(a) Sketch an FET transistor in pMOS technology, that is, one usingan n-type substrate. Estimate all dimensions including the depthof the diffusion layer.

(b) Give the type and density of carriers in each region.(c) Give typical values for the voltages and currents through the

device when in the 'on' state and when in the 'off' state.

Digital electronics

Modern electronic devices operate in general, on digital principles. Thatis, signals are transmitted in numerical form such that the numbers arecoded by binary digits. A binary digit has only two states: 'one' and 'zero',or 'high' and 'low' etc. The reason for relying almost exclusively on digitalinformation is that binary data can be easily manipulated and can bereliably stored and retrieved. That this approach is practical andeconomically advantageous is due to the great advances in large scaleintegration and chip manufacture as already discussed. In this chapter wewill consider digital systems and the representation and storage of binarydata. We will conclude by discussing the architecture of a small 3-bitcomputer, which nevertheless, contains all the important features of largemachines.

2.1 Elements of Boolean algebra

In digital logic circuits a variable can take only one of the twopossible values: 1 or 0. The rules for operating with such variables werefirst discussed by the British mathematician George Boole (1815-64) andare now referred to by his name. Since in pure logic a statement is eithertrue or false, Boolean algebra can be applied when manipulating logicstatements as well. This material is conceptually simple yet it is mostrelevant to the understanding of complex logic circuits.

Boolean algebra contains three basic operations: AND, OR andComplement. The result of these operations can be best represented by atruth table as introduced in Section 1.9, where also the symbols for thecorresponding circuits were given. We will use a + sign to indicate theOR function, the • sign to define the AND and a prime to represent the

Elements of Boolean algebra 43

Complement of any variable. As an example of a Boolean function of threevariables we consider

F = x' + (yz) (2.1)The truth table for Eq. (2.1) and the corresponding logic circuit are shownin Fig. 2.1.

The rules of Boolean algebra are summarized in Table 2.1. Relations(1-8) seem obvious and define the basic arithmetic operations; relations(9-13) define the properties for operating on products and sums, and aresimilar to those found in ordinary algebra. However the remaining threerelations are peculiar to Boolean algebra. Relations (15, 16) are knownas De Morgan's theorems. Relation (14) is similar to what holds true forthe inverse of a number (A~ x)~1 = A but is by no means equivalent; recallthat x'x = 0 whereas A~1A = \\ a corresponding equality in binary is

Table 2.1. The rules of Boolean algebra

OR(1) x + 0 = x(3) x + l = l(4) x + x = x(7) x + x' = l

Addition and multiplication(9)

(11)

(12)(13)

Complemen tat ion(14) (x')' = x(15) (x + y)' = (x'(16) (x-y)'= (*' +

AND(2) x -0 = 0(4) x-l=x(6) xx = x(8) xx' = 0

(10) x-y = y-x commutative

associative

distributive

De Morgan'stheorems

Fig. 2.1. Truth table and circuit for the function F = x' + (y-z).

X

00001111

y00110011

z01010101

F11110001

44 Digital electronics

x' + x = 1. As a further illustration a relation such asx + (y - z) = (x + y) • (x + z) (2.2)

is true by definition for binary variables but is far from valid for ordinaryalgebra.

Boolean algebra can be used to simplify logical expressions or circuits.For instance the expression

can be reduced using De Morgan's theorems as follows

where by relation (16) the last result is also equivalent to F = A + B'.The two equivalent circuits and the truth table are shown in Fig. 2.2. Asfurther examples the reader should convince himself that

x' + y' # (x 4- y)'

Fig. 2.2. Two equivalent digital circuits.

A

A0011

B0101

F

1110

Fig. 2.3. Simple circuit with two inputs and two outputs

A

B

A

0011

B

0101

F

1110

G

0111

G = A 4- B

Fig. 2.4. Symbol for a combinatorial circuit.

^-inputs > /?7-OUtputS

Arithmetic and logic operations 45

and establish the equivalence of the two relations shown below:(a) F=[(A'B')''B'J = A + B(b) G = (A' + B')'(C + D')-(B' + D) = B''D' + B'-C + (A'-C)-D

It suffices to show that the two sides of the equation obey the same truthtable, or one can use Boolean algebra techniques as in the previousexample.

A number of digital inputs can be combined in different ways to givemore than one output. For instance in Fig. 2.3 we show a circuit with2 inputs and 2 output functions. In general a combinatorial circuit canhave n inputs and m outputs. It is specified by a truth table with 2" inputrows and m output columns. Symbolically it is designated by a box withn input and m output lines as in Fig. 2.4.

2.2 Arithmetic and logic operations

By a suitable arrangement of gates it is possible to performarithmetic operations on two operands, x and y. We first consider additionof two binary digits (bits) A and B\ the sum bit is designated by S andthe carry bit by C, and they have their usual meaning; they are definedas follows

S = (A + B)>(A'B)' (2.3)C = AB (2.3')

A circuit that performs these functions is called a half-adder, astraightforward realization being shown in the top of Fig. 2.5.

Fig. 2.5. Two equivalent circuits for the half-adder.

S

A

0011

B

0101

s0110

c0001


The sum bit can also be expressed by the relation5 = (A + B)- (A-B)f = (A + B){Af + B') = AB' + BA' (2.3")

which is implemented by the second circuit in Fig. 2.5. The circuit thatforms the sum bit of two input bits (i.e. Eqs. (2.3) or (2.3")) is given aspecial name the exclusive-OR. The symbol and truth table for theexclusive-OR are shown in Fig. 2.6 and that compact notation is oftenused when discussing higher level logic circuits.

A full-adder must accommodate, in addition to the two input bits, acarry input from the addition of the bits in the preceding lower order.The truth table is then as shown in Fig. 2.7 where the input carry isdesignated by C and the output carry by (p. A full-adder circuit using theexclusive-OR notation is shown in Fig. 2.7 and the complete circuitdiagram for a full-adder using enhancement FETs is given in Fig. 2.8. Inanalysing the circuit of Fig. 2.8 note that the elementary gates are NANDs,

Fig. 2.6. Symbol and truth table for the exclusive-OR.

F

X

0011

y0l0l

F

0110

Fig. 2.7. Truth table and circuit for the full-adder.

A

A

00001111

B

00110011

c01010101

s01101001

<F00010111

Arithmetic and logic operations 47

and that the OR function is accomplished by tying the drains to a commonline; also note the inclusion of inverters at the output to assure that thecircuit obeys the truth table of Fig. 2.8. Finally, we see again that aparticular logic function can be realized by more than one specific circuit.

Subtraction is achieved by forming the twos-complement of thesubtrahend and adding it to the minuend; the overflow bit is to be ignored.Forming the twos-complement is equivalent to complementing all the bitsand then adding 1. As an example we consider the subtraction of decimal171 from decimal 372, where in binary representation we have

372 101 110 100 A)-171 -010 101011 -B)

201(2.4)

011 001 001 C)

According to the proposed algorithm we complement B and add to A

101 110 100+ 101 010 1001 011 001 000

(2.4')

Fig. 2.8. Full-adder circuit using NAND gates made from enhancementFETs. (Adapted from W. C. Holton, The large scale integration ofmicroelectronic circuits, Scientific American, September 1977.)

Carry bit


we also add 1 and ignore the overflow and obtain C1 Oil 001 000 D)

000 000 001 + j l (2.4")011 001 001 C)

The proof of the algorithm is straightforward. Note that the complementof B is given by

£' = (111 111 111 —5) (2.5)Therefore

A + Bf+1=A + (U1 111 l l l - £ ) + l= A - B + 1 000 000 000= A - B + 000 000 000 = A - B (2.5')

where in the last step the overflow bit was ignored. (See also Section 2.6(c).)Multiplication involves bit-shifting operations and addition. Shift

registers are discussed in Section 2.4 and as their name indicates they aredevices that shift the bit pattern to the left (or right) by any desired numberof positions. For example, we consider the multiplication of the twodecimal numbers 11 and 5, which are represented in binary form by 1011and 101. When the bit of the multiplier is zero the partial product is zerowhereas when the multiplier bit is one, the partial product consists of themultiplicand shifted left by as many times as indicated by the position ofthe multiplier bit; we carry out the operation as follows

1011 multiplicand 11x 101 multiplier x _5 =55

1011 no shift multiply by 10000 shift left once, multiply by 0

1011 shift left twice, multiply by 1110111 add, =551O

Division can be implemented by a similar algorithm where the divisor isshifted left and subtracted from the dividend.

As an example of a logic operation we discuss the comparison of twosingle-bit binary numbers A and B. The comparator is a very importantdevice because it allows a computer to branch to different locations in itsprogram (the familiar 'IF' statement in programming) on comparison oftwo logical or arithmetic functions. We use the notation

A .LT. B A Less Than B A<BA .EQ. B A Equal to B A = BA .GT. B A Greater Than B A>B

The truth table for the 1-bit comparator is specified by Eqs. (2.6) and

Decoders and multiplexers 49

involves three simple Boolean functionsA .LT. B = A'BA .EQ. B = A>B + A'-B'A .GT. B = ABf

(2.6)(2.6')(2.6")

The circuit for a comparator that obeys Eqs. (2.6) is shown in Fig. 2.9.It is straightforward to extend these operations to the comparison ofbinary numbers with any number of bits.

Note that the operation A .EQ. B represented by the circuit ofFig. 2.9 is

F = l(A.B') + (A'.B)JWith the help of the De Morgan's theorems this is equivalent to

F = (A'B')f -(A' -B)' = (Af + B)-(A + B') = (A' -B') + (A-B)in agreement with the definition of Eq. (2.6).

Fig. 2.9. Truth table and circuit for a 1-bit comparator.

J * A <B

A = B

A> B

A

0011

B

0101

A .LT. B

0100

A .EQ. B

1001

A .GT. B

0010

2.3 Decoders and multiplexers

The devices discussed in this section are built out of basic logicgates just as were the arithmetic units. Their purpose however is tointerpret instructions, sent in binary form and to switch a signal along aparticular path. Their operation and usefulness will become apparent aswe discuss their performance and applications.


A decoder is a device that activates (sets high) a particular hardwareline in response to a binary input. If the decoder has n inputs it can haveas many as 2" output lines which is the number of combinations of nobjects. As an example we show in Fig. 2.10 the truth table and circuitfor a 2-bit decoder, also referred to as a 2 x 4 decoder. The decodedoutputs are labeled D0,D1,D2,D?> and are asserted respectively by thebinary signals 00, 01, 10,11. An enable lines has been included so that nooutput is generated unless the enable is true.

The presence of the enable line makes it possible to combine two 2 x 4decoders into a 3 x 8 decoder. This is shown in Fig. 2.11 where theadditional higher order bit acts on the enable line.

A multiplexer switches one of several inputs onto a single output line.The input to be selected is identified by the binary information on thecontrol (or select) lines. A 4 x 1 multiplexer is shown in Fig. 2.12, wherethe inputs are labeled IO9IUI2,I3 and the control lines by S0,S1.Depending on the gate selected by the S^Q instruction, the input lineattached to that gate is effectively connected to the output line whereasall the other input lines are isolated from the output.

The multiplexer is the key device in telephone communication. Forinstance when one lifts the receiver that particular telephone line (one ofthe inputs /0 to /3) is connected to a trunk line at the substation. Dependingon the number dialed the trunk line is connected to a particular customer

Fig. 2.10. Truth table and circuit for a 2 x 4 decoder.

A

Enable

3D-

A

0011

B

0101

1000

0100

D2

0010

0001

D0(00)

Z)2(10)

D3(\l)

Flip-flops 51

(one of the outputs Do to D3 in Fig. 2.10), through a demultiplexer. Infact a decoder is also a demultiplexer, if the enable line is used as thesignal input line.

Fig. 2.11. Combining two 2 x 4 decoders into a 3 x 8 decoder.

Inputs

Fig. 2.12. Circuit for a 4 x 1 multiplexer.

Out

2.4 Flip-flops

The circuits that we discussed up to now have the property thattheir output depends on the instantaneous value of the input. In anotherclass of circuits a momentary input fixes the output state; the output willremain in that state, even after the input is removed and until a rest


command is received. We can think of such circuits as devices that maintainthe 'memory' of the input signal. They are commonly related to as latches.

The simplest form of a latch is a trigger circuit: whenever the inputexceeds a certain threshold the circuit outputs a square pulse ofpredetermined amplitude and width as shown by the typical waveform inFig. 2.13. The simple trigger is a monostable device in the sense that ithas only one stable state (in Fig. 2.13 this is the low voltage level) andunder the influence of the signal it will give a high output for somepredetermined time interval, but will then return to the low level. A bistablecircuit has two stable states and can remain in either of these statesindefinitely. Bistable circuits are referred to as flip-flops and always involvea certain amount of feedback.

We can make a flip-flop using two NAND or NOR gates which areinterconnected. In Fig. 2.14 the output of the top NOR gate is fed backto form one of the inputs of the lower NOR gate, and vice versa. Such acircuit will be in one of two states: either Q is high (and Q! is low) or Qf

is high (and Q low). Usually, both the S and R inputs are held low sothat if Q is high, the lower NOR gate has one input high and forces Qf

to be low. With Q' low, the top NOR gate has both of its inputs low andtherefore Q remains high. Similarly, if Qf is high and Q is low, the circuit

Fig. 2.13. Input and output waveforms for a Schmidt trigger circuit.

V k

^ — — InputThreshold

0--Output

- • t

Fig. 2.14. Simple flip-flop using two NOR gates.

R (Reset/Flop)

S (Set/Flip)

Flip-flops 53

will remain in that state, unless an external signal is applied at the input.It is not possible for both Q and Q' to be high at the same time becausethe system becomes unstable and one or the other of the NOR gates willswitch off allowing the circuit to settle in one of its stable states.

Suppose now that the circuit is in the state with Q' high and thatmomentarily the S input goes high. This will drive Q' low; but as Q goeslow it forces the top NOR gate off, namely Q high. As Q goes high thelower NOR gate is latched into its on position (Q' low) even though Smay return to its low level. The circuit has been switched to the Q-highstate. If instead the R input was set high momentarily while the circuitwas in the Q-low state no transition would occur. Thus the S (set) andR (reset) commands drive the flip-flop into its set state (Q-high) or itsreset state (Q-low). Flip-flops (latches) are characterized by a table whichgives the state of the flip-flop after a certain command is issued at itsinput, rather than by a truth table.

State of a flip-flop following the R-S commandsCommand Q Q'S (momentarily high) 1 0R (momentarily high) 0 1

Note that if both R and S go simultaneously high, both Q and Qf willwant to go low, which is an undefined state for the latch. Furthermore itis not possible to predict in what state the latch will be when R and Sare returned to low.

In complex logic circuits it is desirable that the changes of state andother operations occur through the entire system at well-defined timeintervals. This is assured by using a clock which issues pulses at somefixed rate. A flip-flop which can undergo transitions only in synchronismwith the clock pulse is shown in Fig. 2.15. Here we used NAND gates

Fig. 2.15. Clocked R-S flip-flop.

S


but the operation is the same as for the circuit of Fig. 2.14, except thatthe position of the S, R inputs is reversed. The characteristic table givesthe state of the Q output at the time specified by the i, (i +1) , (i + 2),. . .etc. clock pulse. It is also convenient to have an excitation table whichshows the status of the Q output before and after a command; here wehave marked by an X the don't care conditions, namely when the outputis independent of the state of that particular input line.

Characteristic table for a clocked R-S flip-flopC DO iv

0 00 11 01 1

Q(Q(tt)0i

Excitation

Q(tt)0011

s010X

1 FunctionNo changeClear £SetNot allowed

i table for a clocked R-S flip-flopRX010

Q(ti+1)0l01

1R A1

tCP

rSi1

The asymmetries inherent in the R-S flip-flop are absent in a moreelaborate type of flip-flop. This is referred to as a J-K flip-flop with Jthe set and K the clear inputs. In this case, if both the J and K inputsare simultaneously asserted the flip-flop undergoes a transition, i.e. itcomplements its output. The characteristic table, the symbol and theexcitation table for the J-K flip-flop are shown below.

Characteristic table for a J-K flip-flopJ0011

K0101

Q(tt)0lQ'(tt)

Excitation table

FunctionNo changeClearSetComplement

for a J-K flip-flop

Q'iK

o011

J01XX

KXX10

Clear SetCP

Registers and counters

2.5 Registers and counters

55

A set of latches can be used as a register which will hold anyspecific pattern of digits. Registers are used in all computers to hold dataor instructions. The data can be transferred from one register to anotheror to and from arithmetic units. A simple 3-bit register built from R-Sflip-flops is shown in Fig. 2.16. The register can be loaded with theinformation on the three input lines /1? J2 , /3 , when and only when theload command (line) is asserted; furthermore the loading is executed insynchronism with the clock pulse. The contents of the register appear aslevels at the output lines Al9 A2, A3, which are connected to the Q outputof the flip-flop. The entire register can be simultaneously cleared by pulsingthe clear line.

Information can be transferred between registers rapidly and efficiently,as shown in the following example which uses a two-phase clock. In thiscase clock pulses appear in two time sequences $x(0 and </>2(0 which aredisplaced in time as shown in Fig. 2.17. The two signals never overlap,a condition that can be written in Boolean notation as 4>1(t)-(j)2(t) = 0.The transfer of the information from register X to register Y under clockcontrol is realized as shown in Fig. 2.18(a), which is based on MOStechnology enhancement FETs. The transistors controlled by the clockpulses are referred to as pass transistors since they allow the signal to flow

Fig. 2.16. Three-bit register using R-S flip-flops.


down the line. A simpler notation for the same circuit is shown in (b) ofthe figure, and an even simpler form, known as a stick diagram is shownin (c). Evidently, during the clock cycle <f>u the inverter A will be set; inthe next cycle, i.e. 4>2, the inverter B will be set and so on down the line.Thus the inverters can carry different information as a function of time.

A particularly useful form of register is the shift register, in which a bitpattern is shifted by one position to the right or left. The stick diagramfor a 3-bit shift register is shown in Fig. 2.19 where in practice more bitsare used. As compared to Fig. 2.18 the lines in the shift register containtwo additional pass transistors (sticks) indicated by the heavier lines. Theshift command is indicated by S and if S is not asserted, S • (j)2 is true while

Fig. 2.17. Timing diagram for a two-phase non-overlapping clock.

l I._n mi i

i i

Fig. 2.18. Transfer of signals from X to Y under clock control: (a) circuitdiagram, (b) mixed notation, (c) stick diagram.

(a)

(b)

<t>l\ <t>2\

(C)

Registers and counters 57

S-(j)2 is false, letting data flow along the horizontal path. When S is true,the transistors attached to the S- 0 2 line conduct, whereas those connectedto S-(j)2 are turned off. As a result the signals flow along the diagonalsand the bit pattern is shifted one position up X3 -> Y2, X4 -• Y3, X5 -> y4,etc.

A related device is a binary counter. As the name indicates it countsthe number of input pulses it has received since the last time it was cleared.A binary counter using J-K flip-flops is shown in Fig. 2.20. Wheneveran input signal arrives it asserts simultaneously both the J and K inputsof the first flip-flop. Thus the flip-flop complements itself. The secondflip-flop does not see the input signal unless line Ao is asserted; similarlythe third flip-flop does not see the input signal unless Ax and Ao areasserted. As drawn, the counter changes states only in synchronism with

Fig. 2.19. Stick diagram of a shift register.

S'<t>2

X

Fig. 2.20. Three-bit binary counter.

lAa LA,Clear

UClock

K

Input

u UC a r r y


the clock pulse. If the counter was initially cleared the bit pattern changesas in the table below, namely as a binary count. By using appropriatefeedback paths one can build decimal counters or counters in any basis.

Bit pattern for

CLEARINPUTYesYesYesYesYesYesYesYes

A2

0

00011110

3-bit

0

01100110

binary counter

0

10101010 Carry pulse out

BinaryOctalDecimalHexadecimal

28

1016

2.6 Data representation and coding

(a) Number systems: Numerical information can be expressed inany base system. In daily life we use the decimal (base ten) system whereasfor digital logic the most natural system is binary (base two). The numberof symbols required in any particular system equals the base r

System Base Symbols0, 10, 1, 2, 3, 4, 5, 6, 70, 1, 2, 3, 4, 5, 6, 7, 8, 90-9, A, B, C, D, E, F

As an example consider conversion from hexadecimal to decimal(E4)16 = (E x 16) + (4 x 1) = 224 + 4 = 2281O

(AE4)16 = [A x (16)2] + (E x 16) + ( 4 x l ) - 27881O

Next we convert a decimal number (47.8125)1O to binary. To convert theinteger part we should find the largest power of 2 that fits in the numberand this would give the position of the highest binary bit. The differencebetween the number and that highest power of 2 is then converted tobinary and so on until the difference is either 0 or 1. An algorithm forthe conversion process is indicated below on the left side: Divide thenumber by 2 save the remainder; in the next line divide the quotient by

Data representation and coding 59

2 and save the remainder; the quotient from the second line is divided by2 in the third line,. . . and so on until the quotient is zero. The binarynumber is then made up by reading the remainders from the bottom up;the highest bit, is the last remainder. Thus

(47)10 = (1O1,1H)2

For the fractional part of the decimal number, the algorithm is shown onthe right side: we multiply by 2 and keep the integer part of the result;and continue multiplying until no more fraction is left. Note that thisprocess may not terminate. The bit pattern is now read from the top down(0.8125)10 = (0.1101)2.

47 - 2 = 23 + l ! 0.8125 x 2 = 1.625 = 1 + 0.6252 3 - 2 = 1 1 + 1 0.625 x 2 = 1.25 = 1 + 0 . 2 51 1 - 2 = 5 + 1 0.25 x 2 = 0.5 = 0 + 0.55 - 2 = 2 + 1 0.5 x 2 = 1 . 0 = 1 + 02 - 2 = 1 + 01 - 2 = 0 + 1

Therefore

(47.8125)1O = 101,111.110,1

as the reader can easily verify.Octal numbers are often used instead of binary numbers because their

bit image is the same as for the corresponding binary number, yet theycan be expressed with three times fewer digits. Even more economic writingoccurs with hexadecimal notation even though using letter symbols fornumbers is less familiar. This correspondence can be seen by a simpleexample. Consider the 16-bit binary number

(X)2 = 1,111,011,100,101,010To obtain the octal representation we express every group of 3 binarybits (starting from the right) by its corresponding octal symbol. Thus byinspection

(X)8 = (173,452)8

For the hexadecimal representation we break the bit pattern into groupsof 4 bits

(X)2 = 1111,0111,0010,1010so that by inspection

(X)16 = (F72A)16


Finally in decimal representation the number we have been considering is(JSQio = 215 + 21 4 + 21 3 + 212 + 210 + 29 + 28 + 25 + 23 + 21

= (63,274)1O

The largest decimal number that can be encoded with 16 bits is

Groups of 8 bits are referred to as forming 1 byte.(b) Codes: Numerical data can also be represented by a code instead

of by a number system. One common system is the so-calledbinary-coded-decimal (BCD). In this code 4 bits are used to represent thenumbers from 0 to 9. Decimal numbers are then constructed by adjoiningsuch groups of 4 bits (half-bytes). For instance the number (189)1O isrepresented in BCD by

(189)1O = (1 x 100) + (8 x 10) + ( 9 x l )= [0001,1000,1001]BCD

= [110,001,001]BCD

Note that this is quite different from the binary bit pattern of the number

(189)1O = (10,Hl,101)2

Codes are generally less economical in terms of bits than the base-2n

number systems but they are simpler. Furthermore, to encode alphabeticletters we must use a code if we wish to store and manipulate them bydigital techniques. Various codes are in existence, one example being thetelegraph code introduced by Samuel Morse. Another code that was quitepopular and used on 'IBM' punched cards was the Hollerith code.

Today most devices use the ASCII code (American Standard Code forInformation Interchange). This is a 7-bit code and therefore can encodeup to 27 = 128 different symbols. These include the 26 upper case and 26lower case letters, the 10 numbers, 11 special symbols, 23 format controllerssuch as carriage return, indent, skip etc., as well as data flow controlsignals. When a parity bit is included, every ASCII symbol occupies onebyte. A partial list of the ASCII code is shown in Table 2.2.

Another example of a code is the gray code which is well suited for

Fig. 2.21. Encoding of shaft rotation in gray code.

Contactbrush

3rd bit Lowest bit


encoding the rotation of a shaft. Consider for instance a shaft onto whichthe three wheels shown in Fig. 2.21 are rigidly attached. The shaded areasof each wheel will give rise to a ' 1 ' when they are in contact with thecorresponding brush which is located near the vertical. The position ofthe shaft can then be encoded modulo 45° as shown in the table. In thegray code only one bit changes at any transition and this helps resolveambiguities when a brush first makes contact with the conducting partof the wheel.

Angle0°-45°

45°-90°90°-135°

135°-180°

Encoded signal000001011010

Angle180°-225°225°-270°270°-315°315°-360°

Encoded signal110111101100

Table 2.2. American National Standard Code for InformationInterchange (ASCII)

Character

ABCDEFGHIJKLMNOPQRsTUVWXYZ

Binary code

100 0001100 0010100 0011100 0100100 0101100 0110100 0111100 1000100 1001100 1010100 1011100 1100100 1101100 1110ioo mi101 0000101 0001101 0010101 0011101 0100101 0101101 0110101 0111101 1000101 1001101 1010

Character

0123456789

blank

(+$*>—/

=

Binary code

011 0000on oooion ooioon oonon oiooon oioion onoon oinon IOOOon IOOI

010 0000010 1110010 1000010 1011010 0100010 1010010 1001010 1101oio mi010 1100on noi


(c) Data representation: We have indicated how integers can beexpressed in different bases or encoded. Numbers, however, can be eitherpositive or negative and furthermore it is often desirable to use exponentialnotation, also referred to as the floating point representation, where verylarge and/or very small numbers can be expressed with a limited numberof digits.

To indicate negative numbers the left-most bit of the computer wordis frequently used as the sign-bit. A '0' indicates positive numbers and ' 1 'negative numbers. For instance in a 16-bit machine

0 100 000 001 001 001 = +(16457)1O

1 100 000 001 001 001 = -(16457)1O

A more efficient format for manipulating negative numbers is to representthem by their complement. For binary numbers their ones-complementconsists in complementing every bit. The twos-complement consists insubtracting the number from zero: this is equivalent to forming theones-complement and then adding 1 to the result. For example in a 16-bitmachine the number (16,457)1O used above, has

ls-complement 1 011 111 110 110 1102s-complement 1 011 111 110 110 111

In floating point any integer number is represented by the mantissa andthe exponent. Thus, we need two registers to store the number; as anexample consider

A = (457.63)1O

We write A as a pure decimalA = 0.45763 x 103

and therefore the two registers will contain. t . 0 45763 mantissa

sign bit0 3 exponent

The same procedure is applicable to binary numbers. For instance(8.375)1O is expressed in binary as

(8.375)1O = (1000.011)2 = (0.1000011)2 x 24

and therefore the two registers will contain. t . 0 1000 011 mantissa

sign bit0 100 exponent

The precision with which a number is represented depends, of course,on the number of bits available. For instance a 32-bit machine has aprecision of 10 decimal digits. Often, when special precision is needed in


a particular calculation, two computer words can be used to represent asingle number; such calculations are said to be executed in double precision.On the other hand, the largest or smallest number that can be representedis dependent on the bits assigned to express the exponent.

(d) Error checking: We have seen that a binary bit pattern can be usedto represent numbers or characters. It is possible to add an extra bit tothis pattern so as to make the total number of ones in the word alwaysodd (or always even). This is referred to as the parity of the word. As anexample we consider the word 'WHAT' encoded in ASCII (see Table 2.2);we have added a parity bit to make all letters to be represented by bytesof odd parity.

ASCII Parity bit1010111 0100 1000 1100 0001 1101 0100 0

Number of Is (odd)5333

If this message is transmitted over a teletype line, or is stored on tape,upon retrieval, the parity of each word is checked. If the parity is not odda mistake in transmission must have occurred. A simple odd paritygenerator, for 3-bit words is shown in Fig. 2.22(a). It uses exclusive-ORsand the reader can construct the truth table and verify that either 1 or 3of the output lines A, B, C and P are always asserted. An odd paritychecker circuit is shown in (b) of the figure. Parity generating and checkingcircuits are now incorporated in most VLSI packages.

The form of parity discussed here is known as transverse parity and is

Fig. 2.22. Odd parity generating and checking circuits for 3-bit words:(a) generator, (b) checker.

B

C

B

CP

Error

(if asserted)

(a) (b)


extensively used when data is written on magnetic tape. It is the simplesterror checking method; for instance it would not detect an error if aneven number of bits were 'dropped'. More elaborate checking methodsare available by using complicated codes and can even locate the bit thatwas in error.

2.7 Computer memories

A computer is a high speed electronic calculating machine. Beforethe advent of electronics, mechanical calculators were in use and the firstdesign of such a device is attributed to Blaise Pascal (1623-62). Theconcepts used in modern computers first appear in a proposal by CharlesBabbage (1792-1871) who suggested the storage of intermediate resultsin the calculation. The memory of Babbage's machine would haveconsisted of wheels with 10 positions for each decimal digit. He proposeda memory for 1000 numbers each with 50 digit precision which wouldrequire 50 000 wheels. It took almost 100 years before these ideas couldbe used in practice in electronic computer memories. The memory musthold the program that the computer executes, the input and output dataas well as intermediate results. It must be possible to store and retrievedata from specific memory locations, every location being assigned aspecific address.

In principle, a computer memory could consist of a set of registers suchas those used for transferring or holding data. However, in the earlycomputers electronic storage was impractical and the first commercialmemories used small cylindrical ferrite cores. The cores could be driveninto saturation with the magnetization clockwise or counterclockwiseindicating the two binary states. Even though ferrite core memories arenot any more in use, magnetic recording on disk and tape is the principaltechnique for mass storage of data, as discussed in Section 2.8.

Flip-flops or latches are well suited as elements of an electronic memory.However it is simpler, and therefore less expensive, to use a single capacitoras a memory cell. If the capacitor is charged the cell is assumed to be inthe T state whereas when the capacitor is discharged the cell is in the '0'state. Today a charge of Q = 500 fC can be easily measured so that for asystem using 5 volts the capacitance must be

c = g,500xl0-" =

V 5Such a capacitance covers only a small area and can be easily implemented

Computer memories 65

in VLSI technology. For instance a 106 bit memory can be placed on a3 x 3 (mm)2 chip.

The memory cells are arranged in an array and are addressed by twosimultaneous signals as shown in Fig. 2.23. Here the memory capacitoris charged through a pass transistor. One of the address lines connectsthe source of the FET to the ground, whereas the other address lineenables the gate of the FET. The memory capacitor, however, will nothold its charge forever and therefore the charge must be regenerated atfixed time intervals. Typically a read or write operation takes about200 nsec and the charge must be regenerated at time intervals of order of1-2 milliseconds, leading to dead time of the order of few percent.

In memories such as the one shown in Fig. 2.23 any cell can be accessedat random. They are referred to as RAMs for random access memory anda 1 Mbyte card can be purchased today for a few hundred dollars. Toaddress a 1 Mbyte RAM, 1024 x 1024 address lines are required; thesecan be encoded using 20 bits. Therefore, it suffices to bring into the RAM(10 + 10) = 20 lines of binary information to address the 106 memorylocations. The block diagram of a memory unit with its supportingcircuitry can be represented as in Fig. 2.24.

A variant of the RAM is the read-only memory (ROM). In this case thecontents of the memory are introduced at the manufacture stage andcannot be altered. However the memory is addressable and the contentsof each location can be read out. Such memories are typically used insmall pocket calculators. In more advanced types of ROM any particularprogram can be 'burnt' into the ROM, i.e. the memory can be permanentlyloaded according to the user's specifications. Memories using flip-flops

Fig. 2.23. Schematic of an electronic memory array.

Column address1 -« lines • 2

Row addresslines

etc. ... etc. ...


are more versatile than the simple capacitor memories that we discussed,but they are more expensive as well. A block diagram of a memory cellusing an R-S flip-flop is shown in Fig. 2.25.

Random access memories are indispensable for the operation of acomputer but when there is need to store large amounts of data it becomesnecessary to use cheaper and more compact devices such as magnetictape, or magnetic disks, where the access is serial. A memory intermediatein access and cost between the RAMs and the mass storage devices is theCCD: the abbreviation stands for Charged Coupled Device. This is atwo-dimensional register through which the data is continuously shiftedat high rate, so that data can be accessed fairly quickly (even though notrandomly); yet CCDs can be manufactured very compactly.

A schematic of a 64 x 64 bit CCD is shown in Fig. 2.26 and operatesunder clock control. The data comes in as a 64 bit word and is transferredto the first row of the CCD. At the next clock cycle the 64 bit word ispushed to the second row and new data can be stored in the first row.When a word reaches the last row, it exits the CCD and can be sent out

Fig. 2.24. Block diagram of computer memory.

Meraddreg

noryresstster

Memory

Memorybuffer

registerAddress

In

Read/writecommands

T \In Data Out

Fig. 2.25. Read/write memory cell using an R-S flip-flop.

Select(from address

decoder)

Data out

ReadTWrite '0'

Magnetic storage 67

via the output line or otherwise it is recirculated to the first row of theCCD. To estimate the access time consider a CCD with a shift rate of80 kHz. Then any one row takes at most

Armax = 64 x —(8 x 104)

= 8x 10"4s

before appearing at the output. Since each row contains 8 bytes the rateof access is

Rate = 8 x (1/Atmax) = 104 bytes/sIf blocks of data are transferred the rate is much higher, approaching8 x (8 x 104) = 6.4 x 105 bytes/s.

Fig. 2.26. Block diagram of a 64 x 64 CCD.

Input register Data in

64 X 64Storage locations

mi ii, 1Output register

(64 bits)

Recirculationpath

Data out(64 bits)

2.8 Magnetic storage

The invention of magnetic tape in the late 1950s has made possiblethe storage of large amounts of information in highly compact form. Forinstance, a reel of magnetic tape is typically 2400 feet long, \ inch wideand a total of 9 tracks are encoded. The 9 tracks correspond to 8 bits(i.e. one byte) of information and one transverse parity bit. The densitythat can be written/read by modern tape drives is 6250 bits per inch (bpi).Thus if we use one byte to encode an alphanumeric character the tapecan hold 6.25 x 103 x (2.4 x 103 x 12)= 1.8 x 108 characters. The averageword in the English language has 6 letters and a typical book (500 pages)


contains 200 000 words so that one reel of tape (2400 feet) written at6250 bpi can hold

1.8 x 108

= 150 books!6 x (2 x 105)Storage of data in digital form not only has large capacity but also providesrapid access and ease of information retrieval as compared to writtenrecords. First of all the density of information storage is high: given thewidth of the tape the density of bits is

(6.25 x 103) x 8/0.5 = 105 bits/square inchor 104 characters/square inch. Such density can be obtained on microfilmbut in that case it cannot be retrieved automatically. Secondly, magnetictape is an extremely cheap storage device; a reel of ^-inch tape costs lessthan $10. As to the speed with which information can be retrieved, amodern tape drive will operate at 200 inches per second (ips). Televisionimages (so-called video) are stored at even higher density because onecan tolerate a larger fraction of errors and because the format is encodedin a highly repetitive fashion.

Magnetic recording is based on the properties of ferromagneticmaterials. Certain materials can be magnetized, namely the atomicmagnetic dipole moments can become aligned along a particular direction.This gives rise to a macroscopic magnetization M inside the material. Themagnetization is defined as the net dipole magnetic moment per unitvolume

M = - $ >Ft

and in the MKS system it is measured in amperes/meter. Often thealignment of the magnetic dipoles is due to the presence of an externalmagnetizing field H; H is also measured in amperes/meter and is generatedby currents. The magnetic field B is the vector sum of M and H

and in the MKS system it is measured in tesla (V-s/m2). While M isconfined inside the material, B can exist in free space, and

^ 0 = 4TTX 10" 7 V-s/A-mis the magnetic permeability of the vacuum. In practical applications,cgs-emu units are still frequently used, and therefore we give the relevantconversion factors

Magnetic field B 1 gauss = 10 "4 TMagnetizing field H 1 oersted = 103/4TT A/mMagnetization M 1 emu/cm3 = 103 A/m

Magnetic storage 69

For most materials the magnetization is linearly related to the externalmagnetizing field

where % is the magnetic susceptibility l x l« l , and x is positive forparamagnetic materials, negative for diamagnetic materials. In ferro-magnetic materials / » 1, and the magnetization can persist even after theexternal field H is removed. The relation between M and H, or B and His complicated and takes the form shown in Fig. 2.27(a).

The graphs of Fig. 2.27 are called the hysteresis loops. When themagnetizing field H is first applied, the material follows the 'virgin curve'(a) until saturation is reached for some value Bs. As H is reduced theupper curve (b) is followed and even for H = 0 the magnetic dipoles remainpartially aligned and give rise to a remanent magnetization Br. As H isreversed the magnetic field will reach zero for some value Hc the coercivefield strength; when H is further increased in the reverse directionsaturation is reached again. If now H is decreased the material followsthe lower curve (c) to complete the loop.

The magnetization curve differs for various ferromagnetic materials andby special preparation can take the form of Fig. 2.21(b). In this case theferromagnetic material has many of the properties of a bistable system,its two states being characterized by B= +Br or — BT; furthermore, forall but for a small range of H the ferrite is in one of those states. The areaenclosed by the magnetization curve is a measure of the energy stored inthe aligned magnetic dipoles; this is the energy that must be provided inorder to flip the state of magnetization. Such materials could be used asmemory cells for binary digits if we are able to switch them from one

Fig. 2.27. Magnetization curves for ferromagnetic materials: (a) softiron, (b) special high remanence material.

(a) (b)


state to the other and can read their state of magnetization. The switchingis accomplished by applying an external magnetic field, whereas readingis achieved by sensing the field outside the material with a small coil.

Magnetic tape is made by coating a thin layer of iron oxide (y-Fe2O3)in the form of fine particles onto a plastic backing. The coating isapproximately 10 fim thick and the size of the particles is about 0.5 jxm.The material is ferromagnetic with a coercive field of order 300 oerstedand a remanent magnetization of 1500 gauss. The tape is transported overthe head which serves to write onto, and read from the tape. The headconsists of a slotted electromagnet with sufficient leakage field so as tomagnetize the tape. Similarly, changes in the magnetization of the tape,change the flux through the gap of the electromagnet and thus induce anemf in the read coil as indicated in Fig. 2.28(fo). There is a variety oforientations that can be chosen to magnetize the tape. For digital tapes,longitudinal recording is used and the magnetization is saturated in one

Fig. 2.28. Sketch of magnetic tape head: (a) positioning of the tape,(b) field lines in the gap region and through the core.

^Moving tape

(a) (b)

Fig. 2.29. Magnetic disk and movable head assembly.

Positioningmechanism

The compact disk 71

or the other direction. For analogue tapes as in cassette recorders themagnetization is in the linear region of the hysteresis curve.

A form of mass storage that provides much faster access than tape isthe magnetic disk. Typically an aluminum disk, 14 inches in diameter, iscoated with ferromagnetic material. The disk spins at 3000 rpm and thehead can be moved radially to access different tracks as shown inFig. 2.29. The data is written on as many as 1256 circular tracks and issubdivided into sectors, typically 50. For large disks the densities are512 bytes per sector leading to approximately 25 Mbytes per disk surface.More often than not, both sides of the disk are used and a pack of 8 diskswith 16 read/write heads can be stacked together. Such an assembly can

Fig. 2.30. Formats for digital encoding: (a) return to zero (RZ),(b) non-return to zero inverse (NRZI), (c) phase encoding.

(a)

n n n |—| Write

A

Write

RZ

Read

Write

NRZI

Read

Write

PE

Read


store some 500 Mbytes of information. The access time for a sector (whichdefines a block) is about 30 ms. Disks are used to transfer large blocksof data or program code in and out of the computer memory, greatlyincreasing the capability of the computer. Floppy disks, common onsmaller computers use a flexible backing and therefore operate at slowerspeed (300 rpm) and can now hold a Mbyte of data or even more.

There are various methods for encoding the binary information ontothe tape or disk. For instance one orientation of magnetization couldsignify a ' 1 ' whereas the other a '0'; this code is called RZ (return to zero)and is shown in the first two rows of Fig. 2.30 for the binary sequence01101001. The first line represents the magnetization on the tape and thesecond line shows the signal that is picked up in the read head. The NRZI(non-return to zero inverse) method is indicated in the next two lines andhere a transition indicates a 'one'; no transition in a cell is a 'zero'. Inthe PE (phase encoding) method the cell available for each bit is splitinto two parts: a 'zero' has a transition down in the middle of the cellwhereas a 'one' has a transition up. In PE only the transitions in themiddle of the cycle are significant; it is much easier to maintain the timingand synchronism between the tape motion and the read process in PEencoding which is the most widely used method.

Present magnetic storage devices operate with a high signal to noiseratio (S/N) and are therefore quite reliable. In principle the area reservedfor encoding a single bit could be reduced, thus increasing the density ofthe stored data but also reducing the S/N ratio. It appears however thatit will be easier to achieve higher densities by using optical storage asexemplified by compact disk technology which we discuss next.

2.9 The compact disk

Compact disks are used to store audio information in digital formand are widely used since they were first introduced in 1982. It is estimatedthat two million disks are sold annually. In the phonograph and thecassette player audio was recorded in analogue form. Analogue to digitalconversion is discussed in Chapter 3, from where it can be appreciatedthat digital recording offers significant advantages in the fidelity andquality of the reproduced sound.

Compact disks are equivalent to 'read-only' memories, since theinformation is written onto the disk at the time of manufacture and cannotbe altered. On the other hand they have much higher storage density; a12 cm disk could hold as much as 800 Mbytes of data, which is equivalentto some 2000 floppy disks. Such high density of information storage is

The compact disk 73

achieved because the disk is read out optically, using a beam from a solidstate laser. The beam can be focused to a spot of rms radius of 1 /im whichsets the ultimate resolution limit.

The disk is constructed of plastic and the information is placed on thedisk in the form of pits 0.5 fim wide and 2-3 /mi long. The pits are laidout on a spiral track, consecutive tracks being separated by ~2 /mi asshown in Fig. 2.31 (a). After the pits are impressed in the plastic a reflectivecoating is placed on the disk, as shown in (b) of the figure. The opticalread-out beam enters the disk from the opposite side and if it is directedonto the flat 'land' area between pits it is totally reflected. If the beamstrikes a pit area, which now appears as a bump, the optical path of thereflected light is shorter by twice the pit depth; this has been chosen sothat the light reflected from the bump arrives at the detector with a 1/2phase advance and interferes destructively with the light reflected fromthe land area. Thus as the disk rotates past the optical beam, the pitsappear as no reflected light and are characterized as T whereas the flatland appears as reflected light, '0', providing the binary information tothe digital circuitry.

Fig. 2.31. Encoding of a compact disk: (a) plan view showing the tracksand a sequence of'pits', (b) elevation profile and details of the illuminationof the tracks.

1.0-3.0 Mm

Laserbeam(a)

Disk rotation

Substrate

2 /im

ixm

-«1—0.8 mm

Laser beam

(b)


The optical read out technique used here is based on the coherentnature of the optical beam which is obtained from a solid state laser(see Chapter 4). These lasers are made by depositing layers of compoundsof GaAs with different doping and are extremely compact. A schematicof the optics is shown in Fig. 2.32. The reading speed is maintainedconstant at 1.25 m/s which corresponds to between 4 and 8 revolutionsper second. The disks hold 74 minutes of data, namely 5.5 km of track.The information is encoded on the disk by one of the codes discussed inthe previous section. In view of the high information density, sound inthe full audio range of 20 Hz to 20 kHz can be reproduced with a dynamicrange of 90 db.

Research on erasable optical disks is being actively pursued. This isbased on magneto-optic effects in thin films of GdFe and similar materials.Such disks could exceed the storage density of magnetic disks by a factorof 10-100 and would also offer faster access to the data.

Fig. 2.32. Schematic of the optics used for the read out of the opticaldisk. (From T. Rossing, The compact disc digital audio system,The Physics Teacher (1987) by permission.)

Objectivelens

Mirror

Collimatorlens

Semiconductorlaser

^ * Cylindricallens

Photodiode

Computer architecture 75

2.10 Computer architecture

The fundamental principle of a computer is to use a finite numberof logic circuits and devices and change the interconnections between themso that they can perform any one of a variety of logical or arithmeticoperations. The interconnection of the circuits and memory cells is doneby the program which is provided to the computer by its user.A major advance in the development of computers was the suggestion byJohn von Neumann that binary format be used to specify the instructionsthat the computer should execute.

To illustrate how a computer works, we will use as an example a 3-bitmachine. With three bits only 8 memory locations can be addresseddirectly and in our example the instruction set will contain only fiveoperations. Nevertheless the principles and methods of operation areexactly the same as in larger machines which use from 8 up to 60 bits.This example has appeared in an article by W. C. Holton in the September1977 issue of Scientific American.

A computer operates under clock control which defines the timing foreach 'cycle'. Cycles are divided into two parts

FETCH 1st half-cycleEXECUTE 2nd half-cycle

During the fetch part of the cycle the computer retrieves the instructionwhich is then carried out during the execute half-cycle. Even the half-cyclesare further subdivided into two clock periods each, during which veryspecific operations are carried out.

In the fetch half-cycle we have

1st period: The computer finds the memory address from whichto fetch the instruction; this is provided by the programcounter.

2nd period: The instruction is taken from memory, placed on thebus and routed to its destination.

During the execute half-cycle we have

3rd period: Find and retrieve the data from memory or theappropriate register.

4th period: Perform the designated operation.

It is also possible that in the 3rd period a designated operation is carried


out and then the 4th period is devoted to storing the data. Some complexoperations may take more than one computer cycle.

The hardwired connections between the various units of our examplecomputer are shown in Fig. 2.33. Note the presence of four registers whichare all connected to the (3-line) bus via appropriate buffers. At any onetime only the two registers, between which data transfer takes place, shouldbe connected to the bus; this is accomplished by the buffers which areactivated by the control lines - and are also under clock control. The roleof each of the registers used in the machine is as follows:

(a) The instruction register contains the instruction that is to beexecuted. It is connected to a decoder which activates theappropriate control lines - one for each of the five instructionsof the set.

(b) The program counter keeps track of the place at which the currentoperation is within the program. It is incremented every half-cycleunless it is inhibited.

Fig. 2.33. Elements and interconnections of a hypothetical 3-bitcomputer. (Adapted from W. C. Holton, The large scale integration ofmicroelectronic circuits, Scientific American, September 1977.)

Bus

Computer architecture 77

(c) The general register is used to temporarily hold informationduring arithmetic operations.

(d) The accumulator is the register which holds data that is passedto and from the adder.

In addition to the registers and the memory stack, the computer containsthe arithmetic and logic unit (ALU) which in our example is a simpleadder. We also note in Fig. 2.33 two decoders: the memory decoder isused to enable the particular memory cell that is being addressed. Theinstruction decoder enables the various buffers thus interpreting theinstruction that is being carried out. The ALU is connected to the busand can also be thought of as a register since information must becommunicated to it from the bus. The contents of the memory aretransferred to, or received from, the bus through the memory buffer whichis enabled by the clock and the relevant control lines.

The instruction set, as also indicated in the diagram, includes thefollowing five commands

000 Halt001 Load accumulator (with the contents of the next memory

location)010 Load general register (with the contents of the next memory

location)011 Store the contents of the accumulator in memory (at location

held in general register)100 Add (the contents of the general register and accumulator, and

retain the result in the accumulator)

The implementation of these instructions by the hardware can be seen inthe diagram.

Let us then write a program to add two numbers, contained in memory,and store the result in memory. Such a program is indicated below andwill have to be loaded in the corresponding memory locations.

Memory location Content Instruction000001010011100101110

010011001001100011000

Load general register with AData A (0112 = 3)Load accumulator with BData B (0012 = l)Add and place in accumulatorStore in memory location 011Halt


We can follow the execution of the program with the help of the 'timingdiagram' shown in Table 2.3. Initially the program counter is set to 000and the computer is allowed to run. During the first cycle the followingevents occur:

1st period: The program counter is connected to the bus andtherefore the 000 memory location is addressed.

2nd period: The contents of memory location 000 are transferredto the instruction register which now contains 010.

3rd period: The next memory location (001) is enabled.4th period: The contents of memory location 001 are transferred

into the general register.

In the second cycle the same sequence is repeated to load the accumulatorwith the contents of memory location 011. In the third cycle the contentsof the general register and of the accumulator are added and returned tothe accumulator. In the fourth cycle the contents of the accumulator aretransferred to memory location 011. The fifth cycle is used to interpretthe 'halt' instruction and the computing halts. At this point the contentsof the memory are the same as before the execution of the program exceptfor location 011 which now contains 100 instead of 001. Note that ingeneral, during the 1st clock period the contents of the program countergo on to the bus and are used to select the memory address. During the2nd clock period the contents of that memory location are placed on thebus and must be interpreted as an instruction, i.e. placed into theinstruction register.

The example shows how the program stored in memory can controlthe operations that are carried out by the computer. In addition we needinput/output (I/O) devices if a human operator is to communicate withthe computer. These functions are performed by terminals which havekeyboards and alphanumerics or graphic displays and by hardcopydevices. A computer can also communicate directly with electromechanicaldevices from which it receives data and to which it issues control signals.Microprocessors, minicomputers, hand calculators, supercomputers, areall based on similar architecture. The principal differences are in the lengthof the word (i.e. the number of parallel bits used), the versatility of theinstruction set, the cycle speed, and the size of the available direct accessmemory.

Table 2.3.

Machinecycle

1FETCH

EXECUTE

2FETCH

EXECUTE

3FETCH

EXECUTE

4FETCH

EXECUTE

5FETCH

EXECUTE

Timing diagram for the example computer

Clockperiod

1234

1234

1234

1234

1234

Programcounter

000000001001

010010onon

88

88

101100101101

nononono

Information on the bus

Contents of program counterInstruction: load registerContents of program counter

Data

Contents of program counterInstruction: load accumulatorContents of program counter

Data

Contents of program counterInstruction: add

idleContents of general register

Contents of program counterInstruction: storeContents of general registerContents of accumulator

Contents of program counterInstruction: halt

000010001011

010001onon

100100

on

101onon000

no000

Meaning of instruction

Load contents of nextmemory location intogeneral register

Load contents of nextmemory location intoaccumulator

Add contents of generalregister and accumulatorand retain result

Store contents ofaccumulator in memory atlocation specified bycontents of general register

Stop operations

Instructionregister

010010010

010001001001

001100100100

100ononon

on000

Memoryaddress

000000001001

010010onon

88

88

101101onon

nono


Exercises

Exercise 2.1

(a) Draw the diagram of gates necessary to perform the followingtwo operations

X = A-B + C-D, Y = A'B + CD{A, B, C, D are inputs; X, Y are outputs).

(b) Construct the truth table.(c) For what values of the input is X • Y = 1.

Exercise 2.2

A 12-bit register holds a decimal floating point number represented inbinary. The mantissa occupies 8 bits and is assumed to be a normalizedinteger. Negative numbers in the mantissa and exponent have theirleft-most bit in the 1-state.

(a) What are the largest and smallest positive quantities that can berepresented (excluding zero).

(b) Give the binary image in that register of the decimal numbers

(12.2)1O and (-122)1 0

Exercise 2.3

(a) Design a flip-flop circuit using NAND gates. At what level arethe 'set' and 'reset' lines kept normally?

(b) Construct the characteristic table.(c) How does it differ from a flip-flop with NOR gates?

Exercise 2.4

Construct the equivalent gate diagram for the full adder shown in Fig. 2.8.Construct the truth table and show that it is indeed that of a full adder.

Exercise 2.5

(a) Convert decimal 225.225 to octal and hexadecimal.(b) Represent your first name, middle initial and last name in binary

using ASCII; include blanks between names and a period afterthe middle initial.

PART B

COMMUNICATIONS

Communication implies the transmission of messages and is thebasis of human civilization. Speech, smoke signals, or writtennotes are all forms of communication. We will be concernedprincipally with communication over large distances, often referedto as telecommunications. Telecommunications are based on thetransmission of electromagnetic (em) waves from a sending to areceiving station. The em wave can propagate either in a guidedstructure such as a pair of conductors, a waveguide or an opticalfiber or it can propagate in free space. As technology progressed,higher frequency em waves became available and they offerimportant advantages as information carriers.

In Chapter 3 we introduce some general principles ofinformation transmission. We examine the analysis of an arbitrarysignal into a Fourier series, methods for modulating the carrier,and the sampling theorem for digital encoding of analog signals.The topic of noise in communication channels and of the expectedlevel of random noise is treated next. Finally a brief overview ofinformation theory is given. Information theory assigns aquantitative measure to the information contained in a messageand is used to define the capacity of a communication channel.

Chapter 4 is devoted to the problems of the generation,propagation and detection of electromagnetic radiation atdifferent frequencies. The physical laws governing thesephenomena are Maxwell's equations and are universally valid.Different frequencies however present different problems in theirtransmission through the atmosphere and in their propagationalong guided structures. We thus treat separately the reflectionof radio waves from the layers of the ionosphere, the propagationand focusing of microwaves and the topic of optical fibercommunications. The laser and the properties of laser radiationare discussed in the concluding sections.

THE TRANSMISSION OFSIGNALS

3.1 The electromagnetic spectrum and the nature of the signals

Information is transmitted over long distances by electromagneticwaves which are modulated by the information signal. It is interestingthat the first telecommunication system, the telegraph, invented bySamuel Morse, used a binary encoding system. In that case the carrierwas a direct current and two types of symbols were transmitted: shortand long dashes. Higher frequency carriers can be modulated at higherspeed and thus can carry more information.

The relation between the frequency / and the wavelength X of a wave is

fX = c or f = c/k (3.1)

where c is the velocity of propagation of the wave. For em waves in freespace

c = 3 x 108 m/s

In a material of refractive index n(co) the velocity of propagation of anem wave of angular frequency co becomes

f (3.2)n{co)

and

0.2')

Since for visible light the refractive index for most materials is larger thanone, the wavelength inside the material appears shorter than in free space.Frequency is expressed in hertz (Hz) which measure the number of

84 The transmission of signals

oscillations (or cycles) completed in one second. Angular frequency co isdefined through

co = 2nfand is expressed in radians/s.

The em spectrum and the designation of the radiation at differentfrequencies is indicated in Fig. 3.1. Such apparently disparate phenomenaas radio-waves, visible light or X-rays are all em radiation, subject to thesame physical laws and differ only in frequency. The correspondingwavelength is shown by the lower scale of the figure. Television occupiesthe VHF and part of the UHF bands. Microwave links are currently usedextensively in civilian communications including satellite traffic. Theinfrared band is reserved for military communications and optical fiberstransmit visible light. The UV and X-ray bands are not used much forcommunications because of the technical difficulties inherent in themodulation and propagation through the atmosphere of waves withinthese bands. Other methods of communication using neutrinos or evenseismic waves on the earth's surface have been proposed for militarypurposes but have not as yet found any practical application.

The message that we wish to transmit may be stationary, such as agraph or typed page, or time dependent as for instance a spoken message.Even when the message is stationary it will have to be transmitted as afunction of time. Thus we will treat all signals as being functions of timeand we classify them as analog signals if they are continuous functions(Fig. 3.2(a)) or as digital signals if they are discrete functions of time. Thedigital signals shown in Fig. 3.2(b) have only two levels (they are binary)but in general could be more complex.

In order to transmit the signal we must impose it onto a carrier. Thisis because the carrier, usually a high frequency em wave, has much better

Fig. 3.1. The electromagnetic spectrum.

1 102 104 106 108 101 0 101 2 101 4 101 6 101 8 102 0 / ( H z )J I I I I 1 | | | I I

I I I3000 km 300 m 3 cm 3 /mi 3 A

Fourier decomposition 85

transmission characteristics than the signal which is at low frequency.Furthermore, a high frequency carrier can support simultaneously manylow frequency signals. The signal is imposed onto the carrier by modulatingeither the amplitude or frequency of the carrier. Since the carrier is asinusoidal wave of well-defined frequency it is convenient for calculationalpurposes to decompose the signal into a sum of sinusoidal (or harmonic)waves of definite frequencies. When the signal is periodic, such a sum iscalled a Fourier series, whereas an aperiodic signal is represented by aFourier integral. Fourier decomposition is an indispensable tool indiscussing communications because it provides the connection betweenthe time and frequency domains.

3.2 Fourier decomposition

The Fourier theorem, named after its discoverer the Frenchmathematician J. Fourier (1768-1830), states that any arbitrary functionof time can be represented by a linear superposition of sine and cosinefunctions with angular frequencies co varying from zero to infinity. If thefunction is periodic in time at frequency co0 then the frequencies a>contributing to the linear superposition are only the harmonics of coo,namely

a> = a>0, 2coo, 3a>0,. . . , ncoo , . . . n an integerAs a first example, we consider the series of square pulses shown in

Fig. 3.3 where it is assumed that the pulse train extends from t = — oo tot = + oo, that is, the function V(t) is truly periodic in time with a periodT. Then the fundamental frequency is

* • *co0 = 2nf0 =

271(3.3)

V{t)

Fig. 3.2. Time-dependent voltage conveys a signal: (a) in analog form,(b) in digital form.

V{t)k

(a) (b)


We designate the width of the pulse by AT < T and its amplitude by Vo.According to the Fourier theorem we can write

V(t) = A0+ £ An cos(nco0t) Bn sin(n<o0t) (3.4)

The t = 0 point on the time axis is, in general, arbitrary and we can assumethat it is in the middle of one of the square pulses. Then the function V(t)is symmetric with respect to the variable, namely V( — t)=V(t) andtherefore the terms in Eq. (3.4) containing sines must vanish; this is so,because sm(ncoot) = — sin( — na>ot). Thus we set all Bn = 0 and the Fourierseries reduces to

V(t) = A0 Ancos(nco0t) (3.4')

1 J-

The coefficients An can be determined if V(t) is known. As shown inAppendix 1 one finds that

r/2\ 7(t) dr (3.5)

-T/2

An = - \ V(t)cos(no)0t)dt (3.5')T J - r/2

If we introduce the function V(t) shown in Fig. 3.3 into Eqs. (3.5) we canfind its Fourier amplitudes. The result for the first 12 coefficients An (forthe particular choice AT/T = 0.2) is shown in Fig. 3.4. It is important torealize that the representation of the signal by specifying V(t) (as in thegraph of Fig. 3.3) or by specifying all its Fourier amplitudes An (as inFig. 3.4) are completely equivalent. We can view a signal either in thetime domain or in the frequency domain. Note that only few amplitudesare dominant, in fact the dominant amplitudes are contained in thefrequency interval Aco, which in this case appears to satisfy the condition

A < T 2U

Aco ^ 5CD0 = — o)0 = —AT AT

Fig. 3.3. Periodic signal consisting of a sequence of square pulses ofwidth AT and of period T.

0 L

AT1111

r = 0

AT K -

Fourier decomposition 87

where we used T/AT ~ 5. We conclude thatAcoAx~2n (3.6)

a result which is generally valid, provided Aco and AT are appropriatelydefined.

The result of Eq. (3.6) is of fundamental importance for all wavephenomena. It shows that the bandwidth A/ = Aco/2n is the inverse ofthe pulse duration AT. If we wish to transmit many pulses per unit timewe must make the pulses short; this implies that A / will be large, namelythat high frequencies will be involved in the transmission of short pulses.In a different context, in the quantum mechanical description of physicalphenomena, Eq. (3.6) leads to the uncertainty relation betweencomplementary variables such as momentum and position, or time andenergy.

When the signal is not periodic, there is no fundamental frequency co0

to provide the basis for discrete harmonics. Instead we can think of thesignal as having a very long period, T -> oo; thus co0 —• 0 and the harmonicscontain all frequencies, they form a continuous spectrum. In this case thesignal is represented by a Fourier integral. This is discussed inAppendix 1 but it is convenient to use exponential notation

Iwhere the function A(co) is given by the inverse expression

1 f+ 0°^ - u w L "(t)e'"d'

(3.7)

(3.8)

Note that A(a>) is in general a complex function and has dimensions[K] x [time].

Fig. 3.4. The first 12 Fourier coefficients for the signal shown inprevious figure when AT/7 = 0.2.

n

0.3 Ko-

0.1 Vo-

A\•A2

A,

iA9Al0


When V(t) is symmetric, V(t) = V(-t\ Eqs. (3.7, 8) simplify to

V(t) = A(co) cos cot dcoy/(2n)J-c

i r+o°A(co) = —/ V(t) cos cot dt

(3.7)

(3.8')

and if in addition V(t) is real, A(co) is real and symmetric in co. This isindicated for an isolated square pulse in Fig. 3.5, where the representationin the time domain and in the frequency domain are shown. The tworepresentations are completely equivalent. We again note that the productACOAT ~ 2TT as given in Eq. (3.6). The appearance of negative frequenciesin A(co) is a mathematical convenience and physically it implies a waveof frequency co but with reversed phase.

We have seen that the representations of a communication signal inthe time or in the frequency domain are equivalent. Since the frequencyof a signal is often the determining factor in its propagation characteristicsit is important to know the structure of the signal in the frequency domain.For instance a pure sinusoidal signal has only one frequency component,at its own frequency co0. A very narrow pulse approaching a <5-functionin time (see Appendix 1), contains all frequencies with equal amplitudes.In reality no pulse is perfectly sharp, and we can always assign to it awidth AT; then only frequencies up to co^(2n/Ax) are contained in itsFourier representation.

Fig. 3.5. Square pulse and its Fourier transform: (a) voltage as functionof time, (b) Fourier amplitude as a function of frequency.

V(t) AT

(a)

2TT/AT (b)

Carrier modulation 89

3.3 Carrier modulation

We now examine how information can be imposed on a carrierwave. For instance, voice communication involves frequencies, / , in therange l < / < 1 5 k H z ; such low frequencies cannot be transmittedefficiently over long distances and this is why typical radio carrier waveshave frequencies fc ~ 600-1600 kHz. We write the carrier wave as

yc(t) = Accos(coct + (l)c) (3.9)and express the analog signal that we wish to transmit by g(t). We canuse g(t) to modulate the amplitude of the carrier so that the high frequencywave takes the form

y(t) = IAG + Kg(t)] cos(coct + 0C) (3.10)as shown schematically in Fig. 3.6.

If the signal g(t) has a pure sinusoidal form at frequency com

g(t) = cos(a>mt) (3.11)then by elementary trigonometry the form of the amplitude modulatedcarrier can be expressed as a sum of three terms; we have set (j)c = 0 inEq. (3.10).

y(t) = Ac cos(coct) + K cos(comt) cos(coc£)

= Ac cos(coc0 + - cos[(coc + com)i] + — cos[(coc - com)t]

(3.12)Eq. (3.12) is equivalent to a Fourier expansion and shows that in additionto the carrier frequency a>c the signal contains two pure cosine waves atfrequencies (coc + com) and (a>c — com). These frequencies are calledsidebands, because in practice com«coc.

The magnitude of K as compared to Ac determines the depth ofmodulation. When K = Ac, or K/Ac = 1 we have 100% modulation; thissituation is shown in the time and frequency domains in Fig. 3.7. WhenK » Ac we have 200% modulation and there is phase reversal of the carrier

Fig. 3.6. Amplitude modulated carrier signal.

y(t)k


wave when cos(com£)<0; in this case the carrier is suppressed and onlysidebands appear.

To retrieve the information at the receiving end we must 'demodulate'the carrier. This can be accomplished either by mixing the received signalwith a local oscillator at the carrier frequency or by a square law detector.In the latter case the output of the detector is proportional to the squareof the incoming amplitude; this contains the modulation signal \g(t)\2 andcan be separated (filtered out) from the other high frequencies. In squarelaw detection the phase information of g(t) is lost but this may not beimportant in many applications.

While we have discussed only the simplest form for g(t), we know fromthe Fourier theorem that any arbitrary g(t) can be decomposed into anintegral over Fourier amplitudes. If the highest frequency contained ing(i) is comax, then the sidebands will be contained in the interval

™c - ^max <CO<COC + COmax

Thus the transmission will take place over a frequency interval 2comax

centered at coc. In practice, the carrier and even one of the sidebands areoften suppressed in order to restrict the frequency band used fortransmission. This explains why high frequency carriers can support manymore information channels than lower frequencies do. For instance, fortelevision transmission, a bandwidth of 6 MHz is required. At a carrier

Fig. 3.7. Carrier signal of angular frequency coc modulated sinusoidallyat angular frequency com\ (a) signal envelope as function of time, (b) thefrequency spectrum contains the carrier frequency and two sidebands.


frequency of 200 MHz the channel width is comfortably smaller than thecarrier frequency and spans 3% of the carrier. At a UHF frequency of1 GHz it is possible to accommodate 5 television channels in a band whichrepresents the same fraction of the carrier's frequency.

An alternate form of modulation is to vary the frequency of the carrierwhile the amplitude remains fixed. The unmodulated carrier is given asin Eq. (3.9)

yG(t) = AGcos(coGt + <l>G) (3.9)When the frequency is modulated its value changes in time and

co(t) = coG + g(t) (3.13)The transmitted wave has the instantaneous frequency co(t) and theinstantaneous phase angle 0(0 where d(j)/dt = co(t). Thus

Jo Jo0(0 = Q)(ndr' + 0c = Q)ct+ flf(Odr' + 0c (3.14)

J Jand therefore

y(t) = Ac cos[0(O] = Ac c o s L c f + | g(?) d t ' l (3.15)

where we have set 0C = 0. Eq. (3.15) shows that frequency modulation isequivalent to modulation of the initial phase of the carrier. Frequencymodulation has several advantages over amplitude modulation, one ofwhich is that Ac remains constant. It does however require more complexequipment to modulate and demodulate the carrier.

Let us assume, as before, that g(t) is a pure cosine functiong(t) = K{ cos(comt). Inserting this form into Eq. (3.15) the transmitted wavehas the form

— sm(w mt)com Jy(t) = Ac cos coct + - L sin(a>mO (3.16)L com J

Xf is dimensionless and the instantaneous angular frequency varies fromcoc - K{ to coc + K{. The modulating signal #(0 and the transmitted signalare shown in the time domain in Fig. 3.8.

To discuss a frequency modulated (FM) signal in the frequency domainwe define the frequency deviation, / a, through

/a = S (3'17)and the modulation index, mp, which is the ratio of the frequency deviationto the modulation frequency

m* = y = — (3-18)


In terms of the modulation index, Eq. (3.16) can be written as

y(t) = Ac cos[ct)c* + mp sin(cwm0] (3.16')By a theorem on special functions, the above expression can be expandedin a Fourier series

y(t) = Ac f Jn( (3.19)

Thus an infinite number of sidebands appear, symmetrically located aboveand below the carrier frequency coc

(on = (Dc±n(om n = 0 ,1 ,2 , . . .The amplitude of the sidebands depends on the value of the Bessel functionsJn evaluated at mp. When mp < 1 the amplitudes fall off quickly, an examplebeing shown in Fig. 3.9 where only the n = 0, 1 and 2 sidebands are

Fig. 3.8. Frequency modulation: (a) modulating signal as function oftime, (b) the modulated carrier as function of time.

Fig. 3.9. Frequency spectrum of a sinusoidally frequency modulatedsignal; the relative amplitude of the sidebands depends on the modulationindex mp.


significant. For reference, the Bessel functions for the first few values ofn are shown in Fig. 3.10 as a function of their dimensionless argument, x.

The presence of several sidebands is an advantage in FM systemsbecause they can be used to improve the quality of reception and inparticular eliminate atmospheric noise. However FM requires a muchwider passband than AM, and this is why FM is found only at the higherfrequencies. As an example we consider a VHF radio channel that operatesat a carrier frequency fc = 100 MHz; it transmits audio so thatfm = 2-15 kHz, and uses a frequency deviation / a = 75 kHz. Thus themodulation index is mp = 75/15 = 5. For such a modulation index,approximately 16 sidebands are significant; thus the required bandwidthis ±240 kHz, as compared to ± 15 kHz if it was an AM transmission.

In closing we mention one method of frequency modulating the carrier.For instance in the circuit of Fig. 3.11 if the capacitance is varied accordingto the signal g(t) the resonant frequency of the circuit will changeaccordingly. There exist devices whose capacity is changed by theapplication of an electric signal; they are called varactors.

Fig. 3.10. Bessel functions Jn(x) for n = 0, 1, 2 , . . . plotted as a functionof x.

-0.4 •

Fig. 3.11. Frequency modulating network where the capacity of the'varactor' element depends on an externally applied voltage, i.e.C = C0 cos cot.

TT

rC(«m) Vo

L


3.4 Digital communications

In the previous section we discussed the transmission of analogsignals using the modulation of a carrier wave. It is possible to conveythe same signal by representing it in digital form and transmitting thedigital information over the communication channel. In fact most presentday communication signals are transmitted in digital form because it iseasier to recover errors and because high frequency technology has madepossible digital encoding at extremely high rate.

Consider a continuous signal v(t), of duration At = T as shown inFig. 3A2(a); we assume that the structure (i.e. complexity) of the signalis such that its frequency spectrum extends up to frequencies /max = W.This signal is sampled with a series of square pulses represented by thewaveform s(t) shown in Fig. 3.12(fo); the sampling frequency is /s. Thesampled signal will then be given by the product (convolution) of v(t) andthe sampling signal s(t)

V(t) = v(t)s(t)as shown in Fig. 3.12(c). The question is whether V(t) is a faithfulrepresentation of v(t), or more precisely what is the required samplingfrequency / s such that v(t) can be completely recovered from a knowledge

Fig. 3.12. Analog to digital conversion: (a) analog signal as a functionof time, (b) sampling signal, (c) digitized output signal; this can be realizedby multiplying the waveforms in (a) and (b).

V(t)k

(a)

s(t)

V(t)

(b)

(c)

Digital communications 95

of V(t). The answer is that whenfs = 2W (3.20)

the representation is faithful. This very important result is due to Nyquistand is known as the sampling theorem; we will not prove the theoremhere but we can give a non-rigorous argument to indicate its plausibility.Since the function v(t) has a duration of T seconds we can think of it asa segment of a periodic function with period T; we ignore the regiont < — T/2, t > T/2. Then there is a fundamental angular frequency co0 thatcan be associated with v(t)9 where

co0 = 2n/TSince v(t) is treated as periodic it can be expanded into a Fourier seriesand by definition the highest significant amplitude is at angular frequencyo)max = 2nW. The number of the highest harmonic is n = comax/coo, and thetotal number of amplitudes required to represent v(t) is

(2n + 1) =(o0

+ 1 = + 1 = 2WT + 1 (3.21)o0 J \2n/Tj(There are n coefficients An for the cosine terms, n coefficients Bn for thesine terms and the constant Ao; see Eq. (3.4).) Eq. (3.21) shows that toobtain a faithful representation of v(t) we must make (2WT+1)measurements in the time interval T. Thus the sampling rate must be

~{2WT+\) = 2W + -T T

as given in Eq. (3.20).The previous analysis assumes that every sampled point is measured

with infinite precision. In practice, using a 6-bit word to encode anamplitude gives a dynamic range of 64 (18 db in amplitude) which is

Fig. 3.13. The effect of noise and distortion in the transmission of adigital signal; the original waveform can, however, be recovered.

Transmitted

L

.— Received

. Regenerated

Transmissiondelay


adequate for most applications; even a 3-bit word offers a dynamic rangeof 8. Conversion of an analog level to a digitized signal is accomplishedelectronically with a so-called A to D circuit. The simplest example ofA-D is a digital voltmeter. Modern A-D chips can complete a cycle in1 fis and thus could be used to sample at a rate of fs = 1 MHz. Forcomparison, compact disk recorders sample at a rate of 44 kHz. This istwice the frequency range of audio as expected from the sampling theorem(Eq. (3.20)). Telephone communications are sampled at even slower rates.

The digital signal is transmitted in binary form and this is advantageousin terms of power requirements and error recovery. This latter point isillustrated in Fig. 3.13. Even when the signal is considerably distortedduring transmission it is possible to reconstruct the information that wasinitially sent, by using a simple threshold circuit.

3.5 Noise in communication channels

So far we have considered idealized communication channels; inreality any channel contains some amount of noise. For instance if we tryto talk at a crowded gathering we must raise our level of speech in orderto be heard. This analogy can be carried even further because we canoften carry on a meaningful conversation even if we do not hear exactlyevery word but can still successfully guess at it from part of its structure.Thus, information can be transmitted even in the presence of noise. Forem signals noise can enter the system due to distortion in transmission,spurious noise at the receiver or even outright interference from unrelatedsources.

One form of noise that is always present in any physical system, andtherefore also in communication channels, is that due to the thermalmotion of the molecules; the Brownian motion of suspended particles isdue to the same cause. If we examine a constant voltage across a resistor,the deflection angle of a galvanometer, or many other physical quantitiesthat are in equilibrium, we will observe that they fluctuate about theirequilibrium position as shown in Fig. 3.14(a). If we look in more detail byexpanding the scale of X(t) the fluctuations will be magnified as in Fig.3.14(b). We assume that the average value of X(t) is well defined anddesignate it by (X(t)}. Thus we can introduce the new variable

y(t) = X(t)-(X(t)} (3.22)which has by definition zero mean, <y(£)> = 0. Clearly the mean valueof the signal does not carry information about the noise level. Instead weshould introduce a measure which is positive definite and evaluate its

Noise in communication channels 97

mean value. The mean square value of y(t) is defined throughI rr/2

O2(r)> = l i m - y2(t)dt (3.23)r-oo * J -T/2

Note that (y2(t)} is always positive and it is independent of time. Itsmagnitude is a measure of the fluctuations (or noise) on the signal X(t).We note that <y2(t)> is related to the signal X(t) through

<y2(t)> = <ix - <x>]2> = {x2y - 2(xy(X) + (xy2

= <*2>-<X>2 (3.23')When we measure the variable X at any particular time t, the result

X(t) will differ from the mean value <X> by the amount y(t). Thusknowledge of y(t) would enable us to determine the desired mean value<X> from a single measurement. However y(t) represents randomfluctuations and thus we cannot know a priori the function y(t); but wecan determine the probability distribution of y(t). That is, there exists afunction f(y) such that f(y0) dy gives the probability that at any time t,the measured value of y, will be between y0 and y0 + dy. Thus we can write

P[y0 < y < yo + dy-] = f(y0) dy (3.24)the function f(y) is called a probability density function. To determinef(y0) experimentally we can divide the record of Fig. 3.14 into N regionsand count the number of times, ny, that y0 < y < y0 + dy. Then/(.yo) dy = ny/N in the limit N -• oo and dy -> 0. Fluctuations are astochastic, or random, phenomenon and are described by the functionf(y).

It is a remarkable fact of nature that the fluctuations in manyphenomena, especially when several random causes contribute to the noise,follow a Gaussian distribution. Furthermore the Gaussian distribution is

(X)

Fig. 3.14. Random fluctuations about: (a) the mean value <X>, (b) aboutzero, are shown as a function of time.

\0y(t)

• t{a) (b)


characterized by a single parameter, its standard deviation o. Thenormalized Gaussian distribution function is

The square of the standard deviation, a2, is equal to the mean squarevalue of the fluctuations

(3.26)Note that F(y) has dimensions of 1/y as required for a probability density.In terms of the variable X, the distribution function is

1F(X) = (3.25')(27T)1/2<7

The normalized Gaussian distribution is plotted in Fig. 3.15 in termsof its dimensionless normalized argument y/a. Given a normalizeddistribution function f(y) the mean value of a function (j)(y) is in generaldefined through-r

J — c

<<t>(y)>=\ f(y)<t>(y)dyJ — oo

The first few 'moments' of the Gaussian distribution are as follows

e - ^ 2 " 2 d y = l (3.27)

(3.27)

(3.27")

Eq. (3.27') follows immediately from symmetry whereas Eq. (3.27") canbe obtained by integration by parts followed by use of the normalizationcondition of Eq. (3.27).

Fig. 3.15. Normalized Gaussian distribution function plotted in units ofits standard deviation a.

- 3 - 2 - 1 0

Noise in communication channels 99

Knowing the distribution function makes it possible to calculate anypertinent probability. For instance the probability that the fluctuations

f00will reach a value y > y0 is given by the integral P(y > y0) = F(yf) dy'.

It is convenient to tabulate the values of this integral as a function of thenormalized deviation yo/(j; we give below the probabilities that \y\/a islarger than 0.5, 1, 2 and 3.

P(\y\/a > 0.5) = 0.6175 P(\y\/<r > 1) = 0.3174

P(\y\/(T>2) =0.0456 P(\y\/a > 3) = 0.0026

The probability that y exceeds y09 (y0 > 0), is half that of \y\ exceedingy0; for instance the shaded area in Fig. 3.15 represents P(y/d>2) andmust equal 0.0228. The probability that yja exceeds some very largenumber m is finite but decreases very rapidly (exponentially) becomingzero for all practical purposes. The Gaussian function has universalapplication to many statistical phenomena in particular as they manifestthemselves in most sciences.

If a function z is the sum of two functions, xx and x2 which are Gaussiandistributed, then z also is Gaussian distributed. Let

z = x1+x2

and al9 o2 be the standard deviations of xx and x2. Then, if x1 and x2

are uncorrelated, <rz, the standard deviation, of the Gaussian distributionof z is

*, = (*? + ffi)1/2 (3.28)We say that the distributions are combined in quadrature. Note that thestandard deviation of z is always larger than that of either xx or x2.

Now we would like to examine the effects of the noise on a signal asthey appear in the frequency domain. In general, in communicationssystems we are primarily interested in the transmitted power rather thanin the amplitude of the signal. For instance, if an antenna receives avoltage V, the received power is

V2

P = — = i2R (3.29)XV

where R is the characteristic impedance (resistance) of the antenna, andi is the current flowing through it. Thus we are interested in a quadraticmeasure of the signal and of the noise. Such a measure can be expressedin a convenient form which has been adopted in the description of allfluctuation phenomena.

If y(t) is the fluctuation signal we can evaluate its Fourier transform


(see Eq. (3.8)) over the interval TT/21 C

(2TC) 1 / 2 J(3.30)

r / 2where we use the subscript T to indicate the interval of integration. Wenow define the power spectral density or 'power spectrum' of thefluctuations (noise) by the relation*

G(CO)= lim 2n- (3.31)

The 'power' due to the fluctuations in a frequency interval d / is given by

G(<y) d / = — G(co) dco2n

(3.32)

We have used quotation marks in referring to power because the term isnot exact. For instance if y(t) represents a voltage then

„ <y(t)2>v2

P = — andR R

Thus in the frequency domain the noise power per unit frequency, dPN/dfis obtained from

If y(t) represents a current, then we multiply by R etc. We will follow theadopted convention and treat G(co) as 'power' per unit frequency. Notethat gT(co) has dimensions of (y(t) x time); thus G(co) has dimensions of(y(t)2 x time) or (y(t)2/frequency).

Various spectra of G(co) are shown in Fig. 3.16. In (a) we show a

Fig. 3.16. Examples of noise power spectra: (a) white noise, (b) resonantsystem, (c) 1// noise.

- 1 /co

(a)

co0

(b) (c)

* The brackets indicate the ensemble average; note also that G(a>) is defined overpositive and negative frequencies.

Sources of noise 101

constant G(co) which is referred to as random or white noise; however atsome frequency comax the spectrum must fall off because otherwise the totalnoise power would become infinite. In (b) is given the noise spectrum ofa system which has strong resonant behavior. Yet another spectrum, thatof the 1// noise, is indicated in (c) of the figure.

The fluctuation spectra in the frequency and time domain are connectedthrough the Wiener-Khintchine relationship. We first define theautocorrelation function R(T), through

I rr/2R(T)= lim - y(t)y(t-T)dt (3.33)

Tôo * J -T/2

The power spectrum can then be expressed as the Fourier transform ofthe autocorrelation function and vice versat

- j ;G(co)= R(T)el(OXdT (3.34)

1G(co)eda) (3.35)

2TTJ_ 0 0

A discussion and proofs of these relations are given in Appendix 2. Herewe give only an important result that can be derived from Eqs. (3.33, 3.35);as follows: We set x = 0 in Eq. (3.33) to obtain

fr/2R(0) = lim - y(t)2 dt = (y(t)2} (3.36)

T J T/2If we also set T = 0 in Eq. (3.35) we find

i fr-T J -

= f°J -

R(0)= f G(co)^=(y(t)2> (3.37)J 2

Namely the total 'power' in the fluctuations can be obtained either byintegrating the power spectral density over all frequencies, or by evaluatingthe mean square value of the fluctuations in the time domain. The twocalculations yield the same result.

3.6 Sources of noise

White noise. In this case the noise is completely random in time.Thus the autocorrelation function must be zero unless T = 0. This is sobecause the noise never reproduces itself and the overlap of two differenttime intervals will always average to zero. When T = 0 we measure themean square value of the noise, more precisely we have R(T) ~ A over a

t The normalization differs from that of Eqs. (3.7, 8) in order to conform with theusual definition of R(T).


small interval — T C < T < T C . TC measures the time scale over which thenoise changes and is called the correlation time. Furthermore 2TC = n/comax,where comax is the highest frequency component contained in the noise.We write

R(T) = 2ATCS(T) (3.38)Using this expression in Eq. (3.34) we immediately find

G(co) = 2ATC = A — (3.39)^ m a x

Namely that the power spectrum is flat.To obtain a convergent result we must assume that the spectrum rolls

f°°off at comax, as shown in Fig. 3.16(a). Then the integral G(co) d / = AJ — oo

as expected. A more realistic form for the autocorrelation of white noise isR(T) = AQ-1X1/ZC (3.40)

and the resulting power spectrum

G(co) = A 2Tc (3.41)1 + (COTC)2

When co = 1/TC the spectrum has fallen to half its flat value, and we neednot introduce additional convergence conditions. The evaluation of thetransform leading to Eq. (3.41) is elementary and is left to the reader.

Shot noise. In a diode, or in any electronic device, the current fluctuatesdue to the statistical distribution in the flow of the electrons. This givesrise to shot noise and we will analyse it in terms of the diode shown inFig. 3.11 (a). With sufficient time resolution we are able to detect thepassage of individual electrons through the meter and we can plot theirtime of arrival as shown in Fig. 3.11 (b). Since the electrons arrive atrandom times tn the instantaneous current can be written as

W = qYJS(t-tn) (3.42)

Fig. 3.17. Shot noise in a diode: (a) the current flowing through thedetector fluctuates, (b) plot of the time of arrival of individual electrons.

m _LJJ_L_L,(a) (b)


Hereq is the electron chargetn is the time of arrival of the nth electronsR is the number of electrons arriving per unit timei0 = Rq is the average current

To obtain the power spectrum we evaluate the Fourier transform of i(t)rr/2

gT(co)= e-i(Oti(t)dt = qYje~i<otn

J - T/2 n

Here n runs from 1 to nmax = RT, the number of electrons that have arrivedin the (long) time interval T. When we form |gT(co)\2 the cross-termsaverage out to zero so that

and the power spectrum is given by

^ 2 = q2R = qi0 (3.43)

This spectrum is flat in frequency as expected for a random noise source;it has a physical cut-off at some high frequency. Since 2G(co)A/ gives thepower in the frequency interval A / (the factor of 2 is introduced becausewe want to deal only with positive frequencies) the current fluctuationsin a frequency interval A / are

(i2y = 2qi0Af (3.44)This result was first derived by W. Schottky in 1915.

As an application of Eq. (3.44) consider the flow of direct current atan average value of 0.1 juA. If this current is viewed on an oscilloscope witha 1 MHz bandwidth (A/ = 106 Hz) the rms fluctuations will be of order

y 2 = [2 x 1.6 x 10"1 9 x 10"7 x 106]1 / 2 = 1.8 x 10~10 ANamely

Thermal noise. This is the most important source because it is presentin any physical system and is the limiting factor in measuring small signals.In electronic devices, for instance in a resistor, the electrons have randommotion, but there is no net transport of charge; thus there is no dc currentbut an ac component is present. The thermal noise in a resistor is referredto as Johnson noise.

We will use a heuristic approach to derive the power spectrum asfollows: when a current i flows through a resistor R the power dissipation


is P = i2R. The thermal energy per degree of freedom is \kT and thereforethe thermal power in a frequency band A / is P = kTAf. If we equate thethermal power to the current dissipation we obtain for the currentfluctuation

(i2y=-4kTAf (3.45)R

The factor of 4 follows from a rigorous derivation of Eq. (3.45) whichwas first obtained by Nyquist. The thermal noise power that can becoupled out of a resistor R into a matched load is given by

Pnoise = kTAf (3.46)Thus the power spectrum of the Johnson noise is flat and equal to

G{co) = \kT (3.47)In the above expressions k is Boltzmann's constant: k = 1.38 x 10~23 J/K,and T the absolute temperature in Kelvin.

As an application of Eq. (3.45) we take K = 50Q, T = 3 0 0 K andA / = 1 MHz. This yields y<(> 2 = 1.8 x 10"8 A which exceeds by twoorders of magnitude the rms current due to shot noise as calculated inthe previous example. Measurements of Johnson noise can be used toaccurately determine the value of Boltzmann's constant.

Quantum noise. At very high frequencies the quantization ofelectromagnetic energy leads to statistical fluctuations in the detection ofindividual photons; we refer to these fluctuations as quantum noise. Everyphoton carries energy hco, so that if a signal of duration T contains Nphotons (the rate is R = N/T), the energy and power in the signal are

E = Nhco (3.48)and

P = —— = Rhco (3.49)

We note that 1/T = A/ is the bandwidth of the measurement, lino photonsare received in the interval T, the fluctuation is still one photon. Thiscorresponds to a power level hco/T, and therefore the noise power, PN is

PN = hcoAf (3.50)The power spectrum is G(co) = (hco)/2 and is proportional to the frequency.At very high frequencies quantum noise can become dominant. If thenumber of photons received in the measuring time is larger than one, then

(3.50^)

Noise in amplifiers. As we have seen, noise will be present in any


communication channel and will coexist with the signal. The ratio of noisepower to signal power is called the signal to noise ratio and is written asS/N. An amplifier is characterized by a gain G; thus if the input signalpower is Sx the output power will be S2 = GSX. Similarly, we designatethe input noise power by Nx and the output noise power by N2. HoweverN2 ^ GNX because, in general, an amplifier will contribute some additionalnoise. These definitions are indicated in Fig. 3.18.

The noise introduced by the amplifier can be characterized as an additiveequivalent input noise power ATe, such that

N2 = G(N,+Ne) A ^ O ; or Ne = N l G N l (3.51)G

We express Ne by an equivalent amplifier noise temperature by analogyto the thermal noise power given by Eq. (3.46), namely

Ne = kTeAf (3.52)where A/ is the bandwidth of the amplifier. The noise temperature dependson many parameters, including the frequency at which the amplifieroperates. In a good amplifier Te can be less than the ambient inputtemperature.

An alternate convention for expressing the noise contribution of anamplifier is to give its noise figure (NF). The noise figure is given in db(decibel)

JVF=101og l 0(F) (3.53)where F is the noise factor. The noise factor, in turn, is defined as theratio of the (S/N) at the input of the amplifier to that at its output; bydefinition the noise factor is larger than one,

p _ (S/N) input _ SJN, _ N2 _ G(N1 + AQ

(3.54)

(S/N) output S2/N2 GNi GN,N T

1 ^ 1 ^

In the last step we assumed that the input noise is thermal, and attemperature Tx. The difficulty that arises with the noise figure is that itsvalue depends on some standard value for Tl. It is agreed that Tx = 290 K.

Fig. 3.18. Amplifier of gain G with signal and noise power Sl9 Nt at itsinput; the signal and noise power at the output are labeled S2, N2.

(S/N) output = - 1


Thus for instance an amplifier with a noise temperature Te = 58 Khas a noise factor: F= 1 + (58/290)= 1.2, and a noise figure:NF=101og10(1.2) = 0.8db.

When two or more amplifiers are cascaded the noise factor of the firststage usually dominates the overall noise in the system. This can be easilyseen by constructing the noise factor F for the combined system of thetwo amplifiers sketched in Fig. 3.19. It is easy to show that

TN=TN1 + 1^ (3.55)

Thus as long as TN2/G1« TN1 the noise contribution of the second amplifierbecomes insignificant.

It is possible to detect signals even when the (S/N)«1 provided thesignal is present for an adequately long time interval. We can then makerepeated measurements and average them. If n averages are taken, thefluctuations in the noise level are reduced by a factor of 1/y/n whereasthe signal is unchanged. Thus it becomes possible to distinguish evensmall signal power over the 'smooth' noise level.

Fig. 3.19. Cascaded amplifiers.

S3, N2

3.7 Elements of communication theory

In order to optimize a communication channel we must have aquantitative measure of the information that we wish to transmit. Theefforts in this direction were pioneered at Bell Labs by J. W. Tuckey,R. L. V. Hartley, H. Nyquist and others, and placed in a formal contextby C. E. Shannon.* Language messages have statistical properties whichhave been studied and are well known to cryptographers. Furthermoremessages carry a varying degree of information and a communicationschannel has a given capacity for transferring information. Messages arenot restricted to language but can be coded in a variety of forms; forinstance a television picture. We will begin, nevertheless, by reviewing the

* C. E. Shannon, The Mathematical Theory of Communication, The University ofIllinois Press, 1962.

Elements of communications theory 107

statistical properties of the English language and use this analysis as anexample of the more general concepts.

The letters of the alphabet form a finite set and in every language wecan define the frequency P(i) with which the letter T appears. We cansimilarly define the frequency for digrams P(ij), namely the probabilitythat the combination 'if appears in the language; the conditionalprobability P(i|j) gives the frequency with which, once T is found, it isfollowed by the letter ' j ' . In summary

P(i) frequency of letter iP(ij) frequency of digram ijP(i | j) conditional probability that j follows i

These probabilities are normalized, in obvious manner, by

X P(i) = 1 i, 7 = 1,. . . , N{all letters of the alphabet}

Given a set of symbols we can select a sample from it according to theprobability of each symbol. Such a sample is called stochastic. A stochasticsample of the letters of the alphabet will, in general, not be an intelligentmessage. This is so because intelligent messages contain informationwhereas a stochastic sample does not. We can improve on this selectionprocess by using not only the probability of the letters, but also the higherorder probabilities such as digrams, trigrams etc. In this case the samplemay resemble a message in English but still it cannot contain informationbecause it was constructed randomly.

As examples we give below messages which have been obtained bysampling the alphabet (including the blank space) through differentapproximations.

Zero order - All letters equiprobableBTLCCVNFYRRKDXQR . . .

First order - Letters selected according to their frequency in the Englishlanguage

TH EEI ALHENHTTPA NAH . . .Second order - Letters selected according to their digram frequency.A practical way of doing this is to select the first letter, say E, and thenopen a book at random and find the letter E; enter in the sample the


letter following E, say K. Now turn the page and begin at the top untilyou find K; enter in the sample the letter that follows K, say R, and so on

TEDELICEPY CHE SUS ORAINOS . . .One can proceed to higher orders in forming such stochastic messages.

The same procedure can be used with words rather than letters. Herewe have a much larger finite set of symbols and correspondingly theprocess is more laborious. For instance messages where the words havebeen selected from a first order and a second order approximation aregiven below.First order word message

REPRESENTING AND SPEEDILY IS AN GOOD APT ORCOME . . .Second order word message

WHEN NOISE RATIO AT FIRST STEP, DEPENDSU P O N . . .Note that the second order message was constructed using a text ontelecommunications and therefore is not completely random. It containsinformation as to the set of words from where it was drawn.

More details on the structure of the English language can be found inthe references. Suffice it to say that in English there are

26 letters16 357 words (this is an approximation to the commonly

used words)4.5 letters/word

The most common letter is E having a frequency of 0.13105 and the leastcommon is Z with frequency of 0.00077. The frequency of occurrence oflatters in English is reproduced in Table 3.1 below (taken from Secretand Urgent by Fletcher Pratt, Blue Ribbon Books, 1939). A tabulationof word frequencies can be found in Relative Frequency of English SpeechSounds by G. Dewey, Harvard University Press, 1923.

The frequency of words appears to follow a law typical of large samples;namely if the symbols of the set are ranked in order of descendingfrequency, then the frequency of the rth symbol is given by

P(r) = - (3.56)r

where C is a constant. Thuslog P(r) = log C - log r


This relationship is known as Zipf's law and is shown in Fig. 3.20 for thewords of the English language, where C ~ 0.1.

From the discussion it should be clear that in an English languagemessage, transmittal of the letter X (which has low frequency) conveysmuch more information than transmittal of the letter E (which has highfrequency). This relationship is reminiscent of the definition of entropy instatistical systems. Entropy is a macroscopic measure of the probabilitythat a random arrangement (of the positions and momenta of themolecules) will lead to that state of the system. High entropy means lackof information about the specific arrangement of the molecules becausemany microstates can lead to that particular macroscopic state. Whenonly a few microstates can contribute to a macroscopic state of the system

Table 3.1. Frequency of occurrence of letters in English

1.2.3.4.5.6.7.8.9.

10.11.12.13.14.15.16.17.18.19.29.21.22.23.24.25.26.

Letter

ETAONRISHDLFCMUGYPWBVKXJQz

Frequency of occurrencein 1000 words

591473368360820308286275237171153132124114111908989686541197653

Frequency of occurrencein 1000 letters

131.05104.6881.5179.9570.9868.3263.4561.0152.5987.8833.8929.2427.5825.3624.5919.9419.8219.8215.3914.409.194.201.661.321.210.77


the entropy is small, hence the expression that 'information is negativeentropy'.

Entropy is defined throughs = klnW (3.57)

where W is the number of microscopic states and k is Boltzmann'sconstant; the form of Eq. (3.57) is the only one that assures the additivityof the entropy when two systems A, B are coalesced into a single system C

Sc = SA + SB = k\n(WAWB) = kIn Wc

By analogy we define the information content h{ of a message i throughthe probability p{ of its occurrence

hi=-pi\og2pi (3.58)The minus sign is necessary since by definition px < 1. Suppose we are

Fig. 3.20. Word frequency v. word order for the English language obeysZipf's law. (From E. W. Montroll and W. W. Badger, Introduction toQuantitative Aspects of Social Phenomena, Gordon and BreachPublishers, New York (1974) by permission.)

0.1

0.01

0.001

o

0.0001

0.00001

Jt£«— The-i

4-T0

-Or

\

y

HiRCI illy

H>§+- Q uality

V

\\

1 2 4 6 810 20 40 60 100 200 400 1000 2000 4000 10,000Word order


sending a message in binary form and that the two symbols '0' and Thave equal probability of being transmitted. Therefore, each symbol hasprobability \ and according to Eq. (3.58) the corresponding informationcontent is

(3.59)

The information content of a message consisting of one binary digit isthe sum of the information carried by the two symbols, or states

//(binary digit) = h1 + h0 = l (3.60)The unit of information has been defined in terms of the binary digit asin Eq. (3.60) and is called a bit; this word has been derived from acontraction of bi(nary) (digi)t.

We can extend the argument to binary transmission where the twostates have unequal probability, p0 ^ px. If AT digits have been transmitted,state '0' will be received Np0 times while state ' 1 ' will be received Np1

times. The information content of this transmission of N digits is

HN = HN0 + HN1

I

= -Np0 log2 p0 - Npi log2 px = -N X Pi log2 Pi (3.61)i = 0

The information per digit is defined through H — HN/N; in general H < 1and H = 1 if and only if po = p1. If a larger set of symbols is used in themessage, the information content per message has the same form as forthe binary case but with the sum extending over all symbols.

H=-tPilog2Pi (bits) (3.62)

The information is maximal when all pt are equal to each other. If onept = 1 and all others are zero, the information content of a message is zero.

We can apply this formalism to the English language with its 26 letters.If all letters were equiprobable, (pt = j$), receipt of one letter would contain

26H = - X Je Iog2(^) = Iog2(26) = 4.7 bits (3.63)

This is not surprising because with 4 binary digits we can encode only 16symbols, whereas with 5 binary digits we can encode 32 symbols. ThusH can also be interpreted as the number of binary digits needed to fullyencode the set of symbols used to transmit the message. On the otherhand we have seen that in the ASCII code, 7 bits are used to encode eachletter of the alphabet (see Table 2.2). This may appear wasteful, but itprovides redundancy. In reality in the English language the letters are


not equiprobable and using the correct probabilities leads to26

H = - X Pi l Qg Pi = 4 - 2 b i t s (3-6 3 ' )i = 1

This result implies that receipt of one letter in a message, such as in ateletype text, carries less than maximal information.

We can define the relative entropy of a set of symbols as the ratio ofthe information content of a single symbol to the maximum possibleinformation content. As an example, for the letters of the English alphabet

H 4.2 ^Relative entropy = = — = 0.9

Hmax 4.7

We say that the redundancy of the letters in the English language isRedundancy = 1 — (Relative entropy) = 0.1

If we take into account digram and trigram probabilities the informationcontent is even smaller and the redundancy of the language isapproximately 0.5. This means that in writing a text in English only halfthe letters can be freely chosen; the other half is imposed by the structureof the language.

It is possible to devise coding schemes that are maximally efficient andapproach the theoretical limit given by Eq. (3.62). Such codes are knownas Huffman codes but do not have the simplicity of less compact codessuch as ASCII. Furthermore, redundancy in a coding system is useful forerror recovery as we discuss in the next section. Shannon's work has madepossible the quantitative evaluation of the information content ofmessages; this content, measured in bits, is independent of the method ofcoding or transmission.

3.8 Channel capacity

Since information is measured in bits, the rate at whichinformation is transmitted will be measured in bits per second. We write

/ = (bits/s)If we wish to transmit R messages/s, each message having an informationcontent of H bits, then the transmission rate / must be

I = HR (baud) (3.64)Here we use the commonly used notation where 1 bit/s = 1 baud, this unithaving been named in honor of the French engineer J. B. F. Baudot.

A communications channel has a capacity C at which it can transmitinformation. If the bandwidth of the channel is W and the number of

Channel capacity 113

symbols that are being transmitted is m, then the capacity isC = 2Wlog2(m) (baud) (3.65)

In a binary channel, m = 2 and therefore C = 2W. This result is analogousto the sampling theorem (Eq. (3.20)) which states that an analog signalof harmonic content, (or bandwidth W), must be sampled at rate / s = 2W.If the digitization is done at several levels, the channel capacity increaseslogarithmically. There are however good reasons for staying with binarytransmission because of the lower probability for errors.

Eq. (3.65) is given for a noiseless channel. For a channel with signal tonoise ratio S/N the channel capacity is given by Shannon's equation

C=W\og2(l+S/N) (3.66)The above equation is the basis for the design of any communicationchannel. Note that when S/N = 3 the capacity is equal to that of a noiselessbinary channel (Eq. (3.65)). For larger values of S/N one can use multilevelencoding rather than straight binary so as to have the same capacity asan optimally chosen noiseless channel.

As an application of Shannon's equation we consider facsimiletransmission. Let the desired transmission parameters be

Transmission rate R = 1 frame/minuteSize of frame 10x10 inchHorizontal line density 100 lines/inchResolution of lines 100 dots/inchIntensity shadings for each dot 32

We want to find the minimal required bandwidth if the channel has asignal to noise ratio S/N = 20 db.

First we calculate the information content of each frame, which has[(10 x 100) = 103 lines] x [(10 x 100) = 103 dots/line] = 106 dots

Each dot can be any one of 32 symbols (all a priori equiprobable) so thataccording to Eq. (3.62)

H = 106 Iog2(32) = 5 x 106

The message rate is R = s"1, and therefore the transmission rate is/ = 8.3 x 104 baud

A S/N ratio of 20 db corresponds to

S/N=10 (20 /10 )=102

and thus the channel capacity from Eq. (3.66) is

If we equate the channel capacity to the desired transmission rate we find


the minimum bandwidth required for the channel

Wmin = -^—^ = 1.2 x 104 Hz (3.67)

If we had used the noiseless channel formula with m = 32 we would haveconcluded

Wmin = 0.83 x 104Hz (3.67)which is below the Shannon limit. Even the result of Eq. (3.67) is toooptimistic because it does not allow for any redundancy. Since thebandwidth of typical telephone lines is of order 2 x 104 Hz we concludethat the specifications of this example correspond to realistic parametersfor facsimile transmission.

From Shannon's equation it appears as if one could maintain a givenchannel capacity by increasing the bandwidth at the expense of S/N, orvice versa. In the limit that the receiver noise is thermal, no furtherimprovement in S/N is possible, and for transmission over very longdistances more often than not S/N « 1. It is in these cases that Shannon'sequation is of great importance. We can then recast Eq. (3.66) in a formwhere the channel capacity depends only on the received power, Ps, andthe noise temperature T of the receiver. Since S/N « 1

log2 (1 + — ) = — 1 — loge (1 + — J ~ 1.44 —V Nj loge(2) *\ Nj PN

The noise power is assumed to be thermal (see Eq. (3.46))PN = kTAf = kTW (3.68)

where A / = Wis the bandwidth of the receiver. Thus Eq. (3.66) becomes

• 1 . 4 4 ^ (3.69)

In this limit, increasing the bandwidth does not improve the channelcapacity because it increases the noise power in the same proportion.

The limit of Eq. (3.69) is representative of communication with distantspacecraft where the transmitter power is limited and the distance overwhich the signal is transmitted is very long. As an application let usconsider transmission of messages from a satellite orbiting Mars. Weassume the following plausible parameters.

Transmitter power Pt = 1 WattTransmitter antenna gain Gt= 105

(this corresponds to an antenna diameter of 6 m assuming microwavetransmission, k ~ 3 cm)

Receiving antenna diameter Dr = 29 m

Exercises 115

Distance Mars-Earth R~43x 101 * mReceiver temperature Tr = 300 K

We first calculate the signal power received from the orbiter

18 v l l/4nR2 l l 4nR2

The channel capacity is calculated from Eq. (3.69) and we find^ 1.44 1.44 1.4 x 10~1 7~5x 103baudC Ps

kT (1.38 x 10"23)x 300The messages being transmitted could be TV pictures of Mars obtained

at low resolution with a matrix of 512 x 512 dots (pixels) each, with4 shadings (black and white picture). The information content of such aframe is

H = (512 x 512) x Iog2(4) = 5 x 105 bitsThus the rate at which the pictures can be transmitted is

o / C 5 x 103 1 .R = — = — = = pictures/s

H H 5 x 105 100This corresponds to 36 pictures/hour, a fairly realistic rate for the presentstate of technology.

Exercises

Exercise 3.1

A communications channel has a bandwidth B=10 MHz and a signal tonoise ratio of 15. The receiver operates at room temperature (T= 300 K)and the noise is only thermal.

(a) Find the maximum rate of information that the channel cantransmit.

(b) Find the power at the receiver.

Exercise 3.2

What are the relevant 'orders of magnitude' answers to the followingquestions

(a) The highest frequency electromagnetic wave that can be carriedon two open wires.

(b) The frequency of visible light.(c) The sampling frequency necessary to digitize an audio signal such

as a telephone message.


Exercise 3.3

We wish to transmit television pictures consisting of200 bits/line200 lines/frame at the rate of 50 frames/second

(a) What is the minimum bandwidth required for the channel?(b) Calculate the power due to thermal noise in the above bandwidth

interval for T = 300 K.(c) Assume that the transmitter emits with 0.2 W power isotropically

and that the receiving antenna has an equivalent area of 0.4 m2.What is the largest distance at which signals can be received witha signal to noise ratio of 10?

Exercise 3.4

(a) Search through 100 words to establish the frequency of the lettersE, P, X in the English language. Establish the conditionalprobability that H follows T.

(b) Search through 1000 words in an English text and establish thefrequency of the 10 most encountered words. Make a log-logplot of the word frequency, / , v. the word order, n (most frequentword, second most frequent etc.) to show that Zipf's law, shownin Fig. 3.21, is indeed obeyed.

Exercise 3.5

Show that the effect of combining two Gaussian distributions of standarddeviation al and o2 results in a Gaussian distribution of standard deviation

GENERATION ANDPROPAGATION OFELECTROMAGNETIC WAVES

4.1 Maxwell's equations

All electromagnetic (em) phenomena are completely and uniquelydetermined by a set of differential equations, the famous Maxwell'sequations. These equations predict the existence of em waves and governtheir propagation and generation. We will write the equations for a regionof space where no dielectric or permeable materials are present, and usethe MKS system.

\.E = p/e0 (4.1)

- ( V x B ) - £ o ^ = J (4.2)

V B = 0 (4.3)

V x E + - = 0 (4.4)dt

Here E and B are the electric and magnetic fields which are vectors (B isreally a pseudovector), p and J are the electric charge density and electriccurrent density respectively. The dielectric permittivity of the vacuum isdesignated by e0 and the magnetic permeability of the vacuum by /i0.Introduction of these quantities is required to ascertain consistency ofunits and dimensions and in the MKS system they have the values

eo = 9.954 x 10"12 coul/V-m](4 5)

10"7 V-s/A-m (

Maxwell's equations exhibit a great degree of symmetry: Eqs. (4.1) and(4.2) are inhomogeneous and consist of one scalar and one vector equation;the driving terms are the charge density p and the current density J.Eqs. (4.3) and (4.4) are analogous to Eqs. (4.1) and (4.2) except that they

118 Generation and propagation of em waves

are homogeneous; this reflects the absence of magnetic monopoles innature. Finally there is the very important difference in sign betweenEqs. (4.2) and (4.4). This is what makes possible the existence of em wavesin free space as we show below.

In free space there are no charges or currents so we set p = 0, J = 0 inEqs. (4.1, 2). We take the curl of Eq. (4.4) and use a relation from vectorcalculus

V x (V x E) = - V2E + V(V-E)

Since by Eq. (4.1), V-E = 0 in free space, we have from Eq. (4.4)

V x ( V x E ) = - V 2 E = - V x — = (V xB) (4.6)

where in the last step we interchanged the order of differentiation. Wenote that (V x B) is given by Eq. (4.2), and in free space

<9EV x B = ^ 0 6 0 — (4.6')

ctBy taking the time derivative of Eq. (4.6') and equating it to Eq. (4.6) weobtain

^ = 0 (4.7)

Eq. (4.7) is a wave equation where the velocity of propagation c is

c 2 = — (4.8)

Using the values of e0, /x0 given in Eqs. (4.5) we find that c = 3 x 108 m/s,the observed velocity of propagation of light in free space. Thus light isan electromagnetic wave.

The solutions of Eq. (4.7) are waves as can be seen by considering thesimple case where E has only one component, say Ex, and varies onlyalong one direction in space, say the Z-axis, Then Eq. (4.7) simplifies to

with solutionsEx = f(z±ct)

where / is any arbitrary doubly differentiate function of argument (z + ct).The solution Ex = f(z — ct) represents waves propagating to positive zwhereas Ex = f(z + ct) propagates to negative z. The function f(z ± ct)can always be represented by a Fourier expansion, thus it suffices to study

Maxwell's equations 119

the propagation of harmonic waves, such as given byEx = Eo cos(cot - kz) (4.9)

whereco/k = c (4.10)

k = 2njX is the wave vector.An em wave has both a magnetic and an electric field, normal to one

another, and both transverse to the direction of propagation. We showthis for the simple example of a plane wave propagating along z andhaving only an Ex component, but the result is absolutely general. Underthese assumptions only the ^-component of the curl in Eq. (4.4) is differentfrom zero and Eq. (4.4) reads

Using Eq. (4.9) for Ex we find8E* , r - , , * dBy—- = kE0 sm(cot -kz) =dz dt

k 1By = ~E0 cos{cot -kz) = -Ex9 BX = BZ = O (4.9')

CO C

Thus the plane wave of this example has the form shown in Fig. 4.1 (ataot = n/2).

The electromagnetic wave carries energy along its direction ofpropagation, and the energy flux, that is the energy crossing unit area inunit time, is given by the Poynting vector

= - ( E x B ) (4.11)

Fig. 4.1. Electromagnetic plane wave propagating along the Z-axis andpolarized along the X-axis.


S points along the direction of propagation k/|k|. The time dependenceof S is cos2(cot — kz) and the net energy flux is obtained by time averagingS, which introduces a factor of \. Thus

2/x0 |k| 2 \iU0y

The energy density in a region where an electric and magnetic field arepresent is in general

(4.12)

Thus, the energy flux is related to the energy density in the wave through

as can be easily verified by comparing Eqs. (4.11) and (4.12) and usingEq. (4.9').

The electric and magnetic fields are vectors, so that in principle theycan have three independent components. In an em wave however both Eand B must be transverse to the direction of propagation and this reducestheir independent components to only two. To be concrete we will assumethat the wave propagates along z and we label the two components byEx and Ey as in Fig. 4.2. The direction along which the £-field oscillatesis called the direction of polarization of the wave. If the two componentsare in phase, the polarization vector remains fixed in space and we saythat the wave is linearly polarized. If the two components are equal and90° out of phase the polarization vector rotates - at frequency co ofcourse - and we say that the wave if circularly polarized. In general, thewave can be elliptically polarized. Even though an em wave always hasdefinite polarization a source often emits em radiation containing waves

Fig. 4.2. The electric field vector in a plane transverse to the directionof propagation: (a) linear polarization, (b) circular polarization,(c) elliptical polarization.

Ev = En cos cot

(a) (b) (c)

Radiation and antennas 121

of all possible polarizations; in that case we say that the light, or radiation,is unpolarized.

4.2 Radiation and antennas

We now want to examine how em waves are generated. FromMaxwell's inhomogeneous equations we see that the sources of the fieldare a time-dependent charge or current density. In fact uniformly movingelectric charges do not radiate but accelerated charges do. The total powerradiated by a charge e subject to an acceleration v is given by Larmor'sequation

2 e2 v2

P = ^ (4.13)4

Thus accelerated electrons in atoms emit light; microwave radiation isemitted from the motion of electrons in solid state devices. The motionof the conduction electrons in an antenna or the motion of free electronsin a klystron gives rise to radiation in the HF or VHF bands.

If an em wave is emitted isotropically from a source, the E and B fieldsmust fall off with the distance R from the source as l/R. Then, the Poyntingvector S is proportional to l/R2 and the total radiated power remainsconstant and independent of the distance from the source

- J S • dA = 4nR2 |S| = 4nR2 - ^ = constant (4.14)R

We can try to compensate this reduction in flux with distance by focussingthe radiation by means of directional antennas; for instance, in a planewave propagating along z the time-averaged Poynting vector isindependent of z.

The simplest radiating system is an electric dipole whose momentoscillates in time. This is shown in Fig. 4.3(a) and we assume that thedipole moment p has the time dependence

|p| = p0 cos cot = ed cos cot (4.15)We could produce such a moment if the two charges executed harmonicmotion, their position being given by z = ± (d/2) cos cot. Thus theiracceleration would be i)='z= ±co2(d/2) cos cot, and according to theLarmor equation the total radiated power

p = - — 2\-) ^_ = ? J_<^_2 (4.16)3 4ne0 \ 2 / c3 3 4ns0 c

where the brackets indicate a time average. The moving charges can be


thought of las a current / = /0 sin cot = 2eco sin cot and using (2n/k = co) wecan rewrite Eq. (4.16) as

(4.n,This result ignores retardation effects and is strictly valid only when

(d/k)« 1. In that case however the antenna is inefficient and little poweris radiated for a given current. Instead we must use antennas withdimensions of the order of the wavelength. Such a simple dipole antennais shown in Fig. 4.3(fe); it is fed at its center and the current distributionis assumed of the form

I = Io cos — ) cos cot

with

k = 2d therefore2nc nc

co = = —k d

Such an arrangement is called a half-wave antenna. The angulardistribution is very similar to that of a dipole antenna which in the limitd/k « 1 is

d P _ JgdQ~1287r2

1/2,

V(4.18)

This is shown in Fig. 4.3(c); the dipole is assumed oriented along theZ-axis, and obviously the distribution is symmetric in azimuth. The energy

Fig. 4.3. Dipole radiation: (a) two equal and opposite charges separatedby a distance form an electric dipole, (b) dipole half-wave antenna (thedistribution of current flow at a particular instant of time is indicatedby the dashed curve), (c) the angular distribution of the power emittedby the antenna shown in (b).

d\2\

-d/2

I \

(a) (b) (c)

Directional antennas 123

flux, i.e. the energy d£ crossing the area dA in unit time, at a distance Rfrom the source is

dP 1 dP

J-Integration of Eq. (4.18) over all angles is easily carried out by noting that

3and therefore Eq. (4.18) predicts a total power in agreement withEq. (4.17) which was derived from Larmor's equation.

We note that in the MKS system that we are using, (jUo/eo)1/2 hasdimensions of an impedance and

( a \ 1 / 2

— =377Q (4.19)

is termed the impedance of free space. Then Eq. (4.17) can be written inthe form

P = iJReff/g (4.17')and an antenna is now characterized by its effective 'radiation' impedance.For instance, for the half-wave antenna where X = 2d, Re{{ = 50 Q; thisresult is not exact because djX « 1 is not fulfilled but it is quite close tothe exact value Ref{ = 73 Q.

4.3 Directional antennas

We saw that the simple dipole antenna emits more radiation inthe equatorial plane and much less at small polar angles. Thus it has somedirectionality. We can improve the directionality further by using an arrayof dipoles which are fed by signals with a definite phase relationship. Asthe simplest example we consider two dipole antennas positioned adistance A/4 apart along the Y-axis, as shown in Fig. 4.4(a).,For an observeron the Y-axis at y > 0, the signal from antenna 2 will arrive a quarter ofa period later than from antenna 1. If however the source driving theantenna 2 has its phase advanced by n/2 the signals from both antennaswill arrive in phase and interfere constructively. On the other hand, if theobserver is located at y < 0 he receives the signal from antenna 2 first andgiven its phase advance, the two signals arrive with a phase difference of180° and therefore interfere destructively. The radiation pattern from suchan arrangement in the plane normal to the dipoles is shown in Fig. 4A(b).

To express the previous argument analytically we consider the electric


field on the Y-axis where at a point y > 0 each antenna contributes/ kk \

x = Eo cosl cot — ky H h cj)1 I\ 8 /

cot — ky h (/>2 )8 /

The total field is given by (E1 + £ 2 ) ; by using the trigonometric relationfor the sum of the cosines, (and noting that kX — 2TT), we find

E2 = 2E0 cosl cor — Incos -

42 7 \ 4 2The energy flux is proportional to \E\2. If we take the time average, thefirst cosine contributes \ and

1 - 0 2

Thus, if Acj) = cf)1 — (j)2 = —n/2 maximum power is radiated toward +y;if Ac/> = n/2 no power is radiated towards + y but maximum power isradiated towards — y. If A</> = 0 only one half of maximal power is radiatedalong the. Y-axis.

As more individual antennas are used in an array, the radiation patternbecomes narrower and the side lobes become smaller. The advantage ofa phased array antenna is that one can change the direction in which itpoints by simply changing the phase of the elements. This is much moreefficient than trying to mechanically rotate a massive antenna.

A very simple but common example of a phased array are the 'Yagi'antennas, used for household reception of TV, shown schematically inFig. 4.5. If we take the frequency to be in the VHF band where/ = 500 MHz, then 2 = c / / = 0.6m. Thus we can expect the individual

Fig. 4.4. Directional antenna: (a) two dipoles separated by A/4, (b) theresulting radiation pattern in the plane normal to the dipoles.

--y

X/4

(a)


dipoles to be spaced 2/4 or about half a foot apart from each other. Thelength of the dipoles establishes the phase of the individually receivedsignals. Note that television transmission has the polarization vector inthe horizontal plane so that the array also must be horizontal.

It is a general law that the directionality of an antenna is related to itsoverall dimensions L by LS6 ~ k. For instance, in the example of Fig. 4.4we have approximately L ~ A, and therefore we expect 36 ~ 1 radian.Thus, large antennas are needed for long wavelengths as also indicatedby Eq. (4.17). For microwaves, X is of order of centimeters and one canobtain excellent directionality with parabolic reflectors of a few metersdiameter.

A parabola is the locus of all points that are equidistant from thedirectrix and the focus F as shown in Fig. 4.6. If we choose the Y-axis asthe directrix and place the focus, F, on the X-axis at the coordinate xF = 2a,the equation of the parabola becomes y2 = 4(x — a). Given that the angleof reflection equals the angle of incidence it is easy to show that any rayoriginating at the focus is reflected in a direction parallel to the X-axis. Aparabolic reflector is generated by rotating a parabola around the X-axis.

Fig. 4.5. A 'Yagi' antenna.

Reflector-

I • #

Fig. 4.6. A parabolic reflector focuses parallel light to a point; for largecurvature the parabola can be approximated by a spherical surface ofradius R = 2a.

F C

AB =AF


Thus the emerging pattern will always have azimuthal symmetry. As longas the diameter D of the reflector does not exceed the focal distance a,the parabolic shape can be approximated by a spherical reflector of radiusof curvature R = 2a.

The larger the diameter of the reflector, the more power will be directedinto the X-direction. The ultimate limit in the directivity (or focussing)of the reflector comes from the wave nature of light. This is so becauseeven an idealized source must have dimensions at least of order k (see forinstance also the discussion following Eq. (4.17)). It can be shown that aray originating at a distance Sx from the focus emerges at an angle SOwith respect to the horizontal, where

S6 = — (4.20)2a

If we set 2a ~ R ~ D/2, and Sx ~ A/2, we obtainS6 = k/D (4.20')

Eqs. (4.20) are a form of 'uncertainty relations' such as given by Eq. (3.6).They are equally applicable to optical lenses and are referred to as thediffraction limit.

We define the antenna gain, at angle 0, </>, as the ratio of the power perunit solid angle radiated at angle 0, (f) to the power that would have beenradiated from an isotropic source, i.e. in the absence of the antenna.Namely

dP 1G(0, 0) = — (0, </>) (4.21)

dQ ^ (Pt/4;c)where Pt is the total power. Obviously the total power is unaffected bythe antenna and

pt r! = —4nJ

% d p (dQ (

so that the antenna gain function is normalized asG(0, 0) dQ = 4TC (4.21')

Parabolic antennas or 'dishes' are symmetric in </> and we will assumethat all the radiation is contained in the angular range SO « 1. With thissimplified model we can integrate Eq. (4.21')

G(0) d cos 0 = 2TI I * G sin 0 d0 = TTG<502

Jowhere G is the gain in the forward cone, which we presumed constant.


Thus G = 4/592; a better calculation gives for the gain in the forwarddirection, (which is the maximum gain), Go = n2/S62. We can use 36 fromEq. (4.20') and since the area of the dish is nD2/4 = A, the expression forGo becomes

Go = (j^)A ( 4 ' 2 2 )

Thus the energy flux in the forward direction at a distance R from theantenna will be

dP 1 dP PtG

Eq. (4.23) shows the importance of a large area dish and of shortwavelength.

Example: As an application of the above we will derive the powerreceived by a radar operating with the following parameters

Target distance RPower in pulse Pt

Antenna forward gain Go = (4n/k2)AReflection cross section a

The energy flux at the target is given by Eq. (4.23) and the energy reflectedfrom the target equals the incident flux multiplied by the reflectioncross-section (an area). Thus

The reflected power is radiated isotropically so that the reflected energyflux at the receiver is

dR 1^reflecteddA 4nR2

The received power is Pr = A(dPJdA) where A is the area of the dish. Thus

r (4nR2)2

where we have assumed optimal pointing. We see again the importanceof short wavelength for radar; the received signal drops off as the fourthpower of the target distance.

We can introduce typical numerical valuesG - 1 0 m2

X - 10 cm (3 GHz)Pt - 1 kWGo — 103 (this corresponds to A ~ 1 m2)


To have good resolution the pulse width must be narrow, say dt ~ 10"7 swhich would imply a large bandwidth. However we can integrate thesignal in time and therefore assume A/ = 103 Hz. If the receiver amplifiertemperature is TN = 300 K the noise power is (see Eq. (3.46))

PN = kTAf = 4x 10"1 8 WIf we demand a S/N ~ 3 the received power must be in excess of 10"17 W

which will set the range of the radar. From Eq. (4.24)

n 4 aPx(A\2 1 10 x 103 / i ^2

1 0 ( m )4TT \lj Pr An \ 0 . 1 / 10"1 7

Thus R ~ 300 km which is a long distance. If we had demanded a signalto noise ratio S/N ~ 250, the range would have decreased to R ~ 100 km.

4.4 Reflection, refraction and absorption

So far we have considered the propagation of em waves in freespace. The presence of matter affects the externally applied electric andmagnetic fields and therefore the propagation properties of the wave aswell. This comes about because matter contains electric charges and/ormagnetic dipoles that are influenced by the external fields. For instancewhen an electric field is applied to a dielectric material, the dielectric willremain overall neutral but the electrons will be displaced from theirequilibrium positions giving rise to a net dipole moment. The induceddipole moment per unit volume is called the polarization, P, of thematerial.* The polarization is related to the externally applied field E,through P/a0 = #eE where %c is the dielectric susceptibility of the material.

Maxwell's equations in the presence of matter retain their form(Eqs. (4.1)-(4.4)) if we replace the dielectric permittivity and magneticpermeability of the vacuum e0, fi0 by their values in the material

(4.25)

Here, xe an<^ Xm a r e the dielectric and magnetic susceptibilities of the

material; Ke is the dielectric constant and Km is the relative magneticpermeability.f When we introduce these values in Eq. (4.2) the velocity

* It is unfortunate that the same nomenclature is used as for the 'polarization vector'of an em wave; in both cases however we are trying to indicate a preferred direction in space.

f In the literature the dielectric constant is designated by K, and the relative magneticpermeability by /iT. We prefer Ke and Km to emphasize the symmetry between electric andmagnetic phenomena.

Reflection, refraction and absorption 129

of propagation of the em wave in the material becomes1 c c

c = (4-26)

The ratio of the velocity of an em wave in free space to the velocity inthe material is the refractive index n already introduced in Section 3.1.Eq. (4.26) expresses the refractive index in terms of the electromagneticproperties of the material.

When an em wave reaches the boundary (or interface) between twomaterials of different refractive index, part of the wave is reflected whilethe transmitted part changes its direction, it is refracted. These phenomenahave been known since antiquity and are shown in Fig. 4.7 for the casen2 > nx. We define the angles of incidence 6^ reflection 0r, and transmission6t with respect to the normal to the interface. The transmitted and reflectedrays lie in the plane defined by the normal and the incident ray, the planeof incidence. The angle of reflection equals the angle of incidence

0, = 0r (4.27)whereas the angle of refraction is given by Snell 's law

ttj sin 0j = nt sin 9t (4.28)The laws of reflection and refraction can be derived by using the

boundary conditions that must be obeyed by the em fields. Namely

( t) l~( x)l £l( J l - M nhl ^ 2 9 ^(Bt)J^ = (Bt)2/fi2 (Bn)t = (Bn)2 J

where the subscripts 't' and 'n' refer to the components of the fields'tangential' and 'normal' to the interface. The boundary conditions are adirect consequence of Maxwell's equations and we will use them later.

The laws of reflection and refraction can also be derived from a 'least-time' principle for the trajectory between two points A and B (Fermat'sprinciple). This is shown in Fig. 4.8 where the velocity in medium 1

Fig. 4.7. Reflection and refraction of light at the interface between twomedia with different refractive indices.

n2


is c1 = c/n1 and in medium 2 it is c2 = c/n2. In terms of the coordinatesshown in the figure the time of travel is

= n1 sin 61

c cMinimizing t with respect to xx yields

n>> „ , ,„ = n 9 s i n t/9 = 3

The law of reflection can be obtained in a similar fashion if we assumethat on its way from A to C the ray has to reach the interface at some point.

If the wave moves from a region of high refractive index to a regionwhere the index is smaller, i.e. n2 < nl9 Snell's law can be satisfied onlywhen the angle of incidence 6{ is smaller than the critical angle #c, where

when 0{ > 9C there can be no transmitted wave and the incident wave istotally reflected.

We will now show how the refractive index can be obtained from simpleconsiderations about the structure of matter. Furthermore we will considergaseous materials in which case we can ignore the interaction betweendifferent atoms. We choose the polarization vector along the X-axis, andwrite for the electric field

The atomic electrons will feel a force Fx = eEx\ we assume that the electronsare bound to the atom by a linear restoring force, and that their motionis damped. Then the equation of motion is of the form

x = 2yx + (olx = eEê-io)t

m(4.30)

Fig. 4.8. Illustration of Fermat's principle of least time' for refractionand reflection.

ri

* X •1B

I

Reflection, refraction and absorption 131

The solution of Eq. (4.30) is obtained by choosing x = x0e~1(Ot, whichleads to

E0(e/m)(col -co2) — i2yco

where m is the mass of the electron and co0 and y are parameters dependingon properties of the particular atoms involved; co0 is the angular frequencyof strong absorption lines, and y their relative width.

If there are N electrons per unit volume the polarization of the material is

Eo (4.32)1m (COQ — coz) — izyco

The electric susceptibility is given by

and from Eq. (4.25)

l = l + Z e = l + _ ^ _ (4.33)

In most materials the magnetic susceptibility at optical frequencies is muchsmaller than y, so that from Eq. (4.26) the refractive index becomes

._fjLY"_fiY"_(i+JLY"\£0/W \eoj \ s0E0J

Using Eq. (4.32) for the polarization we obtainNe2 1 1 1 / 2

1+ — =A(l+iic) (4-34)ms0 (COQ — co ) — izycoj

In this case, the refractive index has both a real part h, and an imaginarypart, hK. The imaginary part gives rise to absorption of the wave.

In general, in gases / e « 1 and n is close to 1 so that we can expandthe radical in Eq. (4.34) to find the real and imaginary parts of n

Ne2 (col-co2)

- [

= Im(n) * K = — , , , ' , , (4-35')

e0 (col — c°2)2

Ne2 lyco

2ni£0 (col -co2)2 + Ay2co2

These functions are shown in Fig. 4.9. Near the resonance frequencyco ~ coo, the absorption becomes very strong and the medium can beopaque to the transmission of radiation at that particular wavelength.Similarly the real part of the refractive index can become less than 1 justabove the resonance. This implies that the phase velocity c' exceeds c;however the group velocity v% = dco/dk remains less than c as required byspecial relativity. Because (n—1) depends on frequency, it leads to


Fig. 4.9. Variation of the refractive index as a function of frequency inthe region of a resonance. The real part is given by the curve labeled(fi - 1) and the imaginary part by the curve labeled fire.

(n - 1)

dispersion', thus different Fourier components propagate with differentvelocities, and short pulses become distorted when propagating throughthe medium.

For isolated atoms such as in gases the most important resonances arein the ultraviolet, so that for incident visible light we have co«co0;furthermore the lines are not very broad, 2y < a>. Thus Eqs. (4.35) simplifyto

n~n = 1

and K = 0. The above expression is reasonably accurate for gases in spiteof the simple model that was used. For instance for hydrogen gas, if wetake co/2n~ 3 x 1015 (which corresponds to excitation from the groundstate), and N = 5 x 1025/m3 we find

w - l = 2 . 2 x 10"4

very close to the measured value (n— 1) = 1.4 x 10~4.

4.5 The ionosphere

One of the most interesting phenomena of early radiotransmission was the reception of signals at large distances far beyondthe line of sight between transmitter and receiver. Such transmission isdue to the reflection of signals from the ionosphere; it is much morepronounced at night and is strongly affected by atmospheric conditions.We can explain the reflection of em waves from the ionosphere if we treat

The ionosphere 133

the latter as a partially ionized plasma containing free electrons. The emwave interacts with the plasma and we can describe this interaction interms of a refractive index.

The plasma is overall electrically neutral; the ionized electrons are freeto move whereas the much heavier atomic ions do not contribute tocurrent flow. Maxwell's equations then take the form

dBV-B = 0 V x E + — = 0

dtdE

V-E = 0 VxB-e/i — = ii3 = fiaEdt

In the last equation we used Ohm's law to express the current density interms of the conductivity a of the plasma

J = <JE (4.37)By following the same steps that led to Eq. (4.7) the wave equation in

this case becomes

V2E-/ia^-/i£^ = 0 (4.38)

where the term — jda(dE/dt) introduces 'damping' in the propagation ofthe wave and thus affects the velocity of propagation. To solve Eq. (4.38)we will assume, as before, a plane wave solution

E = Ex = Eoe-'li°*-kz) (4.39)and we wish to establish the relation between co and k, imposed byEq. (4.38). Introducing Eq. (4.39) into Eq. (4.38) we obtain

(— k2 + ifiGCO + (o2fie)Ex = 0

with solution

L £0)_\(4.40)

We can evaluate the conductivity o by using the model introduced inEq. (4.30) but setting co0 = y = 0 since the electrons are not bound toatoms. Then

and

Furthermore J = eN\ which in our case is Jx = eNx = aE0c~1<ot, with N

ddt

x =

em

e

icom


the density of free electrons. Thus the conductivity of the plasma ise2N

o = '\ (4.41)com

The conductivity is imaginary (the current is 90° out of phase with respectto the applied electric field) and is frequency dependent. Introducing afrom Eq. (4.41) in Eq. (4.40) and setting £ê0, n^n0

w e find for thewave vector squared

V , f \ (4.42,c L 4 J

which is the desired relation between k and co.It is convenient to introduce the plasma frequency, cop, through

4nN e2

p m 4ne0

so that

When co< cop the wave vector is imaginary, k = i/? and Eq. (4.39) takesthe form

E = EX = £ 0 e ~ i ( c o r ~ k 2 ) = E0Q~i(OtQ~pz

That is, the wave is attenuated in the plasma over a distance / = 1//?. Sincethe wave cannot propagate in the plasma it is reflected from it.* Whenco > a>p, the wave vector is real and the plasma is transparent to the wave.We can express Eq. (4.43) in terms of the refractive index n = ck/co

2

(4.44)

Thus when co > cop the refractive index is real and n< 1. Even in that casethe wave will undergo total reflection if the angle of incidence is largerthan the critical angle 0C, where sin 0c = n.

The ionosphere extends from approximately 50 to 300 km above theearth's surface. At such height the density is low as compared to that onthe surface and one finds layers of ionized gas distributed roughly asindicated in Table 4.1. The ionization is produced by sunlight, in particularits UV component, and to a much lesser extent by cosmic rays and X-rays.

* For the same reason, high frequency em waves cannot propagate inside a metaland visible light is reflected from the surface of metals. In metals N ~ 1023 but the conductivityis dominated by the collisions of the electrons with the lattice sites and thus Eq. (4.41) isnot directly applicable.

The ionosphere 135

The degree of ionization depends on the time of the day, and is maintained,in part, due to the earth's magnetic field. The effect of the field is to trapthe electrons, just as protons are trapped by the earth's magnetic field toform the Van Allen belts which are located at a distance of ~ 1000 kmfrom the surface.

We evaluate the plasma frequency for a free electron densityJVe = 105/cm3, to find

2 A Ne e2 1011 (1.6 x 10"19)2

P me4ns0 0.9 x 1O"30 (8.85 x 10"12)= 3.2 x 1014(rad/s)2

orcon 1.8 x 107

_ p _ = 2.8 x 106 Hzp 2TT In

Frequencies below vp cannot penetrate the ionosphere. At night, the freeelectron density decreases by a factor of about 100, so that vp ~ 300 kHzwhich is typical of AM radio transmission. Signals at this frequency cannow be reflected from high lying layers of the ionosphere and reach longdistances as shown in Fig. 4.10. It should be appreciated that the

Table 4.1.

Height (km)

50-9090-130

130-300

Ionosphere layers

Designation

D layerElayerF layer

JVe (free electrons /cm3)

102

105

105-106

Fig. 4.10. The layers of the ionosphere give rise to the reflection ofelectromagnetic waves.

Ionospherelayers


ionospheric layers do not present sharp interfaces but that the refractiveindex changes gradually as the electron density first increases and thendrops as a function of height.

Propagation in the layers of the ionosphere is further complicated bythe presence of the earth's magnetic field which causes the electrons tospiral at their cyclotron frequency

eBm

This modifies the refractive index from the expression given by Eq. (4.44) to

n = (\ °^ J (4.44')\ (co± coB)wJ

The earth's field is weak, B ~ 0.5 gauss = 5 x 10"5T and results in acyclotron frequency for electrons

( 1 . 6 x l 0 " 1 9 ) x ( 5 x l 0 - 5 )coB = — — ~ 10 7 rad/s

0.9 x lO" 3 0 'This frequency is in the MHz range, and thus the earth's field has asignificant influence on the propagation of radio transmission.

4.6 Satellite communications

Reflection from the ionosphere provided the principal means forover the horizon communications until the introduction of communicationsatellites in the 1960s. It is now possible to transmit a signal from anearth station to an orbiting satellite; the signal is amplified andretransmitted from the satellite, thus providing coverage over very largedistances with high frequency carriers. For even longer rangecommunications such as world-wide coverage, links are establishedbetween satellites in different orbits.

Communication satellites can be placed in polar, inclined or equatorialorbits, as shown in Fig. 4.11 (a). The most common positioning forcommunication purposes is in geostationary orbits. For a satellite toremain fixed over the same earth location, it must be in equatorial orbitand rotate with a 24 hour period. This fixes the orbit radius since for acircular orbit

(4.45)

where G is Newton's constant, M@ the mass of the earth and co, r theangular velocity and radius of the orbit. We can rewrite Eq. (4.45) in

Satellite communications 137

terms of g, the acceleration of gravity on the earth's surface and the earth'sradius R&

or vz=

JUsing g = 9.8 m/s2, R@ = 6.37 x 106 m and co<:10"5 rad/s we find

r = 4.23x 107m

(4.45')

= 2TT/(24 hours) = 7.2 x

Thus the height of the equatorial orbit is h = r — R@; the exact value ish = 35 889 km

Even when placed in a geostationary (also called geosynchronous) orbit,the satellite does not remain perfectly fixed with respect to the earth, butwobbles about its intended reference point.

We can calculate the coverage provided by a satellite located at a heighth above the earth with the help of Fig. 4.11(b). The satellite is at S andthe observer at A; <j){ is the angle of inclination of the satellite above thehorizon of the observer and (/>E is the angle subtended by the observer atthe center of the earth with respect to the satellite direction. The coverageangle P as seen from the satellite is

n"2"

Sincesin P sin(7i/2 +

Fig. 4.11. (a) Various possible satellite orbits, (b) earth coverage by asatellite located at a height h above the earth.

(b)


We obtain

R

+ (fri = arc cos ^ cos

For a geostationary satellite h = 35 900 km and choosing (j>{ > 20°, onefinds 0E + 0i < 82° or (j)E < 62°. This covers a large part of a hemispherebut not regions near the poles. In terms of solid angle

AQ = 2n(l - cos 0E) = 2TT x (0.53)Since at most 2n can be covered (half of the earth is invisible even frominfinite distance and at zero inclination), the satellite can be viewed from53% of the maximum area. In practice, communication satellites havedirectional antennas which increase the power of the transmitted signalat the expense of coverage.

We briefly analyse the power requirements for satellite communicationsand will consider separately the uplink from the downlink. If Pt is the totalpower of the transmitter and Go the maximum antenna gain, the powerreceived by a geostationary satellite is

ls (4.46)4n(h)2

where As is the area of the satellite antenna. We assume Pt = 103 W,Go = 104 which is easy to achieve for a transmitter on the earth; alsoA=lm2 and h ~ 35 900 km. Then P - 6 x 10"1 0 W. We can assume a

Fig. 4.12. Locations of the geostationary communication satellites as of1985.

Waveguides and transmission lines 139

noise temperature TN = 103 K for the receiver in the satellite and abandwidth A / = 3 x 107 Hz; thus the noise power is (see Eq. (3.46))

PN = kTAf~4x 10"13WThen the S/N at the satellite's receiver is in excess of 1000 and thereforeposes no problems.

It is more difficult to achieve a good S/N in the downlink becausesatellite power and antenna size are restricted. We assume Pt = 1 kW asbefore, but Go = 102; this still allows coverage on the earth of an area ofradius R ~ 4500 miles, that is in excess of the entire continental U.S. Wealso take the receiving antenna of typical home TV dishes with an areaof 6 m2; the received power then isP r = 4 x l 0 ~ 1 1 W, and given the samebandwidth and receiver noise we find S/N = 100, which is adequate forhome reception.

There are already a large number of communication satellites inequatorial orbit and more are planned. These are indicated in Fig. 4.12(courtesy of the New York Times, September 15, 1985). A problem ofinterference arises when the angular separation between orbiting satellitesis smaller than the beamwidth of the earth antenna and if they operatein the same frequency band. Thus the number of satellites that can beplaced in equatorial orbit is limited, and this limit is being rapidlyapproached.

4.7 Waveguides and transmission lines

It is possible to transmit em energy by using a conducting medium,such as a pair of 'electrical wires': this is the standard way by whichelectrical power is distributed. For communications we use high frequencyem waves which do not penetrate deep into a conductor but are reflectedfrom its surface. Thus for short wavelengths, X < 10 cm, it is more efficientto contain the em radiation inside a metallic structure, a waveguide, justas water is contained inside a garden hose. In general the wavelength hasto be shorter than the dimensions of the guide if the wave is to propagatewithout attenuation. We will see in the following section that even visiblelight can be trapped in a waveguide but in that case the walls are dielectricrather than conducting.

The behavior of the em fields at an interface is governed by the boundaryconditions given by Eqs. (4.29). Inside a perfect conductor the electricfield vanishes; since the tangential component, £ t, must be continuous atthe boundary (see the first of Eqs. (4.29)) it follows that Et for the externalfield must also vanish at the boundary. Thus, if an em wave is normally


incident on a conductor, as in Fig. 4.13(a), the electric field of the reflectedwave must be reversed with respect to the electric field of the incidentwave. As a result the incident and reflected waves combine to give rise toa standing wave as in (b) of the figure. Standing waves are apparent inmany physical phenomena, as for instance when we wiggle a rope that istied at one end.

Mathematically, we can write for the two waves

Er= -E0Q-i((Ot+kz)

where Ex propagates to positive z, whereas the reflected wave, £ r,propagates to negative z and has reversed amplitude at z = 0. Then

E = Ex + Er = E0Q-'X(at[Q{kz - e" i k z ] = 2iE0Q-i(Ot sin kz (4.47)which is a standing wave of wavelength X = 2n/k. A standing wave doesnot transmit energy along the Z-axis in contrast to traveling waves whichcarry energy along their direction of propagation.

Next we consider a wave which is obliquely incident onto the conductingsurface as shown in Fig. 4.14(a). The electric field is polarized parallel tothe interface and the angle of incidence is 6. We define the wave vector kalong the direction of propagation and can resolve it into two componentskx and /cz, indicated in (b) of the figure.

Ikl — — — (k2 4- k2}1/2 f4 48Î I — — — v x ^ z) y"."o^

Fig. 4.13. Generation of standing waves by reflection from a conductor:(a) the two traveling waves at three different instants of time, (b) theresulting sum of the two waves is a standing wave.

(a) E{

E

(b) Ei

1 v \Acot = 0 cot = TT/4

\J

cor = TT/2


Since Maxwell's equations are linear we can treat independently thewave that is incident normal to the interface kn = kx = k cos 0, and thewave that is propagating parallel to the interface, where kp = kz = k sin 6.The parallel component is unaffected by the boundary, but the normalcomponent is reflected and must always have a node at the interface, thatis the electric field amplitude must be zero. We now place a secondconducting surface, parallel to the first one at a distance b, as shown inFig. 4A5(a). Since the normal component must have a node at that surfaceas well, the wavevector must obey

knb = mnwhere m is an integer. Using kn = k cos 6 = (2U/A) cos 0, we find thatpropagation can occur only at angles 6 that satisfy the condition

2b cos 6 m = 1 ? 2 , 3 , . . . (4.49)A — -

mIf A > 2b Eq. (4.49) cannot be satisfied and the wave will not propagate.

The different values of the integer m correspond to different distributionsof the amplitude of the standing wave in the x-direction. This is shownin Fig. 4.15 and we speak of different modes. Furthermore we have assumed

Fig. 4.14. Reflection from a metallic boundary: (a) the angle of incidenceis 6 and the wave is polarized along the Y-axis (out of the plane of thepaper), (b) resolution of the wavevector into two components along theX- and Z-axes.

Xi

= k cos 6t;

k=k sin 0

(a) (b)

Fig. 4.15. Propagation between two metallic surfaces results in astanding wave in the plane normal to the direction of propagation:(a) lowest mode, (b) next higher mode.

(a) m = 1 ( b ) m = 2


that the electric field is along the y-direction, parallel to the interface;thus there is no electric field component along the z-axis and the modesare called transverse electric (TE-modes). We can think of the energypropagating between the two conducting surfaces by a zig-zag path asthe wave bounces off the two conducting surfaces. The higher the mode,m, the more bounces occur for a given length of propagation.

It is interesting to examine the field distribution along the z-direction(the direction of propagation) as well. We have kp = k sin 8, so that the'wavelength' kp which determines the periodicity along z, and is definedby kp = 2n/kp has the value

2TT_ 2n _ kp kp k sin 8 sin 8

Thus the phase velocity vp is

vp = vlp = — = — (4.50)sin 8 sin 8

which is larger than c. However, the velocity at which energy propagatesdown the z-axis, the group velocity vg, is determined by the zig-zag patternand is less than c. We easily find from Fig. 4.15 that

vg = c sin 8 (4.50')when 8 -> 0 the group velocity vg -» 0 and there is no propagation in thatparticular mode. The wavelength at which this occurs is called the cut-offwavelength k0 and is given (see Eq. (4.49)) by

Ao = — (4.51)m

So far we have considered propagation between two parallel conductingsurfaces. If we add two more surfaces to form an enclosure, we obtain awaveguide. The field amplitudes must satisfy the boundary conditions atall four 'walls' and this gives rise to modes characterized by two indices,n and m. The field patterns for the lowest mode in a rectangular waveguide,the TE0fl mode, are shown in Fig. 4.16. Cylindrical waveguides obey thesame principles but are not used as much as rectangular guides.

The coaxial line: At frequencies below the microwave band, it is notconvenient to use waveguides since they become bulky while transmissionin a conductor still involves too much attenuation. In this case the emfields are contained between two conducting surfaces (that can be atdifferent potentials) and we speak of a transmission line. The transmissionline can be thought of as the transition between bulk conductors andwaveguides.

The most widely used type of transmission line is the coaxial line where


the conducting surfaces are two coaxial cylinders. This is shown inFig. 4.17; the inner radius is rin = a and the outer radius rout = b and theregion between the two conductors is filled with a dielectric (a = 0) ofpermittivity s and permeability fx. The simplest mode of propagation isTEM (transverse electric and magnetic) where the electric field is radialand the magnetic field tangential as indicated in the figure.

To find the propagation properties of an em wave in this structure wewill directly solve Maxwell's equations (Eqs. (4.1)-(4.4)) but in cylindricalcoordinates. We are interested only in the region between the twoconductors, where there is no current flow, J = 0 and thus Eqs. (4.2) and(4.4) become

dB - - 8E (4.52)V x E = —dt

V x B = jTt

In cylindrical coordinates the components of the curl of a vector A are

Fig. 4.16. Field line patterns for the TE01 mode propagating in arectangular waveguide. The graphs show a 'snapshot' of the E and Bfields at a particular instant of time.

-•! ijiii (o) i! j

Fig. 4.17. Coaxial transmission line.


given in general by_ldAz_l d

p dcj) p dz

,_ . . SAn dA9 (4.53)

p op p o

where p is the radial coordinate.We will choose the fields as sketched in Fig. 4.17, that is

_ 17 e-i(cot-kz)(4.54)

and all other components are set to zero. The above fields represent awave traveling in the positive z-direction, and are transverse to thedirection of propagation and to one another. Then the first of Eqs. (4.52)gives

( V x E ) ^ ^ = ikEp = -dJ±* = ioz ot

The second of Eqs. (4.52) gives

(V x B)p = - - A (pB+) = -8-^p dz dz

^= -fieicoEpdt

We summarize the two resultsCO

For these equations to hold we must have

k2_e a1-™1 KKc2

Thus the wave propagates in the coaxial line with phase velocity. CO C

(4.55)

(4.55')

Note that there is no cut-off frequency in this case, and all frequencies,even dc current, can propagate in a coaxial line. The key feature of this

Fibre optics 145

geometrical arrangement is that the high frequency fields are shielded bythe outer conductor.

We can relate the fields £p, B^ to the current / flowing in the conductorand to the voltage difference V between the two conductors. FromAmpere's law we have

(4.57)

Using Eqs. (4.55) we can express Ep in terms of the current /

where in the second step we set o/k = l/(efi)1/2 according to Eq. (4.55').From Eq. (4.57) we see that the current i must have the same space-time

dependence as the electric field. Thus, recalling Eq. (4.54) we write

The voltage can be found by integrating the radial electric field from itsvalue at the inner conductor to that at the outer conductor

sJ }a p 2n\e) \ajThe impedance, Z, of the coaxial line is given by the ratio V/I,

fiz.fcT) (4.58)aj 2% °\KJ \j

where we have used Zo to designate the impedance of free spaceZ0 = (*>) =

soj \8.85 x l(T12(coul/V-m)/already introduced in Section 4.2 (see Eq. (4.19)).

As an example we can consider a coaxial cable with conductor ratiosb/a = 4 and filled with polystyrene which has Ke~2.5, Km~1.0. Thenthe impedance is

377 1Z = r - l n (4 ) = 53Q2*^/2.5

which is a typical value for the impedance used in r.f. systems.

4.8 Fiber optics

We saw in the previous section that high frequency em waves canpropagate in a guide with conducting walls. It is also possible for an em


wave to propagate in a guide with dielectric walls. While the principle ofdielectric wave guides was known since 1910, only recent technologicaladvances have made such guides practical. Dielectric waveguides are usedin the visible and consist of very thin fibers whose refractive index isprecisely controlled; light can be trapped inside such a fiber and willpropagate with very little attenuation.

The optical fibre is constructed of layers of dielectric with differingrefracting index (smaller values of n at larger radii) as shown inFig. 4.18(a). For simplicity we will consider first a flat dielectric slab ofindex n1 bounded by two plane dielectrics of index n2<nx as shown in(b) of the figure. An em wave is incident onto the interface, at angle 6.We know from Snell's law (Eq. (4.28)) that if 6 > 9C there can be notransmitted ray and all the energy is reflected back into medium 1. Thecritical angle 6C is found by setting 62 = 90° (or sin 02 = 1) in Eq. (4.28)so that

sin0c = — (4.59)

The complement of 6 is the propagation angle if/ = 90° — 9. As an example,if we have a lucite rod {n1 = 1.5) bounded by air (n2 = 1.0), the criticalangle is sin 6C = 0.67 and 6C = 58°, so that \j/c = 42°; rays with propagationangles \j/ < \j/c will be trapped inside the lucite.

In optical fibers the refractive index is n1 ~ 1.5 roughly the same as forlucite, but the boundary layer has a refractive index n2 very close to nl5

within a few percent. Thus it is convenient to define the difference betweenrefractive indices through

A ( 4 6 Q )

2n\ n1

We can express \j/c in terms of A, the fractional change in the refractive

Fig. 4.18. Optical fiber consisting of two layers of dielectric with differentrefractive index: (a) the radii and indices are as indicated, (b) simplemodel of the fiber assumes only the presence of two plane boundaries.

@ n2<n{

(2) n2<nx

(a) (b)

Fibre optics 147

index1( = sin f-i 1/2

= sin •1 (2A) 1/2

Thussin ij/c = (2A)1/2 ~ ij/c (4.61)

where in the last step we assumed that A « 1. For a ray to have y\i < \j/c

inside the fiber, the entrance angle i//ext must satisfy sin êst < n1 sin \jjc asshown in Fig. 4.19. As an example we consider a fiber with A = 0.01. Then

The limiting entrance angle ^ a —Î'AC *s referred to as the numericalaperture of the fiber; in the present example N.A. — \// a = 0.21.

Since the em wave will be reflected at the top and the bottom interface,a standing wave is established in the transverse direction, exactly as fora metallic wall waveguide (see Fig. 4.15). Thus the angle of incidence mustsatisfy the condition

An2bcos6 = (N+l) — (4.62)

where k0 is the free-space wavelength and N can be zero or an integer,JV = O, 1, 2, 3 , . . . For a fiber of circular cross-section of radius a,Eq. (4.62) remains valid if we set b ~ 2a. Thus we can write for the modespropagating in an optical fiber

4anxk0 sin \jj = 2n(N +1) AT = 0, 1, 2 , . . . (4.63)As an application we introduce into Eq. (4.63) typical parameters for

an optical fiber, where we have chosen/l0 = 850 nma = 40 jurn

A = 0.01

Fig. 4.19. Definition of the limiting entrance and cut-off angles forpropagation of light in an optical fiber.

Partially transmitted

Totallyreflected


The propagation angle for the lowest mode (N = 0) is found to be. . An 850 xlO"3

sin iAo = - - = 3.5x 10~3rad4anx 4 x 4 0 x 1 . 5

or i//0 ~ 0.2°. Thus the ray propagates in a direction very close to the fiberaxis. Next we calculate the highest mode that can be supported by thisfiber. For this we can insert Eq. (4.61) into (4.63) to obtain

Nmax+l=^(2A)1/2 = 40

The result of the previous example shows that a signal injected into afiber can propagate in several modes. This is a drawback because eachmode has a different group velocity and thus the signal is dispersed. Thegroup velocity is given by

vg = v cos i// = — cos \\i

and depends on the mode, through cos \jj. For the Nth mode we have

« ~ C n _ . / , 2 i l / 2 _ C P. 1 Uo2 I 4anx

(4.64)

The effects of dispersion can be compensated for by constructing fiberswith a graded index, as shown in Fig. 4.20(a). Modes with smallpropagation angles I/JN are confined to the center of the fiber, whereas asN (and thus il/N) increases, the rays can reach larger radii as shown in (b)of the figure. Clearly when N (and thus ij/N) is large, the path length is

Fig. 4.20. Graded index optical fiber: (a) radial profile of the refractiveindex, (b) propagation of rays entering at different angles (note thecompensation of dispersion).

(b)

The laser 149

longer and if the refractive index was homogeneous, the group velocityof the mode would be smaller as indicated by Eq. (4.64). However, theindex n is a function of radius and large angle rays propagate in a regionof smaller refractive index, that is where their velocity is higher. As aresult the longer path length is compensated by the higher velocity andall modes propagate along the fiber axis with very similar velocity. Thechange in the refractive index is usually proportional to the square of theradius (parabolic)

In that case, as can be seen from Fig. 4.20(fo) the fiber has focussingproperties and a short section of fiber can be used as a lens.

Attenuation in commercial fibers is of order of 1 db/km, namely adecrease in intensity of 25% in one km. In communication applications,fibers are driven by solid state lasers and the signal is detected byphotodiodes. The bandwidth of an optical fiber channel is extremely largeas compared to VHF or microwave carriers. If we assume that the opticalcarrier is modulated to 1% of its frequency, this would result in abandwidth W ~ 1013 Hz, that is in the terahertz range. In practice thelimitation on the usable bandwidth does not come from the properties ofthe channel but from the availability of devices for modulating anddemodulating the carrier. Such devices are based on various electroopticeffects and can operate at best in the GHz range.

4.9 The laser

The laser is a source of intense, very highly collimated andmonochromatic, coherent em radiation. The radiation is usually in thevisible but lasers operate effectively in the infrared and at other frequenciesas well. The laser was first proposed by C. H. Townes and A. L. Schawlowin 1958 and its name is an acronym for 'Light Amplification by StimulatedEmission of Radiation'; this is a precise description of the process involvedin a laser. The radiation is emitted when atoms, or molecules or electronsin a solid, make a transition to a state of lower energy; because the atomsare stimulated to radiate, the resulting radiation has the same direction,frequency and phase as the radiation already present in the laser cavity.It is this property that makes the laser such a unique source of em radiation.Needless to say that practical applications of the laser appear today inall areas of technology, of medicine, as well as in every scientific field.

As we know, atoms have discrete energy levels and transitions can takeplace between these levels. In the absence of external excitations the atom


is found in its lowest level, the ground state. A transition from a state ofenergy E2 to a state of lower energy Ex is accompanied by the emissionof a photon of angular frequency co, where

hco = E2- Ex (4.65)Conversely, an atom in the state with energy £x can absorb a photon offrequency co and make a transition to the state of energy E2 provided thecondition of Eq. (4.65), E2 — Et= hco is satisfied. These two processes areindicated graphically in Fig. 4.21; spontaneous emission in (a), andabsorption in (b).

If the atom is in the state E2 and radiation of frequency co, wherehco = E2 — E1 is incident on it, the atom makes a transition to the stateof energy Et and emits a photon of frequency co. Thus if initially onlyone photon of frequency co was present, after the interaction of the emradiation with the atom, two photons of that frequency are present. Thisprocess is called stimulated emission and is indicated in (c) of the figure.Einstein was the first to show that stimulated emission must take placein order to assure the equilibrium between an assembly of atoms andthe electromagnetic field. He also showed that for the same radiationintensity the probability of absorption equals the probability of stimulatedemission.

We will designate the probability per unit time for each of the processesshown in Fig. 4.21 as follows:

A probability/unit time for spontaneous emissionWl2 = B(du/dco) probability/unit time for absorptionW21 = B(du/dco) probability/unit time for stimulated emission

Here du/dco is the energy density of the em radiation, per unit frequencyinterval;* the energy density in an interval da>, near the frequency co0 is

Fig. 4.21. Electromagnetic transitions in a two-level system: (a) spon-taneous emission, (b) absorption, (c) stimulated emission.

(c)

* Often u(a>) is used to designate the energy density per unit frequency; we areusing the more cumbersome expression du/dco to remain consistent with the definition ofenergy density already given by Eq. (4.12).

The laser 151

given by

u(co0, dot) = (4.66)d ^ coo

The coefficients A and B will be determined later. Note however that theabove definitions imply W21 = W12.

We now consider an assembly of atoms with a density of Nt atoms/cm3,where Nx is the density of atoms in state 1 and N2 the density in state 2,and Nt + N2 = Nt (see Fig. 4.21). The energy difference of the two statesis E2 — E1=ha> and radiation at frequency co is incident on the assembly;the energy density of the radiation (du/da>) dco corresponds to a photondensity of Ny photons/cm3. Because of the three processes indicated inFig. 4.21 photons will be created as well as absorbed; the rate of changeof the photon density will be

d^l = AN2- W12Nt + W21N2 (4.67)dt

In most applications W21N2»AN2 so that Eq. (4.67) simplifies to

f ^ A T , ) (4.67)dt \dj

If N2 > Nt the number of photons increases and the assembly of atomsacts as an amplifier of the incident radiation. If N2 < Nx the number ofphotons decreases, and the radiation is absorbed by the atoms.

Under conditions of thermal equilibrium the population of levels ofenergy E is governed by the Boltzmann distribution

( 4 - 6 8 )

where T is the temperature and k is Boltzmann's constant. ThusN Ne~E2/kT

e~(E2~El)lkT

Since E2>EU we find that (N2/Nt) < 1. An assembly of atoms in thermalequilibrium will absorb radiation. In order to amplify the incidentradiation we must create a population inversion in the assembly of atoms.This can be achieved in several atomic, molecular or condensed mattersystems which therefore can be made to lase.

To achieve population inversion, the atoms are pumped from theirground state to a state of higher energy from where the level 2 is populated.The simplest scheme is that of the three-level laser shown in Fig. 4.22(a)*The action of the pump is to excite the state 3, which spontaneously

* It is evident that in a two-level system the pump would depopulate the upperlevel leading only to an equalization of the population of the two states but not to inversion.


decays by a fast transition to state 2; the population of N2 can then exceedATX and a lasing transition can occur between levels 2 and 1. To sustainthe lasing action, the pump must remove atoms from the ground statefaster than the lasing transition populates that state. The four-level lasershown in (b) of the figure is better suited to continuous laser operation.State 1, the lower level of the laser transition decays fast to the groundstate 4; thus Nx is always much less than N2. The upper state 2 is populatedby a fast spontaneous transition from the pumped state 3; the upper stateshould have a fairly long lifetime so that spontaneous decay does notcompete with the stimulated emission.

We will now consider these effects quantitatively. The probability ratefor spontaneous emission is given by

A = -^l—-)\(x>^\ 2 (4.69)3 c2 4ns0 he

A has units of inverse time and h is Planck's constant divided by 2TTh

h = — =1 .05x 10"3 4J-s (4.70)2TT

We also note the appearance of the dimensionless combination

_^__L=a~_i_ (4.70')4n£ohc 137

which is called the fine structure constant, a. The quantity <x>21 = <x)*2is the 'matrix element' of x between the states 1 and 2; it is necessary toknow the quantum mechanical wave functions of the two states 1, 2 inorder to calculate <x>12 explicitly. However <x> has dimensions of lengthand is of the order of atomic dimension <x> ~ 10"8 cm. If we use thisvalue and co = 3 x 1015 rad/s in Eq. (4.69) we find A ~ 107 s"1, that is,atomic states should have lifetimes in the range of 10"7 s; this agrees withobservation.

Fig. 4.22. The principle of laser operation: (a) three-level laser, (b)four-level laser.

Fast spontaneous

Pump

Fast spontaneous

/ Lasertransition

Fast spontaneous

(b)

The laser 153

The absorption probability W12 will depend on the intensity of the emradiation as indicated by the definition W12 = B(du/dco), but also on howclose the frequency a> of the radiation is to coo, the energy difference ofthe two states (divided by fi)\ co0 = (E2 — E^)jh. We will adopt the pointof view that the incident radiation is monochromatic at frequency a>, andthat the atomic levels have an energy width Acoo. Absorption is maximalwhen co = (o0 and falls off quickly when \co — co0\ ^ Aa>0/2; this can bedescribed quantitatively by introducing a line shape function g(co — coo) asshown in Fig. 4.23. The line shape function is symmetric about co0 andnormalized to unity through

J: g(co - (o0) do = 1 (4.71)

Thus g(a> — co0) has dimensions of inverse frequency.We can now express W12 in terms of the intensity of the incident radiation

/, that is the energy crossing unit area in unit time. The intensity is relatedto the energy density u of the radiation through

I = uc (4.72)and the corresponding flux of photons is

F = — = —hco hco

(4.72')

The absorption rate equals the stimulated emission rate and they are givenby

An2 ( e2 1 \-zrl-r— v- K*>i2l20(*> - «o)//»«, 1 A j , I \ \~ ~ / 1. 4.\ Cf \~~ ~~\J/~ \ /

3h \4ne0 he)This equations contains the same matrix element as for spontaneousemission (Eq. (4.69)); it also depends on the line-shape function of theatomic levels introduced in Eq. (4.71). It is convenient to write Eq. (4.73)in terms of the photon flux F and an absorption cross-section a, as

W12 = oF (4.73')

Fig. 4.23. Typical line-shape function characterizing the transitionsbetween energy levels.

- co0)


where a depends only on the atomic system and on the frequency co. FromEqs. (4.73) the cross-section is found to be given by

An2 ( e2 1 \' = -IT T — T- I<*>12|2G>0(O> - ">o) (4.74)

3 \47r£0 ncjThe cross-section has the dimensions of area and if the density of atomsis N (atoms/cm3) the product dP = oN dz gives the probability that aphoton will be absorbed in a path of length dz. For atomic systems<jog(co — co0) ~ a>/Aco0 ~ 105 and for excited states <x> ~ 10"9to 10"10 cm;thus a ranges from 10"14 to 10"16 cm2.

The rate at which the photon flux changes when a beam of photonstraverses a path length dz in the atomic medium can be obtained fromEqs. (4.67). As before we ignore spontaneous transitions so that

dF = W12(N2 - Nx) dz = oF{N2 - N,) dzor

— = (j(N2-N1)dz (4.75)F

For a finite length z,AF = F(z) - F(0) = [e**2-*1* - l]F(0) (4.75')

Thus, as also found in Eq. (4.67'), if iV2 > Nx the photon flux will increaseand the medium will act as an amplifier. If we are able to feed backthrough the medium a fraction of the amplified radiation, the amplifierwill behave as an oscillator provided the gain exceeds the losses in thesystem.

To achieve feedback the radiation is trapped between two mirrors oneither side of the medium. On each reflection a small fraction, /?, of theincident beam is absorbed, and another small fraction T, is transmittedthrough the mirror; furthermore for each pass through the cavity, a smallfraction rj of the beam is lost due to a variety of causes. If the fractionaldecrease in the flux, for one pass, is 8F/F, then

% ( /0( T)( ri) = e-y (4.76)F

In general /?, T and rj are much less than one and so is y. Introducing thelosses in Eq. (4.75') we obtain

AF = { e ^ - ^ ' - r t - 1}F (4.77)where / is the length of the active medium. If the exponent is positive,that is if

(JV2-tf i )>f (4.78)l

The laser 155

the system will lase. The value of (y/h) defines the threshold value of thepopulation inversion density.

We can write Eq. (4.77) in more familiar form by noting that dF/F = dl/Iand that the time for one traversal is L/c, where L is the length of thecavity, thus

dt LNear threshold the exponent is close to zero and Eq. (4.79) is approximatedby

— ~ — [G(N2 - N)l- y] (4.79')dt L

As an example, we consider an optical system where the total losses inone pass are y = 0.03, the absorption cross-section is o = 10" 1 6 cm2 andthe active medium length / = 30 cm. Then the threshold inversion densityis

{N2 - JVJ^ = — = — ° ' ° 3 A = 1013 atoms/cm3

V 2 lhhT al 3 0 x l 0 " 1 6 'This is a very small fraction of all the atoms contained in 1 cm3 and thisis why it is possible readily to observe lasing in so many different atomicsystems. We recall that even in a gas, the atomic density is in excess of1019/cm3. The factor a(N2 — Nx) is referred to as the gain per unit path.Typically gas lasers operate with gains of 0.2 db/m whereas molecularsystems can have gains as high as 40 db/m.

In Eqs. (4.77) or (4.79) we have an expression for the rate of changeof the photon flux. For a complete description of the system we must alsohave expressions for the rate of change of the population densities N2 andNl. For simplicity we will consider a four-level lasing system so that wecan set Nt = 0. Then

dN—± = WpNg - (?(cNy)N2 - AN2 I

dt > (4.80)dA^_ '~di~

Here Wp is the probability rate of the pump, and Ng = Nt — N2 is the densityof atoms in the ground state. We have replaced the flux F by F = cNy9

where Ny is the photon density in the cavity. The two coupled equationsmust be solved for N2 and Ny given appropriate initial conditions; we canusually set Nx = 0 in the second equation.

Under equilibrium conditions, dN2/dt = dNy/dt = 0 and ignoring thespontaneous term and also setting Nl=0 the solution of Eqs. (4.80) is


simply

N2 = — N=VV^la y coN2

or

WnNol (4.81)y c

The power extracted from the laser is given by the photon flux in thecavity F = cnr multiplied by the transmission loss T, the cross-sectionalarea of the beam A and the photon energy ho. Thus

(4.82)y;

As a simple example consider ho = 2 eV, Ng = 1019/cm3, A = 0.03 cm2,/ = 30 cm and (T/y) = . If the pump rate is Wp = 1/s then the transmittedpower is 1 Watt, which corresponds to a laser of significant power.

4.10 Properties of laser radiation

We have seen that in the laser the radiation must be trapped ina cavity so as to interact effectively with the active medium. In the visible,the cavity is formed by two mirrors which also serve to focus the radiationas shown schematically in Fig. 4.24. The reflectivity of the mirrors isR = 1 — j8 where ft is the reflection loss. The number of reflections JV1/e

that can be achieved before the light intensity is decreased to 1/e of itsinitial value is obtained from

RN = - or N(\n R) = -1e

ButIn R = ln(l - 0) - -p for j 8 « l

Thus

Ni"=rjk (4-83)If the cavity length is d, the mean path-length of a photon in the cavityis Ly = dN1/e = d/(l — R). Typical reflectivities for laser mirrors are inexcess of R ^ 0.99.

The optical cavity, as any cavity, can be characterized by a quality

Properties of laser radiation 157

factor, Q, whereenergy stored in system In U U co

energy lost/cycle T dU/dt dU/dt Aco(4.84)

The resonant frequency of the cavity is co and Aco is the width of theresonance. Fig. 4.25 shows two examples of systems with high and lowQ respectively. By definition U/(dU/dt) is the 'lifetime', T, of the energystored in the cavity; thus it follows from Eq. (4.84) that

Q = COT (4.84')To calculate the Q of the optical cavity we note that the lifetime of aphoton in the cavity is given by

c c(l-R)and therefore

2nvd d 1Q = = 2n (4.85)

c(l-R) 2,(1-R)We see that the Q (which is dimensionless) can also be interpreted asgiving the number of oscillations that the em field undergoes before thestored energy decays to 1/e of its initial value.

In the cavity the radiation travels in both directions and consequentlya pattern of standing waves will be established, with nodes at the mirrorsurfaces. Thus not all wavelengths can be supported in the cavity, but

Fig. 4.24. Confocal cavity resonator.

Fig. 4.25. Definition of the quality factor of an optical cavity in termsof the resulting line-width.

±co0 co0

High Q Low Q


only those for which an integral number of wavelengths fits in the spacingbetween mirrors which is given by d (see Fig. 4.26). Hence

4 = 1 , 2 , 3 , . . . (4.86)

Typically, d is of order of a meter whereas X ~ 5 x 10 7 m. Thus djX ~ 106

and therefore q is a very large number. Different values of q correspondto different modes of oscillation. The spacing between modes Avmin is givenby

A C C C r 1 -i C

Avm i n = = — [q + 1 — q\= — (4.87)

For d = 1 m, we find Avmin = 1.5 x 108 Hz.The line width for the individual modes is determined by the Q of the

cavity. Using Eqs. (4.84) and (4.85) we write1

or

— = Q = 2nAv X

Avs =s (lR)2nd

For d=lm and R = 0.99 one obtains Avs = 5 x 105 Hz which is indeeda narrow line; recall that in the visible v ~ 5 x 1014Hz. Finally we askhow many modes will be contained in the laser beam. This depends onthe line shape function for the transition (see Fig. 4.23). For atomic systemsthe line width is often given in terms of wave numbers; in these units theline width is typically Av= l c m " 1 . The line width in hertz is obtainedby multiplying the wave number by the speed of light. Thus

Avg = cAv = (3x 101 0cm/s)x ( l cm" 1 ) = 3 x 1010 Hzwhich is due in large part to Doppler broadening. We can now reconstructthe spectrum of the radiation emitted from a laser, which will be of thegeneral form of Fig. 4.27. However, the system may lase simultaneouslyat more than one optical line.

In any laser mode the electric and magnetic fields are very nearly

Fig. 4.26. Standing waves in an optical cavity.

Properties of laser radiation 159

perpendicular to the cavity axis; thus we speak of TEM modes. In additionto the condition given by Eq. (4.86) a mode is also specified by thedistribution of the em field in the transverse plane. This situation isanalogous to the field distribution in waveguides or optical fibers. Thedistribution in the plane transverse to the direction of propagation issketched in Fig. 4.28 for the lowest and next to lowest mode.

Lasers can also be used to produce very short pulses of intense radiation.If all the radiation stored in the cavity could be extracted in one pass, ashort pulse of high peak power would result. This is achieved by suddenlychanging the Q of the cavity, and one speaks of a Q-switched laser. Thepulse duration is of order

lmc ~ 3 x l O 8 3x 10"9s

Much narrower pulses can be obtained by taking advantage of the manymodes contained in the spectrum. If these modes can be made to oscillatewith the same phase, then they are equivalent to the discrete frequencyamplitudes of a Fourier series. Correspondingly, in the time domain, the

Fig. 4.27. Typical frequency spectrum of the radiation emitted by a laser.

Avg ~ 30 GHz

Fig. 4.28. Spatial distribution of the two lowest modes of laser radiation.

TEM•oo TEMn


pulse will have a width typical of

AAvg 30 GHz

30x 10" 1 2 s

Pulses as narrow as lOx 10~15s can be obtained by special techniquesof 'compression' in the time domain.

The minimum width of the laser beam usually occurs in the middle ofthe cavity and is given by

(4.88)V n Jwhere L is the focal length of the cavity mirror. ws is the 'waist' of thebeam and one can recognize the similarities of Eq. (4.88) to the diffractionlimit of Eq. (4.20'). For typical values of X = 600 nm and L = 0.5 m onefinds

vvs ~ 3 x 10~4 m = 0.3 mmOne can take advantage of the small spot size to make the transmittedbeam highly parallel. Using a lens of the same focal length as the cavitymirrors the resulting beam divergence is

For the parameters defined above, 6d ~ 6 x 10 4.Wave phenomena are characterized by their degree of coherence, which

measures how well the phase relationship is maintained at differentspace-time points. Consider for instance two points P1 and P2 and letthe relative phase of the wave amplitude at these points be <j>12 at sometime t = 0. If the relative phase, at P1 and P2, remains equal to c/>12 at alltimes the wave has perfect spatial coherence. Usually spatial coherencedeteriorates as the separation between P1 and P2 increases. We define thecoherence length lc as the distance over which the phase relationship ismaintained.

We can also define the temporal coherence of the wave by consideringthe phase difference between times tx and (t1 + T) at a fixed point P. Ifthis phase difference (j)x is maintained for all later times t2 and (t2 + T),the wave has perfect temporal coherence. The coherence time TC measureshow long the wave is emitted without 'glitches' or other changes in itsphase. If the frequency of the source has a width Av, then clearly, thecoherence time cannot exceed T0 ~ 1/Av. In this case the coherence lengthis also limited to lc ~ c/(2nAv).

The idealized 'plane wave' that we use in calculating wave phenomenahas by definition perfect spatial and temporal coherence. Ordinary sources

Exercises 161

of visible light are very far from meeting these conditions and one needsto select a very small collimated beam to have a spatially coherent wave.In contrast, laser beams exhibit a degree of coherence several orders ofmagnitude higher than that of ordinary sources. Obviously, anon-coherent beam cannot exhibit interference phenomena.

There are two final properties of laser radiation that we mention:brightness and polarization. Brightness is defined as the power radiatedinto unit solid angle by unit surface of the source. Because of their highdirectionality and small beam size, even low power lasers have a brightnessthat exceeds that of other sources in the visible by orders of magnitude.Laser beams are usually linearly polarized with good accuracy. This isaccomplished by orienting the windows on the discharge tube at the'Brewster angle' with respect to the cavity axis. The polarization of thetransmitted beam is then in the plane defined by the cavity and the normalto the window.

Exercises

Exercise 4.1

(a) Show that a parabolic reflector generates a parallel beam of lightwhen a point source is placed at its focus.

(b) Find the position of the focus.(c) Show qualitatively that when em radiation of wavelength k is

focussed in terms of a reflector of aperture nR2 the angularaperture of the antenna is

A0 ~ k/R(d) Find the dimensions of the Arecibo radiotelescope and calculate

its 'directionality' A6 for frequencies near / = 1 GHz.

Exercise 4.2

Consider a fiber optics channel operating at a wavelength k = 600 nm.The channel will support 1% changes in wavelength.

(a) Find the bandwidth of the channel.(b) Given a S/N ratio of 10 find the capacity of the channel.(c) Assuming that telephone communication needs a sampling rate

W=10kHz , how many simultaneous telephone conversationscan the channel support?

Exercise 4.3

An optical fiber of radius a = 10 fim operates in the visible (k = 600 nm).


The refractive index is n = 1.5 and the difference in the refractive indexbetween the core and the cladding is

2n\(a) Find the limiting entrance angle.(b) Find the maximum mode number that the fiber will support.

Exercise 4.4

In a laser cavity the spot size in the TEM0 0 mode is characterized by aradius

where L is the distance between the mirrors.(a) Give a qualitative derivation of this result based on the

uncertainty relation for wave phenomena.(b) Consider a CO2 laser (where X = 10.6 jum) of length L = 2 m and

calculate the spot size.(c) For the above laser calculate the frequency difference between

two adjacent TEM0 0 modes. Given that the FWHM of the CO2

laser line is 50 MHz, find how many TEM0 0 modes fall withinthis width.

PART C

NUCLEAR ENERGY

The first sustained nuclear chain reaction was achieved by EnricoFermi and his collaborators at the University of Chicago onDecember 2, 1942. Since then, nuclear energy has been one ofthe dominant factors in our society. Fission reactors are widelyused to supplement the generation of electric power and, whenit is realized, controlled thermonuclear fusion promises to be aninexhaustible source of energy for mankind. Yet the most decisiveand terrifying aspect of nuclear energy so far has been in theproduction of weapons of unprecedented destructive power. Theseweapons if used in the large quantities presently available canalter the ecology of the planet and completely destroy, or at theleast, radically change human and animal life from what we knowit to be today.

In Chapter 5 we begin by discussing the units of energy, thevarious levels of energy consumption and supply, and the globalbalance of energy on the earth. The earth receives its energy fromthe sun, which generates energy by nuclear fusion. Next we reviewthe facts associated with nuclear forces in order to discuss therelease of energy in fission and fusion processes. We also discussradioactivity, its detection and its effect on living organisms. Onesection is devoted to nuclear reactors and another section to theprinciples of controlled nuclear fusion. For completeness we alsoconsider solar energy, which even though still economicallyimpractical is an inexhaustible source of clean energy.

Chapter 6 addresses the use of nuclear energy in weapons andthe effects produced by high yield nuclear explosions. We discussthe existing weapons arsenals and the efforts to control the spreadand proliferation of nuclear weapons by treaty. This brings upthe consideration of delivery vehicles and of intelligence gatheringby reconnaissance satellites. Finally we examine proposeddefensive systems, such as the 'Strategic Defense Initiative', andthe limitations imposed on them by the laws of physics.

SOURCES OF ENERGY

5.1 Introduction

When a force acts on a body and displaces it, the force does workand increases the energy of the body. For instance, a rock released in theearth's gravitational field gains kinetic energy as it falls, but it loosespotential energy. While work and energy are equivalent definitions of thesame physical quantity we think of energy as the 'capacity to do work'.Energy can manifest itself in various forms, but overall energy is alwaysconserved in any physical process. Examples of different forms of energyare chemical energy, the rest-mass energy E = me2 of a body of mass m,or the random thermal energy of a material body.

Energy is measured, in the MKS system, in Joules. Power is energyproduced (or consumed) per unit time and its unit is the Watt:1 Watt = 1 Joule/s. Energy flux is the energy crossing unit area in unittime and is measured in Watts/m2. Other commonly used units of energyare the calorie and the btu (British thermal unit). The relevant definitionsand conversion factors are summarized below.

1 Joule = 1 N-m1 calorie = 4.186 J1 btu = 1055 JlWatt = 1 J/s

Even though energy is conserved, it is the conversion of energy fromone form to another that is the basis of life, and also of modern technology.For instance, in an automobile engine chemical energy is first transformedto thermal energy; the thermal energy is converted to the mechanicalenergy that drives the automobile; finally the mechanical energy is againtransformed to heat energy through friction, air resistance, etc. Through

166 Sources of energy

these processes the automobile and its passengers have reached theirdestination, which was the desired goal.

A crucial fact in the conversion of energy from one form to another isthat some processes are reversible, while others are irreversible. At themicroscopic scale of atomic and nuclear phenomena, all processes arepresumed to be reversible. However even for systems of modest complexity,and in particular at the level of human experience, most processes areirreversible. This fact is contained in the second law of thermodynamicswhich states that in any process the total entropy of the system will remainthe same or will increase. In reversible processes the total entropy of thesystem is unchanged; in irreversible processes the total entropy increases.When fuel is burned in an automobile engine the process is irreversible.The price we pay for the transportation is the consumption of the fuel,that is of an ordered form of energy which is converted to disorderedenergy, with a resultant increase in entropy. Thus, when we speak of theenergy needs of the world we imply the need for ordered forms of energy.

Thermal energy can be converted to mechanical energy but only if twoheat reservoirs are available. Let the reservoirs be at temperatures Tx andT2 with T2>TX. To convert some of the thermal energy of the reservoirat T2 to mechanical energy, we must extract from it energy Q2 (in theform of heat) and deliver energy Qx to the reservoir at Tx; the differencebetween Q2 and Qi has been converted to mechanical energy (or work) W

W=Q2-Q1 (5.1)The efficiency, rj, for this conversion process is defined by rj = W/Q2 =(62 — Qi)/Qi- The highest possible efficiency is achieved when the thermalengine is reversible, in which case it can be easily proven that

^ 1 = ^ (5.2)

Eq. (5.2) shows that when T2 » Tx the conversion efficiency will be high.As an example, the temperature of superheated steam (at 3000 psi) isT2 = 700°F (T2 = 673 K), and as in most practical applications, Tx is fixedat the ambient temperature, Tx ~ 300 K. Thus the efficiency of a typicalsteam power plant cannot exceed

1 7 = 1 - (300/673) = 0.55In practice the efficiency is in the range of 40%, close to the upper limit.Eqs. (5.1) and (5.2) show that if we have sources of thermal energy attemperature much above the ambient level, they can be used to satisfyour needs for ordered energy: for instance, by producing electrical power.Hereafter we will simply refer to 'energy consumption' without referenceto the fact that we really mean ordered energy.

Introduction 167

Energy is used by living organisms, the intake being in the form offood, and allows them to accomplish biological functions as well asmechanical work. The main uses of energy, however, in our civilizationare for heating, for industrial production, and for transportation. Theenergy needed or released in some typical processes is listed in Table 5.1,and is also shown in Fig. 5.1 on a logarithmic scale. Note that throughoutthis and the following chapter we always refer to metric tons,1 ton = 103 kg; we use the abbreviations kt and Mt for 106 and 109 kgrespectively.

For early man the main use of energy was for heating, and the sourcewas the burning of wood. At present the principal sources of energy arefossil fuels: coal, oil and natural gas. Nuclear energy is an importantcomponent of electric power generation in the developed countries; itprovides about 13% of the total electrical power in the U.S. Mechanicalsources of energy such as hydroelectric power and wind power have beenknown and exploited for a long time but supply only a small fraction (lessthan 5%) of the total energy consumption. In recent years the use ofgeothermal, solar and even tidal energy has been researched and pilotplants have been constructed; nevertheless the exploitation of these sourcesis at present not economically advantageous and they are used only invery special cases. Fossil fuels, but also uranium and other fissile materials,are all depletable resources and the existing reserves will be exhausted in

Table 5.1. Energy use and energy release in some typical processes

Process Energy or energy per day

Energy flow from sun 3.2 x 1031 J/dayEnergy flow to earth 1.5 x 1022 J/dayTotal world human energy use 9.0 x 1017 J/dayTotal U.S. human energy use 2.0 x 1017 J/dayElectrical output, large power plant (1 GW) 9.0 x 1013 J/dayFission energy of 1 kilogram Uranium-235 8.0 x 1013 JCombustion energy of 1 barrel of oil 6.0 x 109 JU.S. energy use per capita 9.0 x 108 J/dayWorld energy use per capita 2.0 x 108 J/dayFood energy use per capita (2000 kcal/day) 9.0 x 106 J/day

Relative use of energy (U.S. 1970)Heating 0.22Refrigeration 0.05Industrial 0.48Transportation 0.25


a short time (100-200 years at the present rate of consumption). Thusthe realization of an alternate energy source such as controlled fusionappears to be an imperative for the survival of our civilization in its presentform.

The development of our industrial society has been made possible bythe availability of energy, and this in turn has created a greater demandfor energy. In fact, the energy consumption globally has been growingexponentially

P(t) = P(0)Qat (5.3)so that the time in which P(t) doubles is given by TD = (lne 2)/a. Until1977 the doubling time was TD = 29 years but the rate of growth seemsto be slowing down so that at present TD ~ 50 years. This is still too rapida rate of growth in energy demand for the limited resources of our planetespecially as the developing nations become industrialized demanding agreater share of the global energy use. Thus conservation of resources isa necessary component of any long term energy policy.

Fig. 5.1. Comparat ive orders of magnitude of possible energyproduction and of energy consumption world wide.

4-1O22

. . 1 0 2 0

- - 1 0 1 2

- - 1 0 1 0

- - 1 0 8

Energy flow from the sun reaching the earth (daily)

9 X 1017 world energy consumption (daily)2 X 1017 U.S. energy consumption (daily)

Daily production of 1000 MW power plant1 KgofU-235

Combustion of 1 ton of coal or 1 barrel of oil

The terrestrial energy balance 169

5.2 The terrestrial energy balance

Life on earth is crucially dependent on the average ambienttemperature; a planet's temperature is a function of its distance from thesun and is determined by balancing the energy received from the sun withthe total energy radiated by the planet. Clearly, no convection orconduction of heat is possible in the interplanetary space and thus energytransfer to and from the earth proceeds by radiation of em energy, asshown schematically in Fig. 5.2. The energy received from the sun is nearthe visible part of the spectrum while the radiated energy is in the infrared.

To discuss the balance in radiant energy we recall (see Chapter 4) thatexcited atoms emit radiation. If we have an assembly (i.e. a large number)of atoms that are maintained in a cavity of temperature T, and remain inequilibrium with the radiation, the spectrum of the radiation in the cavityis given by Planck's equation. The energy density per unit angularfrequency (co = 2nv) interval is

*UJg. 1 ( 5 4 )

The Planck distribution is plotted as a function of frequency in Fig. 5.3and has a peak which shifts to higher frequencies as the temperatureincreases; the maximum in Eq. (5.4) occurs when (hv/kT) = 2.831.

The energy radiated into unit solid angle by unit normal area of thecavity in unit time can be shown to be

dHco) = ^du,dQ 4ndco

Then the energy radiated over all angles is found from (5.4') by integrating

Fig. 5.2. Illumination of the earth by the sun's radiant energy.

Earth, Te


it over cos 9 dQ from 9 = 0 to 9 = n/2 so that2 dw fin

Jo(5.4")c du

cos U d cos 0 I d(p =dco J o 4 dco

Finally integration of Eq. (5.4") over all frequencies yields the total radiatedenergy per unit surface of the emitter, per unit time

n2k(5.5)

which is. seen to depend on the fourth power of the temperature. The T 4

dependence of the total radiated power was known before Planck'sdiscovery of his equation and is called the Stefan-Boltzmann law. Using5r (instead of 7T) for the energy radiated per unit area and unit time we write

ST = SGT4 (5.6)

Here o is Stefan's constant as given by the expression in parentheses inEq. (5.5); it has the numerical value

a = 5.67 x 10"8 Joule/m2-s-(K)4

The coefficient e, where 0 < s < 1, characterizes the ability of a physicalsystem to radiate; it is called the emissivity. When s = 1 the system radiatesaccording to Planck's law and it is called a black body; this is the maximumpossible rate of radiation. At the other limit, e = 0 characterizes a systemthat does not radiate at all.

A physical system may also absorb em radiation, or it may reflect it.If energy flux S{ is incident on the system, the absorbed energy flux is Sa

whereenergy absorbed

S =unit normal area — unit time

= aS{

Fig. 5.3. The Planck distribution plotted as a function of frequency.

dldco

3kT_fi


and a is called the absorbtivity of the system. In general the emissivityand the absorbtivity are functions of the frequency and the coefficientsshould be written as e(v) and a(v). However it is a very important andgeneral result, known as Kirchoff's law, that

e(v) = a(v) (5.7)This follows from conditions of equilibrium between the system and thesurrounding em radiation. For instance, painting a surface white decreasesthe absorption of radiant heat, but also decreases its ability to radiate atthat frequency.

From the Planck distribution we can understand the spectrum ofradiation emitted by the sun and by the earth. The sun's surface is at atemperature Ts ~ 5800 K and therefore the peak of the Planck distributionis at hv ~ 1.4 eV, or X ~ 900 nm. This corresponds to an energy slightlybelow the visible window but a large fraction of the sun's spectrum fallsinto that window. Conversely the temperature of the earth's surface isT = 300 K and thus the spectrum of its radiated energy peaks athv — 0.07 eV or X ~ 17 /jm, which is well into the infrared.* The energyflux reaching the earth from the sun is

S = 1 . 3 6 x l O 3 Watts/m2 (5.8)and this value is known as the solar constant. We can use it to calculatethe total energy radiated by the sun, what we call its luminosity JS?0; sincethe earth-sun distance is Rs= 1.5 x 1011 m, we find J£?o =3.8 x 1026 Watts.

The balance of energy for the earth is expressed by writing

d£ ddt absorbed radiated

Since at any time, only half of the earth faces the sun, the normal areafor absorption is AN = nR2 where R = K e is the radius of the earth(see Fig. 5.2). Thus

— = Sa(nR 2) (5.9)®l absorbed

The rate at which energy is radiated is given by Stefan's law (Eq. (5.6)).j j-i

— =8(jT 4{4nR2) (5.9')Uf radiated

where the radiating area is the total surface of the earth A = 4nR2.Equating the rates given by Eqs. (5.9) and if as a first approximation we

* Since dl/dX = - (c/X2)(dI/dv), the peak of the distribution dl/dX is at {hc/kTX) ~ 5;this is at a different frequency than the peak of dJ/dv, which occasionally can lead to confusion.


assume that (averaged over wavelength) e = a, we find that the temperatureof the earth's surface should be

sV/47 =

4(7= 279 K = 6°C (5.10)

While this result is in reasonable agreement with the range of temperaturesfound on the earth, the assumption s = a is too crude to use in any detailedmodel.

To improve the calculation we will include (a) the effect of latitude,(b) the reflection of the incident energy by the cloud cover, and (c) theabsorption and re-radiation of energy by the layers of the upperatmosphere. The first of these effects is indicated in Fig. 5A(a) and weassume that the radiation is always incident parallel to the plane of theequator; in reality, of course, the angle of incidence depends on the positionof the earth in its orbit around the sun (on the seasons). We consider astrip at latitude 0 and of width w = RA(j). The total area of the strip isA = w2n(R cos 0). The area normal to the direction of the incidentradiation is A± = w2R cos2 <\> where we have included a factor of cos </>for the inclination of the surface and replaced 2nR by 2R for the normalprojection. Proceeding as in Eqs. (5.9), we set SA± = GT4A to obtain

= (296 K)(cos (5.10')

Fig. 5.4. (a) The effect of latitude on the flux received per unit area ofthe earth's surface, (b) Reflection (<xr) and absorption (ain) of the radiationincident on the earth (as). ae represent the radiation emitted by the earth;it is partially transmitted (ac) and partially reflected (ac) by the cloudsand upper atmosphere.

N |

COS0

Clouds andupper atmosphere

0,(1.0)

(a) (b)


For instance if the equator is at T = 23°C then at a latitude (j) = 44° theweather would be freezing. Of course for a realistic model we must takeinto account the conduction of the earth's surface and convection throughthe motion of water and air masses.

Since we live on the earth's surface we are primarily interested in thesurface temperature and thus in the radiation received and emitted by thesurface. This is greatly affected by the reflection and absorption propertiesof the clouds and of the upper layers of the atmosphere. For instance, thethin ozone layer in the upper atmosphere strongly absorbs the ultravioletproviding the necessary protection to living beings; CO2 layers stronglyabsorb in the infrared giving rise to the 'greenhouse effect'. On the average,half of the earth is covered by clouds which have a reflectivity of order50% for visible, so that the solar radiation reaching the surface is typically~0.7 of the solar flux. Because of the presence of such absorbing layersand since the incident and emitted radiation are at different wavelengthswe cannot any more assume that <a> = <8>.

We can construct a simple model taking the above factors intoconsideration, as shown in Fig. 5.4(b), where the energy balance involvesthe following components:

as incident radiationar reflected radiationain radiation reaching the earth's surfaceae total power radiated by the earth's surface; all the energy is

assumed to be completely absorbed by the upper atmosphereac total power radiated by the upper atmosphere; the radiation

is assumed isotropicEquilibrium implies the relations

ain = a s - a r , as = ar + ac and ae = ain + ac

from where we immediately find thatac = ain and ae = 2ain

The radiation incident on the earth is as = SnR2 and since only 0.7 of theincident radiation reaches the surface we write

ain = 0.7as = 0.7SnR2 and ae = 1 ASnR2

The radiated power ae is determined by the temperature of the earth'ssurface

d£C dt

= GT4(4TZR2)rad

Finally, equating this result to ae = lASnR2 we find T= 303 K = 30°C.


This is in good agreement with observation and shows how the absorbinglayers of the upper atmosphere can significantly affect the surfacetemperature.

Even from our simple calculations we can appreciate how complex theenergy balance of the earth is. Moreover, the equilibrium is not necessarilystable and may be unstable to an increase in atmospheric pollutants ormajor changes in the ecology. For instance, an increase in CO 2 wouldincrease the 'greenhouse effect' and warm up the earth; this in turn wouldcreate more evaporation, increase the cloud cover and the density of theupper layers, further accentuating the greenhouse effect. Conversely, if thesnow cap increases, it would result in more reflection from the earth'ssurface and thus lower the surface temperature; this would extend thesnow cover even further, and so on. It is for these reasons that effectssuch as a 'nuclear winter' can have a permanent and catastrophic effecton the earth's climate.

5.3 The atomic nucleus

The atomic nucleus is an assembly of protons (p) and neutrons(n) which are bound (held together) by the nuclear force. The neutron,discovered by J. Chadwick in 1932, has the same mass and shares manyof the properties of the proton except that it is electrically neutral; theproton and neutron are collectively called nucleons. The typical dimensionof nuclei is rN ~ 10~13 cm, to be compared with that of atomic systemswhich have radii ra ~ 10 ~8 cm. The number of protons in the nucleus isdesignated by Z, which therefore also indicates its charge; the number ofneutrons is designated by N and A = Z + N is the atomic number. Innature stable nuclei are found only with charge less than Z = 92 andatomic number less than A = 238. Nuclei with values of Z as high as 106have been produced artificially but they are unstable.

Since positive charges repel one another, a force stronger than theelectromagnetic force must be present in order to hold the nucleustogether; this force must have a short range because we do not feel itseffects on a macroscopic scale. We speak of the nuclear force and it hasbeen found that it has the same strength for the pp, pn and nn systems.We can model such a nucleon-nucleon force by a square potential wellof radius a ~ 10"1 3 cm, as shown in Fig. 5.5. The depth of the well is oforder 20 MeV, a million times deeper than the energy of the ground stateof the hydrogen atom, and is just barely sufficient to bind a proton anda neutron into a deuteron.

The atomic nucleus 175

To simplify the calculations related to nuclear phenomena it iscustomary to express momentum, mass and energy in MeV and lengthin fermis.

1 fermi = 1 F = 10"1 3 cm = 1 fmAccording to special relativity the energy £ of a particle of mass m isgiven by Einstein's celebrated relation

E = mc2 (5.11)Thus we express the mass of a particle, or nucleus, by its equivalent energyin MeV. For instance the mass of the proton is mp = 1.673 x 10~27 kg;instead it is easier to use

mpc2 = 938.28 MeV proton mass )mnc2 = 939.57 MeV neutron mass > (5.11')mec2 = 0.511 MeV electron mass;

Similarly, we express the momentum p of a particle in MeV by giving thecorresponding value of cp, which has units of energy. Finally thecombination of the constants (he) appears often in calculations; it hasdimensions of energy-time and in our units

#c~200MeV-F (5.12)As an exercise, we estimate the depth of the potential well of Fig. 5.5

from simple arguments based on the uncertainty principle. This principle,which forms the foundation of quantum mechanics, states that we cannotmeasure simultaneously two complementary variables with arbitraryprecision. For instance, if Apx and Ax are the uncertainties in themeasurement of momentum and position, then

ApxAx ^ hThus, if we take Ax to be of the order of the dimensions of the nuclearpotential well Ax ~ a ~ 1 F, we find cApx ^ 200 MeV; we now assume that

Fig. 5.5. A potential well of finite width and depth can be used as amodel for the nuclear force.

-Vn

i' 20 MeV

1 F L


cp ~ cApx so that the kinetic energy is

K E ^ 2

2m 2mcwhere it is the nucleon (proton or neutron) mass that appears in thedenominator. Since a nucleus is bound, the total energy must be negative;thus U — 20 MeV which is of the correct order of magnitude.

The existence of the nuclear force is the reason why a system of positivecharges can be bound, but it is not sufficient to explain the structure ofnuclei. We must take into account the Pauli exclusion principle accordingto which two protons (or two neutrons) with the same spin projectioncannot occupy the same state. As we add protons and neutrons to anucleus the width and depth of the effective potential well increases, butthese particles cannot occupy filled energy levels and must therefore haveincreasingly higher energies. At a certain point the kinetic energy of anyadditional nucleon exceeds the depth of the potential well and the particlecannot be bound. It is for this reason that no nuclei exist with more than~250 nucleons. We recall that the structure of atoms and of the periodictable are also a consequence of the exclusion principle.

The interplay between the repulsive Coulomb force and the exclusionprinciple favors nuclei where N ~ Z, as observed for light nuclei. Howeverfor high Z, nuclei have N > Z because the short range nuclear force isless effective in balancing the em repulsion. In Fig. 5.6 the values of Zand N for the stable and radioactive nuclei are shown. By radioactive, wemean nuclei that spontaneously decay into another nucleus. Nuclei withthe same value of Z but different values of N are called isotopes becausethey have the same atomic (chemical) properties. We designate a nucleusby the atomic symbol of its corresponding chemical element with a leftsubscript for Z, a right subscript for N and a left superscript for A. Forinstance

238TT92U146

is pronounced 'Uranium two-thirty-eight' and represents the nucleus ofthe Uranium (Z = 92) isotope with atomic number A = 238. SinceN = Z — A the right subscript is redundant and we will omit it. Otherexamples are

JH the proton, which we will also designate by p?H the deuteron, also designated by d^He the nucleus of Helium-4, also known as an 'alpha-particle'

and designated by aX|O Oxygen-16, etc.

Nuclei with an odd atomic number have half-integer spin; for instance

The atomic nucleus 177

the proton and neutron have spin \, ^ O has spin § etc. When the atomicnumber is even the nucleus can have spin 0 and this is generally the case,but it can also have integer spin; for instance the deuteron has spin 1,whereas the a-particle has spin 0. The a-particle consists of two protons

Fig. 5.6. Chart of the 'nuclides' as a function of the number of protons,Z, and neutrons, N = A — Z; no nuclei can be formed far away from thestability region where Z~N. (From F. Bitter, Currents, Fields andParticles, John Wiley (1956).)

140

130

120

110

100

90

80

N" 70n* 60

50

40

30

20

10

ai

IF#

ffHfcuMC

JPPCMF

jnMnrwfff—

wfUw1\

r

•sa

i

ifIT

a

f3C

IfdV\3

go

Stable nucleiNaturally radioactive nuclArtificially radioactive nu

p

eiclei

0 10 20 30 40 50 60 70 80 90 100Z


with their spins antiparallel and two neutrons also with antiparallel spins;because of this configuration it is an extremely tightly bound system.

Ordinarily a nucleus is found in its ground state, the state of lowestenergy. However, following a nuclear collision a nucleus can be in anexcited state; it can then return to its ground state by the emission of aphoton as shown in Fig. 5J(a). This process is analogous to radiativetransitions in atoms except that the difference between energy levels is ofthe order of MeV; photons in that energy range, which exceeds the energyof X-rays, are referred to as y-rays. Radiative transitions in nuclei areusually prompt, with lifetimes in the range of 10~12 s.

An unstable nucleus can also decay by emitting an electron. In thiscase, given a parent nucleus A, Z, the daughter nucleus has the same,4-value but the Z-value has increased to Z + 1. We speak of beta-decay,as shown schematically in Fig. 5.7(fr). Positron emission is also possiblein which case the Z-value of the daugher nucleus is Z — 1. These processesare due to the weak interaction which changes a neutron into a protonas in Eq. (5.13) and vice versa as in Eq. (5.13'). For instance, since theneutron — proton mass difference exceeds the electron mass (see Eqs.(5.IT)) a. free neutron can decay according to

n-+p + e~ +ve (5.13)This process takes place with a lifetime of 15 minutes. The inverse process

/?->n + e + + v e (5.13')cannot occur for a free proton, but is allowed when the proton is insidethe nucleus, since a rearrangement of the nucleus inside the well canprovide the necessary energy. The particle designated by ve is the electronneutrino and is a neutral particle of zero mass (ve is the electronantineutrino); it escapes from the nucleus but carries away energy and

Fig. 5.7. Examples of nuclear decay: (a) de-excitation by emission of aphoton, (b) beta decay (emission of an electron and of an antineutrino),(c) decay by the emission of an a-particle (helium nucleus).

A, Z A, Z A, Z

A,Z A, Z + 1 A - 4, Z - 2

(a) (b) (c)

Nuclear binding, fission and fusion 179

momentum. Nuclear reactors are copious sources of antineutrinos. Thelifetime for /?-decay varies for different transitions but can be of order ofhours, days or even longer.

Heavy unstable nuclei often decay by the emission of an a-particle. Inthis case a parent nucleus A, Z decays to a daughter with A — 4, Z — 2as shown in Fig. 5.7(c). That this process is possible is due to the greatstability of the a-particle and to the quantum mechanical tunnel-effect.The lifetime for a-decay is very long and can be of order of thousands ormillions of years. The naming of these decays as a, /?, y radiations datesfrom the first discovery of natural radioactivity in 1900 by Becquerel andthe Curies. Under very special circumstances a heavy nucleus can decayby fission into two lighter nuclei. This process is the source of energy innuclear reactors and in the smaller nuclear weapons.

5.4 Nuclear binding, fission and fusion

We saw in the previous section that the nucleus is a bound systemof protons and neutrons. This means that if one wishes to separate allthe constituents of the nucleus energy must be supplied; the necessaryamount of energy is called the binding energy of that nucleus. Byconvention the binding energy is a negative number, being defined as thedifference between the rest mass of the nucleus and the sum of the restmasses of all its constituent nucleons

B.E. = im - [Zmv + (A- Z)mn]The heavier the nucleus the larger the binding energy. In fact, the bindingenergy per nucleon, B.E./A has approximately the value of —8 MeV formost nuclei. Thus the binding energy decreases the mass of a nucleusfrom that of its free constituents by about 0.8%.

As examples we calculate B.E./A for the deuteron and for the heliumnucleus. The deuteron has a rest mass mdc2 = 1875.63 MeV; the mass ofits two constituents, the proton and neutron was given in Eqs. (5.11').Thus the binding energy for the deuteron is

(B.E.)d = mdc2 - (mp + mn)c2 = 1875.63 - 1877.85 = -2 .22 MeVand we find B.E./A = —1.1 MeV. The mass of the fHe nucleus ismac2 = 3727.41 MeV and therefore

(B.E.)a = mac2 - (2mp + 2mn)c2 = -28.29 MeVwhich results in B.E./A = —7.1 MeV indicating that the |He nucleus is atightly bound system as we have already remarked.

The values of B.E./A for all nuclei are shown in Fig. 5.8 as a function


of A; they vary slowly with A except for the lightest elements. The minimumis in the region of the iron nucleus, A ~ 56, and this is why Fe is the mostabundant element in burned out stars. It is the small deviation fromflatness in the curve of B.E./A that makes possible the release of energyin nuclear fission or fusion processes. For instance a nucleus in the regionA ~ 200 has B.E./A ~ — 8 MeV; if the nucleus breaks up into two nucleiin the region A ~ 60 where B.E./A ~ —8.5 MeV, the final system is moretightly bound, and the difference in binding will appear as kinetic energyof the fission products. Similarly, if two light nuclei fuse into a heaviernucleus, for instance d + d-> ^He + y, the final system is more tightlybound and again the energy difference appears as kinetic and/or radiativeenergy.

We see that in order to release nuclear energy we must either inducethe fission of a heavy nucleus or the fusion of two light nuclei. As longas the final system is more tightly bound than the initial state, energy willbe released. This is equivalent to stating that the rest mass of the finalsystem must be less than the rest mass of the initial state. That is, someof the rest mass is converted into kinetic energy as predicted by Einstein(see Eq. (5.11)), where

A£ = (m{ - mf)c2 = (Am)c2 (5.14)To calculate the energy released in a nuclear reaction from Eq. (5.14) weneed to know the masses of all nuclei with adequate precision. For practicalreasons it is convenient to give the mass of the corresponding neutralatom in terms of the atomic mass unit (amu). One amu is y^th of the massof the carbon-12 atom, where

1 amu = (931.5016 ± 0.0026) MeV/c2 = ma

Fig. 5.8. The binding energy per nucleon as a function of atomic number.

0-

- 1 -

- 9

200

Nuclear binding, fission and fusion 181

Furthermore the mass excess is defined through(M-A) = iM- Ama

where ^M is the mass of the neutral atom with nucleus of charge Z andatomic number A. The mass excess has been tabulated for all stableisotopes, and is usually given in MeV/c2.* Thus the mass of a nucleuscan be found from

> = Ama + (M-A)- Zme (5.15)In Eq. (5.15) the rest energy of the electrons has been included whereasthe electron binding energy can be safely ignored.

As an example we calculate the energy released in the fission of 2 | | Uwhen it absorbs a slow neutron

ln + 235u _> 144Ra + 89K r + 3 ^ ( 5 1 6 )

The number of protons and the number of neutrons are the same on bothsides of the equation since these particles cannot be created or destroyedduring the reaction and no weak interactions such as in Eqs. (5.13) takeplace; thus there is also no change in the number of electrons nor are anyneutrinos produced. The mass excess on the left-hand side is

(M-A)2152U = 40.916

(M-A) ln= 8.07148.987 MeV/c2

whereas on the right-hand side(M-Ayt*Ba~ - 80.0(M-A) 8

396Kv=- 76.79

3x (M-A)ln= 24.21-132.58 MeV/c2

We see that the final products are more tightly bound, and the energyreleased in this process is

AE = (m, - mf)c2 = 48.99 - (-132.58) = 181.57 MeVThe 2HU nucleus fissions into two nuclei of almost equal mass, but notalways those indicated in Eq. (5.16). In each fission, on the average,200 MeV of energy are released and 2.4 neutrons are produced. The fissionproducts are usually unstable nuclei, and are therefore highly radioactive.

Heavy nuclei are, in general, subject to fission when bombarded byneutrons. The explanation of this phenomenon is given in terms of the'liquid drop' model sketched in Fig. 5.9. Since the neutron has no electric

* See for instance Introduction to Modern Physics by J. D. McGervey, AcademicPress, New York, 1983.


charge it can easily penetrate the nucleus. The addition of the neutronincreases the energy of the nucleus by ~ 8 MeV (the average B.E./A) andto dissipate that extra energy, the nucleus begins to oscillate, the oscillationamplitude grows, and eventually the nucleus fissions. The cross-sectionfor fission varies drastically for the various nuclei; for 2%\\J thecross-section increases for low energy or 'slow' neutrons and reachesaf ~ 10~21 cm2 for En ~ 0.01 eV. Neutrons of such low energy are referredto as thermal neutrons. In contrast 2H\J absorbs 'fast' neutrons(En >; 1 MeV) to form the isotope 2HU which beta-decays to neptuniumrather than fission. This reaction is indicated below, the ^ N pbeta-decaying with a 2-day lifetime to plutonium.

£n + 2 ^ U > 2192U • 2^Np + e"+v

20min

239Np y 2^Pu + e -+v e (5.17)

2 days

The artificial isotope " J P u has a very large fission cross-section and isused extensively as fuel for weapons; it has a lifetime of 24 000 years.

The fact that more than one neutron is produced for every absorbedneutron opens up the possibility of a 'chain reaction' which would growexponentially. However, the neutrons from the reaction of Eq. (5.16) areproduced with kinetic energy in the range of 1-2 MeV. Before theseneutrons can be absorbed to induce fission they must be slowed down tothermal energies. This is achieved by using a moderating materialconsisting of light nuclei, such as water or graphite. When a neutronscatters elastically by 180° from a nucleus of mass M, the kinetic energyafter the collision is

(K.E.)after =M — mr

2(K.E.),before

+ mn

Thus the efficiency of elastic collisions for slowing down neutrons improvesgreatly as M -• mn, and this is why light materials are effective moderators.

The rate of growth of the chain reaction depends on the probabilitythat a neutron will induce a fission reaction as compared to its probability

Fig. 5.9. Schematic representation of nuclear fission in the liquid dropmodel.

" \

Nuclear reactors 183

for escaping the volume of the fissile material or of being absorbed in areaction that produces no further neutrons. We can introduce the conceptof a reproduction factor k which is defined as the average number ofneutrons produced for every absorbed neutron in a system of very largedimensions. For instance for a graphite natural uranium lattice k ~ 1.04to 1.07. Clearly if k < 1 no chain reaction is possible. If the probabilitythat a neutron will escape is Pe, then the multiplication factor K= k(\ — Pe).For a reactor operating at constant power we must have K — 1; this isachieved by controlling the reaction rate through the introduction ofmaterials, such as cadmium, that absorb neutrons. For a weapon we wantK > 1 so that the reaction will proceed very fast and give rise to anexplosion.

Natural uranium is composed 99.3% of 2 3 8U which absorbs neutronsand only 0.7% of the fissile 2 3 5U. Thus to increase the reproduction factor,commercial reactors use uranium that has been enriched in 2 3 5U up to3%. However with a suitable moderator reactors using natural uraniumcan operate satisfactorily. In contrast highly enriched uranium willundergo an explosive reaction if it is assembled in large enough quantitiesso as to exceed its 'critical mass'.

As an example, let us calculate the mean free path for a neutron in puremetallic 2 3 5U. In this case there is no moderator present and we take thefission cross-section (for fast neutrons E> lOkeV), <rF~5 x 10~24cm2.Since the density of uranium is 18 g/cm3, the (number) density of nucleiis nF ~ 5 x 1022 nuclei/cm3 and the mean free path is

/ = l / n F < j F ^ 4 c m (5.18)It is reported that the critical mass for pure 2 3 5U corresponds to a sphereof radius r ^ 8 c m in good agreement with our simple estimate for themean free path. Such a sphere has a mass of ~45 kg and a chain reactionwill proceed explosively if that amount of 2 3 5U is tightly assembled. For3% enriched uranium, the critical mass is of order of several tons.

The control of fusion reactions is much more difficult than for fission.This is because the initial particles must have high energy in order toovercome the Coulomb repulsion. In weapons this is achieved by triggeringa fission explosive which produces enough energy to ignite the fusionreaction. Methods for maintaining a fusion reaction under controlledconditions will be discussed in Section 5.7.

5.5 Nuclear reactors

Nuclear reactors are devices in which a fission chain reaction canbe sustained in equilibrium and the generated heat is converted to


mechanical energy, usually by a steam-driven turbine. The fuel is enricheduranium and is encased in hermetically sealed zirconium tubes;* these arethe fuel elements and are assembled in the core of the reactor. Mostcommercial U.S. reactors use water under high pressure as a moderatorin which the fuel elements are immersed. The water acts as a primarycoolant and generates steam in a heat exchanger. A schematic is shownin Fig. 5.10. The secondary loop uses the steam to drive the turbine anda second heat exchanger condenses the exhaust steam. The heat exchangerin the secondary loop uses cooling water from nearby rivers or lakes orair cooling towers. In that respect the reactor has the same properties asany large fossil fuel electric power generating plant.

Under normal operating conditions the power generated by the reactoris kept constant. This implies that the number of fissions per unit time isconstant, and thus also the neutron flux in the reactor core remainsconstant. This means that for every neutron that is absorbed (either in afission-producing reaction or otherwise) or leaves the core, exactly oneneutron must be produced; no more and no less. The ratio of neutrons

Fig. 5.10. Schematic representation of the components of powergenerating reactors: (a) pressurized water reactor, (b) boiling waterreactor.

Steamgenerator

Turbine

Generator

Core

TurbineGenerator

pressurevessel rods Pump

\

WaterCooling

water

Condenser

(a)

Containmentstructure

(b)

* Zirconium has a low neutron absorption cross-section and can withstand hightemperatures.


produced to neutrons lost is the multiplication factor K, already introducedin the previous section. Neutrons are produced by fission of the fuel butalso to a much lesser extent by the decay of unstable isotopes. This isbalanced by the loss of neutrons which occurs because of (a) capture inthe fuel that leads to fission, (b) capture in 2 3 8U or in the reactor structurethat does not lead to fission and (c) escape from the core. Thus when

K < 1 the power is falling and the reactor is sub-criticalK = 1 the power is constant and the reactor is criticalK > 1 the power is rising and the reactor is super-critical

The multiplication factor depends on various parameters of the reactorand is controlled by adjusting the cadmium or boron control rods. Toincrease the power level the control rods are retracted making K slightlylarger than 1, and the power increases. When the desired power level isreached the control rods are inserted deeper into the core to make K = 1.The power is directly proportional to the neutron flux which is carefullymonitored and used to provide the necessary feedback for stable operation.

The power generated in the reactor is given byP = 4>nnFGFVQ (5.19)

Here, 0n is the neutron flux in the core, (f)n = ntv with vn the velocity ofthe thermal neutrons and n their number density. The fission cross-sectionis oF and nF is the number density of fissile nuclei; V is the volume of thecore and Q is the energy released per fission. Assuming an enrichment of3%, the number density of 2 3 5U nuclei is

nF = 0.03 x n = 0.03 x (5 x 1022) = 1.5 x 1021 cm"3

where n is the number density of uranium nuclei averaged over thedimensions of the core. For the fission cross-section in the presence of amoderator we will take

<7 F -10- 2 2 cm 2

and for the neutron flux we choose the realistic value(/>n = 1 0 1 4 c m - 2 s " 1

Thus for a core of two cubic meters, V = 2x 106 cm3, recalling that theenergy released per fission is of order Q = 200 MeV = (2 x 108) x(1.6 x 10"1 9)J, we find

P= 1014 x (1.5 x 1021) x 10"2 2 x (2 x 106) x (3.2 x 10"11)~ 109 Watts

Namely, we can expect a thermal power of 1 GW. If we assume a 40%efficiency for conversion to electrical power, the reactor can provide~400 MW which is equivalent to the rating of large fossil fuel powerplants.


The mass of the core is M ~ (20 g/cm3) x (2 x 106 cm3) = 40 tons andthe number of fissile nuclei is NF = nF x V = 3 x 1027. The fission rate onthe other hand is RF = P/Q~3 x 1019/s; thus the total fuel would beexhausted in 108 s, or in 3 years. Therefore it is necessary to refuel nuclearreactors approximately once a year. Refueling consists in removing thepartially spent fuel elements and replacing them with new fuel elementscontaining uranium enriched to the desired level. The used fuel elementsare then reprocessed to extract the remaining 235U but also the 239Puthat has been produced by the bombardment of the 238U. Refueling,transportation of the used fuel elements and reprocessing are exactingand potentially hazardous operations.

The stability of nuclear reactors is due in part to a detail in the chainof neutron production. The time it takes for a neutron to be thermalizedand induce a fission is of order of 1 ms (10~3 s). Thus if the reactor wasadjusted so that K = 1.001, then in one second the reactor power wouldapproximately double. This shows that the feedback and motion of thecontrol rods would have to be extremely fast. However a small fractionof the neutrons in the core do not come directly from fission but are dueto the decay of fission products, as for instance in the process

87Br -> 87Kr + e~ + v -> 86Kr + Jn

which have a long lifetime. About 1.1% of the neutron flux is due to thesesecondary neutrons which are produced with an average delay of14 seconds. Thus the time from the birth of a neutron to the instant atwhich it induces a fission, averaged over all neutrons, is of orderT = (0.011 x 14 + 0.989 x 10"3) ~ 0.15 s which is significantly longer than10"3 s. The reactor is operated so that only 98.9% of the fissions areproduced by prompt (i.e. direct fission) neutrons and the remainder bydelayed neutrons. When this condition is not observed the reactor becomesprompt-critical and cannot be controlled.

The power v. time curve for a nuclear reactor would typically look asin Fig. 5.11. At the start-up the neutron flux grows exponentially andafter some overshoot remains constant. Under normal conditions whenthe reactor is being shut down the power and thus also the flux are reducedexponentially. It is important to realize that the reactor depends cruciallyon the cooling fluid to maintain steady operation because small variationsin fuel temperature and density or in moderator density affect the reactionrate. Even if the chain reaction is stopped, for instance by the insertionof emergency control rods, the fuel elements are still extremely radioactiveand generate enough heat to melt their cladding. To minimize theconsequences of such possible accidents, reactors are surrounded by a


structure designed to contain the radioactive materials that would bereleased from the core, and to a lesser extent from the primary coolant.

Commercial power reactors in the U.S. are mostly of the pressurizedwater or boiling water type shown in Fig. 5.10. In the pressurized watersystem the steam circuit is isolated from the core, while in the boilingwater it is part of the primary coolant. Water reactors are designed to bethermally stable, that is if the temperature of the core increases themultiplication factor decreases, reducing the reactor power and thus alsothe temperature. When high neutron flux is desired, as in reactors usedto produce weapons material, graphite rather than water is used for themoderator. Typically it takes a neutron approximately 114 collisionsbefore it thermalizes in graphite as compared with only 18 collisions inwater. However in water, neutrons are absorbed by the free protons inthe reaction n + p -> d + y which has a large cross-section. To avoid thisloss, heavy water reactors using D2O as a moderator have been built,and they can operate with natural uranium. A different approach is touse a liquid metal as the coolant since this permits operation at highertemperatures. In this case however the moderating effect of the water isabsent and the fuel must be considerably enriched in fissile material.

The enrichment of uranium fuel can be achieved by diffusion of UF6gas through a long sequence of porous membranes. Alternative methodsare centrifugal separation and a proposed technique based on selectivelaser ionization but they have not found much practical use. In generalthese methods are expensive, whereas fissile material can be produced ina reactor at much lower cost. We have already indicated theplutonium-producing reaction in Eq. (5.17). Another fissile material is2llU which can be produced from thorium when it is bombarded by fast

Fig. 5.11. Power level curve for a reactor as a function of time; note thenormal powering up and shutdown segments.

Power

Normal power up Normal shutdown


(1-2 MeV) neutronsIn + 232Th -> 233Th -> 2 3 3Pa + e~ + ve

2 3 , 3 P a - 2 3 3 U + e - + v e (5.20)2 3 3U has a lifetime T ~ 160 000 years. Producing fuel in a reactor has theadvantage of preserving energy resources and in particular of using themuch more abundant elements 2 3 8U and 232Th as primary sources.Reactors used in this mode are named breeders. The obvious disadvantageis that the fissile material that is produced is of very high grade and thussuitable for use in weapons.

In the early days of nuclear power it was believed that fission reactorscould satisfy the world energy demand well into the future. This hope hasnot materialized for two reasons. First, the available uranium reserves arelimited, and secondly the construction of nuclear reactors and thenecessary safety measures are expensive both in energy and capital. Forinstance the total reserves of uranium in the U.S. are estimated atMR ~ 2.5 x 107 tons, including highly uneconomical mining. From ourexample we estimated that 40 tons of 3% enriched uranium could yieldabout 1 GW-year of electric power. Since natural uranium contains only0.7% of-235U, the reserves are equivalent to WT~ 1.5 x 105 GW-years.Given that the U.S. energy consumption is of order P ^ 2 x 103 GW, wesee that the uranium supply would last for ~75 years. This estimate isby far too optimistic because only 10% of the uranium reserves can beeasily mined; furthermore reactors are developed only for electric powergeneration. Recovery of unspent fuel could double the above figure andin particular a breeder program would alleviate the need for an extendedsupply of uranium.

We can compare the potential energy from uranium to the existingfossil fuel resources. In the U.S. the reserve estimates are

oil reserves - 500 x 109 barrels -> 3 x 1021 Jcoal reserves - 1500 x 109 tons -> 4 x 1022 Juranium reserves ~ 2.5 x 107 tons -• 5 x 1021 J

This energy content is to be compared to the present total yearly energyconsumption in the U.S. which is ~ 7 x 1019 J. Note that coal could satisfyour energy needs for several centuries provided we can overcome theecological consequences of its exploitation (greenhouse effect, acid rain)and use it in suitable form in transportation as an alternative to liquidfuels. There are also apparently vast reserves of natural gas on earth. Thefinal determining factor in the choice of an energy source is the cost ofproduction and one point of view holds that inevitably we will have toreturn to nuclear power in spite of the great hazards that it poses.

Radioactivity 189

5.6 Radioactivity

By radioactivity, we mean the spontaneous emission ofpenetrating radiation by certain nuclei, a phenomenon first observed in1896 by H. Becquerel. We discussed in Section 3 that the radiation mayconsist of y-rays, that is high frequency em radiation, of electrons orpositrons (jS-rays), or of helium nuclei (a-particles). Typically the emittedphotons or particles have energies in the MeV range which allows themto penetrate through matter. However, as the particles traverse thematerial, they interact with individual atoms and ionize them, break upmolecules and on occasion cause a nuclear interaction. These interactionsresult in changes in the structure of the material and are particularlydetrimental to living systems.

For X-rays and y-rays the primary type of interaction is ionization, thatis the creation of electron-ion pairs; the energy necessary to ionize theatoms is supplied by the radiation. The Roentgen is defined as the amountof radiation that produces in lcm 3 of air an ionization equal tolesu = 3.3 x 10"1 0 coulombs, namely 2 x 109 electron-ion pairs. Sincethe density of air is —1.29 x 10" 3 g/cm3,1 Roentgen of radiation produces~1.6 x 1012 electron-ion pairs/g of air. The ionization potential of aircan be taken as Ex ~ 30 eV so that 1 R corresponds to a deposition of48 x 1012 eV/g in air, or

1 R -• 78 ergs/g in air

Because we are interested in the effects of radiation in living systems,the standard unit is the rad which is defined as the amount of X-rayradiation that deposits 100 ergs/g in tissue

1 rad -• 100 ergs/g in tissue

Finally, since different types of radiation have different biologicalefficiency, a new unit is introduced; roentgen equivalent mammal, or rem.One rem of radiation causes the same biological effect when absorbed bya mammal as would 1 rad of X-rays. The radiation dose in rem equalsthe radiation dose in rad multiplied by the relative biological effectiveness(RBE). Charged particles and fast neutrons have RBE ~ 10 because theyare more effective than X-rays in causing biological damage.

While the roentgen is a unit of radiation, we need also a measure forthe intensity of a radioactive source. One Curie (Ci) is defined as a sourcethat undergoes 3.7 x 1010 disintegrations per second. This is the activityof 1 g of radium (including the activity of the surviving radium decayproducts). There is no exact conversion between the curie and the roentgenbecause the relation depends on the type and energy of the radiation


involved, as well as on the geometry. However, an approximate connectionis that 1 Ci of 60Co will produce at a distance of 1 m a radiation of1.3 rem/hr.

1 Ci = 3.7 x 1010 disintegrations/s1 Ci of 60Co at 1 m = 1.3 rem/hr (5.21)

The above units of radiation have been in use since the discovery ofradiation phenomena and are of historic origin. Recently, MKS units havebeen adopted to replace the previous units as follows: one becquerel (Bq)measures the intensity of a radioactive source and corresponds to onedisintegration per second; therefore

lCi = 3.7x 1010Bq

The radiation dose is measured in gray (Gy) which is equal to 100 rad.It corresponds to the deposition of 1J of energy per kg of tissue; therefore

1 Gy = 100 radFinally the sievert (Sv) replaces the rem, where

1 Sv = 100 remRadiation is attenuated when passing through matter; for X-rays or

y-rays the intensity after traversing a distance / is given by/ = / o e- '«

Here X is called the radiation length; for MeV y-rays passing throughconcrete X ~ 10 cm, whereas for lead k = 0.5 cm. Charged particles havea definite range to which they can penetrate; typically they lose 2 MeVper (gcm~2) of material traversed. Thus a proton of 100 MeV kineticenergy will not penetrate beyond approximately 50g/cm2 whichcorresponds to 5.5 cm of copper.

As noted previously the lifetime for radioactive decay varied by manyorders of magnitude for different nuclei. If the lifetime is T, then given asample of No nuclei, there will be only N(t) surviving after a time t.

N(t) = N0Q-t/T (5.21')Thus the intensity of the source is

L * . - (5.21)At T T

Sources with short lifetimes have the advantage of decaying rapidly, eventhough the intensity is high. Sources with long lifetime on the other handare less intense, but can last for an extremely long time.

The danger that radioactive substances pose to living organisms is notonly from direct exposure but also from the possibility of ingestion oflong-lived radioactive materials. These substances can then locate invarious parts of the body and subject it to continuous radiation. The

Radioactivity 191

short-term effects of radiation on humans are fairly well documented: forwhole body exposure received in a few hours the following effects occur

10 rem detectable blood changes20Q rem injury and some disability400 rem 50% deaths in 30 days600 rem 100% deaths in 30 days

The present safety standards require that occasional exposure of the publicnot exceed 0.5 rem/year. For radiation workers the limit is 3 rem in anyconsecutive 13 weeks and less than 5 rem/year. Cosmic rays and naturalradioactivity contribute about lOOmrem/year while man-made sourcescontribute another 70 mrem/yr. The maximum permissible quantities ofingested radioisotopes depend on the lifetime of the isotope, but aretypically of the order of 10-100 fid.

We can apply these considerations to the operation of a nuclear reactorwith the parameters used in the example of the previous section. We hadfound that the fission rate was RF = 3x 1019/s; thus the radioactiveintensity of the core is about / = 109 Ci. Without shielding, the flux at adistance of 100 m would be F ~ 105 rem/hour (see Eq. (5.21)). Since600 rem is a lethal dose, exposure to the reactor flux for At = 6 x 10" 3 hr ~20 s would bring certain death. This example slightly exaggerates theproblem because in practice most of the radioactive materials arecontained inside the fuel elements and it is y-rays and neutrons that arethe primary components of the radiation; it shows however the importanceof shielding and the potential for exposure to those working near the core.

When a nucleus fissions the fragments are neutron rich and thereforeradioactive. They return to the region of stable nuclei primarily by /}"decay but also by neutron emission. Among the many isotopes producedduring fission are radioactive gases such as xenon-135 (^Xe), which hasa 9.2 hour lifetime, and long-lived elements such as cesium-137 (^Cs),or strontium-90 (^Sr), the latter having a lifetime of 28 years. If theseelements escape from the reactor they will be distributed not only overthe immediate area, but can also be transported over long distances inthe form of radioactive fallout. For instance during the Chernobyl accidentin 1986 the release of radioactive material amounted to 107 Ci on the firstday and persisted at the level of ~2 x 106 Ci/day for several days; thisamounted to about 3% of the total activity of the core. Since the maximumburden for ingestion by humans is of order 10-100/xCi, this amount ofactivity posed a considerable threat to human life in the area near thereactor. Similar considerations apply to the disposal of the used fuelelements and other nuclear wastes because of the danger that the long-livedisotopes will, over many years, find their way back into the ecosystem.


5.7 Controlled fusion

If one takes a long-term view of the future of our technologicalcivilization - on the scale of millennia - one concludes that fusion mustbe made to work. The fuel is limitless, there is no CO2 produced andthere are no fusion products to be disposed of. Of course no energycreation process is totally risk-free. A fusion reactor would be radioactiveand there would still be a waste disposal problem, but the reactor cannot'run away' as in the case of fission. Thus it is important to persist inattacking the challenging problems that have stood in the way of controlledfusion for so many years.

The energy of the sun is produced from the fusion of its hydrogen intohelium. It is therefore instructive to review the processes that take placein the sun and lead to a continuous, as contrasted to an explosive, fusionreaction. There is convincing evidence to indicate that the first step in thefusion cycle is the production of deuterons through

p + p^>d + e++ve 6 = 0.93 MeV (5.22)Once deuterons are produced there are various paths that can lead to thetightly bound ^He nucleus. For instance

p + d^>32He + y Q = 5.49 MeV (5.23)

pJe + !He->2He + p + /? Q = 12.86 MeV (5.23')In this case, six protons have fused into one f He nucleus with 2 protonsand 2 positrons free in the final state. The total energy release is 25.7 MeV.Another possible path is

d + d^32He + n Q = 3.3 MeV (5.24)

followed by the reaction of Eq. (5.23'). The (d + d) reaction leads also tothe formation of tritons (JH), for which we will use the symbol t

d+d^t+p Q = 4.0 MeV (5.24')This is then followed by the reaction

d + t-^ê + n Q = 17.6 MeV (5.25)All of the above reactions, except for the first one, involve only the

nuclear force and have a reasonably large cross-section. The first reactionhowever (Eq. (5.22)) involves a positron and thus the weak force, thereforeit is not very probable. It is the probability for the weak reaction that setsthe rate at which the sun burns its fuel. In stars larger than our sunanother cycle involving heavier nuclei, the carbon cycle, is the primarymechanism for the fusion of protons to ^He.

For any of the above reactions to take place the charged particles musthave sufficient energy to overcome the repulsive Coulomb force in order

Controlled fusion 193

to reach within the range of the nuclear force. As an example we calculatethe electrostatic potential energy between a deuteron and a triton at adistance r = 10 F from one another; setting Q^ = Q2 = 1 we obtain

"2 6162V = - ~150keV

Each particle must have kinetic energy (K.E.) ~ 75 keV. This energy willbe due to the thermal motion of the particles. Thus by setting kT{ = (K.E.)f,we find a temperature

T f ~10 9 K (5.26)This is much higher than the temperature of the interior of the sun, whichhas the value

r0(interior) - 1.6 x 107 K (5.26')The apparent contradiction between Eqs. (5.26) is resolved if we recallthat the Maxwellian thermal distribution has a long tail, as shown inFig. 5.12. Thus, even when the mean value <fcT> = 0.75 keV there aresufficient particles with kinetic energy ~ 103</cT> to sustain the reactionof Eq. (5.25). At these temperatures the atoms are completely ionized andthe sun's interior is a plasma held together by the gravitational force.

The rate at which a reaction proceeds is usually given by the productof the cross-section and of the incident flux. In the case of fusion we donot have a well-defined beam and it is preferable to give the rate ofreactions per unit volume; we introduce the symbol R for the number offusions per cm3 per s. If the density (cm"3) of particles of species 1 is nx

and that of species 2 is n2 then we writeR = nin2(<jv}12 (5.27)

where (GV}12 is the reactivity and is expressed in (cm3/s). It is the averagevalue of the cross-section multiplied by the relative velocity between the

Fig. 5.12. Maxwell distribution of the velocity v of particles in thermalequilibrium at temperature T.

dNdv

Particles in 'tail'

rkTf


particles of the two species. Note that the thermal velocity is not uniqueand that the cross-section depends exponentially on the velocity; thus wemust use an average value such as <OT>I2- The reactivity for the d-d, d-tand d-3He reactions (Eqs. (5.24, 25)) is shown in Fig. 5.13 as a functionof the kinetic energy of the particles.

From the data in Fig. 5.13 we see that the (d-t) reaction is the mostfavored one at the lower temperatures. As a numerical example let usassume the peak reactivity, <<n;>12 = 10"1 5 cm3/s and a density ofparticles, nl = n2 = n = 1014/cm3. This would result in a fusion rate

R = nln2((Tv)12 = 1013/cm3-s

Thus, it would take 10 seconds for all the particles to fuse. Therefore theplasma must be contained for that length of time in order to fuse all thefuel. In general, if the containment time is T, we want the fusion ratedensity R to be such that Rx ^ n, or

ni>—^— (5.28)<>

The requirements of Eq. (5.28) are to some extent contradictory andpoint out the key problem in achieving controlled fusion. One needs high

Fig. 5.13. The reactivity (cross-section times velocity) for various fusionreactions as a function of particle energy.

10

1.0

0.1

io

0.01

10 100Ion energy

1000 keV

Controlled fusion 195

density, but as soon as the fuel ignites the available energy disperses it,lowering the density in a very short time. In fact, the energy released bythe fusion reaction is proportional to n2r

EF=an2T

Whereas the energy required to heat the fuel is proportional to nEH = bn

with a, b coefficients that can be calculated from the basic properties ofthe reaction. If the ratio

Q = ^ = a-m (5.29)EH b

exceeds unity there is energy gain; equivalently

m > - (5.30)a

This relation is called the Lawson criterion and for the d-t reactionb/a~ 1014cm~3-s. The Lawson criterion is less stringent than theinequality of Eq. (5.28) where it was assumed that all the fuel would fuse.Nevertheless, we must heat the plasma to temperatures of orderT~l x 108 K and contain it for about one second. The typical pressurein the plasma is of order ~ 10"5 atmospheres, which corresponds to thedensity of 1014 atoms/cm3.

Various methods have been proposed for confining the hot plasmabased on the use of magnetic fields in various configurations. For instancecharged particles will spiral around the magnetic field lines as shown inFig. 5.14(a). If the magnetic field is configured as a torus, then the particleswill be trapped as indicated in (b) of the figure. While this idea is simplein principle, its practical realization involves large, complex and highlysophisticated apparatus. The heating of the plasma requires special

Fig. 5.14. Confinement of charged particles by a magnetic field: (a) linearfield, (b) toroidal field.

Path of ions

(fl) (b)


techniques and obviously the particles must not be allowed to touch thevacuum chamber wall.

Considerable effort is being devoted toward achieving controlled fusion,but the gain factor (see Eq. (5.29)) is still below unity. When Q > 1 isreached the released energy will have to be converted to useful electricalenergy. This can be achieved by stopping the fusion products in a vesselsurrounding the fusion chamber and using the heat to drive a thermalengine. Since the fusion products include neutrons (see Eqs. (5.22-25)),a possible use of a fusion reactor could be to breed fuel for fission reactors.

An alternative to magnetic confinement of the plasma is inertial fusion,where high density (solid) targets are bombarded by intense laser beamsor by charged particle beams. These techniques must also reach Q^land in addition compensate for the efficiency with which the laser orparticle beams are produced. The fuel that is envisaged is deuterium andtritium. Deuterium is abundant and can easily be extracted from sea water.Tritium is radioactive with a lifetime of 12.3 years. It is produced innuclear reactors where it is bred from the reaction n + 6Li -• 4He + t. Thefirst task is to demonstrate in the laboratory that controlled fusion isfeasible, beyond that considerable engineering effort will be required toproduce operating fusion reactors.

In 1989 it was reported in the press that fusion had been achieved atroom temperature, by a simple chemical reaction. These reports turnedout to be completely unfounded but there exists a known reaction wherefusion takes place without reaching the high temperature needed toovercome the Coulomb barrier. This process is mediated by /x-mesons,particles which are like heavy electrons (mM~210me) but which areshort-lived; the lifetime of the muon is i / i = 2 x 10~6 s. The process isknown as muon catalysis and was discovered in 1956 by the Alvarez groupat Berkeley.

Because the negative muon has the same properties as the electron, ifit is stopped in a mixture of deuterium (D2) and tritium (T2) gas, it cansubstitute for one of the electrons in these molecules, forming a 'muonicatom' or 'muonic molecule'. The following chain occurs

\x~ + D->dju + e~d,u + t -+ tfi + d

However, because of the large mass of the muon, the 'Bohr orbit' in thedtfi muonic molecular ion is 200 times smaller than for the normalelectronic ion. As a result, the deuteron and triton have a high probability

Solar energy 197

of being close enough to fuse according to the reaction.dtfi - (5He)fi - 4He + \i~ + n + 17.6 MeV

Another way of thinking of this process is to realize that the tja atom iselectrically neutral (acts like a neutron) and thus can approach thedeuteron nucleus without seeing a Coulomb barrier.

Muon catalyzed fusion is efficient and the same muon can initiate asmany as 150 fusions, before decaying. This corresponds to an energyrelease of ~ 3 GeV which is near the break-even point for the productionof the muon (from n~-decay) in energetic particle collisions. Energyproduction by muon catalyzed fusion is still at an early stage of explorationbut could be promising.

5.8 Solar energy

In principle, solar energy is a non-polluting inexhaustible sourceof energy. Wind power, the growth of organic materials that can be usedas fuels, as well as the fossil fuels available are derived indirectly from theenergy delivered to the earth by the sun. The direct exploitation of solarenergy is made difficult by its relatively low concentration, the day/nighteffect and the dependence on cloud cover coupled with the complexity ofenergy storage. At present, the conversion efficiency is relatively low sothat solar power is significantly more expensive than power derived fromfossil fuels. As a result the motivation for exploiting solar energy and forresearch on new conversion techniques is damped.

As indicated in Eq. (5.8) the solar constant has the valueS=1.36kW/m2 (5.8)

To estimate the average flux incident per unit surface of land area wemust account for the inclination of the sun's rays with respect to thevertical, and consider the night/day effect, the atmospheric absorptionand the cloud cover at any particular location. As a result the averageflux incident per square meter of land area in the U.S. is

S = 0.20kW/m2 (5.31)From Table 5.1 the total energy consumption in the U.S. in 1970 wasapproximately 7 x 1019 J, corresponding to a power rating

P v s - 2 x 109 kWIf all of this power was to be obtained from the sun, even with 100%efficiency, the collector would have to cover an area of 1010 m2, that is,a region 60 by 60 miles squared.


The above estimate does not account for conversion efficiency, whichpresently averages ~10%. One could place the collector in a favorablelocation, such as a desert, but then the power would have to be transportedto the urban centers. The investment in a plant of that size as well as itsmaintenance could prove prohibitively costly. Exotic schemes such asplacing the collectors in space in a geostationary orbit, face similarproblems of very large capital investment. On the other hand, use of solarenergy to reduce in part the dependence on fossil and nuclear fuels ispractical, and would add significantly to the conservation of thenon-renewable sources of energy. It can be argued that short of therealization of abundant fusion energy, the present growth of global energyconsumption cannot be sustained into the long-term future.

The most efficient exploitation of solar energy is by direct conversionto electrical power. This can be achieved by using 'photovoltaic cells';silicon cells are extensively used on spacecraft and in other specializedapplications. Another approach involves the focussing of sunlight ontosmall areas so that temperatures sufficiently high to drive thermal enginescan be reached. Yet the simplest use of solar energy is in heating buildingsand water for household use and in other applications of low quality'heat. We recall that these needs are almost 25% of the total energyconsumption; of course where heating is most needed, the insolation isthe lowest.

The basic collector. The central idea in a collector of solar energy is thesame as in the heating of the earth by the sun. The visible light enters thecollector and is absorbed, but the infrared emitted by the collecting surfaceis prevented from being radiated away. A sketch of a simple collector isshown in Fig. 5.15. Glass surfaces are adequate to trap the IR whileadmitting the visible; the absorber can be a coil through which water iscirculated, and the whole assembly is insulated to reduce leakage by

Fig. 5.15. Simple solar collector for heating water in householdapplications.

r Glass surfaces

- Absorber Insulation

Solar energy 199

conduction and convection. Selective coating of the glass surfaces canimprove the performance of the collector by reflecting wavelengths longerthan X > 2 fim.

Simple collectors can achieve fairly high temperature differentials. Whenno energy i& withdrawn the collector reaches its maximum temperaturedifference, ATmax; this is shown below for various levels of insolation. Thelast column is calculated for a collector area of 50 m 2 and a 50% efficiency;in that case the temperature rise of the circulating water reaches half ofATmax.

Insolation ATmax Useful energy (50 m 2 )0.32 kW/m 2 48°C 8 k W0.64 85°C 161.00 120°C 25

The average household power needs are estimated at 6 kW whereas typicalcommercial collectors units have 5 m 2 areas. Thus for single dwellings itwould be possible to find sufficient area to install adequate solar heating.In contrast, this could not work in densely populated cities.

The basic concentrator. Sunlight is highly collimated and therefore itcan be precisely focussed. Everyone has toyed with lenses or mirrors toshow that sunlight can ignite a fire at the focus. The sun subtends at theearth an angle

6Q=9 x 10"3rad

so that the image produced by a mirror of focal length / has areaa = n(f6Q/2)2. If further the mirror has a radial aperture R = f, theresulting concentration of the solar flux is

A ^ i x l 0 * (5.32)C 2 5 x l 0a n(R6Q/2)2 6%

Thus, very high energy densities can be reached.The technical difficulty with spherical concentrators is that the large

area mirror must track the sun. Alternate schemes such as cylindricaltroughs, or plane mirrors have been used in different installations. Themost ambitious such plant was built by the French in the Pyrenees in1970 and generated 1 MW of power. At that altitude (2000 m) an insolationof 1 kW/m2 can be expected with sunshine for half of the days of the year.The collector was fixed as shown in Fig. 5.16 and consisted of 10 000mirrors, each of area 0.4 x 0.4 m2. A movable array of plane mirrors onthe side of the mountain tracked the sun to illuminate the collector mirrors;it comprised some 12 000 mirrors of 0.5 x 0.5 m2 area. Temperatures ashigh as T = 3500°C were easily obtained at the focus. However the support


of the mirrors and other engineering problems made the operation of theplant unprofitable.

An alternative to exact focussing of the sunlight is the use of fixedcollectors as shown in Fig. 5.17. These are named 'Winston cones' andlight incident on the aperture of the cone is guided to the collecting area,irrespective of its angle of entrance.

Photovoltaic cells. Solar energy is highly ordered, as compared torandom thermal energy and for this reason it is possible to convert itdirectly to electrical energy. Visible photons can easily produceelectron-hole pairs in a semiconductor where the energy gap is of orderEg~ leV (see Section 1.1). By using p-type silicon with an n-type layeron its surface we can construct a junction such that electrons liberated inthe silicon will move toward the surface layer. This is indicated inFig. 5.18 which gives the energy diagram for the junction (see alsoSection 1.4) with no bias applied. Electrons generated in the p-typematerial near the junction will cross over the junction into the n-typelayer, producing a voltage difference between the two surfaces of the cell.

Fig. 5.16. High concentration solar collector installation in the FrenchPyrenees; note the dimensions of the mirror system.

Sunlight

Mirrors43 m

Mountain

Fig. 5.17. A reflecting 'Winston' cone for trapping and collecting solarradiation.

Solar energy 201

The I-V characteristic for a typical silicon cell of 10 x 10 cm2 area isshown in Fig. 5.19. It can support a voltage of 2.5 to 3 V and at aninsolation of ~ 1 kW/m2 it can deliver about 1.5 W; this corresponds toan efficiency of e ~ 14%.

To reduce the cost, solar cells are manufactured from amorphous silicon,but even so they are still priced at ~$20/cell. To be economicallycompetitive the price would have to be further reduced by a factor of 100.For instance, a 3.5 kW system built in the Papago Indian Reservation inArizona cost $330 000 including the storage batteries. If the capital is tobe amortized at 6% per year, the cost of the electrical energy comes to70 cents per kWh; this is to be compared to commercial rates of~ 10 cents per kWh. The lifetime of the cells is not known and furthermorepresent manufacturing techniques are still very energy intensive.

Fig. 5.18. Energy band diagram for a photovoltaic cell; the cell is adiode made of p-type and n-type semiconductor with an energy gap

1 VEg ~ 1 eV.

Electrons

Fig. 5.19. I-V characteristic for a typical photovoltaic cell made ofsilicon.

/ (mA)i

600

400

200

0 1 2 3• V (volts)


Generating stations in space. It has been proposed to place the convertersin geostationary orbit and to transport the generated power to earth. Thisis an elegant solution to the insolation and area coverage problems, andthe energy would be transported by a high power microwave link. Thecapital investment has been estimated at $2000/kW, approximately thesame as for nuclear reactors; for the solar plant the largest part of thecost is in placing the generator in orbit. Notwithstanding the abovearguments, the system does not seem practical. If we assume a 10 GWplant we will need at least 104 tons in orbit; at the present rate thiscorresponds to 300 trips of the shuttle! Furthermore, an efficient 10 GWcontinuous microwave link is a very difficult undertaking.

In conclusion, solar energy will certainly be used in the future to covera part of the energy needs of our civilization. Yet it does not seem probablethat it can fully replace our immediate dependence on fossil fuels. Inparticular for the long term future a civilization based on energyconsumption will have to rely on thermonuclear fusion, rather thandepend only on solar energy.

Exercises

Exercise 5J

A nuclear reactor uses 1 ton of 235U per year.(a) Assume 40% thermal efficiency conversion, and calculate the

power in watts produced by the reactor.(b) If the entire U.S. demand in energy were to be satisfied by such

reactors, how much uranium would have to be mined per year?What fraction of the total U.S. reserves in uranium does thisrepresent?

Exercise 5.2

The height of the orbit of geostationary satellites is 36 000 km.(a) If such a satellite is to transmit at 1W calculate the size of the

solar panels needed. Assume reasonable numbers for solar powerconversion efficiency.

(b) Calculate the size of the antenna of the satellite if it transmitsat / = 4 GHz and the antenna gain is 500.

Exercise 5.3

Good grassland will support one 10001b cow per acre; the cow in turn

Exercises 203

produces 4000 lb of milk per year. Milk is 88% water and 4% fat (9 kcal/g)and 8% non-fat solids (4 kcal/g). Calculate the ratio of the energy content(calories) of the milk to that of the solar energy incident on the field inone year. This gives a measure of the energy conversion efficiency inagriculture.

Exercise 5.4

(a) Assume that the proton is unstable and decays with lifetime x.Calculate the number of protons in a human being and note thatthe radiation level is less than 10 mrem/yr which corresponds toless than 1 juCi of radioactivity in the body. From these two factsfind a limit for the lifetime T of the proton. In addition:

(b) Compare with the age of the universe.(c) Do you know what the experimental limit is on T(?)

NUCLEAR WEAPONS

6.1 Fission and fusion explosives

The realities of World War II led to the rapid development ofnuclear technology in the U.S. culminating in the construction of the firstnuclear weapons. The first nuclear explosion took place in the AlamogordoDesert in New Mexico on July 16, 1945 in a test code-named Trinity.Soon thereafter, on August 6, the first nuclear bomb was dropped overHiroshima in Japan and totally destroyed the city, causing over 150 000casualties. A second bomb was dropped on Nagasaki on August 9, andon August 14 Japan surrendered unconditionally thus ending W.W.II.Today the nuclear capability of the two superpowers (the U.S. andthe U.S.S.R.) is more than a hundred thousand times that of W.W.II andmany other nations have acquired nuclear weapons. Yet it is a sign ofhope that no nuclear weapons have been used in hostilities since thosefirst two explosions in 1945.

In the previous chapter and in particular in Sections 5.3 and 5.4 wediscussed extensively the fission of heavy nuclei when bombarded byneutrons. In the fission of 235U, on the average 2.4 neutrons are releasedand if these neutrons could be used to initiate further fission, a chainreaction can develop. This process is exploited in nuclear reactors where,however, the reaction rate is kept constant. As already indicated byEq. (5.18) the probability PF that a neutron will cause fission aftertraversing a length A/ of the fuel is

PF=l-e-WF<TFA'-nF<7FA/ (6.1)

where nF is the number density of fuel nuclei and o¥ the fissioncross-section. This probability competes with the probability that aneutron will escape from the fuel volume or that it will be absorbed in a

Fission and fusion explosives 205

non-fissioning reaction. Thus we can introduce an efficiency factor

PF = _J^_ ( 6 r )

Pj ^F + PNF

where PNF is the probability for escape and non-fissioning absorption. Ifthe average number of neutrons produced in every fission is m, then themultiplication factor (see Section 5.3) is given by

K = nm = — — — m (6.2)

Recall that PNF depends on the geometry of the fuel volume but also onthe quantity and type of the non-fissile material present.

In a weapon we want K as large as possible, that is we must maximizen; to achieve this we wish to increase PF and decrease PNF. Anothernecessary condition is that the reaction proceed quickly (in 10" 6-10~3 s)in order to create an explosion. This imposes the condition that we usefast neutrons in the chain reaction since the moderation process takes10" 3 s for each generation of neutrons. However, the fission cross-sectionfor fast neutrons is smaller than that for thermal neutrons, typicallyo¥{E~ l M e V ) ~ 5 x 10"2 4cm2. To compensate for the small cross-section the density of fissile material should be high (see Eq. (6.1)), whichis achieved by using highly enriched fuel. Furthermore, the purer the fuelthe smaller the probability of non-fission captures, reducing PNF. Typically,if a value of n = 0.5 can be achieved (for fast neutrons), K = 1.2 leadingto rapid neutron multiplication.

The efficiency factor n depends on the dimensions of the fuel volumethrough both PF and PNF. Thus for every type of fuel assembly when acertain mass is exceeded it will spontaneously undergo a fission chainreaction. This mass is called the critical mass and we gave a rough estimatefor it in Section 5.3. The fuel used in fission weapons is either 2 3 5U or2 3 9P and we give below estimates for the critical mass according toD. Schroeer (Science Technology and the Nuclear Arms Race, J. Wiley,New York, 1984). The efficiency n can be increased by surrounding thefuel volume with materials such as beryllium, which reflect escaping

Table 6.1. Critical mass for different fissile materials

Uranium enriched in 235U by 20% 160 kgUranium enriched in 235U by 50% 68 kgUranium enriched in 235U by 100% 47 kgPlutonium-239 (pure) 19 kg

206 Nuclear weapons

neutrons back into the volume. Consequently, when reflectors are usedcriticality can be reached for smaller masses of fuel.

An explosion is the result of energy release in a very short time interval.One of the most powerful chemical explosives is TNT (trinitrotoluene,C6H2(CH3)(NO2)3); the explosion of 1 kg of TNT releases approximately1000 kilocalories of energy. This corresponds to W~4 x 106 J which isthe energy consumed by a 40 W light bulb burning for one day. Thus itis not a large amount of energy, but its sudden release produces direresults. The yield of nuclear weapons is measured in terms of the TNTtonnage that would create similar effects. Here 1 ton = 103 kg of TNT andwe will use the notation kt and Mt as units of explosive yield,

The energy released in the fission of the 2 3 5U nucleus is ~200 MeV ascompared with a few eV released in chemical reactions. Thus, for a givenmass of explosive, nuclear materials release 107 times more energy thanconventional weapons. Furthermore the energy is released in a very shorttime as we show below. We designate by T the time between the creation(birth) of a neutron and its subsequent absorption (leading to fission).For fast neutrons the time between 'generations' is of order T ~ 10 ~8 s(as compared to 10"3 s for the moderated neutrons in a reactor). If thechain reaction starts at t = 0, then the number of nuclei that have fissionedby the time t is

N(t) = N0(K)<* (6.3)where K is the multiplication factor.* Thus if No = 1, the number ofgenerations needed to completely fission N nuclei is

- = log(iV)/log(X) (6.3')T

For 10 kg of fissile material, AT ~ 2.5 x 1025 and if we take K = 2 thenumber of generations is (t/z) ~ 85. Thus the total time for the explosionis of order o f t ~ 1 0 ~ 6 s . Even if we had chosen K=l.l the explosiontime would have been only six times longer. Most of the energy is ofcourse released by the last few generations of neutrons as the entire fuelmass fissions. Thus it is important to contain the fuel and prevent itsdispersion as the explosion proceeds. This is achieved in part because ofthe inertia of the fuel and because of the very short explosion time. Theaddition of 'tamper materials' is designed to increase the inertia and thusthe confinement of the fuel in order to achieve a more efficient explosion.

The weapon used against Hiroshima was constructed from 70%enriched 2 3 5U and, apparently, was assembled in two subcritical masses

* The multiplication factor should be defined with reference to the intergenerationtime T.


as shown conceptually in Fig. 6.1. The central cylindrical plug was firedby chemical explosives into a hollowed out sphere of 2 3 5U. When the twoparts matched, critical mass was achieved and the assembly exploded.The yield of the weapon was ~ 12 kt of TNT.

Eiirichment of natural uranium to a high concentration of 2 3 5U, suchas needed for weapons grade material, is costly and labor intensive. Onthe other hand plutonium-239 is also fissile and can be produced withcomparative ease in high flux reactors. It appears that most present daynuclear weapons use plutonium as the fissile material; the device explodedin the Trinity test and the bomb used against Nagasaki were 239Puweapons. Because the plutonium is rather pure and because of its largefission cross-section the realization of the critical mass has to occur morerapidly than is possible by the 'gun-firing' technique illustrated inFig. 6.1. Therefore chemical explosives are used to compress the plutoniumcore to almost twice its natural density in a very short time. Thearrangement is shown in Fig. 6.2 where all segments must be detonatedsimultaneously and the explosives are so shaped as to create an inwardgoing shock-wave.

When the spherical core is compressed its volume changes from V toV\ its density from p to p' and the radius from r to r'; for instance if

V'=V/29 then r' = r/21 / 3 , p' = 2p (6.4)From Eq. (6.1) we know that the critical length /c is proportional to (1/p)and that the core will go critical if r > /c. Before the implosion, r/lc = a < 1so that the weapon is subcritical. The implosion compresses the core toa radius r' = r /21 / 3 and reduces the critical distance to l'c = lc/2 so thatthe ratio r'/l'c becomes

- = 1.6 - = 1.6a (6.4')/; (2)1/3(/c/2) lc

Thus if a is chosen such that 0.65 < a < 1.0 the compressed core is

Fig. 6.1. Possible design of a nuclear fission device based on enriched235U. The chemical explosive drives the two subcritical masses together.

Cannon

Trigger

• Chemicalexplosive

208 Nuclear weapons

super-critical and will explode. Note also that the compression techniquereduces the amount of material needed to reach criticality by the squareof the compression ratio.

The Nagasaki bomb had an explosive yield of 22 kt, namely an energyrelease of W ~ 1014 J. If we assume that the core had a mass of 10 kg of239Pu the expected energy release from the fissioning of all the corematerial is

W = x [10* g] x 200 MeV ~ 8 x 1 0 " J

This would indicate that only 12% of the core fissioned before beingdispersed by the power of the explosion. Modern weapons are built withhigher efficiency.

The existence of a critical mass places an upper limit on the size andyield of fission nuclear explosives. There is no such limitation on the fusionof light nuclei, but in this case the nuclei must have very high relative

Fig. 6.2. Possible design for the 239Pu implosion bomb; the carefullyshaped chemical explosives compress the plutonium so as to reach criticaldensity. (After D. Schroeer, Science Technology and the Nuclear ArmsRace, J. Wiley, New York, 1984.)

Explosiontrigger signal

High-energychemical

explosivesin two layers

Explosioninitiator

Focusedshock wave


energies to overcome the repulsive Coulomb barrier; that is, the fuel mustbe heated to high temperature. As was discussed in Section 5.7, the mostfavorable reaction is the fusion of a deuteron and a triton into helium-4according to the reaction

\d + \t^> £He + In + 17.6 (MeV) (6.5)

Since both deuterium and tritium are gaseous at normal temperatures itwas first believed that it would not be possible to reach the necessarydensity for a self-sustaining fusion reaction. However, the explosion of afission weapon by the U.S.S.R. in 1949 convinced the U.S. government toinitiate a program for the construction of a fusion weapon. The first testof a thermonuclear device took place at the Eniwetok Atoll in the Pacificin late 1952; it involved cryogenic d, t and had a yield of 10 Mt.

What has made fusion explosives practical is the use of lithium deuteride,LiD, as a fuel, a technique which, apparently, was first introduced by theSoviets. When bombarded by fast neutrons, lithium produces tritonsaccording to the reaction

In + f Li -> \He + \t + 4.8 (MeV) (6.6)and in an environment of high temperature the tritons fuse with thedeuterons as shown in Eq. (6.5). The necessary neutron flux and the high

Fig. 6.3. Possible design for a thermonuclear weapon triggered by afission explosive. (After H. Morland, The Secret that Exploded, RandomHouse, New York, 1981.)

Electrictrigger

| Layered mix of 235U, 239Pu and D + T

| Layered mix of 235U, D, T, and Li

Chemical high explosive

f j ^ j Be tamper238U tamper

210 Nuclear weapons

temperature T ~ 107 K are achieved by exploding a fission device in aLiD environment.

The LiD fuel is solid, and since we are seeking an explosion, thenecessary confinement time is small. Thus the Lawson criterion ofEq. (5.30) can be easily satisfied. The density of LiD is of order 1 g/cm3,or n ^ 8 x 1022 nuclei/cm3. For a confinement time T ~ 10~6 S we findm ~ 1017 cm"3 s, well in excess of the requirement m > 1014 cm"3 s.

Fusion weapons, also called thermonuclear weapons or hydrogenbombs, always contain a fission device. One possible construction is shownin Fig. 6.3. The fuel consists of LiD, tritium gas (which must be replenished)as well as fissile material such as 239Pu or 235U. A plutonium fissionexplosive is used as a trigger. The X-rays from the fission trigger transferthe energy to the fuel igniting the fusion reaction. The whole assembly isencased in 238U which under the intense neutron flux fissions, contributingperhaps as much as 50% of the total yield. The containment of the fueland the energy transfer between the components of the weapon are surelyimportant details of the design. For strategic warheads the typical yieldof fusion weapons is in the 1 Mt range.

6.2 The effects of nuclear weapons

The explosion of a nuclear weapon releases a large amount ofenergy in a very brief time interval. Following the explosion the energyis distributed roughly as follows

50% in the pressure blast35% in the heat radiation5% in prompt radiation

10% in delayed radiationThe pressure and heat are immediate effects that destroy structures, ignitefires, and, of course kill humans. The resulting prompt and delayedradiation is also lethal and can poison the environment for a very long time.

The cost in human life depends on the area that has been targeted. Asan indication of the enormity of destructive power we consider thecasualties from the two bombs used in W.W.II.

Yield Killed InjuredHiroshima 12 kt 70 000 80 000Nagasaki 22 kt 40 000 20 000

Many present day weapons have a 1 Mt yield and it is estimated that ina full scale war between the U.S. and the U.S.S.R. 5000 Mt of explosives

The effects of nuclear weapons 211

will be used. To emphasize the annihilation potential of such an exchangewe can make a linear extrapolation from W.W.II casualties to find that

5 x 106 kt(210 000) 30 billion peopleV 34kt P p

would be killed. Of course the linear extrapolation is inappropriate andthe total population of the two nations is only ~0.5 x 109 people.Nevertheless enough weapons have been built to eliminate the entirepopulation of the earth several times over.

Nuclear explosives reach a much higher temperature than chemicalexplosives and for this reason the dominant form of energy release is inthe form of radiation. From Eq. (5.6) we know that the total radiatedenergy per unit area per second from a black body at temperature T isS = GT4 with o Stefan's constant. The energy density in the source is

u = - S = 7.6 x 10" 1 6 T 4 J/m3-(K)4 (6.7)c

If we assume a temperature in the explosive of order T ^ 4 x 107 K, weobtain u ~ 1015 J/m3, which for a volume of 1 m3 is a significant part ofthe total energy produced in the explosion. In contrast, in a chemicalweapon T ^ 5 x 103 K and the radiation is insignificant in comparison tothe kinetic energy of the explosion products.

About half of the radiation energy is immediately transferred to thesurrounding air (which is opaque to X-rays) and gives rise to the pressureblast. An overpressure of 5 psi destroys frame houses, whereas 50 psi islethal to humans.* The em radiation is rapidly absorbed giving rise tothermal radiation where deposition of 8 cal/cm2 is sufficient to cause 2nddegree burns. For a weapon with a 1 Mt yield the following effects areexpected within a radius R from the center of the explosion

R ^ 1500 m 50 psi overpressureR ^ 5000 m 5 psi overpressureR^ 13 000 m 8 cal/cm2

Because of the intense heat at the point of explosion a column of hotair rises (since it is less dense than cold air) from the surface, and thiscreates the characteristic 'mushroom cloud' of nuclear explosions. Thereis a very strong draft inside the column and correspondingly strong surfacewinds as shown in Fig. 6.4(a). In (b) of the figure is shown the actualcloud from the 10 Mt explosion of the 'Mike' test at the Eniwetok Atollin 1952. For a 1 Mt explosion the top of the cloud reaches 60 000 ft andtherefore radioactive debris and ashes are carried into the upperatmosphere from where they can be transported over long distances.

* 14.7 psi (pounds per square inch) = 1 atmosphere ~ 105 N/m2.

212 Nuclear weapons

Furthermore, the intense surface winds cause fires to spread rapidly andto retain their intensity.

In the explosion process, the fuel becomes completely ionized and alarge number of free electrons move with high velocity. This gives rise toan intense pulse of HF and VHF em radiation (EMP). This pulse is knownto disrupt communications over long distances and to damagemicrocircuits.

Fig. 6.4. Thermonuclear explosion: (a) the cloud from a 1 Mt explosion,one minute after detonation. (After Glasstone and Dolan, The Effectsof Nuclear Weapons, U.S. Government Printing Office, 1977.) (b) U.S.test shot of October 1952 at the Eniwetok Atoll; it demonstrated thefeasibility of fusion weapons.

Updraft throughcenter of toroid

Toroidal circulationof hot gases

Cool air being drawnup into hot cloud


In addition to the intense heat radiation, nuclear radiation in the formof X-rays and neutral and charged particles is also present. We discussedin Section 5.6 that ~ 500 rem will cause death in 50% of the exposedpopulation. For a 1 Mt explosion the integrated dose due to promptradiation, will exceed 500 rem within a radius R^ 2500 m. Moresignificantly, the fallout from the explosion will result in a 500 remexposure in the first day due to latent radioactivity within a radiusR ^ 25 000 m. While a shelter with 1 m thick concrete walls could provideadequate protection against prompt radiation, survivors will have to dealwith the problem of ground and food contamination.

The great danger from fallout is due to the long lived isotopes. Someof these are indicated in Table 6.2; the lifetime, the average time forretention in the body, and the maximum safe concentration in humansare indicated for each of the isotopes. In a fusion weapon the fallout comesfrom the fission trigger as well as from the other materials in the devicethat have become radioactive under the intense neutron flux. We canestimate the fallout from a 1 Mt weapon by assuming that a 10 kt triggeris used which implies ~ 1025 fissions. Approximately 6% of these fissionslead to 90Sr, so that the amount of 90Sr produced is

6 x 1023 (90Sr nuclei) or - 1 kgThe lifetime of 90Sr is T = 28 years ~ 9 x 108 s, and hence the activity is

At T 9 x 108

while this is less activity than released in the Chernobyl accident (seeSection 5.6) it is a large amount as compared to the maximum safeconcentration for humans.

Isotopes with shorter lifetimes result in much higher activity. If we takethe mean lifetime as T = 1 day and consider that 1026 radioactive nucleiwere produced, then the radioactivity on the first day after a 1 Mt explosionwould be 30 000MCL By comparison, in the 1986 accident at theChernobyl reactor in the U.S.S.R. the overall fallout reached 50 MCi

Table 6.2. Long lived isotopes from weapons fallout

x Retention Limit of safe concentration

90Sr 28 years 36 years 20 /zCi137Cs 30 years 70 days 30/iCi

14C 5600 years 10 days 400 \iC\239Pu 24 400 years 180 years 0.4 \iC\

214 Nuclear weapons

which was about 3% of the total activity of the core; this shows themagnitude of the problem that would be raised by even one 1 Mtexplosion. That fallout can be transported over long distances wasdemonstrated dramatically in the 1954 U.S. test at Bikini. A Japanesefishing vessel located at a distance of over 100 miles from the explosionbecame coated with radioactive ash. Unaware of the radiation the crewmade no attempt to decontaminate their ship. When they returned toport two weeks later they had received an integrated dose of 200 rem andwere seriously sick from its effects. Ultimately in 1963, the U.S. andU.S.S.R. and many other (but not all) nations signed the 'AtmosphericTest Ban Treaty' which forbids nuclear tests in the atmosphere.

In the case of a major nuclear exchange, in addition to the destructionof the targeted areas one expects effects of global nature that willpermanently change our ecosystem. The most serious effects come fromthe large amount of soot generated by the fires raging over cities and thewooded countryside. The soot will be transported to the higher atmosphereand will block the sunlight. We can estimate the change in the earth'ssurface temperature by a simple model along the lines indicated inFig. 5A(b). In addition to the greenhouse layer we will assume a layer ofsoot as shown in Fig. 6.5 which completely absorbs the visible andretransmits the energy in the infrared. The energy balance is then asindicated in the figure and the earth receives (and radiates) only ae = (0.5as)where as is the incident solar radiation. This is to be compared to ae = 1.4asin the model of Fig. 5.4 which resulted in a surface temperature T = 303 K.Thus in the presence of the soot layer

T \1AJand therefore T = 236 K = -37°C.

Fig. 6.5. Model for estimating the change in the earth's temperature dueto a heavy layer of soot in the upper atmosphere, such as would becreated by intense fires over large areas.

oUl.0)

^ Soot layer

'*^*>^ S *"*** Upper atmosphere

Earth


Like all climate calculations our model is subject to uncertaintiesespecially since some of the input parameters are not known. It is howeveragreed among the experts that a global exchange will bring some formof 'nuclear winter' where sunlight will be blocked for a long time, affectingplant and animal life, and that temperatures will be severely depressed,affecting food production and human life. Furthermore the ozone layerwill be partially destroyed so that when the soot is dispersed, UV radiationon the earth's surface will be intensified. These effects will be more seriousin the Northern hemisphere where the conflict is expected to take place,but they will be felt over the entire globe as well. In Table 6.3 we summarizethe predictions for the global consequences of a 10 000 Mt nuclearexchange as given by Ehrlich et al {Science, Vol. 22, 1293, 1983). In thetable NML = northern middle latitudes, and NH = northern hemisphere.

Table 6.3. Global consequences of a 10 000 Mt exchange

Sunlight intensity

Surface temperature

UV radiation

Radioactive fallout exposure

Fallout burden

Effect*

xO.01x0.05x0.25x0.50-40°C-20°C- 3°Cx4x 3^500R^100R^ 10 R106Ru 104 MCi137Cs 650 MCi

90Sr 400 MCi

Duration

1.5 months3 months5 months8 months4 months9 months1 year1 year3 years1 hour-1 month1 day-1 month^ 1 month1 year

30 years30 years

Area affected

NMLNMLNHNHNMLNHNHNHNH30% of NML50% of NML50% of NHNHNHNH

* Multiply present value by indicated number

Estimates of the overall casualties, direct and indirect, from an all out10 000 Mt nuclear exchange are given in Table 6.4 where it is assumed

Table 6.4. Direct casualties from a 10 000 Mt exchange

Direct deaths from blastDeaths from blast and heatInjuries needing treatment

750 M1100M1000 M-3000M

216 Nuclear weapons

that the major cities of the two opposing nations and of their allies havebeen targeted. Thus about 50% of the globe's population, much morethan the population of the nations at war, would be wiped out. Life forthe survivors, as seen from Table 6.3, may be even more difficult orimpossible.

6.3 Delivery systems and nuclear arsenals

The U.S. and the U.S.S.R. have built a vast nuclear arsenal. Inaddition Britain, France, China and possibly other nations have nuclearcapability. Nuclear capability implies not only the possession of weaponsbut also the means to deliver them in enemy territory. The principaldelivery vehicles are long-range rockets, referred to as ballistic missiles.The U.S. defense policy is based on three delivery systems - the so-called'triad' of strategic weapons.

(1) Land based intercontinental ballistic missiles (ICBM)(2) Long range bombers equipped with gravity bombs and cruise

missiles(3) Submarine launched ballistic missiles (SLBM).

Furthermore a variety of 'tactical nuclear weapons' are deployed in areasof potential conflict. The Soviets have similar capability.

Both nations have proclaimed that the nuclear arsenal is to be usedonly for defense, in case of an enemy attack. Thus the nuclear capabilityis such that even after a first strike by the adversary there will be enoughweapons left to obliterate the attacking nation. The policy is referred tovariedly as 'mutual assured destruction', 'deterrence', or 'massiveretaliation'; it is vividly illustrated in the 1983 cartoon by Willis reproducedin Fig. 6.6. Unfortunately the policy of nuclear deterrence has led to acontinuing escalation in armaments, as one or the other nation perceivedthat it was being overtaken by its enemy, so that its retaliatory powerwould be threatened or become ineffective. As a result, while it is estimatedthat only 400 Mt are sufficient to obliterate either country, present arsenalsare ten times larger. This is referred to as 'overkill' capacity.

Technological improvements in the past two decades have made deliverysystems highly efficient and extremely precise. One missile can now carryseveral warheads that are independently targeted, thus multiplying theeffective destructive power; the acronym MIRV (for multiple independentreentry vehicles) is used for such payloads. The accuracy with which a

Delivery systems and nuclear arsenals 217

warhead can be delivered is such that the circle of probable error (CEP)has radius R < 300 m. Thus cities and industrial complexes can be targetedwith confidence. But even hardened missile sites can now be subject todirect hits. Table 6.5 lists the U.S. and U.S.S.R. arsenals as they wereestimated in 1982.*

For these immense arsenals to be effective they must be accompaniedby an extensive information, communication and command system. Theseinclude large radar and reconnaissance and navigation satellites and acomplex computer system. To protect against a 'preemptive strike' by theenemy, land based missiles are placed in hardened silos and a mobilesystem has been proposed. There is continued debate on the value andeffectiveness of a defensive system, as we will discuss in Section 6.5. Inanswer to these questions both nations have opted to increase their nuclearcapacity. The growth in the number of warheads in the arsenals of thetwo superpowers is shown in Fig. 6.7 as a function of time. The sharpupturn around 1970 is attributed to the introduction of the MIRV concept.

The forerunner of modern long range missiles was the German V2rocket used against London near the end of W.W.II. The exploration ofspace and also the present military missiles are based on many of theprinciples used in the V2. A ballistic missile, like an artillery shell, is givenan initial velocity and then allowed to return (fall) to earth under theinfluence of gravity. However, while a shell is propelled in the gun barrel,a rocket sustains a continued thrust due to the exhaust of its fuel at veryhigh relative velocity. Fig. 6.8(0) shows the direct and indirect trajectoriesthat can be followed by a ballistic missile to reach a target at range x.

Fig. 6.6. An accurate illustration of the principle of mutual deterrence(from The Dallas Times Herald, c. 1983).

Deterrence

B. G. Levi, The nuclear arsenals of the U.S. and U.S.S.R., Physics Today, March1983.

Table 6.5. Strategic nuclear arsenals

Delivery vehicle

U.S.ICBMsTitan IIMinuteman IIMinuteman IIIMinuteman III

(improved)

SLBMsPoseidon C-3Trident D-4

AircraftB-52D

Gravity bombsB-52G

Gravity bombsSRAMs*

B-52HGravity bombsSRAMs*

Total gravityTotal SRAMsFB-111A

SS-11 Model 1SS-11 Model 2SS-13 Model 1SS-17 Model 1SS-17 Model 2SS-18 Model 1SS-18 Model 2SS-18 Model 3SS-18 Model 4SS-18 Model 5SS-19 Model 2SS-19 Model 3

SLBMsSS-N-5SS-N-6 Model 1SS-N-6 Model 2SS-N-6 Model 3SS-N-8 Model 1SS-N-8 Model 2SS-N-8 Model 3SS-NX-17SS-N-18 Model 1SS-N-18 Model 2SS-N-18 Model 3

AircraftTU-95 (Bear)Mya-4 (Bison)

Number ofvehicles

52450250300

304216

75

151

90

60

570Some

60150Few

25058

Few310

57

400

292

12

208

10545

Warheadsper vehicle

1133

108

4

44

44

2

13141181

1010

16

11121131317

22

Yield perwarhead(Mt)

9.01.20.170.335

0.050.1

~ 10.2

- 10.2

^ 1

1.000.1-0.30.750.756.00

20.000.90

20.000.500.755.000.55

1.001.001.000.21.00.80.21.00.450.450.2

11

warheads(N)

52450750900

30401728

300

1208

720

120

570

60- 6 0 0

-2500

-1500

57

- 4 0 0

- 3 0 0

12

-1040

21090

Equivalentmegatons(jvy2/3)

225508230440

413372

1114330120

570

50-495

-2300

-1200

57

~400

-250

12

-430

21090

Range (km)

15 0001130013 000n.a.

4 6007 400

9 900

12 000

16 000

4 700

10 5008 800

10 00010 0001100012 0001100010 5009 0009 000

10 00010 000

14002 4003 0003 0007 8009 100

n.a.3 9007 4008 3006 500

12 80011200

CEP (m)

1300370280220

450450

Max. speed (Mach)0.95

0.95

0.95

2.5

CEP (m)140011002000450450450450350300250300300

2800900900

14001300900450

15001400600600

Max. speed (Mach)0.780.87

Data from The Military Balance 1981-2, The International Institute for Strategic Studies, London (1982).* Short range air missile.

220 Nuclear weapons

Assuming that the earth is flat and neglecting air resistance, the parametersof the trajectory as a function of the elevation angle 9 are

Range x = - v% sin(20)9

Maximum height y = — VQ sin2 9

Time of flight At = — cos 0

(6.8)

where v0 is the initial velocity, and g the acceleration of gravity on theearth's surface, # = 9.81m/s2; maximum range occurs for 0 = 45°.

Fig. 6.7. Evolution of the number of strategic nuclear warheads of theU.S. and the U.S.S.R.; the dotted curves are projections. (After P. Craigand J. Jungerman, Nuclear Arms Race, McGraw-Hill, New York, 1986.)

1960 1970 1980 1990Year

Fig. 6.8. (a) Ballistic trajectory in a uniform gravitational field.(b) Typical trajectories of ballistic missiles as a function of the desiredrange; note that the main part of the trajectory is outside the atmosphere.

R = 6300 km(a) (b)

Delivery systems and nuclear arsenals 221

Trajectories typical of ballistic missiles are sketched in Fig. 6.8(b). Wenote below the range, initial velocity and flight time for these trajectories

(1) R = 3000 km v0 = 5.5 km/s At = 21 min(2) R = 5000 km v0 = 6.4 km/s At = 32 min(3) # = 8000 km v0 = 7.4 km/s At = 50 min

For reference, we recall that the initial velocity required for achievinggeostationary orbit is v0 = 10.7 km/s and the escape velocity isvs= 11.2 km/s.

We will discuss rocket propulsion in the following chapter in somedetail. Suffice it to say here that in order to impart high velocity to apayload it is essential to use several stages so as not to waste thrust inaccelerating parts of the spent rocket. We can derive the equationgoverning rocket propulsion, as shown pictorially in Fig. 6.9. Theinstantaneous mass and velocity of the rocket (as seen in the earth's frame)are M and V. The velocity of the exhaust gases relative to the rocket isve and dme/dt is the mass of gas exhausted per unit time; note that

dt dtIn the absence of gravity the rocket together with the exhaust gases arean isolated system and thus their total momentum is conserved,Pt = constant, or dPJdt = 0. Therefore

dt dt V dt IWe find

dt J dt V dt J dtand because of Eq. (6.9) the first and third terms cancel so that

(6.11)

Fig. 6.9. Rocket propulsion: (a) as seen in the rocket's rest frame,(b) as seen in the earth's frame, (c) for vertical launch.

(a)

222 Nuclear weapons

The term ve(dme/dt) has dimensions of force and is called the thrust ofthe rocket; it is as if a. force of that magnitude were acting on the rocketimparting to it the acceleration dV/dt.

If the rocket is launched vertically against the force of gravity as shownin Fig. 6.9(c), the equation of motion becomes

M d K + dM _M ( 6 1 2 )

dt dtwhich we can easily integrate. We multiply by dt and divide by M so that

dMCM(t)

J Modt

J0 J Mo 1V1 JO

Here Mo is the initial mass of the rocket at time t = 0 when V = 0. Thevelocity attained at time t is V(t) and the mass of the rocket is M(t), where

t (6.13)^ ) g tM(t)J

It is clear that to achieve large final velocities Vi we must have largeexhaust velocity ve, but also large values of ln(M0/Mf), where the finalmass Mf includes the payload as well as the empty rocket structure. Forgiven Mo and payload, the best way to minimize Mf is by using severalstages.

As an example the Minuteman III missile has the following parametersMo = 24.5 tonsMf = 1.5 tonsve ~ 4 km/s

and the burn time is t = 200 s. The Minuteman is a three stage rocket butwe will use the parameters of Eq. (6.13) to obtain an upper limit on thefinal velocity reached after the launch

In reality the achieved launch velocity is Vf ~ 7.3 km/s. We can alsocalculate the thrust developed by the rocket if we assume that the(24.5 — 1.5) = 23 tons of fuel are exhausted uniformly in t = 200 s.

6.4 Reconnaissance satellites

The collection of military intelligence about one's adversary is animportant activity going on both in war and in peace time. Reconnaissance

Reconnaissance satellites 223

satellites, which by international agreement can overfly any part of theglobe, have become the primary tool for such operations. Equallyimportantly, reconnaissance satellites play an essential role in theverification of the various arms limitation treaties that are in effect or thatare being negotiated. Satellites are also used for navigation, forcommunications, and as part of an early warning system against an enemyattack.

To obtain the best resolution the satellite should be at low altitudewhen it overflies the territory that is to be photographed; this howeverimplies increased drag and therefore shortened time in orbit. To minimizethis effect reconnaissance satellites are launched into elliptical near-polarorbits with the perigee (lowest point to the earth) at ~160 km, whichplaces them just outside the atmosphere. As the earth rotates, differentparts of it pass under a polar orbit as can be seen in Fig. 6.10; thus in24 hours the entire globe can be scanned by the satellite. The time forcompleting such low orbits is of order of 90 minutes.

We can easily calculate the period of a circular orbit at height h abovethe earth's surface. The angular velocity of the satellite is designated byco, its mass by m and the radius of the orbit by r; M@, R@ stand for themass and radius of the earth and G is Newton's constant. Then

2 ~M®m

mco r = G —-— (6.14)

or

R

Fig. 6.10. A satellite in polar orbit will eventually cover all of the earth'ssurface.

224 Nuclear weapons

Here we introduced the acceleration of gravity at the earth's surface

g = G ^ = 9Mm/s2 (6.15)

and used r = R@ + h. The radius of the earth is R@ = 6370 km; thus wecan expand in the small quantity (h/R®) and we find for the period T,

co

1/2

' • I .h

(6.16)

Note that To = 2n(g/R@)1/2 = 84.4 minutes is the period of oscillation ofa body dropped through the center of the earth; it is often encounteredin geodesic problems. If we assume h ~ 160 km, then the term (3/2)(h/R@)in Eq. (6.16) contributes only a 4% correction to To.

The resolution with which objects on the ground can be observeddepends on the distance from the object, the resolution of the recordingmedium, be it film or an electronic digitizer, and on the quality of theoptics. The relation between these variables can be found with the helpof Fig. 6.11 where s and sf are the object and image size respectively; Fis the focal length of the lens and A is the distance to the object. Theresolution of the film is characterized by R in lines/mm; thus the smallestdistance that can be resolved in the image plane is 3 = 1/R (mm). Fromthe geometry of the figure s' = s(F/A), and if we set s' = S, then s is theground resolution G, or

F F R(6.17)

Fig. 6.11. An imaging lens and typical ray traces.


As an example let us take the focal length to be F = 1 m (slightly over 3feet), the distance A = 160 km, and the resolution R = 200 lines/mm, thatis a 5 /am resolution in the film plane. The ground resolution is then

160 x 103 1G = = 0.8 m

1 200 x lOVtm"1)This is rather remarkable and we see that such images can reveal militarytargets in great detail. Much better resolution is to be expected frommodern reconnaissance cameras with increased focal length.

In deriving Eq. (6.17) and in the examples we did not considerdistortions introduced by the optics or the atmosphere which can belimiting factors. Furthermore, large focal length implies lenses of largediameter because of the limits imposed by diffraction. We recall fromEq. (4.20') that the smallest angle that can be resolved using radiation atwavelength k, with a lens of diameter D is

0 m i n = 1 . 2 - (6.18)

Thus, since 6 = s'/F, we find thats'>F6min=1.2k(F/D)

If, as before we set sr = S = 1/R, then the ratio (F/D), the so-called /-stopof the lens, must be smaller than

D<L2l~ ' JR

If we assume green light, k = 500 nm, to achieve the resolution calculatedin our previous example the lens diameter must exceed D, where

D (500 x 10~9)(200x 103)

namely, D > 12.5 cm, since we had F = 1 m. A good quality lens of suchdiameter is only moderately difficult to manufacture. For longer focallengths, however, the demands on the optics are severe.

The high resolution, combined with the large field of view results in avery rapid stream of data that must be recorded. Furthermore therecording time is limited because of the fast motion of the satellite.Typically, the velocity of the satellite is v = 8.2 km/s, and for a 3° field ofview the area covered has a radius 1(6 = 3°) = A x 0 ~ 160 x 0.05 = 8 km.Thus for complete coverage, images must be recorded at a rate ofapproximately one per second. For the focal length used in the example,

226 Nuclear weapons

the image radius is /' = l(F/A) = F x 6 = 5 cm; given the film resolution<5 = 5 /mi, a 10 x 10 cm2 frame contains (20 000)2 = 4 x 108 pixels. Topreserve the ground resolution the frame must be 'shot' in a time shorterthan t = G/v = 10"4 s. For these reasons fast film is the preferred mediumfor recording the image; it has both the speed and the resolution. Morerecently, CCD cameras (see Section 2.7) are being developed and used assubstitutes for film.

In the early reconnaissance satellites the film was dropped in canistersthat deployed a parachute in the atmosphere and were collected by specialairplanes. As larger payloads were placed in orbit, the film was developedin the satellite and either dropped to earth or scanned on board andtransmitted by a communications link. Electronically recorded andtransmitted images have less resolution than film but can be seen in realtime and thus are useful for early warning systems. Note that the analysisof high resolution images is in itself a tedious and lengthy task especiallywhen large areas are being investigated.

The atmosphere is transparent not only to visible light but also tocertain bands in the infrared. This provides another window forreconnaissance photography and has the advantage that it is notobstructed by cloud cover, and can be used by day or by night.IR photography is particularly sensitive to sources of heat so that rocketand jet engine exhausts are easily located. Extensive radar installationsare also part of the intelligence gathering network. While conventionalradar is limited to line of sight targets, over the horizon (OTH) radaruses the reflection from the ionosphere (see Section 4.5) to increase itsrange. Finally, tracking missile launches and their trajectories, andlistening in on radio transmissions are other important components inthe effort to assess the adversary's intentions and his technical progress.

Two examples of the capabilities of aerial and ground basedphotography are shown in Fig. 6.12. Part (a) is a section from a photographtaken from the Apollo 15 spacecraft at a distance of approximately 100 kmfrom the surface of the moon. The lunar landing module which hasdimensions of order 5 m can be distinguished at the center of thephotograph. Note that the resolution of the original film was much betterthan shown here and that the Apollo camera did not have as high aresolution as cameras carried by reconnaissance satellites. Part (b) is aphoto of the multiple independently targeted reentry vehicles from an MXlaunch as they pass through the atmosphere and impact in the westernPacific near Kwajalein Atoll. The photo is taken from a Navy P-3 Orionaircraft at high altitude. Similar photos of Soviet missiles targeted in thePacific are routinely obtained by U.S. reconnaissance and vice versa.


Fig. 6.12. (a) Section of a photograph taken from the Apollo 15spacecraft at a distance of ~ 100 km; the lunar landing module whichhas dimensions of ~5m can be easily distinguished. (Courtesy ITEKCorporation), (b) Picture of the multiple independently targeted reentryvehicles from an MX launch as they pass through the atmosphere andimpact in the western Pacific near Kwajalein Atoll. The missile waslaunched from the Vandenberg Air Force Base in California. This photois a time exposure (thus the wavy lines) taken by a Navy P-3 Orionaircraft from a high altitude. (Courtesy U.S. Department of the AirForce.)

228 Nuclear weapons

6.5 Proposed defense systems

The possibility of defending a nation against a nuclear attack hasbeen often considered and much discussed. Systems designed to destroyincoming ballistic warheads were deployed in the U.S.S.R. in 1968 and bythe U.S. in 1974 to protect specific geographic regions. These defenseinstallations were not considered to be effective against a major nuclearattack, and in 1972 a treaty was signed, limiting the further deploymentof such anti-ballistic missile system (ABM treaty). In 1983 the U.S.announced the development of a comprehensive defense against a Sovietnuclear attack; it was named the Strategic Defense Initiative (SDI), ormore commonly 'Star Wars' because it envisioned the use of a largenumber of space-based weapons.

The pursuit of a defensive .alternative has great moral and ideologicalappeal. However there is strong divergence of views as to whether sucha system is at all technically feasible. There are obvious questions thatcan be debated: a system designed to preserve the retaliatory force hasdifferent characteristics from one that would protect the population. Evenif the defense was 90% effective, it would hardly prevent an almostcomplete destruction of the target. How easily can the system be foiledby countermeasures or by building more attack capability? Finally, is thedeployment of a defense by one of the superpowers 'destabilizing' in thesense that it would lead to further escalation in the arms race?

The typical trajectory of a ballistic missile launched from the SovietUnion and targeted against the U.S. mainland is shown in Fig. 6.13. Fourphases are outlined: (a) The boost phase during which the fuel is burntand the missile acquires its launch velocity; it lasts 200-400 s. During thistime the missile is most vulnerable because of its low velocity, its largesize, and the intense heat from the rocket engine exhaust which can beused to sense its position, (b) The busing phase during which the payloadtravels on its ascending trajectory but the warheads have not yet separated,(c) A mid-course phase in which warheads and decoys have been releasedand fall freely under the earth's gravitational attraction; this is the longestpart of the flight and lasts approximately 40 minutes. Finally (d) is there-entry phase when the warheads are entering the atmosphere above theirrespective targets.

Depending on the phase during which the missile is to be attacked,different problems arise. In the boost phase, because of the short timethat is available, the weapon must be in place - over enemy territory -and one has to rely on directed energy weapons. Since the locations ofenemy silos are known and because of the heat plume, targeting is relatively

Proposed defense systems 229

easy. An attack during the busing phase is more difficult in terms oftracking, aiming at, and hitting a small target in a short time. Duringmid-course there is sufficient time but the deployment of multiple warheadsand decoys may require the targeting and destruction of some 10 000objects. For the reentry phase, schemes such as exploding a nuclearweapon in the path of the incoming warheads have been considered; theyhave obvious drawbacks in terms of radiation and fallout over the areathat is to be defended.

The weapons proposed for missile defense are either directed energyweapons such as intense laser beams or particle beams, or kinetic energyweapons. The former have the advantage that they propagate with thespeed of light while kinetic energy weapons are much more effective indestroying a missile. In a 1984 test, the U.S. demonstrated that aMinuteman missile launched from California could be intercepted duringits reentry over Hawaii and destroyed in flight by the impact of a kineticenergy weapon. In general, the time available for ascertaining a threatand launching the defensive weapons - especially in the boost phase - isonly a few minutes, posing serious command problems. Furthermore,since many of the proposed weapons cannot be tested under real conditionsit will be difficult to know their effectiveness beforehand.

As an example of ballistic defense weapons we will consider lasers andestimate the required power levels. To destroy a missile a laser beam mustdeliver to it 104 J/cm2 in 1 second or less. Such a pulse would vaporizea 3 mm thick sheet of copper. Of course warheads or missiles can behardened but we will assume that the lethal flux' is

F,= 108W/m2 (6.19)

Fig. 6.13. Typical trajectory and the various phases of the flight of anICBM.

Boostphase

Launch

Hit area

230 Nuclear weapons

For any source of radiation the brightness B is defined asenergy/s

D = (6.20)steradian

and the total radiated power is P = J B dQ. If the source is well collimatedso that all the energy is contained in a small solid angle AQ, we can write

P 2* BAQ (6.20')

and small AQ leads to high brightness. The flux F a t a distance R fromthe source is given by

F = B/R2 (6.20")as follows from the definition of Eq. (6.20).

If the laser is placed in orbit at a height of ~1000 km over the enemy'slaunching sites the required brightness is

B = FtR2 = 108 x (106)2 = 1020 W/sr (6.21)Alternately the laser can be earth based and the beam reflected from ageostationary mirror to a fighting mirror and from there onto the targetas shown in Fig. 6.14. In this case the distance involved isR ~ 2 x 36 000 km and therefore the required brightness is of the orderof £~10 2 4 W/sr ; in addition losses in the atmosphere must becompensated for.

The power needed to reach the required brightness depends on howwell the beam is collimated. The smallest possible solid angle is given bythe diffraction limit and therefore determined by the diameter D of thelast focussing mirror

(6.22)

Fig. 6.14. Proposed scheme for using a high power earth based laser,to attack enemy missiles far beyond the horizon.

Mirror in geostationaryorbit (36 000 km)

ICBM inboost phase

(100 km)

Fightingmirror (1000 km)

Land basedlaser

Proposed defense systems 231

We assume that D = 5 m and that near-UV light, X = 300 nm will be used.Then to achieve a brightness of 1020 W/sr

/3 x 10~7VPmin = BAQmin = 1020 x 4.5 = 1.6 x 106 W

and this power level must be sustained for 1 second, leading to 1.6 MJ ofoptical energy. For earth based lasers, the required power is 104 timeshigher, a level of performance that has not been achieved as yet.

Various high power lasers have been considered for the SDI programand are being further developed. The CO2 laser is not a prime contenderbecause of its long wavelength, k ^ 10 /mi. In chemical lasers populationinversion is established when the molecules are first formed in a reactionsuch as F + H2 -» HF + H; lasing takes place among the vibrational levelsof the HF molecule resulting in wavelengths X = 2.7-2.9 /im and powerlevels of 1 MW have been achieved. In the excimer laser the KF (KryptonFluoride) molecule is pumped by an electron beam and radiates in thevisible and near UV; power levels of order 10 kW have been reached.

When SDI was first proposed there was great confidence that an X-raylaser could be constructed. The idea is sketched in Fig. 6.15 and consistsof an assembly of thin rods which are pumped by the X-rays from a smallnuclear explosive (30 kt). Because of the geometry, the X-rays emitted bystimulated emission are collimated along the direction of the rods andeven though there is only a single pass there is substantial gain. Theadvantage of such a device - if it works - is that it does not require abulky power source and can be easily placed in orbit. However, X-raysare absorbed by the atmosphere so that the enemy missile would have tobe intercepted during the busing phase. Placing an X-ray laser, andtherefore a nuclear weapon in space would be in violation of theU.S.-U.S.S.R. anti-ballistic missile treaty; an alternative is to launch thedevices when needed.

A different very bright source of radiation is the recently developed freeelectron laser (FEL). In the FEL a relativistic electron beam of energy E

Fig. 6.15. Schematic drawing of a possible X-ray laser driven by anuclear explosive.

^ NuclearCollimated ^ ^ Z L ^ ^ ^ ^ ^ T explosive

X-rays

Laser rods

232 Nuclear weapons

is passed through a 'wiggler', a region of an alternating transverse magneticfield. The trajectory of the electron is then curved as shown in Fig. 6A6(a)and therefore the electrons emit synchrotron radiation. By using an opticalcavity the emission of radiation can be stimulated resulting in an intensecoherent beam. The wavelength of the radiation is given by

> l ~ ^ = ^ (6.23)2y2 2(E/mc2)2

where Aw is the wavelength of the wiggler, and y = E/m is the ratio ofenergy to mass for the electrons. Thus the laser can be tuned by changingthe electron energy. As an example we take Aw = 4 cm and E = 100 MeVand this leads to X ~ 500 nm, which is in the visible. Results from the freeelectron laser operated at the French laboratory at Orsay are shown inFig. 6.16(b). The principal peak at A = 648nm is accompanied by twosidebands. The lower trace shows the radiation spectrum without theoptical cavity and is ~ 1000 times less intense.

We have considered only a few of the various proposals for antiballisticweapons, and tried to illustrate the problems of power requirements,timing, and of vulnerability of a space-based defense system. A highlydetailed technical discussion can be found in the report of a study groupformed by the American Physical Society, which was published in Reviews

Fig. 6.16. The free electron laser, (a) A beam of high energy electrons'wiggles' as it passes through a spatially periodic alternating magneticfield; the emitted radiation is coherent and with the addition of mirrorscan have a significant component of stimulated emission, (b) The emittedradiation as a function of wavelength; in this case the presence of theoptical cavity enhanced the radiated power by a factor of 103.

Fl HMirror Stimulated

emission

Electronbeam

20

10

Tuned cavity

Untuned

640 660 X (nm)

(a) (b)

Arms limitation treaties 233

of Modern Physics, Vol. 59, July 1987. Furthermore, defense againstsubmarine launched missiles, with their much shorter flight time, andagainst radar-evading low-flying planes and cruise missiles, introduces anew set of parameters and problems. How effective a defense system canbe, and the political consequences of its deployment, are also importantfactors and are briefly touched upon in the following section.

6.6 Arms limitation treaties

Ever since the first nuclear explosions in 1945, efforts have beenmade to limit the spread of nuclear weapons and the capability tomanufacture them. Nuclear weapons technology has been kept secret formilitary reasons, and while such secrecy cannot be effective over the longrun it inhibits the spread of information. A more effective approach isthrough the adherence to multilateral agreements among nations and bythe establishment of bilateral agreements between the U.S. and U.S.S.R.In spite of these efforts the stockpile of nuclear weapons has grown to theabsurdly large quantities discussed in Section 6.3. The best hope for theavoidance of a nuclear catastrophe is a continuing dialogue between thesuperpowers leading eventually to a de-escalation and reduction of theinventories of nuclear weapons.

One of the reasons for the present predicament is the philosophy ofescalation, coupled with the doctrine of massive retaliation. Furthermore,the proliferation of nuclear weapons among other nations increases thedanger of a nuclear accident which could then precipitate a full-scaleexchange. As has been discussed, such a conflict would create global effectsthat could permanently destroy life on earth as we know it. The agreementsand treaties that are in effect have helped in checking the uncontrolledgrowth of nuclear materials, nuclear weapons and of their testing; but toassure a future free of nuclear war the existing agreements must beexpanded and strengthened.

Below we list the most important treaties that are currently in effectand/or are being negotiated.

(1) The limited test ban treaty, signed in 1963. This was an importantmilestone in that it stopped all tests in the atmosphere and in thesea (with only minor exceptions). It curtailed the spread ofradioactive fallout from nuclear testing and effectively limited thesize of the weapons that are being tested.

(2) The outer space treaty (1967-72). This treaty prohibits the use

234 Nuclear weapons

of outer space and of the seabeds for the deployment of nuclearweapons. It may be of major importance in preventing theplacement of weapons in orbit around the earth as envisioned incertain strategic scenarios for the future.

(3) The non-proliferation agreement, negotiated between 1968 and1970 but not signed by all nations. According to this treaty nonew nations can join the 'nuclear club', and nuclear fueltechnology is to be strictly controlled by those countries that dopossess it. So far this treaty has been reasonably effective but itis evident that several nations have secretly acquired or are seekinga nuclear capability.

The bilateral agreements between the U.S. and the U.S.S.R. are groupedunder the 'Strategic Arms Limitation Treaties' known as SALT I andSALT II. The latter has not as yet been ratified by the U.S. Senate butwith some exceptions its provisions are respected by both countries.Included in SALT I were:

(1) The antiballistic missile (ABM) treaty of 1972. This prevents theconstruction of new ABM systems and forbids interference withreconnaissance satellites. This last step is essential in the effortsfor verification of any arms limitation agreement.

(2) The agreement on limits for certain offensive weapons whichresulted in a freeze of the number of deployed missiles (but notwarheads) at their level of October 1972.

SALT II which was signed in 1979 has two important provisions:

(1) Restrictions on MIRVs (multiple independently targeted reentryvehicles) and on other new weapons.

(2) The threshold treaty which limits tests of nuclear weapons to alevel below 150 kt. There have been protests from both sides thatthe treaty has been occasionally violated, but it remains true thatoverall the testing has been limited to relatively small devices.

Finally, in 1987 an agreement on intermediate nuclear forces (INF) wassigned. This agreement covers weapons of the NATO allies deployed inEurope and similar Soviet weapons deployed against Europe. The INFtreaty may well be a landmark agreement because it calls for thedismantling and removal of weapons already deployed and operational.

Exercises 235

Future proposed agreements are:

(1) The comprehensive test ban treaty whereby all testing of nuclearweapons would be eliminated. This would be an important stepbut it is opposed by the military who are concerned about thepreparedness of their forces and the reliability of the weaponsstockpile. Yet, on occasion, one or the other country has declareda unilateral moratorium on testing.

(2) A treaty against testing of antisatellite weapons (ASAT).(3) A true arms reduction treaty and even possibly a 'freeze' on all

nuclear weapons construction.

The willingness of a nation to sign a treaty and to adhere to it dependsin large measure on the nation's ability to make sure that its adversaryis complying with the provisions of the treaty. Reconnaissance satellitesplay a key role in treaty verification and more recently on-site inspectionby foreign teams has been included in the agreements. Politicalconsiderations, economic pressures, the concern about local conflicts andeven nuclear posturing are all factors making the negotiation andratification of arms limitation treaties difficult and protracted.

Arms limitation concerns are not restricted to nuclear arms and shouldeventually cover conventional weapons. Biological and chemical weaponscan also produce disaster on a global scale and a convention was signedin 1977 to limit their production. Unfortunately, war has been part ofman's history but it has now reached the point where it could completelyannihilate our civilization. It is incumbent upon all of us to control thesuper-weapons that have been built and to prevent their use under anycircumstances.

Exercises

Exercise 6.1

Consider a fission weapon containing 20 kg of 2 3 5U. Assume that the2 | | U nucleus fissions to the following channel in 10% of the cases

(a) Calculate the amount of 90Sr released.(b) Given that the lifetime of 90Sr is 28 years, find the total

radioactivity released.(c) Assuming that the 90Sr is dispersed over 2000 (km)2 find the

contamination in /zCi/m2.(d) Estimate the yield of the weapon in kt.

236 Nuclear weapons

Exercise 6.2

Consider the trajectory of a ballistic missile in its most simplified form(ignore the curvature of the earth and air resistance). Let the missile belaunched for maximum range (6 = 45°), and let the range of the missilebe the shortest distance from Moscow to New York (approximately9000 km).

(a) Calculate the required launch velocity.(b) Calculate the time of flight.(c) Assuming an acceleration of 5g calculate the burn (or boost) time.(d) Calculate the maximum height of the trajectory.

Exercise 6.3

Consider the illumination of an ICBM booster by blue light X = 0.3 /mi.A mirror 4 m in diameter is stationed at a distance of 1500 km.

(a) What is the minimum size of the spot on the missile due to thediffraction limit of the mirror.

(b) If the energy deposition (surface density) required to damage anypart of the missile is 104 J/cm2 find the energy required to destroythe missile.

(c) How does this compare with the power of commercially availablelasers.

Exercise 6.4

Consider a fire that would destroy a city of one million inhabitants.(a) How much soot do you expect to be generated?(b) Estimate the thickness of the soot layer that would reduce the

sun's intensity by a factor of 10.(c) Given the thickness and the total mass of soot calculated in (a),

what is the maximal area that could be covered?(d) What fraction of the earth's surface is that?

For details see the article by B. Levi and T. Rothman in Physics Today(September 1985) - 'Nuclear Winter' and references therein.

Exercise 6.5

An empirical formula for the radius of destruction of a missile silo hardenedto H (H expressed in thousands of psi) by a weapon of yield Y (Y inmegatons) is

RB = 460(Y/H)1/3 (meters)

Exercises 237

The probability of hitting the target within a radius R is given by

given that H = 2000 psi, Y = 1 Mt, CEP = 360 m(a) Find the probability that the silo is destroyed by one such missile.(b) Estimate the long lived radioactivity that is released.

Exercise 6.6

An X-ray laser is placed in polar orbit at an altitude h = 1000 km. Todestroy a missile the laser must deliver 104 J/cm2 on target.

(a) Find the required brightness.(b) If the beam divergence is 1000 times the diffraction limit and if

k = 0.01 nm and the device diameter is D = 1 m, find the necessarypower in the laser.

(c) Estimate the spot diameter at the target.

PART D

SPACE TRAVEL

Transportation, of people and materials, is among the majorfactors that have made our civilization possible. The harnessingof animals and the use of ships were exploited early in the historyof man. The sailing boat was an extraordinary invention becausethe sea offered reduced friction to the point where the wind wouldsuffice to propel the ship. Railroads provided the freight capacitythat made possible the industrial revolution, to be followed bythe introduction of the automobile in the 20th century. The firstairplane flight by the Wright brothers took place in 1903, andtoday transportation has brought within easy access all parts ofthe globe. This speed and ease in transportation has had andcontinues to have a profound effect in shaping the social andeconomic structure of the world community. In 1969 man landedon the moon, and unmanned spacecraft have reached to the edgeof our planetary system. More ambitious missions into space canbe foreseen as technology advances and the desire to carry themout persists.

Chapter 7 is devoted to a discussion of airplane and rocketflight and propulsion. In contrast to ships which are buoyant,airplanes are heavier than air and are supported by thedynamically produced lift. The reduced friction in air allowsairplanes to reach high velocities, even in excess of the speed ofsound. While airplanes must fly in the atmosphere, rockets arenot subject to such a restriction. Rocket velocities are ten timeshigher than those of airplanes, to the point that rockets can exceedthe escape velocity and leave the gravitational field of the earth.

To understand airplane flight a knowledge of elements of fluiddynamics is necessary, and these are developed in the text.Interestingly, rocket engines also involve fluid dynamics since theexhaust gases emerge at very high temperature and at supersonicspeed. As a final application of these ideas the NASA shuttle isanalysed in the concluding section of the chapter. Forcompleteness, a brief review of the basic equations of fluid

240 Space travel

dynamics is given in Appendix 3. The speed of sound is discussedin Appendix 4.

Space travel proper is considered in Chapter 8. We focus ontravel within our planetary system since such missions have beensuccessfully completed. The solar system and orbit theory arereviewed briefly, to be followed by a discussion of transfer orbitsand related maneuvers necessary for planetary flight. TheVoyager 2 mission is considered in some detail and exemplifiesthe application of the principles discussed previously. Next wediscuss ideas and proposals for future travel outside the solarsystem; we emphasize first principles such as energy andmomentum conservation since they help us distinguish thephysical from the fictional world. The final section is devoted toinertial guidance in view of the essential role it plays in any rocketor space mission.

7AIRPLANE AND ROCKETFLIGHT

7.1 Fluid flow and dynamic lift

The motion of a fluid is extremely complex because the individualmolecules are subject to random thermal motion as well as to the collectivemotion of the fluid as a whole. Thus we consider a small element dr ofthe fluid and follow its motion as a function of time. We will assume thatthe fluid is incompressible, so that the mass dm = p dr contained in thevolume di remains fixed and the density p is constant throughout thefluid; we will also assume that the fluid is non-viscous, that is there areno internal frictional forces. These two assumptions are applicable tomotion through air when the velocity v is small as compared to the velocityof sound vs9 i.e. v« vs. The velocity of sound is a measure of the randomthermal velocity of the molecules; its value for air at s.t.p. is vs ~ 330 m/s.When necessary we will relax these assumptions.

The simplest form of flow occurs when the velocity at each point of theliquid remains constant in time. This is illustrated in Fig. 7.1 (a) wherethe element di follows the path from the point P to Q to R and has thevelocity vP, vQ, v^; at a later time another element of the fluid will be atP but it will again follow the path to Q to R and have the same velocity.The path followed by a fluid element is called a streamline and the velocityis always tangential to the streamlines; such flow is called steady or laminarflow.

Streamlines cannot cross one another because at the crossing point thevelocity would be undefined; thus the fluid contained within a flow tube,such as shown in Fig. 1 A(b) remains always within the tube. If the areaat the entrance of the tube is A, the mass of fluid entering the tube perunit time is

242 Airplane and rocket flight

and correspondingly the mass leaving the tube at the exit is Q2 =The conservation of mass, also referred to as the continuity conditiondemands that Q1 = Q2. For an incompressible fluid the density is constantso that Px= p2 and the continuity equation takes the form

A1v1=A2v2 (7.1)Namely, the area of the flow tube is inversely proportional to the velocityat that point. Conversely, the density of the streamlines is proportionalto the velocity of the field.*

Streamlines are analogous to electric or magnetic field lines and thusshould begin on a source and terminate on a sink. In practice we drawstreamlines by extending them to infinity; they can also close ontothemselves. Some examples of steady flow are shown in Fig. 7.2. In (a)the fluid is flowing through a tube that has a (smooth) constriction andat that point the velocity of the fluid is greatly increased. As we shall seelater the pressure is reduced at the throat and this arrangement, called aventuri tube has many applications as for instance in carburetors forinternal combustion engines. Fig. 12(b) shows the streamlines for thesteady flow past a cylinder. The flow divides itself above and below thecylinder, while the velocity at the singular points Sl9 S2 is zero; these arecalled stagnation points. Note that the pattern of the streamlines is thesame whether the liquid flows by the cylinder with velocity v, or whetherthe cylinder moves with velocity v through a stationary liquid.

Bernoulli's equation. In the absence of friction, we can use energyconservation to derive an important relation between the forces acting ona fluid element and its velocity along a streamline. As before we willconsider an element of fluid enclosed in a flow tube as shown in Fig. 7.3.

Fig. 7.1. (a) Fluid flow can be visualized by 'streamlines', continuouscurves that are tangent to the local velocity vector, (b) Streamlines formthe boundary of 'flow tubes'.

(a) (b)

Figures of streamlines are drawn, or photographed in two dimensions. In realityof course the pattern is three dimensional but often one dimension can be takenas infinitely long, or as having circular symmetry, etc.

Fluid flow and dynamic lift 243

At the entrance to the tube the area, pressure and velocity are Al9 Px

and vl9 while at the exit they are A2, P2 and v2. Furthermore we can letthe entrance and exit areas be at different elevations, hx and h2, withrespect to the horizontal. The mass of fluid contained in an elementcrossing the entrance area for a time interval dt is

dm = PiAtVt dt = p2A2v2 dtand must be equal at the entrance and exit regions for continuity reasons.We also assume that the fluid is incompressible, so that px = p2 = p. Thework done on this element by the force due to the pressure is

AWP =W1-W2 = F1 dsx - F2 ds2

= (P1A1)(v1dt)-(P2A2)(v2dt2)where ds1 and ds2 are the thickness of the element along the streamline.The work done by the force of gravity is

AWg = dmgh1 - dmgh2 = (pA1v1 dt)g(h1 - h2)In the absence of friction the work must equal the change in the kineticenergy of the fluid element

A(K.E.) = i dm(v22 - v\) = &pAlVl dt)(v2

2 - v\)

Fig. 7.2. (a) Flow through a 'Venturi tube', i.e. a smooth constriction.(b) Streamlines for steady flow past a cylinder; note the two stagnationpoints.

(a) (b)

Fig. 7.3. Demonstration of Bernoulli's principle with the help of a flowtube of changing cross-section and of differing elevation at its two ends.


setting A(K.E.) = AW? + AWg we then find

(7.2)

which is known as Bernoulli's principle and is valid along a particularstreamline.

Bernoulli's principle is often expressed by the statementP + pgh + \pv2 = constant (7.2')

which should be used carefully because the constant can have a differentvalue for different streamlines. For horizontal flow or when the effects ofgravity are negligible we write Eq. (7.2') in the simpler form

p + \pV2 = constant (7.3)

From Eq. (7.3) we see that when the velocity increases the pressure dropsand vice versa; this gives rise to slight paradoxes which can be observedin everyday life, as for instance the low pressure in a tornado, etc. Finallyby taking the derivative of Eq. (7.3) we can obtain a differential form ofBernoulli's equation

dP + pi?di; = 0 (7.3')Again this equation should be used with care because dP and dv are thetotal differentials of P and r a s a function of space and time; for steadyflow dP/dt and dv/dt are of course zero.

As an important application of Bernoulli's principle let us re-examinethe steady flow past a cylinder shown in Fig. 12{b) and reproduced inFig. 7.4(a). At the stagnation points the velocity is zero so that the pressurepl= P2 = p0 and the resulting forces are equal and opposite. At points 3and 4, the velocity is high and the pressure is low P3 = P4 = Po — \pv\ =

Fig. 7.4. Flow past a cylinder, (a) Cylinder at rest; the pressure exertedon the cylinder surface is balanced in all directions, (b) When the cylinderrotates the streamline density is affected and the pressure is not anymorebalanced.

(a) (b)

Fluid flow and dynamic lift 245

P0 — jpvl', however P3 and F4 give rise to equal and opposite forces sothat the cylinder remains in equilibrium. Let us next assume that thecylinder rotates as indicated in Fig. lA(b). Then because of frictionalforces, some of the fluid is carried along by the cylinder's motion and thestreamlines are deformed as indicated. In this case v3 > v4 and thereforeP3<P4; as a result the cylinder is subject to a net force in the upwarddirection and will move accordingly.

The forces acting on a rotating cylinder or sphere that moves througha fluid or air are well known to baseball and tennis players where theeffects of spin are quite pronounced. Lord Rayleigh appears to have beenthe first to appreciate that a tennis ball spinning in one direction(Fig. 1.5(a)) would reach much farther than when spinning in the oppositedirection (Fig. 1.5(b)). In the first case the Bernoulli force reduces theeffect of gravity, while in the second case it increases it. The same principlegives rise to the lift on an airplane wing, or airfoil.

An airfoil is designed so as to preserve the streamline flow around it,but in such a way that the velocity of the airstream above the foil isincreased. This depends on the angle a of the airfoil with respect to theairstream. For instance for the position of Fig. 7.6(a) the flow is evenly

Fig. 7.5. A spinning tennis ball moving in the direction opposite to thestreamlines: (a) if the ball spins in the sense indicated by the arrow it issubject to a 'lifting' force and can reach farther than a non-spinning ball,(b) if the spin is reversed the additional force drives the ball towards theground and therefore its range is reduced.

Mg

(a) (b)

Fig. 7.6. An airfoil in a fluid stream: (a) the attack angle is zero andthere is no lift, (b) for a finite attack angle a the airfoil experiences alifting force.

L

(a)


divided above and below the airfoil. When positioned as in (b) of thefigure the flow pattern is changed and there is an upward lifting forceacting on the airfoil. The angle between the airfoil axis and the velocityof the airstream is called the angle of attack.

Circulation. We are all familiar with eddies that are produced in theflow of water, in which case the streamlines are closed onto themselvesas shown in Fig. 7.7(a). If the fluid elements are moving in a circularpattern there must be a force acting perpendicular to the direction ofmotion; namely a pressure gradient in the radial direction. We consideran element of radial thickness dr, width w and of length / (in the directionperpendicular to the plane of the paper) as shown in Fig. 1.1 (b). The forceacting on the fluid element is

F = A dP = Iw dPThe mass of the element is dm = p dV = plw dr, and if the fluid elementis moving in a circle of radius r, Newton's law demands that

v2 dm= AdP

or(7.4)

From Bernoulli's equation in its differential form (Eq. (7.2)) we can expressdP = —pv dv and therefore Eq. (7.4) is written as

— pi;dt; = pi;2 —

Fig. 7.7. Circulation around a vortex: (a) the streamlines are closed andtheir density increases as we approach the vortex, (b) an element of thefluid showing the forces acting on it.

(a) (b)

Airplane flight 247

or

* — * (7.4')

Eq. (7.4') can be easily integrated to giveIn v + In r = constant

orconstant

v = (7.5)r

From Eq. (7.5) we see that the velocity at the center of the eddy becomesinfinite, v -* oo as r -> 0; this singular point is called a vortex. A furtherconsequence of Eq. (7.5) is that the line integral of the velocity takenalong any streamline has always the same value. The line integral is definedthrough $v-dl; if we take a circular streamline, v is constant and v isalways parallel to dl. Thus <j>v«dl = i;27tr and since vr = C (where C isthe constant appearing in Eq. (7.5)) we have the important result

T = <t)\-d\ = 2nC (7.6)J

While the result of Eq. (7.6) was derived for the special case of a circularpath along a streamline it is generally true for any path as long as itencloses the vortex. If the vortex is not enclosed in the path, the lineintegral vanishes. Thus the strength of a vortex can be characterized bythe value F of the line integral of the fluid velocity as given by Eq. (7.6);F is called the circulation of the vortex. The importance of this conceptis that when referring to the flow around a rotating cylinder, or an airfoilwe can represent it as the sum of a uniform flow and of a circulation Fsurrounding them.

7.2 Airplane flight

We have seen in Fig. 7.6 that when an airfoil moves through air,a circulation develops around it and therefore it is subject to a liftingforce. The lifting force per unit length of span, which we designate by L(see Fig. 7.8), is given by the Kutta-Jukowski law

L = pvT (7.7)with p the air density, v the velocity of the airstream and F the circulation.In turn the circulation can be expressed by

F = n£vcai (7.8)


where c is the chord of the airfoil and a is the angle of attack defined inFig. 7.6(6); £ is an empirical factor of order unity and hereafter we willuse { = 1.

The circulation is equivalent to a vortex as shown in the previous sectionand therefore we can replace the effect of the airfoil by imagining a lineof vortices along its axis. However vortex lines, just as streamlines, mustextend to infinity, and in the case of a finite wing only part of the vortexline moves along with the airplane. As a result the vortex line bendsbackwards as shown in Fig. 1.9(a) and leaves a trail of vortices emanatingfrom the wingtips. The air flow induced by the vortex is known as thedownwash and it exerts a pressure on the wing as shown in (b) of thefigure; it is strongest at the wing tip. The condensation trails that we seewhen airplanes fly at high altitude are due to the detached vortices, wherebecause of the increased air velocity, the pressure and thus also thetemperature, drop and condensation occurs. Fig. 7.10(a) shows a vortexformed by flow around a sharp edge and in (b) of the figure the vortextrails of an airplane (emitting dust) can be clearly seen.

The presence of the downwash is unavoidable for any finite wing andhas the important consequence that the airstream velocity is not parallel

Fig. 7.8. Flow and circulation around an airfoil.

Fig. 7.9. (a) The line of vortices associated with an airfoil of finite length.(b) 'Downwash' forces acting on the airfoil as a result of vortex formation.

, V> Y r (r7 I I I I CTT

Line ofvortices

(a) (b)

Airplane flight 249

anymore to the motion of the airplane. This is illustrated in Fig. 7.11,where the effective velocity v' is the vector sum of the airplane velocity vand of the downwash w. The aerodynamic force is always perpendicularto the airstream velocity, and therefore in the presence of the downwash,the force R is not normal but has a component D which opposes themotion of the airplane. This is called the induced drag, and the airplaneengine does work against the induced drag even in level flight. The induceddrag per unit length of span, D\ is given by

Dr w pvVw— = — or D = = plw (7.9)L v v

The exact analytic calculation of lift, drag and other aerodynamic forceson any particular structure is obviously too complex for practical purposes.Instead very precise results can be obtained by measuring the sameparameters on models in a wind tunnel and then scaling them to the full

Fig. 7.10. Demonstration of vortex flow: (a) due to streamline flow pasta sharp object, (b) vortex trails emanating from the wing tips can beclearly seen in this picture because the plane was emitting dust. (FromTh. von Karman, Aerodynamics, McGraw-Hill, 1954.)

CeurUtv of McOrawiiiU Boot Co.

(a)(b)

Fig. 7.11. Analysis of lift and drag forces acting on an airplane wing offinite length. The airplane velocity is — v, but because of the downwashw the airstream velocity is v'.


size of the airplane. It is therefore customary to express the forces in termsof coefficients defined as follows

LiftCT = lift coefficient = -——

*? (7.10)Drag

CD = drag coefficient =

Here S is the area of the wing and \pv2 is the 'dynamic pressure'.Finally a reasonable approximation for the downwash velocity is

^ ^ 2 4 (7.10')

where ^ = b2/S is the 'aspect ratio' of the wing with b the length or 'span'of the wing. Thus we also have yR ~ b/c (c is the chord), from where wesee that, in general, yR > 1.

From Eqs. (7.7) to (7.10) we can easily deduce in the same spirit ofapproximation that

CL~2n<x (7.11)and that

CD = 4n(x2(S/b2) (7.1 1')Finally the necessary engine power to keep the plane in level flight is

(7.12)7T^R

where L is the lift force and D the drag force.

7.3 The effects of viscosity

So far we have assumed that the fluid is incompressible andnon-viscous. The effects of the compressibility of air become importantwhen the velocity approaches the speed of sound and we will considerthem in the following section. The viscosity of a fluid is due to internalfriction and is always present; the fact that circulation is established aroundan airfoil is due to the frictional forces between the surface of the wingand the air molecules, so that a thin boundary layer is being carried alongwith the wing. At high velocities the flow ceases to be laminar, eddiesdevelop and momentum is transferred between adjacent fluid elements; theflow has become turbulent.

As in our previous treatment we consider the fluid as consisting of smallelements dr, carrying momentum dp = pv di ; the density and velocity are

The effects of viscosity 251

assumed to be continuous variables of position and time. We distinguishbetween 'body-forces' which act on the bulk of the material and'surface-forces' which are specific to each element. Examples of body-forcesare gravitation and, in the case of a plasma, electrical forces; we will notconcern ourselves with such forces here. The surface-forces are the normalstress, that is the force resulting from the pressure in the fluid, and thetangential stress or shear which is due to the viscous forces in the liquid.In solids the shearing force depends on the corresponding normal forceFN, i.e. we write for the frictional force Ff = fiFN where \i is the coefficientof friction. In liquids the shearing stress depends on the rate of change ofthe velocity through the fluid. This can be illustrated if we think of acontainer filled with a fluid at rest, up to a height d as shown in Fig. 7.12.We now place a flat plate on the surface of the fluid and move it withconstant velocity v0 along the x-direction. As long as the flow remainslaminar, the fluid layer next to the plate (z = d) has velocity v = v0, whereasthe layer at the bottom of the container (z = 0) must have velocity v = 0.If the area of the plate is ^4, it is found tha' the tangential force Ft requiredto maintain the velocity v0 is given by

FJA = ri(vo/d) (7.13)The proportionality factor rj is called the coefficient of viscosity of thefluid and can be defined more generally through

n = ^ - (7.13')dt;x/dz

The coefficient of viscosity has dimensions of kg/m-s as can be deducedfrom the defining equation and varies by orders of magnitude for differentfluids. Some typical values are given in the Table 7.1. The viscosity of afluid is due to the intermolecular forces and depends strongly on thetemperature. Some materials such as tar or glass which at normal

Fig. 7.12. Demonstration of the effects of viscosity by moving a plateA over a liquid of depth d. The velocity of the liquid as a function ofdepth is indicated by the arrows.


temperature appear to be solids can be thought of as extremely viscousfluids.

When we take frictional forces into account, Newton's equation for thefluid elements take the form

FP + Ff = ma (7.14)with Ff the frictional force, FP the force due to pressure gradients and athe acceleration of the element. We can of course replace ma on the r.h.s.of Eq. (7.14) by introducing on the l.h.s. an inertial force F, = —ma; wethen obtain an equilibrium condition between the three forces

FP + Ff + F, = 0 (7.14')These forces are shown schematically in Fig. 7.13 and need not coincidewith the velocity vector or be colinear. Eq. (7.14) makes it easy to visualizedifferent regimes of fluid flow. For instance at low velocities, theacceleration is also low and therefore the inertial force is weak. Thus thepressure force must balance the frictional force. If on the other hand rj islow, then the pressure force balances the inertial force. In general, if theratio of F! to Ff is the same the flow pattern is identical, and geometricallysimilar objects have similar dynamic properties. The ratio of the inertialto the frictional force is called the Reynolds number 01.

To define the Reynolds number, which is dimensionless, we mustintroduce a scale factor or characteristic length L. For instance in flowthrough pipes, L is the diameter of the pipe; for an airfoil, L would be

Table 7.1. Viscosity of selected fluids

Lubricating oil rj ~ 1.0 kg mWater rj = 0.9 x 10 " 3

Air at s.t.p. rj = 1.8 x 10"5

Fig. 7.13. The forces acting on a fluid element: F p is the static pressureforce, F! the inertial force and Ff the frictional force, and the three forcesmust balance.

Streamline

The effects of viscosity 253

the chord. Now the frictional force (see Eq. (7.13)) is

Fc = nA-~ riA— (7-15)d L

while the inertial force is given by

Fl = m\*\ = (pAL) — (7.15')dt

We assume that At ~ L/v and that in that time Av~v leading to a ~ v2/Lwhich is typical of circular motion. We then find

FY~pAv2 (7.15")Therefore the Reynolds number is defined as

(7.16)*f n

From this result we see that:0t Small corresponds to viscous flow at low velocity. As we

know, this is the regime of laminar flow.$ Large corresponds to low viscosity and high velocity. This is

the regime of turbulent flow.For flow in pipes the transition from laminar to turbulent flow occursw h e n ^ - 1 0 3 .

As an example of flow at low Reynolds number we consider a smallsphere of radius a which is dropped into a viscous fluid. Eventually thesphere reaches a terminal velocity vT; at this point the inertial force iszero and the pressure gradient is small as compared to the force of gravity.Therefore the force of gravity Fg = mg equals the frictional drag force Ff.From Eq. (7.13) we expect that Ff will be proportional to rj, to vT and tothe dimensions of the sphere. In this case F{ can be calculated and is givenby Stokes' law

FD = Ff = 6narjvT (7.17)Next, let us compare the flow at high Reynolds number typical of an

airfoil and of a large fish. For the airfoil the following parameters arerealistic:

t; = 200km/hr ~55m/sL = 5 m (chord length)p = 1.29 kg/m3 (density of air at s.t.p.)rj = 2 x 1(T5 kg/s (from Table 7.1)

Then1 .3x55x5 Q7

2 x l O ~ 5


For the fish we can usev = 20 knots ~ 10 m/s (fast fish)L = 1 . 5 mp = 103 kg/m3 (density of water)n = 10~3 (from Table 7.1)

103x 10x1.571.5 x 107 (fish& =

n 10"3

We see that the airfoil and the fish have similar Reynolds numbers andthis explains why their cross-sectional shapes are so similar. In both cases'streamlining' is important to avoid turbulence.

Some turbulence always develops behind an airfoil and as a result thepressure behind the airfoil is reduced. This produces a pressure drag inaddition to the frictional drag. Fig. 1.14(a) shows the onset of turbulencebehind an airfoil, whereas in (b) of the figure the attack angle has beenincreased to the point that the flow is no more laminar and therefore thereis no lifting force; we say that the foil is in a stall. At large Reynoldsnumber the drag is mainly due to the inertial force rather than to friction.Thus

where AL is the area perpendicular to the direction of flow. As a result,the power required to maintain constant velocity is

P = FD^~yv3A± (7.18)as also found from Eq. (7.10), where FD = CD^pv2S. The dependence ofthe engine power on the cube of the velocity explains why increments inair speed were so dependent on more powerful propulsion systems ratherthan on refinements in aerodynamic design.

As we have already stated, the circulation that is established aroundan airfoil is due to the boundary layer that becomes attached to the wing.The thickness of the boundary layer S depends on the viscosity of thefluid and for a typical linear dimension L, it holds that

5 = L/@1/2 (7.19)For instance for the airfoil analysed in the last example, 0t ~ 107 and thechord of the foil was L = 5 m. Then we find that

(5 = 5m/(10 7)1 / 2 ~1.6mmIndeed the boundary layer is very thin and mechanical perturbations ofthe wing surface at that scale can cause the boundary layer to becomedetached with a consequent loss of lift. On the other hand, for extremelysmooth surfaces the boundary layer becomes detached at lower Reynoldsnumber than for a slightly irregular surface finish. This is exemplified by

Supersonic flight 255

the 'dimpled' golf balls which, for the same stroke, have a range almostfive times longer than that of a smooth ball.

7.4 Supersonic flight

Variations in the local density and pressure of a fluid canpropagate through a medium and are detected by the ear giving rise tothe sensation of sound. The velocity of sound is given in general by

vs = (dP/dPy2 (7.20)where P is the pressure and p the density. For nearly ideal gasesEq. (7.20) can be evaluated and because sound propagation is adiabatic

Fig. 7.14. Turbulence developing from the motion of an airfoil througha viscous fluid: (a) small angle of attack, (b) for a large angle of attackthe turbulence results in loss of lift and thus to a stall. (FromA. H. Shapiro, Shape and Flow, copyright 1961 by Educational ServicesInc. used by permission of Doubleday, a division of Bantam, Doubleday,Dell Publishing Group Inc.)

(a)

(b)


one finds thatvs = (yRT0/M)112 (7.20')

where y = cp/cy, R = Nok is the universal gas constant, M is the molecularweight of the gas and To the ambient temperature. The derivation of Eqs.(7.20) is given in Appendix 4.

To evaluate vs for air at s.t.p. we useK = 8.31J/(g-mole)-KM = 28.8 g/mole = 0.0288 kg/g-moley=1.4T = 4°C = 277K

and findt;s = 335m/s (7.21)

The speed of sound is a measure of the typical random velocity of themolecules in the gas. Thus, when an object moves through a gas atvelocities approaching vs we cannot any more assume that the randomvelocity within each fluid element has averaged to zero; of course, wemust also account for the compressibility of the fluid.

When the velocity of the body exceeds the speed of sound a shock waveis produced, namely a sharp discontinuity in pressure and densitypropagates through the medium as illustrated in Fig. 7.15(a). The creationof the shock wave can be understood with the help of the sketch in part(b) of the figure. The body moves uniformly along the x-axis starting fromthe point xA at time tA; at the present time tD it is at the point xD andtherefore its velocity v = (xD — xB)/(tD — tA). The circles represent the locusat the time tD of the pressure waves produced when the body was passing

Fig. 7.15. (a) Pressure profile of a shock wave, (b) Generation of a shockwave by a body moving faster than the speed of sound.

x(t)

(a)

Supersonic flight 257

the points xA, xB,. . . , etc. Since the pressure wave propagates with thespeed of sound, the radii of these circles are given by rA = vs(tD — tA),rB = vs(tD — tB),... etc. It is clear from the geometry that the pressurewaves from all points of the trajectory arrive simultaneously on the surfaceof a cone with apex in the present position and of angle 0, where

AA' vs(tD - tA) vssin 6 = - (7.22)AC v(tD-tA)

As long as v > vs the angle 6 is real and a shock wave is produced. Itis customary to call the ratio v/vs the Mach number, M. We then find that

1= ( l -s in 20)1 / 2 = - 1 ) 1/2 (7.22')

The shock waves can be easily observed using an optical interferencetechnique known as Schlieren photography. An example correspondingto ^ = 1 . 4 5 is shown in Fig. 7.16; the angle of the shock waveapproximately satisfies Eq. (7.22).

The creation of a shock wave requires energy and this must be providedby the airplane engine. Consequently the diag is greatly increased asv -• vs. Fig. 7.17(a) shows the lift coefficient of a flat plate as a functionof Mach number; it exhibits a resonant behavior tending to infinity at

Fig. 7.16. Shock wave produced by a wedge moving at Mach number1.45; it is visualized by Schlieren photography. (From Th. von Karman,Aerodynamics, McGraw-Hill, 1954.)


v = vs. When the velocity v -+vs maximum stress is developed and this iswhy we speak of 'breaking through the sound barrier'. The lift and dragcoefficients for a more realistic structure, the German V2 rocket used inW.W.II, are shown in (b) of the figure. Once v > vs the drag is mainly dueto friction which at these velocities greatly exceeds the induced andpressure drag.

For supersonic flight the airfoil design differs from that which isoptimum for subsonic velocities. Sharp edges are preferable to streamlinedprofiles and swept-back and delta wings offer better performance. At evenhigher velocities we speak of transonic flight where the main problem ishow to cool the wing and body surface which are heated to very hightemperature by air friction. The thrust required to reach supersonicvelocities cannot be developed by propellers, and instead jet or rocketengines are necessary; we discuss this subject in the following section.

7.5 Propulsion dynamics

So far we have examined airplane flight by tacitly assuming thatthe plane was moving with a given velocity v through the air. There are,

CTk

Fig. 7.17. Supersonic flight: (a) theoretical lift coefficient of a flat plateas a function of Mach number (a is the angle of attack), (b) lift anddrag coefficients for the German V2 rocket used in W.W.II. (FromG. P. Sutton, Rocket Propulsion Elements, J. Wiley, 1963.)

+JU

(a)

1 2 3 4 5Mach number (jM)

(b)

Propulsion dynamics 259

however, retarding forces acting on the airplane such as the induced andfrictional drag and if the plane is climbing the force of gravity. Thereforea force must act on the plane not only to accelerate it but also to maintainits speed. In the first airplanes the propulsive force was obtained from theaction of the propeller whereas at present jet engines are used for all butsmall airplanes.

In contrast, rockets are not accelerated continuously but reach theirfinal velocity during a short time interval when they are launched. Thethrust that propels a rocket is obtained from the exhaust at high velocityof a stream of gases; the gases are produced by the burning of the fuel ina high temperature chamber and are accelerated as they exit through anozzle. Once launched, rockets continue on their flight without furthermajor propulsive impulses except for course corrections and similarmaneuvers.

One can think of the action of the propeller as that of a screw that isforced to advance through the medium because of its rotational motion.While this picture is suitable for describing the propulsion of a ship, it isnot applicable in air where there is very large slippage. Instead, we considerthe airstream produced by the propeller, which has acquired a momentumAp; since momentum must be conserved an equal and opposite forceAp/At acts on the airplane. In Fig. 7.18 we show the streamlines and flowtube ahead and beyond the propeller. The area swept out by the propelleris S and its linear speed is v (the linear speed is given by the pitch of thepropeller multiplied by its angular velocity in rev/s); the velocity of theplane is w, so that the air velocity before the propeller is u and after thepropeller it is u + v. The mass of air moved by the propeller per unit timewill be designated by Q where

Q = ^ = pSv (7.23)at

Then the thrust acting on the plane is

F = £ = ,£ - C , (7.23')dt dt

Fig. 7.18. Streamlines and their modification by propeller action.

-Area S

u + v


It is of interest to consider the efficiency of the propeller, namely theratio of the power used to maintain the airplane in flight to the totalengine power. The propulsive power is

Pp = Fu = Qvu (7.24)

The total power expended is equal to the rate of increase of the kineticenergy of the airstream

dW f v2l

Pt = -^ = *QUU + v ? - w2l = Q\ m + 7 1 <7-24')

Therefore the propulsion efficiency is

n = ^= V \ = 1 (7.24")Pt vu + v2/2 l + v/(2u)

Thus, to have high efficiency we want (v/u) to be small; however formaximum thrust we need v to be large. Consequently variable pitchpropellers are used to provide large v at take-off and when high thrust isrequired, but permitting v to be reduced for level flight. We also see thatthe efficiency increases for high speed airplanes. The results of Eq. (7.24')are of general validity and are also applicable to jet engines and to rockets.

In practical applications one is interested in the power required tomaintain a given thrust. When u = 0 we can obtain a simple result bycombining Eq. (7.24') with Eqs. (7.23) and find

2 (7.25)

Apart from the numerical factor of \ this result is valid for a hoveringhelicopter and gives the correct dependence of power on the thrustdeveloped. As an example, a propeller with disk S = 10 m2 and deliveringa thrust F = 104 N (-2200 lb) would require Pt ~ 190 horsepower.

Jet engines too rely on creating a high velocity airstream; this is achievedby heating the intake air in the combustion chamber and then exhaustingit through the rear of the engine. A schematic of a turbojet engine is shownin Fig. 7.19. The airstream enters from the left and is compressed beforebeing ignited; on their way to being exhausted the hot gases drive a turbinewhich provides the power for the compressor. In general, jet engines aresimpler in construction than reciprocating (piston) engines, but requirematerials that can withstand high temperature and pressure. The firstoperational jet engine was used in a German fighter plane toward the endof W.W.II in 1945; it developed a thrust of ~1000 lb. In contrast, typicaljet engines today have thrust of 20 000 to 30 000 lb.

Rocket engines are in principle similar to jet engines in that the thrustis developed by the exhaust at very high velocity of a stream of gases.

Propulsion dynamics 261

Since rockets must be able to operate outside the atmosphere and becausevery large thrust is required, the gases that are heated and exhausted arecarried by the rocket itself. The thrust is given by

dmF = — v edt

(7.26)

as derived in Eq. (6.11); ve is the exhaust velocity and typicallyve ~ 2000 m/s, namely a supersonic velocity, and (dm/dt) is the mass ofthe gases exhausted per unit time. Eq. (7.16) is valid in the rocket's restframe; recall that ve is the exhaust velocity with respect to the rocket.

To calculate the efficiency of a rocket engine we proceed as we did forthe propeller (see Eqs. (7.24)). The power delivered to the rocket is

Pp = fu = (dm/dt)vcu (7.27)

with u the instantaneous velocity of the rocket in an absolute frame. Thetotal power delivered by the engine is the rate of increase of the energyof the rocket and of the exhaust gases; note that in the absolute framethe velocity of the gases is vf = u — ve. Thus

Pt = P + 1 ^ {v'f = ^ v.u + - - (u - v,)2 (7.27')

And therefore the efficiency is

Pt

2(«/PC)

+ (u/ve)2 (7.27")

The efficiency is maximal and equals one when u = ve, that is when theexhaust gases are at rest in the absolute frame after leaving the rocket.

It is customary to characterize rocket engines by their specific impulse/sp, which is defined as the ratio of the thrust to the weight (as measured

Fig. 7.19. Schematic of a turbojet engine.

- Fuel injection

Intake

Compressor Turbine

High velocityexhaust


on the surface of the earth) of the fuel consumed per unit time*

/sp = — = F ="- (7.28)dw/dt g(dm/dt) g

Here g is the acceleration of gravity at the surface of the earth. When thespecific impulse is defined as in Eq. (7.28), it has dimensions of time aridis given in seconds. For typical rocket engines /sp ~ 200-400 seconds; thisimplies exhaust velocities of 2000-4000 m/s.

Rocket engines use solid or liquid propellants as fuel; for instance liquidoxygen and liquid hydrogen can be mixed in the reaction

2H + O -• H2Owhich releases 3 eV of energy. The energy released per unit mass of fuel is

QR = N0 x (3 x 1.6 x 10"19) J/mole = 3 x 105 J/mole- 2 x 107 J/kg

Thus for a large rocket, carrying 100 tonnes of fuel the total energy releaseis of order Q = QRMrjc where M = 105 kg is the mass of the fuel and thecombustion efficiency nc ~ 0.4-0.7; thus Q ~ 1012 J. Only a fraction of theenergy Q appears as kinetic energy of the payload; a large part is expendedin the residual energy of the exhaust gases and in the kinetic energy ofthe rejected first stages of the rocket. We examine these considerations inthe following sections.

7.6 Rocket engines

A simplified model of a rocket engine consists of a combustionchamber followed by a nozzle through which the gases exhaust; this isshown in the schematic of Fig. 7.20(a). Of course there must be provisionsfor supplying the fuel (usually under pressure) to the combustion chamberand controlling the flow of the two components; often part of the nozzlemust be cooled in spite of the ability of the various components towithstand high temperatures. A more realistic view of a rocket engine isgiven in Fig. 7.21.

In the model of Fig. 7.20(a) we identify three regions: the combustionchamber (c), the throat (t) and the exhaust plane (e). The pressure, densityand temperature of the gases will take different values at different positions,

* The specific impulse was first introduced as the ratio of the thrust, in pounds-force,divided by the mass of fuel, expressed in pounds-mass, per second. This was aconvenient, even if not correct, definition; we will adopt the definition ofEq. (7.28) which shows that /sp is strictly equivalent to specifying the exhaustvelocity.

Rocket engines 263

but we can assume cylindrical symmetry about the axis. The force resultingfrom pressure always acts normal to the walls and we will designate theoutside pressure by Po. Differences between Po and the exhaust pressurePe modify Eq. (7.26) so that the thrust is given by

F = —i? e + (Pe-Po)i4e (7.29)

with Ae the exhaust area. If Pe > Po we have increased thrust but we arenot efficiently using the energy in the chamber. If Pe < Po a 'pressure drag'acts on the rocket, while for optimal design of the engine Pe = Po. SincePo is a function of altitude it is difficult to maintain this conditionthroughout the flight.

We will now attempt to find the exhaust velocity ve in terms of thetemperature Tc in the combustion chamber. Once heated, the expansionof the exhaust gases is adiabatic and therefore

U + PV = constantwhere U is the internal energy of the gas, and V the volume. The mechanicalenergy PV is converted to the ordered kinetic energy of the gasesPV = \mv2 whereas AU = mcpAT, with cp the specific heat under constantpressure. Thus

¥P\ - vl) + cP(Ti - Ti) = 0 (7.30)

If we now compare the parameters in the chamber (vx = 0 , 7\ = Tc) tothose at the exhaust, we find

t>e2 = 2 C p (T c - r e ) (7.30')

We recall that y = cp/cy, cp — cv = R and that the velocity of sound is givenby vs = (yRT/M)1/2 with M the molecular weight (see Eq. (7.20')). We can

Fig. 7.20. Rocket engine: (a) schematic showing the combustionchamber and throat and exhaust channel, (b) flow through an idealizedDeLaval' nozzle.

Throat Exhaust

Subsonic

Ji<\Supersonic

Jt>\

(a) (b)


then write for Eq. (7.30')1/2

(7.31)

where (vs)c is the velocity of sound in the combustion chamber and n isan efficiency factor which in actual engines takes values 0.5-1.0. Toevaluate rj we note that

(7.31')

where the last step follows for an adiabatic process in an ideal gas.We know that for air, y ~ 1.4, and for the oxygen-hydrogen mixture

used in rocket engines y is even closer to unity, y ~ 1.25. Thus even withrj = 0.5, the exhaust velocity is vc ~ 2(vs)c namely supersonic. As the gasesexit the combustion chamber they gain velocity, and the throat of thenozzle is defined as the plane where the Mach number M = 1; in theexhaust region M>\. This is shown in Fig. 7.20(b) which is drawn foran idealized 'DeLaval' nozzle.

To obtain the thrust of the engine we must evaluate dm/dt, and we doso by considering the flow through the throat area Av Then

dm_ A . ^c n ~ ,dt (».)c

Fig. 7.21. (a) Schematic diagram of a liquid propellant rocket.

Vent Vent

Pressure regulator

Rocket thrustchamber

Rocket engines 265

(7.33)

Since we also know that ve is of order (t>s)c, we can write for the thrust(see Eq. (7.29))

dm~V*~dt~ FX l c

where we ignored the pressure difference term. Here C F X ~ (1.5-2.0).Eq. (7.33) is useful for calculating the required throat area to achieve agiven thrust.

Fig. 7.21. (b) F-l liquid propellant turbopump-fed rocket enginemanufactured by Rocketdyne; this engine develops a thrust of- 7.5 x 106 N. (From G. P. Sutton, Rocket Propulsion Elements, J. Wiley,1963.)

Fuel inlet

Dual oxygen pumpdischarge to injector

Pitch and yawactuators

Gimbal mount(carries the thrust load)

Pitch and yawactuators

Combustionchamber

Dual fuel pump.discharge piping

Turbine exhaustduct

Uncoolednozzle

extension


As an application of the relationships that we derived we consider arocket engine operating at Tc = 3000K and Pc = 200 atmospheres(~3000 psi). If the throat area is At = 0.2 m2 then the thrust is

F ~ 1.5(0.2 m2) x (2 x 107 M/m2) - 6 x 106 Nor equivalently a thrust of one million pounds. Next we calculate theexhaust velocity from Eqs. (7.31). Assuming that Pe = 100 psi, and thaty = 1.25, the efficiency factor rj is

*7 = l - (0 .33)°- 2 ~0.5The speed of sound in the combustion chamber can be estimated byassuming that the exhaust gases have the same properties as air, so that

OOc ~ vs{TJT^fl2 = 1 ^ 1 0 = 1000 m/sThus from Eq. (7.31)

ve - 2(us)c ~ 2 x 103 m/sand the specific impulse for the engine is

Jsp = 200 secondsFinally by using the values for the thrust and the exhaust velocity, wecan obtain the mass rate of propellant flow. We have

dm F 6 x l O 6

— = — = = 3 x 103 kg/sdt vc 2 x 103

If the total mass of the fuel is 100 tonnes then the burn time would beAt - 3 3 s.

7.7 Multistage rockets

We have already discussed the rocket equation in Section 6.3.For flight at an angle 9 as shown in Fig. 7.22, Newton's equation takesthe form

duM — = F - D - Mg cos 9 (7.34)

dtwhere the thrust F, the drag force D and the velocity u are all along theflight direction at an angle 9 to the vertical. By integrating Eq. (7.34) wefind the velocity as a function of time, u(t), and a second integral yieldsthe displacement along the line of flight, s(t).

We can apply Eq. (7.34) to the special case of a vertical launch, 9 = 0,and if we ignore the drag force we obtain

^ ^ 9 Ispgo^g=-Isp9oU^M)-g (7.35)dt M M dt

Multistage rockets 267

In the last step we used dm/dt = — dM/dt, where M is the mass of therocket. If /sp and g are constant, then Eq. (7.35) can be directly integratedto yield

"bo - uo = Qohv In u - gtp (7.36)L-êmptyJ

which is the same expression as given by Eq. (6.13). Here wbo stands forthe burn-out velocity and tp for the time of powered flight; we also usedg0 to indicate the acceleration of gravity at the surface of the earth, andu0 is the initial velocity. Note that according to Eq. (7.36) the first termcontributing to the velocity increment is independent of the specific valueof dm/dt, but the second term is not, because tp depends on dm/dt. Forinstance, if we assume a uniform burn

M{-Me _Mt-Mt_ , ( 7 3 ? )p dm/dt F

The height at burn-out is obtained by integration of Eq. (7.36)

The distance traveled after burn-out up to the highest point of thetrajectory is called the coasting height hc, and hm = hc + hho. If we setu0 = h0 = 0 but take into account the variation of g with height above theearth's surface, we find

h ="k (R®+Ko) ( 7 3 9 )C 2lR2l(R+h)/2-]

For flights near the earth's surface, hho « R® and Eq. (7.39) reduces tothe familiar expression hc = ulo/2g0. Conversely, when uho exceeds theescape velocity uho> (2g0R®)1/2 the denominator becomes negativeindicating that the rocket does not return to earth.

Fig. 7.22. Forces acting on a rocket flying at an angle 6 with respect tothe vertical.

Mg


For engineering applications it is convenient and customary to introducea notation in terms of dimensionless fractions (these are less than unity)and dimensionless ratios (which are greater than unity). If

Mf is the full mass of the rocketMs is the mass of the structure after the fuel has been consumed,

andM£ is the mass of the payload,

we define the followings = MJM{

r = F/g0M(

R = Mt/(M. + M,)=l/(s + l)In this notation we can rewrite Eq. (7.36) as

(7.40)

dead weight fractionpayload fractionthrust to weight ratiomass ratio

where we assumed that g = g0. The velocity increment (ub0 — u0) for arocket with /sp = 300 s is plotted in Fig. 7.23 as a function of the massratio R and for different values of the thrust to weight ratio r. It is evidentfrom the graph that increasing r has only a small effect on the final velocityincrement.

To optimize the velocity increment for fixed payload one would try toincrease Jsp or decrease s; however increasing /sp may involve a moresophisticated engine with a consequent increase in dead weight. The bestway for increasing the final velocity is to use a multistage rocket, where

Fig. 7.23. Velocity increment given a rocket with /sp = 300 s as a functionof the mass ratio R; the curves are for different values of r, the thrustto weight ratio.

9000 m/s

• •6000

• •3000

5 10 15Mass ratio R

20

Multistage rockets 269

each stage contributes its own velocity increment to the payload. Whena stage reaches burnout, it is detached and dropped off so that only thesubsequent stages are accelerated further. After n stages have been burnedout, the velocity of the payload is

u = uho(n) + uho(n - 1) + • • • + ubo(l) (7.41)and the overall payload ratio is given by

G = Mtun = 1

We will illustrate these ideas by considering a two stage rocket. Forsimplicity we will restrict ourselves to level flight (or equivalently we canassume that r -• oo); then the two consecutive velocity increments resultin a payload velocity

u = I, lnf-^T") + h ln(-Tr) (7'42)

\ + ij \s2 + yWe want to maximize u while keeping the overall payload ratio fixed

G = - i - (7.42')

and under the assumption that the two stages are similar. The similaritycan be expressed by requiring that the structure parameter

5 = — (7.42")1 - / '

be the same for both stages; b is the ratio of the mass of the structure tothe mass of the propellant plus structure. With these two conditions andwithout varying Il912 we find that maximum u occurs when

_ll{l-Sl)_¥ l2(l-52)11 — — I?

R, R2

Thus, if the specific impulse is the same for both stages, the payloadvelocity is largest if the mass ratios are equal for both stages.

The effect of rocket staging is illustrated graphically in Fig. 7.24. Thefinal velocity is given as a function of the overall payload ratio G for adifferent number of stages n = 1-4 and for identical specific impulse forall stages. The curves have been obtained from Eq. (7.42) (and its extensionfor n = 3, 4) and show how important staging is when high payloadvelocity is required.

As an example consider a rocket fueled by liquid oxygen and kerosenewith the following parameters

Dead weight fraction s = 0.12Payload fraction / = 0.08


Mass ratio RSpecific impulse /s,Thrust to weight ratio r = 2.0

Using Eq. (7.40) we then find for the final velocityubo = 3600 m/s for vertical launch

= 4750 m/s for horizontal launchIn contrast, if we used two stages with the same parameters as given abovewe would have

. . . s 0.121 - / 1 - 0.08

= 0.13

and

so that G = 156. The final velocity for a horizontal launch would then be

u£> = 2gh lnf . * 1 = 9500 m/sY_d(l - I) + /J

Fig. 7.24. Terminal velocity as a function of the number n of stages usedand of the mass to payload ratio G; the structure factor is fixed at<5 = 0.10. (From H. Seiffert, Space Technology, J. Wiley, 1959.)

2000

S 1500

g 1000

1<5 500

II

= 30

0

a

j

" \

/ j

/ /

/ s /

11///5000 15 000 25 000

Velocity (ft/s)35 000

7.8 The NASA shuttle

The space shuttle was conceived by the 'National Aeronauticsand Space Administration' (NASA) as a manned reusable vehicle that

The NASA Shuttle 211

could reach low earth orbit and return from orbit by landing as a wingedcraft. The shuttle is launched by three rocket engines fueled by liquidpropellants from an external tank and assisted by two solid propellantbooster rockets. The boosters are separated and jettisoned after twominutes of flight; they are allowed to drop by parachute and are retrievedand reused. The external tank separates after eight minutes at which pointthe orbiter climbs into an earth orbit of altitude 200-300 km. On returnto earth the orbiter enters the atmosphere at a large attack angle (34°)and is slowed down by atmospheric drag; it lands as a glider with atouchdown speed of approximately 150 miles/hr.

A schematic of the orbiter mounted on the main tank is shown inFig. 7.25. The orbiter itself is 125 feet long and has a wing span of 78 feetand a weight of 270 000 lb including fuel and cargo. It is constructedsimilarly to commercial aircraft and is provided with a large cargo bayin which payloads up to 65 000 lb can be carried. Once launched theshuttle can maneuver by using its thrusters which are powered byhydrazine (MMH) and nitrogen tetroxide (N2O4); normally they canprovide an on-orbit AV of ~300m/s. The first shuttle was launched in1981 and many successful flights followed. On January 28, 1986 one ofthe shuttles exploded 74 seconds after lift-off, killing its crew. The tragedyslowed down the U.S. space program and raised questions about the useof manned flight as contrasted to simple rocket missions for the explorationof space. Shuttle flights were resumed in the U.S. in 1988.

As an application of the equations that we derived in the previoussections we can calculate the velocity of the orbiter from the known massof fuel and the specific impulse of the engines. We first collect the pertinentdata*

Fig. 7.25. Side view of the NASA shuttle mounted on its external fueltank.


Boosters (two; parameters are for one booster)Mass Mb - 6 x 105 kgThrust F b ~ 1 . 2 x 107 NBurn time th ~ 120 s

Therefore, assuming a uniform burn, we estimate that dm/dt = 5 x 103 kg/sand thus

/sp = 240s (7.43)which is reasonable.

Main engines (three; parameters are for the total system)Liquid hydrogen (fuel) M ~ 105 kg

(383 000 gallons - 108 g-mole)Liquid oxygen (oxidizer) M ~ 6 x 105 kg

(143 000 gallons - 4 x 107 g-mole)External tank (empty) M - 3.5 x 104 kgThrust Ft ~ 5 x 106 NBurn time tt ~ 500 s

Assuming uniform burn we estimate dm/dt = 1.4 x 103 kg/s and thus/sp = 360s (7.44)Mass of orbiter M o = 1 . 2 x l 0 5 k g

The trajectory followed by the shuttle during launch is shownschematically in Fig. 7.26. Initially the trajectory is at 45° and at the timethe boosters are burned out it is almost horizontal. Even though the mainengines are ignited at launch, for simplicity of calculation we will assumethat they are turned on after the boosters are jettisoned. For the boosterphase we then have from Eq. (7.36)

p ^ | ^ c o s e r p (7.36)

where we use

Mf = 2Mb + Mt + Mo = 2.05 x 106 kg

Me = Mt + Mo = 8.5 x 105 kgand

/sp = 240s, t p = 1 2 0 s , 0 = 60°

Thus we obtainu£ = 2100 - 600 = 1500 m/s (7.45)

The velocity increment obtained from the main engines can be calculated

The NASA Shuttle 273

from the same equation where now we use

Me= M o =1.2xl0 5

andJs =360s, L=500s, 0 = 90°

sp P

Thus we obtainAw = ttg) - u0 = 6900 m/s (7.45')

The velocity required for low earth orbit isVo = /(gRe) = 1900 m/s

and we found u^J = u^J + AM = 8400 m/s. Thus the orbiter can enter intoa low orbit by maneuvering its thrusters after the main tank is jettisoned.

We can also check the value of the overall energy delivered by the mainengines. In the reaction 2H + O -> H2O, the energy released is 3 eV andsince 4 x 107 g-mole react, the total energy is

£ = iV o x(4x 107) x 3 eV - 1013 J (7.46)The kinetic energy of the orbiter is

K.E.=4MoP^rbi t = 4 x 1012J (7.46')which indicates reasonable efficiency. Note that we ignored the energyexpended by the boosters which is of the same order as in Eq. (7.46). We

Fig. 7.26. Trajectory and maneuvers of the shuttle during its launch;approximate separation times are indicated.

2 minutes, 7 seconds:Solid fuel boostersseparate from externalfuel tank and fall intoAtlantic Ocean, wherethey are recovered bythe Navy.

Solid-fuelbooster9 seconds:Roll begins.

6 minutes,30 seconds:Shuttle begins longshallow dive to preparefor separation ofexternal fuel tank,which falls into theIndian Ocean. Underpower from its small on-board fuel supply, theshuttle then headstoward orbit.

Externalfuel tank


also ignored the gain in potential energy of the orbiter, but this is small:for instance for h = 100 km, AU = Mogh = 1.2 x 1011 J. The powerdeveloped by the main engines is

E 1013

P = - = = 2 x 1010 W - 25 million horsepower (7.47)tp 500

This is to be compared to the power delivered to the craft, which is givenby Po = FtV where Ft is the thrust. If we use an average velocity V ~ 4 km/sand Ft = 5 x 106 N

P0 = FtV = 2x 1010W (7.47r)in close agreement with the result of Eq. (7.47). In practice the propulsivepower is not constant, but the average values we calculated here are typicalof the flight.

Finally, we consider the propulsion parameters. Given that thecombustion chamber pressure and temperature are

Pc = 3000 psi = 220 atm = 2.2 x 107 NTc = 2700°C, p c =16kg /m 3

we find from Eqs. (7.31) that the exhaust velocity isi; e~3.15x 103m/s

Since the main engine mass flow is dm/dt = 7 x 105/500 = 1.4 x 103 kg/s,the expected thrust is

Ft = ve — = 3.15 x 103 x 1.4 x 103 - 4.5 x 106 Ndt

as already assumed.In conclusion we see that the main parameters of the shuttle flights can

be calculated from first principles. Yet the technical realization of theseprinciples involves great ingenuity and effort and a malfunction of anyone component can easily lead to disaster.

Exercises

Exercise 7.1

An airplane wing moves at a speed of 200 km/hr through air at standarddensity; the lift per meter of span is 3000 Newtons.

(a) Determine the circulation around the wing.(b) Assume that the angle of attack was a ~ 5° and find the chord

of the wing.(c) Calculate the Reynolds number for the air flow past the wing.

Exercises 275

Exercise 7.2

A helicopter has a propeller with the following parameters: radius R = 5m,pitch d = 2m/turn, rotational speed 1200 rpm (turns/minute) and noslippage.

(a) Calculate the thrust of the propeller and therefore the maximumweight that can be lifted.

(b) Calculate the power delivered.(c) Calculate the Reynolds number at the tip of the propeller, by

using a reasonable estimate for the chord of the blade.

Exercise 7.3

Consider a rocket engine with the following parameters: Pc = 300atmospheres, Tc = 2200°C, dm/dt = 1 kg/s, operating at sea level(Po = 1 atm). Furthermore let y = 1.25, M = 18.

(a) Find the throat area from the simplified expressiondm _ AtPc

dt " vsc

where vsc is the speed of sound in the combustion chamber.(b) Use an approximate expression to calculate the thrust using the

result of (a).(c) If the nozzle is ideally designed (Pe = Po), find the exhaust velocity

vc9 and recalculate the thrust.

Exercise 7.4

From your own experience, or data, make a log-log plot of the maximumspeed (in km/hr) versus the specific power (horsepower per ton of mass)for various modes of transportation. Note that on a log-log plot almostany relationship falls on a straight line.

Exercise 7.5

Evaluate the speed of sound in the earth's atmosphere: (a) as a functionof temperature, (b) as a function of altitude above sea level.

Exercise 7.6

Determine the burnout velocity, burnout altitude and maximum altitudefor a dragless projectile in vertical flight given the followingparameters: i?e = 7250ft/s; Mp/Mo = 0.57 (propellant mass/full mass);tp = 5.0 s and uo = ho = 0.

8TO THE STARS

8.1 The solar system

We shall begin this chapter with a survey of our solar system ofwhich the earth is one of its nine planets. The planets follow closed orbitsaround the sun because of the attractive gravitational force exertedbetween the sun and the planets. The planetary orbits are nearly circularand the influence of the other planets is much weaker than that of thesun. Furthermore, with the exception of Pluto, planetary orbits lieapproximately in the same plane, the plane of the ecliptic which is definedby the earth's orbit. Planets spin about their own axis and are accompaniedby satellites. In addition the solar system contains a large number ofasteroids and an unknown number of comets.

The planets, their mean distance from the sun, their mass and theirorbital period are listed in Table 8.1. Note that distances are given inastronomical units (AU), where one AU equals the length of thesemi-major axis of the earth's orbit around the sun

1AU = 1.495 x 10 n m (8.1)

Masses are given in earth masses M®, whereM@ = 5.978 x 1024 kg (8.2)

The orbital period T is related to the semi-major axis of the orbit a throughKepler's third law

2n a3/2

2 ( 8 '3 )

Here G is Newton's constant and Mo the mass of the sun. The productGMQ is determined to better accuracy than G or Mo individually, and

The solar system 277

(8.4)

(8.5)

(8.6)

we will use the symbol KQ for it, so thatKQ = GMQ = 1.324 x 1020 m3/s

for reference we also give the accepted value of Newton's constantG = 6.668 x KT^N-mVkg 2

and the main value of the earth's radiusR@= 6.378 x 106m

From Eqs. (8.4, 8.5) we infer that the mass of the sun isM@ = 3.325 x 105 M@ (8.4')

The orbits of the inner and of the outer planets are sketched approximatelyto scale in Figs. 8.1 (a, ft). As seen from the north pole of the ecliptic allplanets and most satellites move counterclockwise.

When a particle moves in a central field of force, the orbit lies in aplane and angular momentum about the center is conserved. Of course,the total energy is also conserved and for bound orbits the total energyis negative

E=T+U=T-\U\<0 (8.7)It is understood that the potential energy is defined to be zero for a particleat infinite distance from the attracting center. If the force has a 1/r2

dependence, then the average value of the kinetic and potential energyare related by the virial theorem

<T>= - i < t / > (8.8)Just as the planets are subject to the sun's attractive force, natural or

artificial satellites are subject to the 1/r2 attractive force of the planet. Forinterplanetary probes one must consider the attraction of the sun andthat of the planet from which and/or toward which the probe is launched.

Table 8.1. The planets of the solar system

Planet SymbolSemi-majoraxis (AU)

Mass(in Af0)

Orbital period(in earth years)

MercuryVenusEarthMarsJupiterSaturnUranusNeptunePluto

0.3870.7231.0001.5245.2039.539

19.1830.0639.52

0.0540.8141.0000.108

318.495.314.617.30.83

0.2410.6061.0001.88

11.8629.4684.0

164.8247.7

278 To the stars

Such probes follow, with respect to the planet, a hyperbolic trajectory.In general, motion in a 1/r2 central force field leads to trajectories whichare conic sections and can be bound (circle, ellipse) or unbound (parabola,hyperbola). Because of the large mass of the sun as compared to that ofthe planets we can treat the sun as stationary and account for the effectof the other planets by perturbative techniques.

Conic sections are generated by the intersection of a cone with a planesurface as shown in Fig. 8.2. Let \j/ be the angle between the normal tothe plane surface and the axis of the cone, whose apex half-angle is 6.When \j/ = 0 the intersection is a circle; when \j/ < TC/2 — 6 it is an ellipse;when \\f = 7c/2 — 6 it is a parabola and when ij/ > n/2 — 6 it is a hyperbola.This family of curves can be characterized by their semi-major axis a andtheir eccentricity e. It is convenient to give the equation of the conies inpolar coordinates r, (/>, and we will use the convention indicated in thesketches of Figs. 8.3.

Ellipse. The locus of points P, for which the sum of the distances fromthe two foci is constant

r + r' = 2a = constantb = a{\-e2)112

a(l-e2)1 + e cos i

(8.9)

Fig. 8.1. The solar system: (a) the inner planets, (b) the outer planets.

(a)

5.4 Lighthours

The solar system 279

Hyperbola. The locus of points P for which the differences of thedistances from the two foci is constant. A hyperbola has two disconnectedbranches, and the asymptotes have a slope cos <\> = 1/e

r' — r = 2a = constantb = a(e2-l)1/2

a(e2-l)r = 1 —e cos

(8.10)

Circle. The locus of points P for which the distance from the center isconstant

r = aNamely, the limiting case of an ellipse with e = 0.

Parabola. The locus of points P which are at an equal distance fromthe directrix and the focus. It is the limiting case between an ellipse anda hyperbola, namely

e=lHowever, a -• oo, while a(e2 — l) = 2d remains finite

r = — (8.11)1 — cos <\>

For the Cartesian coordinates indicated in Fig. 8.3(c) the equation of theparabola is

y2 = 4d(x-d) (8.11')

from which it also follows that x = r.According to the above discussion, the orbits of the planets must be

ellipses; the eccentricity of the earth's orbit is very small, e@ =0.0167giving rise to a difference between the aphelion and perihelion ofRA — RP = 0.033 AU. Pluto and Mercury have the most eccentric orbits

Fig. 8.2. Conic sections: (a) ellipse, (b) parabola, (c) hyperbola.

(c)

280 To the stars

with e ~ 0.2. Recurring comets also have elliptic orbits but with e -> 1. Theaxis of spin of most planets is normal to their orbit plane, with the notableexception of Uranus. For the earth the inclination of the spin axis withrespect to the normal to the ecliptic is e = 23° 27' and this is the primaryeffect that gives rise to the seasons of the year.

While the earth has only one satellite, the moon, and Mars has two,the outer planets have a large number of satellites. The matter density ofthe inner planets is of the same order as that of the earth, p ~ 5 g/cm3

Fig. 8.3. Parameters characterizing conic sections: (a) ellipse, (b) hyper-bola, (c) parabola.

(a)

Motion in a central field of force 281

whereas for the outer planets as well as for the sun, the density is muchlower, p ~ 1 g/cm3. The surface temperature and the composition of theatmosphere of the planets depend crucially on their distance from the sunwith Mars being the most 'hospitable' planet in terms of the conditionsprevailing on earth.

8.2 Motion in a central field of force

Motion under the influence of a central force is a classical problemin mechanics, with which most readers will be familiar. Nevertheless wewill give its formal solution and then derive the velocity equation, whichis a first integral of the equations of motion. The formalism of the velocityequation is well adapted to the analysis of orbital maneuvers such as willbe discussed in Sections 3 and 4. Since the force is central, the motion isconfined to a plane and we will use polar coordinates r, (/>.

Gravitation is an inverse square law force and therefore the equationof motion is

F=-G^fir = »^ (8.12)r2 dt2

Here uT is the unit vector along the radial direction, and u^ the orthogonalunit vector. As can be seen from Fig. 8.4 the following relations hold

dt dt * dt dt ' 'Since r = mr, and using Eqs. (8.13) we find that

* = * * , + , * * * , (8.13')dt dt dt *d2r Yd2r /d<A2~| Y d2</> d r d ^ l— T = — - - r I — u r + \r—^ + 2 -\& A (8.13")dt2 Idt2 \ d t j J r L d t 2 dt d t ] *

Fig. 8.4. Polar coordinates for the description of arbitrary motion in aplane; U, and A^ are the unit vectors and are position dependent.

282 To the stars

Therefore, the equation of motion (Eq. (8.12)) is equivalent to two scalarequations

dt2 \dtj 2

dt2 dt dt V

where we used the notation K = KQ = GMQ. Hereafter we will use dotsfor total time derivatives, f = dr/dt, <j> = d(j)/dt, etc.

The angular momentum about the center of force is given by/ = | r x p| = m|r x r| = mr2fy (8.15)

Taking the derivative of Eq. (8.15) we obtain

= — / = 2rr<j) + r24> = r(2r(p + r$)mdt m

which is equal to zero by virtue of Eq. (8.14'). Thus/ = constant (8.16)

as was to be expected since a central force exerts no torque on the particle.Eq. (8.16) is a first integral of the equation of motion, and since theequation is of second order there must also exist a second integral. Thisis the total energy of the particle

E=T+U (8.17)To express the total energy in suitable form we return to Eq. (8.14)

which we write in terms of the angular momentum, asK I2

rz m r*Multiplying both sides by mf, Eq. (8.18) can be cast as a total derivative

(8«18)

d H 2 Km I2 1— \mr2 + = 0dl L r 2mr2\

The integral is therefore a conserved quantity which we identify with thetotal energy

E = \m\ r2 + —— = constant (8.19)L m2r2j r

In Eq. (8.19) we recognize the kinetic energy termT = im(v)2 = \m{r2 + r2^2) (8.20)

and the potential energy term

1/=-— (8.21)r


The physical interpretation of Eq. (8.19) is best understood by analogyto one-dimensional motion. In that case the kinetic energy is T' = \mf2

and we introduce an effective potential

so that

(8.22)

(8.19')

Thus, the Newtonian potential is modified by a 'centrifugal barrier' termleading to a one-dimensional potential well as shown in Fig. 8.5. We nowsee clearly that the total energy of the particle determines the allowedrange of radial distances, and therefore the nature of the orbit.

Since in Eq. (8.19), £ and / are constant we can solve for r and integrateto obtain r(t) (it is given by an elliptic integral). Instead we are interestedin a parametric equation of the orbit of the form /(r , $) = 0. This isfacilitated by returning to the equation of motion and making the standardchange of variable

u=l/r (8.23)Therefore, in terms of the new variable

m r m(8.23')

Fig. 8.5. Effective potential for motion under the influence of a centralinverse square law force.

E<

E>0

0

E <0

284 To the stars

and. dr • d / 1 \ • I dur = — </> = — I - U =

d</> dcj) \uj m d(j)I [d fduX]- l\2d2u 2r= — — ) \ < P = - - u2 (8.23")m Ld(/> Vd0/ J W d0 2 V ;

Introducing these expressions for r\ r and 4> in Eq. (8.14) we obtain

which can be solved by elementary methods. It is a harmonic oscillatorequation with a constant driving term, and has the solution

(8.25)W /ccos(</></>0) +l/mz

where k and c/>0 are arbitrary integration constants.To interpret the orbit equation (Eq. (8.25)), we set the initial phase

(/>0 = 0 and express k in terms of the total energy to obtain

^ [ l ± ^ c o s ( / ) ] (8.26)7 7 ^r (l/m)

where

(8-27)(l/m)2 ~a(\-e2Y a(l - e2)

e is the eccentricity and a the semi-major axis of the orbit. Since (K/(l/m)2)must be positive, we use

+ sign when e < 1, i.e. an ellipse, or E < 0— sign when e > 1, i.e. a hyperbola, or E > 0

The orbit equations are:

r= a(l-e2) E=-1-^ e<l (8.28)

1 + e cos (j) 2 a

which represents an ellipse (see Eq. (8.9)), or

a(e2-\) „ 1 Km .1 — e cos (p 2 a

which represents a hyperbola (see Eq. (8.10)), or

K 1 — cos


which represents a parabola (see Eq. (8.11)). Finally we note that,, (I/a)2 2mE(l/m)2

e =l-\ = 1 H (8.31)2mE (Km)2

and recall that E can be either positive or negative; a can then bedetermined from the first of Eqs. 8.27.

The velocity equation is a first integral of the equations of motion alsoknown by its latin name vis-viva or 'live force'. We start with the equationof motion (Eq. (8.12)) which is

r=-^r (8.32)

Here we used r = rur, and the relevant vectors are shown again in Fig.8.6. We form the scalar product of the vector equation, Eq. (8.32), withr and obtain

2K.2r • r = r • r

r3

or

dt r2

since r»r = rr (see Fig. 8.6).

r2 dt \rNoting that r2 = V2, with V the velocity of the particle, we obtain

Eq. (8.33) can be immediately integrated because — r = — IK — [ ~

= 2K\~- —^V2 - [_V(t = 0)]2 = 2K\ - - - 7 — | (8.34)

Our goal is to simplify this result by a proper choice of t = 0. We consideran elliptical orbit and choose t = 0 to be the time when the particle is atthe perifocus, the point of closest approach to the center of force. Weindicate all the variables at that instant of time by a zero subscript

rn = a(l-e), rn = 01

ymj r% \mj a2(l-e)2

Fig. 8.6. The position vector and its derivatives.

286 To the stars

If we make use of Eq. (8.27) we can rewrite

V0 a(\-e)

and therefore Eq. (8.34) becomes

r a\_\—e 1 — e

or

(8.35)

where the /w/«w5 sign is for elliptic orbits and the plus sign for hyperbolicorbits; a is the semi-major axis of the orbit. For circular orbits r = a andwe obtain the familiar result V2 = K/a.

We will refer to Eq. (8.35) as the velocity equation, because it relatesthe velocity of the particle to the distance from the center of force fordifferent initial conditions. These conditions are expressed through thesemi-major axis a. As a check on our result we calculate the total energyof the particle

Using Eq. (8.35) we find immediately_Km

E = H = constant2a

in agreement with Eqs. (8.28, 8.29). As another example, for a parabolicorbit a -* oo, and therefore

V2 = 2K/R (8.36)which yields the expression for the escape velocity from a planet of radiusR and gravitational attraction K = GM. For a low orbit around a planetwe have r = a = R and therefore V2 = K/R.

8.3 Transfer orbits

During interplanetary travel the spacecraft executes certainwell-defined maneuvers which we can classify as follows

(a) Escape from the local gravitational field of a planet.(b) Capture in the local gravitational field of a planet.(c) Transfer from the heliocentric orbit of one planet to that of

another.

Transfer orbits 287

(d) Encounter with a planet, which affects both the direction and themagnitude of the heliocentric velocity of the craft.

The first three of these maneuvers are accomplished by thrusting, thatis, by changing the velocity of the vehicle. The velocity increment AV isgiven by the integrated thrust,

AV= \-dt (8.37)J m

and is a measure of the fuel that is needed for a particular maneuver. Thechange in the total energy of the vehicle is proportional to

and thus can be positive or negative or zero depending on the relativeorientation of V and AV. Keep in mind that when a vehicle is in orbitaround a planet its heliocentric velocity is the vector sum of the localvehicle velocity and of the planet's heliocentric velocity. In an encounterthe velocity and orbit of the vehicle change due to the gravitationalattraction of a planet or other massive body.

We will now consider the most economic trajectory for transferring avehicle from an initial orbit / to a final orbit F. For simplicity we willtake the initial and final orbits to be circular with respective radii ax andaF. These are indicated in Fig. 8.7 where we also show an ellipticaltrajectory tangent to the initial orbit at A and to the final orbit at B. Thistrajectory is called the Hohman ellipse and we can calculate the necessaryvelocity increments at the points A and B. Note that B lies on the linejoining A with the force center.

We use the 'velocity equation' (Eq. (8.35)) to find the velocity of the

Fig. 8.7. Hohman transfer from one circular trajectory to another.

288 To the stars

vehicle in the initial and final circular orbits

-KKFMTand

/ 2

The semi-major axis of the Hohman transfer ellipse is obviouslyaT = (al + aF)/2 (8.39)

and therefore the velocity of a vehicle following that ellipse is at the pointA (the perifocus),

j ,8.40,aY 0, + % / J \_aY ( + )J

At the point B (the apofocus) the vehicle velocity on the transfer ellipse is

By comparing Eqs. (8.38) and (8.40) we see that the velocity of the vehiclemust be increased at the point A, in order to follow the Hohman ellipse;it must be again increased at the point B in order to leave the Hohmanellipse. We speak of velocity increments AV, where

W

ajThe total velocity increment AVH = AVA-\- AVB is often expressed infractional terms

whereR = a¥lax

As an example, let us calculate the velocity increment for a vehicle ina high earth orbit if it is to escape the solar system. In a high earth orbitthe vehicle is free of the earth's attraction, but moves with the heliocentricvelocity of the earth, which for a circular orbit is

''•3*-"Q--*y"_2 98OOm/s1.485 x 1 0 n m /

If the vehicle is to escape from the solar system, then aF= oo, whereas

Transfer orbits 289

ax = a@. Using this input, namely R = aF/ax = oo in Eq. (8.42) we find forthe fractional velocity increment,

AKH/l^ = v / 2 - 1 = 0 . 4 1and therefore

AVH = 0.41 Vx = 12 200 m/s (8.43)

This additional velocity must be imparted in a direction parallel to thevehicle's initial velocity, which is the same as the earth's heliocentricvelocity.

In practice, if we wish to launch from the surface of the earth a vehiclethat is to escape the solar system, we will have to provide not only thevelocity increment shown in Eq. (8.43) but also the velocity incrementnecessary to place the vehicle in high earth orbit. The latter is given by

Vfsc = ^^®- = (2gRe)112 = 11 200 m/sV ê /

Thus the total velocity increment is

AKotai = V%> + AKH - 23 400 m/s (8.43')It is important, however, to appreciate that if the vehicle was launched

directly from the earth's surface (in a direction parallel to the earth'sheliocentric motion) without first going into a high orbit, the total velocityincrement would be smaller even though the total energy gained by theprobe is the same in both cases. To see this note that VX=V@ as before,and the final velocity Vt must be such that

= V? + $(Vfsc)2 (8.44)ax K@

in order to overcome the sun's and the earth's gravitational fields. ThusAV=Vt-Vx = [2K,2 + (K®c)2]1/2 - Vx = 13 800m/s (8.44')

which is significantly smaller than the result obtained in Eq. (8.43'). Thisis so because the escape velocity is added in 'quadrature' in Eq. (8.44');*note that if Vfsc was zero we would regain exactly the result of Eq. (8.43).

Earth to Mars. As a further application we consider the total velocityincrement needed to launch a vehicle from earth and have it land onMars. We will use (see Table 8.1)

M^=0.11M@ and K = — =1.524ax

* For a given velocity increment AV, the gain in energy is AE = mV-AV and it islargest when V is large.

290 To the stars

From Eq. (8.42) we obtain

V,2 T - . = 0.099;

l+Rj= 2950m/s

AVB / iy/2 r =0.089; AKB = 2650 m/s

It is most economic to escape from a low earth orbit and to be capturedinto a low Mars orbit. The radius of Mars is Rj = 0.52 R@ and thereforethe corresponding escape velocities are

2GM\ 1 / 2 _5150 m/s

With respect to the earth the vehicle's kinetic energy must be such thatafter it has been decreased by the potential energy corresponding to theearth's attraction it still has the value AVA. Thus if we designate by vA

the velocity that the vehicle must have in order to successfully leave theearth orbit and enter the transfer ellipse, vA must satisfy

2VA~ 2\V esc) — 2lXV A

The required velocity increment is the difference between vA and thevelocity in low earth orbit vf9 where (vf)2 = \{V®.C)2. Thus

A ^ = vA-vf= [(Kesc)2 + (AFJ 2 ] 1 ' 2 - (l/^2)Vfsc = 3660m/s

Fig. 8.8. Transfer trajectory from a parking orbit around the earth toa parking orbit around Mars; the wiggles are greatly exaggerated.

Circle w.r.t.Mars

Ellipse w.r.t. SunHyperbola w.r.t.Earth, Mars

Encounters 291

and correspondinglyAvB = vB-vi = l(Vesc)2 + (AVB)2y2 - {\lj2)Vic = 2150 m/s

In the total velocity balance we must, of course, add the velocityincrements required to bring the vehicle into low earth orbit and to landit from low Mars orbit. Thus

AVT = v§ + AvA + AvB + vi = 17 370 m/sWe stress that the thrust imparted to the vehicle must be parallel to itsown motion and to its heliocentric motion for the above calculations toremain valid. The reader can easily show that if the transfer to the Hohmanellipse was executed from a high earth orbit (and into a high Mars orbit)the total velocity increment would have been AV'T = 22 000 m/s. Thetrajectory of the vehicle as seen from the heliocentric system is shown inFig. 8.8 where the effect of the planetary gravitational fields is included.Finally note that the velocity increments for the missions that we discussedare within the reach of present technology (see Chapter 7), especially fornot too massive vehicles.

8.4 Encounters

When a vehicle approaches a planet from a large distance, thetotal energy of the vehicle with respect to the planet is positive, and inthat reference frame the vehicle executes a hyperbolic trajectory with focusat the position of the planet. Thus, the asymptotic velocity with respectto the planet before and after the encounter will have the same magnitudebut different direction. This change in direction of the relative velocityresults in a change in the magnitude and direction of the heliocentricvelocity of the vehicle. By properly chosen encounters the velocity of avehicle can be boosted; this possibility is exploited in missions to the outerplanets - such as carried out by the Pioneer and Voyager craft.

That the heliocentric velocity of a vehicle changes in an encounter witha planet can be easily understood from the sketches of Fig. 8.9. Thevelocity relative to the planet, at large distance (i.e. its asymptotic value)will be designated by v^ and its magnitude u^ is referred to as thehyperbolic excess velocity. The heliocentric velocity of the planet is u andthe heliocentric velocity of the vehicle before and after the encounter arelabeled by V and V . By definition

V = u + Voo V ' ^ u ' + v^ (8.45)For the duration of the encounter we can set u = u' and even though v^has changed direction, | v j = Iv'J. The vector diagram before and after

292 To the stars

the encounter is shown in Figs. 8.9(a) and 8.9(fe) where the new directionof v^ (i.e. v ' J is such that |V'| > |V|.

We can reach the same conclusion by a more detailed argument asfollows: The total energy of the vehicle with respect to the sun is

E=T+U or h = — =V2 (8.46)m R

where R is the heliocentric distance and h is known as the energy constant.If we use the velocity equation (Eq. (8.35)) whereby V2 = (2K/R - K/a)we find immediately h = —K/a, in agreement with Eq. (8.28). The changein the energy of the vehicle during an encounter (where R can be consideredconstant) is then

^ = 2 V . ^ = 2(« + v ) ± (8.47)At At At

Here we expressed V = u + v with v the velocity relative to the planet;further dV = dv because during the encounter u is practically constant.

The relative acceleration of the vehicle dv/dr depends only on thegravitational attraction of the planet Kp = GMp and on the distance rfrom the planet

dv

The change in the total energy of the vehicle is the integral of Eq. (8.47)over time

(8.48)The last step in Eq. (8.48) follows because by symmetry the integral ofv«(dv/df) vanishes. In Fig. 8.10 we show two trajectories of encounterswith a planet that moves with heliocentric velocity u. For case (a) theunit vector ur is antlparallel to u (for small r) so that the integral of Eq.

Fig. 8.9. Vector diagrams for the velocity change during an encounterof a probe with a planet. The velocity of the planet is u and that of theprobe V, so that v^ is the relative velocity between probe and planet:(a) before the encounter, (b) after the encounter; even though themagnitude of v^ is not changed the magnitude of the heliocentric velocityof the probe V has increased.

u u(a) (b)

Encounters 293

(8.48) is positive and the vehicle gains energy in the encounter. For case(b) the opposite is true and the vehicle loses energy. The angle by whichthe direction of v^ changes depends on rp, the distance of closest approachto the planet.

In a more familiar context the change in the heliocentric velocity of thevehicle is analogous to the change in the speed of a tennis ball that isbounced off a moving backboard. If the backboard is moving towardsthe ball (case (a)) the ball will bounce off with higher speed; if thebackboard is moving away from the ball (case (b)), the ball moves sloweron its return path. Encounters are also referred to as 'swing-bys' or'slingshot trajectories'. Their accuracy depends on the correct aiming ofthe vehicle when it is still far away from the planet. In what follows wewill give a quantitative analysis of the expected velocity increment in anencounter and use it to discuss the Voyager-2 mission.

In Figs. 8.1 l(a, b) we show the orbit of a planet moving with heliocentricvelocity u and the heliocentric trajectory of a vehicle before and after theencounter. The vehicle velocity is \ 1 before and V2 after the encounterand its corresponding velocity relative to the planet is designated by vând v'x. As shown in the figures

V t =!! + ¥„ V2 = u + v'oo (8.49)and we identify the angles

!!) JB2=*(V2,U)1(8.50)

and the angle A< by which the relative velocity has been rotatedM;=${yao,v'J |vJ = |v'J (8.50')

Given u and \ l we can find V2 if we know A£.

Fig. 8.10. Encounter of a probe with a planet moving with velocity u;Voo, v^ are the relative velocities of the probe with respect to the planetbefore and after the encounter. In case (a) the probe gains velocity inthe heliocentric system; in case (b) the heliocentric velocity of the probedecreases.

294 To the stars

To calculate A£ we consider the encounter in the reference frame of theplanet as shown in Fig. 8.12. Here the trajectory of the vehicle is ahyperbola with semi-major axis a. The semi-major axis is related to thetotal energy (see Eq. (8.29)) E = Km/2a. At large distance from the planetthe total energy is simply the kinetic energy E = mv^/2. Therefore weobtain (setting K = Kp for clarity)

a = ^ (8.51)

Fig. 8.11. Vehicle trajectory and planet orbit during an encounter:{a) before the encounter, (b) after the encounter (in this case the vehiclegains velocity).

Vehicle V2

Planet

VehicleTo Sun To Sun

(a) (b)

Fig. 8.12. Trajectory of a vehicle encountering a planet, as seen in theplanet's rest frame (i.e. relative motion).

The Voyager-2 Grand Tour of the Planets 295

The distance of closest approach rp for the encounter is given and is relatedto the eccentricity (see Eq. (8.10)) through

rp = a(e-l) or e=l+^ (8.52)a

We also know that the angle of the asymptotes is related to the eccenticity(see Fig. 8.3(6) or Eq. (8.10) for r -> oo)

cos ^ = l/e (8.53)where the angle 4*^ is shown in Fig. 8.12 and defines A£, since

l/e = cos = s i n ^ - </>„ J = sin(A£/2) (8.54)

The calculation of V2 proceeds then as follows:(a) Given u and \ l we find v o o = V 1 - u.(b) We find px from Eq. (8.49) since

( O 2 = u1 + (V,)2 - luV, cos p, (8.55)and £1? from

^sin^^sin^ (8.55')(c) Next we calculate the parameters of the hyperbolic trajectory

from a knowledge of Kp and rp (Eqs. (8.51) and (8.52))a = KJvl e=l+rp/a (8.56)

and therefore we find A£ from Eq. (8.54)A^-2s in- 1 ( l /e ) (8.57)

(d) Finally we form

and calculate V2 = u + v'^, where(y2f = u2 + ( t /J2 + 2i«;'oo cos £2 (8.58)

and p2 fromV2 sin p2 = v'n sin £2

We recall that |v'J = Iv^l. Furthermore as rp->0, e -+ 1 and thereforeA£ ^180°; of course, rp cannot be smaller than the radius of the planet.In practice, the desired value of A£ fixes rp; to achieve this rp thevehicle must be carefully aimed by suitable thrusting maneuvers before itcomes under the influence of the gravitational attraction of the planet.

8.5 The Voyager-2 grand tour of the planets

The Voyager missions were designed to explore the outer planetsby taking advantage of a very special conjunction of the planetary orbits.

296 To the stars

The trajectories followed by the two craft are shown in Fig. 8.13. Voyager-2was launched on August 20, 1977 and swung by Jupiter, Saturn andUranus; it encountered Neptune in 1989 and will then leave the solarsystem. Communications have been maintained with the craft and theinformation acquired during the encounters has been received on earthin spite of the very large distance involved: the signal transit time fromUranus is 2 hr 40 min. A sketch of the Voyager craft is shown in Fig. 8.14.

From Fig. 8.13 we can obtain the approximate data for the missioncollected in Table 8.2. We have given the average velocity betweenencounters even though in practice the velocity of the craft is not constant.One would expect that the velocity would decrease as the craft movedfarther away from the sun. However, during each planetary encounter thecraft receives a velocity boost and this is reflected by the tabulated meanvelocities which are relatively constant. For convenience we note that aspeed of 1 AU/year = 4750 m/s. The velocity of Voyager-2 when it leavesthe solar system (at approximately 60 AU) will be Vf = 16.1 km/s.

It is instructive to analyse in some detail one of the encounters and wewill consider the Saturn swing-by. This is shown in Figs. 8.15(a, b, c) whichare drawn to indicate progressively more detail. The distance of closestapproach to the satellites of Saturn is indicated by the arrows and is aconcern, since too close an approach would alter the Voyager's trajectory.For instance, the craft approached to within 93 000 km of Tethys whileit passed Phoebe at 2 076 000 km. The closest approach to Saturn was

rp=161094km (8.59)or approximately 2.67 Saturn radii. For the gravitational attraction of

Table 8.2. Approximate data for the Voyager-2 mission

Earth

Jupiter

Saturn

Uranus

Neptune

Leave solar system

20/8/77

9/7/79

26/8/81

24/1/86

24/8/89

-2000 AD

fl(AU)

1.0

5.2

9.5

19.2

30.1

60

As (AU)

6.9

7.5

18.0

14.5

At (years)

1.89

2.13

4.41

3.58

vf =

V (km/s)

17.3

16.8

19.3

19.3

16.1

VGR 2 Saturnencounter8/26/81

VGR 1 Saturnencounter11/12/80

VGR 2 Uranusencounter1/24/86

VGR 2 Neptuneencounter8/24/89

Fig. 8.13. Voyager-1 and Voyager-2 missions as seen in the plane of the elliptic. (Provided through the courtesy ofthe Jet Propulsion Laboratory, California Institute of Technology, Pasadena, California.)

298 To the stars

Saturn we haveX — 3 70 v 107 Vm3 /c2

•JK — D.ly X 1U K m / S

and the heliocentric velocity of Saturn is2na In x 9.54

(AU/yr) = 9660m/s

(8.60)

(8.61)T 29.46

where we used data from Table 8.1.We are given that the excess hyperbolic velocity of the craft with respect

to Saturn is!>«, = 10 680 km/s (8.62)

and from Fig. 8.13 we infer that /?x = 53°. We can then calculate Vl9 bysolving the quadratic equation* (see Eq. (8.55))

to findVt = 13.3 km/s (8.63)

This is a reasonable value, since the mean velocity between Jupiter andSaturn was V= 16.8 km/s.

Also we find the eccentricity of the hyperbolic orbit using Eq. (8.56)Kh 3.79 x 107 km3/s2

a _ _ ._5 la ^ 332 x 105 km

e= 1 + r

Fig. 8.14.

High-fieldmagnetometer - \

\ >\ /

114km2/s2

p / a = 1.484

The Voyager-2 spacecraft.High-gain antenna Plasma

(3.7-mdia) Cosmic ray \

\ ^Sun sensor . N.^^^ ^ ^ ^ ^

NO-ow-field ^ ^ ^ ^ J W L - J L J V x /Planetary radio magnetometer ^^%j^yP^f i^^^Tf \astronomy and

plasma waveantenna ""V

X^\*^r V V \7/

Radioisotopethermoelectric

generator

L

Imaging, NANHjjg^/ Ultravioletg B ^ ? . spectrometer

\ v ^| \ \ InfraredA. \ spectrometer

\ \ and radiometer\ \

\ Photopolarimeterv \\ \ Low-energy

\ charged\ particle

\ ^ Hydrazine\ AC/TCM

\ thrusters\

Optical calibration targetand power supply shunt

load radiator

One can also use a graphical method unless high precision is desired.

All event timesgiven wrt

Saturn C/A0 100,000 200,000Phoebe, +9d22h

GEO OCC+ 3 6 m t o + 2 h 1 0 m

Descending node, +54

Enceladus, +21 m

VGR 2 C/A (8/26/X81,3:24161,094 k m ^ \ GMT)

S17, - l h 0 2 m

mill, km(a) Outer satellites

/ /JS10, +265S16, +2h38moT / /< '

' S14,+10m 6-SCO OCC- 3 h 2 2 m t o

_ 1 h 4 1 m

909,000471,000660,000318,000

IapetusHyperionTitanS12(Dione-B)DioneS17(Teth-B)MimasS15(A-Ring-A)S13(F-Ring-A)S14(F-Ring-B)EnceladusS10(Coorb-A)Sll(Coorb-B)S16(Teth-A)TethysRheaPhoebe

502,000154,000310,000287,000

S15, - 1 5 m

Mimas, - 5 0 m

284,00093,000

645,0002,076,000

(c) Inner satellites(b) Middle satellites

Fig. 8.15. Encounter of Voyager-2 with Saturnthe courtesy of the Jet Propulsion Laboratory,

shown in the equator plane for three different scales. (Provided throughCalifornia Institute of Technology, Pasadena, California.)

300 To the stars

Thus

A£ = 2sin-1(l/e) = 85° (8.64)in agreement with the trajectory indicated in Fig. 8.15(a).

To calculate V2, the departure velocity from Saturn, we first evaluate(see Eq. (8.55'))

= 86°

Therefore

and finally from Eq. (8.58) we obtain

Vj = u2 + ( O 2 + 2wyoo c o s £2 = 20.2 km/s (8.65)Thus Voyager-2 gained a velocity increment AV ~ 7 km/s on swinging bySaturn. This is to be compared with the velocity increments in thrustingmaneuvers which are typically A F ~ 15 m/s.

Having found V2 let us estimate the arrival velocity at Uranus. Fromenergy conservation we have

V2_2KQ = V^_2K^ ( g 6 6 )

where for simplicity we assumed circular orbits for Uranus and Saturn,with radii R& and Rh. Setting Vh = V2 = 20.2 km/s as found in Eq. (8.65)we obtain

V^ = ( ^ = 17.9 km/s

Fig. 8.16. Trajectories of the Pioneer and Voyager spacecraft afterleaving the solar system; the positions are shown for the year 2000 AD.

Voyager 1

Pioneer 10

Voyager 2

Interstellar travel 301

Therefore we predict an average velocity between Saturn and Uranus ofV = 19.05 km/s in good agreement with the value V = 19.3 km/s listed inTable 8.2. Given that the final velocity after escape from the solar systemis Vf =16.1 km/s, the departure velocity from Uranus must be(V2\ ^ 19km/s. Thus Voyager-2 received a boost at the Uranus swing-byas well.

We can also work backwards from the data to estimate the initial launchvelocity relative to the earth. A Hohman transfer from earth to Jupiter

= 5.2 AU) requires a fractional velocity increment

Therefore AVH= 14.3 km/s where we used Vx = K@ = 29.8 km/s. If weinclude the escape velocity from the earth, we find for the launch velocity

v = [(14.3)2 + (11.2)2]1/2 = 18.2 km/sThis is a slight underestimate because the orbit to Jupiter is hyperbolic,but it shows that the vehicle velocities are within the capabilities ofavailable launching systems.

In closing, we show in Fig. 8.16 an oblique view of the trajectories ofthe Pioneers and Voyagers at their expected position in the year 2000 AD.These are the first probes that have been sent from earth to explore theregion outside the solar system. They have also provided us with a wealthof information about the planets of our own solar system. The missionswere feasible only by taking advantage of the gravitational field of theplanets and required a highly refined tracking and command network.No doubt there will be much more exploration of our solar system, butwe already have a fair understanding of the conditions prevailing on theplanets as well as of the properties of the sun.

8.6 Interstellar travel

Interstellar travel is often chronicled in fascinating detail byscience fiction writers. Yet, travel to another solar system is not practicaltoday and there are fundamental limitations on the type of travel, thelength of the journey and the energy requirements for such a venture. Wewill consider these aspects of space travel and discuss the methods ofpropulsion that have been proposed. These methods are based on correctscientific fact but their practical realization is far from proven, and theenergy necessary for a single mission dwarfs the present energy productioncapabilities on earth.

302 To the stars

The presumption motivating interstellar travel is that at least some ofthe stars that we see in the sky have planetary systems similar to our own,and may harbor planets hospitable to human life. While this may beplausible, it is not proven, because planets are practically impossible toobserve at astronomical distances. Furthermore, in our part of the galaxythe density of stars is low, and the nearest stars are several light yearsaway. We give below a list of the four nearest stars and show their relativepositions in Fig. 8.17.

DistanceStarAlpha-CentauriBarnard's starSiriusEpsilon-Eridani

PropertiesTriple starRed dwarfLarge bright starSimilar to sun

4.3 LY6.0 LY8.2 LY

10.8 LYNote that distances are now measured in light years (LY) where

1AU= 1.495 x 10 n m1LY = 9.458 x 1015mlpc = 3.086 x 1016rn = 3.26 LY

The last unit is the 'parsec' and is defined as the distance at which 1 AUsubtends a parallax of one second of arc. It is the unit commonly usedto describe extended astronomical objects and distances. It follows thatthe parallax subtended by the above stars over one earth revolution is ofthe order of 1 arcsecond and therefore can be measured directly. Still, ascompared to the dimensions of the solar system (50 AU) the projectedtravel represents an extrapolation in distance by a factor of 104.

Kinematics. As our first task we want to explore the time necessary tocomplete a mission. Supposing that an unmanned craft is sent toa-Centauri we would have to wait for 4.3 years after its arrival to receivethe signal on earth. To complete the mission in a reasonable time (asmeasured by the span of human life) the craft would have to reach avelocity which is a fraction of the speed of light. We can imagine that the

Fig. 8.17. The nearest stars to our solar system.

e-Eridani/ • Sirius

Solar system~ 10~3 L.Y.

• Barnard's Star


acceleration imparted to the craft is small (low thrust acceleration) butsustained for a very long time.

As an example we choose an accelerationa = 0Ag~lm/s2

and calculate the time needed to complete a mission to a-Centauri fordifferent final velocities V{ of the craft. The results are given in Table 8.3and it is evident that little is gained for J^/cÔ.3. The table has beencalculated using non-relativistic kinematics since they introduce only verysmall differences. In a realistic mission the craft should decelerate whenit approaches its target and this would further add to the mission time.

Next we make an estimate of the energy requirements. For a probe ofmass M = 100 kg (which is y^th of the Voyager mass, and j ^ o t h of theshuttle orbiter) at velocity Vf/c = 0.1, the kinetic energy is

T = \V\ = 4.5 x 1016 J - 1010 kW-hours (8.67)This corresponds to one day's energy production in the U.S. Thereforethe propulsion engine should be efficient, but it must also be highlycompact. These conditions are best met by nuclear fuels. Furthermore therecoil velocities of the nuclear fragments are in the range J^/c~0.1 andare therefore well matched to the speed of the craft.

Apart from self-propelled vehicles it has also been proposed to use anearth based energy source, such as a laser or beamed microwaves to 'push'the craft along. A variant is the direct use of the solar radiation impingingon a solar sail extended by the spaceship. In analogy to the planetaryswing-by one could consider a craft that 'rides a comet' out of the solarsystem. These methods yield rather low velocities as compared to thedesired Vf/c ~ 0.1 and can accelerate only small payloads. We will returnto these ideas after first discussing a proposal for a large ship propelledby nuclear explosions.

Dyson's ship. In the 1950s, F. Dyson and S. Ulam proposed the

Table 8.3. Total mission time to tx-Centauri for a = 1 m/s2

Final velocity Acceleration Coasting time Data return Mission timetime (years) (years) (years) (years)

- 1 . 00.50.30.10.05

105310.5

06

134385

4.34.34.34.34.3

1415204890

304 To the stars

construction of a spaceship based on the concepts sketched in Fig. 8.18.The ship carries a large complement of nuclear bombs which are droppedone by one behind the ship and exploded; forward going particles arecaught by a shield so that about half of the explosion's momentum istransferred to the ship. An ablation shield is used to absorb the generatedheat and a shock absorber to buffer the ship from the impact of theexplosion. Note that the payload is large, of the order of an old fashionedocean liner, and that the explosions must take place well behind the craftin order to assure the radiation protection of the crew.

The shock absorber limits the velocity increment that we can impartto the vehicle in each explosion. We choose AF=30m/s per explosionwhich is reasonable. Further, we take the average mass of the ship to beM= 150 000 tonnes so that AF=30m/s corresponds to a momentumchange

Ap = MAV = 4.5 x 109 kg-m/sA rough calculation shows that a 1 megatonne (Mt) nuclear explosivecan provide the thrust, as follows: Let the mass of the fragments bedmf = 103 kg and their velocity vf = 0.1c. If ^ of the total momentum istransferred to the shield the thrust is

f ~10 1 0 kg-m/sHowever, to sustain an average acceleration a = 0.1 g = 1 m/s2 by velocityincrements of AV = 30 m/s, we will have to explode one thermonucleardevice every 30 s!

Next we calculate the total fuel required, for a final velocity of the ship,Vf/c = 0.03. From the simple non-relativistic expression Vf = aAt theacceleration time is At ~ 107 s ~ 0.3 years. Therefore we will need

Fig. 8.18. Schematic of a very large spaceship propelled by nuclearexplosions as proposed by F. Dyson.

Ablationshield

Explosion

Fuel storage(bombs)

Payload~ 50 000 T

300 000 TShock

absorbers


approximately3 x 105 1 Mt bombs (8.68)

each of a mass of 1 tonne. That amount of fuel is approximately 30 timesthe present nuclear arsenal of the U.S. In terms of cost, if the price ofdeuterium is $200/kg and even if we ignore all other components of thebombs, we have an expenditure of $6 x 1010 which is approximately 60times the (yearly) U.S. federal budget. Thus a mission of that type is welloutside our present capabilities.

Beamed power. Here we consider a payload attached to a reflector asshown in Fig. 8.19. The reflector should be as large as possible and alsoas light as possible. It could be constructed out of a thin mesh in the caseof a microwave beam, or out of thin aluminum in the case of opticalradiation. We will assume a thickness t = 16 nm (160 A - 5% of the laserlight passes through); the density of the reflector, or sail, is thenp = 0.4g/m2. This density is sufficient to sustain the thermal heating. Ifthe diameter of the reflector is d = 3.6 km, the total area and total mass(divided approximately equally between the sail, the structure and thepayload) would be

A = 107 m2 M= 103 kgNext we calculate the necessary power to impart an acceleration

a = 0.03 g = 0.3 m/s2 to the craft. For a perfect reflector dp/dt = 2P/cwhere P is the electromagnetic power (in Watts) incident on the reflector,and dp/dt is the momentum transfer. Since p = Mv and using M = 103 kg.we obtain

p = £dp = Mc_v = M c f l ^ a 5 x 1 0 n w (8 6 9)2dt 2 dr 2

Namely, 50 GW of laser power must be focussed to within a narrow cone.For reference we recall that a large U.S. city with a population of half amillion people has a power consumption of the order of 1 GW.

If the angular divergence of the beam is set by the diffraction limit andwe use 10 km optics at A = 300 nm we could achieve 6d ~ 3.6 x 10~xl rad.

Fig. 8.19. Principle of propulsion by beamed power.

Incomingradiation

• Payload

Reflector

306 To the stars

Such an angle would preserve efficient collection of the beamed powerfor a distance of 350 AU giving a final velocity to the craft Vf ~ 0.02c. Wecould envisage even larger lenses constructed in space by robots; to achievethis a significant infrastructure in space technology will be needed.

Solar sails. The idea here is to use the momentum of solar radiationeither by absorbing or reflecting the radiation as shown in Fig. 8.20. Atthe earth's orbit the solar flux is So = 1.36 x 103 W/m2 and therefore ata distance R (expressed in AU)

. S o . dV 2S0A 1• = —z and — = -

i^2 d Me R2

(8.70)

where M is the mass of the craft, and A the area of the reflector. If weuse the parameters of the previous example A = 107 m2, M = 103 kg,we find

a= a i m / * 22 (8.70')

[K(AU)]2

Thus, in the vicinity of the earth a solar sailing craft could be effective,but the acceleration decreases rapidly as the craft distances itself from thesun. For a craft with the above parameters, starting from earth radiusand escaping the solar system, the final velocity would be V{~ 0.005cwhich, even though slow for interstellar travel purposes, still equals 100times the escape velocity of the Voyager craft.

Antimatter propulsion. The annihilation of antimatter with matter iswell studied in the laboratory and is the most efficient nuclear reactionin the sense that over half of the initial mass is converted into energy.Typically, antiproton-proton annihilations lead to final states with7r-mesons

pp -• n+n+ + n_7c" + no7r° (8.71)where n+, n_, n0 are the number of positive, negative and neutral 7c-mesons(or pions) produced. On the average five pions are produced so that therest mass of the annihilation products is 5mn ~ 700 MeV, as compared to

Fig. 8.20. Solar sailing using: (a) an absorbing or (b) reflecting sail.

(a)

Inertial guidance 307

the initial mass 2mp = 1880 MeV; the balance is converted into the kineticenergy of the decay products. The n° decays to two y-rays with a lifetimex = 10"1 7 s; charged pions also decay but with longer lifetimes.

The annihilation products are emitted isotropically and therefore it willbe necessary to focus them to the rear of the ship in order to gain thrust.This can be accomplished with magnetic fields. Typical exhaust velocitieswill be very near the speed of light, ve ~ 0.8c. Containing and handlingantimatter in macroscopic quantities is a problem of unprecedentedchallenge and may not be solvable in practice. We will nevertheless assumethat our craft carries a fuel of antiprotons of mass Mf = 9 kg and can usethe antiprotons, intelligently, for propulsion.

For the parameters of the craft we choose a payload of ML = 1 tonneand a mass of matter (to be exhausted at high velocity) Me = 4 tonnes.We assume that all of the p rest-mass is converted to energy and that thisenergy is transferred with 100% efficiency to the craft. Then the totalenergy gained by the payload will be

EPp = 2(Mfc2) = 2 x 9 x (3 x 108)2 ~ 1.6 x 1018 J (8.71')and, the final velocity of the payload

V = (2EPp/ML)1/2 = 6 x 107 m/s ~ 0.2c (8.71")The above estimates are based on optimal efficiency and on a very largeamount of antimatter fuel.

At present antimatter is produced at high energy accelerators, wherethe antiprotons are stored at high velocity in a magnetically confiningevacuated ring. The storage rate is typically 2 x 101 1p/day, whichtranslates to 10"1 0 g/year. While this rate could be increased by severalorders of magnitude, we still must find alternate ways of producingantimatter, if we want to use it as fuel for an interstellar mission.

It is generally accepted that chemical fuels are not adequate forinterstellar travel, because of their low specific efficiency and the relativelylow exhaust velocities. Ion propulsion can produce higher exhaustvelocities but does not resolve the energy problem. Nuclear fuels are bettersuited to interstellar travel but the propulsion systems are highly complexand expensive. This will not stop humans from thinking about newtechniques for interstellar travel and hopefully, future generations maysucceed in venturing beyond our own solar system.

8.7 Inertial guidance

The flight of missiles and spacecraft is controlled by instrumentswhich can sense the rotation of the craft's axis and the acceleration of the

308 To the stars

craft with respect to a given reference frame. In certain space flights thereference frame may be fixed to the stars, whereas in terrestrial flights,the earth is used as reference. Rotation is sensed by comparing theorientation of the vehicle to the axis of one or more free gyroscopes; agyroscope on which no torques act, maintains its axis of rotation in afixed direction. Special relativity precludes the possibility of measuringthe uniform velocity of a vehicle by purely internal instruments. Contraryto that, the acceleration of the vehicle can be measured due to the inertialforce F, = — ma which is experienced by any massive body in acceleratedmotion. Integrating the measured acceleration yields the velocity, and afurther integration of the velocity gives the position of the vehicle.

The above statements are true only in an inertial frame, a frame whereNewton's laws are strictly valid. The earth is not an inertial frame becauseof its daily rotation; however the acceleration of the earth due to itsmotion around the sun is a much smaller effect and can be ignored.Furthermore, in the vicinity of the earth all bodies are acted upon by theforce of gravity and this must be known in order to deduce the correctacceleration. Therefore, accelerometers must be mounted on a stableplatform with respect to the earth. It can be said that one navigates inreference to the direction of the earth's gravitational field.

The principle of operation of an accelerometer can be understood bythe analogy to 'Einstein's' elevator. We suspend a mass from a verticalspring which in turn is fastened to the roof of the elevator. When theelevator is at rest, as in Fig. 8.21 (a) the spring is stretched by an amountAx where kAx = mg with k the restoring constant of the spring and m the

Fig. 8.21. Principle of operation of an accelerometer demonstrated withthe help of 'Einstein's' elevator: (a) elevator at rest (the force ofgravity acts on the mass and extends the spring), (b) elevator in freefall (no force acts on the mass), (c) elevator accelerates upwards witha= —g (the force on the mass is twice that of gravity and extends thespring twice as much as in (a)).

a=-g

Ax = mg/k

f = mg

(a)

t(b)

f = 2mg

(c)


mass. If the elevator is in free fall as in (b) of the figure, the spring willbe in its free, or equilibrium position (Ax = 0) because no force acts onit; note that in this case the elevator accelerates downwards withacceleration a = g. Finally, if the elevator accelerates upwards withacceleration a= —g, the spring will indicate twice the displacement ofcase (a) because kAx = m(g + a); this is sketched in (c) of the figure.

In a frame of reference which has acceleration a with respect to theearth, a body of mass m feels a specific force

f=F g + F, = m(g-a) (8.72)where Fg is the force of gravity and Fj the inertial force. Accelerometersmeasure (g — a) along three orthogonal axes and if the direction andmagnitude of g are precisely known, one obtains a. The simplest type ofaccelerometer is a pendulum with very low friction, one realization beingsketched in Fig. 8.22(a). The effects of friction can be canceled by usinga suitable feedback system; when the shaft of the pendulum turns, a signalis produced which is used to drive a torque motor to restore the pendulumto its equilibrium position. The current in the servo loop is a direct measureof the specific force. Another design is based on the mass-spring idea butinstead of a spring, the mass is supported by a 'force generator' drivenby an electric current as shown in Fig. 8.22(fo). The moving mass is keptat the equilibrium position by a servo system where, as before, the signalcurrent is a measure of the specific acceleration along the axis of theinstrument.

Fig. 8.22. Practical accelerometers always involve a feedback mechanism:(a) the sensing element is a pendulous mass, (b) the motion of the sensingelement is linear.

Outputsignal

Signaloutput

Signalgenerator

Input axis

Case

Inputacceleration

Torquegenerator

(a) (b)

310 To the stars

To maintain a fixed direction in space whether in airplanes, missiles,or ships, gyroscopes are always used. Mechanical gyroscopes have beendeveloped to great perfection but recently laser ring gyros are beingintroduced in many commercial applications. Fundamentally, a gyroscopeis a wheel spinning fast about its principal axis of inertia as shown inFig. 8.23. The suspension of the shaft must be as free as possible so thatno external torques are applied; in the absence of torques the angularmomentum is conserved and the axis of the gyro will remain fixed in theinertial frame. To show this explicitly we will present an elementaryanalysis of gyroscopic motion.

We use the coordinate system shown in Fig. 8.23 and designate themoment of inertia with respect to the x-axis by / , whereas for the transversedirections we use /t. The gyro is constructed so that / » /t. The angularmomentum is

L = IcoJ + It(oyj + Itoozk (8.73)where the unit vectors ij, k are fixed to the body of the gyro. It is clearfrom Eq. (8.73) that the angular momentum vector does not necessarilylie along the symmetry axis (the x-axis in this case); however, since / » / t

and cox »coy, coz, in a good gyro L is always near the principal axis. Theequation of motion for L in vector notation is

£-,and as L changes, the gyro rotates so as to maintain its principal axisaligned with the angular momentum vector.

The characteristic property of gyros is that when a torque t is applied,they usually rotate in a direction normal to t. Such response is contraryto the experience with non-spinning bodies but is a simple consequenceof Eq. (8.74). In Fig. 8.24(a) we show a torque T acting on a gyro of

Fig. 8.23. A gyroscope is a wheel spinning fast about its principal axisof inertia; L is the angular momentum vector.


angular momentum L, where x is orthogonal to L. In a time interval Atthe change in L is AL = J x At and is directed along t as shown in thefigure. Thus L changes its direction to L' but not its magnitude; L isrotated by an angle Ad where

|AL|Ad = -ih

As long as x continues to act on the gyro in a direction normal to thegyro axis, the gyro rotates about the direction normal to L and t at a rate

O l i mAt At |L|

This result can be expressed vectorially by the equationt = fixL (8.75)

where fl specifies both the axis and magnitude of the slow rotation; suchmotion is called precession.

In the absence of torques, L must remain fixed in space, but this doesnot imply that the principal axis of the gyroscope will also necessarilyremain fixed. In general, the principal axis will slowly rotate around thedirection of L, as shown in Fig. 8.24(fr); this motion is called nutation andwill be superimposed on the precession that results from the presence ofa torque. To find the half-angle of the nutation cone, we consider the caseof no external torques; thus L and L2 are conserved

L2 = I2OJ2X + /t

2(co2 + co2z) = constant (8.76)

The kinetic energy of the gyro is also conservedE = %L • co = \lcol + |/t(co2 + co2) = constant (8.76')

Fig. 8.24. (a) Precession of a gyroscope under the influence of a torquex. (b) Nutation of a gyroscope in the absence of external torques; theprincipal rotation axis describes a cone around the angular momentumvector which is fixed in space.

(a)

312 To the stars

Since / and It are independent, the two Eqs. (8.76) can be satisfiedsimultaneously only if a>l a n ( l (wy +°>f) a r e independently constantthroughout the motion. Therefore the principal axis lies on a cone centeredon L and with half-angle 0, where

<P = tan" J W + f ^ L constant ( 8 . 7 6 ")I /<»*: J

In terms of the sketch of Fig. 8.24(6), the transverse angular momentumis free to rotate in the (gyro-fixed) y-z plane, and its motion is determinedby the initial conditions.*

Gyros are built with freedom to rotate about one or both axes, inaddition to the principal axis. As an example we will discuss thesingle-degree-of-freedom gyro shown in Fig. 8.25. This type of gyro wasdeveloped by C. S. Draper at M.I.T. and is widely used; the spinningwheel assembly is contained in an enclosure which floats at neutralbuoyancy in a high viscosity fluid to reduce gravitational forces on thebearings. When the case rotates about the 'input axis' the spin referenceaxis tends to precess about the free axis - labeled in the figure as the'output axis' - in the direction shown. Such rotation induces in thegenerator a signal which after amplification drives the torque motor.However, exerting a torque on the gyro (along the output axis) forces thegyro, and therefore its case, to precess around the input axis; the precession

Fig. 8.25. The single-degree-of-freedom hermetically sealed gyroscopedeveloped by C. S. Draper at MIT.

Spin motor

Gyrowheel

Torque \^*^?0§k O u t P u tgenerator ^ ^ l i r

Case A I R D Y T J W ]t/^^--^ S i S " a l

generatorSpin reference axis

Float

* The exact motion of the spin axis is obtained from solving Eq. (8.74) for t = 0after introducing L from Eq. (8.73).


is such that the reference axis is maintained exactly at its original positionin space. Different arrangements in the servomechanism make it possibleto measure either the rate of precession or the total angle of rotation.

The measurement of the very small rate of rotation that is necessaryfor precise intertial guidance can also be achieved with optical means. Iftwo em waves, for instance light beams, travel in a closed path in oppositedirections, the transit time will be different for the two waves if the wholeassembly rotates with respect to an inertial frame. This can be understoodwith the help of Fig. S.26(a) if it is argued that because of the relativemotion between the sending and receiving end, the path in one directionis effectively shorter than in the other direction.

We can obtain the correct answer for the path difference by anelementary argument as follows: the path difference will be AL = T x Avwhere Av is the difference in the velocity of propagation of the two beams,and T is the time of propagation around the loop. If we take a circularpath of radius R, rotating with angular velocity Q

Av = 2RQ. T = 2nR/cand therefore

We expressed the path difference in terms of the area A enclosed by theloop, and the result of Eq. (8.77) is exact even through the precisederivation involves considerations on the propagation of light in an

Fig. 8.26. Principle of operation of an optical (ring laser) gyro, {a) Theeffective path length for light propagating in an optical fiber loop along,or counter to the direction of rotation is different and can be detectedby interference techniques, (b) The open ring gyro is formed using mirrors.(c) An active medium is introduced in the path so that the system canlase, eliminating the need to inject and extract the light.

Long path

Out

Short pathOut

(fl) (b)

\

£)\

(c)

314 To the stars

accelerated frame of reference. This phenomenon is known as the Sagnaceffect, first discussed in 1913.

In spite of the smallness of AL, one can use interferometric techniquesto measure the phase difference between the two beams, and amplify theeffect by letting the beams make many, say N9 revolutions around thering. Then

A* = * ^ O (8.78)kc

To appreciate the orders of magnitude involved we choose A = 100 cm2,N = 1000, M = 500 nm and a rotation rate Q ~ 7 x 10" 8 rad/s; thiscorresponds to Q ^ 10"3QE where QE is the rotation of the earth. Forthese parameters A(j> ~ 10"5 radians, namely 3 x 10~6 of an interferencefringe. Ring laser gyros have reached this accuracy which in more familiarunits corresponds to (5Q~0.01°/hour.

Ring laser gyros can be configured as shown in Fig. 8.26(a) by usingoptical fibers and injecting/extracting the two beams. Other possibilitiesare to use open mirrors arranged in a ring pattern as shown in (b) of thefigure and again inject and extract the beams. More interestingly, a lasing(gain) medium can be included in the ring path as in (c). In this case thetwo beams have different frequency where A/ = 4ACI/IA, with / the lengthof the ring. The development of laser gyros is being pursued because oftheir compactness and relatively simpler construction as compared tomechanical gyros, rather than because of superior precision.

Exercises

Exercise 8.1

The term low orbit' implies a circular orbit around a planet withr = ^planet-

(a) Calculate the speed of a satellite in low orbit around the earthand around Mercury.

(b) Find the time for one revolution.

Exercise 8.2

Consider a Hohman transfer between two circular orbits of radius a, andaF (aF > ax). Show that the energy increments corresponding to the velocityincrements AVA and AVB add up, as they must, to

Km\\ 1~2 L i

Here K = GM^ and m is the mass of the vehicle.

Exercises 315

Exercise 8.3

Analyse the Voyager-2 swing by Uranus that took place in January of1986. The trajectory in the heliocentric system is shown in Fig. 8.13 andthe hyperbolic excess velocity is

^(Uranus) = 14 730 m/sThe aim point is at a distance of closest approach

C.A. = 4.21 (Uranus radii) = 107 080 km(a) Find the heliocentric velocities of approach and departure from

Uranus.(b) Find the heliocentric velocity of approach to Neptune and the

time of flight.(c) Assuming that the Neptune encounter does not alter the trajectory

significantly, find the velocity of Voyager-2 after escape from thesolar system.

Exercise 8.4

(a) If you were to use a 100% reflecting surface as a solar sail so asto navigate away from the earth, how would you deploy it?

(b) For a sail area of 1 km2 what would be the thrust?(c) What is the velocity change for a 1000 kg vehicle after a year of

sailing at the earth's orbit?The solar constant at the radius of the earth is S = 0.136 W/cm2.

Exercise 8.5

Consider a rocket accelerated by the ejection of ions. Use reasonablevalues to calculate the rocket velocity that can be achieved for a payloadof 10 tons.

Exercise 8.6

Show that a coordinate system fixed to the center of the earth can beconsidered as an inertial coordinate system to an accuracy of a few partsin 108.

Appendix 1THE FOURIER TRANSFORM

Consider a periodic function of time V(t) with period T1 _ 2nT ° ° r

F(£) can always be decomposed into an even and an odd partV(t)=Ve(t)+Vo(t)K(t)=Ve(-t) Vo(t)=-Vo(t) (A1.2)

The Fourier theorem assures us that we can express Ve(t) and Vo(t) as aseries of harmonics of co0

00

Ve(t) = A0+ Y, Ancos(nco0t) (A1.3)

Vo(t)= £ Bmsin(nco0t) (A 1.4)

The coefficients An are found by multiplying both sides of Eq. (A1.3) bycos(mo;0r) and integrating over dt from — T/2 to + T/2

• T/2 r T/2Ve(t) cos(ma>0t) dt=\ Ao cos(mco0t) dt

1 -T/2 J -T/2oo r+T/2

n=l J -T/2(A1.5)

The cosine functions of the harmonics of a>0 form an orthogonal set inthe interval - T/2 to + T/2

T/2 jcos(nco0t) cos(mco0t) dt = — Smn m, n / 0

' - r / 2 2

The Fourier transform 317

Thus we obtain from Eq. (A 1.5)m

\ V(t)dt (A 1.6)- r/2

= - \1 J -2 r* v -

2 fr/

T J -

V(t)cos(n0a)0t)dt (A1.7)- r/2

By a similar argument and using the orthogonality of the sine functionswe obtain for the Bn coefficients

V(t) sin(noj0t) dt (A1.8)772

For a continuous, non-periodic function we must use a Fourier integral.We will use complex notation, so that even if the function V(t) is real,the Fourier transform A(co) may be complex. Complex notation greatlysimplifies calculations and is widely used; note that a complex functioncontains both amplitude and phase information. We express the functionV(t) through the integral

A(co)t-itat dco (A1.9)(27T)1'2

To determine A(co) we multiply by eico' and integrate over time

V(t)ei(Ot dt = — 1 — A{co)dcD( 2 T I ) 1 / 2 J

A(co)2n S(co — co') da>

= (2n)i/2A(co')Thus

(2TT)1/2 J . ^

If V(t) is real, A(co) can be expressed in terms of trigonometric functions

Re\_A(co)-] = -—^— I V(t) cos(cot) dti2nl [| V(t) sin(cot) dt

Furthermore when V(t) is real, negative and positive frequencies are relatedthrough

as follows immediately from Eqs. (A 1.11).

318 Appendix 1

If V(t) is real and an even function of t, A(co) is real and even in a>; then

A(co)cos(cot)dco (A1.13)TAII2\(In) Jo

If V(t) is real and an odd function oft, A(co) is imaginary and odd in co; then

Jo(lit)1

Note that in the expressions (Al .13,14) the integration is only over positivefrequencies. If the symmetry properties of V(t) are specified, the integrationover dt in Eqs. (A 1.11) can be restricted to the interval 0 < t < oo.

In deriving Eq. (A 1.10) we made use of the fundamental relation

Eq. (A 1.15) is a representation of the Dirac delta function which can bedefined through

f(x) S(x — x0) dx ••= f(x0) (A1.16)0

One way of arriving at the (5-function is based on a limiting procedurethat can' be obtained from Eq. (A 1.15). We let co — cof = A, then

1 CT 1= lim — e u r dt = lim — ^ ( e u r - ( e-ur}

]-T T-^^'IniA

= l im —

A s T o o , l ifA OXT

^ 0 / lÔThus S(l) -+ (T/n) -+ oo /I = 0

But

= lim dk = lim - dx =J o n ^T n Jo x

Appendix 2THE POWER SPECTRUM

We define the correlation C(T) between two functions of time /x(t) andf2(t) as the average value of their shifted overlap*

C(T)= lim - fi{t)f2(t-T)dtT->oo T J -T/2

The autocorrelation R(T) is defined through1 fT/2

K(T) = lim - f(t)f(t-x)dt (A2.2)rôo T J _T/2

R(T) measures how the function f(t) remains self-similar over the timeinterval T. For instance if f(t) is constant, then

i rr/2R(T) = - A2dt = A2

* J -T/2

If f(t) = A cos cor, thenA2 CT/2

R(T) = — cos cot cos(cot — COT) dtT J -T/2

= — [cos2 cot cos COT + cos cot sin cot sin COT] dtT J -T/2

A2

= COS COT

as expected.Next we evaluate the Fourier transform of the autocorrelation of y(t)

Too ^ o o I [T/2

#(T)eic0T dT = lim - y(t)y(t - T)eic0T dt dTJ-oo J -oo r->oo T J -T/2

(A2.3)* In more advanced treatments the ensemble average is used instead of an integral

over time.

320 Appendix 2

We can insert eico'e~ico' = 1 into the integrand and obtain for the r.h.s. ofEq. (A2.3)

In 1 fr/2

l i1 f00

— —2TT) 1 / 2 J_ 0T->oo 1 \^'n) J -T/2 • \*"

This is the product of two terms, each of which is the Fourier transformof y(t), the second one being complex conjugated. Thus the r.h.s. ofEq. (A2.3) becomes

2TClim — LdA^QT^y] = G(a>)

Tôo T

and therefore we have shown that the power spectrum is the Fouriertransform of the autocorrelation function (Eq. (3.34)).

G(co)= R(T)eicoxdx (A2.4)J — oo

The inverse relation is obviously (Eq. (3.35))

, , 1 -[(OX dco (A2.5)

Therefore it follows immediately that

G(co)df (A2.6)

This result is an expression of Parseval 's theorem which states that thesame total power is obtained either in the time or frequency domain

1°° \y{t)\2dt= I \g(oj)\2dco (A2.7)J — oo J — oo

where g(co) is the Fourier transform of y(t).

Appendix 3THE EQUATIONS OF FLUIDMECHANICS

The dynamics of a non-viscous fluid are completely determined by twoequations: the continuity equation, and Euler's equation which expressesNewton's equations for a fluid. We have made use of these equations inthe text but always in a simplified form. For completeness we give heretheir full expressions using the notation of vector calculus, but withoutseeking or suggesting solutions.

(a) The continuity equation

^ 0dt

For an incompressible fluid, p is constant in space and time so that

SPJ_PJ_PJ_P = 0 ( A 3 2 )

dt dx dy dzand the continuity equation takes the form

V-v = 0 (A3.3)For stationary (or steady) flow

^ = ^ = ^ = 0 (A3.4)dt dt dt

and the continuity equation takes the formV- (pv) = p(V-v) + v- (Vp) = 0 (A3.5)

(b) Euler 's equation

dv = f-ivP (A3.6)dt p

322 Appendix 3

Here f represents the external body forces per unit mass and v is thevelocity of a fluid element; P is the pressure. Note that d\/dt is a totalderivative so that

dv dv dx dv dy dv dz dv , A „ „,— = + + + — (A3.7)dt dx dt dy dt dz dt dt

where x, y, z and t are treated as independent variables. Eq. (A3.7) canbe written in vector notation as

dv 3v I" d ~|dt dt |_ dt\

where (v-V) has the special meaning implied by Eqs. (A3.7). Thus weintroduce the concept of convective derivative

-^ = ^ + v.V (A3.8)dt dt

(c) Special cases: low velocity

When v«vs we can neglect terms proportional to v, but we keep thederivatives of v. Then the continuity equation becomes

) = - ^ (A3.9)dt

Euler's equation, with f = 0, yields

dv = (v.V)v +

a V = - i v Pdt dt p

and ignoring the non-linear term

— =--\P (A3.10)dt p

Next we take the divergence of Eq. (A3.10) and the partial time derivativeof Eq. (A3.9) (where we neglect a term of order p(V-v)2), to eliminate thevelocity term. We then find

V2P = ^ | (A3.ll)dt

The equation of state of the gas or liquid provides a relation between pand P for the particular process that takes place. This allows us to expressP as a function of p (or vice versa) so as to obtain a wave equation forthe pressure or density. This is the subject of Appendix 4 but we wantedto stress here the approximations that go into deriving the linear

The equations of fluid mechanics 323

Eq. (A3.11). For an adiabatic process in an ideal gas we find

V2p-4-f = 0 (A3.12)with

c2 = yP/p (A3.IT)and y = cP/cY. The speed of sound is then vs = -Jc2.

(d) Special cases: supersonic flow

From the equation of state (see Eqs. (A3.11 and 12)) we have the relation\P = c2\p (A3.13)

so that Euler's equation with f = 0 gives

— =~ \p (A3.14)dt p

We take the dot product of Eq. (A3.14) with v, to obtaindv c2

v — = v(Vp) (A3.14')dt p

We now expand the continuity equation(A3.15)

dtComparing Eqs. (A3.14' and 15) we have the equation for supersonic flow

p d2 - H , • - • ) = ~ f <A3-16)

c dt dtThis is a highly non-linear equation because (d\/dt) is already a non-linearterm (see Eq. (A3.7')) and it is further multiplied by v.

(e) Viscous fluid

In the presence of viscosity the Euler equation is replaced by theNavier-Stokes equation

— = f- - VP + - - V(V-v) + - (V 2v) (A3.17)dt p 3 p p

Here n is the coefficient of viscosity and we note again that the equationis highly non-linear. It is for this reason that the solutions to fluid flowproblems are highly complex and difficult to obtain analytically. TheNavier-Stokes equation is valid for small Reynolds numbers.

Appendix 4THE SPEED OF SOUND

Sound is due to small local variations in the pressure and density of aliquid, a gas or a solid. These variations obey a wave equation andtherefore can propagate through the medium. In Appendix 3 we haveshown that small variations in density and pressure are related throughEq. (A3.11).

dt2

For a gas the relation between pressure and density depends on thethermodynamic process that takes place. Sound is an adiabatic processbecause the variations in density and pressure are too fast to permit thetransfer of heat. Thus

PVy = constant (A4.2)Since the density p is given by p = nM/V (n = number of moles,M = molecular weight) we have

P = Kpy (A4.3)where K is an arbitrary constant and y = cP/cv.

We take the gradient of Eq. (A4.3)

VP = yKpy~x\p = - Kpy\p (A4.4)9

and if we take the divergence of Eq. (A4.4) then we obtain to first order

y , / V P \ ,V • (VP) = V2P = - KpyV2p = — V 2p (A4.4')9 \9 J

Thus Eq. (A4.1) becomes a wave equation

c2 dt2

The speed of sound 325

withc2 = (yP/p) (A4.6)

For an ideal gasPV=nRT

with n the number of moles and R the gas constant. ThusP _RT~p~~M

and the speed of sound in a gas is given byvs = (yRT/M)112 (A4.7)

Eq. (A4.7) was used in the main text to evaluate vs for air at s.t.p.The result of Eq. (A4.4') can be written in general in the form

V2P = ( — ) V 2 P (A4-8)

Therefore the speed of sound is given byvs = (dP/dp)1/2 (A4.9)

For gases we found dP/dp from Eq. (A4.3); for liquids we can use thecompressibility K, which is defined through

K=1-8^ (A4.10)pdP

For instance for water K= — 50 x 10~6 (atm"1). Then

^ J 1 = 2 x 1 0 ' ^2

2 x 1 0dp KP (103 kg/m3)50 x K T ^ m V N ) s2

and thereforeug = (2x 106)1/2 = 1.4x 103m/s

For solids (dP/dp) is equivalent to (Y/p) where Y is Young's modulus(defined as the ratio of normal stress Sn = F/A to strain e = Sl/l). Then

F/A Y Fl Fl Fl/V SP 2Y = and — = = = = = v;

Sl/l p pASl pSV pSV/V dp(A4.ll)

For steel, 7 = 2 x 1011 N/m2, p - 8 x 103 kg/m3, and we find for thespeed of sound

) J ^ L - ) =1.6xl04m/spj VSxlO3/

REFERENCES ANDSUGGESTIONS FORFURTHER READING

CHAPTER 1

A. S. Grove, Physics and Technology of Semiconductor Devices, J. Wiley,New York, 1967.

C. Mead and L. Conway, Introduction to VLSI Systems, Addison-Wesley,Reading, MA, 1980.

R. F. Pierret and G. W. Neudeck, Modular Series on Solid State Devices,Addison-Wesley, Reading, MA, 1983.

CHAPTER 2

W. C. Holton, The large scale integration of microelectronic circuits, ScientificAmerican, September 1977.

P. Horowitz and W. Hill, The Art of Electronics, Cambridge University Press,Cambridge, U.K., 1980.

M. M. Mano, Computer System Architecture, Prentice-Hall, Englewood Cliffs,N.J., 1980.

CHAPTER 3

J. Brown and E. V. D. Glazier, Telecommunications, Chapman and Hall,London, 1974.

R. M. Gagliardi, Satellite Communications, Wadsworth, London, 1984.

CHAPTER 4

J. B. Marion, Classical Electromagnetic Radiation, Academic Press, New York,1965.

Y. Suematsu and K.-I. Iga, Optical Fiber Communications, J. Wiley, New York,1976.

O. Svelto, Principles of Lasers, translated by D. C. Hanna, Plenum Press, NewYork, 1982.

CHAPTER 5

D. R. Inglis, Nuclear Energy, Addison-Wesley, Reading, MA, 1973.J. D. McGervey, Introduction to Modern Physics, Academic Press, New York,

1983.E. H. Thorndike, Energy and Environment, Addison-Wesley, Reading, MA, 1976.

CHAPTER 6

P. Craig and J. Jungerman, Nuclear Arms Race, McGraw-Hill, New York, 1985.S. Glasstone and P. J. Dolan, The Effects of Nuclear Weapons, U.S.

Government Printing Office, Washington, 1977.D. Schroeer, Science Technology and the Nuclear Arms Race, J. Wiley,

New York, 1984.The science and technology of directed energy weapons, Reviews of Modern

Physics, Vol. 59, July 1987.

References 327

CHAPTER 7

J. D. Anderson, Introduction to Flight, McGraw-Hill, New York, 1985.A. H. Shapiro, Shape and Flow, Doubleday, Anchor Books, Garden City,

N.Y., 1961.G. P. Sutton, Rocket Propulsion Elements, J. Wiley, New York, 1963.Th. von Karman, Aerodynamics, McGraw-Hill, New York, 1954.

CHAPTER 8

A. I. Berman, The Physical Principles of Astronautics, J. Wiley, New York,1961.

C. S. Draper, W. Wrigley and J. Hovorka, Inertial Guidance, Pergamon, NewYork, 1960.

H. Seifert (editor), Space Technology, J. Wiley, New York, 1961.A. B. Sergeyevsky, Voyager-2: A grand tour of the giant planets, AAS/AIAA

Astrodynamics Conference, Lake Tahoe, Nevada, August 1981.

INDEX

absorption, 128cross section, 153of radiation, 150

absorptivity, 171acceleration of gravity, 262accelerometer, 308acceptor impurities, 9accumulator, 77addition, 45adiabatic process, 323aerodynamic design, 254agriculture, 203airfoil, 245airplane flight, 241, 247airstream velocity, 249alloying, of semiconductor junctions, 25alpha-Centauri, 302alpha decay, 179alpha particle, 176alphanumeric character, 67AM radio, 135ambient temperature, 166Ampere's law, 145amplification, 21amplifier,

gain, 105of radiation, 151

amplitude modulation, 89analogue to digital (A to D), 94, 96AND, 38, 42angle of attack, 246angular frequency, 83angular momentum, 310

antenna, 121dipole, 122directional, 123half-wave, 122gain, 114

anti-ballistic missile system, 228antimatter propulsion, 306Apollo 15, 227Arecibo telescope, 161arms limitation treaty, 233ASCII code, 60,61, 80, 111aspect ratio, 250astronomical unit, 276atmospheric drag, 271atomic mass unit (amu), 180attenuation, in fibers, 149autocorrelation, 101, 319average value, 96Avogadro's number, 5

Babbage, C , 64balance of energy, 169ballistic missile, 216, 236ballistic trajectory, 220bandwidth, 87, 112base, 20baud,112beamed power, 305Becquerel, H., 189Becquerel (Bq), 190Bernoulli's equation, 242Bernoulli's principle, 244Bessel functions, 92

Index 329

beta decay, 178biased junction, 18bilateral agreement, 234binary counter, 57binary number, 58binary transmission, 111binding energy, 179black body, 170bpi (bits per inch), 67Boltzmann's constant, 8Boltzmann distribution, 151boiling water reactor, 184Boolean algebra, 42, 43boost phase, 228booster rocket, 271boundary condition, 129boundary layer, 250breeder reactor, 188brightness, 161, 230btu, 165burn-out velocity, 267busing phase, 228byte, 67

calorie, 165capture maneuver, 286carrier,

density distribution, 15modulation, 89wave, 89

carry bit, 45Cartesian coordinates, 279casualties, nuclear war, 215cavity, laser, 154CCD camera, 226central field, 277, 281centrifugal barrier, 283chain reaction, 183channel, FET transistor, 31, 32channel capacity, 112, 113characteristic table, 54charge,

density, 17, 117transport, 12

charged coupled device (CCD), 66chart of the nuclides, 177chemical energy, 165Chernobyl accident, 191circulation, 246, 274

clear line, 53CO2 laser, 162CO2 layers, 173coal reserves, 188coasting height, 267coaxial cable, 145coaxial line, 142code, 60coding, 58coefficient of viscosity, 251coherence length, 150coherence time, 150collector, 20collector, solar energy, 198combinatorial circuit, 45communications, 81

satellite, 136, 138communication theory, 106compact disk, 72comparator, 48complement, 42, 47complex notation, 317compressibility, 319computer, 1

architecture, 75memory, 64

concentrator, solar energy, 199conditional probability, 107conduction band, 4conductivity, 13

of plasma, 133confocal resonator, 157containment, in nuclear fusion, 194conic sections, 278continuity condition, 242continuity equation, 242, 321control rods, for reactor, 186controlled fusion, 192convective derivative, 322coolant, reactor, 184core, reactor, 184correlation function, 319correlation time, 102Coulomb force, 174, 192counter, 53coverage angle, 137critical mass, 183, 205cross-section, fission, 205crystal, 3

330 Index

Curie (Ci), 189current,

density, 12, 117gain, 22transfer ratio, 22

current-voltage characteristic, 33cut-off wavelength, 142cyclotron frequency, 136cylindrical coordinates, 143

data representation, 58d + d reaction, 192dead weight fraction, rocket, 268decay, of proton, 203decimal number, 58decibel, 113decoder, 49, 50defense system, 228delayed neutrons, 186delayed radiation, 210De Laval nozzle, 264De Morgan theorems, 43density,

distribution of carriers in p-n junction,15

of free electrons, 5, 7of gas in rocket engine, 262of intrinsic carriers, 8of states, 10

depletion FET, 31, 35depletion zone, 16detached layer, 254deterrence, nuclear, 216deuterium, 196deuteron, 176dielectric constant, 128

of SiO, 33dielectric permittivity, 117diffusion, 26

coefficient, 14method, 28

diffraction limit, 126, 225digital circuit, 44digital signal, 84digital tape, 70diode, 19dipole antenna, 122Dirac delta function, 318directed energy weapons, 229

directional antenna, 123directionality, 25dish antenna, 126dispersion, 132

in fibers, 148disposal, nuclear wastes, 191distortion, of signal, 95division, 48donor impurities, 9doped semiconductors, 9doped germanium, 40Doppler broadening, 158double precision, 63doubling time, 168downlink, 138downwash, 248drag coefficient, 250drain, 30, 32drift velocity, 12D2O as moderator, 187Dyson's spaceship, 303, 304dynamic lift, 241

earth's field, 136earth orbit, 271, 290earth to Mars, 289ecliptic, 276effective mass, 13effective potential, 283Efficiency,

nuclear reaction, 205propeller, 260rocket engine, 266thermal, 166turbojet, 261

Einstein's elevator, 308electric dipole, 121electric field, 117electric susceptibility, 131electromagnetic spectrum, 83electron, 178ellipse, 278, 284emissivity, 170emitter, 20emitter follower, 23, 24EMP, 212enable, 50encounter, 287, 291, 293

Index 331

energy,balance, terrestrial, 169band diagram, 10, 18, 19, 21bands, 3consumption, 166density, 120, 150, 169deposition, 236flux, 119, 165, 170gap, 4level, 3, 149production, 168release, 167, 181, 206sources, 165

English language, 111Eniwetok atoll, 209, 211enhancement FET, 31, 35enrichment, of uranium, 207entrance angle, 162entropy, 109, 166equation of motion, 283equatorial orbit, 136error checking, 63escape maneuver, from solar system, 289escape probability, neutron, 204Euler's equation, 321excimer laser, 231excitation table, 54execute, 75exhaust .velocity, 221, 263, 307explosion time, 206explosive, nuclear, 204

fallout, radioactive, 213FET (field effect transistor), 29, 35fetch, 75Fermat's principle, 129Fermi (fm), 175

distribution, 7, 41energy, 7Enrico, 163function, 8level, 6

ferromagnetic material, 69fiber optics, 145final velocity, 301fine structure constant, 152finite wing, 248fissile material, 183fissile nuclei, 185

fission nuclear, 79flip-flop, 51floating point, 62floppy disk, 72fluid flow, 241fluid mechanics equations, 321focal distance, 126focal length, 224fossil fuels, 167, 198forward bias, 19Fourier,

coefficient, 86decomposition, 85transform, 316

free electron laser, 231free neutron decay, 178frequency, 83

deviation, 91domain, 86modulation, 91of letters, 107, 109, 116optical, 131plasma, 134, 135spectrum, laser, 159

frictional force, 245fuel, fossil, 167, 198full-adder, 46, 47fusion,

inertial, 196nuclear, 179rate, 194reactor, 192

gallium arsenide, 3gain,

amplifier, 105laser, 154

gamma rays, 178gas constant, 265, 325gas laser, 155gate, 32Gaussian,

distribution, 97, 98integral, 99

general register, 77generating station, energy, 202germanium, 3geostationary orbit, 136global consequences, of nuclear war, 215

332 Index

graphite as moderator, 187gravitation, 281gravitational attraction, 292gray (Gy), 190gray code, 60Greenhouse effect, 173, 174ground resolution, 225grounded base, 22grounded emitter, 23group velocity, 131, 142gyro axis, 311gyroscope, 308, 310

single degree of freedom, 312

half-adder, 45harmonic, 85, 119, 316harmonic oscillator, 284H, 6, 175head, magnetic tape, 70heat radiation, 210heating, 167heavy nuclei, 181heliocentric velocity, 288, 315helium-4, 174hexadecimal, 58Hohman ellipse, 287Hollerith code, 60hole, 5hydrazine, 271hydroelectric power, 167hyperbola, 278, 279, 284hyperbolic excess velocity, 291hyperbolic orbit, 298

ICBM, 216, 236impedance,

coaxial cable, 145free space, 123input, 24output, 25

implosion, 207implosion bomb, 208impurities, in semiconductors, 9incompressible fluid, 241induced drag, 249inertial force, 252, 308inertial fusion, 196inertial guidance, 307

information,content, 109, 110, 113rate, 115storage, 68transfer, 106

input impedance, 24input/output (I/O), 78insolation, 199instruction register, 76intensity of radiation, 153integrated circuit, 29interstellar travel, 301interstellar mission time, 302intrinsic carriers, 8inverter, 34

construction, 36truth table, 36

ionosphere, 132layers, 135

irreversible process, 166island, /i-type, 27isotope, long-lived, 213I-V curve, for diode, 19, 201

jet engine, 260J-FET, 31Johnson noise, 103joule, 165junction transistor, 20Jupiter, 296

Kepler's law, 276kilocalorie, 206kinematics, interstellar travel, 302kinetic energy, 282kinetic energy weapons, 229Kirchoff's law, 171

laminar flow, 241, 253Larmor's equation, 121laser, 149

mirrors, 156in orbit, 230radiation, 156

latch, 52latitude effect, 172launch, horizontal, 270Lawson criterion, 195lethal flux, 229

Index 333

LiD, 210lifetime, laser, 157lifting force, 247light year, 302limited test-ban treaty, 233line of vortices, 248line shape, 153liquid drop model, 181liquid oxygen, 269liquid propellant rocket, 262, 264load command, 55logic gates, 37logic operation, 45low orbit, 286, 314low velocity flow, 322luminosity, sun, 171

Mach number, 257magnetic disk, 71magnetic field, 117magnetic permeability, 117magnetic storage, 67magnetization, remanent, 69magneto-optic effect, 74main engines, of Shuttle, 272majority carriers, 11manufacture of transistors, 25Mars, 281

-earth distance, 115orbit, 291

mass excess, 181mass ratio rocket, 268matrix element, 153Maxwell's equations, 117mean square value, 97mean free path, 183megabyte, 71megaton, 211memory array, 65memory, random access (RAM), 65memory, read only (ROM), 65Mercury, 314metallic boundary, 141metric ton, 167microelectronics, 1microwaves, 125minority carriers, 11Minuteman III, 222mirrors, laser, 156

MIRV, 217mixing, of signals, 90MKS system, 117mobility, 13mode,

highest, 148of oscillation, laser, 158in optical fiber, 147TEM laser, 159in waveguide, 141

moderator, 182modulation, frequency, 91monostable circuit, 52MOS (metal oxide silicon), 30MOSFET, 31MOS planar geometry, 32motion, in central field, 281multiplexer, 49, 50multiplication, 48

factor, 183, 185, 205multistage rocket, 266muon catalysis, 196mutual assured destruction, 216MX missile, 227

NAND gate, 38, 46NASA, 270neutrino, 178neutron, 174

flux, 185, 187rich elements, 191fast, 209

Newton's constant, 223, 276nitrogen tetroxide, 271H M O S transistor, 35node, 141, 157noise,

in communication channel, 961//, 100figure, 105Johnson, 103power, 114quantum, 104resonant, 100shot, 102temperature, 105, 114thermal, 103white, 100, 101

non-linear behavior, 19

334 Index

non-proliferation agreement, 234NOR gate, 39normal stress, 251nozzle, 262, 275n-p-n transistor, 20NRZI coding, 71, 72n-type island, 27n-type semiconductor, 10nuclear arsenals, 216nuclear bomb, 304nuclear binding, 179nuclear exchange, major, 214nuclear explosives, 211nuclear fission, 179nuclear force, 174nuclear fusion, 179nuclear reactor, 183nuclear waste disposal, 191nuclear weapons, 204nuclear winter, 174, 215nucleon, 174number systems, 58nutation, 311

octal number, 59oil reserves, 188ones-complement, 62optical frequency, 131optical gyroscope, 313OR, 38, 42orbiter, 263ordered energy, 166oscillation period, 224output impedance, 25overpressure, 210

parabola, 125, 279parabolic orbit, 286parabolic reflector, 125parity generator, 63parsec, 302Parseval's theorem, 320Pauli principle, 176payload, 222

ratio, 268velocity, 269

perfect conductor, 139periodic function, 85, 316periodic signal, 86

permeability, magnetic, 117, 128phase reversal, 89phased array, 124photon flux, laser, 155photovoltaic cell, 200Pioneer, 291, 300Planck distribution, 171Planck's constant, 152Planck's equation, 169plane of incidence, 129plane wave, 160planets, 277

grand tour, 295inner, outer, 278

plasma, 133frequency, 134, 135in fusion, 195

plutonium, 182production, 187

p-n junction, 14p-n-p transistor, 21point source, 161polar coordinates, 281polar orbit, 136, 223polarization, 120, 130polycrystalline silicon, 32population inversion, 151position vector, 285potential difference in p-n junction, 17potential energy, 282potential well, 175power, 165

amplification, 21level, in reactor, 187radiated, 173spectral density, 100

Poynting vector, 119precession, of gyroscope, 311preemptive strike, 217pressure blast, 210pressure, of gases, 262pressurized water reactor, 184probability distribution, 97program, computer, 75program counter, 76prompt-critical reactor, 186prompt radiation, 210propagation, of wave, 141propagation angle, 146

Index 335

propeller action, 259propulsion dynamics, 258propulsion parameters, shuttle, 274proton, 174p-type semiconductor, 10pumping, optical, 151

Q-switched laser, 159quadrature, 99quadratic measure, 99quality factor (Q), 157quantum noise, 104

rad, 189radar, 127radial coordinate, 144radiation, 121

impedance, 123length, 190

radioactive material, 186radioactive nucleus, 176radioactivity, 189. 237random access memory (RAM), 65rate, of information, 115RBE, 189reactivity, in nuclear fusion, 193read only memory (ROM), 65recombination, 14reconnaissance satellite, 222, 226redundancy, 112references, 326reflected power, 127reflection, 128reflection loss, 156reflector, propulsion, 306refraction, 128refractive index, 83, 129

fiber optics, 146imaginary, 131of plasma, 132

refueling of reactor, 186register, 55

general, 77instruction, 77shift, 56

relative entropy, 112rem, 189remanent magnetization, 69reproduction factor, 183

resistive load, 35resistivity, 13reset, 53resolution, optical, 224reverse bias, 18reversible process, 166Reynolds number, 252, 274ring laser gyroscope, 313rocket engine, 260, 262, 265rocket flight, 241rocket propulsion, 221Roentgen, 189rotation rate, of gyro, 313

Sagnac effect, 314SALT II, 234sampling frequency, 94, 115sampling theorem, 95satellite communications, 136satellite reconnaissance, 222saturation current, 34saturation, forward, 40Saturn encounter, 299Saturn swing-by, 296Schmidt trigger, 52semiconductor, 9

p-type, 10n-type, 10

semi-major axis, 284servo system, 309set, 53Shannon's equation, 113shear, 251shelter, radioactive, 213shift register, 56shock wave, 256shot noise, 102sidebands, 89, 90, 92sievert (Sv), 190signal to noise ratio, 105signal power, 115silicon, 3silicon dioxide, 26, 32slingshot trajectory, 293Snell's law, 129, 146solar constant, 171, 197solar energy, 197solar flux, 173solar sail, 306

336 Index

solar system, 276solid angle, 230solid state laser, 74source, 30, 32sources of energy, 165space,

shuttle, 270, 272travel, 239vehicle, 294

spatial coherence, laser, 160spatial distribution, laser radiation, 159specific force, 309specific impulse, 261, 266spectral density, power, 100speed of sound, 256, 324spin, nuclear, 177spontaneous emission, 150square pulse, 88staging, of rocket, 269standard deviation, 98standing wave, 140

in optical cavity, 158stars, 276

travel to, 304Stefan-Boltzmann law, 170stick diagram, 56stimulated emission, 149, 150stochastic process, 107storage, data, 64strategic defense initiative, 228strategic nuclear arsenals, 218streamline, 241strontium-90, 213subtraction, 47sun, interior, 193sum bit, 45supersonic exhaust, 264supersonic flight, 255supersonic flow, 323

tangential electric field, 139tape, magnetic, 72telecommunications, 81telephone, 96television, 90temperature of gases, 262thermal neutrons, 182, 186thermal noise, 103thermal power, 185

thermonuclear weapon, 210threshold, lasing, 155throat, of rocket engine, 263thrust, 222, 266thrust to weight ratio, 268time domain, 86TNT, 206toroidal field, 195torque, 282total differential, 244total reflection, 130total time derivative, 282trajectory, of shuttle, 272transfer orbit, 286, 290transistor, 3, 20

field-effect, 29manufacture, 25production, 27, 36

TTL (transistor-transistor logic), 34transmission line, 139, 142transportation, 239trapping, 146trigger circuit, 52trigonometric function, 317Trinity test, 207tritium, 196, 209triton, 192truth table, 37turbojet, 260, 261turbulence, 255turbulent flow, 250, 253twos-complement, 62

UHF, 84, 91uplink, 138uranium, 182uranium reserves, 188Uranus, 315UV (ultraviolet), 84

V2 rocket, 217, 258Van Allen belts, 135valence band, 135varactor, 92vector diagram, 292velocity,

equation, 281, 285increment, 269, 287, 288of progagation of light, 118

Index 337

Venturi tube, 277VHF, 84virial theorem, 277viscosity, 250

of selected fluids, 252viscous flow, 323viscous fluid, 241VLSI (very large scale integration), 31,

37vortex, 247

flow, 249Voyager-2, 291, 300

grand tour of the planets, 295mission, 296

waist, laser beam, 160watt, 165wave equation, 118, 324wave vector, 119, 140waveguide, 139

dielectric, 146rectangular, 142

wavelength, 83weak interaction, 178weapons,

biological, 235nuclear, 183, 204

white noise, 101whole body exposure, 191wiggler magnet, 232Winston cone, 200World War II, 204word message, 108

X-ray, 84, 189laser, 231, 237

Yagi antenna, 125Young's modulus, 325

zirconium tubes, 184Zipf'slaw, 109, 110

Documents

[Adrian C. Melissinos] Principles of Modern Techno(Book4You)