View
4
Download
0
Category
Preview:
Citation preview
University of New MexicoUNM Digital Repository
Physics & Astronomy ETDs Electronic Theses and Dissertations
7-11-2013
Continuous Measurement and Stochastic Methodsin Quantum Optical SystemsRobert Cook
Follow this and additional works at: https://digitalrepository.unm.edu/phyc_etds
This Dissertation is brought to you for free and open access by the Electronic Theses and Dissertations at UNM Digital Repository. It has beenaccepted for inclusion in Physics & Astronomy ETDs by an authorized administrator of UNM Digital Repository. For more information, please contactdisc@unm.edu.
Recommended CitationCook, Robert. "Continuous Measurement and Stochastic Methods in Quantum Optical Systems." (2013).https://digitalrepository.unm.edu/phyc_etds/12
Candidate
Department
This dissertation is approved, and it is acceptable in quality and form for publication:
Approved by the Dissertation Committee:
, Chairperson
Robert Lawrence Cook
Physics and Astronomy
Ivan H. Deutsch
Carlton M. Caves
Sudhakar Prasad
Terry A. Loring
Continuous Measurement and StochasticMethods in Quantum Optical Systems
by
Robert Lawrence Cook
B.S., University of California Santa Cruz, 2003
DISSERTATION
Submitted in Partial Fulfillment of the
Requirements for the Degree of
Doctor of Philosophy
Physics
The University of New Mexico
Albuquerque, New Mexico
May, 2013
iii
c©2013, Robert Lawrence Cook
iv
Dedication
Kylie, I promise we’ll take a walk when this is all over.
v
Acknowledgments
First of all I’d like to thank my most recent and final advisor Ivan Deutch. WhenI started at UNM 10 years ago I had no idea what I wanted to study, only thata masters program seemed better than a job at the latest flying-Starbucks. It wasyour undergraduate quantum mechanics lectures that showed me how strange andrich the quantum world can be and they ultimately set me on the path to where Iam today. I will be forever grateful for your help and guidance though the bumpierparts of my graduate career. I also have to thank Brad Chase. Without you thisdissertation would have taken a very different form. Prior to reading the epic worksof van Handel et al. I would never have guessed that I’d become an advocate formathematical formalism. To Ben Baragiola I thank you for your friendship, enthusi-asm and willingness to talk though a problem. And to Heather Partner I will alwaysbe grateful for your support and camaraderie on the roller coaster ride that startedat Los Alamos, ran through UNM and ended in Sandia.
In my latest academic home of Room 30, I need to thank Carlos Riofrıo for yourfriendship, warmth and immediate inclusion into Deutsch group, Josh Combes foryour shared enthusiasm for QSDEs, Leigh Norris for your kind hearted adoption ofthe luckiest goldfish on the planet, and Vaibhav Madhok for just being Vaibhav.To the rest of Deutsch group - Bob Keating, Charlie Baldwin, and Krittika Goya -thanks for listening to me prattle on in group meeting about stochastic calculus andstatistical estimation. I hope I didn’t bore you too much. In the greater quantuminformation group I need to thank Professors Carl Caves and Andrew Landahl, cur-rent and former CQuIC students Jonas Anderson, Chris Cesare, Seth Merkel, IrisReichenbach, Alexandre Tacla, Zhang Jiang, Matthias Lang, and Shashank Pandey.I must also thank Vicky Bird for feeding us so well during arxiv review. From myshort tenure at Sandia national labs I need to thank Cort Johnson, Dan Stick, ToddBarrick, Dave Moehring, Francisco Benito, Peter Schwindt, Yuan-Yu Jau, Mike Man-gan, Tom Hamilton, and Grant Biedermann for the help and support as I learnedthat cryogenic experiments are not for me. I will never forget the time spent workingwith Roy Keyes, Tom Jones, Thomas Loyd, and Paul Martin. While we may nothave gotten a lot done we had a whole lot of fun doing it. To Laura Zschaechnerthanks for being a good friend and a shoulder to cry on. And finally I’d like thankmy parents and family for their love and support.
vi
Continuous Measurement and StochasticMethods in Quantum Optical Systems
by
Robert Lawrence Cook
B.S., University of California Santa Cruz, 2003
Ph.D., Physics, University of New Mexico, 2013
Abstract
This dissertation studies the statistics and modeling of a quantum system probed by
a coherent laser field. We focus on an ensemble of qubits dispersively coupled to a
traveling wave light field. The first research topic explores the quantum measurement
statistics of a quasi-monochromatic laser probe. We identify the shortest timescale
that successive measurements approximately commute. Our model predicts that for
a probe in the near infrared, noncommuting measurement effects are apparent for
subpicosecond times.
The second dissertation topic attempts to find an approximation to a conditional
master equation, which maps identical product states to identical product states.
Through a technique known as projection filtering, we find such a equation for an
ensemble of qubits experiencing a diffusive measurement of a collective angular mo-
mentum projection, in addition to global rotations. We then test the quality of the
approximation through numerical simulations. This measurement model is known
to be entangling and without the rotations we find poor agreement between the ex-
act and approximate predictions. However, in the presence of strong randomized
vii
rotations, the approximation reproduces the exact expectation values to within 95%
accuracy.
The final topic applies the projection filter to the problem of state reconstruc-
tion. We find an initial state estimate based on a single continuous measurement
of an identically prepared atomic ensemble. Given the ability to make a continuous
collective measurement and simultaneously applying time varying controls, it is pos-
sible to find an accurate estimate given based upon a single measurement realization.
Previous experiments implementing this method found high fidelity estimates, but
were ultimately limited by decoherence. Here we explore the fundamental limits of
this protocol by studying an idealized model for pure qubits, which is limited only
by measurement backaction. This ultimately makes the measurement statistics a
nonlinear function of the initial state. Via the projection filter, we find an efficiently
computed approximation to the log-likelihood function. Using the exact dynamics to
produce simulated measurements, we then numerically search for a maximum like-
lihood estimate based on the approximate expression. We ultimately find that our
estimation technique nearly achieves an average fidelity bound set by an optimum
POVM.
viii
Contents
1 Introduction 1
1.0.1 A note on quantum foundations . . . . . . . . . . . . . . . . . 10
1.1 An executive summary . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1.1 Quantum optics and quantum stochastic differential equations 12
1.1.2 Classical and quantum probability theory . . . . . . . . . . . . 13
1.1.3 Projection filtering for qubit ensembles . . . . . . . . . . . . . 16
1.1.4 Qubit state reconstruction . . . . . . . . . . . . . . . . . . . . 23
2 Quantum Optics and Quantum Stochastic Differential Equations 29
2.1 Quantum Stochastic Process in Optical Fields . . . . . . . . . . . . . 30
2.1.1 Free space quantization . . . . . . . . . . . . . . . . . . . . . 32
2.2 Wave Packets, Fock Space and Stochastic
Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2.1 Wave packets . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.2.2 Weyl operators . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Contents ix
2.2.3 Fock space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.2.4 A basis independent expression for the wave packet inner product 43
2.2.5 Fock space and stochastic srocesses . . . . . . . . . . . . . . . 45
2.2.6 Localized wave packets and stochastic processes . . . . . . . . 47
2.3 Paraxial Envelopes and Measurable Pulses . . . . . . . . . . . . . . . 49
2.3.1 Paraxial wave packets in the time domain . . . . . . . . . . . 52
2.3.2 The measurable subspace . . . . . . . . . . . . . . . . . . . . . 54
2.4 The one-dimensional limit . . . . . . . . . . . . . . . . . . . . . . . . 57
2.5 Quantum Wiener processes and the
continuous-time decomposition . . . . . . . . . . . . . . . . . . . . . 60
2.5.1 The continuous-time tensor decomposition . . . . . . . . . . . 62
2.5.2 The quantum Wiener process . . . . . . . . . . . . . . . . . . 65
2.5.3 The units of quantum noise . . . . . . . . . . . . . . . . . . . 67
2.6 Systems Interacting with Quantum Noise . . . . . . . . . . . . . . . 68
2.6.1 Quantum white noise in paraxial wave packets . . . . . . . . . 73
2.6.2 The scattering process . . . . . . . . . . . . . . . . . . . . . . 76
2.6.3 The limiting stochastic propagator . . . . . . . . . . . . . . . 78
2.6.4 A simple 1D example . . . . . . . . . . . . . . . . . . . . . . . 80
2.7 The Faraday Interaction . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.7.1 The quadratic Faraday interaction . . . . . . . . . . . . . . . . 86
Contents x
3 Classical and Quantum Probability Theory 89
3.1 Classical Probability Theory . . . . . . . . . . . . . . . . . . . . . . 90
3.1.1 Stochastic processes and random variables . . . . . . . . . . . 94
3.1.2 Expectation values, the conditional expectation, and measur-
ability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
3.1.3 Special processes - time-adaption and martingales . . . . . . . 103
3.1.4 The Wiener process . . . . . . . . . . . . . . . . . . . . . . . 105
3.2 Quantum Probability Theory . . . . . . . . . . . . . . . . . . . . . . 107
3.2.1 Embedding the quantum into the classical . . . . . . . . . . . 107
3.2.2 Quantum probability . . . . . . . . . . . . . . . . . . . . . . 111
3.2.3 The quantum conditional expectation . . . . . . . . . . . . . . 114
3.2.4 The conditional expectation and generalized measurements . . 116
3.3 Quantum Filtering Theory . . . . . . . . . . . . . . . . . . . . . . . 118
3.4 The Conditional Master Equation . . . . . . . . . . . . . . . . . . . 122
3.4.1 The innovation process . . . . . . . . . . . . . . . . . . . . . 124
3.4.2 The Ito correction in the conditional master equation . . . . . 126
3.4.3 The conditional Schrodinger equation . . . . . . . . . . . . . 129
4 Projection Filtering for Qubit Ensembles 132
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
4.1.1 An introduction to differential projections . . . . . . . . . . . 134
Contents xi
4.1.2 The conditional master equation . . . . . . . . . . . . . . . . . 135
4.2 Differential Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.2.1 Tangent spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.2.2 Riemannian Metrics and orthogonal projections . . . . . . . . 139
4.2.3 Differentials on abstract manifolds . . . . . . . . . . . . . . . 141
4.2.4 Stochastic calculus on differential manifolds . . . . . . . . . . 143
4.3 The Bloch Sphere as a Riemannian Manifold . . . . . . . . . . . . . . 145
4.3.1 Projecting the unconditional master equation . . . . . . . . . 146
4.4 Projections in the tensor product submanifold . . . . . . . . . . . . . 148
4.4.1 The metric in spherical coordinates . . . . . . . . . . . . . . . 149
4.4.2 Calculating collective operator inner products . . . . . . . . . 151
4.4.3 The spherical projection of the CME . . . . . . . . . . . . . . 157
4.5 The Projection Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
4.5.1 Special cases for the projection filter . . . . . . . . . . . . . . 163
4.6 Simulations and Performance . . . . . . . . . . . . . . . . . . . . . . 165
4.6.1 Simulation parameters . . . . . . . . . . . . . . . . . . . . . . 167
4.6.2 Spin squeezing comparisons . . . . . . . . . . . . . . . . . . . 167
4.6.3 Squeezing simulations . . . . . . . . . . . . . . . . . . . . . . 169
4.6.4 Projection filter simulations . . . . . . . . . . . . . . . . . . . 172
5 Qubit State Reconstruction 178
Contents xii
5.1 Previous reconstruction results . . . . . . . . . . . . . . . . . . . . . . 179
5.2 The Estimation Procedure . . . . . . . . . . . . . . . . . . . . . . . . 180
5.3 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
5.3.1 Observability and randomized controls. . . . . . . . . . . . . 184
5.4 The Likelihood Function . . . . . . . . . . . . . . . . . . . . . . . . . 186
5.4.1 The reconstruction procedure . . . . . . . . . . . . . . . . . . 189
5.4.2 Coupled CMEs
and filter stability . . . . . . . . . . . . . . . . . . . . . . . . 191
5.4.3 Backaction in continuous quantum measurement . . . . . . . . 195
5.5 Numeric Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
5.5.1 Simulation parameters . . . . . . . . . . . . . . . . . . . . . . 198
5.5.2 Results and discussions . . . . . . . . . . . . . . . . . . . . . 201
6 Summary and Outlook 206
6.1 Quantum optics and quantum stochastic differential equations . . . . 206
6.2 Classical and quantum probability theory . . . . . . . . . . . . . . . . 208
6.3 Projection filtering for qudit ensembles . . . . . . . . . . . . . . . . . 211
6.4 Qubit State Reconstruction . . . . . . . . . . . . . . . . . . . . . . . 214
A Paraxial Optics 217
B Classical Stochastic Calculus 220
B.1 Ito Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Contents xiii
B.1.1 The Ito conversion . . . . . . . . . . . . . . . . . . . . . . . . 225
C Quantum Stochastic Calculus 227
C.1 The Quantum Stochastic Unitary . . . . . . . . . . . . . . . . . . . . 232
C.1.1 Unitary evolution . . . . . . . . . . . . . . . . . . . . . . . . . 235
D The Quantum Wong-Zakai Theorem 237
D.1 Quantum white noise . . . . . . . . . . . . . . . . . . . . . . . . . . 238
D.2 The quantum Wong-Zakai theorem . . . . . . . . . . . . . . . . . . . 240
D.2.1 Quantum stochastic calculus and operator ordering . . . . . . 241
D.2.2 Gauge freedom in the Ito correction . . . . . . . . . . . . . . . 244
D.3 The Limiting Propagator . . . . . . . . . . . . . . . . . . . . . . . . . 248
1
Chapter 1
Introduction
Within the past three decades, the ability to engineer individual quantum systems
into highly nonclassical states has become a reality. The fundamental technology
that facilitated these revolutionary experiments is the coherent laser with its ability
to address specific electronic transitions in matter. A quasi-monochromatic laser can
also introduce optical forces on position degrees of freedom. While initially used to
laser cool and trap atoms, coherent electronic superpositions can also be transferred
to external superpositions allowing for atom interferometers [1] or highly nonclassical
states in trapped ions [2, 3].
In addition to providing control of an atomic system at a quantum level, the same
laser systems can be used to measure the atomic state of the system. The simplest
of all detection methods is resonant fluorescence, where laser light resonant with a
single transition is applied to an atom, which will scatter photons if that level is
occupied. However, if the internal state is in a superposition between the resonant
level and an additional off-resonant ‘dark’ state, the presence or absence of scattered
light provides information about the internal state of the system to the experimenter
[4].
The quantum nature of the atom-light interaction carries over to off resonant
Chapter 1. Introduction 2
applications. In a low intensity regime, a free space laser with a carrier frequency
significantly detuned from an atomic transition predominantly induces a state de-
pendent energy shift without significantly exciting that transition. The specific form
of the interaction also depends upon the polarization of the exciting laser. In a given
parameter regime the resulting Hamiltonian dominates over the decoherence from
absorption and subsequent emission, resulting in a controllable coupling between the
atoms and the polarization of the probe laser [5]. This coupling affects both the
atomic and polarization quantum states. The fact that the laser is a traveling wave
means that this is a fundamentally open quantum system and the output state of
light carries with it some information about the atomic state.
Quantum mechanics is at its core a probabilistic theory where the wave function
is a tool for computing the probability of observing experimental events. Upon the
receipt of a measurement outcome, an accurate description of the quantum system
must reflect this new information. This is true in an idealized projective measurement
or in indirect measurements like those described above. However there are significant
differences between the method of state detection via resonance fluorescence and off-
resonance polarization spectroscopy.
In fluorescence detection a vast majority of the scattered light is ultimately lost,
either because only a small fraction of possible emission directions are observed or
due to losses in the detection apparatus. It takes a considerable experimental effort
to measure as little as 5% of the total scattered light from a single trapped ion [6, 7].
A single ion will have to scatter a lot of light in order for an experimenter to be
able to discriminate a bright state from a dark state with any reasonable confidence.
This means that after a relatively short time period it is likely that an ion prepared
in a superposition of bright and dark states has scattered several photons that were
lost to the experimenter. Honest scientists would be forced to admit that while they
were still uncertain as to the outcome of the measurement, they are quite certain
that any coherence between the bright and dark states has been destroyed.
Chapter 1. Introduction 3
In the off-resonant scheme, almost all of the probe light can be collected, mean-
ing that a clever experimentalist has access to nearly all of the information available.
After a short interaction time the measured state will in general change, but only
in proportion to the amount of information gained. The point is that armed with a
complete measurement record it is possible to track the evolution of the state from
an initial superposition to a final projected outcome. This kind of measurement is
known as a weak quantum nondemolition measurement (QND), and has been demon-
strated in several different experiments involving ensembles of monatomic gasses in
various parameter regimes. One important consequence of this kind of measurement
is that the projective outcome is a highly nonclassical state involving strong quantum
coherence between all of the atoms in the ensemble. While the experimental realities
of photon scattering ultimately limit the system from reaching this eigenstate, an
intermediate squeezed state has been observed on several occasions [8–10], where
the uncertainty of the measured observable is reduced below the standard quantum
limit.
This dissertation is focused on the modeling of a quantum atomic system inter-
acting with a quantized traveling-wave optical probe when that field is also measured
continuously in time. The most interesting quantum effects occur in the idealized
case with no loss of light and a noise free measurement, which is the only case consid-
ered here. The progression from an initial superposition state to a final measurement
eigenstate is neither a time stationary process nor a linear transformation and so a
model capable of tracking the full transition must be both time-adaptive and non-
linear. Finding such a mathematical description is not a trivial exercise but one that
has been extensively studied previously.
The ultimate objective is to apply this continuous measurement model to the
problem of quantum state tomography. Constructing an estimate for an arbitrary
quantum state based upon experimental data is very resource intensive. Specifying
an arbitrary quantum state for a d-dimensional system requires at most d2 − 1 pa-
Chapter 1. Introduction 4
rameters and for each parameter N uncorrelated projective measurements generally
gives an accuracy of√N . Through an alternative protocol proposed by Silberfarb
et al., these inefficiencies can be largely side-stepped, by applying a weak continu-
ous measurement plus a well chosen dynamical control collectively to an ensemble
of identically prepared systems [11]. If the control drives the system in such a way
as to make the measurement informationally complete, then a single measurement
record has encoded information about all d2 − 1 parameters, albeit with a varying
level of certainty and noise corruption.
In particular, we consider an atomic ensemble prepared in an identical tensor
product state ρtot = ρ⊗n0 that experiences a known Hamiltonian while simultaneously
coupled to a traveling wave probe via a collective degree of freedom. A continuous
measurement of this probe then generates a measurement record that is strongly cor-
related with the evolution of the system. With sufficient signal to noise, a statistical
estimate of an unknown initial system state will in general have high fidelity with the
true initial condition. Using the weak measurement generated by an off-resonance
probe, this state reconstruction procedure has been implemented in the laboratory,
allowing reconstruction of the full hyperfine, d = 16, ground state manifold of a
laser cooled neutral Cs atom ensemble [12, 13]. However, these experiments were
performed in a parameter regime where the amount of information lost to the en-
vironment dominated over any measurement induced backaction. Chap. 5 explores
how this procedure performs in an opposite regime where decoherence is negligible
and we retain a complete measurement record. Arriving at this result requires sev-
eral intermediate steps, particularly a detailed knowledge of how a classical statistical
estimate is made and how that is applied to a quantum system.
The estimation of a possibly random signal from an observation corrupted by
unwanted noise is known as filtering and takes its origin from the work of Wiener
[14], where the signal was assumed to be generated with time stationary statistics. In
a linear system with additive Gaussian noise, an optimum estimate to a continuous
Chapter 1. Introduction 5
nonstationary signal was computed by Kalman and Bucy [15] and is an indispensable
tool in engineering and classical signal processing and control. Not surprisingly an
estimate to a nonlinear signal is significantly more challenging than in a linear system.
The nonlinear classical filter began with Stratonovich [16] and was later expressed
in the useful language of Ito calculus by Kushner [17]. Important contributions were
made by Kallianpur and Striebel [18] and Zakai [19], steps that are particularly useful
in formulating a quantum analog. Nonlinear filtering theory has an extremely wide
range of applications including GPS based navigation, optimal stochastic control,
financial portfolio optimization, audio and imaging noise removal and enhancement,
speech recognition, weather prediction, and so on [20]. For each application there
is a rich body of literature with a wealth of numerical and approximation methods
allowing for practical and, in some cases, real-time implementations.
One of the fundamental tools that makes continuous-time classical estimation
possible is stochastic calculus. In the same way that a global function can be built
up from an integral over local infinitesimals, a random signal can also be constructed
from random increments. In order for the filtering problem to be remotely tractable,
it is necessary that the random increments originate from an uncorrelated noise
source. This means that the fundamental noise injected into an otherwise determin-
istic system is assumed to be uncorrelated white noise. Under this assumption the
filtering equation is Markovian, meaning that the estimate updates only according to
the latest measurement and its most recent value. Classically, a white noise approx-
imation is often well justified when the input noise is actually an aggregate effect
from a large number of uncorrelated sources. The canonical example is a particle ex-
periencing Brownian motion. Each impulse is a collision with a background molecule
imparts a small amount of momentum to the larger particle. For times that are longer
than the time between collisions, the net displacement is uncorrelated with previous
intermediate times. In this case not only are the collisions uncorrelated, the parti-
cle’s displacement is Gaussian distributed with a variance that grows proportional
Chapter 1. Introduction 6
to time. Brownian motion is a classic example for a system influenced by Gaussian
white noise, but due to the central limit theorem, these models are ubiquitous in
systems with continuous trajectories.
Working with white noise directly adds another layer of sophistication to an
already mathematically challenging topic [21]. This is due to, among other problems,
the fact that as white noise is defined to have completely uncorrelated fluctuations at
any point in time, it is therefore discontinuous; continuity from one point to the next
would imply correlations. To build a mathematical framework that is both useful
and provably consistent, the hard learned lesson is to frame the problem not in terms
of the white noise itself but to instead use its integral, which is at least continuous1
[22]. The most widely used representation for an integral over Gaussian white noise
(in other words a mathematical model for Brownian motion) is the Wiener process.
Chap. 3 reviews many of its defining properties, which we are able to leverage into a
maximum likelihood estimate of an initial quantum state based upon a polarimetry
measurement.
Beyond a successful model for integrated white noise, a classical filter needs to
be able to manipulate these integrals in a full fledged calculus. The subtlety of
dealing with a randomized integral is that different limiting approximations lead to
fundamentally different stochastic calculi. The two most common forms of stochastic
integration are the Stratonovich integral and the Ito integral, with differing calculus
rules and statistical properties [22]. Both forms of integration are used here and a
brief review is presented in Appendix B. One drawback from initializing a model
with a stochastic integral is that any predictions will dramatically depend on what
kind of integral is used. At one level the choice of integral is no different than a
number of other approximations one makes in formulating a statistical model of a
given physical system.
1The calculus of randomized distributions characterizing white noise is referenced backto the integrated expressions anyway [21].
Chapter 1. Introduction 7
The lessons from developing a stochastic calculus for classical systems has also
been applied to quantum models as well. In the mid 80’s Hudson and Parthasarathy
developed an operator-valued quantum version of the classical Ito calculus [23]. This
quantum Ito calculus is indispensable in modeling open quantum optical systems
and has been applied to not only continuous quantum measurement [24, 25], but
also quantum control (see [26] for an overview and introduction).
The Ito integral is based upon the assumption that the noise is completely un-
correlated so that the integral between times [t0, t1) will be independent from the
integral over times [t1, t2) for any times, 0 ≤ t0 < t1 < t2, no matter how small
the difference. For an actual Brownian particle, this is only an approximation as
the particle’s velocity is correlated for times between atomic collisions. The quan-
tum Ito integral also makes such an assumption but it does so by assuming that
the operators representing equivalent integrals commute for nonoverlapping times,
no matter how small. Before immediately applying Hudson and Parthasarathy’s
formalism to the laser probed system, Chap. 2 investigates for what times such an
approximation applies, given that the resulting operators must also be consistent
with a quasi-monochromatic description of a traveling wave light field.
The similarities between classical filtering and a quantum system subject to an
indirect measurement should not be ignored. In a classical setting one seeks an es-
timate of an unobserved system state consistent with a noisy measurement. The
fundamental goal in a quantum system is to predict the results of future measure-
ments consistently and accurately with past measurements. The unobserved atomic
system is then the estimated quantity and the measured probe gives the noisy data.
To fully exploit this similarity and apply the techniques developed for classical sys-
tems, several fundamental questions about the nature of quantum statistics and
classical probability must be addressed.
The development of a continuous-time filter for a quantum system was pioneered
by the work of Belavkin starting in the early 80’s [24, 27–29]. These mathematically
Chapter 1. Introduction 8
rigorous results developed and applied a deep relation between the algebraic and
commutative properties of operators on a Hilbert space and the expression of clas-
sical stochastic processes. Experimental observations in elementary quantum theory
are represented by Hermitian operators and that upon making a measurement the
random outcome corresponds to an eigenvalue of that operator. The connection is
then made though the following two insights. The first insight is then to associate
operators with random variables. In classical probability all random variables are
consistent, in the sense that all random variables will agree that the same under-
lying outcome of the system occurred, no matter what order they are queried. For
quantum systems it is a hard learned fact that only commuting operators will return
consistent results. Thus, the second insight is that in order to use a sequence of
measurements for statistical inference, all of the measured operators must commute.
Additionally, any operator whose statistics we wish to infer must also commute with
all measurements to date. The utility of considering sets of commuting operators for
the purposes of statistical inference is more well known in the physics community as
the defining property of QND [30]. Working within these limitations, the problems of
noncommutativity is no longer an issue leading to a real and useful mapping between
quantum measurements and classical probabilities.
A mapping between the quantum and the classical descriptions of probability
can be more than just a guiding principle. Through a formal isomorphism between
commuting quantum operators and the language of classical filtering theory all of
the above classical results can be easily applied. The quantum filter developed by
Belavkin is nothing more than a noncommuting analog of the classical Kushner-
Stratonovich equation of nonlinear filtering [25]. Chap. 3 reviews how a mapping
between quantum operators and a formal classical probability model is made. The
purpose of this review is two fold. The first is to provide the necessary background for
a quantum filter. The second is to shed light on how the algebraic language of classical
probability theory can be applied to quantum systems thereby gaining new insights
Chapter 1. Introduction 9
and intuitions into the quantum/classical divide. When the quantum and classical
coincide, nearly 50 years of engineering experience can either be immediately applied
or adapted with some modifications. Finally, this chapter shows how the quantum
filter is equivalent to a generalized measurement by making a unitary extension to a
larger dimensional Hilbert space.
Chap. 4 applies one such method to the quantum system of an identical spin
ensemble undergoing a polarimetry measurement in idealized conditions. Brigo et al.
applied the methods of differential geometry to simplify a classical filter [31, 32].
This method of making differential projections was adapted to a quantum system by
van Handel and Mabuchi where they simplified a continuous quantum measurement
of a strongly driven atom-cavity system into manifold states where the cavity has a
Gaussian Q-function [33]. The method has been subsequently applied to other cavity
QED systems [34–36], collective spin systems in a linearized-Gaussian regime [37],
and to find a low rank approximation to a general master equations in Lindblad form
[38]. Chap. 4 computes the orthogonal projection of an ensemble of n qubits into
the manifold of identical separable states of the form ρ⊗n and numerically compares
the accuracy of such an approximation to a complete evolution.
Using the projected filter as a computational tool, Chap. 5 turns the problem
of quantum state tomography essentially into a classical parameter estimation prob-
lem, where the classical parameters are the pointing angles of a spin coherent state
constructed from the initial n qubits. The parameters are estimated by numerically
computing a maximum likelihood estimate based upon a polarimetry measurement
when the measurement statistics are strongly affected by quantum backaction. By
only including the conditional effects present in the projection filter we achieve an
average reconstruction fidelity that nearly saturates an optimum bound given by any
generalized measurement scheme [39].
Chapter 1. Introduction 10
1.0.1 A note on quantum foundations
Any work that addresses the quantum world and in particular quantum measurement
eventually encounters some issue rooted in the foundations of quantum mechanics
and the various interpretations one could assume. This dissertation does not address
quantum mechanical foundations in any meaningful way and attempts to remain
agnostic about the reality of a quantum state or even the existence of a more fun-
damental theory. Wherever possible we take a statistical perspective and implicitly
assume that the simple models we construct may not be error free, in the sense that
they do not include the whole the reality of a given experiment.
When considering quantum state tomography, we compare the conditional evo-
lution of an ensemble of initial conditions and then select the state that maximizes
a likelihood function. In our numerical simulations, no member of that ensemble
corresponds with arbitrary precision to the initial condition used to simulate the
measurement record. So in one sense the conditional state we calculate will always
be incorrect. However, in a field where the ontology of a quantum state is still de-
bated, we take a conservative position and will not to presume to know that any
conditional state is the true conditional state. Instead we will only take the stance
that what we calculate is a quantum state that best predicts any future measurement
in a manner that is consistent with past results and the assumptions of the model.
We identify this state through the framework of quantum probability theory, a for-
malism that is less well known to physicists working in quantum information theory.
The final object that we calculate is ultimately no different from what is given by
the usual stochastic Schrodinger/master equations that are used in quantum optics.
The purpose of working with quantum probability theory is that it illustrates an
immediate connection to classical probability and estimation theory. In the classical
setting, an estimator is a tool that is used to predict or estimate some quantity given
a series of measurements. The stochastic Schrodinger equation is in a very real sense
Chapter 1. Introduction 11
a quantum estimator. We would rather not comment as to whether or not it is
estimating the state of the system because it lacks knowledge of a theory extending
beyond standard quantum mechanics or if it is the fundamental limit and there is
no more information in existences. In effect, we assume that the quantum state is
simply a tool for making predictions about a quantum system.
1.1 An executive summary
This is a terse summary of the fundamental results of this dissertation, presented
in the same order as the subsequent chapters. This is not intended to be a gentle
introduction to the material and assumes a strong familiarity with the background
material. We encourage an interested but nonexpert reader not to struggle too
hard trying to comprehend this section and instead consult the main text and the
associated appendices.
If any single global thesis can be applied to the entirety of this work it is that clas-
sical probabilistic methods are useful and that with some care they can be adapted
to quantum systems. The previous introduction discussed how to connect stochastic
calculus and nonlinear filtering theory to a quantum system continuously probed by
an optical field. Chaps. 2 and 3 provide a physical and mathematical foundation for
this connection while Chaps. 4 and 5 apply it to the specific problem of efficiently
estimating an initial qubit state. Chap. 6 discusses possible directions this work
could take. In addition to this main matter, we include several appendices providing
background material such as a review of paraxial optics (Appendix A), stochastic
differential equations (Appendix B), quantum stochastic differential equations (Ap-
pendix C) and the quantum Wong-Zakai theorem (Appendix D).
Chapter 1. Introduction 12
1.1.1 Quantum optics and quantum stochastic differential
equations
Chap. 2 shows how a second quantized picture of classical traveling wave packets
reproduces the mathematical structure necessary for defining a formal quantum Ito
stochastic calculus. It also identifies the timescales for which a quasi-monochromatic
field can be approximated as generating quantum white noise. This is a regime that
is independent from any system coupling or measurement apparatus and applies for
a large family of states - include highly nonclassical states, such as multi-mode Fock
states.
The specific model we consider is the second quantization of quasi-monochromatic
wave packets [40–42] where the single particle Hilbert space is the space of coherent
state amplitudes for an associated classical field. We assume a paraxial model where
there is a factorization between a carrier plane wave exp(−iω0(t − z/c)), spatial
mode function u(+)T (x, y, z), and longitudinal envelope function f(t−z/c). The quasi-
monochromatic approximation means that the longitudinal function must satisfy the
inequality, |f(t)| 1ω0
∣∣ ∂∂tf(t)
∣∣ 1ω2
0
∣∣∣ ∂2
∂t2f(t)
∣∣∣.We ultimately seek creation and annihilation operators that are simultaneously
quasi-monochromatic as well as consistent with a quantum white noise approxima-
tion. To do so, we define a†[f(0)] to be the operator that creates a single quantum in
a given spatial mode u(+)T (x, y, z), with an envelope function f , referenced to some
point along the optical axis. The operator a[f(t)] annihilates a quantum in a similar
mode but one that has experienced free propagation for a time t. We derive the
unequal time commutation relation,[a[f(t1)], a†[f(t2)]
]∝ e−iω0(t2−t1)
(f ? f (t2 − t1)− i 1
ω0
df
dt? f (t2 − t1)
)(1.1)
where g ? f is the cross-correlation function of g and f and the proportionality
constant is simply a scaling factor that can be absorbed into the definition of f .
Chapter 1. Introduction 13
Physically it is entirely reasonable that if a classical envelope is no longer temporally
correlated then the associated field operators should commute. To the best of our
knowledge this is a new result in the characterization of quantized fields.
The canonical definition for quantum white noise is that there exist the creation
and annihilation operators [a(t), a†(t′)] ∝ δ(t − t′). Therefore in order for quasi-
monochromatic light to be consistent with a white noise approximation, not only
does f ? f (t2 − t1)→ δ(t2 − t1) in a suitable limit but 1ω0
dfdt? f (t2 − t1)→ 0.
From an approximation to white noise, (in a rotating frame) we then use the
limiting white noise operators and a recent theorem by Gough [43], reviewed in
Appendix D, to consider the dispersive Faraday interaction, in an idealized regime
where the possibility for multiple scattering events is nonnegligible. The limiting
object is a quantum stochastic Ito equation for the propagator that describes the
unitary evolution between the field and the atomic system. Using well know results
in quantum stochastics we write down the equivalent master equation in Lindblad
form.
1.1.2 Classical and quantum probability theory
Chap. 3 is a mathematical review of well known results from classical and quantum
probability theory, which serves as a foundation for the novel work in later chapters.
This review is conducted with an emphasis for physicists and attempts to explain
and justify the concepts while omitting the proofs. The end goal is have an under-
standing of how the conditional master equation results from a mapping between
sets of commuting operators and a classical probability space. This is a critical point
as the driving noise in the conditional master equation is not a Wiener process, but
is instead the random outcomes of a continuous quantum limited measurement. The
resulting classical stochastic process ytt≥0 is only a Brownian motion when the
measurements are (i) of a field quadrature in the vacuum state and (ii) there is no
Chapter 1. Introduction 14
system coupling to that quadrature, i.e. the measurement has no system information.
The second objective of this chapter is to emphasize the general power of this
technique and to discuss how the language of classical probability theory can be used
to identify semiclassical subspaces embedded in a quantum system. In order to do
so in a relatively self-contained manner we review the basic elements in the triple
(Ω,F ,P) forming a classical probability space. The infinite dimensional example
we focus on is the sample paths for a Brownian motion and explicitly describe the
relevant σ-algebra. In order to introduce the quantum conditional expectation, we
first review the classical conditional expectation and more generally how expectation
values are computed in the measure theoretic framework. We then introduce the
concept of time-adapted processes and martingales as both are crucial in the quantum
case. The review of classical probability theory concludes by discussing the Wiener
process and the Wiener measure over the space of continuous functions.
From a firm description of classical probability theory we then discuss the quan-
tum analog. We explicitly show how one identifies a classical probability space from
the set of mutually commuting observables by taking Ω as the set of possible eigenval-
ues, F as the σ-algebra generated by those eigenvalues, and the probability measure
P as the quantum expectation under the state ρ of the associated projectors. From
this semiclassical description we then introduce the noncommutative analog were
one omits a sample space of compatible outcomes, identifies the σ-algebra with a ∗-
algebra of operators (or a von Neumann algebra in the infinite dimensional case), and
the probability measure with a valid quantum state ρ. We then explain the power of
generating sub-∗-algebras from sets of operators focusing on the important object of
the commutant. We specifically explain how the commutant is the largest space of
operators that we can condition on a sequence of commuting observations and how it
contains noncommuting elements. Armed with that description we identify the prop-
erties of the quantum conditional expectation, and provide an explicit construction
for how it is in correspondence with the generalized measurements found in quantum
Chapter 1. Introduction 15
information theory. From the discussion of the quantum conditional expectation we
then state the resulting the quantum filter as it is generated from the observation
processYt = U †t (At + A†t)Ut
t≥0
under vacuum expectation.
While the quantum filter is an elegant expression for a conditional operator, it
rarely closes to a finite set of quantum stochastic differential equations. Rather it is
more useful to work with an effectively semiclassical equation, the conditional master
equation. Here we use the term semiclassical in a sense that does not imply a subop-
timal approximation but rather to indicate that the quantum measurement process
Ytt≥0 (a family of operators) is treated as a classical stochastic process ytt≥0 (a
family of classical random variables defined on the probability space (Ω,F ,P) ). The
probability measure P matches the statistics of ytt≥0 to the quantum measurement
statistics Ytt≥0. While this mapping “demotes” the measurement operators to a
classical process, it still treats the system quantum mechanically, by propagating a
density operator ρtt≥0. Generally the statistics of ytt≥0 will depend upon a quan-
tum system expectation value, and so this is semiclassical and not a fully classical
probability model. This system density operator matches the quantum conditional
expectation by enforcing the equality
πt(X)|Yt=yt = Tr(ρtX) (1.2)
for every system operator X and time t.
The quantum filter is derived in terms of a quantum Ito equation and so the
resulting semiclassical conditional master equation is a matrix-valued classical Ito
equation. In Chap. 4 we are required to express it in terms of a Stratonovich
integral and so we derive the associated correction factor here. The chapter closes
by finding a conditional Schrodinger equation that corresponds to the more general
master equation in the case of pure states. This equation is useful for numerical
simulation as propagating a complex vector is more efficient than a complex matrix.
Chapter 1. Introduction 16
1.1.3 Projection filtering for qubit ensembles
Chap. 4 derives an approximate form of the conditional master equation for an
ensemble of n qubits under the assumption that the state will remain nearly an
identical separable state. The approximation is made though a technique known as
projection filtering, developed to reduce the dimension of a classical filtering equation
by formulating the space of solutions as a Riemannian manifold and then making
an orthogonal projection onto a lower dimensional manifold. The lower dimensional
manifold that we wish to project onto is the space of density matrices that can be
written as % = ρ⊗n for some valid single qubit state ρ. The appeal of the projection
filtering technique is that it is algorithmic in nature, in that after identifying the
desired manifold and making a choice of metric, finding the optimal projection is
reduced to a problem of matrix algebra. Due to the simplicity of the qubit we are
able to solve for this projection analytically.
A third of this chapter reviews the fundamentals of differential geometry, focusing
on the mapping between qubit states and the Bloch ball. We refer to the set of
valid quantum states for a d < ∞ dimensional quantum system as S(d) and the
three-dimensional unit ball as B. The metric we use is the trace inner product
〈A, B〉% = Tr(A†B) for A,B ∈ T% S(d). From the standard mapping between points
in the Bloch ball and qubit states ρ : B ⊂ R3 → S(2)
ρ(x) = 12
(1+ xiσi
)(1.3)
we identify a basisDi ≡ 1
2σi
for the tangent space Tρ(x)S(2). The resulting trace
inner product induces an Euclidean metric on B,
〈Di, Dj〉ρ = 14
Tr(σiσj) = 12δij. (1.4)
The manifold we ultimately want to consider is the set of density operators
P ≡ρ(x)⊗n : x ∈ B
⊂ S(2n). (1.5)
Chapter 1. Introduction 17
Any derivative we define for % ∈ P must distribute over the tensor product structure,
and so we identify the tangent space
T%(x)P = span
Di(x) =
n∑`=1
ρ(x)⊗`−1 ⊗ 12σi ⊗ ρ(x)⊗n−`
. (1.6)
For n 6= 1, the metric on the Bloch ball induced from the trace inner product is no
longer Eucildean. Instead it is given by the matrix
gij(x) = Tr(Di(x)Dj(x) )
=n
2n(1 + |x|2
)n−1δij +
n(n− 1)
2n(1 + |x|2
)n−2xkx` δkiδ`j.
(1.7)
This metric is however isotropic, which can be seen by converting to spherical coor-
dinates. The resulting line element is
ds2 =n
2n(1 + r2
)n−1(
1 + nr2
1 + r2dr2 + r2 dθ2 + r2 sin2 θ dφ2
). (1.8)
With this non-Euclidean metric, we then wish to apply the projection map ΠP :
T%S(2n)→ T%P , defined as
ΠP(X) = gij(x) 〈Dj(x), X〉% Di(x) (1.9)
to each terms in the conditional master equation.
A general unconditioned master equation written in Lindblad form is,
ddt% = −i[H, %] +D[L](%) (1.10)
for some Hamiltonian H and jump operator L. As the master equation describes a
valid quantum evolution, the righthand side of this equation must describe a vector
in the tangent space T%S(2n). Applying the projector ΠP to the general master
equation results in a new master equation, describing the evolution of a modified
state %|P ,
d
dt%|P = gij(x)
(〈Dj(x), −i[H, %(x)]〉% + 〈Dj(x), D[L](%)〉%
)Di(x). (1.11)
Chapter 1. Introduction 18
This new equation is guaranteed to both produce a valid quantum evolution as well
as constrain the state to remain in P . Performing this projection in the case of a
conditional master equation is essentially no different, with one caveat due to the
subtle nature of stochastic integrations.
Converting the derivative ddt%|P into a differential form dρ|P has no ambiguity in
interpretation as the differential
d%|P = ai(x)Di(x) dt (1.12)
describes a valid mapping between the tangent space TtR+ and the tangent space
T%(x)P . However, interpreting a general stochastic differential in terms of a differen-
tial form is problematic because the explicit path-wise derivative generally does not
exist. Even if one were to solve the stochastic differential equation
d%|P = B(xt) dwt (1.13)
for B(xt) ∈ T%(xt)P there is no a priori reason to assume that the resulting solution
will remain in P . In developing the projection filtering technique, Brigo et al. found
that a solution to a general Ito equation decidedly does not satisfy this property
[31]. The problem is that the drift induced by the second order nature of the Ito
rule causes the solution to leave P even when the integrand is in the proper tangent
space. The saving grace is that the orthogonal projection method does constrain the
solution when the original equation is written in Stratonovich form. The bottom line
is that in order to project the conditional master equation into the tangent space
T%P it must first be written as a Stratonovich integral. Chap. 3 calculates the
proper conversion factor, generating the Ito correction map Ic[L](%) for a general
measurement operator L.
The conditional master equation is given by the Stratonovich equation
d% = −i[Htot, %]dt+D[Ltot](%)dt+ Ic[Ltot](%)dt+H[Ltot](%) dvt (1.14)
Chapter 1. Introduction 19
where the maps D[Ltot](·), H[L](·), and Ic[L](·) are given in eqs. (4.5, 4.6, and 4.10)
respectively. The subscript tot used here is used to specify that these operators act
on the joint Hilbert space over all n qubits.
The projections of each term are computed relatively generally, but under the
assumption that the operators Htot and Ltot act identically and independently on
each qubit in the ensemble. This means that for the single qubit operators H and L
the joint operators are equal to
Htot =n∑`=1
1⊗`−1 ⊗H ⊗ 1⊗n−` (1.15)
and
Ltot =n∑`=1
1⊗`−1 ⊗ L⊗ 1⊗n−`. (1.16)
From the general expressions we also specialize to the examples of L =√κ 1
2σz and
H = 12(f 1(t)σx+f 2(t)σy+f 3(t)σz) for a constant rate κ and deterministic real valued
control fields f i(t). This specialized example corresponds to an idealized model of
a dispersive measurement of a collective angular momenta and a time varying but
uniform magnetic field. The final expression we calculate for this example and call
the projection filter is a system of coupled Ito stochastic differential equations that
correspond to the single particle Bloch vector components, xt,
dxt = a1(x, t) dt−√κx z dvt,
dyt = a2(x, t) dt−√κ y z dvt,
dzt = a3(x, t) dt+√κ(1− z2) dvt.
(1.17)
The deterministic integrands ai(x, t) are
a1(x, t) = f 2(t) z − f 3(t) y − 12κx+ κ γ(r)x z2,
a2(x, t) = f 3(t)x− f 1(t) z − 12κ y + κ γ(r) y z2,
a3(x, t) = f 1(t) y − f 2(t)x− κ (n− 1)(
1−r2
1+r2
)z − κ γ(r) z3
(1.18)
Chapter 1. Introduction 20
with the function
γ(r) ≡ (1− r2)
(n (n+ 1)
2 (1 + n r2)− 1
1 + r2
). (1.19)
Finally the stochastic increment dvt is the innovation process, calculated from the
measurement process yt (no relation to the Bloch vector component) with a differ-
ential
dvt = dyt − n√κ zt dt. (1.20)
The nonlinear function γ(r) has two important zeros that simplify the projection
filter dramatically. The first is that when n = 1, γ(r) = 0 for every value of r.
Furthermore it is easy to compute that when evaluating the projection filter for
n = 1, the equations are identical to a set of conditional Bloch vector equations one
obtains directly from the conditional master equation. In other words, the projected
space is the whole manifold of solutions, P = S(2). The second zero occurs for
γ(r = 1) = 0 for any n. The n and r dependent terms in a3(x, t) also evaluate to
zero for r = 1, meaning that for any n the projected pure state evolution is essentially
identical to the evolution of a single qubit state. The only remaining n dependence is
that the innovation process requires the expected measurement outcome to be scaled
by a factor of n.
There are three elements that makes this projection filter tractable for obtaining
an analytic expression. The first is the isotropic nature of trace inner product metric,
dramatically simplifying the calculation. The second is that the Pauli matrices form
a simple basis for 2 × 2 complex matrices and have equal eigenvalues. The third
is the identical and independent assumption for the joint operators. This allows for
terms that would in general result in ensemble averages to be given by identical single
particle values.
The final element of Chap. 4 is a series of numerical experiments testing the
accuracy and performance of the projection filter against exact simulations for initial
Chapter 1. Introduction 21
pure spin coherent states. The rate κ sets the measurement timescale and so all
times in the simulations are compared to this rate, effectively setting it to 1. Each
simulation ran for a fixed time t = 0.2κ−1. As the quality of the projection filter
should explicitly depend upon n, these simulations test the qubit numbers n =
1, 25, 50, 75, 100. The average performance data included a sample of ν = 100
isotropically sampled initial qubit states with a single noise realization for each initial
state.
In addition to testing the performance as a function of n, it also tests two different
control functions f i(t). The first is for f i(t) = 0 for all t and i. This corresponds to a
QND measurement of the z projection of the total angular momentum formed by the
qubit ensemble, Jz, and is known to produce spin squeezing. We find that an initial
state involving 50 qubits prepared in +Jx eigenstate produced ≈ 10 dB of squeezing
in one measurement duration. A squeezed state is inherently not a product state,
and so serves as a worst case scenario for the projection filter and acts as a lower
bound on its performance.
The zero field measurement is compared to the case of a strong randomized
control sequence. Chap. 5 uses the projection filter in an algorithm to reconstruct
the initial condition of a SCS from a continuous measurement of Jz, characterized
by the rate κ. In order to obtain information about observables other than Jz, an
external control Hamiltonian must be applied. For reasons discussed in Sec. 5.3.1,
this takes form of a sequence of global π/2 rotations, where each rotation is about
an axis n independently sampled from a uniform distribution. Fully characterizing
the control amplitude f(t) requires specifying the amplitude and duration of each
pulse, as a larger Larmor frequency is needed to enact the same rotation in a shorter
time. For simplicity, we will fix f(t) to have a constant magnitude and so for a pulse
duration τ the control field is then given by,
f(t) =π
2 τ
∑m=1
χ[m−1,m)( t/τ) nm (1.21)
Chapter 1. Introduction 22
where χ[a,b) (t) is the indicator function for the interval [a, b) and nm are i.i.d.
unit vectors drawn from a isotropic distribution.
The accuracy of the projection filter was tested by comparing how well it is
capable of reproducing the conditional expectation values of the collective angular
momentum components Ji as well as the squared overlap between the exact state and
the equivalent spin coherent state that is made from an ensemble of n identical pure
qubits. The time-dependent results are presented in Figures 4.4 and 4.5. The RMS
error in the conditional expectation values were generally independent of the number
of qubits, likely due to the fact that for pure states the projection filter dynamics
are essentially independent of n. With the randomized controls the RMS error was
. 5% of the total spin length in all 3 expectation values. In the absence of a control
field, there was a general linear increase in the Jx and Jy errors also reaching the
5% level, while there was a noticeable increase in the Jz error with a final value in
the ∼ 5− 10% range. The poorer performance is attributable to the effect the spin
squeezing has on the mean values.
The squared overlap between the exact state and the equivalent spin coherent
state exhibits a strong dependence upon both the number of qubits and the con-
trol fields. In the uncontrolled case this metric monotonic decreased for all n > 1
dropping to 0.75 for n = 25 and 0.48 for n = 100. This is in stark contrast to the
simulations including the randomized controls. While the resulting average fidelity
was noticeably poorer for large n, the minimum value was > 0.8 for all n. While the
trend was to have poorer fidelities at longer times, the decrease was not monotonic
implying that the control wave form could be optimized to maximize the average
overlap with the spin coherent state and thereby minimizing the information lost by
performing the the projection.
We hypothesize that the state remains closer to a product state because the
randomized controls tends to mix both the squeezed and anti-squeezed components
leading to a near zero average. Not only does the mean spin rotate, but the orien-
Chapter 1. Introduction 23
tation of the squeezing ellipse also rotates. As the rotation axes are chosen from a
uniform distribution, the squeezed component is just as likely as the anti-squeezed
component to be oriented along the measurement axis. At any given time, the uncer-
tainty in the Jz component is equally likely to be above or below the uncertainty of an
equivalent spin coherent state. Therefore it is difficult for any significant squeezing
to develop, and thereby keep the exact state closer to the product state description.
1.1.4 Qubit state reconstruction
Chap. 5 describes how to use the quantum filtering formalism in order to construct
a tomographic estimate for an unknown initial quantum state from an ensemble of
identical copies experiencing a joint continuous measurement. We make a maximum
likelihood estimate (MLE) of the initial state, based upon the statistics of a single
continuous measurement realization. The purpose of this work is to extend previous
results [11–13], which used a continuous measurement for quantum state tomogra-
phy, into a regime where the quantum backaction significantly effect the measurement
statistics. In an idealized numerical study, we find that such an estimate can nearly
saturate an optimum reconstruction bound. Much is known about the fundamental
quantum limits of reconstructing pure qubit states from a finite number of measure-
ments. Massar and Popescu showed that given n copies of a pure qubit state, it is
possible to find a generalized measurement that returns the highest average fidelity
between the estimated state and the correct initial state [39]. The average is made
not just over measurement outcomes but also over an unbiased set of possible input
states. The optimum average fidelity bound is simply given 〈F〉opt = (n+1)/(n+2).
We consider here an idealized model of an ensemble of n qubits identically coupled
to a single traveling wave quantum light field via a linearized Faraday interaction.
Under certain approximations discussed in Sec. 2.7, a measurement of the orthogonal
quadrature contains information about the collective angular momentum variable
Chapter 1. Introduction 24
Jz, with a coupling rate κ. The ensemble is assumed to be prepared in a pure spin
coherent state characterized by the unknown polar angles (θ, φ). However the initial
qubit state is not a QND variable, meaning that ρ(θ, φ)⊗n does not commute with the
fundamental interaction Hamiltonian. The implication of this is that it is impossible
to find a consistent method for inverting the forward time dynamic to arrive at a
conditional expression for the initial state.
To circumvent this problem we instead map the quantum state estimation prob-
lem to a parameter estimation problem to find a MLE of (θ, φ). A single continuous
measurement realization of a noncommuting output quadrature results in a stochas-
tic process ytt≥0 that contains information about the atomic ensemble. Because
of this information, its statistics parametrically depend upon the unknown angles.
While a MLE based upon a single data point would perform quite poorly, we find a
conditional estimate based upon the entire trajectory performs quite well, when the
measurement is informationally complete. To ensure informational completeness, a
known time-varying control Hamiltonian is applied to the system, thereby mixing
all spin projections with the measurement axis. Riofrıo et al. found that an efficient
and unbiased control policy is to choose a set of operations capable of generating
any single particle state and then randomly varying the magnitude of each control
in a piecewise constant way [13]. Here we use a similar control policy by including
in the modal a uniform magnetic field with a constant field strength that rotates the
collective angular momentum vector by π/2 in a period τ about randomly chosen
rotation axes. In a fixed final time we find 40 rotations provides enough information
to obtain high fidelity estimates.
In the semiclassical probability space induced by a measurement realization, the
appropriate probability measure P has a parametric dependence on the initial angles
(θ, φ). Identifying this dependence is best seen by considering not the conditional
statistics of ytt≥0 but instead the calculated innovation process vtt≥0. For the
Chapter 1. Introduction 25
measurement model considered here this process is given by
vt = yt − 2√κ
∫ t
0
ds Tr(Jz ρs(θ, φ) ), (1.22)
where ρs(θ, φ) is the system density operator calculated via the conditional master
equation assuming that initial condition is given by the angles (θ, φ). The inno-
vation process vt is shown in Sec. 3.4.1 to have the statistics of a Wiener process
if Tr(Jz ρs(θ, φ) ) corresponds to the exact quantum conditional expectation of the
Heisenberg picture operator U †t JzUt. If this correspondence cannot be made because
ρs(θ, φ) used an incorrect initial condition, then vt will not have statistics of a Wiener
process for every measurement realization. Here we use this fact to find the MLE for
(θ, φ). We seek the initial condition that makes the innovation process most likely to
be a Wiener process. As the conditional master equation is a nonlinear equation, we
resort to a mixture of numerical and analytical methods for finding an approximation
to the true likelihood function.
The Wiener measure gives the probability for a Wiener process sampled at times
ti : i = 1, . . . , n will be within associated intervals Ii = (ai, bi) and is given by the
integral
P (vti ∈ Ii) =
∫ b1
a1
dv1 . . .
∫ bn
an
dvn
n∏i=1
(1√
2π∆tiexp
(−(vi − vi−1)2
2∆ti
)). (1.23)
Because this is a Gaussian probability density, the MLE coincides with the least
squares estimate. For an equally spaced mesh of finite time intervals both the nor-
malization factor and the denominator of the exponent are irrelevant for the purposes
of computing a MLE. Therefore maximizing the likelihood function is equivalent to
minimizing the quadratic variation,
QV(vt) ≡n∑i=1
(∆yi − 2
√κ∆tTr(Jz ρti−1
(θ, φ) ))2
(1.24)
The minimization of this function with respect (θ, φ) is computed numerically as we
are unable to find an analytic solution to the conditional master equation and the
dependence upon the initial condition is nonlinear.
Chapter 1. Introduction 26
Every evaluation of this function requires a full numerical integration of the con-
ditional master equation. Numerically integrating an exact conditional pure state is
a computationally intensive task. Here we test spin ensemble involving 25 ≤ n ≤ 100
qubits, which require order n complex number to fully describe the relevant condi-
tional dynamics. While it is computationally feasible to integrate the exact equation
to generate a simulated measurement record, for every measurement record we used
a total of 500 evaluations of the quadratic variation cost function. We found it
infeasible to use an exact expression for computing QV(vt) and instead sought a
reasonably accurate approximation. The approximation we use is the projection fil-
ter developed in Chap. 4. Under identical conditions to the dynamics used here,
the projection filter is able to match the expectation value Tr(Jz ρt) to within 95%
accuracy of the exact value while only propagating 3 real numbers for any number
of qubits. By minimizing with respect to the approximate filter, the limiting com-
putational element became generating the simulated measurement record. While in
a higher dimensional space one could approach the problem with a gradient descent
algorithm, we find it more efficient to simply make a dense Monte Carlo sampling of
the entire Bloch sphere2 and then select the most likely sample.
In order to understand what role backaction plays in limiting the reconstruction
fidelity, we compare the performance of the projection filter estimate to one that
ignores completely the conditional dynamics. Instead of propagating a conditional
state, this estimate only considers the Hamiltonian dynamics generated by the mag-
netic field rotations. In other words by solving the Heisenberg equation of motion,
d
dtσz(t) = +i[1
2f j(t)σj(t), σz(t)] (1.25)
for the controls f i(t) given in Eq. (1.21), the expectation value√κnTr(σz(t)ρ(θ, φ))
reproduces the expected signal, ignoring the backaction. The purpose for computing
2Due an issue involving numerical stability we start with a uniform sampling of mixedstates and then make a subsequently smaller sample of pure states. See Sec. 5.4.2.
Chapter 1. Introduction 27
this in the Heisenberg picture is so that we are able to solve for the dynamical
observables once, and then apply that solution for any initial condition.
The results of the numerical experiments are given in Fig. 5.2. The estimate
based upon the projection filter nearly achieves the optimum (n+ 1)/(n+ 2) fidelity
bound, averaged over ν = 1000 trials for n = (25, 40, 55, 70, 85, 100) qubits. The
difference between the optimum bound and the numerical averages never exceeded
0.21%, a deviation that is likely statistically significant but not attributable to any
fundamental Monte Carlo sampling errors. In comparison the backaction-free es-
timator performed significantly poorer especially for higher qubit numbers. This
suggest that including the conditional dynamics is indeed important in this idealized
scenario.
The cause of the discrepancy between the projection filter estimate and the
backaction-free estimate is likely due to a bias that develops when all measurement
effects are ignored. This can be see in Figures 5.3 and 5.4. These figures plot the av-
erage reconstruction fidelity as a function of the measurement duration. The filtering
based estimate shows a monotonic rise in the average fidelity which then saturates at
a level only slightly below the optimum bound. As n increases this saturation occurs
at earlier times. In contrast, the estimate based only on Hamiltonian evolution does
not have monotonic increase in reconstruction fidelity. For n = 55, 70, 85, 100 the
average fidelity reaches a maximum and then has significant decrease as more data
are collected. For n = 25, 40 it is possible that a decrease might also have occurred
if the simulation continued for longer times.
When backaction is ignored, the assumption of pure unitary evolution implies that
no coherence is lost during the course of the measurement. If at time t the random-
ized controls managed to return to the original orientation then the backaction-free
estimator would “weight” the data received at that time just as much as the data
obtained at time t = 0. In comparison, the filtering based estimate knows that while
the rotations may have canceled, the expected signal at time t is not what it was
Chapter 1. Introduction 28
at time t = 0, precisely because of the conditional effects. By not including this
information the unitary estimate is biased away from the optimum estimate.
29
Chapter 2
Quantum Optics and Quantum
Stochastic Differential Equations
The objective of this chapter is to identify how the formalism of quantum stochas-
tic differential equations is implemented in the context of quantum optics. This is
done by first showing how the second quantization of classical quasi-monochromatic
traveling wave packets gives the natural structure necessary for defining the quan-
tum Ito integral. We then show under what conditions a wave packet operator can
be treated as generating a localized field operator, which is necessary for a Markov
approximation. From that localized structure, we then review how this defines a
quantum Wiener process and relates to the quantum white noise formalism usually
presented in quantum optics. With a wave packet description of quantum white
noise, we then review how a system coupling to these operators generates a quantum
stochastic differential equation for the propagator. Generating this equation is inti-
mately related to the operator ordering of the field operators, which is also related
to defining different kinds of stochastic equations. Here we review this fact and how
the propagator is derived. Finally, we apply this result to the Faraday Hamiltonian
and discuss the interaction in the limits of both strong and weak number coupling.
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 30
2.1 Quantum Stochastic Process in Optical Fields
The classical stochastic process is most generally defined as a family of random vari-
ables xtt≥0 indexed by time t. A quantum stochastic process is then a family of
operators Xtt≥0 also indexed by time. This definition allows for a slightly more
general structure than simply an operator is dependent upon time. A common ex-
ample of a time-dependent quantum operator is a Heisenberg picture operator, X(t),
acting on some Hilbert space H with its dynamics given by a unitary transformation.
Conversely, a quantum stochastic process implies something more general, where the
spectrum of Xt could be time-dependent and even the Hilbert space upon which it
acts nontrivially could be continuously changing in time.
A concrete and pertinent example is a continuous wave laser that is switched on
at time t0 = 0 and then switched off at some later time t > 0. Consider a stationary
observer located a distance d = c τ away from the laser who begins counting photons
with a perfectly efficient detector at time t0. By a time 0 < s < τ it is clear that
this observer will have not observed any of the laser light and so, in absence of any
corrupting background, the probability for observing anything must be zero. The
point of this example is that in the time interval [0, s] any observations must be
modeled as projectors acting on a part of Fock space that is independent of the part
displaced by the laser.
It is perfectly reasonable to use a Schrodinger picture description where the ob-
server makes projective measurements on a volume of the electromagnetic field and
the free-field Hamiltonian acts in such a way as to propagate the state of the field
fixed from the laser position to the detector. In Sec. 3.2, a mapping between a set
of commuting observables and a classical probability space is developed, where there
is a one-to-one correspondence between classical random variables and commuting
operators. To utilize these tools of classical probability theory, it is most natural to
work in a Heisenberg picture where the states remain fixed and the unitary evolution
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 31
is applied to the observables of interest. To develop a Heisenberg picture formula-
tion for a continuous measurement, we require a mathematical structure that can
cope with the fact that as time progresses a stationary observer will measure an ever
increasing set of operators, and these operators act upon different parts of the field’s
Hilbert space. This family of operators is the quintessential definition of a quantum
stochastic process.
In classical stochastic calculus, the Wiener process is the fundamental random
process from which the Ito integral is constructed and from there other processes are
defined. In the quantum setting, we require equally fundamental operator-valued
noise processes from which we will construct other processes. But as we are seek-
ing a description of a continuous optical measurement, those processes should arise
from the quantized electromagnetic field. The next section reviews the canonical
quantization of the free electromagnetic field to identify the operator nature of the
quantized field.
Throughout this chapter we will be discussing both classical and quantized ele-
ments of the electromagnetic field. In order to make this distinction, the classical
vector fields for the vector potential, electric field, etc. will be denoted as A(x, t),
E(x, t) and their quantized operator expressions as A(x, t), E(x, t). We will quan-
tize the free space electromagnetic field following the classic text by Cohen-Tannoudji
et al., and use SI units [44]. For reference, the spatial Fourier transform of a function
f(x, t) is defined as
f(k, t) ≡∫R3
d3x√(2π)3
e−ik·x f(x, t) (2.1)
and the inverse transform is
f(x, t) ≡∫R3
d3k√(2π)3
e+ik·x f(k, t). (2.2)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 32
2.1.1 Free space quantization
When rigorously quantizing the free space electromagnetic field, one begins by defin-
ing a scalar Lagrangian functional, with respect to variations in the vector potential,
A(x, t), whose minimization reproduces Maxwell’s equations [44]. The field Hamil-
tonian is
Hf =
∫R3
d3x
(|Π|2
2ε0
+1
2ε0c
2 |∇×A|2)
(2.3)
with the conjugate variable to the vector potential Π being
Π = ε0∂
∂tA = −ε0E . (2.4)
(The final equality with the electric field is made by assuming that there are no free
charge.) From this classical Hamiltonian, the connection with quantum mechanics
is made by noting this is the Hamiltonian of a continuous set of harmonic oscillators
with canonical variables A,Π. Quantization then promotes these variables to
canonically commuting operators.
By choosing to work in the Coulomb gauge, it is easily shown that if A(k, t) is
the Fourier transform of A then
k · A = 0. (2.5)
This constraint results in two free polarization components (labeled by s ∈ 1, 2)
for each Fourier component, defining the vectors eq(k) which satisfy the properties
k · eq(k) = 0, (2.6a)
e∗q(k) · eq′(k) = δqq′ , and (2.6b)∑q
e∗qi(k) eqj(k) = δij − kikj/ |k|2 (2.6c)
where i, j refer to the Cartesian components. In terms of the real space operators the
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 33
components of the quantized fields A(x) andE(x) satisfy the commutation relations,
[Ai(x), Aj(x′)] = [Ei(x), Ej(x
′)] = 0 (2.7)
[Ai(x),−ε0Ej(x′)] = i~ δTij(x− x′) (2.8)
where δTij(x− x′) is the transverse delta function, defined as
δTij(x− x′) ≡∫
d3k
(2π)3e+ik·(x−x′)
(δij −
kikj
|k|2
). (2.9)
In reciprocal space, the vector potential is most suitably expressed in terms of the
annihilation (aq(k)) and creation (a†q(k)) operators associated with the polarization
vectors eq(k). They obey the commutation relations,
[aq(k), aq′(k′)] = [a†q(k), a†q′(k
′)] = 0 and (2.10)
[aq(k), a†q′(k′)] = δqq′ δ(k− k′). (2.11)
The Schrodinger pictures operators for the vector potential and electric field are then
given by
A(x) =∑q
∫d3k
(2π)3/2
√~
2ε0c |k|eik·x e∗q(k) aq(k) + h.c., (2.12)
E(x) = i∑q
∫d3k
(2π)3/2
√~c |k|2ε0
eik·xe∗q(k) aq(k) + h.c., and (2.13)
B(x) = i∑q
∫d3k
(2π)3/2
√~
2ε0c |k|eik·x k× e∗q(k) aq(k) + h.c. (2.14)
Substituting these expressions into the Hamiltonian results in the simplified form
Hf = 12~ c∑q
∫d3k |k|
(aq(k)a†q(k) + a†q(k)aq(k)
). (2.15)
We will follow the standard practice of discarding any vacuum energy contributions
and write
Hf = ~ c∑q
∫d3k |k| a†q(k)aq(k). (2.16)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 34
2.2 Wave Packets, Fock Space and Stochastic
Processes
For a single simple harmonic oscillator, the state space is spanned by a complete
basis of states labeling the number of quanta in the oscillator. In the free-space EM
field, we have instead a continuous distribution of oscillators, each representing a
plane wave Fourier component with one of two orthogonal polarization states. This
continuous nature means that the field operators are unbounded in two ways: for a
given k we can have a countably infinite number of quanta and a single quantum
can have an unbounded amount of energy if we allow pure plane wave states with
arbitral large wave numbers |k|. The solution to these problems is to consider only
states of light for which these operators return finite quantities.
Notice that in Eqs. (2.12 - 2.16) the plane wave operators aq(k) and a†q(k) act
as operator-valued integral kernels where they are combined with various weighting
functions to form the physically relevant operators. The fact that they have the
singular commutation relation [aq(k), a†q′(k′)] = δqq′ δ(k − k′) implies that they are
only well defined in the context of an integral, where the Dirac delta function is well
behaved. The point is that the domain of the operators E, B andHf that return finite
eigenvalues should not be considered as a set of distinct plane wave oscillators, but
instead in terms of continuous functions defined over ranges of Fourier components.
We will refer to these distributions as wave packets, in that by constructing a properly
weighted distribution over plane waves one arrives with a localized pulse or packet
of waves that propagates in some direction. Rather than initially discussing wave
packets in terms of single quanta, it is easier to first define wave packet states in
terms of semiclassical states of light that generalize the coherent state of a single
mode harmonic oscillator.
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 35
2.2.1 Wave packets
A semiclassical wave packet identifies those states of light that reproduce coherent
classical radiation when one takes expectation values of the quantized operators
A(x), E(x), etc. This relationship has been identified by many authors, e.g. Deutsch
[40], Garrison and Chiao [41], Smith and Raymer [42]. We review these results here,
focusing on the physical interpretation for the wave packet distributions.
The single-mode coherent state, ψ = |α〉, is characterized by the complex ampli-
tude α, where the mean photon number is given by |α|2 and is an eigenstate of the
annihilation operator a |α〉 = α |α〉. We have seen that in the canonical quantization
of the free field, each plane wave and transverse polarization vector has its own anni-
hilation operator aq(k) and so for a corresponding coherent state of light requires a
complex vector valued function g(k). The coherent state ψ[g] satisfies the equation
aq(k)ψ[g] = gq(k)ψ[g]. (2.17)
By hypothesizing the existence of the states ψ[g] we would like to see how the
coherent amplitude function g(k) relates to physical quantities in expectation. By
taking the expectation value of the (vacuum energy removed) Hamiltonian Eq. (2.16)
we can easily see that
〈Hf〉ψ[g] = ~c∑q
∫d3k |k| |gq(k)|2 . (2.18)
An equally trivial calculation results in
〈A(x)〉ψ[g] =∑q
∫d3k
(2π)3/2
√~
2ε0c |k|eik·x gq(k) e∗q(k) + c.c. (2.19)
and
〈E(x)〉ψ[g] = i∑q
∫d3k
(2π)3/2
√~c |k|2ε0
eik·x gq(k) e∗q(k) + c.c. (2.20)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 36
It is possible to invert these two equations and so express g(k) in terms of the spatial
Fourier transform of a classical vector potential, A(k, t). Performing this inversion
we find that,
g(k) =
√ε0
2~c |k|
(c |k| A(k, 0) + i
∂
∂tA(k, t)
∣∣∣∣t=0
). (2.21)
When quantizing the field, Eq. (2.21) and its adjoint are nothing more than the
“normal variables” that are in classical correspondence to aq(k) and a†q(k) [44].
It is common in optics to relate the physical classical fields A, E , and B to a
unitless mode function. In terms of the vector potential this results in the ansatz,
A(x, t) = A0
(u(+)(x, t) + u(−)(x, t)
)(2.22)
where A0 is a real constant and u(+)(x, t) is a complex unit-less mode function. The
fact that the vector potential is required to be real, we have the relation that
u(−)(x, t) = u(+) ∗(x, t). (2.23)
As u(+)(x, t) is unitless, its integral
v =
∫d3x |u(+)(x, t)|2 (2.24)
has units of volume and is referred to as the mode volume of the field. By taking
the spatial Fourier transform of Eq. (2.22) we have
A(k, t) = A0
(u(+)(k, t) + u(−)(k, t)
)(2.25)
The purpose of separating between u(+)(x, t) and u(−)(x, t) is to allow for the sepa-
ration between positive and negative frequency components, respectively. For a free
field then,
u(+)(k, t) = u(+)(k, 0) e−ic|k| t. (2.26)
The Fourier space version of Eq. 2.1.1 is
u(−)(k, t) = u(+) ∗(−k, t). (2.27)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 37
To simplify the expression for g(k) as given in Eq. (2.21), we need to compute the
time derivative of A(k, t). Substituting Eq. (2.26) into Eq. (2.25) and computing
the derivative we have
∂
∂tA(k, t) = −ic |k| A0
(u(+)(k, t)− u(+) ∗(−k, t)
). (2.28)
Substituting this expression into Eq. (2.21) leads to
g(k, t) = A0
√2ε0c |k|
~u(+)(k, t). (2.29)
Rather than including the vector potential constant A0, which usually contains in-
formation about the overall intensity of the field, it is useful to relate it back to the
magnitude of g. We first define the characteristic wave number k1 as
k1 ≡∫d3k |k|
∣∣u(+)(k, 0)∣∣2
v. (2.30)
By considering v−1∣∣u(+)(k, 0)
∣∣2 to be a normalized distribution in reciprocal space,
then k1 is the average magnitude. With this definition
‖g‖2 = A20
2 ε0 c k1 v
~. (2.31)
Inverting this relationship results in
g(k, t) = ‖g‖
√|k|k1
u(+)(k, 0)√v
e−ic|k|t. (2.32)
It is worth noting that the units of Eq. (2.32) is of root volume and that ‖g‖
now acts as a unitless scaling factor. This final formula shows the fundamental
relationship between a distribution over coherent state amplitudes g(k) and the
positive frequency Fourier components of the mode function u(+)(k, 0). While in one
sense this has simply been an algebraic exercise (expressing one distribution over
spatial frequencies in terms of another) the real utility of this expression is that the
mode function u(+)(x, t) has practical implications as it describes the spatial and
temporal properties of a propagating laser beam.
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 38
Finally, we express the expected energy in a wave packet state in terms of the
envelope function. Simply substituting Eq. (2.32) into Eq. (2.18) results in,
〈Hf〉ψ[g] = ~c ‖g‖2
∫d3k|k|2
k1
∣∣u(+)(k, 0)∣∣2
v. (2.33)
Similarly to defining the mean wave vector k1 we can define a two-norm wave vector
k2,
k2 =
(∫d3k |k|2
∣∣u(+)(k, 0)∣∣2
v
)12
(2.34)
so that
〈Hf〉ψ[g] = ~c ‖g‖2 (k2)2
k1
. (2.35)
If, however,∣∣u(+)(k, 0)
∣∣2 is a sharply peaked function centered at some large vector
k0, then we have that |k0| ≈ k1 ≈ k2. In this case the average energy is then
〈Hf〉ψ[g] ≈ ~ω0 ‖g‖2 (2.36)
where ω0 = c |k0|.
2.2.2 Weyl operators
Assuming the existence of the semiclassical states is only a first step, but real utility
comes from finding the family of operator that generate these states. In the context
of the simple harmonic oscillators, the coherent state with amplitude α is generated
by the unitary displacement operator
Dsho(α) = exp(α a† − α∗ a
)with |α〉 = Dsho(α) |0〉. (2.37)
Writing (2.37) in terms of its generator Υ(α)
Dsho(α) = exp (−iΥ(α)) (2.38)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 39
we find that
Υ(α) = i(α a† − α∗ a
). (2.39)
Note that as gq(k) is a distribution of coherent amplitudes over all plane wave modes,
we make the correspondence
α∗a → g∗q (k) aq(k). (2.40)
But as this is a pointwise weighting over each plane wave, we define the total field
operators a[g] and a†[g] to be
a[g] ≡∑q
∫d3k g∗q (k) aq(k) (2.41)
and
a†[g] ≡∑q
∫d3k gq(k) a†q(k). (2.42)
This are sometimes called smeared creation and annihilation operators as they have
been spread over a range of k values. By applying the commutation relations (2.10),
it is easy to see that[a[f ], a†[g]
]=
∫d3k f∗(k) · g(k). (2.43)
An important property that we will use is that by the linearity of the integral over
d3k we have that, for complex coefficients c1 and c2
c1 a†[f ] + c2 a
†[g] = a†[c1f + c2g] (2.44)
and
c1 a[f ] + c2 a[g] = a[c∗1f + c∗2g]. (2.45)
In other words a†[·] is linear in its argument and a[·] is anti-linear. The continuous
analog of the displacement operator, called a Weyl operator, is
W[g] ≡ exp(a†[g]− a[g]
)(2.46)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 40
and the coherent state ψ[g] is given by
ψ[g] = W[g] |∅〉. (2.47)
Applying the Zassenhaus formula to the Weyl operator shows that
W[g] = exp(a†[g]) exp(a[g]) exp
(−1
2
∫d3k |g(k)|2
). (2.48)
Note that because a[g] |∅〉 = 0 for any g, this implies that
ψ[g] = exp
(−1
2
∫d3k |g(k)|2
)exp(a†[g])|∅〉. (2.49)
When proving limits involving sequences of coherent states, it is often more conve-
nient to work with unnormalized state vectors. Therefore, it is common to define an
exponential vector
e[g] ≡ exp(a†[g]) |∅〉. (2.50)
One particularly useful relationship that we will end up applying repeatedly is that
a[f ]ψ[g] =
∫d3k f∗(k) · g(k) ψ[g], (2.51)
i.e., ψ[g] is an eigenstate of any smeared annihilation operator a[f ], regardless of the
smearing function f . If, however, the functions f and g are orthogonal, then that
eigenvalue could very well be zero.
2.2.3 Fock space
A number of useful relations can be derived involving the Weyl displacement opera-
tors and the exponential vectors. Before doing so it is necessary to introduce some
of the formal and algebraic properties of second quantization and Fock spaces. Note
that if we take f and g to be any square integrable complex functions, then the right
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 41
hand side of (2.43) forms an inner product on a Hilbert space of wave packets. We
will denote the inner product as,
〈g, f〉 ≡∑q
∫d3k g∗q (k)fq(k) (2.52)
and the Hilbert space of wave packets as
h ≡ g(k) : 〈g, g〉 <∞ . (2.53)
A Fock space F is a total Hilbert space describing a unknown and possibly
unbounded number of particles that are each represented by states in a single particle
Hilbert space h. If we are given a single particle from h, then the full Hilbert space
of two such particles is the tensor product of two such Hilbert spaces. Likewise for
three particles, there would be three fold product. We will notate the joint space of
n particle as h⊗n = h ⊗ h ⊗ · · · ⊗ h where there are n such products. In terms of a
total space with an indeterminant number of particles, each subspace that contains
n particle will be mutually orthogonal. Thus the total space is the direct sum over
each subspace. If we take the space for zero particles to be the complex numbers,
(h⊗0 = C), then the full Fock space is given by
Ffull(h) =∞⊕n=0
h⊗n. (2.54)
The reason for the notation Ffull(h) is that if h is a Hilbert space of bosonic particles
than only states that are symmetric under particle exchange will apply. We denote
the symmetric subspace of h⊗n to be h⊗s n. So the symmetric Fock space is given by
Fsym(h) =∞⊕n=0
h⊗sn. (2.55)
We are strictly interested in bosonic particles, so throughout this document we when
refer to F (·) we are referring to the symmetric Fock space.
For a single simple harmonic oscillator, the coherent state |α〉 is expressed in
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 42
terms of the number states |n〉 as
|α〉 =∞∑n=0
αn√n!e−|α|
2/2 |n〉. (2.56)
The equivalent expression for the wave packet state ψ[f ] is
ψ[f ] = e−‖f‖2/2
∞⊕n=0
f⊗n√n!, (2.57)
From the relation that 〈f⊗n, g⊗n〉 = 〈f , g〉n, we have
〈ψ[f ]|ψ[g]〉 = exp(−1
2( ‖f‖2 + ‖g‖2 ) + 〈f , g〉
)(2.58)
or equivalently
〈e[f ]|e[g]〉 = e〈f ,g〉. (2.59)
A number of useful properties involving the Weyl displacement operators and the
exponential vectors are the following:
• The Weyl operators obey the composition law
W[g] W[f ] = exp(−1
2(〈g, f〉 − 〈f , g〉
)W[g + f ]. (2.60)
• The action of the Weyl operator on an exponential vector is
W[g] e[f ] = e−〈g, f〉−‖g‖2/2 e[f + g]. (2.61)
• The linear span of all the exponential vectors (and equivalently the coherent
states) is dense in the symmetric Fock space F (h), meaning that any state
in F (h) can be represented by a limiting sequence of a linear combination of
exponential vectors [45].
• Written in terms of the single particle inner product, the exponential vector
e[g] is an eigenvector of the annihilation operator a[f ] with,
a[f ] e[g] = 〈f , g〉 e[g]. (2.62)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 43
2.2.4 A basis independent expression for the wave packet
inner product
An alternative to expressing ‖g(k, t)‖2 in terms of the characteristic parameters,
v, k1, etc. is to relate it to a basis independent expression involving the physical
(classical) fields E and A. From Eq. (2.29) we can see that
g(k, t) =
√2ε0c |k|
~A(+)(k, t) (2.63)
and that
‖g(k, t)‖2 =2ε0
~
∫d3k c |k| A(+) ∗(k, t) · A(+)(k, t). (2.64)
The presence of the factor c |k| makes this expression inherently tied to the k basis
and not immediately expressible in terms real space quantities. However by recog-
nizing that in the Coulomb gauge E = − ∂∂tA and A(+)(k, t) = A(+)(k, 0)e−ic|k|t we
have the equality
c |k| A(+)(k, t) = −i E (+)(k, t). (2.65)
Substituting this relation into Eq. (2.64),
‖g(k, t)‖2 =i2ε0
~
∫d3k E (+) ∗(k, t) · A(+)(k, t). (2.66)
This expression is basis independent, in the sense that we can take the inverse trans-
forms to arrive at
‖g‖2 =i2ε0
~
∫d3x E (+) ∗(x, t) ·A(+)(x, t). (2.67)
In [42], Smith and Raymer derive a Dirac quantization scheme for a photon wave
function, equivalent to the more standard expressions reviewed in Sec. 2.1.1. In that
work they assume that for each polarization vector q there exists a countable set
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 44
of complete scalar orthonormal wave packets gj q(k), which therefore satisfy the
properties∑j
gj q(k)∗gj q(k′) = δ(k− k′) and∫
d3k gj q(k)∗gj′ q(k) = δj j′ .
(2.68)
They then observe that the classical electric fieldsE (+)j q (x, t)
in correspondence to
these wave packets, via Eq. (2.20), are no-longer orthogonal in a real space overlap
integral precisely because of the weighting factor of√|k|,∫
d3x E (+) ∗j q (x, t) E (+)
j′ q′(x, t) 6= δj j′ δq,q′ . (2.69)
They also observe that if instead one considers the overlap with the vector potential
then the orthogonality is preserved, due to the cancelation of the factors of√|k|.
This is precisely the statement that if 〈gj(k, t), gj′(k, t)〉 = δj j′ , then
〈gj(k, t), gj′(k, t)〉 =i2ε0
~
∫d3x E (+) ∗
j (x, t) ·A(+)j′ (x, t)
=−i2ε0
~
∫d3x A(+) ∗
j (x, t) · E (+)j′ (x, t) = δj,j′ .
(2.70)
In quantizing a photon’s wave function Smith and Raymer consider E (+)(x, t) to be
the fundamental single particle wave functions. Secondly they observe that in order
to preserve orthogonality in the real space inner product then the dual vectors are
not E (+) ∗(x, t) but are instead proportional to A(+) ∗(x, t).
In this work we will continue to view the single particle vectors to be the wave
packet functions g and not the associated classical electric field. This is for two
reasons. Firstly, it is mathematically convenient that the vector dual to the wave
packet g is simply its complex conjugate. The second reason is that we continue
to treat g as an analogy with the simple harmonic oscillator’s coherent state and
the vector potential A and the electric field E are in correspondence with X and P
quadratures.
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 45
2.2.5 Fock space and stochastic srocesses
By structuring the Hilbert space of the free EM field as a Fock space defined over
a single particle Hilbert space h, we can now define a quantum stochastic processes
and a quantum stochastic calculus. Consider again the example of a coherent laser
pulse propagating towards a photon counter. A quantum description of a traveling
wave laser pulse is a coherent wave packet state ψ[g] where g(k, t) is related to the
classical field by Eq.(2.29). Imagine a perfect space fixed detector that is capable
of returning a voltage directly proportional to the total energy in a given classical
wave packet g(k, t). Furthermore, imagine that this detector is activated between
the times [t0, t1], and after this interval the voltage is read. If the “entirety” of an
incident wave packet g, could be absorbed in that time, then the detector should be
modeled as making a projective measurement on the part of Fock space containing
ψ[g]. Depending upon the details of the detector, it could likely have recorded
pulses that were similar enough to g, either in magnitude, temporal profile or carrier
frequency. For instance, a 100% efficient detector with a linear response should be
able to measure pulses with 2 µW of average laser power just as well as a pulse with
200 mW of power. By modeling a physical measurement as a Hermitian operator
acting upon some Fock space, we need to define the set of possible wave packets
the detector could have completely measured. Fig. 2.1 shows a schematic where a
paraxial laser pulse is focused upon a gated photo-detector.
In the second quantization formalism we can give a mathematical chain from a
classical wave packets to field operators. In the time interval [t0, t1] a fixed detector
could projectively measure some set g of incident wave packets and the linear
span of these wave packets forms a subspace h[t0,t1] ⊂ h. In turn h[t0,t1] defines
a subspace F (h[t0,t1]) ⊂ F (h). Furthermore there exist operators O[t0,t1] that act
nontrivially on coherent states ψ[g] ∈ F (h[t0,t1]) but as the identity for any ψ[g⊥] for
g⊥ /∈ h[t0,t]. An operator X[t0,t1] identified by this procedure then defines a quantum
stochastic process, by considering the family of operatorsX[t0,t] : t0 ≤ t <∞
. The
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 46
DetectorColor Filter
Collection Lens
Temporal Pulse
Para
xial
Mod
e
h[t0, t1]
Paraxial Measurement Model
Incident Envelope
432 1 0Time
t0 t1
Puls
e A
mpl
itude
Figure 2.1: A Model of a Paraxial Measurement. A photo-detector is positionedrelative to an optical system which defines a paraxial beam with a characteristicwavelength and mode profile. Here the optical system is defined simply by afocusing lens and color filter with the beam schematically indicated by linesof constant intensity. The detector is activated between times t0 and t1 whichcorresponds, at time t = 0 to a pulse localized in space in the region ∆z = c(t1−t0). For a perfect detector with a linear response, the integrated output currentwill be proportional to the total pulse energy and can be modeled as making aprojective measurement on the sub-Fock space of pulses localized between thesetimes.
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 47
requirement that X[t0,t] acts trivially on coherent states ψ[g⊥] defines a process that is
time-adapted, in direct analogy with a time-adapted classical stochastic process (see
Sec. 3.1.1 for the definition of a time-adapted classical stochastic process). Turning
this qualitative procedure into a mathematically sound object requires explicitly
constructing h[t0,t1], which means defining what it means to measure the entirety
of a wave packet in a finite time interval. While naıvely this may seem trivial, in
practice it intersects with the problem of defining a localizable photon in quantum
field theory. We will illustrate why this is an issue next.
2.2.6 Localized wave packets and stochastic processes
While the canonical quantization of the free field is most easily performed in the
Fourier domain, the mathematical structure of the second quantized Fock space
F (h) is generally basis independent. The operators a[g] and a†[f ] can be related to
the coherent states ψ[h] without any reference to the fact that the wave packets are
originally defined with respect to k. Any unitary transformation of g is an equally
valid expression of the wave packet state in that the Hilbert space of wave packets
h = g : 〈g, g〉 <∞ is basis independent. The only element that depends upon
g being defined in the Fourier domain is its relationship to the spatial profile of the
mode function u(+)(x, t). But as we have defined the wave packets in the Fourier
domain, it is not immediately apparent what effect the constraint ‖g‖2 <∞ has on
u(+)(x, t). One drastic result of this constraint is that it prohibits one from defining
fields that are strictly localized in space [41, 46].
To see why this is true, consider a one-dimensional case where we wish to define
square wave pulse of duration L, with a carrier frequency ω0 = c k0. The mode
function for such a pulse is
u(+)loc (z, t) =χ[0,L] (z − ct) exp(+ik0(z − ct)). (2.71)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 48
Taking the spatial Fourier transform shows that
u(+)loc (k, t) =
L√2π
sinc(
12L(k − k0)
)exp
(− i1
2(k − k0)L− ick t
)(2.72)
and from Eq. (2.29), we then have g loc(k, t) ∝√k u
(+)loc (k, t). The
√k factor makes
all the difference, as if we try and calculate the norm we find
‖g loc(k, t)‖2 ∝∫ ∞−∞
dk |k| sinc2(12L(k − k0)) =∞. (2.73)
The failure of this calculation stems from the fact that the indicator function χ[a,b] is
a discontinuous function and this discontinuity presents itself in the Fourier domain
by this divergence. This example implies that there is a nonlocalizable property of
photon wave packets. This nonlocal property of a photon wave function has been
studied by many authors, with various definitions for a photon’s wave function, see
[42] for a pertinent discussion. While this example does not show that any localized
wave packet suffers from this or a similar problem, this is indeed the case. In [46],
Bialynicki-Birula proves that the energy density of a photon can be localized no
better than an exponential function, exp(−f(r)), where f(r) grows slightly slower
than a linear function in r.
At this point, we must make some approximation thereby admitting localized
states of light. Ultimately this means that we will relax the measurement window
to include functions localized with in exponentially damped tails. However, we gain
physical insight by considering a temporal rescaling so that on a timescale that is
long compared to an optical period, a smoothly varying function can appear to be
a localized discontinuous function. We will also show that though this rescaling,
one obtains all the familiar approximations in quantum optics, namely the Markov,
quasi-monochromatic, and rotating wave approximations. It also provides a gateway
for defining quantum white noise and equivalently the necessary conditions for ap-
plying a quantum Wong-Zakai theorem to arrive at a physical realization of quantum
stochastic calculus.
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 49
2.3 Paraxial Envelopes and Measurable Pulses
Before constructing temporally localized wave packet we first must address the spa-
tial/temporal decomposition indicated in Fig. 2.1. An excellent mathematical de-
scription for the focused collection of light by a series of thin lenses is to model
the system paraxially where a coherent plane wave is propagating along the optical
axis but is spatially and temporally modulated by a slowly varying envelope func-
tion. This envelope function describes how the plane wave is localized to the optical
axis as well as how the phase fronts are distorted by the optical system [47, 48].
Appendix A reviews the derivation of the paraxial wave equation, which describes
the propagation of a slowly-varying-envelope, as well as computes its spatial Fourier
transform.
A paraxial and quasi-monochromatic wave is characterized by the complex func-
tion
U (+)(x, t) = f(tr)u(+)T (xT , z) e
−iω0 tr . (2.74)
Here ez is the axis of propagation, xT is the remaining transverse coordinates and
tr = t− z/c is the retarded time. The paraxial mode function u(+)T (xT , z) describes
how the carrier wave (with angular frequency ω0, and wave number k0) is modulated
as it propagates along the optical axis. Note that this is a time independent quantity.
Any nontrivial time dependence is given by the temporal envelope function f(tr),
which decouples from the paraxial mode u(+)T . The only requirement is that f(tr)
be slowly varying,∣∣ ddtf∣∣ ω0 |f |, ensuring the full solution is quasi-monochromatic.
The problem of finding the space of measurable wave packets now translates into
finding the temporal envelopes f(tr) that “fit” in the measurement window [t0, t1].
Identifying the appropriate spatial mode function u(+)T (xT , z) for a given optical
system, is simply a problem of classical optics and is well modeled by a Hermite-
Gaussian mode function [48]. Here we are only concerned with the fact that such
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 50
a function exists and is well defined and has a given “transverse area”. In free
space conservation of energy requires that the total power passing though a plane
transverse to the optical axis be conserved, which manifests thought the property
that for any paraxial mode we have∫R2
d2xT
∣∣∣u(+)T (kT , z)
∣∣∣2 ≡ σT (2.75)
and that σT is independent of z for any finite z. While the distribution of energy
in the transverse plane can vary due to diffraction, the total power passing though
an infinite transverse plane will be conserved. This transverse area can be combined
with the square integrated temporal duration
τ ≡∫dt |f(t)|2 , (2.76)
to construct the total mode volume
v = c τ σT . (2.77)
We have already shown how a wave packet g is related to the spatial Fourier
transform of a classical vector potential, A(+)(k, t), and how that can be expressed
in terms of a unitless mode function u(+)(x, t). The spatial Fourier transform of the
paraxial mode function is given by
U (+)(k, t) = c f (ω(k)− ω0) u(+)T (kT , 0) e−iω(k)t. (2.78)
where f is the temporal Fourier transform of the pulse envelope, u(+)T (kT , 0) is the
spatial transform of the mode function with respect to the transverse coordinates xT
(evaluated at z = 0) and ω(k) is the approximate frequency
ω(k) ≡ c |k| ≈ c
(|kT |2
2k0
+ kz
). (2.79)
Eq. (2.29) relates g(k, t) to A(+)(k, t) and so
g(k, t) = A0
√2ε0ω(k)
~c f (ω(k)− ω0) u
(+)T (kT ) e−iω(k)t. (2.80)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 51
In Sec. 2.2.1, we eliminated the constant A0 in favor of an expression in terms
of characteristic parameters, namely the mode volume v and the characteristic wave
number k1. Here we abandon k1 in favor of a frequency ω1 = ck1, which is equal to
ω1 =1
v
∫d3k ω(k)
∣∣∣c f (ω(k)− ω0) u(+)T (kT )
∣∣∣2 . (2.81)
Calculating this integral is easiest with the change of variables
kT , kz → kT , ω(k), we find that
ω1 =
∫d2kTσT
∣∣∣u(+)T (kT )
∣∣∣2 ∫ dω(k)
τω(k)
∣∣∣f (ω(k)− ω0)∣∣∣2 . (2.82)
Implicit in the paraxial approximation, is the requirement that∣∣∣u(+)
T (kT )∣∣∣2 → 0 as
|kT |2 → ∞. This fall off implies that we can treat the factor c |kT |2 / 2k0 in ω(k)
as a finite and independent offset to the dω(k) integral and not consider how ω(k)
converges as kz → −∞ with |kT |2 → ∞. We can then make another change of
variables ω(k)→ ν + ω0 so that
ω1 =
∫d2kTσT
∣∣∣u(+)T (kT )
∣∣∣2 ∫ dν
τ(ω0 + ν)
∣∣∣f(ν)∣∣∣2 . (2.83)
when f(t) is real-valued, it is simple to show that∣∣∣f(ν)
∣∣∣2 is an even function and
therefore mean zero. In that case we have
ω1 = ω0
∫dν
τ
∣∣∣f(ν)∣∣∣2 +
∫dν
τν∣∣∣f(ν)
∣∣∣2 = ω0 (2.84)
as one would intuitively expect. In the case of a complex valued f(t),∣∣∣f(ν)
∣∣∣2 will
not in general be mean zero. In this more general case,
ω1 = ω0
(1 +
∫dν
τ
ν
ω0
∣∣∣f(ν)∣∣∣2) . (2.85)
For the one-dimensional localized pulse, Sec. 2.2.6 demonstrated that the second
integral is infinite. By integral expressions for the Fourier transforms and then inte-
grating by parts, we can show that∫dν
ν
ω0
∣∣∣f(ν)∣∣∣2 = −i
∫dt
1
ω0
(d
dtf ∗(t)
)f(t). (2.86)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 52
By the slowly varying envelope approximation, we require that∣∣ ddtf(t)
∣∣ ω0 |f(t)|,
and in order for the quasi-monochromatic regime to hold this integral must be a
small correction factor. Therefore, we find that
g(k, t) ≈ ‖g‖
√ω(k) c
ω0 τ σTf (ω(k)− ω0) u
(+)T (kT , 0) e−iω(k)t. (2.87)
2.3.1 Paraxial wave packets in the time domain
Even in a quasi-monochromatic regime, the wave packet g is still tied the Fourier
basis due to the factor of√ω(k). We just showed that when f(t) is real-valued or
very slowly varying then this factor plays no role in calculating ‖g‖2. Here we would
like express the inner product between wave packets that share the same spatial mode
in terms of real space coordinates in order observe what effect this “nonlocal” factor
has on their temporal distinguishably. If we are able to make the approximation
that ω(k) ≈ ω0 for a family of wave packets then there will be a simple unitary
relationship between wave packets the real and Fourier domains. The goal of this
section is to identify this family.
Consider the two wave packets g1(k, t) and g2(k, t) that share the same paraxial
mode function u(+)T (xT , z) and carrier frequencies, but have differing temporal profiles
f1(tr) and f2(tr). For simplicity we will assume that the corrective factor of Eq. (2.86)
is small and so Eq. (2.87) is valid for each wave packet. If we then calculate the
unequal time inner product, we have that
〈g1(k, t1), g2(k, t2)〉 =‖g1‖ ‖g2‖√
τ1 τ2∫d3k
ω(k) c
ω0 σT
∣∣∣u(+)T (kT , 0)
∣∣∣2 f ∗1 (ω(k)− ω0) f2 (ω(k)− ω0) e−iω(k)(t2−t1).
(2.88)
By again making the change of variables kT , kz → kT , ν with ν = ω(k)− ω0 we
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 53
are able to integrate out the transverse degrees of freedom to arrive at
〈g1(k, t1), g2(k, t2)〉 =‖g1‖ ‖g2‖√
τ1 τ2
∫dν
(ω0 + ν)
ω0
f ∗1 (ν) f2(ν)e−i(ω0+ν)(t2−t1). (2.89)
In analogy with the convolution theorem, it is easy to show that∫dνf ∗1 (ν) f2(ν)e−iνt =
∫ds f ∗1 (s) f2(s+ t) = f1 ? f2 (t) (2.90)
where f1 ?f2 (t) is the cross-correlation function between f1 and f2 evaluated at time
t. Furthermore by repeating the integration by parts transformation from Eq. (2.86)
we have that∫dν ν f ∗1 (ν) f2(ν)e−iνt = −idf1
dt? f2 (t). (2.91)
Combining these two facts,
〈g1(k, t1), g2(k, t2)〉 =‖g1‖ ‖g2‖√
τ1 τ2
e−iω0(t2−t1)(f1 ? f2 (t2 − t1)− i 1
ω0
df1
dt? f2 (t2 − t1)
)(2.92)
While previously we were able to show that for zero delay and a real-valued envelope
the second term would be identically zero, this is clearly not the case for different
wave packets. However, due to the slowly varying envelope approximation we know
that this must be a small correction. Ignoring this correction results in
〈g1(k, t1), g2(k, t2)〉 ≈ ‖g1‖ ‖g2‖√τ1 τ2
e−iω0(t2−t1) f1 ? f2 (t2 − t1). (2.93)
Eq. (2.93) shows that the overlap between the two wave packets is proportional to the
cross correlation function of the temporal envelopes. Physically this is a extremely
satisfying result, as if we have the two (paraxial) field operators a[g1(k, t1)] and
a†[g2(k, t2)] then
[a[g1(k, t1)], a†[g2(k, t2)]
]∝ f1 ? f2 (t2 − t1) (2.94)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 54
meaning that field operators for uncorrelated temporal envelopes commute! Fur-
thermore if we can construct a wave packet ϕ(k, t) whose temporal envelope ϕ(tr)
is (approximately) delta correlated in time then,[a[ϕ(k, t)], a†[ϕ(k, t′)]
]∝ δ(t′ − t). (2.95)
This is significant because this is the defining feature of quantum white noise, which
is discussed in Sec. 2.6. Before doing so, we will apply the results of this section the
defining h[t1,t2].
2.3.2 The measurable subspace
In the ideal situation, the measurable wave packet are the wave packets defined on
the paraxial mode u(+)T (xT , z) with a envelope functions f(t) such that f(s) = 0 for
all s /∈ [t0, t1]. Unfortunately because of the problem of localization no such physical
wave packets exist. If we allow for discontinuous functions, then for any function
g(t),
g[t0,t1](t) ≡χ[t0,t1] (t) g(t) (2.96)
is clearly zero for any t 6= [t0, t1] and therefore would be an element of h[t0,t1]. In
order to define approximately localized temporal envelopes we need an approximate
form of the indicator function χ[t0,t1] (t), i.e. a smooth cut off function. A common
choice for such a function is to convolve χ[t0,t1] (t) with a smooth positive normalized
distribution function ϕ(σ)(t), where σ represents the degree of localization. For a
concrete example, if ϕ(σ)(t) is a mean-zero normalized Gaussian with variance σ2
then,
χ(σ)[t0,t1] (t) ≡ ϕ(σ)∗ χ[t0,t1] (t)
=
∫ds 1√
2π σ2exp(− (t−s)2
2σ2 ) χ[t0,t1] (s)
= 12
(erf( t−t0√
2σ)− erf( t−t1√
2σ)).
(2.97)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 55
Note that because limσ→0 ϕ(σ)(t) = δ(t), we also have limσ→0 χ
(σ)[t0,t1] (t) =χ[t0,t1] (t).
Again, to maintain a quasi-monochromatic field f(t) must be slowly varying. This
statement is quantified by the relation 1ω0
∣∣∂f∂t
∣∣ |f |, and in terms of σ this means
1
σ ω0
1. (2.98)
In quantum optical systems a carrier frequency of ω0 = 2π × 370 THz is not
uncommon, and has a corresponding wavelength of λ0 = 810 nm. A common use for a
laser at this wavelength is to generate optical pules as short as 5 fs in duration [49]. If
we take this to be a minimum but physically realizable timescale than we would have
(σ ω0)−1 ∼ 0.08. While a wave packet of this duration is still relatively slowly-varying,
its likely that the correction to ω1 in Eq. (2.86) could be a nonnegligible contribution,
as well as other higher order effects. A convenient limit would be to set σ such that
(σ ω0)−1 ∼ 10−3, meaning that for the near infrared wavelengths σ ∼ 0.1 ps. While
this sets a physically realistic smoothing variance, it does not say when χ(σ)[t0,t1] (t) is
a good approximation to an actual indicator function, as this requires a comparison
between the smoothing and its overall duration. Fig. 2.2 illustrates this distinction by
plotting χ(σ)[0,τ ] (t) with σ = 0.1 ps and a series of durations, 0.5 ps ≤ τ ≤ 100 ps. For
visual comparison each indicator is plotted in scaled units of τ . Simple inspection
shows that for intervals on the scale of τ & 10 ps a smoothing variance of 0.1 ps
makes an excellent approximation to the truly discontinuous function. Note that as
t extends beyond the interval [0, τ ], χ(σ)[0,τ ] (t) decays like an error function scaled by
σ and this extent is independent of τ for τ σ. So that for τ = 10 ps and τ = 0.1
ps, χ(σ)[0,τ ] (τ + 3σ) ≈ 10−3.
With this function in hand we can now identify a space of wave packets that are
able to be projectively measured approximately in the time window of [t0, t1]. For
any valid envelope function f(t), we can define a localized version
f(σ)[t0,t1](t) ≡χ
(σ)[t0,t1] (t) f(t). (2.99)
Clearly these functions are good temporal envelopes and approximately fit in h[t0,t1].
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 56
Figure 2.2: Approximations for a localized pulse. Shown here are a series ofslowly varying temporal envelopes. Each envelope is a unit pulse centered at zero,with a variable duration τ , convolved with a Gaussian smoother with σ = 0.1ps. The pulse duration τ ranges between 0.5 ps to 100 ps. Each pulse is plottedverses t in units of τ .
Note that we can actually increase the space of valid wave packets by observing that
with an appropriate σ, χ(σ)[t0,t1] (t) is itself a valid temporal envelope. Therefore for any
function f(t) <∞ whose support is contained in the interval [t0, t1], we can define a
smooth version via convolution
f (σ)(t) ≡ ϕ(σ) ∗ f (t). (2.100)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 57
And so any wave packet whose temporal envelope is defined in this way is approx-
imately measurable in the time interval [t0, t1]. This can be formalized by defining
the set of functions,
S (σ)[t0,t1] ≡
ϕ(σ) ∗ f (t) :
∫dt f(t) <∞, supp(f) ⊆ [t0, t1]
(2.101)
and the set of wave packets
s(σ)[t0,t1] = span
g =
√ω(k) c
ω0
f (ω(k)− ω0)√|t1 − t0|
u(+)T (kT , 0)√σT
e−iω(k)t : f ∈ S (σ)[t0,t1]
.
(2.102)
We then have the limit
h[t0,t1] = limσ→0
ω0σ→∞
s(σ)[t0,t1]. (2.103)
2.4 The one-dimensional limit
In the previous sections we showed that if there are two quasi-monochromatic wave
packets g1 and g2 defined on the same paraxial mode but with differing temporal
envelops f1 and f2, then the commutator between a[g1] and a†[g2] is proportional to
the cross-correlation function f1?f2. Furthermore the proportionality is independent
of the details of the paraxial mode. This suggests that moving to a simplified,
one-dimensional model is both appropriate and fruitful. This section makes this
connection and relates it to the standard representations of quantum white noise.
The end of Sec. 2.3.1 suggested defining a wave packet ϕ(k, t) whose temporal
envelope, ϕ(tr) is delta correlated in time. In defining the smoothed set of functions
s(σ)[t0,t1], Sec. 2.3.2 took any integrable function defined on the interval [t0, t1], f[t0,t1](t),
and convolved it with a Gaussian distribution ϕ(σ)(t) to obtain an envelope consistent
with the quasi-monochromatic approximation. To move to a one-dimensional model,
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 58
we will factor out f[t0,t1](t) from field operator and define an operator-valued density
a[ϕ(σ)(t)]. We will shortly show that this is approximately delta commuting in time.
Deriving this factorization is not difficult, and begins by first noting that as the
Fourier transform of a convolution is proportional to the product of the Fourier
transforms, we have that
f (σ)(ν) =√
2π f(ν) ϕ(σ)(ν) =
(∫ds f(s)e+iν s
)ϕ(σ)(ν). (2.104)
Substituting this expression into the definition of g(σ)(k, t), as written in Eq. (2.87),
we have,
g(σ)(k, t) =‖g‖√τ
∫ds f(s)e−iω0 s ϕ(σ)(k, t− s) (2.105)
where
ϕ(σ)(k, t) ≡
√ω(k) c
ω0 σTϕ(σ) (ω(k)− ω0) u
(+)T (kT , 0) e−iω(k)t. (2.106)
By the anti-linear nature of the creation operator a[g] we are able to bring the integral
over s out of the operator to write,
a[g(σ)(k, t)] =‖g‖√τ
∫ds f ∗(s) e+iω0s a[ϕ(σ)(t− s)]. (2.107)
Note that a[g(σ)(k, t)] and a[ϕ(σ)(t − s)] have different units, as the former is unit-
less while the latter has units of 1/√
time, ultimately arising from the fact that
ϕ(σ)(t) is a density over time and therefore has units. Eq. (2.106) has the following
physical implications. First is that the integral is the point-wise weighting of an
annihilation operator by a complex amplitude, completely akin to original analogy
of one-dimensional simple harmonic oscillator α∗ a ↔ f ∗(t) a(t). The second impli-
cation is that the complex weighting function in general matches the phase of the
carrier wave, resulting the explicit appearance of the e+iω0s factor. The third impli-
cation is that both the function f(s)/√τ and the operator a[ϕ(σ)(t − s)] have the
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 59
same units, which are the same as white noise. The final implication comes from the
fact that because we have the commutator
[a[ϕ(σ)(t)], a†[ϕ(σ)(t′)]
]= e−iω0(t′−t) ϕ(σ) ? ϕ(σ)(t′ − t) +O
((σ ω0)−1
)(2.108)
and the limit limσ→0 ϕ(σ) ? ϕ(σ)(t′ − t) = δ(t′ − t), then
limσ→0
ω0σ→∞
[a[g
(σ)1 (k, t)], a†[g
(σ)2 (k, t)]
]=‖g1‖ ‖g2‖√
τ1 τ2
∫ds f ∗1 (s) f2(s). (2.109)
This implies that if we have two square integrable functions, h1(t) and h2(t) then
these functions can count as members of a single particle Hilbert space, h′ = L2(R).
Furthermore we can define a Fock space F (h′) and ultimately the field operators
a[h1] and a[h2]. If
h1(t) ∼=‖g‖√τf1(t), (2.110)
then we can draw the formal equivalence
a[h1] ∼= limσ→0
ω0σ→∞
a[g(σ)1 ]. (2.111)
While discussing the statistical aspects of quantum light can be interesting in its
own right, the real fun is when that light is coupled to another quantum system. Sec.
2.6 shows how when a system couples though an interaction Hamiltonian to operators
similar to a[ϕ(σ)(t)] and a†[ϕ(σ)(t)], the limiting object can be written in terms of a
quantum stochastic process on the joint Hilbert space Hsys⊗F (h′). Furthermore it
discusses how this relates to the standard expressions in quantum optics involving
simpler models of quantum white noise. Before including the system however, we
will show what is gained by taking this discontinuous limit and how it is useful for
defining quantum stochastic processes.
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 60
2.5 Quantum Wiener processes and the
continuous-time decomposition
Quantum stochastic integrals were first defined mathematically by Hudson and Par-
thasarathy in 1984. There they formulated a quantum version of a Ito-type stochas-
tic integral where the fundamental differentials, in correspondence to the classical
Wiener process and other jump processes, are operators acting on a bosonic Fock
space [23]. Independently Gardiner and Collett formulated a physical description of
quantum white noise operators where creation and annihilation operators are asso-
ciated with excitations in a bosonic heat bath, which are then used as driving noise
sources in a quantum Langevin equation [50]. This second formulation is the most
well known in the quantum optics community (see, e.g., the well written reference
[51]) but is less amenable for directly applying the filtering techniques of classical
probability theory. The picture of a heat bath does not immediately induce a pic-
ture of a traveling flow of information from a probe system to a detector. Rather it
instills a picture of a system immersed in stationary and chaotic environment and it
is unclear what it means quantum mechanically to “measure the bath”. While one
certainly could, and often does, construct a large scale flow in the bath running from
the system to an independent observer such a construction ultimately resembles a
wave packet description.
If one instead explicitly includes time into a description of the environment, as
Hudson and Parthasarathy do, then statistical properties necessary for defining a
quantum Wiener process and a quantum Ito integral, namely the ability to construct
time-adapted processes, is a direct consequence. We will shortly review how this is
done, but first note that in contrast to relying on the system to dictate how the
bath is modeled, this represents a more axiomatic approach in that the statistical
properties of the bath are postulated independently from the system. Now clearly the
physics of the entire system-probe-measurement combination will dictate whether or
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 61
not this is an appropriate model. The purpose of this section is to show what is
gained in this formulation.
Like the Gardiner and Collett formulation, this formulation begins by assuming
a bosonic Fock space, however here it assumes that it is a second quantization of a
single particle Hilbert space,
h ≡ L2(R+)⊗ h′ (2.112)
where L2(R+) represents the Hilbert space of square integrable functions defined on
the positive real line (representing time) and h′ is an auxiliary Hilbert space. Almost
all formulations immediately assume that h′ is a finite d-dimensional system and so
every g(t) is effectively a complex-vector-valued function, i.e. g : R+ → Cd. In this
case the inner product between two single particle vectors is
〈f , g〉 =
∫ ∞0
dt f∗(t) · g(t) <∞. (2.113)
From this single particle Hilbert space the symmetric Fock space F (h), exponential
vectors e[g], and Weyl displacement operators W[g] are identical to their definitions
in Sec. 2.2.3. And most importantly, we can define the annihilation and creation
operators
Ait ≡ a[χ[0,t] ei] (2.114)
and
Ai †t ≡ a†[χ[0,t] ei] (2.115)
that have the commutation relation[Ais, A
j †t
]=
∫ ∞0
ds′ χ[0,s] (s′) χ[0,t] (s
′) ei · ej = δi,j min(s, t). (2.116)
These processes serve as two of the building blocks of quantum stochastic calculus
and are in analogy with an n-dimensional Wiener process. To show this last analogy,
we must first discuss how we specifically include time in h.
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 62
2.5.1 The continuous-time tensor decomposition
In basis quantum mechanics there is an intimate connection between statistical in-
dependence and a tensor product structure. When a complete system is described
by the tensor product of two Hilbert spaces and the total state is a product state
then both systems can be considered statistically independent. Specifically in this
case two operators from the individual Hilbert spaces X1 and X2 are statistically
independent in the sense that
〈X1 ⊗X2〉ρ = 〈X1〉ρ1〈X2〉ρ2 . (2.117)
In Sec. 2.2.5 we introduced the notion of a time-adapted quantum stochastic process,
where a quantum operator O[0,t] was time-adapted if it acted as the identity on any
coherent ψ[g⊥] where g⊥ is excluded from the time interval [0, t]; g⊥(s) = 0 for
0 ≤ s ≤ t. We defined two mutually orthogonal spaces of wave packets h[0,t] and h⊥[0,t]
which in turn have their associated Fock spaces F (h[0,t]]) and F (h⊥[0,t]). The classical
definition of a time-adapted stochastic process is that the process is statistically
independent of all events in the future. In light of the connection between statistical
independence on the one hand and a tensor product structure on the other, it seems
reasonable to have
F (h) ∼= F (h[0,t])⊗F (h⊥[0,t]) (2.118)
where ∼= indicates a unitary equivalence. But if h[0,t] represents all wave packets
localized to [0, t] then it also seems reasonable to conclude that h⊥[0,t] = h(t,∞).
In this section we will show that this tensor product decomposition is not only
possible for any single time t but it is also possible for any sequence of n ordered
times tn : 0 < t1 < · · · < tn <∞. This is called the continuous-time tensor de-
composition and is the relation that
F (h) ∼= F (h[0,t1))⊗F (h[t1,t2))⊗ · · · ⊗F (h[tn,∞)). (2.119)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 63
The proof of this statement is outlined in lemma 2.1 (which is essentially proposition
19.6 of [52]).
Now if Eq. (2.119) is true, then for any partitioning of time, no matter how small,
this Fock space decomposes into a tensor product between the various partitions.
Furthermore if we have operators
O[ti,ti+1)
which are each adapted to the interval
[ti, ti+1) then we have that the expectation values factorize,⟨ ∏i
O[ti,ti+1)
⟩ψ[g]
=∏i
⟨O[ti,ti+1)
⟩ψ[g[ti,ti+1)]
. (2.120)
In other words, if both the operators and the state respects the continuous-time
decomposition then those operators will be statistically independent for independent
times.
Why is this important? Well, Sec. 3.1.4 reviews the basic properties of a Wiener
process and shows how its defining feature is that its restrictions to independent
time increments are statistically independent and that each are mean zero Gaussian
random variables of variance ti+1 − ti. Therefore, any quantum analog of a Wiener
process must also respect the continuous-time decomposition. The fact that the
classical Wiener process satisfies the Markov and martingale properties is a direct
consequence of this independence [53]. Sec. 3.1.1 reviews the definition of these two
properties and how they relate to taking conditional expectation values. Additionally,
the Ito definition of a stochastic integral, (see Appendix B ) is defined in such a way
so that the integral∫xt dwt is also a martingale and that if xt is Markovian than so is
the integral. For a quantum stochastic integral to also have these desirable properties,
a necessary criteria is that a process Xtt≥0 must be statistically independent of all
future events. In the next section we return to the operators Ait and Ai †t and show
how they can be used to construct a quantum Wiener process, but first we include
a proof of the continuous-time decomposition.
Lemma 2.1. Given the single particle Hilbert space h = L2(R+)⊗Cd and an ordered
sequence of times tn = ti ∈ R+ : 0 < t1 < · · · < tn <∞, the symmetric Fock
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 64
space satisfies unitarily equivalence F (h) ∼= F (h[0,t1])⊗F (h[t1,t2))⊗ · · ·⊗F (h[tn,∞))
where h[ti,ti+1] = L2([ti, ti+1])⊗ Cd.
Sketch of Proof. For any vector g ∈ h we can define the projection of g onto a time
interval via
g[t0,t1](t) ≡χ[t0,t1] (t) g(t) (2.121)
and that for all n times we have
g(t) = g[0,t1)(t) + g[t1,t2)(t) + · · ·+ g[tn,∞)(t). (2.122)
As this is true for any element h, we have the natural decomposition,
h ∼= H[0,t1) ⊕H[t1,t2) ⊕ · · · ⊕ H[tn,∞). (2.123)
where H[ti,ti+1) is the space of square integrable vector valued functions of dimension
d defined on the interval [ti, ti+1). Because of this decomposition, proving that F (h)
satisfies the tensor decomposition now means proving unitary equivalence
F( n⊕
i=1
h[ti,ti+1)
)∼=
n⊗i=1
F (h[ti,ti+1)). (2.124)
This is easily shown by first noting that
⟨g[ti,ti+1), f[tj ,tj+1)
⟩= δi,j
⟨g[ti,ti+1), f[ti,ti+1)
⟩. (2.125)
This however implies that for the exponential vectors e[g] and e[f ]
〈e[g]|e[f ]〉 = exp(〈g, f〉) =n∏j=1
exp(⟨
g[ti,ti+1), f[ti,ti+1)
⟩). (2.126)
If we define the transformation V : F (h) → F (h[0,t1]) ⊗ · · · ⊗F (h[tn,∞)) such
that
V e[g] = e[g[0,t1)]⊗ · · · ⊗ e[g[tn,∞)], (2.127)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 65
The inner product between two transformed vectors is then
〈V e[g]|V e[f ]〉 =n∏j=1
e〈g[ti,ti+1), f[ti,ti+1)〉 = e〈g, f〉. (2.128)
This shows that V is a unitary transformation between the exponential vectors.
However because the exponential vectors are dense in the symmetric Fock space, V
linearly extends to any vector in F (h).
2.5.2 The quantum Wiener process
In [25], Bouten et al. given an elegant derivation of how the quadratures
Qit ≡ Ait + Ai †t and P i
t ≡ i(Ai †t − Ait
), (2.129)
have the statistics of a Wiener processes, when the field is in the vacuum state. For
the sake of completeness we reproduce this derivation here.
In Sec. 2.2.2, we introduced a[g] and a†[g] though the generators of the coher-
ent state ψ[g] through the relation, ψ[g] = W[g]|∅〉 = exp(a†[g] − a[g]) |∅〉. The
argument of this displacement operator defines a Hermitian generator
Υ[g] ≡ i(a†[g]− a[g]
)(2.130)
so that exp(a†[g]− a[g]) = exp(−iΥ[g]).
For a classical random variable x then
ϕx(κ) ≡ E(
exp(iκ x))
(2.131)
is the characteristic function for that random variable and therefore characterizes
its statistics. The Weyl operator exp(−iΥ[g]) is nearly equivalent to the char-
acteristic function, up to the constant κ and a minus sign. However though the
anti-linear property of a[g] we have a[λg] = λ∗a[g], but if λ = −κ, (real κ) then
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 66
exp(−iΥ[−κg]) = exp(+iκΥ[g]). Converting this operator into a true characteristic
function simply means taking an expectation value with respect to the field state. If
the state is in a coherent state, ψ[f ], then
ϕΥ[g](κ) ≡ 〈exp(+iκΥ[g])〉ψ[f ] , (2.132)
which characterizes the statistics of the operator Υ[g]. In terms of the Weyl dis-
placement operators this means that
ϕΥ[g](κ) = 〈ψ[f ]|W[−κg] |ψ[f ]〉 = e−‖f‖2
〈e[f ]|W[−κg] |e[f ]〉 . (2.133)
Eq. (2.61) relates the action of the Weyl operator to the exponential vector showing
that this simplifies to
ϕΥ[g](κ) = exp(−‖f‖2 + κ 〈g, f〉 − κ2 ‖g‖2 /2) 〈e[f ]|e[−κg + f ]〉
= exp(−‖f‖2 + κ 〈g, f〉 − κ2 ‖g‖2 /2− κ 〈f , g〉+ ‖f‖2)
= exp(i κ 2 Im 〈g, f〉 − κ2 ‖g‖2 /2).
(2.134)
The final line is recognizable as the characteristic function of a Gaussian random
variable of mean 2 Im 〈g, f〉 and variance ‖g‖2. Note that when g =χ[0,t)(t) ej,
Υ[χ[0,t)(t) ej] = i(Ai †t − Ait
)= P i
t (2.135)
and
Υ[−i χ[0,t)(t) ej] = Ai †t + Ait = Qit. (2.136)
In either case, ‖g‖2 = t and so that both operators have variance t, regardless of the
coherent amplitude f of the underlying state.
When the state of the field is in vacuum (f = 0) then both quadratures are
mean zero, Gaussian random variables whose variance is given by t. Lemma 2.1 also
shows that the operator Υ[χ[s,t)(t) ej] = P it − P i
s is a generator of displacements in
the Fock space F (h[s,t)) and therefore commutes with any generator for states in
F (h[0,s)). Clearly the vacuum respects the continuous tensor product decomposition
and therefore the quantum stochastic processes Qitt≥0 and P i
t t≥0 have, in vacuum
expectation, the statistics of Wiener processes.
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 67
2.5.3 The units of quantum noise
An extremely observant reader might be concerned about the quadrature definitions
of Qt and Pt given in Eq. (2.129). The issue lies in the units and physical interpre-
tation of the single particle wave functions. In the strictest sense of second quan-
tization, the normalized vector f ∈ L2(R+) ⊗ Cd are single particle wave functions
whose square represents the probability density for observing the particle at some
point in its domain. In order for f to be a square normalized density,∫dt |f |2 = 1,
means that f must have units of 1√time
. Consequently the operators a[f ] and a†[f ]
must be unitless as their commutator is subsequently unitless. However as written,
the quadratures Qjt and P j
t have the commutation relation[Qjt , P
jt
]= 2i t, (2.137)
which clearly has units of time. The solution to this distinction is realize that when
defining At we should really be considering the field operators relative to some char-
acteristic rate γ. Through the linearity of a[f ], we clearly have
a[√γ χ[0,t]] =
√γ At. (2.138)
The whole point of defining At in this ways is that regardless to the magnitude of
γ t, the scaled quadrature√γ Qt will still have the statistics of a Brownian motion,
simply with the diffusion rate γ.
The objective of Sec. 2.3 was to identify on what scales we could treat a quasi-
monochromatic field to be statistically independent for independent increments of
time. Fig. 2.2 showed the scaling of a smoothed characteristic function χ(σ)[0,τ ](t) for a
fixed smoothing variance and a variable duration τ . The act of smoothing limited the
derivative to be at most on the order of 1/σ and for τ ∼ 103 σ this had little effect on
the visual appearance of the smoothed function. Note that the actual correction term
to inner product in Eq. (2.92) compared the rate of change of the temporal envelope
to the carrier frequency ω0 and the introduction of the smoothing distribution is to
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 68
simply limit this derivative. Assuming that γ represents the rate of diffusion then
for times τ ∼ 1/γ any pulse should appear to have a discontinuous derivative on
this scale. In other words, in order to treat an optical field as generating a quantum
Wiener process we must have
γ σ−1 ω0. (2.139)
In any realistic application, the physics of the system typically sets values for γ and
ω0. In an atomic physics context the carrier frequency is usually a dipole allowed
optical transition, leading to ω0 ∼ 2π× 100 THz. For such a transition the measure-
ment timescale is on the order of the lifetime of the excited state, tdecay ∼ 10 ns,
meaning that typically, γ ∼ 2π × 10 MHz. This leaves 7 orders of magnitude be-
tween these two scales. In many atomic systems this is actually an upper bound
on the measurement rate. Typically one will consider off resonant light leading to
a significantly slower diffusion rate. In term of this dissertation, Chap. 5 applies
a quantum stochastic treatment to an idealized model of the Faraday interaction,
where γ is reduced by a factor of one over this frequency difference. This will be
discussed in detail in Sec. 2.7. Before considering this specific model we will review
how to transition from a smooth deterministic Schrodinger equation, to one involving
quantum stochastic integral with respect to At and A†t .
2.6 Systems Interacting with Quantum Noise
Much of this chapter has alluded to coupling a quantum system of interest to a
traveling wave field and describing the resulting evolution in terms of a quantum
stochastic process. In the quantum optics literature a system coupled to delta com-
muting field operators have been discussed since the work of Gardiner and Collett, if
not before [50]. The field operators are typically defined as the Fourier transform of
one-dimensional operators quantizing a continuous spectrum of harmonic oscillators.
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 69
That is, from the operators a(ω) and a†(ω), [a(ω), a†(ω′)] = δ(ω − ω′),
a(t) ≡ 1√2π
∫ ∞∞
dω e+iωt a(ω). (2.140)
It then follows that [a(t), a†(t′)] = δ(t′ − t). From these operators one typically
formulates the noncommuting quadratures, q(t) = a†(t) + a(t), and p(t) = i a†(t) −
i a(t). If the state of the field is specified to be in vacuum, then 〈q(t)〉∅ = 0 and
〈q(t)q(t′)〉∅ = δ(t− t′). In other words, in vacuum expectation q(t) has the statistics
of white noise and the same is true for p(t) and any rotated combination of the two.
This is why the field operator a(t) is typically given the designation as quantum
white noise.
A system is introduced to the problem typically through a linear interaction
Hamiltonian where after making a couple of approximations vary much in line with
our assumed separation of timescale, the interaction Hamiltonian reads [51]
Hint(λ, t) = i~√γ(a†(λ, t) c− a(λ, t) c†
)(2.141)
where c is a generic system operator and a(λ, t) is an operator that limits to a(t)
as λ → 0 1. To arriving at Hint(λ, t), a transformation to an interaction picture
was made and all time dependence was associated with either a(λ, t) or its adjoint.
In this interaction picture, the joint state of the system and field evolves under a
unitary propagator U(λ, t), which satisfies the equation
d
dtU(λ, t) = − i
~Hint(λ, t)U(λ, t). (2.142)
It is well known that for a general time-dependent Hamiltonian the resulting propa-
gator is given by a time ordered exponential
U(λ, t) = ~T exp
(− i~
∫ t
0
dsHint(λ, s)
). (2.143)
1Actually Gardiner and Collett consider the limit in the frequency domain and so theyconsider a bandwidth θ that approaches infinity. For our purposes it is more convenient toconsider λ→ 0, but the spirit is the same.
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 70
The ultimate goal is to interpret the operator limλ→0 U(λ, t) ≡ Ut as a solution
to an equivalent quantum stochastic differential equation. Appendix C reviews the
mathematical background needed to fully discuss these kinds of equations in the
language of quantum stochastic processes acting in terms of a second quantized Fock
space.
In the textbook formulation of quantum noise, Gardiner and Zoller define a form
of quantum Ito calculus that is explicitly tide to the statistical properties of the state
of the field and considers only a small family of Gaussian states [51]. Conversely, the
quantum Ito calculus defined by Hudson and Parthasarathy has an Ito rule that is
independent of the field state and the proof of convergence holds over a large domain
of possibly correlated system field states [23]. The chief distinction between the two
formulations is that the quantum optics derivation is based in a specific model, one
that does not initially assume the structure necessary for the more abstract version.
From the point of view of a physicist trying to model a quantum system starting
from an interaction Hamiltonian and the Schrodinger equation, it is not at all clear
how and when that fits into the abstract quantum Ito calculus. While Hudson
and Parthasarathy gave criteria for when a quantum stochastic differential equation
describes a unitary process, they did not specify how one should arrive at such a
process from a Schrodinger equation and an approximating principle. This is exactly
what the quantum optics derivation provides, however without making the final
connection to the state independent Ito calculus. In 1990, Accardi et al. proved this
connection where they showed how a unitary propagator generated from a linear
Hamiltonian similar to Eq. (2.141) converged to an Ito integral as specified by
Hudson and Parthasarathy. Additionally they showed that the proof holds in “the
weak sense of matrix elements.” What this means is the following.
Suppose we are given two coherent states with smoothed wave packets ψ[g(σ)1 ] and
ψ[g(σ)2 ] and that when σ → 0 we have the equivalent discontinuous coherent states
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 71
ψ[g′1] and ψ[g′2]. Furthermore suppose we are given a stochastic processesX
(σ)t
t≥0
that contains both system and field operators. Then X(σ)t → Xt in the weak sense
of matrix elements if, for arbitrary system state vectors φ1 and φ2,
limσ→0
σω0→∞
⟨φ1 ⊗ ψ[g
(σ)1 ]∣∣∣X(σ)
t
∣∣∣φ2 ⊗ ψ[g(σ)2 ]⟩
= 〈φ1 ⊗ ψ[g′1]|Xt |φ2 ⊗ ψ[g′2]〉 . (2.144)
It’s worth noting Accardi et al. shows that the limit holds for the bear propa-
gator U(λ, t) as well as for the Heisenberg evolution of a system operator, i.e.
U †(λ, t)XU(λ, t).
We mention this here because as long as the total system-field state ρtot can be
represented in terms of the matrix elements
limσ→0
σω0→∞
⟨φ1 ⊗ ψ[g
(σ)1 ]∣∣∣ ρtot
∣∣∣φ2 ⊗ ψ[g(σ)2 ]⟩,
then the quantum stochastic representation is appropriate. Clearly these states are
much more complex than simply mean-zero Gaussian states as they can represent
entangled states between the system and the quasi-monochromatic field as well as
nonclassical field states such as superpositions between modes and even single photon
states. However not every field state is included, as we must allow for discontinu-
ous yet quasi-monochromatic matrix elements2. In other words, the total state ρtot
must be compatible with the approximations that make the stochastic representation
possible to begin with.
Here we are also interested in moving beyond an interaction Hamiltonian that is
linear in a†(λ, t) and a(λ, t). This is because the Faraday interaction is fundamen-
tally quadratic in the field operators as it describes the scattering of light in one
polarization state to another, see Sec. 2.7. Fortunately, in 2006 Gough extended the
results of Accardi et al. to include scattering or conservation interactions. In classical
stochastic calculus the conversion between a smooth ordinary differential equation
2For defining a reasonable calculus it is also required that the wave packet amplitudestake on a large but finite maximum value.
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 72
and a stochastic differential equation is the Wong-Zakai theorem. Not surprisingly,
the conversion between a smooth Schrodinger equation and quantum stochastic dif-
ferential equation for the propagator is called the quantum Wong-Zakai theorem in
[43].
The specific Hamiltonian that the quantum Wong-Zakai theorem considers and
the one that we will use here is the interaction Hamiltonian, (sum over repeated
indices with i, j = 1 . . . d)
Hint(λ, t) = ~(Eij a
†i (λ, t) aj(λ, t) + Ei0 a
†i (λ, t) + E0j aj(λ, t) + E00
)(2.145)
where Eαβ : α, β = 0, . . . , d are bounded operators acting on a system Hilbert
space Hsys. Each term in Hint(λ, t) physically represent the following:
• E00 is an operator acting solely on system degrees of freedom, independent of
the bosonic modes, with units of frequency, e.g. it could be what remains of
the free system Hamiltonian after transforming to an interaction picture.
• Ei0 is a system operator that accompanies the creation of an excitation in the
ith bosonic mode centered at time t. A canonical example would be an operator
proportional to an atomic lowering operator, with units of 1/√time.
• E0j is a complementary process where, at time t, an excitation in the jth mode
is removed.
• Eij is a unitless system operator weighting an instantaneous scattering of
quanta from the jth mode to the ith. When i = j this can be interpreted
as a system coupling to the number of quanta in that mode at time t.
Note that as this Hamiltonian is required to be self-adjoint, the system operators
must satisfy the constraint Eαβ = E†β α.
Adding the quadratic term has an interesting and slightly unexpected effect on
the physics. Again the term Eij a†i (λ, t) aj(λ, t) represents the instantaneous transfer
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 73
of a photon from mode j to mode i. However for λ > 0, a†i (λ, t) has temporal
extent, meaning that its possible for the system to interact again with the scattered
quanta. If the magnitude of Eij is relatively small then the possibility of re-interaction
maybe relatively small, but we have yet to impose any such constraint. Fortunately
the whole problem of converting from the equation U(λ, t) = − i~Hint(λ, t)U(λ, t)
to a quantum stochastic differential equation, including the possibility of multiple
scattering events was solved by Gough. Some of the details of this derivation is
reviewed in Appendix D.
Intimately related to this conversion is how the formal quantum Ito integral is
related to the operator ordering of the constituent field operators. There exists a
fundamental connection between the rules of quantum Ito calculus (see Appendix
C) and whether an iterated integral containing a sequence of field operators are
either time or normally ordered. This connection was also formalized by Gough in
[43], which Appendix D reviews. The bottom line is that in order for the limit of
the time ordered exponential in Eq. (2.143) to be interpreted as a solution to an
equivalent quantum Ito stochastic integral it must be put into normal order with
all of the annihilation operators to the right of the creation operators. The effects
of the multiple scattering events become mathematically apparent when converting
from the time ordered solution to a normally ordered form.
Before we are able to fully write down the resulting propagator we must address
two important issues. The first is to concretely link the operator ai(λ, t) to the wave
packet theory introduced in this chapter as well consider a kind of field operator
wholly different from what we have considered up to this point.
2.6.1 Quantum white noise in paraxial wave packets
There are several different derivations that lead to bosonic operators a(λ, t) and
a†(λ, t) that result in calling the object limλ→0 a(λ, t) quantum white noise. For
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 74
instance Gardiner and Zoller use a wide-bandwidth limit where they assume that the
interaction Hamiltonian Hint is initially specified in the frequency domain and that
the system couples preferentially frequencies centered at a large transition frequency
ω0. They then assume that the coupling between the system and the operators a(ω)
is nearly flat in a frequency band centered at ω0. When this flat coupling band
is sufficiently wide, the effect is for the system to be interacting with an operator
representing a white spectrum and is therefore delta correlated [51]. Accardi et al.
take a different approach by considering a weak-coupling/long-time limit. The weak
coupling implies that on short “optical” timescales the system field interaction can be
considered perturbatively but that on longer “mesoscopic” times the aggregate effect
is nontrivial and the field fluctuations develop a diffusive characteristic. Though a
subtle re-scaling of time, a field operator a(λ, t) emerges and is delta commuting
as λ → 0 [55]. Rather than fixing ourselves to a specific system-field interaction,
this work has focused instead on integrating the language of second quantization
and stochastic processes with a realistic description of classical optics meaning that
neither model fully fits our needs. Instead we hope to find a description that is not
tide to a specific model but capture the spirit of each.
Consider the time ordered exponential
U(λ, t) = ~T exp
(− i~
∫ t
0
dsHint(λ, s)
)(2.146)
where Hint(λ, t), is given in Eq. (2.145). Expanding the exponential to just two
terms shows us that
U(λ, t) =1− i~
∫ t
0
dsHint(λ, s) + . . .
=1− iEij∫ t
0
ds a†i (λ, s) aj(λ, s)− iEi0∫ t
0
ds a†i (λ, s)
− iE0j
∫ t
0
ds aj(λ, s)− iE00
∫ t
0
ds+ . . . .
(2.147)
Now consider the creation term, Ei0 a†i (λ, s). Physically, this means that localized
in an time interval near time s, create an excitation in the ith field mode and while
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 75
you’re at it, apply the system operator Ei0. Suppose the joint system was in a pure
product state |ψi〉|∅〉 and that |ψi〉 happens to be an eigenstate of Ei0 with eigenvalue
hi. Then the time integral over this term acting on this state gives
−iEi0∫ t
0
ds a†i (λ, s)|ψi〉|∅〉 = −ihi∫ t
0
ds a†i (λ, s)|ψi〉|∅〉. (2.148)
This integral looks almost like a smoothed creation operator for a wave-packet with
temporal envelope function hi(t) = −i hi χ[0,t](s) acting on vacuum. Specifically, Eq.
(2.107) gives the expression for the one-dimensional smooth wave packet a[g(σ)(k, t)].
Taking the adjoint of that equation results in
a†[g(σ)(k, t)] =‖g‖√τ
∫ ∞0
ds f(s) e−iω0s a†[ϕ(σ)(t− s)]. (2.149)
The major distinction between this operator and the integral −ihi∫ t
0ds a†i (λ, s) is
the existence of the carrier phase e−iω0s. This means that in order to interpret the
integral −iEi0∫ t
0ds a†i (λ, s) as creating a system dependent extended single photon
state, the interaction must be inherently phase modulated at the carrier frequency
ω0. But this is exactly the same statement as Gardiner and Collett when they assume
that the system interacts with the bath at a large characteristic frequency. Therefore
without specifying a detailed interaction model we can say that
a†i (λ, s)∼= e−iω0s a†[ϕ
(σ)i (−s)] (2.150)
and
aj(λ, s) ∼= e+iω0s a[ϕ(σ)j (−s)] (2.151)
for some system characteristic frequency ω0 and smoothing wave packets ϕ(σ)i (k, s)
and ϕ(σ)j (k, s). The presence of the time reversal might be a little puzzling at first,
but this is simply due to the fact that this wave packet was defined with respect to
a convolution, which always time reverses one of the two functions. One possible
mapping to include a set of d distinct modes is to assume the model considers a
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 76
set of d paraxial spatial mode functions u(+)i (xT , z), which satisfy the orthogonality
relation,∫d2xT u
∗i (xT , z) · uj(xT , z) = δij σT .
To complete this discussion we should identify what the parameter λ means in the
wave packet context. We are able to explicitly compute the unequal time commutator
from Eq. (2.92) and remembering that for the smoothing kernel ‖g‖ /√τ = 1. This
results in[ai(λ, t), a
†j(λ, t
′)]
= e+iω0(t−t′)⟨ϕ
(σ)i (−t), ϕ(σ)
j (−t′)⟩
= δij
(ϕ(σ) ? ϕ(σ) (t− t′)− i 1
ω0
dϕ(σ)
dt? ϕ(σ) (t− t′)
).
(2.152)
λ is simply a parameter representing the formal limit that as λ → 0, σ → 0 and
(σ ω0)−1 → 0.
2.6.2 The scattering process
Up until now, the only kind of field operator we have considered is a creation operator
associated with a given single particle state g. While these operators are vitally
important, it does leave out the possibility of a whole other class of field operators.
Eq. (2.147) expanded the time order exponential for Ut to first order, which included
the integral∫ t
0ds a†i (λ, s) aj(λ, s). We will now show that this operator is quite
different from the product of two smeared wave packet operators, particularly in the
limit λ→ 0.
Consider the exponential vectors e[f(λ)] and e[h(λ)], f ,h ∈ L2(R+)⊗Cd, defined
as
|e[f(λ)]〉 ≡ exp
(∫ ∞0
dt fi(t)a†i (λ, t)
)|∅〉. (2.153)
If∫ t
0ds a†i (λ, s) aj(λ, s) where some how equivalent to the product a†[g]a[g] as λ→ 0,
for some wave packet g we would have the eigenvalue relationship
limλ→0
⟨e[f(λ)]
∣∣ a†[g(λ)]a[g(λ)]∣∣e[h(λ)]
⟩= 〈f , g〉 〈g, h〉 〈e[f ]|e[h]〉 . (2.154)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 77
We can explicitly show that this is not the case. To simplify the notation, we’ll define
the function
cij(λ, t− s) ≡ [ai(λ, t), a†j(λ, s)], (2.155)
which has the property that
limλ→0
cij(λ, t− s) = δij δ(t− s). (2.156)
Explicit calculation then shows that⟨e[f(λ)]
∣∣∣∣ ∫ t
0
ds a†i (λ, s) aj(λ, s)
∣∣∣∣e[h(λ)]
⟩=∫ t
0
ds
∫ ∞0
ds1 f∗` (s1)c∗i`(λ, t− s)
∫ ∞0
ds2 cjk(λ, s− s2)hk(s2) 〈e[f(λ)]|e[h(λ)]〉 .
(2.157)
Then in the discontinuous limit,
limλ→0
⟨e[f(λ)]
∣∣∣∣ ∫ t
0
ds a†i (λ, s) aj(λ, s)
∣∣∣∣e[h(λ)]
⟩=
∫ t
0
ds f ∗i (s)hj(s) 〈e[f ]|e[h]〉 . (2.158)
In terms of an inner product on single particle wave vectors this is actually equal to,
limλ→0
⟨e[f(λ)]
∣∣∣∣ ∫ t
0
ds a†i (λ, s) aj(λ, s)
∣∣∣∣e[h(λ)]
⟩=⟨f , χ[0,t] eiej · h
⟩〈e[f ]|e[h]〉 . (2.159)
This is clearly not an eigenvalue relationship involving a product of wave packets.
In actuality it is a second quantization of an operator acting on the wave packets
themselves [45, 52]. The specific operator here is multiplication by the indicator
function χ[0,t] (s) and the dot product into the dyad of basis vectors eiej. In the
language of the Hudson and Parthasarathy formulation of QSDEs, this operator is a
scattering or conservation process, and is notated as Λijt . While these processes can
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 78
be derived without reference to a limiting integral, our purposes we simply take this
to be a definition
Λijt ≡ lim
λ→0
∫ t
0
ds a†i (λ, s) aj(λ, s). (2.160)
Appendix C reviews the notation and manipulation of QSDEs in terms of the quan-
tum Ito differentials dAit, dAj †t and dΛij
t .
2.6.3 The limiting stochastic propagator
With a firm connection between the interaction Hamiltonian with scattering terms
and the wave packet theory, we are now able to express the limiting Hamiltonian in
terms of an Ito form quantum stochastic differential equation. Appendix C shows
that the most general quantum stochastic integral usually considered defines a pro-
cess,
Ut = U0 +
∫ t
0
dΛijs F
ijs +
∫ t
0
dAi †s Fi0s +
∫ t
0
dAjs F0js +
∫ t
0
dsF 00s . (2.161)
In Sec. C.1 it also shows what constraint must be placed on the operators Fαβs in
order for Ut to be a unitary process. As any unitary can be written as exp(−iAt) for
some generator At, instead of working with Fαβs it is more convenient to define the
operators Gαβs so that
Fαβs = Gαβ
s Us. (2.162)
The unitary constraints written in terms of Gαβs is given in Eq. (C.19).
The bottom line result of the Quantum Wong-Zakai theorem is that the operators
Gαβs are expressible in terms of the system operators Eαβ defining Hint(λ, t) in Eq.
(2.145) a matrix of constants κij. As Eαβ are assumed to be time independent this
results in Gαβs = Gαβ
0 and so we will omit the time index and demote the superscripts
to subscripts. While the constants κij are in general complex the simplest of all cases
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 79
is when κij = 12δij. Not only is this the simplest of cases it is also well motivated for
our problem and so we will use it here, see Appendix D.2.2. With these simplifica-
tions, the propagator Ut is expressible as a quantum stochastic Ito integral, which
solves the recursive QSDE,
dUt = Gij Ut dΛijt +Gi0 Ut dA
i †t +G0j Ut dA
jt +G00 Ut dt. (2.163)
The limiting coefficients Gαβ are
Gαβ = −iEαβ − 12Eαi
(1
1+ i 12E
)ij
Ejβ, (2.164)
where i and j start from 1 and we defined E as the matrix of operators Eij. The
appearance of this matrix in the denominator is precisely due to the possibility of
having multiple scattering events. A Neumann series is the operator-valued gener-
alization of a geometric series, so that for an operator A,∑∞
n=0An = (1 − A)−1,
which is well defined whenever 1 − A is invertible. The equivalent Ito coefficients,
generates a Neumann series of operators where in this case A = −i12E and An rep-
resents a quantum scattering between the modes n times. The limiting coefficient
then involves the i, j component of the operator/matrix inverse (1 + i12E)−1. For
an intuitive physical picture, the coefficients Gαβ can be interpreted in the following
way.
Each coefficient Gαβ can be roughly thought of a right-to-left acting transforma-
tion occurring on the system, dependent upon on how it couples though the field.
The original, direct couplings Eαβ are still present as shown in the first term in Eq.
(2.164). In addition to the direct coupling, there are the effects of coupling thought
the various modes. As an example the second part of the Gi0 coefficient shows a
photon can be created in the ith mode not just by just direct excitation, represented
by the −iEi0 term, but also by first exciting jth mode, and then scattering any num-
ber of times and then finally being emitted into the ith. The same goes for G00 and
Gij except these either leave the field unchanged or transfer a quantum from mode
j to mode i.
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 80
2.6.4 A simple 1D example
Nearly the simplest of all nontrivial examples of the quantum Wong-Zakai theorem
is when d = 1, and
E11 = 0, E10 = i√γ D, E01 = −i√γ D†, and E00 = Hsys (2.165)
where D and Hsys are system operators. In other words, these are the coefficients
for a total Hamiltonian
Hint(λ, t) = ~(i√γ D a†(λ, t)− i√γ D† a(λ, t) +Hsys
). (2.166)
This is an extremely common model in quantum optics where an atomic dipole
operator D couples with the rate γ to a quantized quasi-monochromatic electric field,
with “white noise” creation operator a†(λ, t). Hsys is the remaining system operator
which includes any residual detuning of the field mode from the system transition
frequency or any externally applied controls. The quantum Wong-Zakai theorem
states that this pre-limit Hamiltonian generates a propagator with the coefficients
G11 = 0, G10 =√γ D, G01 = −√γ D†, and G00 = −iHsys−1
2γ D†D. (2.167)
This results in the propagator Ut satisfying the QSDE
dUt =(√
γ D dA†t −√γ D† dAt − iHsys dt− 1
2γ D†Ddt
)Ut. (2.168)
Next section we will apply the quantum Wong-Zakai theorem to the much more
interesting case of the Faraday interaction where d = 2 and Eij 6= 0.
2.7 The Faraday Interaction
The Faraday interaction is physically based on an optical field propagating in a
polarizable medium. In classical optics it is used as a magneto-optical effect, where
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 81
the polarization of a linearly polarized probe is rotated by an amount proportional
to the component of the magnetic field parallel to the direction of propagation.
At a macroscopic level, it is modeled terms of the energy shift of a polarizable
particle induced by an oscillating electric field. Such an energy shift can be easily
implemented and precisely controlled by applying a quasi-monochromatic laser to
a rarefied monatomic gas. The laser’s effect on a single atom can be considered a
perturbation to its atomic ground states, if the laser drive is in a “low saturation”
regime. Specifically the laser’s carrier frequency must be close to, but significantly
off resonance from, a ground state transition. Furthermore the intensity must be
small enough so that the total number of excited atoms will be negligibly small.
Not surprisingly, the details of the ground state atomic structure effects both the
magnitude and direction of the induced polarization and often leads to effects beyond
a simple linear rotation of the probe polarization. The derivation of this kind of
interaction is elegantly presented by Deutsch and Jessen in [5]. Here we will consider
the simplest of all settings where the atomic ground state is given by a spin 1/2
particle. Such a ground state is experimentally realizably, if the atom has a single
valance electron and negligible hyperfine structure, or if two valence electrons form
a spin singlet ground state and the nucleus has a total spin I = 1/2.
In either the classical or quantum mechanical setting, the polarizability Hamil-
tonian for an atom located at position ra is given by
Hpolar = −E(−)(ra) · ←→α · E(+)(ra) (2.169)
where ←→α is the polarizability tensor. Quantum mechanically, ←→α is an operator
acting solely on the atomic ground states. It is worth noting that in any real system
there will be additional decoherence due to spontaneous emission, which in this
treatment we will ignore. Shortly, we will note that the strength of the coherent
interaction is proportional to Γ/∆ where Γ is the excited state decay rate and ∆ is
the probe detuning from the atomic resonance. One can additionally show that the
incoherent photon scattering is proportional to Γ/∆2. When Γ/∆ is small, Γ/∆2 is
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 82
smaller and so the decoherence is often ignored.
The connection to the quantum Wong-Zakai theorem is that Hpolar is quadratic
in the field operators. As E(+)(ra) ∝ a† and E(−)(ra) ∝ a, the components of ←→α
will be identifiable with the operators Eij. By ignoring spontaneous emission we are
also able to restrict our attention to a single quasi-monochromatic paraxial mode.
While in principle the atom couples to any electric field at its location, including the
quantized vacuum, we will be applying a coherent displacement in a definite mode
u(+)(xT , z), with an envelope function f(t). As we have seen this envelope function
is expressible in terms a convolution with the smoothing function ϕ(σ)(t), see Sec
2.6.1, and so the relevant field operators we will be considering are a(λ, t) and its
adjoint. In other words, we are simply using this as an example for applying all of
the theoretical machinery developed in this chapter.
Before finally writing down Hint(λ, t), we will make one more extremely useful
but only marginally justifiable approximation. Here we assume that the spatial
distribution of the atoms will be irrelevant and all atoms in the ensemble can be
treated as existing at the same location in space. From the point of view of the
slowly varying envelope, this is a reasonable assumption if the dimension of the gas
along the direction of propagation is on the order of c σ. If the longitudinal extent of
the atoms becomes significant then Hint would have to treat atoms at the beginning
of the gas differently from the atoms at the end. In addition to having a spatial-
temporal dependence, any realistic paraxial beam will have some intensity variation
in both xT and z. If we take u(+)(xT , z) to be a standard Hermite-Gaussian beam,
the transverse and longitudinal intensity can be treated as approximately constant if
the gas has a transverse area that is small when compared to the beams characteristic
area σT . However, if a beam of a fixed input power has a large transverse area then
it will have a relatively low intensity at any give point relative to a beam with a
smaller σT . Ultimately this means that the more uniform the probe is, the weaker
the over all interaction will be.
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 83
With all of the above caveats and assumptions the Faraday interaction Hamilto-
nian is
Hint(λ, t) = ~χ0
3Jz
(a†r(λ, t)ar(λ, t)− a
†l (λ, t)al(λ, t)
)(2.170)
with the operators and constants defined though the following. For a spin 1/2 ground
state, the single atom polarizability ←→α is diagonal in the circular polarization basis,
er and el. Up to an irreverent global energy shift ←→α ∝ σz(e∗rer − e∗l el) where σz
is the pauli z operator and the proportionality constant depends upon the specifics
of the atomic physics. By assuming that all of the atoms exist at the same location
in space, computing the Hamiltonian for the whole ensemble reduces to computing
a sum over the individual polarizabilities which further reduces to summing over all
of the σz operators. It is well known that with N spin 1/2 particles, a collective
pseudo-spin J can be defined whose components (i = x, y, z) are
Ji =N∑n=1
12σ
(n)i . (2.171)
The details of finding the proper proportionality to express the total interaction as
Eq. (2.170) is given in [5] but the dimensionless constant χ0 has an exceedingly
simple form with,
χ0 =σ0
σT
Γ
2∆(2.172)
where σ0 is the resonant scattering cross-section for the given transition. This con-
stant can be viewed as giving the probability that a single atom absorbs and remits a
photon into the paraxial beam. In order to make all the above approximations valid
(e.g. assuming that the intensity is near constant and that spontaneous emission is
negligible) requires that both σ0 σT as well as Γ ∆, meaning that χ0 1.
When proving that the convergence of U(λ, t) in the sense of matrix elements
limλ→0 〈e[f(λ)]|U(λ, t) |e[h(λ)]〉, Gough used the useful relationship that
limλ→0〈e[f(λ)]|U(λ, t) |e[h(λ)]〉 = lim
λ→0
⟨∅∣∣∣ U(λ, t)
∣∣∣∅⟩ (2.173)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 84
where U is the propagator, except with the replacements
ai(λ, t)→ ai(λ, t) + hi(t) and a†i (λ, t)→ a†i (λ, t) + f ∗i (t). (2.174)
As we are applying this limit when the field has a coherent displacement, it is suf-
ficient to work with these displaced versions and pretend that the field is in the
vacuum. The heart of Faraday interaction is the rotation of a linearly polarized
input and so we will assume that our displacement is linearly polarized and that
the equivalent amplitude function f(t) ∈ L2 ⊗C2 represents the displacement in the
electric field. With these two constraints we have that
fr(t) = f ∗l (t) = i√2f0(t) (2.175)
where f0(t) is real-valued. Typically experiments involving the faraday interaction,
the driving displacement is operated in switched on to a constant value for the
duration of an experiment. In this case max(f0) =√NL/τ where NL is the average
number of photons in a pulse of duration τ .
In the case of this displacement, we have the effective vacuum Hamiltonian
Hint(λ, t) = ~χ0
3Jz
((a†r(λ, t) + f ∗r (t)
)(ar(λ, t) + fr(t)
)−(a†l (λ, t) + f ∗l (t)
)(al(λ, t) + fl(t)
)). (2.176)
It is useful to define the function,
κ(t) =(χ0
3f0(t)
)2
. (2.177)
When f0(t) is held at a constant level we have a characteristic rate κ = (χ0/3)2NL/τ .
From this definition we can expand out the interaction
Hint(λ, t) = ~χ0
3Jz
(a†r(λ, t)ar(λ, t)− a
†l (λ, t)al(λ, t)
)+ i ~
√κ(t)
2Jz(a†r(λ, t) + a†l (λ, t)− ar(λ, t)− al(λ, t)
).
(2.178)
Sec. 2.6 discusses the operators Eij as representing the scattering of field quanta
in a system dependent way. In addition, the limiting coefficients show that when
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 85
Eij takes on relatively large values, there is a possibility for multiple scattering
events. In the practical approximations that resulted in χ0 1, the probability
for multiple scattering is relatively small, unless Jz takes on obscenely large val-
ues. Note that when f0(t) 1, there can still be a significant interaction, as this
means that the second linear term dominates the interaction. Also note that we have
1√2
(ar(λ, t) + al(λ, t)) = ah(λ, t), i.e. an annihilation operator for horizontal polar-
ization. Dropping quadratic terms in favor of the terms with a large displacement,
we now have an approximately linear interaction in a single horizontally polarized
mode.
Hint(λ, t) ≈ i ~√κ(t) Jz
(a†h(λ, t)− ah(λ, t)
). (2.179)
From the 1-D example this means that we have G10 =√κ(t) Jz and so if we add
a time-dependent system control Hamiltonian Hc(t) to this expression we have a
propagator
dUt =(√
κ(t) Jz dAh †t −
√κ(t) Jz dA
ht − 1
2κ(t) J2
z dt− iHc(t) dt)Ut. (2.180)
This is the propagator that we will be considering in Chaps. 4 and 5.
It is worth noting that the Faraday interaction has been applied to several differ-
ent continuous measurement models in the QSDE formalism with varying levels of
initial assumptions [56, 57]. In [56], the free field was assumed to be well modeled
by a QSDE and a Faraday like interaction was derived via adiabatically eliminat-
ing an excited state as well as an artificial cavity mode which left many questions
unanswered as the derivation was made strictly through a single mode picture and
did not address the fundamental two mode structure of the Faraday interaction.
In contrast [57] considered a scattering interaction, however there they simply took
the fundamental scattering interaction before the displacement and substituted the
scattering processes dΛrrt and dΛll
t for the white noise operators. What that model
failed to consider was the effect of normally ordering the field operators in obtaining
the proper Ito correction. In the language of the quantum Wong-Zakai theorem, the
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 86
propagator initially considered in [57] should have been interpreted as a quantum
Stratonovich equation and not a Quantum Ito equation.
2.7.1 The quadratic Faraday interaction
It would be a shame to discuss the full solution to the quantum Wong-Zakai theorem
and not give an example that retains the scattering interaction. The Faraday inter-
action is a prime candidate for this, in a world where we have a weak drive f0(t) ∼ 1
but it is possible to see some kind of effect. From Eq. (2.178) we can identify
Err = −Ell = χ0
3Jz,
Er0 = El0 = i
√κ(t)
2Jz,
E0r = E0l = −i√
κ(t)2Jz, and
E00 = 0.
(2.181)
We can substitute these operators into the coefficients Gαβ in Eq. (2.164). After
some algebraic simplifications we find that
Grr = G†ll =−iχ0
3Jz
1 + iχ0
6Jz,
Gr0 = −G0r =
√κ(t)
2Jz
1 + iχ0
6Jz
Gl0 = −G0l =
√κ(t)
2Jz
1− iχ0
6Jz, and
G00 = −κ(t)
2J2z
1 +(χ0
6Jz)2 .
(2.182)
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 87
App. C reviews the usual formulation of the propagator dUt, in terms of the operators
Sij, Li and H. Some more simple algebra shows that
Srr = S†ll =1− iχ0
6Jz
1 + iχ0
6Jz,
Srl = Slr = 0,
Lr = L†l =
√κ(t)
2Jz
1 + iχ0
6Jz, and
H = 0.
(2.183)
In order for dUt to be a unitary process, Sij must form a unitary matrix of operators,
i.e. S†ijSjk = SijS†jk = δik, which is clearly satisfied in this case. So finally the
Faraday interaction generates the propagator Ut, which solves the QSDE
dUt =(
(Srr−1) dΛrrt +(S†rr−1) dΛll
t +Lr (dAr †t −dArt )+L†r (dAl †t −dAlt)−L†rLr)Ut
(2.184)
with the initial value U0 = 1. In the small χ0 limit we were able to write the
propagator in terms the linear polarized field operators Aht and Ah †t . This is not
the case here as the right and left polarization states have different atomic coupling
operators. This might result in creating some system dependent elipticity to the
probe laser, however more analysis is clearly needed.
The power of writing the propagator in terms of the (S, L,H) parameters is
that a large number of results have already been computed for general coefficients
which can simply be applied here. As an example suppose we wish to compute the
unconditioned master equation of the atomic system assuming that the displaced field
is in vacuum, i.e. other than the coherent drive laser. Then the master equation
is given in Lindblad form with jump operators Lr and Ll. Specifically the system
Chapter 2. Quantum Optics and Quantum Stochastic Differential Equations 88
density operator ρ(t) is the solution to
dρ
dt= D[Lr](ρ) +D[Ll](ρ)
= Lr ρL†r − 1
2L†rLr ρ− 1
2ρL†rLr + Ll ρL
†l − 1
2L†lLl ρ− 1
2ρL†lLl
= 12κ(t)
(Jz
1+iχ0
6Jzρ Jz
1−iχ0
6Jz
+ Jz
1−iχ0
6Jzρ Jz
1+iχ0
6Jz− J2
z
1+(χ0
6Jz
)2 ρ− ρ J2z
1+(χ0
6Jz
)2
).
(2.185)
Here we can see yet again that when χ0 → 0 we recover the standard dissipative
master equation with measurement operator κ(t) Jz. Also note that when the system
is prepared in either an eigenstate or a mixture of eigenstates of Jz then it does not
evolve in time.
89
Chapter 3
Classical and Quantum Probability
Theory
This chapter serves two purposes. The first and primary intention is to present a
number of known results from classical and quantum probability theory, which will
serve as a foundation for the novel work in later chapters. Those results rely on a
detailed knowledge of the (classical) statistical properties of the Wiener process and
so we review them here. Additionally, we need to know how to extract a classical
Wiener process from a fundamentally quantum system. The procedure of identifying
a stochastic process embedded in a quantum system is one useful application of a
more general mapping between quantum systems and classical probability theory.
The second purpose of this chapter is to emphasize the power of this technique and to
discuss how the language of classical probability theory can be used to identify certain
symmetries that might exist in a quantum system. In order to do this coherently,
we also review some of the basic elements of classical probability theory.
By working in the language of classical probability theory, the tools of nearly 80
years of classical mathematical analysis can be applied to quantum problems, with
one important example being a continuous-time quantum filter. We will not rederive
Chapter 3. Classical and Quantum Probability Theory 90
it here, merely discuss its origins, limitations and various formulations. A particulary
important form is the conditional master equation, an equation that is in some
sense semiclassical and can be viewed as being generated by the quantum-to-classical
mapping. Here we use the term semiclassical in the sense that the measurement
record is modeled as a real valued classical stochastic process whose statistics are
given by a quantum system expectation (see Sec. 3.4). Chaps. 4 and 5 work
exclusively with this equation, albeit in three different variations. The final topic of
this chapter is to show how these various forms are derived.
3.1 Classical Probability Theory
A physics Ph.D. program does not generally include a course in measure theory or
axiomatic probability theory. Most physics problems only consider a handful of dis-
crete or real-valued random variables and so applying a full measure theoretic context
is unnecessary. However in some instances, working only with a probability density
function becomes either intractable or conceptually problematic. One example is
when one is attempting to understand the behavior of a random function defined
over continuous time. In principle, this requires describing an uncountable number
of random variables, one for each possible time, where the density function at a given
time could be highly correlated with past (and maybe even future) times.
Furthermore, when adding the possibility of statistical inference to the picture,
defining individual density functions becomes even more convoluted. Consider trying
to estimate the history of the random variable xt based upon a continuous observation
of a nonlinear function of x, e.g. f(xt) = sin(xt). Writing down a joint and marginal
density functions for xt and f(xt) is not particularly straightforward, as they are
clearly distinct objects but are hardly independent. In the long run, a much more
efficient way of doing business is to decouple the notions of random events and their
associated probabilities from the specifics of any one random variable. By finding a
Chapter 3. Classical and Quantum Probability Theory 91
way to associate xt, f(xt), and maybe even a third random variable y to the same
underlying structure of events, we can then calculate the probability associated with
those events, independent of the specifics of x, y or f(x). The way this decoupling
is made is by invoking some of the structure found in measure theory.
An axiomatized probability model contains three elements, usually written as the
triple (Ω,F ,P), with Ω being a sample space, F a σ-algebra of events, and P is a
probability measure over those events [22, 53, 58]. We will now discuss each element
including specific examples. Ultimately, we are interested in describing diffusive
measurements and so we will focus on the example of Brownian motion. Brownian
motion is the canonical example for a system experiencing unforced diffusion and the
Wiener process is the most widely used mathematical model for such a system. Chap.
2 already encountered an instance of a Wiener process, in the vacuum statistics of
the quadrature At + A†t .
The first element of a probability space, Ω, is called the sample space and describes
the set of all possible outcomes of the model. In a system with a discrete number of
outcomes, a flip of a coin or a roll of a die, then Ω is simply the set of all possible
outcomes. For the coin Ω = heads, tails and the die Ω = 1, 2, 3, 4, 5, 6. In
addition to these discrete examples, the sample space could also be uncountably
infinite. For a Brownian particle, moving in d dimensions, The sample space is the
space of all possible trajectories. As a particle’s trajectory must be a continuous real
valued function, Ω is then then the set of all continuous functions of time [58]
Ω =ω(t) : R+ → Rd, ω continuous
. (3.1)
The next element of the probability model, F , is a σ-algebra over the sample
space. This represents any “sensible” question we can ask about the various out-
comes. Each object in the algebra represents such a question and is called an event.
In this formalism, probabilities are computed not from the individual outcomes in
Ω, but instead from the events in F . The reason for this distinction is to exclude
Chapter 3. Classical and Quantum Probability Theory 92
pathological cases that arise when working with uncountable sets and is the same
reason measure theory was developed. When the sample space Ω is uncountably
infinite, one can find highly pathological sets that can be used to obtain paradoxical
results. For instance, by choosing just 5 disjoint subsets from the unit ball, one can
construct, simply through translations and rotations, two independent and identical
copies of that ball [59]. It would be problematic for a probability model to consider
these kinds of sets, as one could then double the probability for picking a point in
the unit ball simply by doubling the ball. Identifying the elements of F with sensible
questions means that we are excluding these kinds of pathologies.
In the discrete case the sensible questions are things like, “Did the die land
with an even number?”, “Did it land showing the number 6?”, or even “Did it
land showing any number 1 though 6?”. Mathematically, these questions represent
sets of the underlying outcomes. These correspond to the sets 2, 4, 6, 6, and
1, 2, 3, 4, 5, 6 = Ω respectively. The σ-algebra F is the set of these sets, representing
any possible question -event- we can ask about the system. For a finite and discrete
number of outcomes, F is usually the power set, in that it is the set of all possible
sets one can make out of Ω. Operationally speaking, a σ-algebra has the following
definition [53]. A σ-algebra F is a collection of sets of Ω satisfying the following
three properties1,
1. If a countable number of sets Ann∈N ∈ F then ∪nAn ∈ F .
2. If A is a set in F than its complement, Ac, is also in F .
3. F must contain the space Ω, and therefore by the second property, its comple-
ment the empty set ∅.
For the case of Brownian motion, F is the σ-algebra of all “cylinder sets”, which
are defined in the following way [58]. In real valued random variables probabilities
1The “σ” in σ-algebra is to mean “countable” [58].
Chapter 3. Classical and Quantum Probability Theory 93
are given in terms of intervals. The probability that a random variable x, with
probability density p(x), has a value in the interval [a, b] is given by the integral∫ badx p(x). Here the event is the interval [a, b] and is an element of the Borel σ-
algebra, B. This is essentially the set of all intervals, open and closed over, the real
line.
However, at any given time, a d-dimensional Brownian motion will take on values
in Rd. In order to ask if a trajectory landed in some interval, I, we must also specify
an associated time, t, for that measurement. A basic cylinder set is then specified
by both a time and an interval. The actual set, C(t; I), is the set of all Brownian
trajectories that are in I at time t,
C(t; I) = ω ∈ Ω : ω(t) ∈ I . (3.2)
A trivial example is the set C(t,Rd) = Ω, i.e. all continuous trajectories will have a
value in Rd at any time t. A nontrivial example in one dimension is to ask for the
set of all trajectories that are between a = -10 µm and b = 5 µm at time t = 5 ms.
In addition, questions that involve multiple times are also sensible. It is per-
fectly reasonable to ask, “what one-dimensional trajectories are in I1 = (a1, b1)
at time t1 and in I2 = [a2, b2) at time t2 > t1?” This is also a cylinder set,
C(t1, t2; I1, I2). An image that might be helpful is to imagine that the cylinder set
C(t1, t2, . . . tn; I1, I2, . . . In) defines the set of trajectories that successfully navigates
the “slalom” defined by these intervals and these times.
The σ-algebra we will use for analyzing Brownian motion is the σ-algebra gener-
ated by the cylinder sets defined by all countable sequences of times and all open sets
of Rd at those times [58]. Note that the issues of discussing a uncountably infinite
number of random variables is avoided by defining the cylinder sets for a countable
number of times. In fact asking questions about an uncountable number of events
is ultimately identified as “unreasonable” as it allows for the introduction of patho-
logical possibilities. Here we are only interested in describing the continuous sample
Chapter 3. Classical and Quantum Probability Theory 94
paths of a Brownian particle and means that we can safely consider a countable
number of events, e.g. times defined by a sequence of rational numbers.
The final element of a probability space is the probability measure, P. It defines
the probability for observing the events in F . Mathematically P is a function that
takes sets of Ω, (elements of F) and maps them to real numbers between zero and
one, P : F → [0, 1]. In order for a valid measure to be a probability measure we
must have:
1. The probability of something happening be one, P(Ω) = 1, and the probability
of nothing happening be zero, P(∅) = 0.
2. The probability of the union of a countable number of disjoint events in F
must be additive,
P(∪nAn) =∑n
P(An) if An∩Am = ∅ for An, Am ∈ F and n 6= m. (3.3)
The requirement that a probability measure be countably additive is simply a state-
ment that if A is independent of B then the probability to observe A or B is the
sum of the two probabilities.
Shortly we will discuss what the probability of observing a given cylindrical set,
if the trajectories in those sets represent unforced Brownian motion.
3.1.1 Stochastic processes and random variables
From a well constructed probability space we now need to see how random variables
fit into the measure theoretic context. Much more can be said on this topic than we
can include here, so an interested reader is encouraged to consult [22, 53, 58, 60].
Chap. 5 requires a reasonable understanding of the statistical properties of a one-
dimensional Brownian motion, the Wiener process, and so we will focus on that
example here.
Chapter 3. Classical and Quantum Probability Theory 95
Abstractly, a random variable f is a function that maps elements of Ω to another
space, usually the real numbers. Placing a $50 bet that a coin toss will land heads
is an example of a random variable. Another example of a random variable the
indicator function χA(ω) for any event A ∈ F . Chap. 2 already found many uses for
an indicator function, which in a probabilistic context, is a random variable defined
as
χA(ω) =
1 if ω ∈ A
0 if ω /∈ A. (3.4)
Such a random variable is deceptively simple but is also extremely useful. One of
its primary uses is that they relate set operations in F to algebraic operations on
random variables. It is easy to show that the random variable x(ω) =χA(ω)+ χB(ω)
is equal to 1 whenever ω ∈ A ∪ B. Also the random variable y(ω) =χA(ω) χB(ω) is
equal to 1 only when ω ∈ A ∩B, meaning
χA∪B(ω) =χA(ω)+ χB(ω) (3.5)
χA∩B(ω) =χA(ω) χB(ω). (3.6)
For the case of 1D diffusion, one of the most important random variable is pa-
rameterized by time and simply returns the value of trajectory at that time. For all
times t ≥ 0, we define the function xt : Ω→ R such that
xt(ω) = ω(t). (3.7)
This definition might seem a bit pedantic, but note that the trivial random variable
y(ω) = ω is not real-valued, ω describes the entire trajectory not just at any one
specific time. From xt(ω) a whole host of other random variables can be defined
through functional composition. A man could place a $50 bet on whether or not a
diffusive particle will be greater than +5 µm from its starting point by time t = 1
ms. That bet is the composition b(x1ms(ω)) where b maps real number to ±50.
The random variable xt in Eq. (3.7) only gives a snapshot of a trajectory at that
time. In order to describe the trajectory dynamically in time as a random variable,
Chapter 3. Classical and Quantum Probability Theory 96
there is the notion of a stochastic process. Most generally, a stochastic process is
a family of random variables xtt∈I indexed by some parameter t, almost always
representing time. Typically we will take time to start at 0 and either let it continue
off towards infinity or, when convenient, stop at some finite time. When discussing
the concept of the process we will use the notation xtt≥0 and xt is the random
variable given at that time. Before discussing a couple important types of processes,
we should know how to compute the probability for a random variable to evaluate
to a range of values.
The previous section showed that a probability measure P acts on elements of F
and returns probabilities. To compute the probability of a $50 dollar bet b to win,
we need to identify the set of events that the function b : Ω → ±50 evaluates to
50. Because b is a function acting on Ω, we can also consider its inverse map b−1.
If a given random variable can take on a continuum of values, we can still run into
pesky problems of having uncountable numbers of things. The solution to this is
to again only consider sensible sets of outcomes for any given random variable. For
the random variable xt from Eq. (3.7) we will have to take its inverse map x−1t to
act only on elements of the Borel σ-algebra, B. When we ask for probabilities of
observing certain values of a random variable x, we must ask for the probability of
observing sets or intervals in the range of x.
For every “reasonable” interval that x maps to, there must be a corresponding
element A ∈ F in order for us to be able to calculate the probability of that under-
lying event. Such a random variable is called measurable. If a random variable x is
not measurable, then there is little we can say about it when its outcomes lead to
unreasonable questions. In other words, a nonmeasurable random variable has an
inverse that generates sets not in F . Faced with this possibility we can either ignore
such questions and pray they never occur or redefine the probability space in order
to make these sets measurable. A nontrivial example of this problem is suppose
we had a random variable yt that returned the value 1 whenever the sample path
Chapter 3. Classical and Quantum Probability Theory 97
exhibited a discontinuous jump in the time interval [0, t) and zero otherwise. The
question, “what is the probability of yt returning 1?”, corresponds to P(y−1t (1)
). If
our probability space is constructed only of continuous functions, then we technically
can’t answer this question as the pre-image y−1t (1) ask for the set of functions that
have a discontinuity for times 0 ≤ s < t, which is not an element of F .
By defining random variables as measurable functions, we can easily relate the
statistics of multiple random variables to each other though their inverse maps.
Consider the stochastic process xtt≥0 defined by Eq. (3.7). Then xt1 and xt2 are
two random variables taking on values in the real number line. Suppose we wish to
calculate the probability of observing xt1 in the interval (a, b) and xt2 in the interval
(c, d]. Individually, we have
x−1t1
((a, b)
)= C(t1; (a, b) ) ∈ F (3.8)
and
x−1t2
((c, d]
)= C(t2; (c, d] ) ∈ F (3.9)
The joint probability of these two events is simply the probability of the intersection
of these two sets,
P
(C(t1; (a, b) ) ∩ C(t2; (c, d] )
)= P
(C(t1, t2; (a, b), (c, d]
) )(3.10)
3.1.2 Expectation values, the conditional expectation, and
measurability
The most fundamental operation one performs with random variables is computing
their expectation values. If the random variable z(ω) takes on a finite number of val-
ues,z(i) : i = 1, . . . , n
, then calculating the expectation value for z is no different
than in the nonmeasure theoretic context
E(z) ≡n∑i=1
z(i)P(z = z(i)
). (3.11)
Chapter 3. Classical and Quantum Probability Theory 98
The expectation value of z is the average of all its outcomes, weighted by how likely
they are to occur. Note that writing P(z = z(i)
)is shorthand for finding the event,
Ai ≡ z−1(z(i))
with
P(z = z(i)) ≡ P(Ai) = P(
ω ∈ Ω : z(ω) = z(i) ). (3.12)
In addition to these simple random variables, we need to formulate expectation
values for random variables that can take a continuum of values. This is done
by defining a measure theoretic version of a standard Riemann integral, called the
Lebesgue integral. One path for this construction is to make an approximation for
xt that takes on a finite number of values. The expectation value of such a discrete
approximation is easily computed though Eq. (3.11). Then by taking a suitable
limit where the number of values become continuous we can calculate the proper
expectation value. At this point this procedure is a bit vague, but as we have not
specified a measure for Brownian motion, it is difficult to be more specific. When
discussing the Wiener process, we will be able to be more clear.
Probability theory gets a lot more lively when instead of considering simple expec-
tation values we consider conditional quantities. When working with simple random
variables finding a conditional expectation values is no harder in a probability space
than in a standard context. No matter how sophisticated the framework, Bayes’ rule
still applies, in that for two events A1 and A2 we have
P (A1 ∩ A2) = P (A1|A2)P(A2). (3.13)
Whenever P(A2) 6= 0 we can invert to find the conditional probability of A1 given
A2,
P (A1|A2) =P (A1 ∩ A2)
P(A2). (3.14)
We emphasize that we are calculating the probability of an event A1 occurring,
conditional on the event A2. While A2 could correspond to the pre-image of a
Chapter 3. Classical and Quantum Probability Theory 99
single value of a simple random variable, it could also correspond to another random
variable taking on a range of values or even the intersection between the outcomes
of two random variables. All of these different possibilities correspond to the same
underlying event and thus carry the same information.
Turning this conditional probability into a conditional expectation value is simply
a matter of weighting the outcomes of one random variable by the conditional prob-
abilities. We will illustrate this in a quick example. Consider the two simple random
z(ω) and y(ω) with valuesz(1), . . . , z(n)
and
y(1), . . . , y(m)
. For each variable
and each outcome find the corresponding events, Ai = z−1(z(i)) and Bj = y−1(y(j)).
Then the conditional expectation value of z given that y = y(j) is
E(z∣∣y = y(j)
)=
n∑i=1
z(i)P(z = z(i)|y = y(j)
)=
n∑i=1
z(i) P (Ai ∩Bj)
P(Bj). (3.15)
If one is having to compute this conditional expectation value by hand, hopefully the
events Ai and Bj are relatively simple and that computing the probability of their
union is relatively straight forward. But even if this is not the case, by adding the
structure that events are sets and probabilities are measures on events, computing
conditional quantities does not require defining a new probability structure like a
joint density function for each random variable we want to consider.
Sometimes it is convenient to write the conditional expectation not directly in
terms of the events Ai and Bi, but instead in terms of a regular expectation value and
an indicator function. This method will be useful when both z and y are not simple
random variables, also when we want to abstract away y and instead think about
conditioning on just some abstract event B. A nice property of indicator functions
is that because they only the value 1 on a single event, we can write
E(χAi) = P (Ai) (3.16)
and in particular,
P (Ai ∩Bj) = E(χAi∩Bj) = E(χAi
χBj). (3.17)
Chapter 3. Classical and Quantum Probability Theory 100
As the expectation value is a linear operation we have that
E (z|Bj) =n∑i=1
z(i) E(χAiχBj
)
E(χBj)
=E( (∑
i z(i) χAi
)χBj
)E(χBj
)=E(z χBj
)E(χBj
). (3.18)
The last equality here is particulary useful as it holds even when z isn’t a simple ran-
dom variable. When continuous-time and continuous-valued random variables are
involved, it should not be surprising that explicitly computing conditional quanti-
ties by finding the underlying sets and computing their union is often impracticable.
While working with the indicator function χBjmakes some substantial simplifica-
tions, often an explicit computation is still impractical. Instead, taking an indirect
root is often fruitful and one of the prime tools for doing so is what is called the
conditional expectation.
Moving from a conditional expectation value, to a conditional expectation, is
only one step more complicated. If one is prepared for computing the conditional
expectation value for every outcome of y, i.e. having computed all of the sets Bj and
know which have zero probability, you can write down a random variable that in some
sense computes all of the conditional expectation values at once. The conditional
expectation of z on y written as E(z|y), is a random variable that takes on the
value E(z∣∣y = y(j)
)whenever y = y(j). This works through the following. For every
outcome y(j) we have an underlying event Bj. Whenever P(Bj) 6= 0 we can define
the random variable,
E (z|y) ≡∑j
E(z∣∣y = y(j)
)χBj
(ω). (3.19)
Whenever P(Bj) = 0 we can give the conditional expectation value any value we
wish, safe in the knowledge that the probability for obtaining such an arbitrary values
is zero. A lot can be said for this object, but one of the most important is that it can
be viewed as a reasonable estimate for z given information about y. More specifically
suppose you wanted to find a least-mean-squared estimate for what value z would
return when the outcome you receive, ω, results in y(ω) = y(i). Generally speaking,
Chapter 3. Classical and Quantum Probability Theory 101
there are many different ω for any value of y(i) also the set of ω that gives this value,
may return multiple values for z. It turns out that if you want to make any estimate
for any random variable, z, conditioned on the events given by another, y, then the
estimate for z must have the form
z(ω) =m∑j=1
aj χBj(ω) (3.20)
where Bj are the events generated by the various outcomes for y and aj are constants.
It should not be too surprising that by finding the constants that correspond to the
least-mean-square estimate for z given y turn out to be the conditional expectation
values aj = E(z∣∣y = y(j)
).
Any random variable that can be written as Eq. (3.20) is called a y-measurable
random variable. Note that y itself and almost any function of y can written like
this. The sets Bj can be used to construct a σ-algebra by taking all (countable)
unions and complements, then the random variable z is measurable with respect
to that σ-algebra. The σ-algebra formed by these sets, written as Y or sometimes
σ y, is called the sigma algebra generated by y. It is not difficult to show that if
you have multiple random variables x, y, etc. you can form a σ-algebra, σ x, y, . . .,
generated by all of those as well simply by taking unions and complements of all of
their various events.
All of the above concepts, joint probabilities, the conditional expectation, gener-
ating σ-algebras from random variables, etc., can be extended to continuous random
variables by taking appropriate limits. More often than not, one tries to avoid taking
an actual limit, and instead looks for a random variable that is y-measurable and
satisfies the basic properties of the condition expectation. For general reference, here
are some of those properties:
1. The conditional expectation is linear,
E (αx+ β z|y) = αE (x|y) + βE (z|y) (3.21)
Chapter 3. Classical and Quantum Probability Theory 102
for constants α and β.
2. The conditional expectation is consistent with usual expectation values,
E(E (x|y)
)= E(x ). (3.22)
3. The conditional expectation of any y-measurable random variable, x, is itself
E (x |y) = x. (3.23)
4. If z and y are independent than the conditional expectation is just the expec-
tation value for z (times the “identity”)
E (z|y) = E(z) χΩ(ω). (3.24)
5. For any y-measurable random variable y′, the conditional expectation value
satisfies the property that
E(E (z|y) y′
)= E( z y′ ). (3.25)
while this last property could be inferred from the second and third, we include it
here because it is often what is used to “guess” what the conditional expectation is,
without going though a bare bones construction.
Much, much more can be said about the conditional expectation. The deriva-
tion of classical and quantum filtering theory is based upon computing a conditional
expectation of some unobserved processed, xt based upon measurements of a cor-
related process yt. Unfortunately, we will be unable to go into the detail here
but interested reader is encouraged to seek out a number of good references on the
subject, some of which are [22, 25, 37, 53].
Chapter 3. Classical and Quantum Probability Theory 103
3.1.3 Special processes - time-adaption and martingales
Having discussed random variables and stochastic processes in terms of a classical
probability space (Ω,F ,P), we now need to introduce a couple of important and
useful processes. Chap. 2 already introduced the concept of a time-adapted process.
An stochastic process xtt≥0 is time-adapted when it depends only on events defined
in the present or past and not on the future. Having introduced the concept of a
measurable function and a σ-algebra over the cylindrical sets C(t1, t2, . . . ; I1, I2, . . . )
we can easily give a precise meaning to time-adapted process. A stochastic processes
xtt≥0 is time-adapted when each random variable xt is measurable with respect
to the σ-algebra, Ft, generated from the cylindrical sets C(t1, . . . , tn ≤ t; I1, . . . , In).
(Sometimes a more general definition avoids using these specific cylinders and simply
uses an indexed sequence of σ-algebras F0 ⊂ Fs ⊂ Ft ⊂ F called a filtration [53]. A
process adapted to this filtration is called Ft-adapted.) In the context of statistical
estimation, working with time-adapted processes is essential, as these are the pro-
cesses that are independent of any future events. Within the bounds of time-adapted
processes there are an additional type of stochastic processes that have special and
simplifying characteristics in terms of their conditional statistics, called martingales.
A martingale is an important kind of stochastic process which plays a crucial role
in classical probability theory. More importantly for our purposes, they will play a
crucial role in significantly simplifying a quantum conditional master equation as we
will see in Sec. 3.4.1. They are used to represent fair betting games where no amount
of past information is helpful in predicting future events. The defining property is
that the conditional expectation of any future value of the process is simply given by
its current value. You expect to leave a fair casino with the same amount of money
as you had when you entered2. In essence a martingale is a random process where
the conditional mean of any future increment is zero, [22, 53, 60]. To illustrate this
property, consider taking a fair coin and flip it N times. A typical sequence ω may
2In contrast to a real casino.
Chapter 3. Classical and Quantum Probability Theory 104
be something like,
ω = H, T, H, H, H, H, T, T, H, . . . . (3.26)
For any sequence ω we can create a random variable xn which is equal to the number
of heads minus the number of tails seen in the first n flips. So that for this sequence
x = 1, 0, 1, 2, 3, 4, 3, 2, 3, . . . . (3.27)
In the case of a fair coin, xn is a martingale.
To see why, note that in each flip there is equal probability of the coin landing
heads or landing tails. So that for any n we have the expectation value,
E(xn) = 0. (3.28)
In this specific realization, after the first four flips x4 is not 0, but is in fact 2.
However, because any future flip are independent of the past we should not expect
to see any more heads than tails. This means that conditioned upon the first four
outcomes we should not expect for xn≥4 to be 0, but instead it should average around
2. In other words
E(xm − xn| x1, x2 . . . xn) = 0 for all m ≥ n. (3.29)
This is the fundamental property of a martingale which is usually written as,
E(xm| x1, x2 . . . xn) = xn for all m ≥ n. (3.30)
Now imagine that that the coin was not in fact fair. Say that the probability
for heads was PH = 2/5 and the probability for tails was PT = 3/5. Then in this
case xn is not a martingale, as we would instead expect xn to trend negative. Or
in other words E(xm| x1, x2 . . . xn) < xn for m > n. But while xn may not be
a martingale, it is sometimes possible to construct from x another random variable
that is a martingale. This kind of process is called a semi-martingale and defines the
class of processes that are capable of being used in an Ito integral. The fact that
this is possible will play a crucial step in finding a maximum likelihood estimate in
Chap. 5.
Chapter 3. Classical and Quantum Probability Theory 105
3.1.4 The Wiener process
One of the most important processes one can consider is the Wiener process, a
mathematical model for Brownian motion named for Norbert Wiener. In addition to
elegantly describing diffusive motion, the Wiener process is used to model nearly any
systems interacting with white noise. Chap. 2 already found an application outside
of diffusive motion, that is the statistics of the quadrature Qt and Pt under vacuum
expectation. This section reviews of the properties of the Wiener process, including
how it relates to a classical probability space introduced in Sec. 3.1.
The defining characteristics of a Wiener process are two fold:
1. A Wiener process makes a continuous trajectory in time, with probability one.
2. A Wiener process has increments that are independent, mean zero, Gaussian
distributed random variables with a variance given by the increment’s time
duration.
The first property is obvious while the second is a bit more involved and requires
some explanation. Consider the stochastic process wtt≥0. For times 0 < s < t,
we can define the random variables, a = ws − w0 and b = wt − ws. If wtt≥0
is a Wiener process then a and b are statistically independent and a is a mean
zero, Gaussian random variable with variance s and b is also a mean zero Gaussian,
with variance t − s. Sec. 3.1.1 found that if a random process has statistically
independent increments and each increment is mean zero then its a martingale, and
so the Wiener process is also a martingale. There are many more interesting and
sometimes nonintuitive properties that one can calculate for a Wiener process, see
[58, Chap. 2]. Some interesting facts are (i) a Wiener process is nondifferentiable
with probability one and (ii) if at some time t a Wiener process takes on the value
w(t) = a then it will take on that value and infinite number of times in every interval
[t, t+ ∆t].
Chapter 3. Classical and Quantum Probability Theory 106
The statistical properties of a Wiener process are deceptively simple, and yet
exceedingly rich. The second defining property allows us to find a connection be-
tween this simple statement and a nontrivial probability measure over the space of
continuous functions. While we just used the times 0 < s < t to demonstrate what
this property means there is nothing stopping us from using a countable sequence
of times 0 < t1 < t2 < . . . . We then know that for a Wiener process the random
variables ∆wi ≡ wti−wti−1are all independent mean zero Gaussian random variables
with variances ∆ti ≡ ti− ti−1. For each time ti we can calculate the probability that
the Wiener process lies in the interval Ii = (ai, bi). This ultimately turns out to be
P (wti ∈ Ii) =
∫ b1
a1
dw1
∫ b2
a2
dw2 · · ·∏i
(1√
2π∆tie− (wi−wi−1)2
2∆ti
), (3.31)
which is known as Wiener’s discrete path integral [58]. Notice, that by picking the
sequence of times 0 < t1 < t2 < . . . and the intervals Ii = (ai, bi), we just defined
a cylindrical set, C(t1, t2, . . . ; I1, I2, . . . ). Eq. 3.31 is a probability for observing a
continuous trajectory to lie within this set, and therefore we can use these integrals
to define a probability measure on the space of continuous functions ω : R+ → R.
Not surprisingly, this is called the Wiener measure
P
(C(t1, t2, . . . ; I1, I2, . . . )
)= P (wti ∈ Ii). (3.32)
It is worth noting that under this measure, all trajectories which do not have ω(0) = 0
are given zero probability, i.e.
P
(ω ∈ Ω : ω(0) = 0
)= 1. (3.33)
This brings us back to the probability space (Ω,F ,P) defined over the continuous
functions ω(t) with the σ-algebra F over the cylindrical sets. If the probability
measure P is the Wiener measure as defined above than the stochastic process,
wt(ω) = ω(t) : 0 ≤ t <∞ is Wiener processes.
Chapter 3. Classical and Quantum Probability Theory 107
3.2 Quantum Probability Theory
These same concepts can also be applied to quantum theory either directly or with
some modification. The mathematics of quantum stochastic calculus and noncom-
mutative probability theory is a broad and detailed subject, one that is beyond
our scope. Reasonable introductions with an emphasis on filtering can be found in
[25, 61] and with more detailed treatments in [23, 45, 52]. However a certain amount
of review is necessary in order to address the physical implications of the formalism.
Before discussing the truly quantum nature of noncommutative probability theory,
we will discuss its similarities with the classical theory.
3.2.1 Embedding the quantum into the classical
This section reviews how to constructs a classical probability space from a set of
mutually commuting quantum observables. The purpose for this review is two fold.
First, the quantum filtering problem relies upon this kind of mapping. The contin-
uous measurements we will be making is described by a set of mutually commuting
operators which is increasing in time. The eigenvalues that we receive will be viewed
as a (read mapped to) classical random variables on a classical probability space.
The second reason for this review is to emphasizes its limitations. In Chap. 5 the
quantum filter is used to estimate an unknown initial state of a qubit. A natural tool
in classical systems is the smoother which is an estimate for an unobserved system
at some past time, given measurements up to some current time. However, naıvely
applying this classical technique violates a necessary condition that allows for the
classical mapping.
In classical probability theory we found that random variables could be viewed
as functions mapping elements of the sample space to real numbers. At its most
practical level, quantum theory is used to predict the outcomes of experiments where
Chapter 3. Classical and Quantum Probability Theory 108
the measured observables are represented as Hermitian operators acting upon some
underlying Hilbert space. The first step in bringing classical probability theory to
the quantum is to formulate an analogy between classical random variables and
Hermitian operators.
classical ↔ quantum
x(ω) ↔ X
This is a natural analogy, as the basic operation in classical probability is to calculate
the expectation values of random variables.
The next association is that in the classical theory we have the probability mea-
sure P to calculate expectation values while in the quantum we have the system
state ρ. This analogy is best illustrated in a discrete example where the classical
sample space is the set of a finite number of d realizations, Ω = ω1, . . . ωd. The
σ-algebra F for this space is then the power set of Ω and the probability measure
P is completely described by the probabilities of the singleton events pi = P(ωi).
The classical expectation value of a simple random variable x(ω) in this space is then
E(x) =d∑i=1
x(ωi)P(ωi). (3.34)
In the quantum case a system described by a Hilbert space H of dimension d is
equipped with a positive trace one density matrix ρ. Expectation values of operators
X acting on H are of course calculated as
E(X) = Tr(ρX) (3.35)
which will sometimes also be notated as 〈X〉. Thus we have the correspondence
classical ↔ quantum
E(x) ↔ Tr(ρX).
Chapter 3. Classical and Quantum Probability Theory 109
Rather than this loose analogy, a formal equivalence is possible where certain aspects
of quantum theory can be embedded into a classical probability space.
While the classical probability space has a fixed, albeit sometimes abstract set of
realizations Ω, identifying such a set in quantum mechanics is problematic. In the
spirit of deterministic classical physics, the sample space Ω most often represents
locally realistic fates of the system. The probability of observing certain events
is given by the probability measure P which act on subsets of Ω. The utility of
probability theory is that we have an event A = ω ∈ Ω | a ≤ x(ω) ≤ b which has
a concrete meaning for multiple random variables, not just x but f(x). In addition
there could certainly be another random variable y such that when it take on the
values c ≤ y ≤ d whenever x is observed in the interval [a, b], and so both correspond
to the same underlying event A. It is clear from Bell’s theorem that this locally
realistic interpretation of Ω is not consistent with quantum mechanics.
A less ambitious task is to find a classical probability representation that is capa-
ble of describing the joint statistics of compatible observations. Compatibility of two
observables X and Y means that [X, Y ] = 0 and more importantly they share a set
of eigenvectors |ei〉. For each eigenfunction we have the projector Pi = | ei 〉〈 ei |
and the operators X and Y have the spectral decompositions
X =∑i
xi | ei 〉〈 ei | and Y =∑i
yi | ei 〉〈 ei |. (3.36)
Note that the eigenvalues xi and yi need not be distinct as they could have degenerate
subspaces.
In a d-dimensional system there are at most d distinct, mutually orthogonal
projectors associated with a set of commutating operators C = X, Y, Z, . . .. If we
associated projectors Pλ = |λ 〉〈λ | so that
X =d∑
λ=1
xλ Pλ (3.37)
Chapter 3. Classical and Quantum Probability Theory 110
and
E(X) =d∑
λ=1
xλ Tr( ρ Pλ). (3.38)
The mapping between discrete quantum mechanics and classical probability is to as-
sociate the set of labels for projectors with the sample space in a classical probability
space. Then we have Ω = λi : i = 1 . . . d and F is simply the power set of Ω.
From this assignment, the probability measure is simply the quantum expectation
value of the associated projectors. For example, probability for the event λ1, λ2 is
P(λ1, λ2) = Tr(ρ Pλ1, λ2) = Tr(ρ |λ1 〉〈λ1 |) + Tr(ρ |λ2 〉〈λ2 |). (3.39)
In classical probability the simplest of simple random variables are the indicator
functions χF (for every set F ∈ F) which correspond to the projectors in the X 7→
x(ω) mapping. This procedure is formalized in Theorem 2.4 of [25] and is summarized
in Table 3.1.
Classical QuantumΩ λ1, λ2, . . .F
λ1 , λ1, λ2 , . . .
P(λi) Tr(ρ Pλi )x(ω) X
Table 3.1: The spectral mapping between a set of commuting observables anda classical probability space.
The above discussion extends also to the case of infinite dimensional Hilbert
spaces and operators with continuous spectra [25]. For any “normal” operator3 A,
which may take on all values in R, there exist a spectral decomposition for A such
3A normal operators is one that commutes with its adjoint and so has a spectral decom-position. It can be written in terms of commuting Hermitian and anti-Hermitian parts.
Chapter 3. Classical and Quantum Probability Theory 111
that
A =
∫RaP(da) (3.40)
where P(da) is the spectral measure, also called a projection valued measure, associ-
ated with A taking on values in interval da. Often explicitly constructing P(da) is
a little tricky, especially if one must first identify any vectors ψ in Hilbert space for
which φ ≡ Aψ is not well behaved. ( Hermitian operators whose eigenvalues span
the entire real line, so called unbounded operators, exhibit these kinds of problems.
A trick for dealing with this case is to compute the spectral measure for the bounded
operator T = (A + i1)−1. This works because any function f(A) commutes with
A, therefore they share the same projectors. When T takes on the complex value
λ there is then the corresponding value for A, a = λ−1 − i.) Once armed with a
spectral measure for A we can then find an equivalent classical probability model,
whose sample space are labels for the possible values of A.
Regardless of whether or not the associate operators are unbounded, we empha-
size this spectral mapping is only applicable to a subspace of operators which all
commute with the underlying projectors. While this may seem to indicate that the
mapping is severely limited, in practice it is extremely useful for describing ancilla
assisted measurements. If one is interested in computing conditional expectation
values for operators that commute with the projectors defining the classical space,
then the quantum-to-classical mapping is still applicable.
3.2.2 Quantum probability
The spectral mapping to a classical probability space lacked a representation that is
independent of the specific choice of projectors. Furthermore, Bell’s theorem shows
that there are no locally realistic sample spaces consistent with quantum mechanics.
The first step in discussing quantum mechanics in a probability theoretic framework
Chapter 3. Classical and Quantum Probability Theory 112
is to omit the sample space Ω [62]. At the level of making practical calculations, the
sample space provided an underlying structure for associating random variables with
the probability measure. By observing a random outcome, x(ω) = a we were able to
see what event this corresponds to and then calculate the probability for that event
given the measure P. In other words, we identify the set of possible realizations that
are compatible with this observation then evaluate the probability for this event.
In quantum theory the underlying Hilbert space of the system, H, provides this
necessary structure. By making the association between Hermitian operators and
the results of experiments we already have the necessary mapping between random
variables and probabilities. In the above spectral mapping between quantum to clas-
sical, we associated events with collections of possible eigenvalues and so even in the
infinite dimensional case, the probability of observing an event is given by the expec-
tation value of the corresponding union of projectors. In the fully quantum case, we
need to consider all possible projections, not just those projections that commute.
The mathematical object that is guaranteed to contain all possible projections is
a ∗−algebra (read “star”-algebra) of operators4. Therefore the correspondence be-
tween the σ−algebra F in classical probability space is a ∗−algebra A of operators
on H.
A ∗−algebra of operators acting on a Hilbert space, H is defined as the set of
operators A so that
1. A contains all complex linear combinations of its elements. For all A, B ∈ A
we also have C = c1A+ c2B ∈ A for any complex coefficients c1 and c2.
2. A contains all adjoints of its elements. A ∈ A implies that A† ∈ A.
3. A contains all products of its elements. A,B ∈ A implies AB ∈ A.
4The ∗ in the name comes from the mathematical convention of using ∗ to represent anoperator adjoint rather than †.
Chapter 3. Classical and Quantum Probability Theory 113
4. A contains the identity 1.
In the finite dimensional case where H = Cn the largest ∗−algebra acting on H
is simply Mn, the space of all complex n × n matrices. However the reason for
introducing this algebraic structure is not just for a love of mathematical formalism.
In the same way that a set of classical random variables generate a σ−algebra,
σx, y, z, . . . (see Sec. 3.1.2) a set of operators generate a ∗−algebra. For example,
the above spectral mapping means that there is a ∗−algebera of operators generated
by the set commuting of projectors Pλ. The fact that a ∗−algebra generated from
a set of commuting operators still commute means that a commutative ∗−algebra is
the set of operators that define the events in a classical probability space.
When H is infinite dimensional, defining a suitable ∗−algebra (or sub-∗−algebra)
becomes much more tricky. This is particularly true when trying to show that any
limiting sequence of operators in the ∗−algebra is still in the algebra. As one might
imagine, taking limits of unbounded operators becomes problematic as a sequence of
operators might converge when acting on one set of vectors, but diverge when acting
on another. The details of how one solves these issues is beyond our scope. Suffice
it to say, the solution is to first consider only bounded operators (while keeping the
T = (A + i1)−1 trick in mind) and then include the limits of all the sequences of
operators in the algebra that converge on a class of well defined states. The technical
name for such an algebra is a von Neumann algebra [25]. Generally though, one
hardly ever needs to apply this kind of construction directly.
One additional concept that is useful, particularly for discussing a quantum con-
ditional expectation, is that of a commutant. Suppose you are given a set of operators
S and you want to know what are the set of operators that commute with S. That
set is call the commutant and is notated as S ′. To see why this idea is import consider
the following. Assuming that you are able to measure the operators Y1, Y2, . . . , Yn
and that all of these operators commute with each other. We just showed how to
Chapter 3. Classical and Quantum Probability Theory 114
form a commutative von Neumann algebra from this set, but one might wonder if
that’s the whole story in a quantum to classical mapping. The answer turns out
to be no. The wiggle room is from the fact that the number of distinct eigenvec-
tors made from the operators Y1, Y2, . . . , Yn may not be equal to the dimension of
the underlying Hilbert space. In that case, you can find operators A and B where
[A,B] 6= 0 and still have [A, Yi] = [B, Yi] = 0.
One example is in a two qubit system. Suppose you only measure σz on one qubit
but leave the other one alone. Then the projectors |+1 〉〈+1 |⊗1 and | −1 〉〈−1 |⊗1
form the singleton events in the the classical probability model. But clearly any op-
erator on the second qubit commutes with these projectors and so there is more
in this system than is wholly representable in a classical probability system. But
because the second system does commute with these projectors it is possible to form
a quantum conditional expectation of the system upon the first. The commutant
gives you the set of all possible operators that can be mapped onto a classical prob-
ability space though a quantum conditional expectation. We will briefly discuss this
mapping next.
3.2.3 The quantum conditional expectation
In a classical probability space, if we are given a random variable y, or more generally
a set of random variables ys : 0 ≤ s ≤ t, the distinct outcomes of those variables
form a set of events. From these events we are able to take unions and complements to
make a σ-algebra, Y = σ ys : 0 ≤ s ≤ t, representing the questions we can answer
about the model given the observation of the random variables ys : 0 ≤ s ≤ t.
Then for every event Ai ∈ Y , (assuming P(Ai) 6= 0) we are able to compute the
conditional expectation value of any random variable x, via Eq. (3.18)
E(x|Ai) =E(x χAi
)
E(χAi). (3.41)
Chapter 3. Classical and Quantum Probability Theory 115
In mapping quantum mechanics on to classical probability theory everything still
applies, as long as the operators Ys : 0 ≤ s ≤ t all mutually commute. If we have
a projection Pλ generated from these operators, we can define a quantum conditional
expectation value,
E(X|λ) ≡ 〈X Pλ〉〈Pλ〉
=Tr(ρX Pλ)
Tr(ρ Pλ). (3.42)
This equation shows how crucial that X and Pλ commute in order for this equation
to make sense as a classical analogy. Not only do we need [X,Pλ] = 0 in order for X
to be block diagonalized via the labels λ, but also in order for this expression to be
interpretable as a conditional expectation value, we needed to have 〈X Pλ〉 = 〈PλX〉
for any λ. Classically, we took a conditional expectation value to a conditional
expectation by multiplying by the “projector” onto the event Ai, see Eq. (3.19). The
same is true in the quantum case, as can be seen in a finite dimensional example.
From a set mutually commuting observables Y1, Y2, . . . Yn, we can form a com-
mutative ∗-algebra Y that is spanned by the orthogonal projectors Pλ. These
projectors form a resolution of the identity so that∑
λ Pλ = 1. For any operator X
in the commutant of Y , the quantum conditional expectation is defined as [25]
E(X|Y ) ≡∑λ
〈X Pλ〉〈Pλ〉
Pλ . (3.43)
The quantum conditional expectation has a number of properties in common with
the classical conditional expectation. Specifically,
E (E (X|Y )) = E (X) (3.44)
E (Y1X Y2|Y ) = Y1E (X|Y ) Y2 ∀ Y1, Y2 ∈ Y (3.45)
E (X|Y ) = E(X)1 ∀ X independent of Y . (3.46)
The quantum conditional expectation has some operator specific properties,
E (1|Y ) = 1 (3.47)
E(X†∣∣Y ) = E (X|Y )† (3.48)
E(X†X
∣∣Y ) ≥ 0. (3.49)
Chapter 3. Classical and Quantum Probability Theory 116
Finally, an extremely important property that also carries over from the classical
conditional expectation is that for all Y ∈ Y and X ∈ Y ′
E (E (X|Y )Y ) = E (XY ) . (3.50)
This property is as important here as it is in the classical case, because it is often
used to identify5 what the conditional expectation should be when it is intractable
to find an explicit representation for Pλ. This is particularly true in the infinite
dimensional case and then Eq. (3.50) is taken as the defining characteristic for the
conditional expectation. In other words, if you can find an operator X ∈ Y that
satisfies this equation, then you have the conditional expectation for X given Y [25].
3.2.4 The conditional expectation and generalized measure-
ments
Before considering a specific example of a continuous-time quantum conditional ex-
pectation, we briefly pause to discuss the connection between the quantum con-
ditional expectation and generalized measurements, as traditionally formulated in
quantum mechanics. We argue here that these two ideas are essentially equivalent
and will specifically show that any measurement given in terms of a countable set
of distinct Kraus operators Mm is equally well represented in terms of a quantum
system mapped to a classical probability model. In particular, the posterior state
ρ|m is a Schrodinger picture version of a quantum conditional expectation value.
A general quantum measurement on a Hilbert space H is specified by a set of
Kraus measurement operators Mm where the indices m label the outcomes of the
measurement [63]. The measurement operators are required to satisfy a completeness
relation∑m
M †mMm = 1. (3.51)
5i.e. guess and check.
Chapter 3. Classical and Quantum Probability Theory 117
The completeness relation means thatM †
mMm
is a valid POVM, and in particular
that under the state ρ the expectation values,
Tr(ρM †mMm)
define a probability
measure for a sample space Ω = 1, . . . ,m, . . .. Upon receiving the outcome m, a
mixed state ρ updates to the posterior state ρ|m via the map
ρ|m ≡MmρM
†m
Tr(ρM †mMm)
. (3.52)
Our claim is that there exists a Heisenberg picture formulation where the use of
this posterior state is replaced by a conditional expectation. Proving this is not diffi-
cult, by using the fact that any generalized measurement is equivalent to a projective
measurement performed on an ancillary system after an entangling unitary opera-
tion [63]. The equivalent Heisenberg/quantum probability picture is then to evolve
all of the operators with the entangling unitary and then calculate a conditional
expectation value for a post interaction projector.
The specific relation is that every measurement outcome can be modeled as a
state in an ancillary system with Hilbert space HA where there are basis vectors
|m〉 that correspond to the outcomes of the measurement. Clearly for this to make
sense, the dimension of HA must be as least as big as the number of measurement
outcomes. The entangling unitary operator U then maps a fiducial pure state |0〉 to
the basis vectors |m〉 and when doing so applies the operator Mm to system. Thus,
there always exists a unitary U such that for every system state vector ψ
U |ψ〉|0〉 =∑m′
Mm′|ψ〉|m′〉. (3.53)
Operating the projector 1⊗ |m 〉〈m | on the post unitary state, results in
(1⊗ |m 〉〈m |)U |ψ〉|0〉 =∑m′
Mm′|ψ〉|m〉 〈m|m′〉 = Mm|ψ〉|m〉. (3.54)
In other words by applying the projector |m 〉〈m | to the post interaction state, we
have applied the measurement operator Mm to the system and projected the ancilla
Chapter 3. Classical and Quantum Probability Theory 118
into the measurement eigenstate m. For a general system state ρ, the probability for
obtaining the outcome m is then given by
P(m) = Tr( (U ρ⊗ | 0 〉〈 0 |U †
)1⊗ |m 〉〈m |
)= Tr
(ρ⊗ | 0 〉〈 0 |
(U † 1⊗ |m 〉〈m |U
) ). (3.55)
Applying a unitary transformation to an operator does not change its spectrum and
so a unitary evolved projector is still a projector and in this case, one that is no
longer acting solely on the ancilla.
The quantum probability description between the generalized measurement with
operators Mm is to use a Heisenberg picture version of the “purification” of that
measurement. Specifically the commuting set of operators that we are conditioning
on is simply the unitarily evolved projectors
Pm ≡ U † 1⊗ |m 〉〈m |U
.
It is not difficult to show that the partial trace of the posterior state ρ|m with the
system operator X is equivalent to the conditional expectation value of the operator
U †(X ⊗ 1)U conditioned on the projector Pm, under the joint state ρ⊗ | 0 〉〈 0 |. In
other words we have the equality
trsys(ρ|mX) =Tr(ρ⊗ | 0 〉〈 0 | U †(X ⊗ 1)U Pm
)Tr (ρ⊗ | 0 〉〈 0 | Pm)
= E(U †(X ⊗ 1)U
∣∣Pm
). (3.56)
3.3 Quantum Filtering Theory
Quantum filtering theory has a particularly grandiose title but in actuality it is not
much more than what we have already developed here. Bouten et al., wrote an award
winning introduction to the problem quantum filtering and quantum stochastic cal-
culus [25]. This section does little more than quote their final results. The quantum
filter is in essence nothing more than the conditional expectation for a system ob-
servable X, based upon a light observable, e.g. Qit, after both have interacted though
a unitary Ut. The two light measurements that are typically considered are that of
Chapter 3. Classical and Quantum Probability Theory 119
measuring an output quadrature, e.g. U †tQitUt, or a direct photon number measure-
ment, e.g. U †t Λiit Ut. Here we have focused on classical and quantum diffusion, and
so we will assume that we are measuring the quadrature Qit. In addition, to simplify
the notation, we assume that we are considering a single field mode and will drop
the label i. More general expressions are not difficult to derive once the formalism
is in place; for examples see [64].
The quantum filter for time independent system observable X is written as a
time indexed map πt(X) and is the conditional expectation of the unitarily evolved
operator U †tXUt, conditioned on the (continuous) set of measurements of an output
process Ytt≥0. In the diffusive case
Yt ≡ U †tQtUt = U †t (At + A†t)Ut. (3.57)
When Ut is given by a general single mode, it is the solution to the QSDE given in
Appendix C Eq. (C.27). In the 1D case with no scattering interactions we are able
to calculate that,
Yt = Qt +
∫ t
0
U †s (L+ L†)Us ds. (3.58)
The general expressions for the unitary evolution of any system operator X and the
fundamental field processes Ajt , Ai†t and Λij
t are given in Appendix C, Sec. C.1.1.
Sec. 2.5.2 showed that in vacuum expectation, Qt has statistics of a Wiener
process. Because of this one may be tempted to interpret Eq. (3.58) as the time
integral of a system operator plus quantum white noise. We urge the reader to
avoid this temptation because, as Sec. C.1.1 shows, U †t (L+L†)Ut is generally a very
complicated expression involving integrals with respect to dΛt, dAt, and dA†t . Yt is a
fully coherent operator acting on the joint Hilbert space H⊗F (h[0,t]) and does not
generally commute with Qt.
It is, however, not difficult to show that Yt commutes with itself at different
times, i.e. [Yt, Ys] = 0 for any times t and s. Therefore a continuous observation of Y
Chapter 3. Classical and Quantum Probability Theory 120
between the times 0 ≤ s ≤ t makes a set of commuting observables Ys : 0 ≤ s ≤ t.
This set of observations can then be used to form a commutative von Neumann
algebra Yt. The quantum filter πt(X) is then given by the conditional expectation
πt(X) ≡ E(U †tXUt|Yt). (3.59)
Finding an expression for πt(X) requires implementing the conditional expecta-
tion in the form given in Eq. (3.50). Note that in general, the conditional expectation
depends upon the properties of the joint system field state and so you will arrive at
different filtering equations if the system is in vacuum, [25], a coherent state [65], or
a state with nonclassical photon statistics [66]. The quadrature measurement of a
single mode in vacuum expectation is arguably the simplest of all cases, and is what
we will use exclusively here. The bottom line result is that the quantum filter for
any system operator X is given by the recursive QSDE
dπt(X) = πt(L00(X) ) dt
+(πt(L
†X +XL )− πt(L† + L) πt(X))(dYt − πt(L+ L†) dt
).
(3.60)
with the initial condition π0(X) = E(X). This is very analogous to the classical
Kushner-Stratonovich equation of nonlinear filtering [25]. The operator map
L00(X) = +i[H,X] + L†XL− 12L†LX − 1
2X L†L (3.61)
is the 00 Evens-Hudson map, (see Sec. C.1.1) and is essentially the Heisenberg picture
version of the Lindblad master equation. A serious draw back to the quantum filter is
that because it is recursive, it will very rarely close. In order to propagate Eq. (3.60)
for the operator X, we need to also calculate in parallel the filter for the operators
A = L† +L, B = L†X +XL, and C = L00(X). It’s also highly likely that the space
of possible system operators is not generated by simply these four operators. By
calculating πt(A), we will also need to know the filter for πt(L00(A)), which itself will
likely generate more complicated operators. Fortunately a saving grace is that we
can invert this equation to find an effective “noisy” system operator ρt. The equation
of motion for ρt is the conditional master equation which will discuss in Sec. 3.4.
Chapter 3. Classical and Quantum Probability Theory 121
Before doing so, we would like to highlight one important issue that makes a
strong distinction between quantum and classical filtering. In the classical case the
filter is one of a couple of operations that one is interested in computing condi-
tioned on an observation process yt. Another process that one is interested in is a
smoother, which is defined classically as
πs,t(x) ≡ E(xs| yt′ : 0 ≤ t′ ≤ t) for s ≤ t. (3.62)
Classically this is a perfectly well defined thing to do, as long as xs is measurable
with respect to the σ-algebra defining the global probability space (Ω,F ,P). One
would generally then be tempted to define a quantum mechanical smoother,
πs,t(X) ≡ E(U †sXUs|Yt) for s < t. (3.63)
Unfortunately this object is not well defined for any system operator, X, because
U †sXUs is not in the commutant of Yt. To see why, consider that Yt = Qt+∫ t
0U †r (L+
L†)Ur dr, which certainly has support upon the system Hilbert space via the time
integral of U †r (L+L†)Ur. There is no guarentee that [U †sXUs, U†t (L+L†)Ut] for any
X and times t, s. The reason that U †tXUt is in the commutant of Yt is because we
can show that U †sQsUs = U †tQsUt, for s ≥ t. This property then shows us that
[U †tXUt, Ys] = [U †tXUt, U†tQsUt] = U †t [X, Qs]Ut = 0 (3.64)
for s ≤ t. This means that the post interaction system operator at time t is able
to be conditioned on past measurements. However the same “advancement” trick is
not possible for the system observable, and therefore there is no guarantee that Eq.
(3.63) is well defined. If you simply threw caution to the wind and went through
a smoothing calculation, even though U †sXUs is not in the commutant of Yt, then
it quite possible that by conditioning you could take positive operators to negative
ones or even Hermitian observables into non-Hermitian operators [25].
Tsang proposed a time-symmetric quantum smoother where one calculates a
smoothing operation for a classical signal imprinted on a quantum system [67]. In
Chapter 3. Classical and Quantum Probability Theory 122
this case, the smoother is calculating a conditional estimate for the classical signal
and therefore commutes with both the system and field operators. In Chap. 5
we wish to form an estimate for the system state at the initial time t = 0 given
measurements up to time t. One might be tempted to try and formulate a quantum
smoothing equation E(X|Yt), but as we just showed such an object is not in general
well defined. Therefore we have to resort to different methods.
3.4 The Conditional Master Equation
Sec. 3.3 just showed how one could form a conditional estimate for system oberv-
ables based upon a measurement of an output light quadrature via the Heisenberg
picture formalism of quantum probability. A serious drawback is that the filtering
equations are recursive and hardly ever close. The saving grace of this is to convert
to a randomized Schrodinger picture and work with a Conditional Master Equation
(CME).
We know from Sec. 3.3, that every commutative space of operators is mappable to
a classical probability space. We also know that from the definition of the conditional
expectation, the filter πt(X) = E(U †tXUt|Yt) is an operator in Yt. And so if we
generate a classical probability space (Ω,F ,P) for Yt then the filter πt(X) should
be representable in that space. Furthermore in a given experiment, the eigenvalues
we receive from measuring Y form a realization of a classical stochastic process yt
defined on that probability space.
What this means in practice is that we will now focus our attention to solely
system variables a treat the measurement record yt as a classical stochastic process. It
is in this sense that we call the conditional master equation a semiclassical equation.
Specifically, it treats the output measurements Ytt≥0 as a classical random variable
while the system undergoes a noisy quantum evolution. In our opinion, it cannot
Chapter 3. Classical and Quantum Probability Theory 123
be over emphasized that this process has its origin as a quantum object and so not
every operator will commute with Yt – particularly past system observables.
With that warning to tread lightly, finding a semiclassical equation for a noisy
system state ρt is remarkably easy. Such an equation begins by enforcing that for
every system operator X, we must have6
Tr(ρtX) ∼= πt(X). (3.65)
To find an SDE for ρt, we simply notice two things. In every term of Eq. (3.60),
there is a coefficient πt(Y ) of some operator Y which is in turn relatable to Tr(ρtY ).
The second is that the only quantum stochastic differential in Eq. (3.60) is dYt,
which from Eq. (3.58), satisfies the quantum Ito rule,
dYt dYt = dt. (3.66)
Therefore in the semiclassical mapping dyt also has the Ito rule
dytdyt = dt. (3.67)
With these two observations we have,
Tr(dρtX) = Tr(ρt L00(X) ) dt
+(
Tr(ρt (L†X +XL)
)−Tr
(ρt (L† + L)
)Tr (ρtX)
)(dyt−Tr(ρt(L
†+L)) dt).
(3.68)
We can use the cyclic property of the trace to decompose L00(X) into an adjoint
map acting on ρt,
Tr(ρt L00(X)) = Tr((−i[Ht, ρt] + LρtL
† − 12L†Lρt − 1
2ρtL
†L)X). (3.69)
6Mathematically, this equivalence may seem strange as the left hand side is a scalarvalued random variable while the right hand side is an operator in Yt. The equivalenceis made though the classical outcome ω, that labels the set of eigenvalues we receive fromthe measurement.
Chapter 3. Classical and Quantum Probability Theory 124
By making the same kind of transformation of the remaining terms and noting that
it is true for any system operator X, we arrive at the conditional master equation
(CME)
dρt = −i[Ht, ρt] dt+D[L](ρt) dt+H[L](ρt) dvt, (3.70)
with the initial condition is ρ0 = ρ(0) and we made the following definitions. D[L](ρt)
is the Lindblad operator map commonly found in open quantum systems and is
defined as
D[L](ρt) ≡ Lρt L† − 1
2L†Lρt − 1
2ρt L
†L. (3.71)
H[L](ρt) is the state update map defined as
H[L](ρt) ≡ Lρt + ρt L† − Tr((L+ L†) ρt) ρt. (3.72)
This map shows how the state updates, weighted by the strength of the stochastic
process,
dvt = dyt − Tr((L+ L†) ρt) dt. (3.73)
The random process vt, called the innovation process, plays an important role as
it is the only random contribution to the CME. In the next section we will review
the proof that when everything about the measurement yt is properly specified, then
dvt is a realization of a Wiener process.
3.4.1 The innovation process
Here we will show that in the innovation process vt transforms yt into a Wiener pro-
cess by subtracting off the conditional expected mean. In classical probability, Levy’s
theorem is an important result because it gives necessary and sufficient conditions
for showing that a given process is in fact a Wiener process. Roughly stated, if a
Chapter 3. Classical and Quantum Probability Theory 125
stochastic process mt is a “local martingale” and obeys the Ito rule that (dmt)2 = dt
then it must be a Wiener process [25]. Martingales are an important kind of stochas-
tic process that play a crucial role in classical probability theory (see Sec. 3.1.1). In
essence it is a random process where the conditional mean of any future increment
is zero [22, 53].
The proof that vt is a Wiener process is given in theorem 7.1 of reference [25]
and relies on some fundamental properties of the conditional expectation. We quote
this result in Lemma 3.1, for two reasons. The first is simply because it is easily
shown and is a rather elegant result. The second is that Chap. 5 uses the fact
that vt is Wiener process only when ρt is “consistent” with the actual statistics of
ytt≥0. Here consistency means that the correspondence Tr(ρtX) ∼= πt(X) holds in
the sense that πt(X) is a conditional expectation of X with respect to Yt, under the
true quantum state. If ρt does not exactly match πt(·) because its initial condition
is wrong or any number of other approximations, then vt will not generally have the
statistics of a Wiener process. See Secs. 5.4 and 5.5 for further discussion.
Lemma 3.1. In vacuum expectation, the quantum stochastic process Mt ≡ Yt −∫ t0πs(L+L†) ds is an instance of a quantum Wiener process in that its finite dimen-
sional statistics are independent mean zero Gaussian random variables with variances
equal to the time differences.
Proof. In Sec. 3.1.1, we review that the classical definition of a martingale is that
it satisfies the property E(mt|Fs) = ms for s ≤ t. In the quantum case this is
equivalent to showing that E( (Mt−Ms) |Ys) = 0. The reason for this is because the
conditional expectation obeys the property that for every K ∈ Ys, E(K |Ys) = K.
By the definition of the conditional expectation, we have that for every K ∈ Ys
E (E(Mt −Ms|Ys)K ) = E ((Mt −Ms)K ) . (3.74)
Chapter 3. Classical and Quantum Probability Theory 126
Substituting the definition of Mt,
E ((Mt −Ms)K ) = E ((Yt − Ys)K )−E(
∫ t
s
ds′ πs′(L+ L†)K ). (3.75)
Notice, however, that πs′(X) = E(U †s′XUs′ |Ys′) we can again use the definition of
the conditional expectation to convert the second term into an expectation of an
integral of U †s′(L + L†)Us′ . In Eq. (3.58) we solved for Yt, and found that Yt =
Qt +∫ t
0ds′ U †s′(L+ L†)Us′ . After substituting that solution, Eq. (3.75) simplifies to
E ((Mt −Ms)K ) = E ((Yt − Ys)K )−E(∫ t
0
ds′ U †s′(L+ L†)Us′ K
)= E ((Qt −Qs)K ) .
(3.76)
Any operator K ∈ Ys is an operator which acts on the system Hilbert space and the
Fock space associated with light operators defined for times s′ ∈ [0, s). The operator
Qt −Qs acts on light field states defined on the time interval [s, t]. This means that
this final expectation value factorizes to show,
E ((Mt −Ms)K ) = E(Qt −Qs)E(K) = 0. (3.77)
This is zero because the quadrature operator Qt is mean zero in vacuum, and so Mt
is indeed a martingale, when we condition on Ys. The proof is finished by simply
observing that dMtdMt = dt and so Mt is a quantum Wiener process by Levy’s
thoerem.
3.4.2 The Ito correction in the conditional master equation
The quantum filter πt(·) is given by an Ito form quantum stochastic differential
equation and therefore the conditional master equation is a semiclassical Ito equation.
In addition to an Ito integral, there is also a Stratonovich integral where the rules
of standard calculus still apply, but the statistical properties are more subtle (see
Appendix B for their respective definitions). While the two forms of integration are
Chapter 3. Classical and Quantum Probability Theory 127
distinct, they are related by a conversion formula, resulting in the “Ito correction”,
derived in Appendix B.1.1. In Chap. 4, we are required to work with a conditional
master equation written as a Stratonovich integral and so we derive this conversion
here.
For a general measurement operator L and HamiltonianH, the conditional master
equation is
dρt = −i[H, ρt] dt+D[L](ρt) dt+H[L](ρt) dvt. (3.78)
The first two terms are simple deterministic integrals and are unaffected by the
choice of stochastic integral and can be ignored. The integrand in Ito integral is the
conditioning map and for reference is,
H[L](ρt) = Lρt + ρt L† − Tr(Lρt + ρtL
†) ρt. (3.79)
For the remainder of this section we will suppress the parameterizing argument and
simply write H(ρt).
A one-dimensional Ito integral that is typically considered in an Ito-Stratonovich
conversion has a differential
dxt = b(xt)dwt, (3.80)
for a smooth integrand b(x). When written as a Stratonovich equation, this differ-
ential is notated as
dxt = b(xt) dwt. (3.81)
The Ito correction is what results when you enforce that both integrals must give
the same process xt, and the final result is that
b(xt) dwt = b(xt) dwt + 12
db
dx(xt) b(xt) dt. (3.82)
The additional term is known as the Ito correction.
Chapter 3. Classical and Quantum Probability Theory 128
To immediately apply this result to the conditional master equation would involve
defining what it means to take the derivative the H(ρ) operator with respect to ρ.
Rather than defining a calculus of super–operators, we will return to the roots of the
relation (see Appendix B.1.1) and write the correction as
dI = H(ρt) dvt −H(ρt)dvt =(H(ρt + 1
2dρt)−H(ρt)
)dvt. (3.83)
The map H is unfortunately not a linear operator in ρt and so the integrand on right
hand side is not simply H(12dρt). After a little algebra we find that
dI = 12
(Ldρt + dρt L
† − Tr(Lρt + ρtL†)dρt
− Tr(Ldρt + dρt L†) (ρt + 1
2dρt)
)dvt. (3.84)
To simplify this expression into a final form, we will substitute the Ito equation
expression for dρt and apply the Ito rule that dvtdvt = dt with all other differential
products being zero. This means that when substituting dρt we need to only use
the stochastic term as any deterministic term will result in a product dtdvt = 0.
Furthermore any term with two powers of dρt will also be zero as that will result in
three powers of dvt. The simplified expression is then,
dI = 12
(LH(ρt) +H(ρt)L
† − Tr((L+ L†)H(ρt)
)ρt − Tr((L+ L†)ρt)H(ρt)
)dt
≡Ic[L](ρt) dt.
(3.85)
Substituting the definition of H[L](ρt), the Ito correction map, Ic[L](ρt), simpli-
fies to
Ic[L](ρt) =(LρtL
† + 12L2ρt + 1
2ρtL
† 2)
−(〈L†L〉+ 1
2〈L2〉+ 1
2〈L†2〉
)ρt
− 〈L+ L†〉 (Lρt + ρt L† − 〈L+ L†〉 ρt)
(3.86)
where 〈X〉 = Tr(Xρt).
Chapter 3. Classical and Quantum Probability Theory 129
Ultimately the Stratonovich form of the conditional master equation is then given
by
dρt = −i[Ht, ρt] dt+D[L](ρt) dt− Ic[L](ρt) dt+H[L](ρt) dvt. (3.87)
3.4.3 The conditional Schrodinger equation
In this chapter we have focused solely on the interpretation of quantum mechanics
in terms of probability spaces. That description lead to a quantum conditional ex-
pectation and a quantum filter, which is described in a Heisenberg picture. From
that Heisenberg picture description we found a conditional master equation (CME).
When the state of the system is pure and the dynamics are such that it will re-
main pure, then propagating a full density matrix is unnecessary and a conditional
Schrodinger equation (CSE) is sufficient. Chap. 5 uses this fact for computational
efficiency and therefore we include the general expression for a CSE based upon the
CME in Eq. (3.70). The details of the conversion can be found in [37].
The CME gives the evolution for ρt in terms of an Ito differential dρt, Eq. (3.70).
Any density matrix whose purity is 1 can be represented by an outer product of a
normalized state vector in Hilbert space [63],
ρt = |ψt 〉〈ψt | if and only if Tr(ρ2) = 1. (3.88)
Furthermore |ψt〉 is unique up to an arbitrary constant phase. While we have worked
quite hard to derive the CME and give it physical meaning, practically speaking it
is “nothing” more than a matrix valued stochastic differential equation defined on
a classical probability space. Therefore a method for moving from a CME to a
CSE is to hypothesize the existence of a random state vector |ψt〉 satisfying some
vector valued SDE d|ψt〉 and then solve for the differential that give the differential
d(|ψt 〉〈ψt |) equal to the CME.
Chapter 3. Classical and Quantum Probability Theory 130
We note that this is not a standard derivation. Typically in quantum optics, one
first derives a stochastic Schrodinger equation via an unraveling of a master equation
that considers photon counting and then takes a diffusive limit [51]. Having already
developed the CME from the quantum filter, it is much simpler to perform the
above calculation, rather than including an independent derivation. The resulting
equations are identical.
The derivation of d|ψt〉 is not difficult as we can see that the only random process
that enters the Eq. (3.70) is through the innovations vt and it does so linearly. We
also know that vt satisfies the Ito rule, dvtdvt = dt. Therefore a reasonable form for
d|ψt〉 is
d|ψt〉 = At|ψt〉 dt+Bt |ψt〉 dvt. (3.89)
for some time-adapted but possibly state dependent operators At and Bt. The adjoint
of this equation is then
d〈ψt| = 〈ψt|A†t dt+ 〈ψt|B†t dvt. (3.90)
And so we need to solve for At and Bt subject to the constraint,
dρt = d|ψt〉 〈ψt|+ |ψt〉 d〈ψt|+ d|ψt〉 d〈ψt|. (3.91)
Doing so is not too difficult and the operators turn out to be
At = −iHt −1
2
(L†L− 2
⟨L†⟩L+
⟨L†⟩〈L〉)
(3.92)
and
Bt = L− 〈L〉 . (3.93)
Traditionally the operator L is Hermitian, and so if we choose our favorite example
of L =√κ Jz then
d|ψt〉 =(− iHt − 1
2κ (Jz − 〈Jz〉)2
)|ψt〉 dt+
√κ(Jz − 〈Jz〉
)|ψt〉 dvt (3.94)
Chapter 3. Classical and Quantum Probability Theory 131
with
dvt = dyt − 2√κ 〈Jz〉 dt. (3.95)
This is the equation used in Chaps. 4 and 5.
132
Chapter 4
Projection Filtering for Qubit
Ensembles
This chapter derives an approximate form for the conditional dynamics of an ensem-
ble of n qubits under the assumption that the state will remain nearly an identical
separable state. We assume that the system is undergoing a diffusive measurement
of the collective angular momentum operator Jz while simultaneously experiencing
strong global rotations. The approximation is made by formulating a projection fil-
ter from the exact conditional master equation. The projection is made through the
technique of orthogonal projections in differential geometry. Here we identify the
space of identical separable states as a Riemannian manifold and then project the
conditional master equation into its tangent space. We also review the elements of
differential geometry that make such a mapping possible. Finally we test the accu-
racy of the projection filter numerical by comparing it to simulations of a stochastic
schrodinger equation. We find that it matches the conditional mean spin projections
to within a 5% RMS error.
Chapter 4. Projection Filtering for Qubit Ensembles 133
4.1 Introduction
Numerical integration of a conditional master equation is generally a resource inten-
sive exercise. Specifying a general mixed state for a d-dimensional quantum system
requires d2 − 1 real parameters. Furthermore the total dimension of a many body
system grows exponentially. A system of n qubits generates a 2n-dimensional Hilbert
space, requiring 22n − 1 parameters. This “curse of dimensionality” is true even in
a unconditioned system and so physicists often search for symmetries that allow for
a more efficient description. The nonlinearity in the conditional master equation
means that a number of symmetries that are often preserved in an uncondition map
are no longer exploitable.
A projection filter is a tool that was developed in the context of classical filtering
theory and provides a general method for constraining nonlinear estimators to remain
in a lower dimensional space [31, 32]. Within the past decade these tools have
also been applied to quantum systems, specifically for cavity QED systems [33–
36], collective spin systems [37], and low rank approximations for general master
equations [38]. The flexibility of the projection method is provided by its formulation
in the language of differential geometry. In the quantum framework we have a high,
possibly infinite, dimensional manifold representing the space of possible states. It
is often the case that the system is initialized in a state with a large amount of
symmetry thereby initially allowing an efficient, lower dimensional representation.
The project filter modifies the exact evolution in such a way as to constrain the
system to remain in the lower dimensional submanifold. It does so by projecting the
differential into the lower dimensional tangent space.
Here we focus on an ensemble of n qubits initially prepared in an identical tensor
product state. In other words, the total state of the system ρtot is initialized in as a
n-fold tensor product of a single qubit state ρ,
ρtot = ρ⊗n. (4.1)
Chapter 4. Projection Filtering for Qubit Ensembles 134
Clearly this is a highly symmetric and easily represented state, as a single qubit
state requires only 3 parameters to be specified uniquely. If the master equation
acts on each qubit individually then the total system will remain in an identical
separable state for all future times. However for a joint qubit system undergoing
a weak, diffusive measurement of the collective angular momentum variable Jz, the
conditional master equation is generally entangling. In the long time limit, this kind
of measurement most often results in the system projecting into a nonseparable Dicke
state.
In this chapter we demonstrate, through numerical simulation, that if the system
also undergoes strong, randomized rotations in addition to the collective measure-
ment then the system will remain nearly sparable. Under this assumption that this
is the case, we apply the technique of projection filtering to the conditional master
equation so that it maps identical separable states to identical separable states.
4.1.1 An introduction to differential projections
The general technique of differential projections can be understood though the fol-
lowing example. Consider an ordinary scalar function defined on three dimensions,
f(x, y, z). The chain rule shows that the differential for f is
df =∂f
∂xdx+
∂f
∂ydy +
∂f
∂zdz. (4.2)
Suppose that we have a particle with position vector x(t), and at each time t we
evaluate f(x(t), y(t), z(t)). In order to have a complete description for f we clearly
need to keep track of all three components because a change in x, y or z induces a
change in f . Now suppose that keeping track of z is too much of a hassle and we
are only interested in tracking x(t) and y(t). The question posed by the projection
filter is, “how should we modify f so that we only need to track x and y?” The
answer comes from the fact that if ∂f∂z
= 0 everywhere then f doesn’t change with z
Chapter 4. Projection Filtering for Qubit Ensembles 135
and ultimately z can be ignored. Therefore the modification we should make it set
the gradient of f to point only in the xy-plane, i.e. set ∂f∂z
= 0. This modification is
the differential projection of f . Therefore we have a modified function f |x,y, whose
differential is simply,
df |x,y=∂f
∂xdx+
∂f
∂ydy + 0 dz. (4.3)
The difficulty in forming a projection filter is that f is not usually written in terms
of, x, y, z, but instead some other set of parameters, x′, y′, z′, or even just t.
Furthermore the desired subspace might be some complicated 2D surface with pa-
rameters v and w. It is very likely that v and w may not even be orthogonal, at least
not in the same sense x, y and z are orthogonal. The first challenge in developing a
projection filter is to give the desired objective a geometric interpretation.
4.1.2 The conditional master equation
Before embarking on a description of the geometry of quantum states, we will first
collect all of the necessary equations from previous chapters here for a single point
of reference. Sec. 3.4 found that the state of an atomic system conditioned on
a continuous diffusive measurement is easily represented by the conditional master
equation (CME) given by the Ito differential,
dρt = −i[H, ρt]dt+D[L](ρt)dt+H[L](ρt)dwt. (4.4)
(See Appendix B for a review of classical stochastic differential equations.) The
dissipation and conditioning maps, D[L](·) and H[L](·), are parameterized by the
measurement operator L and are defined as
D[L](ρ) = LρL† − 12L†Lρ− 1
2ρL†L (4.5)
and
H[L](ρt) = Lρt + ρtL† − Tr((L+ L†)ρt) ρt. (4.6)
Chapter 4. Projection Filtering for Qubit Ensembles 136
Often we will omit the parameterizing argument and simply write D(ρt) and H(ρt).
Also note that ~ has been set equal to one, so that the Hamiltonian operator H has
units of frequency and the measurement operator L has units of root frequency.
Note that Sec. 3.4 used a slightly different notation, referring to the innovation as
dvt rather than dwt. Sec. 3.4.1 showed that innovation computed from the measure-
ment record yt has the statistics of a Wiener process, when the initial condition ρ0
coincides with the “true” initial state. In Chap. 5 this will not be the case, however
here we are assuming that the initial condition is known, and in particular, that it
can be written as ρ⊗n. Therefore, throughout this chapter we will consider the in-
novation to be Wiener process and write it as dwt. Sec. 3.1.4 reviews the statistical
and defining properties of the Wiener process.
The physical system that we have in mind is the idealized linear Faraday inter-
action in Sec. 2.7, meaning the measurement operator is
L =√κ Jz (4.7)
where κ is a constant rate. In addition to this measurement, we consider applying a
uniform but time varying magnetic field, leading to the Hamiltonian
H = fx(t)Jx + f y(t)Jy + f z(t)Jz. (4.8)
The control fields f i(t) are assumed to be real valued, deterministic functions of
time1.
For reasons made apparent in Sec. 4.2.4, we will also need to work with the
Stratonovich form of the CME,
dρt = −i[H, ρt]dt+D[L](ρt)dt− Ic[L](ρt)dt+H[L] dwt. (4.9)
The conversion from the Ito form generated the Ito correction map, derived in Sec.
1In Chap. 5 the control fields are written as b(t), however in this chapter the coordinatesbi indicate the projected coefficients for the stochastic terms, so here we use f i(t) instead.
Chapter 4. Projection Filtering for Qubit Ensembles 137
3.4.2, which is
Ic[L](ρt) =LρtL† + 1
2L2ρt + 1
2ρtL
† 2
−(⟨L†L
⟩+ 1
2
⟨L2⟩
+ 12
⟨L†2⟩)ρt
−⟨L+ L†
⟩(Lρt + ρt L
† −⟨L+ L†
⟩ρt).
(4.10)
Here the expectation value of the operator X has been written as 〈X〉 ≡ Tr(ρtX).
4.2 Differential Manifolds
A manifoldM is most generally a continuous set of point that can be locally mapped
to a d-dimensional Euclidean space. In a neighborhood of any point in M we can
define a smooth mapping points in that neighborhood to a flat space of dimension
d. How smooth this mapping needs to be, often depends upon the author and
the context, generally it must be smooth enough so that the tools of differential
calculus can be applied. The concept of smooth is quite at odds with the random
nature of Brownian motion, as the Wiener process is provably nondifferentiable with
probability one. Here we will be ultimately considering random trajectories on a
differential manifold. The resolution between these two conflicting notions is that
while a diffusive trajectory is nondifferentiable, it is a trajectory in a smooth space.
i.e. a two-dimensional Brownian motion is not a differentiable curve, but it is defined
on a 2-D plane which is smooth.
The specific manifold we need is the space of all valid density operators for n
qubits. For a single qubit, the Bloch vector defines a perfectly respectable one-to-
one mapping between a quantum state and the 3-dimensional Euclidian ball. The
conditional master equation then has a representation as a diffusive trajectory within
the Bloch ball. Defining an equivalent representation for a d−dimensional quantum
state is nontrivial and is still the subject of current research. While there does exist
an equivalent mapping to a ball living in a 2d−1-dimensional space, the boundary and
Chapter 4. Projection Filtering for Qubit Ensembles 138
smoothness of this mapping is quite complex and not well understood [68]. Here we
will only be interested in a geometric representation of states that can be written as
n copies of a single qubit state. Ultimately a Bloch vector representation is sufficient
for our purposes.
4.2.1 Tangent spaces
The differential projection we ultimately want to preform requires a deeper under-
standing of how to define a gradient in a more abstract setting. A key conceptual
point is that we make an association between basis vectors in a d dimension space and
the partial derivatives we can take of a function defined on the manifold. Specifically,
a point p in the manifoldM is representable by the coordinatesx1, x2, . . . , xd
. Any
smooth function f on the manifold, evaluated at this point p can therefore also be
represented as a function of these coordinates, f(x1, x2, . . . xd). At the point p the
partial derivative of the function f with respect to the coordinate xi defines the rate
of change of f as xi is varied, i.e. it defines a line tangent to f pointing in the
direction of xi.
The relation between partial derivatives and vectors can be formed by associating
the basis element ei with the partial derivative operator ∂∂xi
. Differential geometry
is concerned with defining structures that are independent of any given coordinate
system. Calculating a partial derivative with respect to a different coordinate system,
yi, is easily accomplished by applying the usual chain rule,
∂
∂yi=∂xj
∂yi∂
∂xj. (4.11)
The coordinate independent quantity here is the space of all possible partial deriva-
tives we could take at this point. At first glance this may seem like a rather large
object, however the chain rule just showed that a partial derivative in one basis is
simply a linear combination of partial derivatives in another basis. Therefore the
Chapter 4. Projection Filtering for Qubit Ensembles 139
space of all possible partial derivatives is simply the linear span of partials taken
with respect to some basis. This space is called the tangent space of M at point p,
denoted by,
TpM = span
∂∂xi
∣∣p
: i = 1 . . . d. (4.12)
Note that the tangent space is a d-dimensional vector space, as we are taking lin-
ear combinations of d basis vectors. Often we will discuss a directional derivative,
meaning that we will be taking a derivative in the direction of another point in the
manifold. But as this could have any relation to a given coordinate system, the di-
rection derivative defines a vector in the tangent space. Another useful bit of jargon
is that if you have a tangent vector defined for ever point in the manifold then this
defines a vector field.
4.2.2 Riemannian Metrics and orthogonal projections
The tangent space TpM defines the set of all possible partial derivatives one could
make at the point p. However it does not describe how those derivatives are related.
While in a Cartesian basis we have a sense that ex is orthogonal to ey, in general its
hard to tell how the arbitrary vector eu is related to ew. The missing element is a
metric, 〈·, ·〉p, describing a positive definite inner product between any two tangent
vector fields. At each point p we can take the dot product of eu and ev to see how
they are related. If the space is Euclidean, then the metric well report the fact that
ex is orthogonal to ey, which is not true in general.
A Riemannian manifold is a manifold M that is equipped with a metric that
varies continuously between different points. While in a Euclidean space the inner
product between two vectors doesn’t change between different points, this is not
true in a general space leading to much richer geometries. For a basis of vectors ej
spanning the tangent space TpM the metric at that point can be written as a d× d
Chapter 4. Projection Filtering for Qubit Ensembles 140
matrix with components,
gij(p) ≡ 〈ei, ej〉p . (4.13)
In addition to being positive definite, a metric is also symmetric in that 〈ei, ej〉p =
〈ej, ei〉p.
A metric gives a notion of two vectors being orthogonal and from that we are
able to make an orthogonal projection. This is crucially important as we wish to
project the conditional master equation into the tangent space of states that are n
copies of a single qubit state. For a Euclidean space, the orthogonal projection of
the vector v = v1e1 + v2e2 + v3e3 onto the XY plane is trivial to compute, as it
simply discards the e3 component. Given a metric and a general manifold we can
make a similar formulation.
Suppose for a Riemannian manifoldM we have a submanifold N ⊆M of dimen-
sion n ≤ d. Without explicitly constructing an orthogonal basis for every tangent
space TpM, we would like to find a map that discards the vector components orthog-
onal to TpN . This can easily be done, given a basis of vectors vi : i = 1, . . . , n
that span TpN . The metric 〈·, ·〉p taken from M, can equally well be applied to
N as their tangent spaces overlap. Applying this metric to vi we have the n× n
matrix with elements
gij(p) = 〈vi, vj〉p . (4.14)
As the metric is positive definite, this matrix is invertible whose entries are often
written as, gij(p) ≡(g(p)
)−1
ij. We can now show that that the map ΠN : TpM→
TpN ,
ΠN (·) = gij(p) 〈vj, · 〉p vi (4.15)
is equivalent to discarding the component of w orthogonal to TpN .
The projection map should operate as the identity for any vector u ∈ TpN .
To check that this is true, TpN = span vi : i = 1 . . . n, and so u be written as
Chapter 4. Projection Filtering for Qubit Ensembles 141
u = uk vk for some coefficients uk. Then we have
ΠN (u) = gij(p)⟨vj, u
kvk⟩pvi
= gij(p)(uk gjk(p)
)vi
= uk gij(p) gjk(p)vi
= ukδik vi = u.
(4.16)
ΠN should also return zero for every vector orthogonal to TpN . This is also easy to
check, as for every v⊥ in the orthogonal complement of TpN , we have⟨v, v⊥
⟩p
= 0,
if v ∈ TpN . Therefore,
ΠN (v⊥) = gij(p)⟨vj, v
⊥⟩pvi = 0. (4.17)
But as TpM = TpN ∪ (TpN )c, ΠN is the correct mapping.
Note that Eq. (4.15) required only specifying the metric gij(p) on the submanifold
N and does not require an explicit representation for tangent vectors outside of this
subspace. This is the reason why it is not necessary to find an explicit mapping
between the space of n-qubit density matrices to a 2d − 1-dimensional Euclidean
space in order to use the projection filtering methods. All we need is the valid metric
for density matrices and a spanning set of tangent vectors in the submanifold we
wish to project onto.
4.2.3 Differentials on abstract manifolds
As the conditional master equation is written in terms of stochastic differentials,
we must see how a differential operates in a geometric context. In multivariable
calculus the fundamental object is the differential of the coordinates, e.g. dx, dy,
etc.. In a more abstract space, its difficult to intuit what the differential means.
For instance how would one define a differential of a matrix, say the Pauli matrix
σx. Would it be the differential of its entries, the differential of its eigenvalues or
Chapter 4. Projection Filtering for Qubit Ensembles 142
maybe even a differential of both the eigenvalues and eigenvectors? The solution
to this problem is to consider the differential not the individual points themselves,
but the differential after the application of a smooth map to the Euclidean space.
The differential in the abstract space is then inferred from the Euclidean differential.
This process of inference is called the pullback, in that you are pulling back from the
original mapping. Our ultimate goal is to interpret the conditional master equation
dρt in the language of differential geometry and so we need to understand how it
relates to a Euclidean mapping.
Basic multivariable calculus shows that the total differential of the scalar function
f is given by,
df =∂f
∂xidxi. (4.18)
A differential can also be view as a linear map acting on tangent vectors. The action
of df on the tangent vector ∂∂xi
is defined to be
df
(∂
∂xi
)≡ ∂f
∂xi. (4.19)
While this may seem a bit obtuse at first, it is actually a very useful concept. To
see why, consider the most basis function we can consider, namely the coordinate
function xi. The differential dxi has an action on the basis vector ∂∂xj
,
dxi(
∂
∂xj
)=∂xi
∂xj= δij. (4.20)
This shows that the coordinate differential is biorthogonal to ∂∂xj
, and therefore
can be thought of as a dual basis vector. When defining the tangent space in Sec.
4.2.1 we found that the partial derivatives spanned that space and the coordinate
transformation coefficients were simply linear expansion coefficients. The same is
true for the differential df in that ∂f∂xi
are the expansion coefficients in a dual space
spanned by the basis vectors dxi. The dual space is often called the cotangent space.
A differential of a function between two spaces can also be defined. While we just
considered the differential of a scalar valued function, f , we can also consider the
Chapter 4. Projection Filtering for Qubit Ensembles 143
differential of a vector, matrix, or operator valued function. When df acted on a basis
vector ∂∂xi
it returned a scalar value ∂f∂xi
, but with a more general mapping function
the returned value should be something other than a scalar. It turns out that when
you have a function ϕ : R3 →M, the differential of this is a function dϕ : TxR3 →
Tϕ(x)M. The point being that when a function maps one space into another, the
differential maps tangent vectors to tangent vectors. This is best illustrated though
a concrete example, which we will give in Sec. 4.3.1, after formulating the Bloch
vector representation as a Riemannian manifold.
4.2.4 Stochastic calculus on differential manifolds
There seems to be a fundamental inconsistency between a smooth, infinitely differ-
entiable manifold and the nowhere differentiable path of a Wiener process. From
the Wong-Zakai theorem, (see Appendix D) we know that if there exists a smooth,
ordinary differential equation that limits to a stochastic differential equation, then
the limit should be interpreted as a Stratonovich SDE. When trying to incorporate
stochastic differential equations into the language of differential forms, one approach
would be to enforce a smooth approximation, apply the differential technique and
then take a stochastic limit at the end. However a skeptical mathematician might
wonder if such a result could be believed as the end result might depend heavily on
how the smooth approximation was made.
At a practical level, the second order nature of the Ito rule is difficult to reconcile
with the notion of constrained motion on submanifold. A simple example of this is
made in [32], which we will reproduce here. Consider the ordinary differentials,
dx = dt
dy = 2t dt.(4.21)
We can easily see that this describes the parabola y = x2, which can also be con-
sidered an immersion of a one-dimensional manifold into R2. (The parameterizing
Chapter 4. Projection Filtering for Qubit Ensembles 144
function is ϕ(t) = (t, t2).) Furthermore we can see that the coefficients of these
equations, describe a vector, (1, 2x), tangent to the parabola. Were these equations
used to describe the evolution of a system whose initial condition is on the parabola,
we expect the system to remain on this submanifold.
In contrast, consider an equivalent system of Ito stochastic differential equations
dxt = dwt
dyt = 2xt dwt.(4.22)
The coefficients still describe a vector, (1, 2xt), tangent to the parabola ϕ(t) =
(xt, x2t ). However, a simple application of the Ito rule shows that these SDEs have
the solution,
xt = x0 + wt
yt = y0 + w2t − t.
(4.23)
So even if (x0, x0) = (0, 0), these equations clearly does not remain on the parabola,
even though they are described by a vector field in its tangent space. Conversely,
the Stratonovich SDEs
dxt = dwt
dyt = 2xt dwt(4.24)
have the solution
xt = x0 + wt
yt = y0 + w2t ,
(4.25)
which properly describes diffusion on the parabolic manifold.
Our ultimate goal is to take a system of stochastic differential equations and
modify their coefficients so that they remain constrained to a particular submanifold.
This example demonstrates that in order for the tangent space projection to be
effective, we must first express the Ito equation in a Stratonovich form.
Chapter 4. Projection Filtering for Qubit Ensembles 145
4.3 The Bloch Sphere as a Riemannian Manifold
In order to describe the space of density matrices in geometric terms, we need to
choose a metric. There are an infinite number we can choose from and it is likely
that any results we arrive at will depend upon this choice. In the classical projection
filtering problem, the metric Brigo et al. choose the the Fisher information, as it
endows information theory with a nontrivial geometry [31]. van Handel and Mabuchi
follow this example and use a quantum version of the Fisher information [33]. Later
authors choose a different metric, namely the trace inner product [34–37]. While the
trace inner product does not have an immediate connection to quantum information
theory, it is significantly simpler to work with and, as we will shortly show, under
this metric the Bloch sphere for a single qubit is Euclidean. In showing this, we will
also formally construct the state space for a single qubit as a Riemannian manifold.
For a Hilbert space of dimension d, we will follow [68] and refer to the set of all
valid density operators as S(d). In the case of a qubit with d = 2, we already know
that the Bloch sphere is an incredibly useful parametrization of this set. Formally,
we define this as the map ρ : B ⊂ R3 → S(2) so that
ρ(x) = 12
(1+ x · σ) . (4.26)
As every valid quantum state is required to be trace 1 and positive semi-definite, we
have the constraint that |x| ≤ 1, implying that B is the unit ball.
Through the Bloch sphere mapping, we can construct a tangents space for S(2).
This is first done by defining a directional derivative for S(2). Consider the Bloch
vectors x and y (|x| , |y| ∈ (0, 1) ). The derivative of ρ(x) in the direction of y is
defined to be
Dy ≡ limλ→0
ρ(x + λy)− ρ(x)
λ=y · σ
2. (4.27)
Then assuming the standard Cartesian coordinate system x1, x2, x3, we have the
Chapter 4. Projection Filtering for Qubit Ensembles 146
basis of tangent vectors
Di ≡ 12σi. (4.28)
The tangent space at the point ρ(x) ∈ S(2) is then
Tρ(x)S(2) = span Di : i = 1, 2, 3 . (4.29)
Armed with these tangent vectors we will choose, with some foresight, the trace
inner product as a metric. For two tangent vectors Di and Dj we have the metric
gij = 〈Di, Dj〉ρ ≡ Tr(D†i Dj). (4.30)
While this could result in a complex metric, we can see that for the qubit the basis
vectors are Hermitian and therefor the metric is real. Also note that due to the cyclic
property of the trace, it is also symmetric. Then for the qubit, simply calculating
shows
〈Di, Dj〉ρ =1
4Tr(σiσj) =
1
2δij. (4.31)
Up to a factor of a half, the Bloch sphere is Euclidian under this metric.
4.3.1 Projecting the unconditional master equation
In this section we work though an example of explicitly expressing an unconditional
master equation for a single qubit in terms of a differential form dρ. We will also
do so generally, without assuming a Euclidean metric. Most generally ρ(t) is a
map ρ : R+ → S(2). For any time, t, ρ(t) returns a valid density matrix. Then
as a differential, the master equation is the map dρ : TtR+ → Tρ(t)S(2), which is
specifically
dρ = −i[H, ρ]dt+D[L](ρ)dt, (4.32)
Chapter 4. Projection Filtering for Qubit Ensembles 147
for a general Hamiltonian H and jump operator L. Instead of the direct mapping
between time and density matrices, we would like to consider this in terms of the
Bloch sphere mapping of Eq. (4.26). This can be done if we consider the time
component as a kind of functional composition, so that ρ(t) = ρ(x(t)) for a map
x(t) between time and Bloch vectors. From Eq. (4.26), the general expression for
dρ : TxR3 → Tρ(x)S(2) is
dρ = 12ai(x)σi dx
i. (4.33)
To see that this is indeed a map the two tangent spaces we can simply calculate its
action on the basis vector ∂∂xj
dρ
(∂
∂xj
)=∑i
12ai(x)σidx
i
(∂
∂xj
)=∑i
12ai(x)σi δ
ij = 1
2aj(x)σj (4.34)
where there is no sum in the final expression. This is clearly in the tangent space
Tρ(x)S(2) as it is proportional to Dj = 12σj. Our ultimate goal is then to solve for
the coefficients ai(x).
Any traceless matrix 2×2 matrix can be written as a linear combination of Pauli
matrices. As both the commutator [H, ρ] and the map D[L](ρ) are traceless, both
of these operations have some expansion coefficient in terms of the Pauli matrices.
Sec. 4.3 found that the tangent space TρS(2) is also spanned by the Pauli matrices,
meaning that −i[H, ρ] and D[L](ρ) are vectors in this space. Thus finding the
coefficients ai(x) simply comes to projecting these maps onto the basis vectors Di.
Sec. 4.2.2 established that the general projection map ΠN can be written as Eq.
(4.15), in terms of the metric and its inverse gij(x). We are able to write dρ(t) as
dρ(x(t)) = −i gij(x) 〈Di, [H, ρ]〉ρDjdt+ gij(x) 〈Di, D[L](ρ)〉ρDjdt. (4.35)
But as this is a differential with respect to dt and not dxj we can define the differ-
entials for the time-dependent coordinates xj(t)
dxj = −i gij(x) 〈Di, [H, ρ]〉ρ dt+ gij(x) 〈Di, D[L](ρ)〉ρ dt, (4.36)
Chapter 4. Projection Filtering for Qubit Ensembles 148
meaning that
dρ = Dj dxj. (4.37)
4.4 Projections in the tensor product submanifold
Our ultimate goal is to form a projection from a general state over n qubits, to the
closest n-fold tensor product of a single qubit state. We will define P to be the
submanifold of S(2n) which describes the space of all states of the form ρ(x)⊗n. This
space has the simple parameterization % : B ⊂ R3 → P ⊂ S(2n) such that
%(x) ≡ ρ(x)⊗n =1
2n(1+ x · σ)⊗n . (4.38)
We also need to identify the tangent spaces for each point in the submanifold.
Because of the linear nature of the one qubit map ρ the directional derivative of %(x)
with respect to y is simply
Dy =∂
∂λ%(x + λy)
∣∣∣∣λ=0
. (4.39)
A derivative acting on a tensor product must obey the Leibnitz rule. The directional
derivative of ρ(x)⊗n in the direction y must then be equal to
Dy(%(x)) =∂
∂λρ(x + λy)⊗n
∣∣∣∣λ=0
=n∑i=1
ρ(x)⊗ i−1 ⊗ 1
2y · σ ⊗ ρ(x)⊗n−i. (4.40)
For the single qubit, the directional derivative was uniform over the manifold,
which implied the Euclidean geometry for our simple metric. For multiple qubits,
this is no longer the case, which implies that P has a richer geometry. With a slight
abuse of notation, the basis vector associate with the coordinate xi, evaluated at the
state ρ(x)⊗n will be notated Di(x) and is given by
Di(x) =n∑j=1
ρ(x)⊗ j−1 ⊗ 12σi ⊗ ρ(x)⊗n−j. (4.41)
Chapter 4. Projection Filtering for Qubit Ensembles 149
The tangent space at %(x) is then
T%(x)P = span Di(x) : i = x, y, z . (4.42)
The metric on P induced from the trace inner product is now easily calculated.
The product of the two basis vectors Di and Dj is equal to
DiDj =n∑
p,q=1
(ρ⊗ p−1 ⊗ 12σi ⊗ ρ⊗n−p)(ρ⊗ q−1 ⊗ 1
2σj ⊗ ρ⊗n−q)
=n∑
p=q=1
ρ2⊗ p−1 ⊗ 14σi σj ⊗ ρ2⊗n−p
+n∑
q>p=1
ρ2⊗ p−1 ⊗ 12σiρ⊗ ρ2⊗ q−p−1 ⊗ 1
2ρσj ⊗ ρ2⊗n−q
+n∑
q<p=1
ρ2⊗ p−1 ⊗ 12ρσj ⊗ ρ2⊗ p−q−1 ⊗ 1
2σiρ⊗ ρ2⊗n−q.
(4.43)
The metric coefficient is then
〈Di, Dj〉%(x) = Tr(DiDj )
=n
4Tr(ρ2)n−1 Tr( σi σj ) +
n(n− 1)
4Tr(ρ2)n−2 Tr(ρ σi) Tr(ρ σj)
=n
2n(1 + |x|2
)n−1δij +
n(n− 1)
2n(1 + |x|2
)n−2xkx` δkiδ`j.
(4.44)
We will often need to calculate the product between several collective operators
and then take the trace. While Eq. (4.43) has a distinct ordering to the tensor
products, resulting in the two sums p < q and q < p, upon taking the trace this
order becomes irrelevant. Thus, there are only two relevant terms: p = q and p 6= q.
4.4.1 The metric in spherical coordinates
The metric as given by Eq. (4.44) has a simple form when written in spherical co-
ordinates. In terms of the Cartesian basis vectors ex, ey, ez the standard spherical
Chapter 4. Projection Filtering for Qubit Ensembles 150
basis vectors are defined as,
er = sin θ cosφ ex + sin θ sinφ ey + cos θ ez
eθ = cos θ cosφ ex + cos θ sinφ ey − sin θ ez
eφ = − sinφ ex + cosφ ey.
(4.45)
In analogy, we will define the associated tangent vectors,
Dr(x) = sin θ cosφDx(x) + sin θ sinφDy(x) + cos θ Dz(x)
Dθ(x) = cos θ cosφDx(x) + cos θ sinφDy(x)− sin θ Dz(x)
Dφ(x) = − sinφDx(x) + cosφDy(x).
(4.46)
When x is in the subset x ∈ B : (0 < r < 1, 0 < θ < π, 0 < φ < 2π), these vector
fields form a perfectly valid basis for each tangent space T%(x)P .
It will also be convenient to define “spherical” Pauli matrices,
σr ≡ sin θ cosφσx + sin θ sinφσy + cos θ σz
σθ ≡ cos θ cosφσx + cos θ sinφσy − sin θ σz
σφ ≡ − sinφσx + cosφσy.
(4.47)
These operators obey the usual properties associated with Pauli matrices, in that for
i, j, k ∈ r, θ, φ
Tr(σi) = 0 (4.48a)
Tr(σi σj) = 2δij (4.48b)
[σi, σj] = i εijk 2σk (4.48c)
(σiσj + σjσi) = δij 21. (4.48d)
Furthermore, we have that
Di(x) =∑j
ρ(x)⊗j−1 ⊗ 12σi ⊗ ρ(x)⊗n−j (4.49)
for both Cartesian and spherical bases. And the state ρ(x) can now be written as
ρ(x) = 12
(1+ r σr) . (4.50)
Chapter 4. Projection Filtering for Qubit Ensembles 151
We can now use the fact that the Pauli matrices are orthogonal, and the fact that
the state ρ is now orthogonal to σθ and σφ to evaluate the inner product between
the spherical tangent vectors, and thus write the metric as a matrix in spherical
coordinates. From the general expression
〈Di, Dj〉 = 14nTr(ρ2)n−1 Tr(σiσj) + 1
4n(n− 1) Tr(ρ2)n−2 Tr(σiρ) Tr(σjρ), (4.51)
we have
〈Dr, Dr〉 =n
2n(1 + r2
)n−1 1 + nr2
1 + r2,
〈Dθ, Dθ〉 =n
2n(1 + r2
)n−1,
〈Dφ, Dφ〉 =n
2n(1 + r2
)n−1,
(4.52)
and
〈Dr, Dθ〉 = 〈Dr, Dφ〉 = 〈Dθ, Dφ〉 = 0. (4.53)
As a matrix, the metric in spherical coordinates is given by
G(x) =n
2n(1 + r2
)n−1
1+nr2
1+r2 0 0
0 1 0
0 0 1
(4.54)
and its inverse is
G−1(x) =2n
n
(1 + r2
)−(n−1)
1+r2
1+nr2 0 0
0 1 0
0 0 1
. (4.55)
Notice that when n = 1 we recover the simple Euclidean metric of gij = 12δij.
4.4.2 Calculating collective operator inner products
This section contains the detailed calculations necessary for projecting the various
components of the conditional and unconditional master equations onto the space of
Chapter 4. Projection Filtering for Qubit Ensembles 152
identical separable states. We will first derive the projection for a general conditional
master equation of the form
dρtot = −i[Htot, ρtot]dt+D[Ltot](ρtot)dt+Ic[Ltot](ρtot)dt+H[Ltot](ρtot)dwt. (4.56)
The subscript tot is used to specify that these operators are operators on the total
Hilbert space consisting of N particles. Any single particle operator A, acting on the
nth particle of the ensemble, is denoted by A(n) and is given by the tensor product,
A(n) ≡ 1⊗n−1 ⊗ A⊗ 1⊗N−n. (4.57)
The fundamental assumption for this derivation is that the operators Htot and
Ltot, act independently and identically on each each qubit and may be written as
Htot =N∑n=1
H(n) =N∑n=1
1⊗n−1 ⊗H ⊗ 1⊗N−n (4.58a)
Ltot =N∑n=1
L(n) =N∑n=1
1⊗n−1 ⊗ L⊗ 1⊗N−n. (4.58b)
Furthermore, the tangent vectors Di, for the single particle state ρ are
Di =N∑n=1
ρ⊗n−1 ⊗ 12σi ⊗ ρ⊗N−n. (4.59)
In projecting the collective master equation onto the identical product states,
we will need to calculate the product of up to three collective operators and then
take the trace. Each collective operator is composed of a sum over single particle
operator, each acting on nth member. When taking the product of sums there will
be N terms where both single particle operators act on the same subsystem, as well
as N(N − 1) terms where the constituent operators act on different systems.
The simplest case is when there are no collective operators i.e., simply calculating
the overlap between Di and the state %. This is equal to
〈Di, %〉 =N∑n=1
Tr(( ρ⊗n−1 ⊗ 1
2σi ⊗ ρ⊗N−n) %
)=N Tr(ρ2)N−1 Tr(1
2σi ρ).
(4.60)
Chapter 4. Projection Filtering for Qubit Ensembles 153
The next step up in complexity is to include a single collective operator, Atot. This
requires two sums, the sum from Di and the sum from Atot. We then have
〈Di, Atot %〉 =N∑
n,m=1
Tr((ρ⊗m−1 ⊗ 1
2σi ⊗ ρ⊗N−m
)A(n)ρ⊗N
)=
N∑n=m=1
Tr(ρ2⊗m−1 ⊗ 1
2σiAρ⊗ ρ2⊗N−m
)+
N∑n<m=1
Tr(ρ2⊗n−1 ⊗ ρAρ⊗ ρ2⊗m−n−1 ⊗ 1
2σi ρ⊗ ρ2⊗N−m
)+
N∑n>m=1
Tr(ρ2⊗m−1 ⊗ 1
2σi ρ⊗ ρ2⊗n−m−1 ⊗ ρAρ⊗ ρ2⊗N−n
)=N Tr(ρ2)N−1 Tr(1
2σiAρ) +N(N − 1) Tr(ρ2)N−2 Tr(1
2σi ρ) Tr(Aρ2).
(4.61)
The cyclic property of the trace shows us that upon switching the order of Atot and
%, (i.e. to instead calculate 〈Di, %Atot〉) the second term will be left unchanged, so
〈Di, %Atot〉 = N Tr(ρ2)N−1 Tr(12σi ρA) +N(N −1) Tr(ρ2)N−2 Tr(1
2σi ρ) Tr(Aρ2).
(4.62)
When calculating the projection of the dissipator terms, we need to calculate the
product of two collective operators, which will have a triple sum. For two collective
operators Atot and Btot we have,
〈Di, AtotBtot %〉 =N∑
n,m,l=1
Tr((ρ⊗n−1 ⊗ 1
2σi ⊗ ρ⊗N−n
)A(m) B(l) ρ⊗N
). (4.63)
The previous result shows us that there will be five distinct terms, for cases where
n = m = l, n 6= m = l, n = m 6= l, n = l 6= m, and n 6= m 6= l. This expression then
Chapter 4. Projection Filtering for Qubit Ensembles 154
simplifies to
〈Di, Atot Btot %〉 =N Tr(ρ2)N−1 Tr(12σiAB ρ)
+ N(N − 1) Tr(ρ2)N−2 Tr(12σi ρ) Tr(ρAB ρ)
+ N(N − 1) Tr(ρ2)N−2 Tr(12σiAρ) Tr(ρB ρ)
+ N(N − 1) Tr(ρ2)N−2 Tr(12σiBρ) Tr(ρAρ)
+ N(N − 1)(N − 2) Tr(ρ2)N−3 Tr(12σi ρ) Tr(ρAρ) Tr(ρB ρ).
(4.64)
The terms Tr(ρXρ) can be simplified to Tr(Xρ2) but were left to make it more
explicit. The order of the collective operators can be exchanged, but this won’t
effect the five term structure. Thus we calculate that
〈Di, Atot %Btot〉 =N Tr(ρ2)N−1 Tr(12σiAρB)
+ N(N − 1) Tr(ρ2)N−2 Tr(12σi ρ) Tr(ρAρB)
+ N(N − 1) Tr(ρ2)N−2 Tr(12σiAρ) Tr(B ρ2)
+ N(N − 1) Tr(ρ2)N−2 Tr(12σiρB) Tr(Aρ2)
+ N(N − 1)(N − 2) Tr(ρ2)N−3 Tr(12σi ρ) Tr(Aρ2) Tr(B ρ2)
(4.65)
and
〈Di, %Atot Btot〉 =N Tr(ρ2)N−1 Tr(12σi ρAB)
+ N(N − 1) Tr(ρ2)N−2 Tr(12σi ρ) Tr(AB ρ2)
+ N(N − 1) Tr(ρ2)N−2 Tr(12σiρA) Tr(B ρ2)
+ N(N − 1) Tr(ρ2)N−2 Tr(12σiρB) Tr(Aρ2)
+ N(N − 1)(N − 2) Tr(ρ2)N−3 Tr(12σi ρ) Tr(Aρ2) Tr(B ρ2).
(4.66)
The final two calculations involving collective operators are the expectation val-
Chapter 4. Projection Filtering for Qubit Ensembles 155
ues, 〈Atot〉 and 〈AtotBtot〉. They are
〈Atot〉 = Tr(Atot%) = N Tr(Aρ) (4.67)
and
〈AtotBtot〉 = Tr(AtotBtot %) = N Tr(AB ρ) +N(N − 1) Tr(Aρ) Tr(B ρ). (4.68)
The general expressions in Eqs. (4.60-4.68) are all we need in order to calculate
all of the terms in the conditional master equation and a tangent vector Di. Starting
with the Hamiltonian commutator, Eq. (4.61) and its permutated version gives,
〈Di, [Htot, %]〉 = N Tr(ρ2)N−1 Tr(12σi [H, ρ]). (4.69)
For the dissipator term, substituting in to Eq. (4.64-4.66) with the appropriate
collective operator Ltot or L†tot,
〈Di, D[Ltot](%)〉 = Tr(Di (Ltot%L
†tot − 1
2L†totLtot%− 1
2%L†totLtot)
)=N Tr(ρ2)N−1 Tr(1
2σiD[L](ρ) )
+ N(N − 1) Tr(ρ2)N−2 Tr(12σi ρ) Tr(ρD[L](ρ))
+ N(N − 1) Tr(ρ2)N−2 Tr(14σi [L, ρ]) Tr(L†ρ2)
+ N(N − 1) Tr(ρ2)N−2 Tr(14σi [ρ, L
†]) Tr(Lρ2).
(4.70)
Note that final lines in Eqs. (4.64-4.66) are all equal and hence cancel in the dissipator
term.
For the conditioning map H[Ltot] we again need Eq. (4.61), with the addition of
Eq. (4.60) and the single operator expectation value Eq. (4.67). Its overlap then
reduces to
〈Di, H[Ltot](%)〉 = Tr(Di
(Ltot%+ %L†tot −
⟨Ltot + L†tot
⟩%))
=N Tr(ρ2)N−1 Tr(12σi(Lρ+ ρL†))
+N(N − 1) Tr(ρ2)N−2 Tr((L+ L†) ρ2) Tr(12σi ρ)
−N2 Tr(ρ2)N−1 Tr((L+ L†)ρ) Tr(12σiρ).
(4.71)
Chapter 4. Projection Filtering for Qubit Ensembles 156
Finally for the general Ito correction map Eq. (3.86),
〈Di, Ic[Ltot](%)〉 =N Tr(ρ2)N−1 Tr(
12σi (LρL
† + 12L2ρ+ 1
2ρL† 2)
)+N(N − 1) Tr(ρ2)N−2 Tr(1
2σi ρ) Tr
(ρ(LρL† + 1
2L2ρ+ 1
2ρL† 2)
)+N(N − 1) Tr(ρ2)N−2 Tr(1
2σi(Lρ+ ρL†) ) Tr((L+ L†)ρ2)
−N2 Tr(ρ2)N−1( ⟨L†L+ 1
2L2 + 1
2L2 †⟩
+12(N − 1)
⟨L+ L†
⟩2)
Tr(12σi ρ)
−N⟨L+ L†
⟩〈Di, H[Ltot](%)〉 .
(4.72)
These expressions simplify, in the case of a Hermitian operator L acting on a
single qubit state. Specifically when L = Jz and H is the control Hamiltonian in Eq.
(4.8), Eqs. (4.69-4.72) simplify to
〈Di, [Htot, %]〉 = 14N Tr(ρ2)N−1 Tr
(σi [f
j(t)σj, ρ]), (4.73)
〈Di, D[Jz](%)〉 = 18N Tr(ρ2)N−1 ( Tr (σi σzρ σz )−N Tr(σi ρ) )
+ 18N(N − 1) Tr(ρ2)N−2 Tr(ρσzρσz) Tr(σi ρ),
(4.74)
〈Di, H[Jz](%)〉 = 14N Tr(ρ2)N−1 Tr (σi (σzρ+ ρ σz) )
− 12N2 Tr(ρ2)N−1 Tr(σz ρ) Tr(σi ρ)
+ 12N(N − 1) Tr(ρ2)N−2 Tr(σzρ
2) Tr(σi ρ),
(4.75)
〈Di, Ic[Jz](%)〉 = 〈Di, D[Jz](%)〉
+ 14N(N − 1) Tr(ρ2)N−2 Tr(σzρ
2) Tr (σi (σzρ+ ρσz) )
− 14N2(N − 1) Tr(ρ2)N−1 Tr(σz ρ)2 Tr(σi ρ)
+ 14N(N − 1)(N − 2) Tr(ρ2)N−3 Tr(σzρ
2)2 Tr(σi ρ)
−N Tr(σz ρ) 〈Di, H[Jz](%)〉 .
(4.76)
Chapter 4. Projection Filtering for Qubit Ensembles 157
It is worth noting that in Eqs. (4.74-4.76), a majority of the terms are proportional
to Tr(σi ρ). If the state has zero expectation along the σi axis, then the overlap
with that tangent vector will be greatly simplified. However, any qubit state (which
is not completely mixed) has a Bloch vector pointing along some axis, leaving the
orthogonal axes with zero expectation. When the state happens to align with the
Di, i.e. the Bloch vector is x = rei, Tr(σi ρ) = r. This suggests that these terms
may simplify for any state, if we choose to work in spherical coordinates.
4.4.3 The spherical projection of the CME
With the metric inverse in Eq. (4.55), and the inner product expressions in Eqs.
(4.75-4.76) we can finally calculate the projection coefficients hi, li, ci, bi. To fully
simplify the inner products we need the following relations, which are easily calcu-
lated:
Tr(σi ρ) =
r for i = r
0 for i = θ
0 for i = φ
, (4.77a)
Tr(σi (σzρ+ ρσz)) =
2 cos(θ) for i = r
−2 sin(θ) for i = θ
0 for i = φ
, (4.77b)
Tr(σi σz ρ σz) =
r cos(2θ) for i = r
−r sin(2θ) for i = θ
0 for i = φ
, (4.77c)
Tr(σz ρ) = Tr(σz ρ2) = r cos(θ), (4.77d)
Chapter 4. Projection Filtering for Qubit Ensembles 158
and
Tr(ρ σz ρ σz) = 12(1 + r2 cos(2θ)). (4.77e)
The azimuthal symmetry of the problem is directly apparent in these expressions, as
the Jz projection carries no information about φ.
Substituting the spherical Pauli matrices into Eq. (4.75), the Hamiltonian inner
products simplify to
〈Dr, −i[H, %]〉 = 0
〈Dθ, −i[H, %]〉 = 12nTr(ρ2)n−1 r
(f2(t) cos φ− f1(t) sin φ
)〈Dφ, −i[H, %]〉 = 1
2nTr(ρ2)n−1 r(f3(t) sin θ − f1(t) cos θ cos φ− f2(t) cos θ sin φ
).
(4.78)
Physically, applying magnetic fields to a spin ensemble cannot change the total mag-
netization, a fact that is confirmed by having the Dr projection be zero.
The remaining projections we need to calculate are all based on the Jz mea-
surement operator and so contain no information about the φ coordinate. This can
be verified by substituting the Dφ tangent vector into Eqs. (4.74-4.76), which all
evaluate to zero.
The Dθ projections are greatly simplified by the fact that Tr(σθ ρ) = 0. Using
Eq. (4.77c) we find that
〈Dθ, D[Jz](%)〉 = −18nTr(ρ2)n−1 r sin(2θ). (4.79)
From Eq. (4.77b) the conditioning product reduces to
〈Dθ, H[Jz](%)〉 = −12nTr(ρ2)n−1 sin(θ). (4.80)
And combing these two results into Eq. (4.76) we have
〈Dθ, Ic[Jz](%)〉 = − 18nTr(ρ2)n−1 r sin(2θ)
− 14n(n− 1) Tr(ρ2)n−2 r sin(2θ)
+ 14n2 Tr(ρ2)n−1 r sin(2θ).
(4.81)
Chapter 4. Projection Filtering for Qubit Ensembles 159
Simplifying the r projections is obviously a more complicated task. However, the
dissipator product with Dr is not particularly more difficult than the Dθ product.
By including the fact that Tr(σrρ) = r and substituting Eqs. (4.77c and 4.77e) into
Eq. (4.74) we find
〈Dr, D[Jz](%)〉 = −18nTr(ρ2)n−2(1 + nr2) r sin2(θ). (4.82)
Evaluating the conditioning map requires Eqs. (4.77c) and (4.77d) which reduce Eq.
(4.75) to
〈Dr, H[Jz](%)〉 = −14nTr(ρ2)n−2(1 + nr2)(r2 − 1) cos(θ). (4.83)
Finally when we combining these past results into the Ito correction product, Eq.
(4.76) simplifies to
〈Dr, Ic[Jz](%)〉 = − 18nTr(ρ2)n−2(1 + nr2) r sin2(θ)
+ 12n(n− 1) Tr(ρ2)n−2 r cos2(θ)
− 14n2(n− 1) Tr(ρ2)n−1r3 cos2(θ)
+ 14n(n− 1)(n− 2) Tr(ρ2)n−3 r3 cos2(θ)
+ 14n2 Tr(ρ2)n−2(1 + nr2)(r2 − 1)r cos2(θ).
(4.84)
Having now simplified the the projections, we are able to include the inverse met-
ric components in Eq. (4.55) to arrive at the proper projections. The Hamiltonian
projections are
hr(x, t) = 0
hθ(x, t) = gθθ 〈Dθ, −i[H, %]〉 = f2(t) r cos φ− f1(t) r sin φ
hφ(x, t) = gφφ 〈Dφ, −i[H, %]〉 = f3(t) r sin θ − f1(t) r cos θ cos φ− f2(t) r cos θ sin φ.
(4.85)
The dissipator projections are
lr(x) = grr 〈Dr, κD[Jz](%)〉 = −12κ r sin2 θ
lθ(x) = gθθ 〈Dθ, κD[Jz](%)〉 = −14κ r sin 2θ
lφ(x) = 0.
(4.86)
Chapter 4. Projection Filtering for Qubit Ensembles 160
The conditioning projections are
br(x) = grr⟨Dr,√κH[Jz](%)
⟩= −√κ(1− r2) cos θ
bθ(x) = gθθ⟨Dθ,√κH[Jz](%)
⟩= −√κ sin θ
bφ(x) = 0.
(4.87)
The Ito correction projections are
cr(x) = grr 〈Dr, κ Ic[Jz](%)〉 = lr(x) + κα(r) r cos2 θ
cθ(x) = gθθ 〈Dθ, κ Ic[Jz](%)〉 = lθ(x) + 12κβ(r) r sin 2θ
cφ(x) = 0
(4.88)
where we defined the coefficients
α(r) ≡n(r2 − 1) +2(n− 1)
(1 + nr2)− n(n− 1)(1 + r2)
2(1 + nr2)r2 +
2(n− 2)(n− 1)
(1 + r2)(1 + nr2)r2 (4.89)
and
β(r) ≡ n− 2n− 1
1 + r2. (4.90)
To bring all of this together, we are reminded that the projected conditional
master equation is
dρt =(hi + li − ci
)Di dt+ biDi dwt. (4.91)
Furthermore a general Stratonovich SDE is traditionally written as dxt = A(xt) dt+
B(xt) dwt. To conform to this convention we will define the coefficients
ar(x, t) ≡ − κα(r) r cos2 θ
aθ(x, t) ≡hθ(x, t)− 12κβ(r) r sin 2θ
aφ(x, t) ≡hφ(x, t).
(4.92)
with the Hamiltonian projections hi(x, t) defined in Eq. (4.85). The projected con-
ditional master equation is finally given by the Stratonovich SDE
dρt = ai(x, t)Di(x) dt+ bi(x)Di(x) dwt (4.93)
with the spherical tangent vectors defined in Eq. (4.46).
Chapter 4. Projection Filtering for Qubit Ensembles 161
4.5 The Projection Filter
From the projected conditional master equation in Eq. (4.93), we would like to
find a closed set of easily simulated stochastic differential equations. We now have
a nonlinear matrix-valued SDE that propagates the closest identical product state
to the exact conditional state. A single qubit state is completely characterized by
its Bloch vector, and so to simulate ρt we need only find SDEs for the three Bloch
components.
In the notation of quantum filtering theory [25], the filter is a QSDE that prop-
agates the conditional expectation of a given observable,
πt(X) ∼= Tr(ρtX). (4.94)
The filter is generally expressed as a differential, so that for the observable X (on n
qubits), we have
dπt(X) = Tr(dρtX) = ai(x, t) Tr(Di(x)X) dt+ bi(x) Tr(Di(x)X) dwt. (4.95)
The Bloch vector components can of course be identified by the expectation
value of the Pauli operators. For n qubits, we also have the relation that Tr(Ji %) =
n2
Tr(σi ρ) = n2xi, so to extract SDEs for the Cartesian Bloch components xi from
dρt, we simply need to calculate the filtering equations for the operators 2Ji/n. Thus,
dxit =2
nTr(dρt Ji) =
2
n(aα(xt, t) Tr(Dα(xt)Ji) dt+ bα(xt) Tr(Dα(xt)Ji) dwt)
(4.96)
for α ∈ r, θ, φ and i ∈ 1, 2, 3. Explicit calculation shows that
Tr(Dr(x)Ji) =n
2( sin θ cosφ δi 1 + sin θ sinφ δi 2 + cos θ δi 3 )
Tr(Dθ(x)Ji) =n
2( cos θ cosφ δi 1 + cos θ sinφ δi 2 − sin θ δi 3 )
Tr(Dφ(x)Ji) =n
2(− sinφ δi 1 + cosφ δi 2 ) .
(4.97)
Chapter 4. Projection Filtering for Qubit Ensembles 162
These equations show how to find a mixed coordinate expression for the projection
filter, i.e. it expresses dx in terms of the variables r, θ, φ. We would like to combine
these results with the expressions for aα and bα, Eqs. (4.92 and 4.87) to obtain
deterministic and stochastic coefficients expressed in Cartesian coordinates. We seek
the functions, ai(x, t) and bi(x) such that
dxit = ai(xt, t) dt+ bi(xt) dwt. (4.98)
With the standard conversion between spherical and Cartesian coordinates, Eqs.
(4.85, 4.92, 4.87, and 4.97) we can easily find these coefficients.
In Cartesian coordinates the deterministic Stratonovich coefficients are
a1(x, t) = f 2(t)x3 − f 3(t)x2 − κ (α(r) + β(r) )x1 (x3)2
r2,
a2(x, t) = f 3(t)x1 − f 1(t)x3 − κ (α(r) + β(r) )x2 (x3)2
r2,
a3(x, t) = f 1(t)x2 − f 2(t)x1 + κβ(r)x3 − κ (α(r) + β(r) )x3 (x3)2
r2,
(4.99)
and the stochastic coefficients are
b1(x) = −√κx1 x3,
b2(x) = −√κx2 x3,
b3(x) =√κ(1− (x3)2).
(4.100)
Note that the functions α(r) and β(r) actually only depend upon r2 = ‖x‖2.
To complete the derivation we will convert these Stratonovich equations back to
the Ito form. The Ito correction for the multivariable bi coefficients is given by,
∆ ai(x) = ai(x, t)− ai(x, t) =1
2bj(x)
∂bi(x)
∂xj. (4.101)
Substituting bi(x) into this formula, the Ito corrections simplify to
∆ a1(x) =κx1((x3)2 − 1
2
)∆ a2(x) =κx2
((x3)2 − 1
2
)∆ a3(x) =κx3
((x3)2 − 1
).
(4.102)
Chapter 4. Projection Filtering for Qubit Ensembles 163
By adding these terms to the deterministic coefficients in Eq. (4.92) we find
a1(x, t) = f 2(t)x3 − f 3(t)x2 − 12κx1 + κ γ(r)x1 (x3)2,
a2(x, t) = f 3(t)x1 − f 1(t)x3 − 12κx2 + κ γ(r)x2 (x3)2,
a3(x, t) = f 1(t)x2 − f 2(t)x1 + κ (β(r)− 1)x3 − κ γ(r)(x3)3
(4.103)
were we defined the new coefficient function γ(r) as
γ(r) ≡ 1− α(r) + β(r)
r2= (1− r2)
(n (n+ 1)
2 (1 + n r2)− 1
1 + r2
)(4.104)
and the β(r) coefficient is given in Eq. (4.90). Note that for any n and 0 ≤ r ≤ 1,
the coefficient γ(r) is strictly nonnegative. The zeros of this function however are
two special cases, which we will discuss next.
If we substitute the stochastic coefficients bi(x) into Eq. (4.101) the projection
filtering equations are
dx1t = a1(xt, t) dt−
√κx1
t x3t dwt
dx2t = a2(xt, t) dt−
√κx2
t x3t dwt
dx3t = a3(xt, t) dt+
√κ(1− (x3
t )2) dwt.
(4.105)
4.5.1 Special cases for the projection filter
There exist two very interesting special cases for the projection filter. The first is
when we only have a single qubit, and the second is when the state is pure. We
have already shown how, when n = 1, the metric is simply a Euclidean metric. (Up
to a factor of a half.) Therefore, we expect the projection filtering equations to
dramatically simplify for a single qubit. The first thing to notice is that when n = 1
the β(r) coefficient Eq. (4.90) simplifies to
β(r) = 1 for n = 1. (4.106)
Furthermore, for n = 1 the γ(r) coefficient Eq. (4.104) simplifies to
γ(r) = 0 for n = 1. (4.107)
Chapter 4. Projection Filtering for Qubit Ensembles 164
When we evaluate the projection filter for pure states, we arrive at remarkably similar
results. In other words for any n = 1, 2, . . . we also have
β(r = 1) = 1 and γ(r = 1) = 0.
These fantastic simplifications means that for a (possibly mixed) single qubit or
any pure multi-qubit state, the projection filtering equations are simply
dx1t =
(f 2(t)x3
t − f 3(t)x2t − 1
2κx1
t
)dt−
√κx1
t x3t dwt
dx2t =
(f 3(t)x1
t − f 1(t)x3t − 1
2κx2
t
)dt−
√κx2
t x3t dwt
dx3t =
(f 1(t)x2
t − f 2(t)x1t
)dt+
√κ (1− (x3
t )2) dwt.
(4.108)
It is an interesting exercise to see that if one simply computed the Heisenberg picture
filtering equations for the Pauli operators, πt(σi), one arrives at these vary same 3
coupled SDEs.
What the pure state evaluation says is that when r = 1, the dynamics of the
separable system no-longer depends on the total number of qubits and evolve simply
as n identical copies of a single qubit state. The only remaining dependence upon
n is in the innovations process, where we have the differential dwt → dvt = dyt −√κnx3
t dt, where dyt is the integrated measurement current in time [t, t+ dt].
One finial question about the pure state projection filter is, “Will the filter remain
pure, once it becomes pure?” In other words, is r = 1 a trap for the system so that if
rs = 1 for some time s, will rt = 1 for all t ≥ s. The physics of the situation dictates
that it must, as with no sources of decoherence the state will purify and stay pure.
However, it is difficult to see that this is indeed the case simply by inspecting Eq.
(4.108). Where this not the case, then it would likely indicate an error with the
model. Thankfully the answer is decidedly yes and can be seen in the spherical basis
representation of dρt, with the Stratonovich coefficients a(x, t) and b(x, t) given in
Eqs. (4.92) and (4.87). From these two equations we see that
ar(x, t) = −κα(r) r cos2 θ (4.109)
Chapter 4. Projection Filtering for Qubit Ensembles 165
and that
br(x, t) = −√κ(1− r2) cos θ. (4.110)
Substituting in for r = 1 into Eq. (4.89), we find that α(r = 1) = 0. But this means
that ar(x, t)|r=1 = 0 and br(x, t)|r=1 = 0. Following this logic to its conclusion we
find that
dρ|r=1 = aθ(x, t)|r=1Dθ dt+ bθ(x, t)|r=1Dθ dwt
+ aφ(x, t)|r=1 Dφ dt+ bφ(x, t)|r=1Dφ dwt.(4.111)
In other words, by evaluating dρ at r = 1 we see that it is independent of the
tangent vector Dr and thus will remain on the submanifold of identical separable
states defined by r = 1.
4.6 Simulations and Performance
The primary purpose for deriving the projection filter was to find a reduced dimen-
sional description of a system of qubits undergoing a continuous measurement of
the collective angular momentum Jz. However we know that the end result of a
continuous measurement of Jz, absent of any other influence, is an eigenstate of Jz,
a so called Dicke states. With the exception of the stretched states, Dicke states
are not separable. Even after a relatively short time, the reduction of the Jz spin
component results in a nonclassical spin squeezed state, a phenomena observed in
several experiments [8–10].
In this section we test through numerical simulation how well the projection fil-
ter reproduces certain properties of the exact conditional state. Here we show that
by adding strong randomized external control fields during the collective measure-
ment the joint state remains highly separable and the approximate description of the
projection filter reproduces collective expectation values much more faithfully.
Chapter 4. Projection Filtering for Qubit Ensembles 166
Chap. 5 uses the projection filter in an algorithm to reconstruct the initial con-
dition of a SCS from a continuous measurement of Jz, characterized by the rate κ.
In order to obtain information about observables other than Jz, an external control
Hamiltonian must be applied. For reasons discussed in Sec. 5.3.1, this takes form
of a sequence of global π/2 rotations, where each rotation is about an axis n that
was independently sampled from a uniform distribution. Therefore, we will test the
performance of the projection filter with this control law in mind.
Fully characterizing the control amplitude f(t) requires specifying the amplitude
and duration of each pulse, as a larger Larmor frequency is needed to enact the
same rotation in a shorter time. For simplicity, we will fix f(t) to have a constant
magnitude and will only vary its direction. This constrains the π/2 rotations to be
square-wave pulses, each of duration τ ,
f(t) =π
2 τ
∑m=1
χ[m−1,m)( t/τ) nm (4.112)
where χ[a,b)(t) is the indicator function for the interval [a, b) and nm are i.i.d. unit
vectors drawn from a isotropic distribution.
To efficiently simulate the exact dynamics we utilize two conserved quantities.
The first is that because the system Hamiltonian Ht and measurement operator
L commute with J2, the total angular momentum of the atomic system will be
conserved. Furthermore, the states we ultimate use are all initialized in states with
a maximum projection of angular momentum along some direction, thereby always
possessing n/2 units of angular momentum. This allows us to restrict the simulations
to a d = n+ 1-dimensional space. In other words we simulate a single J = n/2 spin
system.
The second conservation property we will use is the fact that without any addi-
tional sources of decoherence, the conditional master equation maps pure quantum
states to pure quantum states. Sec. 3.4.3 discusses the conditional Schrodinger equa-
tion (CSE) and how it can be derived from a conditional master equation (CME).
Chapter 4. Projection Filtering for Qubit Ensembles 167
Using a CSE generates significant computational savings, as each time step prop-
gates a single a complex vector, rather than a complex matrix. The general form of
the CSE is given in Eq. (3.94). In our case, L =√κJz for a real, positive κ this
simplifies to
d|ψt〉 =(−iHt − 1
2κ(Jz − 〈Jz〉 )2
)|ψt〉 dt+
√κ(Jz − 〈Jz〉
)|ψt〉 dwt. (4.113)
4.6.1 Simulation parameters
In absence of the control Hamiltonian Ht, the one universal timescale in the CSE
is set by the measurement strength, i.e. the characteristic time κ−1. Therefore
these simulations are all reported in time units of this characteristic time. The
range of qubits total qubit numbers we will test are between 25− 100, meaning that
the simulations will be of collective spin values of 12.5 ≤ J ≤ 50. In addition to
these collective spins, we will compare the projection filtering equations to the exact
simulations for a single qubit, proving that they generate the same dynamics.
The remaining parameters, namely the gate duration τ and the fixed terminal
simulation time tf will be chosen to correspond to the parameters that will be ulti-
mately used in Chap. 5. Specifically, τ = 5× 10−3 κ−1 and tf = 0.2κ−1.
The actual simulations are implemented in the MATLAB computing environment
using a hand coded, weak second order predictor-corrector stochastic differential
equation integrator. The algorithm is described by Kloeden et al. [69, page 200] and
was implemented in MATLAB by Brad Chase for his PhD dissertation [37].
4.6.2 Spin squeezing comparisons
Spin squeezing is a much sought after and well studied effect in atomic spin ensembles.
The general phenomena describes the reduction in uncertainty in a expected value
Chapter 4. Projection Filtering for Qubit Ensembles 168
of a spin component transverse to the mean spin direction. Due to a Heisenberg
uncertainty relationship, this reduction in uncertainty is accompanied by an increase
in uncertainty in the orthogonal quadrature.
The standard example is to consider an collective spin system composed of n
qubits, initialized in a SCS pointing along the +ex direction. This state is clearly
an eigenstate of Jx with eigenvalue mx = J = n/2. It is also easy to show that this
state is a minimum uncertainty state so that it minimizes the Heisenberg uncertain
relation
⟨∆J2
z
⟩ ⟨∆J2
y
⟩≥ 1
4〈Jx〉2 (4.114)
with equal uncertainties 〈∆J2z 〉 =
⟨∆J2
y
⟩= 1
2J . An example of a spin squeezed state
is a state that is still mostly polarized along the ex axis but also contains quantum
correlations so that 〈∆J2z 〉 <
⟨∆J2
y
⟩but still maintains the equality of Eq. (4.114)
[70].
A spin squeezed state is one of the immediate consequences of a continuous mea-
surement of a collective angular momentum variable, such as Jz [71], for a SCS
prepared transverse to the measurement axis. A number of papers have investigated
the relation between spin squeezing and various measures of entanglement (see e.g.
[72] and references therein). One particular measure of spin squeezing, ξ2T , has been
shown by Yin et al. to be directly related to the concurrence, a measure of pairwise
entanglement [72]. They show that when the concurrence C is greater then zero,
indicating entanglement, then ξ2T < 1 and that when ξ2
T ≥ 1, C = 0 and the state is
unentangled. ξ2T takes on the following definition.
For each component of angular momentum we can compose the symmetrize cor-
relation and covariance matrices (i, j = x, y, z)
Corri,j = 12〈JiJj + JjJi〉 (4.115)
Chapter 4. Projection Filtering for Qubit Ensembles 169
and
Covari,j = Corri,j −〈Ji〉 〈Jj〉 . (4.116)
From these matrices we can also form the Hermitian matrix
Γ = (n− 1) Covar + Corr . (4.117)
The squeezing parameter ξ2T is then defined as
ξ2T ≡
λmin
〈J2〉 − n2
(4.118)
where λmin is the minimum eigenvalue of the matrix Γ.
4.6.3 Squeezing simulations
This section presents simulations that benchmark the typical effect the measurement
has upon the states of interest, as well as how much the control law mitigates these
effects. We test here five classes of states, each composed of n = 1, 25, 50, 75, 100
qubits. The one qubit case is included as a control, testing that the numerics produce
reasonable results.
As discussed previously, a continuous measurement of Jz is a standard protocol
for producing a spin squeezed state. However, a spin coherent state (SCS) prepared
along to the ez axis will not squeeze at all, while the states prepared in the equatorial
plane squeeze the most. Therefore to demonstrate the maximum amount of quantum
correlations a typical measurement realization can produce, a natural choice is a SCS
prepared along the ex axis.
Fig. 4.1 characterizes the typical results of a CSE simulation for a +ex SCS
initial state, containing n = 50 qubits, (spin J = 25). Figs. 4.1.a - 4.1.c show the
conditional expectation values of Jx, Jy and Jz as a function of time. Also plotted are
1σ regions of confidence indicating the expected deviations from these mean values.
Chapter 4. Projection Filtering for Qubit Ensembles 170
Figure 4.1: Typical Uncontrolled Evolution. CSE evolution for a +ex SCS ini-tial state, containing n = 50 qubits, (J = 25). a-c: The conditional expectationvalues for Jx, Jy, and Jz vs time, including a 1σ region of confidence. d: Thesqueezing parameter ξ2
T in dB vs time. e: The Husimi Q-function qψ(θ, φ) for theconditional state, |ψtf 〉, at the final time tf = 0.2κ−1.
In other words, the grey regions are bounded by the values 〈Ji〉 ±√〈∆J2
i 〉. Fig.
4.1.a shows how 〈Jx〉 tends to decrease as the state squeezes around the sphere and
how its uncertainty grows. Fig. 4.1.b shows how 〈Jy〉 remains zero throughout the
measurement, while its variance increase due to the characteristic anti-squeezing.
Conversely, Fig. 4.1.c shows the decrease in 〈∆J2z 〉 due to the squeezing as well as
the deviation of 〈Jz〉 from zero as the system evolves towards an eigenstate of Jz.
Fig. 4.1.d plots the evolution of the squeezing parameter ξ2T , in dB, as a function of
time. ( Where ξ2T in decibels is 10 log10(ξ2
T ). ) Fig. 4.1.e shows a single 3D plot of
the Husimi Q-function quasi-probability distribution for the state at the final time
tf . The Q-function for a pure spin state ψ with total angular momentum J is defined
Chapter 4. Projection Filtering for Qubit Ensembles 171
as
qψ(θ, φ) ≡ 2J + 1
4π|〈θ, φ|ψ〉|2 (4.119)
where |θ, φ〉 is a SCS parameterized by the polar angles θ and φ. The constant
factor ensures normalization. The color scale of Fig. 4.1.e has been normalized to
the maximum value of 2J+14π
. The fact that the Q-function has a maximum value of
∼ 2.2 shows that this squeezed state has poor overlap with SCSs.
Figure 4.2: Typical Controlled Evolution. CSE evolution for a +ex SCS initialstate, containing n = 50 qubits, with 40 randomized π/2 rotations. a-c: Theconditional expectation values for Jx, Jy, and Jz vs time, including a 1σ regionof confidence. d: The squeezing parameter ξ2
T in dB vs time. e: The HusimiQ-function qψ(θ, φ) for the conditional state, |ψtf 〉, at the final time tf = 0.2κ−1.
Fig. 4.2 presents a typical realization for the same initial condition in the presence
of the randomized π/2 rotations, with a duration of τ = 5× 10−3 κ−1. By the time
tf = 0.2κ−1, this period leads to a total of 40 rotations or one for every horizontal
Chapter 4. Projection Filtering for Qubit Ensembles 172
tick mark in Figs. 4.2.a - 4.2.d. Figs. 4.2.a - 4.2.c show the conditional expectation
values 〈Jx〉, 〈Jy〉 and 〈Jz〉 as a function of time. In contrast to the example, lacking
the randomized controls, these Figs. show that there is little qualitative difference
between the three expectation values. Fig. 4.2.d indicates that there is a significant
reduction in the amount of squeezing produce during measurement compared to the
uncontrolled system. ξ2T min = −3.21 dB in the presence of controls while ξ2
T min =
−10.1 dB without them. The 1σ confidence regions indicate that this squeezing is
not with respect to a fixed coordinate axis but is rotated between all three and
so there is a substantial averaging effect that leads to far less squeezing than the
uncontrolled case. In fact there is nearly a factor of 5 decrease in the maximum
amount of squeezing and therefore the controlled state is kept much more separable.
This separability is indicated in the Q-function of the final state, shown in Fig. 4.2.e,
with its more spherical appearance and near perfect overlap with a spin coherent
state, indicated by the maximum value ∼ 3.7.
The amount of squeezing is significantly reduced because the randomized controls
tends to mix both the squeezed and anti-squeezed components leading to a near zero
average. Not only does the mean spin rotate, but the orientation of the squeezing
ellipse also rotates. As the rotation axes are chosen from a uniform distribution, the
squeezed component is just as likely as the anti-squeezed component to be oriented
along the measurement axis. At any given time, the uncertainty in the Jz component
is equally likely to be above or below the uncertainty of an equivalent spin coherent
state. Therefore it is difficult for any significant squeezing to develop.
4.6.4 Projection filter simulations
Ultimately we need to compare how well the projection filter performs when it
calculates an innovation from a measurement record yt that is not generated by
a separable state. In this case dwt is actually given by the innovation process,
Chapter 4. Projection Filtering for Qubit Ensembles 173
dvt = dyt −√κnx3
t dt, which is not a Wiener process in all cases. We make this
comparison through two measures. The first is to see how well the projection filter is
able to reproduce the expectation values 〈Jx〉, 〈Jy〉 and 〈Jz〉 compared to the exact
conditional state. The second is through the fidelity, squared overlap, between the
exact and approximate states. Fig. 4.3 makes the comparison between the projection
Figure 4.3: Projection Filter Tracking example. The comparison between theprojection filter predictions and the exact conditional expectation values for thesimulations shown in Figs. 4.1 and 4.2. a-c: The conditional expectation values〈Ji〉 are shown for the exact uncontrolled state (blue) with the 1σ regions ofconfidence (grey). Also shown are the projection filtered values n
2xit (red). d:
The squeezing in the exact uncontrolled state (blue) and the squeezing reportedby the projection filter (red). e-h: Same as a-d but with the controls now applied.
filter and the expectation value shown in Figs. 4.1 and 4.1. Figs. 4.3.a - 4.3.c and
4.3.e - 4.3.g re-plot the true conditional expectation values, 〈Ji〉, as well as the pro-
jection filter values, given simply as n2xit. Fig. 4.3.a shows that as the uncontrolled
Chapter 4. Projection Filtering for Qubit Ensembles 174
state becomes significantly squeezed, the 〈Jx〉 value reduces accordingly. The projec-
tion filter is unable to account for this and therefore has a noticeable error. Fig. 4.3.c
shows that after a time t ∼ 0.05κ−1, there is an increase in the difference between
the conditional expectation value 〈Jz〉 and the value calculated from the projection
filter. These differences are in stark contrast to the tracking results in Figs. 4.3.e -
4.3.g where the differences between all three expectation values are almost all within
the line thicknesses. Figs. 4.3.d and 4.3.h emphasise the fact that the projection
filter reports separable states and so the projected squeezing parameter ξ2T remains
fixed at 0 dB.
Beyond these two sample trajectories, we also test the quality of the projection
filter for a variety of initial states and qubit numbers. This comparison is made
in Fig. 4.4, showing a trial averaged RMS error between the projection filter and
the exact conditional expectation values. Here we test five different qubit values,
n ∈ 1, 25, 50, 75, 100. For each n, the average is made over ν = 100 input SCSs
chosen at random with a uniform distribution over the Bloch sphere. These same
Bloch angles are used for each n. For each input state we run a single simulation to
compute the three exact conditional expectation values, 〈Ji〉, as well as the projection
filter Bloch components xit. Then for each run we compute the RMS errors
Err Ji ≡√⟨(
1J〈ψt| Ji |ψt〉 − xit
)2⟩ν, (4.120)
as a function of time. The expectation value 〈·〉ν represents the athermic mean
over the ν trials. The normalization of the exact expectation value means that
0 ≤ Err Ji ≤ 1 or in other words, is in units of the total spin length J . With the
exception of the single qubit case (showing only numerical integration error), the
scaled RMS errors are relatively independent of number of qubits. This is likely due
to the fact that when the system is in a pure state, the projection filtering equations
are independent of n (see Sec. 4.5.1).
However, the presence of the strong randomized controls has a significant effect.
Chapter 4. Projection Filtering for Qubit Ensembles 175
0 0.05 0.1 0.15 0.20
0.05
0.1RMS error − No Control
Err J
x [J]
0 0.05 0.1 0.15 0.20
0.05
0.1
Err J
y [J]
0 0.05 0.1 0.15 0.20
0.05
0.1
κ t
Err J
z [J]
0 0.05 0.1 0.15 0.20
0.05
0.1RMS error − Control
0 0.05 0.1 0.15 0.20
0.05
0.1
0 0.05 0.1 0.15 0.20
0.05
0.1
κ t
Figure 4.4: Average RMS Tracking error vs time. The left column shows Err Ji,with i = x, y, z (in descending order) in the case of no control fields. The rightcolumn shows the same but in the presence of control fields. The average is overν = 100 uniformly random Bloch angles with a single noise realization per state.For each Bloch vector, the RMS error is computed for n = 1 (blue), n = 25(green), n = 50 (red), n = 75 (cyan) and n = 100 (purple) qubits.
Sans the controls, Fig. 4.4 shows a near linear increase in Err Jx and Err Jy. In Fig.
4.3.a, the projection filter was unable to track the decrease in the 〈Jx〉 component as
the uncontrolled state became squeezed and developed significant curvature on the
sphere. Were the system initialized in a +ey spin coherent state, the roles of Jx and
Jy would be reversed but still have the same behavior. We attribute the increase
in Err Jx and Err Jy to this effect. In contrast, when the randomized controls are
applied, the RMS error is equally distributed across all expectation values and remain
. 5% of the total spin length. Additionally, the Err Jz values is significantly worse
in the uncontrolled case. The ∼ 1% error at time t = 0 in the n = 100 simulations
Chapter 4. Projection Filtering for Qubit Ensembles 176
is attributed to using Stirling’s approximation to calculate the Jz basis coefficients
in the initial SCS.
0 0.05 0.1 0.15 0.20.3
0.4
0.5
0.6
0.7
0.8
0.9
1
κ t
Proj
ectio
n Fi
lter F
idel
ity
Projection Filter Fidelity − No Control
0 0.05 0.1 0.15 0.20.3
0.4
0.5
0.6
0.7
0.8
0.9
1
κ tPr
ojec
tion
Filte
r Fid
elity
Projection Filter Fidelity− Control
n − 1n − 25n − 50n − 75n − 100
n − 1n − 25n − 50n − 75n − 100
Figure 4.5: Average Projection Filter Fidelities. These plots show the averagefidelity between the exact CSE simulation and the SCS given by the projectionfilter. The left plot shows the fidelity in the uncontrolled case with the rightadding the 40 π/2 gates per simulation. The average is over ν = 100 uniformlyrandom Bloch angles and a single noise realization per state. For each Blochvector, the average fidelity is computed for n = 1 (blue), n = 25 (green), n = 50(red), n = 75 (cyan) and n = 100 (purple) qubits.
While these results indicate that the projection filter performs well in the pres-
ence of rapid, randomized rotations, an arbitrary spin state with J > 12
contains
more information than simply three expectation values. To characterize the general
performance, we turn to the second comparison and calculate the average fidelity
between the exact state and a SCS given by the projection filter. Fig. 4.5 makes
this comparison, averaged over ν = 100 uniformly sampled states. This is made
both with and without controls and again for n = 1, 25, 50, 75, 100 qubits. The
overall state fidelity for SCSs with n > 1 shows a poor performance as the number
of qubits increases, indicating an increase in the squeezing produced during the fixed
measurement duration. In the worse case with n = 100 and no controls applied, the
Chapter 4. Projection Filtering for Qubit Ensembles 177
average fidelity reaches a minimum value ∼ 0.47. However with the controls, the
fidelity is > 0.80 for any n. The non-monotonic decrease in the controlled fidelity
suggests that the specifics of the control law impacts this fidelity and so it might be
possible to optimize the control law so that the average fidelity is maximized.
178
Chapter 5
Qubit State Reconstruction
This chapter describes how to use the quantum filtering formalism to construct a
tomographic estimate for an unknown initial quantum state from an ensemble of
identical copies experiencing a joint continuous measurement. We make a maximum
likelihood estimate of this state, based upon the statistics of a continuous measure-
ment of an output field quadrature. The purpose of this work is to extend previous
results [11–13] into a regime where the quantum backaction significantly effects the
measurement statistics.
We consider here the case of an ensemble of n qubits coupled to a single traveling
wave quantum light field. The qubit ensemble is assumed to be in a pure spin coherent
state characterized by the unknown polar angles (θ, φ). The quantum state estima-
tion problem is mapped to a parameter estimation problem, which is then approxi-
mated by a Monte Carlo sampling algorithm. Numerical experiments show that the
ultimate performance of the estimate approaches an optimum fidelity bound, found
by Massar and Popescu [39]. The deficit in the reconstruction fidelity is attributed
to a separability approximation in the Monte Carlo algorithm. This algorithm is
compared to, and significantly out performs, an equivalent “Schrodinger” estimate
that ignores the backaction of the measurement. At long times the Schrodinger es-
Chapter 5. Qubit State Reconstruction 179
timate is shown to be biased away from the true state, indicating the significance of
the conditional dynamics and the utility of the quantum filtering framework.
5.1 Previous reconstruction results
A fundamental task in quantum information processing is the ability to both reliably
prepare an arbitrary quantum state and to experimentally verify its production.
Traditional quantum state estimation relies on an exhaustive tomographic procedure
where the target state is repeatedly prepared and then destructively measured in
an informationally complete number of measurement settings. Such a procedure is
often extremely time intensive, requiring both a tremendous amount of data as well
as significant post processing time [2, 3].
In an alternative protocol proposed by Silberfarb et al., these inefficiencies can
be largely side-stepped though a weak continuous measurement of an identically
prepared ensemble in conjunction with a well chosen dynamical control [11]. In
particular, an atomic ensemble is prepared in an identical tensor product state ρtot =
ρ⊗n0 and experiences a known Hamiltonian while simultaneously coupled to a traveling
wave probe, via a collective degree of freedom. A continuous measurement of this
probe then generates a measurement record that is strongly correlated with the
evolution of the system. If the dynamic drives the system in such a way as to make
the measurement informationally complete, then a statistical estimate of an unknown
initial system state should have a high fidelity with the true initial condition.
Such a system naturally arises in the field of laser cooled atoms, were an ensemble
of n atoms are easily assembled and then weakly coupled to an off-resonant probe
laser. One can then measure a collective spin state of the ensemble via the amount
of polarization rotation induced by the Faraday effect. This protocol has been im-
plemented in several experiments, ultimately reconstructing the full 16-dimensional
Chapter 5. Qubit State Reconstruction 180
hyperfine ground state manifold [12, 13]. However, they were performed in a param-
eter regime where the intrinsically quantum nature of the continuous measurement
could be ignored. The amount of state disturbance caused by the nonlinear mea-
surement process, the so called backaction, was negligibly small when compared to
the decoherence induced by diffuse light scattering as well as inhomogeneous effects
in the control fields.
This work investigates, through theoretical analysis and numerical simulation,
the fundamental limits of this protocol. We do so in an idealized model, where
the effects of decoherence are absent and thus the backaction becomes a significant
effect. To avoid unnecessary complications, we will also reduce the dimensionality
of our fundamental system and consider only pure qubits, initialized in an identical
tensor product state |ψtot〉 = |ψ0〉⊗n. With a fully quantum model of the atom-light
interaction, we formulate a maximum likelihood (ML) estimate of the single particle
initial state, which we will denote as |ψ0〉.
5.2 The Estimation Procedure
When ignoring backaction, the linearity of an unconditioned master equation means
that the measurement signal can be considered as a linear function of the initial
state of the atomic system, ρ0. In an additive white noise model, the instantaneous
polarimetry signal y(t) can be modeled as,
y(t) = g Tr (V(t, O0) ρ0) + “white noise” (5.1)
where g is a measurement gain relating to the signal-to-noise ratio, V(t, ·) is the
Heisenberg picture equivalent to a dissipative master equation and O0 is the initial
system coupling observable [13]. The problem of state reconstruction in this model
then becomes a constrained linear estimation problem.
In a generalized measurement model, the set of possible outcomes is described
Chapter 5. Qubit State Reconstruction 181
by a positive operator-valued measure (POVM), with elements Eα indexed by a
discrete outcome α. In a given model of this measurement there exist a (possibly not
unique) decomposition of a POVM into a set of Kraus operators Aα, which satisfy
the relation Eα = A†αAα for every outcome α. Then upon obtaining the outcome α,
a pure state |ψ〉 updates via the transformation
|ψ〉 → 1√〈ψ|Eα|ψ〉
Aα |ψ〉. (5.2)
Due to the renormalization factor, this update map is inherently nonlinear in the state
vector. Any generalized measurement scheme can be decomposed into a continuous
measurement process [73]. Conversely, a continuous measurement process can be
modeled as a limiting sequence of weak generalized measurements. Then in general,
the nonlinearity of a repeated application of a time-dependent update map means
that a measurement sequence is no longer a linear functional of the initial state ρ0.
Much is known about the fundamental quantum limits of reconstructing pure
qubit states from a finite number of measurements. Massar and Popescu showed that
given n copies of a pure qubit state, it is possible to find a generalized measurement
that optimizes the average fidelity 〈F〉 between of the state estimate and the true
state, averaged over all possible input states [39]. The fidelity for the pure states ψ1
and ψ2 is
F ≡ |〈ψ1|ψ2〉|2 . (5.3)
(For mixed states this corresponds to the Uhlmann fidelity, but here we will only
be concerned with pure states.) With this definition the optimum average fidelity
bound is simply given
〈F〉opt =n+ 1
n+ 2. (5.4)
They also showed that such a generalized measurement is necessarily a joint mea-
surement involving all n qubits, and no single measurement applied in series to each
Chapter 5. Qubit State Reconstruction 182
qubit can achieve this bound. Later, Bagan et al. found that a generalized mea-
surement scheme that achieves this bound is a measurement that is uniform over all
possible spin coherent states (SCS) composed from n qubits [74]. While Varbanov
and Brun gave a constructive proof for a continuous time stochastic process that re-
produces a given generalize measurement, it is often quite difficult to obtain a closed
form expression for what POVM the entirety of a given continuous measurement
implements. Instead of pursuing this track however, we instead turn to a Monte
Carlo sampling framework.
At its most basic level, the initial state estimation problem is a parameter estima-
tion problem, in that we observe a time varying signal whose statistics parametrically
depend upon the initial state of the atomic system. The simplest of all initial state
estimation problems is binary state discrimination. In this problem, the initial con-
dition is know to be one of two possibilities, ψa or ψb. Then based upon a sequence
of measurements, yt, we wish to identify which state was most likely to generate
these data.
In our more general problem, we have a data set yt and a detailed model of
the dynamical system that generated the data, but only with the knowledge that the
initial state is a SCS. To deal with the continuous nature of this parameter estimation
problem we resort to Monte Carlo sampling. We randomly generate a collection of
m sample SCS, ψj : j = 1, . . .m, picked from some prior distribution. In Sec.
5.4 we describe how we choose the prior distribution though a two step resampling
procedure, seeded from a uniform distribution over spherical angles. Because the
space of qubit SCS is isomorphic to the surface of the sphere, with just a few hundred
samples we can easily cover that space so that any discretization error is well below
the infidelity implied by the optimum bound 〈F〉opt.
Irrespective of how the candidate states are chosen, we have reduced the contin-
uous parameter estimation problem to a much simpler state discrimination problem.
We will choose the state |ψm′〉 ∈ |ψm〉 that maximizes the likelihood function
Chapter 5. Qubit State Reconstruction 183
P (yt |ψm). In other words, the ML state |ψ〉 defined as
|ψ〉 = |ψ〉 ∈ |ψm〉 : p(yt |ψ) = arg maxm
p(yt |ψm) . (5.5)
In order to evaluate the likelihood function, we are still left with the problem of solv-
ing the recursive POVM expression or finding an equivalent method for calculating
it.
Here we choose to formulate an equivalent expression. Because we are working
with a finite set of hypothesis states, we find that it is more efficient to propagate m
(approximate) conditional states from their initial values and calculate the likelihood
for seeing the next increment, given the current estimates. This method is discussed
in detail in Sec. 5.4.
5.3 The Model
Sec. 2.7 reviews how the Faraday interaction can be modeled as a collective angular
momentum J coupled to a single P quadrature in vacuum. We align our coordinates
so that we couple to the Jz projection of angular momentum, with the collective
angular momentum operators
Ji ≡1
2
n∑j=1
σ(j)i , (5.6)
where σ(j)i is the ith Pauli operator for the jth qubit and we have set ~ = 1. The
coupling rate κ between Jz and P is proportional to the local power in the drive
laser field, which in general could be a time varying quantity. For simplicity, we will
assume that the laser is operated in a switched mode, where at time t = 0 it achieves
a constant value and that the measurement record ends before it is turned off.
In order to make the measurement record informationally complete, (or in the
language of filter stability, make the system observable), we need to add an external
Chapter 5. Qubit State Reconstruction 184
control Hamiltonian Ht, acting solely on the collective spin system. The exact form
for Ht to make it observable will be discussed in Sec. 5.3.1. Under these parameters
the system field interaction is given by the unitary propagator Ut, which is the
solution to the QSDE
dUt =(√
κ Jz dA†t −√κ Jz dAt − 1
2κ J2
z dt− iHt dt)Ut, U0 = 1. (5.7)
From this stochastic propagator we are able to apply the results of Sec. 3.3 and work
with a conditional master equation (CME). For reference, upon the receipt of the
measurement realization ytt≥0, the CME for this model is given by the SDE
dρt = −i[Ht, ρt] dt+ κD[Jz](ρt) dt+√κH[Jz](ρt) dvt (5.8)
with the initial condition ρ0 = ρ(0), where we have the following definitions. D[Jz](ρt)
is the Lindblad map commonly found in open quantum systems and is defined as
D[Jz](ρt) ≡ Jz ρt Jz − 12J2z ρt − 1
2ρt J
2z . (5.9)
H[Jz](ρt) is the state update map defined as
H[Jz](ρt) ≡ Jz ρt + ρt Jz − 2 Tr(Jz ρt) ρt. (5.10)
This map shows how the state updates, weighted by the strength of the innovation
process,
dvt = dyt − 2√κTr(Jz ρt) dt. (5.11)
5.3.1 Observability and randomized controls.
In reconstructing the full Cs ground state manifold, Riofrıo et al. used a random-
ized control policy to generate an informationally complete measurement record [13].
Merkel et al. showed that by combining traverse RF magnetic fields and microwave
radiation, with fixed magnitudes and time varying phases, the 16-dimensional ground
Chapter 5. Qubit State Reconstruction 185
state manifold is controllable [75]. In other words, through these fundamental op-
erations it is possible to generate any ground state operation and thereby map any
state to any other state.
The connection between controllability and observability is a natural one. Imag-
ine that at time t = 0 the probe couples to the operator Jz. In order for the
measurement statistics of this probe to depend upon the Jy Bloch component, an
external control must at some point rotate the system so that field now couples to
the part of Hilbert space spanned by the projectors of Jy. If the controls are unable
to effect some hidden subspace, then the only other way to know about that part
of Hilbert space is to apply an additional probe. Not every observable system needs
to be controllable, however. One can certainly observe a system completely without
being able to affect it in an arbitrary way.
The strictest definition for a system to be observable is that if there are two
quantum states ρA and ρB where ρA 6= ρB then there cannot exist a projector P in
the von Neumann algebra generated by the observation process Ytt≥0 such that
Tr(ρA P) = Tr(ρB P) [76]. (See Sec. 3.2.2 for a discussion of von Neumann algebras
and quantum stochastic processes.) This definition guarantees that after many trials,
one will always be able to distinguish ρA from ρB by looking at the statistics of Y .
However, even if a given system is observable, this does not guarantee that it
is well observed in a given measurement realization. In order for the statistics of a
single realization to give a high fidelity estimate, the space of possible initial states,
e.g. the space of all spin coherent states, should be well represented throughout the
measurement record. If the goal was to measure Jz to a high degree of accuracy, the
optimum control policy would be to apply no control at all. However our objective is
to measure every spin coherent state with equal weight, there by hopefully achieving
the optimum POVM fidelity bound.
Riofrıo et al. found that high fidelity reconstructions were possible by choosing
Chapter 5. Qubit State Reconstruction 186
random, piecewise constant phase angles, thereby randomly cycling though a con-
trollable set of operations. Here we choose to implement a control policy that is
randomized between a set of generators that rapidly spans the space of spin coherent
states. This policy then guarantees that these states will be well represented in the
measurement statistics. To achieve this, the control Hamiltonian Ht is chosen to
have the form
Ht = b(t) · J = bx(t)Jx + by(t)Jy + bz(t)Jz, (5.12)
where the control field components bi(t) are drawn from a random distribution but
are predetermined before the start of the measurement, i.e. are without measurement
feedback.
For simplicity, we further emulate the control policy of the Cs experiments and
fix the magnitude of the control field while varying its direction in a randomized
but piecewise constant way. Furthermore we will constrain the magnitude so that
for each direction, the Bloch vector will rotate by π/2. Switching the field direction
with a period of τ then requires ‖b(t)‖ = π/(2τ). With this constraint, the control
law is fully defined.
To generate a control waveform with m randomized π/2 gates with a period τ ,
we first generate a set of m of unit vectors ei so that each vector ei is drawn from
a uniform distribution across the unit sphere. The control field is then
b(t) =π
2 τ
m∑i=1
χ[i−1,i)( t/τ) ei. (5.13)
5.4 The Likelihood Function
In a discrete setting where the space of all possible outcomes, (the entire measurement
record ytt≥0) can only have a finite number of outcomes, the likelihood function
is simply the probability of receiving the observed values, given a parameter value.
Chapter 5. Qubit State Reconstruction 187
The maximum likelihood estimate is then the parameter value that maximizes the
probability for obtaining the observed data. When the measurement takes on a
continuous number of outcomes the probability for receiving a specific outcome is in
fact zero. However, we can still formulate a likelihood function by instead considering
the probability density for the observed value.
Things become a bit more complicated when considering stochastic processes in
continuous time. In Chap. 3, we found that the probability measure for a Wiener
process was defined by Wiener’s discrete path integral. This means that for a se-
quence of n times 0 = t0 < · · · < ti · · · < tn = tf we can ask for the probability
that the Wiener process evaluated at time ti will be within the interval (ai, bi). The
resulting probability is given by the integral
P (wti ∈ Ii) =
∫ b1
a1
dw1
∫ b2
a2
dw2 · · ·∏i
(1√
2π∆tiexp
(−(wi − wi−1)2
2∆ti
)). (5.14)
If one attempts to take a continuous limit of this expression you find something
rather peculiar [77]. By focusing on just the product of exponentials, one finds
limn→∞
n∏i=1
exp
(−(wi − wi−1)2
2∆ti
)= lim
n→∞exp
(−1
2
n∑i=1
∆ti
(wi − wi−1
∆ti
)2)
= exp
(−1
2
∫ tf
0
ds(dws
ds
)2).
(5.15)
Were the Wiener process in anyway differentiable, this expression might be exceed-
ingly useful. However with our limited knowledge of stochastic analysis, it merely
indicates the subtleties in working with densities of continuous time, nondifferen-
tiable processes. Attempting to make sense of these kinds of objects lead to the
formulation of a stochastic calculus of variations, which has proved exceedingly use-
ful for extending an Ito integral for anticipative integrands [78] as well as a theory
of white noise stochastic partial differential equations [21]. We will not follow this
path here.
There is an additional consideration as we know that ytt≥0 is decidedly not
a Winer process. Even making a discrete approximation, we still need to find an
Chapter 5. Qubit State Reconstruction 188
expression for the discrete density and how it depends upon the initial system state.
In this problem, we can make some progress. Sec. 3.4.1 showed that the innovation
process, vt = yt − 2√κ∫ t
0dsTr(Jzρs) is an instance of a Wiener process. More
specifically, vt is a Wiener process when the filtered state ρs accurately represents
the conditional state of the system. In our Monte Carlo setting we do not have just
a single conditional state ρt, we in fact have a set of m conditional states ρmt , as
the proper initial condition is unknown. It is possible that a candidate state ρmt will
differ in some aspects from the conditional state we would calculate, had we know
then true initial condition. For each hypothetical state we will have a set of possible
innovations vmt , each a function of the measurement record yt and the filtered state
ρmt . It should be clear that not every vmt will be an instance of a Wiener process. In
fact the maximum likelihood estimate that we will construct hinges upon the fact
that not every vmt will be a Wiener process. This is because rather than computing
the entire unknown and highly complicated statistics of yt, we will compute the
statistics of the known and simple statistics of the Wiener process vt. We then seek
the candidate initial condition that makes the statistics of vmt most resemble a Wiener
process.
Stepping back from the mathematics for a moment, converting Eq. (5.14) into
an expression for p(yt |ρm) via the innovation is deceptively simple. We can write
vt as
vt = vs + yt − ys − 2√κ
∫ t
s
ds Tr(Jz ρs), (5.16)
or in other words,
∆vi ≡ yti − yti−1− 2√κ
∫ ti
ti−1
ds Tr(Jz ρs). (5.17)
From the nonanticipative construction of the Ito integral, we have, for the smallest
of possible time differences,
∆vi−1 ≈ ∆yi − 2√κ∆ti Tr(Jz ρti−1
). (5.18)
Chapter 5. Qubit State Reconstruction 189
The density for ytt≥0 is then made by simply substituting ∆vi into Eq. (5.14). In
other words, the likelihood for the increment variables yi ≡ ∆yi is then given by
pn(y1, y2, . . . , yn|ρm) ≈(n∏i=1
(2π∆ti)−1
2
)exp
(−
n∑i=1
(yi − 2√κ∆ti Tr(Jz ρ
mti−1
))2
2 ∆ti
). (5.19)
The only possible way to maximizing Eq. (5.19) with respect to the initial condition
ρm is by minimizing the argument of the exponential. If we set all of the time
increments to be equal, ∆ti = ∆t, we can even factor out the denominator and so
the maximum likelihood estimate then becomes a problem of minimizing the sum,
QV(vmt ) ≡n∑i=1
(∆vmi )2 =n∑i=1
(∆yi − 2
√κ∆tTr(Jz ρ
mti−1
))2
. (5.20)
This kind of object is called the quadratic variation1 and Appendix B showed that
it is ultimately what gives rise to the rules of Ito calculus. So while it is a relatively
delicate mathematical object, it is well defined in the infinitesimal limit. Further-
more, in proving the Ito rules, one shows that QV(wt) = t with probability one,
so that we expect, and observe numerically, that QV(vmt ) ∼ t for most candidate
initial conditions. It is often the case that in Guassian problems such as ours, a
maximum likelihood estimate over a Gaussian probability density simple becomes
the least squared estimate. So in our Monte Carlo search we have
arg maxm
p(yt|ρm) = arg minm
QV(vmt ). (5.21)
5.4.1 The reconstruction procedure
The Monte Carlo sampling estimator we have outlined follows this rough procedure:
1. Sample m pure Bloch vectors uniformly from the unit sphere.
1Technically the quadratic variation is given in the infinitesimal limit [22].
Chapter 5. Qubit State Reconstruction 190
2. For each sample state compute the forward time evolution conditional on the
measurement record yt.
3. Compute the quadratic variations of the innovation processes for each condi-
tional state.
4. Select as the estimate, the sample state that minimizes the quadratic variation
at the final time.
In practice we need to modify this procedure in two respects. The first is that due
to involving the stability of Markov Filters [76], the above procedure suffers from
poor numerical stability when the hypothesis initial condition has very little overlap
with the true initial condition. To rectify this problem, we implement a two step
procedure, by first sampling m mixed initial conditions and then resampling, within
some solid angle, pure states about the direction of the most probable mixed state.
This issue will be discussed in detail in Sec. 5.4.2.
The second modification stems from the fact that propagating the full conditional
Schrodinger equation for a sufficient number of samples requires a large amount of
computer time. To fully propagate a spin J pure state requires 2J + 1 complex
numbers. The stochastic integrator we choose to use implements a weak second-
order predictor-corrector method ( Kloeden et al. [69, page 200] ) and empirically
requires a time step ∆t ∼ 10−6 κ−1 to produce reliable expectation values. When
considering ensembles of mixed qubits it is not sufficient to consider the maximum
projection of the collective angular momentum, but instead requires considering all
possible total angular momentum values one could construct with n spin-12
particles.
This requires a total density matrix of order n2 × n2 in size [79].
In Chap. 4 we developed a projection filter that projected the conditional master
equation for the collection of n qubits onto the manifold of identical separable states,
which greatly reduces the computation demand. We also showed that in the presence
Chapter 5. Qubit State Reconstruction 191
of strong randomized control, the projection filter tends to track the exact expecta-
tion values with a RMS error of less than 5% of the total spin length. For mixed
initial conditions, rather than propagating matrices of dimension ∼ n2× n2 for each
sample state, the projection filter allows us to reduce this to tracking a single mixed
Bloch vector, i.e. three real numbers. With these modifications, the Monte Carlo
separable least squares estimate is computed though the pseudocode algorithm 5.1.
Algorithm 5.1 A Monte Carlo Separable Least Squares Estimate
rm ←− m uniformly random Bloch vectors with r = rmixed < 1
for all rm ∈ rm do
rmt ←− Integrate Eq. (4.105) with record yt and initial value rm0 = rm.
QV(vmt )←−∑
i (∆yi −√κnx3m
t ∆ti )2
end for
rmin ←− rm′ ∈ rm : QV(v(m′)t ) = min QV(vmt )
r′m ←− m random Bloch vectors with r′ = 1 and r′m · rmin /rmixed ≤ cos(Θmax )
for all r′m ∈ r′m do
rmt ←− Integrate Eq. (4.108) with record yt and initial value rm0 = r′m.
QV(vmt )←−∑
i (∆yi −√κnx3m
t ∆ti )2
end for
r′min ←− r′m′ ∈ r′m : QV(v(m′)t ) = min QV(vmt )
return ρ(r′min )
5.4.2 Coupled CMEs
and filter stability
For a given hypothesis state ρm, there is a hidden “true” state ρ? that generated
the measurement record yt. When using this measurement record to propagate the
hypothesis state, that ends up coupling ρm to ρ?. This leads to a coupled set of
stochastic differential equations. By definition, the conditional state ρ?t results in an
Chapter 5. Qubit State Reconstruction 192
innovations process that is a true Wiener process. Explicitly, ρ?t is evolves according
to the stochastic differential equations,
dρ?t =− i[Ht, ρ?t ] dt+ κD[Jz](ρ
?t ) dt+
√κH[Jz](ρ
?t ) dwt, (5.22a)
dyt = dwt + 2√κTr(Jzρ
?t )dt. (5.22b)
where wt is the correct innovation process and is an unobserved Wiener process.
The candidate initial condition ρm0 , and the measurement record yt are the elements
necessary for propagating ρmt in time via,
dρmt =− i[Ht, ρmt ] dt+ κD[Jz](ρ
mt ) dt+
√κH[Jz](ρ
mt ) dvmt , (5.23a)
dvmt = dyt − Tr((L+ L†)ρmt )dt. (5.23b)
In terms of the unobserved Wiener process wt, ρmt evolves as,
dρmt =− i[Ht, ρmt ] dt+ κD[Jz](ρ
mt ) dt+
√κH[Jz](ρ
mt ) dwt (5.24a)
− 2κH[Jz](ρmt ) Tr (Jz(ρ
mt − ρ?t )) dt.
Note that if at some time t we happen to have the equality ρ?t = ρmt , then we also
have dρ?t = dρmt . But the general state ρτ at some time τ > t can be written as
ρτ = ρt +∫ τtdρt′ . This implies that because the differentials are equal whenever the
states are equal we will have ρ?τ = ρmτ for every τ > t if ρ?t = ρmt .
One possible result of this coupling is that it acts as an attractor, always decreas-
ing the “distance” between the Jz projections of ρ?t and ρmt . This correction effect
is known as filter stability. If the filter is able to correct for certain modeling errors,
it is stable. The differences in the two initial states ρ?0 and ρm0 can be viewed as a
modeling error and the convergence of ρmt → ρ∗t is a correction of this error. This is a
well studied effect in both the quantum and classical settings, see [76] and references
there in.
In [76], Van Handel gave explicit criteria for when a quantum filter is stable for
an incorrect initial conditions. For our purposes these criteria boiled down to the
Chapter 5. Qubit State Reconstruction 193
following two issues. The first is that the system must be observable, in that the
measurement record must be informationally complete. If we did not have a trans-
verse magnetic field, then the measurement statistics would only include information
about the eigenstates of Jz and so the system is not observable. The second issue is
that the probability density for yt calculated with the true state ρ? must be absolutely
continuous with respect to the density calculated under the guessed state ρm. This
is a term borrowed from classical probability theory and embodies the concept that
a probability measure PB is compatible with observations that are actually governed
by PA. The quantum version is given by the following definition. In order for ρA
to be absolutely continuous with respect to ρB, then for any projector P in the von
Neumann algebra generated by Ytt≥0, we must have Tr(P ρB) = 0 implying that
Tr(P ρA) = 0. This need not be a two sided relationship so that ρB need not be abso-
lutely continuous with respect to ρA. These requirements are not just important to
the question of filter stability but also apply to the Monte Carlo sampling procedure.
As it has been discussed previously, the observability condition is vital in order to
obtain a high fidelity estimate. However absolute continuity is also quite important.
In a Kraus operator formulation of a continuous measurement, the state after the
measurement outcome i is updated as
ρ 7→ ρ|i =AiρA
†i
Tr(A†iAiρ). (5.25)
If the denominator Tr(A†iAiρ) = 0, then the update cannot be made as it requires
dividing by 0. However Tr(A†iAiρ) is also the probability for obtaining the outcome
i, as calculated according to the state ρ. Therefore, if the event i occurs with this
probability, dividing by zero is not an issue as it will never happen. Suppose we obtain
the outcome i that occurred with probability Tr(A†iAiρ?) = p?i . Furthermore, suppose
we tried to update a state ρm that had the audacity to assert pmi = Tr(A†iAiρm) = 0.
This results in a crisis of conscious, as there is no way to incorporate this incompatible
information into our world view. The condition that ρ? must be absolutely continuous
with respect to ρm means that pmi will never be zero without p?i also equal to zero.
Chapter 5. Qubit State Reconstruction 194
In principle any valid initial spin state could generate a given diffusive measure-
ment record. This can be easily seen by noting that the “true” innovations process
is given by
v?t = yt − 2√κ
∫ t
0
dsTr(Jzρ?s) (5.26)
and is a Brownian motion. Because Jz takes on eigenvalues in the range −n/2 ≤
mz ≤ n/2, any candidate innovation vmt will be within the range,
yt − n√κ t ≤ vmt ≤ yt + n
√κ t. (5.27)
For finite, n, κ, and t, it is perfectly possible for a Brownian motion to obtain any
of these values, it is just increasingly unlikely. Therefore if the measurement record
is observable, the conditional master equation is in principle stable.
In practice, the numerical stability of states conditioned on highly improbable
measurements becomes a issue. By not taking this into account preliminary results
that did not consider the possibilities of unstable trajectories, showed nearly a 1%
drop in the average reconstruction fidelity from what we ultimately achieve. Inves-
tigating the cause of this sub-optimal performance showed that the average fidelity
was significantly biased by outlier trajectories that gave estimated states that were
nearly orthogonal to the true state. The cause of these outliers was the numerical
stability of Monte Carlo sample points with very poor overlap with the true state.
By switching to the two step sampling procedure in algorithm 5.1, every initial
mixed single qubit state can be viewed as a convex combination of pure states point-
ing along opposite directions. That is, if we have the possibly mixed single qubit
Bloch vector r with length 0 ≤ r ≤ 1 we have
ρ(r) = 1+r2ρ(er) + 1−r
2ρ(e−r), (5.28)
where ρ(er) is a projector on the the pure SCS pointing in the er direction. This
implies that by using initial mixed vectors each initial state has some support over
Chapter 5. Qubit State Reconstruction 195
the orthogonal spin coherent state ψ(e−r)⊗n. In the numerical simulations presented
in Sec. 5.5 an initial mixed state vector of radius rmixed = 34
provides enough of
a signal to choose an appropriate direction for the pure state resample as well as
enough orthogonal support for the trajectories remain stable.
5.4.3 Backaction in continuous quantum measurement
In order to identify what impact the backaction has on the reconstruction fidelity, we
need to construct a similar but backaction-free estimator. A figure of merit commonly
used to consider the importance of backaction is the ratio of the “projection noise” to
the “shot-noise”. The projection noise is a description of the fluctuations (i.e. noise)
in a given observable if a projective measurement is made. As we are considering a
continuous measurement of Jz with respect to a SCS, the relevant projection noise
is
⟨∆J2
z
⟩ψ⊗n =
n
4
⟨∆σ2
z
⟩ψ
= n p+1(1− p+1) (5.29)
where p+1 = |〈+1|ψ〉|2 is the probability to observe the individual spin state to be
in the +1 eigenstate of σz [4].
The shot-noise describes the noise added by making a continuous measurement
over a finite time. To identify the order of magnitude of this additive noise, note
that from Eq. (5.26) we have
yt = v?t + 2√κ
∫ t
0
ds Tr(ρ?s Jz) (5.30)
and that v?t is a realization of Brownian motion. We would like to invert this formula
to arrive at a random variable whose statistics allow for an estimate of Tr(ρ?0 Jz).
Suppose that we wished to model the system completely ignoring the theory of
continuous quantum measurement and that for times 0 ≤ s ≤ t, ρ?s evolves according
to the Schrodinger equation. If we further assume that Ht = 0, we then have that
Chapter 5. Qubit State Reconstruction 196
Tr(ρ?s Jz) = Tr(ρ?0 Jz) and so the classical random variable
jz ≡yt
2√κ t
= Tr(Jz ρ?0) +
1
2√κ t
v?t (5.31)
is Gaussian distributed with mean Tr(Jz ρ?0) and Var( jz) = (4κ t)−1. It is this vari-
ance that is referred to as the shot-noise added by the probe. Looking at the ratio
of these two fluctuations we have
ζ ≡〈∆J2
z 〉ψ⊗n
Var( jz)= 4nκ t p+1(1− p+1). (5.32)
If the system is prepared in a SCS with p+1 = 12
then Sec. 4.6.3 showed that we
then expect a maximum amount of spin squeezing or equivalently a large amount of
bipartite entanglement. In this case 〈∆J2z 〉ψ⊗n takes on its maximum value of n/4
and so ζ = nκ t. When ζ 1 then one expects a significant contribution of quantum
backaction in the system and therefore the measurement effects must be accounted
for [5]. In the uncontrolled spin squeezing simulations of Sec. 4.6.3, we found that
for n = 100 and κ t = 0.2 we found ξ2T ∼ 10 dB and so that in the absence of strong
Hamiltonian controls, ζ = 20 indeed leads to a strongly nonclassical state.
However, the above discussion assumed no controls. It is possible that with
the randomized controls considering only the Hamiltonian evolution is sufficient to
obtain a high fidelity estimate. To make this comparison we formulate a backaction-
free estimator, one that only includes the Hamiltonian in the model for the forward
time dynamics. Rather than considering a measurement record where yt is given by
Eq. (5.30), we instead propose a model were
yt ≈ wt + 2√κ
∫ t
0
ds Tr(Jz ρ?(s)) (5.33)
and ρ?(t) is the solution to the Schrodinger equation
d
dtρ?(t) = −i[Ht, ρ
?(t)] (5.34)
and wt is a Wiener process.
Chapter 5. Qubit State Reconstruction 197
To make a fair comparison, this backaction-free estimator will also be imple-
mented though a Monte Carlo sampling procedure. We use a algorithm similar to
algorithm 5.1, but with two modifications. The first is that the two step sampling
procedure is unnecessary because there are no conditional dynamics to cause numer-
ical stability. The second is that because the dynamics are linear, the Schrodinger
evolution in Eq. (5.34) is most efficiently computed in the Heisenberg picture. In
the Heisenberg picture, we simply need to integrate the time evolution of the Jz
observable once and then compute its expectation value with each candidate state.
Furthermore in this decoherence free model the system state will always remain in a
separable state and so we need only consider the Heisenberg evolution for the single
qubit Pauli operator, σz. In other words,
2√κTr(Jzρ
m(t)) =√κn 〈ψm|σz(t) |ψm〉 (5.35)
where σz(t) is the solution to the Heisenberg equation of motion
d
dtσz(t) = +i[Ht, σz(t)], with σz(0) = σz. (5.36)
The pseudocode for the backaction-free estimator is given in algorithm 5.2
Algorithm 5.2 A Monte Carlo Backaction-free Estimate
σz(ti) ←− Integrate Eq. (5.36) and evaluate at times ti.
rm ←− m uniformly random Bloch vectors with r = 1
for all rm ∈ rm do
QV(vmt )←−∑
i (∆yi −√κn Tr(σz(ti−1) ρ(rm) ) ∆ti )
2
end for
rmin ←− rm′ ∈ rm : QV(v(m′)t ) = min QV(vmt )
return ρ(rmin )
Chapter 5. Qubit State Reconstruction 198
5.5 Numeric Simulations
This section presents the results of numerical simulations, comparing algorithms
5.1 and 5.2 to the optimum POVM bound in Eq. (5.4). The bound 〈F〉opt =
(n + 1)/(n + 2) gives the average fidelity of a single POVM where the average is
taken over measurement outcomes as well as an average over possible input SCS.
Therefore, the results of these simulations are reported as an average of ensemble of
ν trials. All results in this section use ν = 1000 trials.
5.5.1 Simulation parameters
For each trial, we choose a single qubit Bloch vector from a distribution that is
uniform over the surface of the unit sphere. We then use this vector to gener-
ate SCSs composed of n qubits. This simulations use the qubit numbers n =
25, 40, 55, 70, 85, and 100. Then for each initial state and each number of qubits,
we generate a single measurement realization yt and use this record to then estimate
the initial Bloch vector. In total 6000 measurement records were generated.
Every simulation uses the same control Hamiltonian, where the randomized piece-
wise constant control vector b(t) was generated at the start of the simulation. The
directions of rotation are again distributed uniformly across the unit sphere and no
attempt was made to select an optimum realization. The parameters that fully con-
strains the simulation are the measurement strength κ, the final measurement time
tf and the control gate period τ . With no other scales in the problem we choose to
essentially set κ to one and discuss the remaining two parameters in units of κ−1.
In Chap. 4 we found that the separable approximation is valid in regime where
the randomizing magnetic field strength κ b0. By fixing the strength to generate
a π/2 rotation in one gate period τ this means that b0 = π/(2τ), implying that
κ τ 1. We also found that a gate period τ = 5 × 10−3κ−1 gave less than a 5%
Chapter 5. Qubit State Reconstruction 199
RMS tracking error for the separable projection filter, (see Sec. 4.6), and places b0
two orders of magnitude greater than κ.
For n = 25− 100 qubits we find that the reconstruction fidelities have saturated
by a time t ∼ tf = 0.2κ−1, which we fix as final time for every simulation run. With
this final time and gate period, each simulation has 40 randomized π/2 rotations.
To efficiently implement these simulations we exploit two conservation properties
of the system. The first is that because the total angular momentum operator J2
commutes with the stochastic unitary of Eq. (5.7), the total angular momentum of
the system is conserved. This means that by initializing the system in a state of
maximum projection of angular momentum (i.e. in a pure SCS) we are initializing
the system in the eigenspace with total angular momentum J = n/2. Rather than
considering the entire d = 2n dimensional Hilbert space we only need to simulate
a spin J = n/2 particle and work in its d = 2J + 1 = n + 1 dimensional Hilbert
space. The second property is that the conditional master equation we consider here
maps pure states to other pure states, because it has no additional loss channel. This
means that we can in fact integrate a conditional Schrodinger equation rather than
a conditional master equation. This makes a substantial savings in computational
overhead as we be propagating a d = n + 1 complex vector in time, rather than a
d× d complex matrix. These two properties that makes it computationally feasible
to generate 1000 measurement records for a system containing 100 qubits.
The actual simulations are implemented in the MATLAB computing environ-
ment using a hand coded weak second-order predictor-corrector stochastic differen-
tial equation integrator. The algorithm is described in Kloeden et al. [69, page 200]
and was implemented in MATLAB by Brad Chase for his PhD dissertation [37].
Chapter 5. Qubit State Reconstruction 200
Monte Carlo Parameters
The Monte Carlo separable estimator used 250 sample states for each part of the
two-step estimation. In the initial step, the 250 mixed states produce a sparse but
uniform covering of all possible SCS directions. A typical sampling has an average
angular separation between adjacent points of ∼ 6 and a maximum separation of
∼ 20. As mentioned above, the mixed state radius of the Bloch vector used in these
simulations is rmixed = 0.75. In the second step, we sample 250 pure states that
are constrained to be no more than 45 from the most likely mixed state direction.
Example first and second step sampling distributions are shown in Fig. 5.1.
Figure 5.1: An Example Monte Carlo State Sampling Distribution. (left) Anexample of the initial mixed state sampling for the Monte Carlo algorithm, withm = 250 and rmixed = 0.75. (right) An example of resampling m = 250 statesabout the +ex axis with Θmax = 45.
For the backaction-free comparison, we use an number of samples matching the
density of points in the second resample step. The resampled solid angle covers
approximately 15% of the Bloch sphere, meaning that m = 1700 cover the whole
sphere with roughly the same density of states. This number of samples lead to an
Chapter 5. Qubit State Reconstruction 201
average fidelity between nearest neighbors of 〈F〉sample = .9994, meaning that if the
true ML estimate falls between two sample points, on average, the infidelity caused
by the Monte Carlo sampling will be on the order of 10−4. This is well below the
optimum POVM bound for the simulated qubit number and so any loss in fidelity
should not be attributable to sampling errors.
5.5.2 Results and discussions
25 40 55 70 85 100
0.96
0.97
0.98
0.99
Average Fidelity of Reconstruction
n
Ave
rage
Fid
elity
The BoundSeparableSchrodinger
Figure 5.2: A comparison of numerical reconstructions to the optimum bound(Color Online.) Data points show the average fidelity of single shot reconstruc-tion as a function of the number of qubits n, averaged over ν = 1000 randomlychosen pure initial states states. Blue circles show the separable estimator. Greendiamonds show the backaction-free Schrodinger equation estimator. The opti-mum POVM bound is show in as a dotted line. Error bars show a standard errorof ±
√Var[F ]/ν.
Fig. 5.2 shows the results of the numerical simulations. The trial-averaged re-
Chapter 5. Qubit State Reconstruction 202
construction fidelity is plotted as a function of the number of qubits in the system
for both the separable estimate (i.e. with backaction) and the Schrodinger evolved,
backaction-free estimator. The fidelity is computed by taking the squared overlap
between the single qubit state for that measurement record with the single qubit
state estimate. In other words, if the true qubit state is given by the Bloch vector r0
and the estimate reports the Bloch vector rm then the fidelity of that reconstruction
is given by F = 12(1 + r0 · rm ).
In these numerical experiments, the separable Monte Carlo estimator shows a
significant improvement over a simple backaction-free estimator that considers only
the unitary evolution of the state due to the control fields. The discrepancy increases
as the number of qubits increase, keeping the duration of the measurement fixed.
Furthermore, the separable estimator almost achieves the optimum bound. The
deficit between the bound and the numerical averages never exceeds 0.21% with an
average of 0.16%, which is still above the expected error caused by the discrete Monte
Carlo sampling. A possible source for this deficit could be the separability assumption
in the projection filtering method, which is known to have a non-negligible tracking
error in the Jz expectation value (see Sec. 4.6).
The performance of the backaction-free Schrodinger estimator is best understood
by considering not just the estimate for the initial state given the entire measurement
record, but to instead consider the family of estimates created by only taking part
of the measurement record.
Estimator Bias
The Monte Carlo estimators take as input a measurement record y containing data
for times t ∈ [0, tf ] and returns an estimate for the initial state ρ0. It is just as easy to
consider a whole family estimates computed with only part of the total measurement
record, i.e., instead of using the entirety of y we use ys for 0 < s ≤ tf in computing
Chapter 5. Qubit State Reconstruction 203
the estimate. Ideally, having more data should only improve the estimate. However,
in order to use the data at times t > s we are required to compute an estimate for
the state of the system at time s. If this estimate is in fact inaccurate, then any
modeling errors might bias the conclusions drawn from future measurements.
Moreover both of the estimators considered here have modeling errors. The sep-
arable estimator uses the projection filtering equations, which explicitly remove any
entangling dynamics. The estimator based simply upon the unitary Schrodinger dy-
namics makes a much greater sin. This estimate completely ignores any effect the
measurement has on the system of qubits. Figures 5.3 and 5.4 indicate what affect
these modeling errors have on the average reconstruction fidelity.
0 0.05 0.1 0.15 0.20.950
0.963
0.976
0.982
0.9860.9880.990
κ t
Ave
rage
Fid
elity
25
40
55
7085100
POVM
Bou
nd (
N)
Separable Estimator
Figure 5.3: Reconstruction fidelity vs measurement duration. This plot showsthe average reconstruction fidelities for the separable filter estimate, as a functionof the length of the measurement record. Shown are traces for the 6 qubitnumbers considered, which are (in order of decreasing reconstruction fidelity)100, 85, 70, 55, 40, and 25 qubits respectively. The vertical axis is a linear scale,with grid lines indicating the optimum fidelity bound for these same number ofqubits. The averaging was over ν = 1000 randomly chosen pure initial states.
Chapter 5. Qubit State Reconstruction 204
Fig. 5.3 shows for the separable filter, the trial averaged reconstruction fidelities
for all 6 qubit numbers plotted against the duration of the measurement record.
It is clear from this figure that having a larger signal composed of more qubits
improves the final fidelity. It also shows how, as the number of qubits increases, the
fidelity improves at a faster rate. Furthermore, the modeling error introduced by the
separable approximation does not seem to significantly bias the estimate away from
an optimum sample state.
0 0.05 0.1 0.15 0.20.950
0.963
0.976
0.982
0.9860.9880.990
κ t
Ave
rage
Fid
elity
25
40
55
7085100
Schrodinger Estimator
POVM
Bou
nd (n
)Figure 5.4: Reconstruction fidelity vs measurement duration. This plot showsthe average reconstruction fidelities for the backaction-free Schrodinger estimate,as a function of the length of the measurement record. Shown are traces for the6 qubit numbers considered, which are (in order of decreasing reconstructionfidelity) 100, 85, 70, 55, 40, and 25 qubits respectively. The vertical axis is alinear scale, with grid lines indicating the optimum fidelity bound for these samenumber of qubits. The averaging was over ν = 1000 randomly chosen pure initialstates.
Fig. 5.4 shows an identical plot for the backaction-free Schrodinger estimate,
showing that a larger number of qubits improves the reconstruction fidelity with a
higher fidelity estimate at shorter measurement times. However, it also shows that
Chapter 5. Qubit State Reconstruction 205
not including the backaction into the model significantly decreases the reconstruction
fidelity at longer measurement times. This bias tends to be more pronounced as
number of qubits increases.
While the peak reconstruction fidelities between the two methods seem to be
comparable (certainly within error bars), the fact that the Schrodinger evolution is
biased away from an optimum state shows the importance of including the backaction
in the dynamical model.
206
Chapter 6
Summary and Outlook
We conclude with a summary of each research chapter and a discussion of the possible
avenues this research might take in the future.
6.1 Quantum optics and quantum stochastic dif-
ferential equations
Chap. 2 derived a relation between quasi-monochromatic traveling wave packets
and the bosonic Fock space necessary for defining a formal quantum Ito stochastic
calculus. We identified a limit where the continuous-time tensor product decom-
position is consistent with a quasi-monochromatic approximation. The limit was
ultimately enforced by convolving any bounded, square-integrable complex function
with a smoothing kernel, constraining the resulting object to be slowly-varying in
time. A suitable white noise limit was identified when the kernel approached a
delta function in conjunction with a limit where its derivative remained remained
infinitesimal when compared to an optical period. This produced a separation of
three timescales. The resulting quantum stochastic integral is on the slowest scale,
Chapter 6. Summary and Outlook 207
the delta correlated smoothing kernel is in the middle, and the fastest is an optical
period.
This version of a quantum white noise limit is new and distinct from two existing
explanations. The first is a static picture of a bosonic heat bath lacking any dynam-
ical flow of information, which does not capture the fundamental propagation of a
traveling wave field [50, 51]. In an alternative description, Accardi et al. derive a
quantum white noise limit through a rescaling argument [54, 55]. If the system field
coupling Hamiltonian had a fundamental interaction strength of λ, then by rescal-
ing time as t → t/λ2 and taking the limit λ → 0, a quantum white noise operator
appears. While this is in no doubt mathematically correct, it is in our opinion ad
hoc, as requiring time to be not just big, but specifically 1/λ2-big is an artificial
constraint. Here we do require a white noise limit in a “middle timescale”, meaning
that the smoothing envelop approaches a delta function (σ → 0) while the timescale
of one carrier oscillation tends to zero faster ((σω0)−1 → 0). We do not require any
fixed or delicate scaling law. Additionally, in our model the QSDE treatment stands
independently from any system-field interaction, as long as that coupling respects
the above approximations.
After formulating this a quantum white noise approximation, we rederived a
QSDE description of the propagator. This derivation applies the recently derived
quantum Wong-Zakai theorem [43]. This result describes how a general interaction
involving scattering operations converges to a valid QSDE. This result allows for
a stochastic description of a dispersive Faraday interaction in a regime of a weak
drive but high optical density. A fundamental characteristic of any scattering based
propagator is that admits for the possibility of having multiple scattering events in
the intermediate timescale. It effectively renormalizes over this effect.
The scattering propagator we derive is similar to a propagator derived by Bouten
and Silberfarb [80]. They derive a polarizability interaction for a 4-level atom starting
from a QSDE expression for a quantum field coupling two ground states, via two ex-
Chapter 6. Summary and Outlook 208
cited states. They then adiabatically eliminated the excited atomic states under the
usual approximations of weak excitation. The resulting propagator contained similar
but not identical scattering processes. The difference is that the resulting integrands
for the dΛrrt and dΛll
t terms only contain single scattering events. By starting from
a QSDE for a dipole interaction and then eliminated an atomic manifold, the atoms
are only capable of making a single scattering transition in one intermediate time
increment. It is unclear at this point which model is more applicable for describing
the underlying physics or if both are equally “wrong”, just in different ways. It
should be noted that both models agree in the limit of a weak forward scattering
rate and negligible spontaneous emission. More analysis is clearly needed to settle
any debate.
6.2 Classical and quantum probability theory
Chap. 3 served as a review of both classical and quantum probability theory. It
ultimately focused on the mapping, via the spectral theorem, between sets of com-
muting observables and a classical probability space. When a family of operators
Yii∈I pairwise commute, the underlying projectors, i.e the spectral measure P(dλ),
define the classical probability model (Ω,F ,P). The sample space Ω is set of labels
λ, F is the smallest σ-algebra over Ω, and the probability measure is defined by
the quantum expectation value for a projector associated with any element in F ,
P(dλ) = Tr(ρP(dλ)).
The utility of this mapping is that it allows us to define a classical stochastic
process that is in some sense equivalent to a QND observable. When restricting
our attention to operators that commute with the projectors defining the classical
space, we are able to treat any operator A =∫
Ωa(λ) P(dλ) in terms of the classical
random variable a. When the eigenvectors of A do not form a complete basis in
the underlying Hilbert space, there exist operators X that commutes with A but
Chapter 6. Summary and Outlook 209
still must be treated quantum mechanically. Rather than being a liability, this is
in fact a feature, as it allows us to simplify a system-probe interaction by treating
the system quantum mechanically and the probe as a classical random variable.
We call the resulting description semiclassical, but it still captures the physics of a
system-probe-measurement model, when Yii∈I are the measured probe observables.
However a careful physicist should always check that any operator X considered in
this model commutes with every Yi.
We then apply this formalism to rederive the conditional master equation, when
measurements are generated from the observation process Yt = U †t (At+A†t)Utt≥0 in
vacuum expectation. This derivation is initially performed in the Heisenberg picture
and is then referred to a quantum filter. In the quantum filtering language the filtered
observables U †tXUt are still operators, written as πt(X), and are linear combinations
of the projectors P(dλ). When converted to a conditional master equation, we
compute a system state ρt that is defined on the system Hilbert space only. When
taking an expectation value of a system operator X, we have the equality
πt(X)|Yt=yt = Tr(ρtX) (6.1)
for every system operator X and time t. We also described how the quantum filter
is equivalent to a purification of any generalized measurement scheme.
When the conditional master equation has complete information, and the ini-
tial condition is a pure state, then it is sufficient to use an equivalent Schrodinger
equation and calculate the random system state vector |ψt〉. Given a correct and
complete description of the system, a conditional Schrodinger equation is identi-
cal to the stochastic Schrodinger equation derived in the quantum optics literature
and the choice of the measurement observables defines the specific unraveling of the
unconditioned master equation.
In a recent paper by Tsang and Caves, they define a quantum mechanics free
subsystem as a set of time-dependent operators subject to the constraint that the
Chapter 6. Summary and Outlook 210
operators commute for any time [81]. Given the set of Heisenberg picture operators
Oi(t) : i = 1, . . . , n, if [Oi(t), Oj(t′)] = 0 for all i, j, t, and t′, then they are free of
the laws of quantum mechanics. This is exactly the same idea as the quantum to
classical mapping via the spectral theorem and relies on exactly the same principle.
We mention this result here as it is an example showing active research utilizing the
mapping between quantum and classical structures. The tools of classical probability
theory should have an important role to play in this line of research. In particular,
the concept of a commutant should be invaluable as it describes the set of operators
that are compatible with this subsystem.
The results of Chap. 5 would not have been possible without noting that the
fundamental noise process driving the conditional master equation is not simply a
Wiener process, but is instead the measurement realization itself. With different
initializations, the conditional master equation produces different innovations and
only a few of them will have the statistics of a Wiener process. From a statistical
perspective, the conditional master equation is an estimator that allows us to predict
the outcomes of measurements performed on the system, given an ancilla coupled
measurement record. Furthermore in classical estimation theory, the concepts of
robustness and stability play an important roles, as it is a desirable for an estimator
to be robust to modeling imperfections and incorrect initializations. The stability of
the quantum filter shows that in most cases the conditional master equation is able to
correct itself given bad initial information. This supports the case that a conditional
quantum state should be viewed as a quantum analog to a classical estimator. Using
this perspective, it is possible that one might be able to formulate a variant of the
quantum filter, that is more robust to modeling errors or corrupting noise in the
measurement signal.
For real optical beams, the continuous-time tensor product decomposition was
only an approximation, and that for short enough timescales the operators did not
strictly commute. Outside of this approximation, it is not immediately clear if one
Chapter 6. Summary and Outlook 211
could formulate a truly commutative space of operators for the purpose of defining
a conditional expectation. As the technology of ultrafast lasers progresses, it might
be possible to experimentally test a regime where subsequent optical measurements
almost commute and the conditional dynamics might then reveal surprising quantum
effects. Doing so would likely require formulating a conditional expectation on a
noncommutative von Neumann algebra, which in some cases is possible [76], but the
classical probabilistic interpretation is lost [25].
Von Neumann argued that a commuting approximation to almost commuting
observables was always possible [82], however this has been shown to be not the
case [83]. In the context of spin chains, Ogata recently established the existence
of commuting approximations for “macroscopic” observations [84]. It is likely that
these mathematically rigorous results will be invaluable in identifying the consistent
information embedded in a sequence of noncommuting observations.
6.3 Projection filtering for qudit ensembles
The methods of differential geometry are a set of powerful and flexible tools, readily
applied to a wide variety of problems. Orthogonal projections are fundamental to
quantum theory. Chap. 4 combines both of these tools to derive an approximation
to the conditional master equation for an ensemble of n qubits, given a diffusive
measurement of the collective angular momentum projection Jz, and in the presence
of strong global rotations. The approximation was based on the ansatz that if the
system was initialized in an identical tensor product state, % = ρ⊗n, then it should
remain close to a state ρ′ ⊗n of some single qubit state ρ′. The approximation was
made to find a modified evolution that preserved this symmetry. It was formulated
by projecting the conditional master equation, acting on the state %, into the tangent
Chapter 6. Summary and Outlook 212
space of the manifold of states
P ≡ρ⊗n : ρ is a valid qubit state
. (6.2)
We worked in a parametrization where the single qubit state is mapped to a vector
defined within the unit Bloch ball. We were able to derive an analytic expression for
a projection filter that describes the diffusion of the Bloch vector in a non-Euclidean
but isotropic space.
We subsequently tested the quality of the resulting approximation numerically for
pure spin coherent states. These simulations were performed for a systems composed
of 25 ≤ n ≤ 100 qubits under a variety of conditions. We make this comparison
first without an external Hamiltonian, allowing the system to evolve only under
the action of the measurement. In an exact description this model would produce
a significant amount of spin squeezing and is confirmed in the simulations. The
projection filter tracked the mean expectation values with ∼ 90% accuracy but failed
to describe the correlations induced by the squeezing, as it was designed to do. We
then performed the same analysis in the presence of a Hamiltonian driving strong
randomized rotations. In this case, spin squeezing failed to significantly accumulate,
leading to a & 95% agreement between the exact and projected mean expectation
values and an average fidelity > 80% between the projected and exact states for all
qubit numbers tested.
A natural extension of the projection filter is to move beyond qubits and consider
higher spin systems. Unfortunately, the simplicity of the Bloch sphere is lost for
d > 2. There certainly exist d2 − 1 traceless, orthogonal, Hermitian matrices for
decomposing a d-dimensional quantum state. The problem is that in attempting to
formulate a mapping between valid quantum states and a d2 − 1-dimensional ball,
you find that not every point inside the ball, or its surface, corresponds to a valid
quantum state [68]. The problem is that while the orthonormal matrices have a
number of useful features, they do not share the same spectra, and so the boundary
Chapter 6. Summary and Outlook 213
between valid and invalid states is not isotropic. For qubits, we happily ignored any
issues involving the boundary between valid and invalid states. It is likely that for
qudits willful ignorance may lead to disaster. The best course of action may be to
seek a more abstract representation of the state.
In addition to moving to a higher spin system, we can also consider correlated
states. One family of correlated states of general interest are spin squeezed states.
Finding a smooth parametrization for pure spin squeezed states is not difficult as the
canonical example of spin squeezing is generated by a specific Hamiltonian [70]. By
composing the one parameter group of squeezers with the group of SU(d) rotations,
it is likely that one can describe the space of pure spin-d squeezed state as a d2-
dimensional manifold, baring any issues with linear independence. Whether or not
there is a wieldy metric induced on this space is a whole other question entirely.
An additional complication is the unavoidable fact that for a model to be at
all experimentally useful, it must be able to handle mixed states and decoherence.
Adding single qubit decoherence to the separable projection filter is a trivial un-
der taking. Any map that acts identically and independently on each qubit is, by
definition, in the tangent space of identical separable states. The reason why our
simulations only considered pure state dynamics is because generating exact simu-
lations for n ∼ 50 qubits is quite challenging when the total angular momentum is
not a conserved quantity. Chase and Geremia derive a simulation technique that
required only order n2 parameters for exactly propagating n qubits under symmetry
preserving local decoherence [79]. By applying this or a similar method, we expect
to be able to extend our numerical tests to include some decoherence to the model.
While the algorithmic and “optimal” nature of the projection filter is appealing,
control and system engineers confronted by nonlinear problems have derived a num-
ber of suboptimal but highly successful estimation techniques. Some of which are
a linearized extended Kalman filters [85], “unscented” Kalman filters [86], Monte
Carlo “particle” filters [87], symmetry preserving filters [88], and so on. Some or
Chapter 6. Summary and Outlook 214
all of these techniques may prove useful for partially observed quantum systems.
Although, without a general mapping between quantum observables and classical
statistics, none of these tools are applicable.
6.4 Qubit State Reconstruction
Chap. 5 applied the quantum filtering formalism to construct a tomographic estimate
for an unknown initial quantum state from an ensemble of identical copies experi-
encing a joint continuous measurement. We found a maximum likelihood estimate
of the initial state, based upon the statistics of a single continuous measurement
realization. The purpose of this work was to extend previous results using a contin-
uous measurement for quantum state tomography, into a regime where the quantum
backaction significantly affect the measurement statistics. In a numerical study with
ideal conditions, we found that our estimate nearly saturate an optimum bound.
Derived by Massar and Popescu, this bound states that the average reconstruction
fidelity given n copies of a pure qubit state and no other prior information, the best
average fidelity is given by 〈F〉opt = (n+ 1)/(n+ 2).
The problem of identifying a tomographic estimate was mapped to a parameter
estimation problem, where the statistics of the measurement record parametrically
depended upon the initial qubit state. We then found that the likelihood function
for the measurement record was ultimately Gaussian, leading to an equivalence be-
tween a maximum likelihood estimate and a least-squares estimate. Maximizing the
likelihood function then ultimately reduced to minimizing the quadratic variation
of an innovation process, computed from the measurement record and a conditional
state estimate. When the conditional state corresponds to a “correct” description,
then the innovation is a Wiener process, setting the minimum value to be ∼ t.
In order to make a numerical implementation computationally feasible, we ap-
Chapter 6. Summary and Outlook 215
proximated an exact innovation by one computed with the projection filter. As this
reconstruction procedure is tied to the quality of the projection filter, any improve-
ments in its accuracy will almost surely improve the reconstruction fidelity. We
expect that by extending the projection filter to include squeezed states, there will
be near perfect agreement between an innovation computed from the projection filter
and an innovation computed from the exact conditional master equation.
The nature extensions of the projection filter carries over to the case of state re-
construction. The principle of finding a least-squared estimate is system independent
as long as evolution still described by a diffusive conditional master equation. How-
ever when moving to qudits, the number of parameters we need to estimate grows
unfavorably with d. It is likely that in the general case, it will no longer be feasible
to simply sample from the compact parameter space and select the most likely candi-
date. Parameter estimation in nonlinear statistical models is a well studied problem
for classical systems and we believe that a classical solution will be adaptable to the
quantum case. One possible avenue to investigate is a statistical importance and
resampling technique, a “particle filter” [87], which has already been adapted to a
quantum parameter estimation problem [89].
From our perspective, it is an open question as to whether or not minimizing the
innovation’s quadratic variation is the optimum statistical test to use. In hypothesis
testing, comparing the ratio of two likelihood functions has been shown to have the
most predictive power out of all statistical tests. From that fact, one possible method
for improving the reconstruction procedure is to compute the likelihood ratio between
each candidate state and a master equation initialized in the completely mixed state.
In hypothesis testing, the ratio is made by comparing the likelihood of the data being
generated from your model compared to a null hypothesis. For a quantum system
the most logical null hypothesis is the completely mixed state. It is possible that
by computing this likelihood ratio we will be able to better discriminate the signal
arising from the initial state from the signal caused by the quantum backaction.
Chapter 6. Summary and Outlook 216
It is still an open question as to why this continuous measurement scheme ap-
proaches the optimum bound computed by Massar and Popescu. Because the numer-
ical results perform so well, it is likely that the randomized controls are mapping the
continuous measurement to a uniform measure over all spin coherent states, as this
is known to achieve the optimum bound [74]. Understanding what effective POVM
a given controlled-continuous measurement implements will likely be a powerful re-
sult in itself. Armed with that knowledge, a clever experimentalist could engineer
any number of complicated measurement protocols. Not the least of which being
unambiguous state discrimination [63].
217
Appendix A
Paraxial Optics
This appendix review the paraxial wave equation and ultimately calculates Fourier
transform of a paraxial mode function. The paraxial wave equation begins by assum-
ing a quasi-monochromatic solution to the wave equation that takes the form of a
rapidly oscillating plane wave, exp(+i(k0z−ω0t)) modulate by an envelope function
that changes slowly in both space and time. If there is a vector valued function
U(x, t) satisfying the wave equation
∇2U(x, t)− 1
c
∂2
∂t2U(x, t) = 0, (A.1)
then we hypothesize a real-valued solution of the form U(x, t) = U (+)(x, t) + c.c.
where
U (+)(x, t) = u(+)(x, t)e+i(k0z−ω0t) (A.2)
and u(+)(x, t) is a slowly varying function. Slowly varying is characterized by the
inequalities∣∣∣∣∂2u(+)
∂z2
∣∣∣∣ k0
∣∣∣∣∂u(+)
∂z
∣∣∣∣ k20
∣∣u(+)∣∣ (A.3a)
and ∣∣∣∣∂2u(+)
∂t2
∣∣∣∣ ω0
∣∣∣∣∂u(+)
∂t
∣∣∣∣ ω20
∣∣u(+)∣∣ . (A.3b)
Appendix A. Paraxial Optics 218
Based upon this assumption one can then neglect terms in the wave equation that are
second derivatives with respect to z and t. The result is the paraxial wave equation,
1
2k0
∇2Tu
(+) + i
(∂u(+)
∂z+
1
c
∂u(+)
∂t
)= 0. (A.4)
∇2T denotes the Laplacian with respect to the remaining transverse coordinates
x, y, and we will denote the transverse direction as xT . It is often said that
the paraxial approximation is valid under the assumption that the envelope function
u(+) varies slowly when compared to an optical period, 2π/ω0.
Under the change of variables (x; t)→ (xT , z; t− z/c) the paraxial wave equation
becomes independent of retarded time tr = t − z/c. Without a loss of generality it
can be assumed that
U (+)(x, t) = f(tr)u(+)T (xT , z) e
−iω0 tr (A.5)
for any slowly varying f . In this case we have,
1
2k0
∇2Tu
(+)T (xT , z) = −i ∂
∂zu
(+)T (xT , z). (A.6)
If we make the replacement z → t and k0 → m/~ then Eq. (A.4) is identical to
a two-dimensional Schrodinger equation in free space. Furthermore in the Fourier
domain, each plane wave component is an eigenstate of the “Hamiltonian” 12k0
∇2T
ultimately implying,
u(+)T (kT , z) = u
(+)T (kT , 0)e
−i |kT |2
2k0z. (A.7)
where kT = kxex + kyey. Note that this is Fourier transform is with respect to the
transverse components only and is still a function of the longitudinal component z.
Taking the Fourier transform of the full solution U(x, t)→ U(k, t),
U (+)(k, t) = u(+)T (kT , z = 0)
∫dz√2πe−ikzze
−i |kT |2
2k0zf(t− z/c)eik0(z−ct). (A.8)
Appendix A. Paraxial Optics 219
If we change variable from taking the spatial transform with respect to z to trans-
forming with respect to the retarded time tr we find that
U (+)(k, t) = c f( c|kT |2/(2k0) + ckz − ω0) u(+)T (kT , 0) e
−ic(|kT |22k0
+kz
)t
(A.9)
where f(ω) is the temporal Fourier transform of f .
We know that regardless of any approximations the positive frequency compo-
nent of a traveling wave solution evolves according to U (+)(k, t) = U (+)(k, 0)e−ic|k|t.
Evidently, the paraxial approximation is an approximation that
ω(k) = c |k| ≈ c (|kT |2
2k0
+ kz). (A.10)
This approximation seems slightly at odds with the fact that |k| is strictly nonneg-
ative, while the right-hand side of Eq. (A.10) extends to negative frequencies. This
issue is resolved by the assumption that u(+)(x, t) is a slowly varying function or
equivalently u(+)(k, t) is a sharply peaked function about k = 0. The resolution is
that because of the carrier plane wave, U (+)(k, t) is a sharply peaked function about
k0 = k0 ez. While in principle ω(k) can be negative, these negative components
never contribute as long as the paraxial approximation holds. Finally we find that
U (+)(k, t) = c f (ω(k)− ω0) u(+)T (kT , 0) e−iω(k)t. (A.11)
220
Appendix B
Classical Stochastic Calculus
This appendix reviews the derivation of the Stratonovich and Ito integrals as well as
the rules of Ito calculus. We attempt to describe the salient points found in standard
texts while leaving out the proofs.
Stochastic integration, in either the quantum or classical sense begins from a form
of functional integration. The traditional Riemann integral takes a function f(t)
and integrates with respect to a small difference in its argument ∆t. In a functional
integral, as defined by Stieltjes, the function f(t) is integrated with respect to a small
difference in another function g(t), i.e.∫ t
0
f(s) dg(s) ≡ limn→∞
n∑i=1
f(t∗i )(g(ti)− g(ti−1)
)(B.1)
where ti−1 ≤ t∗i ≤ ti. This limit can be shown to make sense if f and g are reasonably
well behaved. One limitation is that g can’t vary “too much” over a time interval ∆t,
(the total variation of g must be finite) [53]. Like the usual definition of a Riemann
integral, the convergence of this integral does not depend upon where t∗i lies in the
interval [ti−1, ti].
A stochastic integral, often called a stochastic differential equation (SDE), re-
places both functions f and g by stochastic processes. However in this replacement
Appendix B. Classical Stochastic Calculus 221
the problems of working with nondifferentiable functions leads to a more delicate
situation. The fact that Brownian motion has a nowhere smooth trajectory means
that its total variation is infinite, leading to a divergence in a Riemann-Stieltjes
limit [22]. Not only does this mean that we are forced to consider a different kind
of limit, but the choice of t∗i makes a dramatic difference on its mathematical and
statistical properties. Specifically, if t∗i = ti−1, one arrives with an Ito integral, which
has several desirable statistical properties but does not obey the chain rule as seen
in ordinary calculus. If, however, t∗i is taken at the midpoint of the interval, called
a Stratonovich integral, then the rules of calculus are preserved but the statistical
properties are more involved. Fortunately there exists a simple conversion between
the two integral definitions. We will discuss all of these issues in greater detail in the
following section.
Concretely, the Ito and Stratonovich integrals begin with the following definitions.
Consider the partitioning of the time interval [0, τ ] into in increasingly dense mesh
n-ordered times tn : n ∈ N, 0 < t1 < · · · < tn = τ . The Ito integral takes the
(time-adapted) process xtt≥0 and defines the integral of the well-behaved function
b(xt), with respect to the Wiener process wt, to be∫ τ
0
b(xt) dwt ≡ limn→∞
n∑i=1
b(xti−1) (wti − wti−1). (B.2)
This limit is then shown to converge to an almost unique object, with probability
one1 The Stratonovich integral takes a different definition,∫ τ
0
b(xt) dwt ≡ limn→∞
n∑i=1
b(xti+xti−1
2
)(wti − wti−1). (B.3)
These two definitions arrive at fundamentally different, but not unrelated, integrals.
One integral can be converted to another by using a simple trick, derived in Sec.
B.1.1.
1There is a slight caveat where one could add another random process which happensto have zero probability of ever occurring.
Appendix B. Classical Stochastic Calculus 222
Before moving on to discussing the operational and statistical properties of these
integrals it is worth noting, that in attempting to model a classical physical sys-
tem, with SDEs the choice of calculus is crucial. Fortunately, the question as to
which calculus to use is answered by the Wong-Zakai theorem [90] (see the intro-
duction to Appendix D). In this paper they showed that an ordinary differential
equation containing a piecewise smooth approximation to Brownian motion, limits
to a Stratonovich equation and not an Ito equation. If one derives an equation of mo-
tion for a system including an approximation to Brownian motion then the solution
to that equation must be interpreted in the Stratonovich sense.
B.1 Ito Calculus
The rules of Ito calculus can be derived with varying levels of detail and sophistica-
tion. Their practical purpose is to give a method for manipulating and combining
multiple Ito integrals into new and different expressions. The bottom line result is
that the standard differential chain rule, d(fg) = f ′dg + g′df must be extended to
include a second order correction, see Eqs. (B.14 - B.15). An often cited reference
for the derivation of the Ito integral and Ito calculus is the book by Oksendal [22].
There he shows how the limits in Eqs. (B.2) and (B.3) may or may not converge.
Specifically, the standard techniques for defining a integral with respect to a Rie-
mann sum fails, because in doings you ultimately consider the quantity, called the
total variation,
lim∆t→0
n∑i=1
|wti − wti−1| (B.4)
for the partition of times a ≤ t0 < · · · < tn = b. What you can show is that with
probability one, this is infinite for the Wiener process. However you can also show
that instead of summing the absolute value of each increment, |wti−wti−1|, you sum
the square of each increment, (wti − wti−1)2, then you obtain a finite quantity. This
Appendix B. Classical Stochastic Calculus 223
is called the quadratic variation, and with probability one,
lim∆t→0
n∑i=1
(wti − wti−1)2 = b− a. (B.5)
What this is saying is that while the Wiener process travels an infinite absolute
distance in any finite time, it RMS displacement only grows like the square root of
time. Using the fact that the Wiener process has a well behaved quadratic variation,
the limits of the Ito and Stratonovich integrals make sense if you consider their
squared expectation value, the so called L2(P) limit.
For a flavor for how a squared expectation value might make sense for a Wiener
process, observe that as it is constructed to have Gaussian statistics for any finite
interval,
limn→∞
E
( n∑i=1
(wti − wti−1)
)2 = E
((wt − ws )2
)= t− s. (B.6)
The way this is turned into an integral is that if one has the time-adapted stochastic
process bt, i.e. it is assumed to be independent of wt′ for times t′ > t, then it can be
shown that
limn→∞
E
(( n∑i=1
bti−1(wti − wti−1
))2)
= limn→∞
E
(n∑i=1
b2ti−1
(ti − ti−1 )
)(B.7)
as long as E(∫ t
sb2t dt)< ∞ [22]. A fair amount of analysis goes into showing for
what kinds of processes a piecewise constant approximation. More work is needed
to extend the proof to hold with probability one. We mention this here only because
it indicates the line of reasoning that relates the product of two Ito integrals to a
integral over time of the product of the integrands.
Moving from a consistent definition of an integral to a mature calculus involves
placing a constraint on what kind of integrands we are able to use in a stochastic
integral. Consider the example of the recursively defined Ito process
xt = x0 +
∫ t
0
a(s, xs) ds+
∫ t
0
b(s, xs) dws, (B.8)
Appendix B. Classical Stochastic Calculus 224
where a and b are continuous functions that are once differentiable in time and twice
in x. An extremely common and useful notational device is to write an Ito integral
in a differential form,
dxt = a(t, xt) dt+ b(s, xt) dwt (B.9)
to represent that integral. The utility of this notation is apparent in that if xt were
an ordinary deterministic equation, (setting b(t, xt) = 0) then xt is the solution to
the equation
dx
dt= a(t, x). (B.10)
This is why that these kinds of equations are called stochastic differential equations.
A typical application is to consider two such equations, so in addition to xt we
have the differential
dyt = c(t, yt) dt+ d(t, yt) dwt. (B.11)
We wish to find a differential for the product xt yt or even some other function
f(xt, yt). The answer to the first example is that
d(xt yt) = a(t, xt) yt dt+ b(t, xt) yt dwt
+ xt c(t, yt) dt+ xt d(t, yt) dwt
+ b(t, xt) d(t, yt) dt.
(B.12)
This expression can be derived by computing the L2(P), ∆ti → 0 limit of the def-
inition of the Ito integral. It is important to emphasize that the right-hand side of
this equation is itself another Ito integral. By multiplying, adding, and subtracting
Ito integrals one still finds “just” another Ito integral, with the whole closing upon
itself to form an algebra. Compare this example to a second order Taylor expansion
of the product xt yt,
d(xt yt) = dxt yt + xt dyt + dxt dyt. (B.13)
Appendix B. Classical Stochastic Calculus 225
If we apply the Ito rules to the second order product dxt dyt, the only surviving
term is (b dwt) (d dwt) = b d dt. This means that there is a consistency between Eqs.
(B.12) and (B.13). The general case for some function f(xt, yt) is given by the second
order Taylor expansion,
df(xt, yt) =∂f(xt, yt)
∂xtdxt +
∂f(xt, yt)
∂ytdyt
+1
2
∂2f(xt, yt)
∂x2t
dxt dxt +1
2
∂2f(xt, yt)
∂y2t
dyt dyt +∂2f(xt, yt)
∂xt ∂ytdxt dyt (B.14)
where
dxt dxt = b2(t, xt) dt, (B.15)
dyt dyt = d2(t, yt) dt, and (B.16)
dxt dyt = b(t, xt) d(t, xt) dt. (B.17)
In order for this general Ito expansion to be well defined, f(x, y) must have a finite
first and second derivatives.
B.1.1 The Ito conversion
Supposed we have the recursive Ito form SDE such that
xτ =
∫ τ
0
a(xt) dt+
∫ τ
0
b(xt) dwt. (B.18)
We then assert that this same process has a corresponding Stratonovich form,
xτ =
∫ τ
0
a(xt) dt+
∫ τ
0
b(xt) dwt. (B.19)
(a has no relation to the Fourier transform.) Our goal is to then find a relation
between the functions a(x) and a(x) that makes this assertion true. Subtracting
these two expressions lead to the equality,
Ic ≡∫ τ
0
b(xt) dwt −∫ τ
0
b(xt) dwt =
∫ τ
0
a(xt) dt−∫ τ
0
a(xt) dt. (B.20)
Appendix B. Classical Stochastic Calculus 226
This difference is known as the Ito correction term.
To lighten the notation we will define the intervals, ∆xi ≡ xti − xti−1, ∆wi ≡
wti − wti−1and ∆ti ≡ ti − ti−1. The Ito correction can then be written as
Ic = limn→∞
n∑i=1
(b(xti+xti−1
2
)− b(xi−1)
)∆wi
limn→∞
n∑i=1
(b(xti−1
+ 12
∆xi)− b(xi−1)
)∆wi
(B.21)
We can write the integrand very suggestively in terms of a prelimit form of the
derivative of b(x). By defining
∆b
∆x(x) ≡ b(x+ ∆x)− b(x)
∆x(B.22)
we then have
b(xti−1
+ 12
∆xi)− b(xi−1) =
1
2
∆b
∆x(xti−1
) ∆xi. (B.23)
With a recursive definition for xt, we can substitute ∆xi to find
Ic = limn→∞
n∑i=1
1
2
∆b
∆x(xti−1
)(a(xti−1
) ∆ti + b(xti−1) ∆wi
)∆wi. (B.24)
However, by the rules of Ito calculus (i.e. ∆ti ∆wi → 0 and ∆wi ∆wi → ∆ti with
probability 1) the whole expression converges to
Ic =1
2
∫ τ
0
db
dx(xt) b(xt) dt. (B.25)
And so we ultimately find that
a(x) = a(x)− 1
2
db
dx(x) b(x). (B.26)
This conversion between Ito and Stratonovich equations is vitally important in Chap.
4, where the first order rules of ordinary calculus allows for the application of differ-
ential geometry to a stochastic system. Sec. 4.2.4 gives an example for why using
Stratonovich calculus is necessary.
227
Appendix C
Quantum Stochastic Calculus
This appendix reviews the basis notation and properties of an Ito form quantum
stochastic differential equation (QSDE). Sec. 2.5 discusses at length how a bosonic
Fock space F (h), defined over the single particle Hilbert space h = L2(R+) ⊗ Cd,
has the operators Qit and P i
t . Each of these operators have the statistics of Wiener
processes when taken in vacuum expectation. They are constructed though linear
combinations of the annihilation and creation operators, Ait = a[χ[0,t] ei] and Ai †t =
a†[χ[0,t] ei], which are also used to form non-Hermitian quantum Ito integrals. In
addition to these processes, Sec. 2.6.2 encountered a different kind of operator, the
scatter or conservation processes Λijt . Here we will gloss over how Λij
t can be defined
in terms of an operation acting on the single particle Hilbert space (see [45]). Note
that(Λijt
)†= Λji
t . To define an integral with respect to Ait, Ai †t , and Λij
t , it is
Appendix C. Quantum Stochastic Calculus 228
sufficient to know the matrix elements1,
⟨e[f ]∣∣Ait ∣∣e[h]
⟩=
∫ t
0
ds hi(s) 〈e[f ]|e[g]〉 , (C.1a)⟨e[f ]∣∣∣Aj †t ∣∣∣e[h]
⟩=
∫ t
0
ds f ∗j (s) 〈e[f ]|e[g]〉 , and (C.1b)
⟨e[f ]∣∣Λij
t
∣∣e[h]⟩
=
∫ t
0
ds f ∗i (s)hj(s) 〈e[f ]|e[g]〉 . (C.1c)
The derivation of the quantum Ito integral and the equivalent stochastic calculus
begins in much of the say way as the classical Ito integral. Eq. (B.2) gives the limiting
sequence used in defining the Ito integral, with its characteristic nonanticipative
integrand. The quantum Ito integral is given by a similar form, where the integral
with respect to dAjt is given by the limiting sum, (with the mesh of times t0 = 0 <
t1 < . . . tn < t)
Yt =
∫ t
0
Xs dAjs ≡ lim
n→∞
n∑i=1
Xti−1(Ajti − A
jti−1
). (C.2)
Similar definitions are made for integrals with respect to Aj†t , Λijt , and time. The
integrand Xs is required to be a time-adapted process meaning that it is required
to act as the identity on the Fock space F (h(s,∞)). Note that in general, Xs is not
required to act solely on the Fock space F (h[0,s]). It could be an operator defined over
a joint space Hsys ⊗F (h[0,s]). Because the integrand is required to be time-adapted
it commutes with the differential and so we can write,
Yt =
∫ t
0
Xs dAjs =
∫ t
0
dAjs Xs. (C.3)
We will use both forms, whichever is more convenient. It should not be that surpris-
ing that proving that the above integrals exists and are finite is more difficult than
in a classical setting. We will not reproduce the full result here, see [23, 52] for the
proof.
1For technical reasons the amplitudes of these states are assumed to be square integrableand have a large but finite upper bound [25, 45].
Appendix C. Quantum Stochastic Calculus 229
However to get a sense of where the quantum Ito rule comes from, we will discuss
a few key points. The proof of convergence of the infinitesimal limit is shown by
taking a piecewise constant approximation, and then proving convergence of the
matrix elements. The set of vectors chosen for those matrix elements are the tensor
product of any system pure state and an exponential vector. By showing that it
hold for these matrix elements, you can then extend the result to hold in expectation
with any state composed of convex linear combinations of these vectors. For example,
consider the integral Yt in Eq. (C.2),
〈ψ ⊗ e[f ]|Yt |ψ ⊗ e[h]〉 = limn→∞
n∑i=1
∫ ti
ti−1
ds hi(s)⟨ψ ⊗ e[f ]
∣∣Xti−1
∣∣ψ ⊗ e[h]⟩. (C.4)
For a reasonably large class of integrands the piecewise constant approximation is
appropriate and converges to
〈ψ ⊗ e[f ]|Yt |ψ ⊗ e[h]〉 =
∫ t
0
ds hi(s) 〈ψ ⊗ e[f ]|Xs |ψ ⊗ e[h]〉 . (C.5)
Equivalent expressions occur for integrals with respect to dAj †t and dΛijt , where hi(s)
is replaced by the correct amplitudes as given in Eq. (C.1).
In order to define a proper calculus, one must also consider how products of the
integrals behave. The elementary step is to consider the matrix elements of two
operator combinations of Ait, Aj †t and Λij
t . The matrix elements⟨e[f ]∣∣AitAjt ∣∣e[h]
⟩,⟨
e[f ]∣∣∣Ai†t Aj †t ∣∣∣e[h]
⟩, and
⟨e[f ]∣∣Λij
t Akt
∣∣e[h]⟩
are easily to calculated, as they involve
an eigenvalue relationship of At and the matrix elements given in Eq. (C.1). The
Appendix C. Quantum Stochastic Calculus 230
nonobvious two operator matrix elements are [52, Proposition 20.13]⟨e[f ]∣∣∣AitAj †t ∣∣∣e[h]
⟩=
(∫ t
0
ds f ∗j (s)
∫ t
0
ds hi(s)
+ δij
∫ t
0
ds
)〈e[f ]|e[g]〉 , (C.6a)⟨
e[f ]∣∣∣Ait Λjk
t
∣∣∣e[h]⟩
=
(∫ t
0
ds f ∗j (s)hk(s)
∫ t
0
ds hi(s)
+ δij
∫ t
0
ds hk(s)
)〈e[f ]|e[g]〉 , (C.6b)⟨
e[f ]∣∣∣Λij
t Ak†t
∣∣∣e[h]⟩
=
(∫ t
0
ds f ∗i (s)hj(s)
∫ t
0
ds f ∗k (s)
+ δjk
∫ t
0
ds f ∗i (s)
)〈e[f ]|e[g]〉 , and (C.6c)
⟨e[f ]∣∣Λij
t Λk`t
∣∣e[h]⟩
=
(∫ t
0
ds f ∗i (s)hj(s)
∫ t
0
ds f ∗k (s)h`(s)
+ δjk
∫ t
0
ds f ∗i (s)h`(s)
)〈e[f ]|e[g]〉 . (C.6d)
Calculating these expressions requires knowledge of the commutation relations be-
tween all of the noises and/or their relation to the exponential vectors. Without
discussing the origin of Λt, writing down the commutation relations are no more
intuitive and less useful than above matrix elements. Inspecting these four equations
shows that each has two parts. The first is essentially the product of the single oper-
ator matrix elements, up to a factor of 〈e[f ]|e[g]〉. The second part is an additional
term due to the noncommuting structure of the processes. If we express these four
matrix elements in terms of differentials,⟨e[f ]∣∣∣ dAit dAj †t ∣∣∣e[h]
⟩=(O(dt2) + δij dt
)〈e[f ]|e[g]〉 , (C.7a)⟨
e[f ]∣∣∣ dAit dΛjk
t
∣∣∣e[h]⟩
=(O(dt2) + δij dt hk(t)
)〈e[f ]|e[g]〉 , (C.7b)⟨
e[f ]∣∣∣ dΛij
t dAk†t
∣∣∣e[h]⟩
=(O(dt2) + δjk dt f
∗i (t)
)〈e[f ]|e[g]〉 , and (C.7c)⟨
e[f ]∣∣ dΛij
t dΛk`t
∣∣e[h]⟩
=(O(dt2) + δjk dtf
∗i (t)h`(t)
)〈e[f ]|e[g]〉 . (C.7d)
Notice that each term on the order of dt is expressible in terms of the matrix element
of a differential, dt, dAkt , etc. Taking the terms O(dt2) → 0, and asserting that
Appendix C. Quantum Stochastic Calculus 231
knowing these matrix elements is sufficient, we have the quantum Ito rules
dAit dAj †t = δij dt, (C.8a)
dAit dΛjkt = δij dAt, (C.8b)
dΛijt dA
k†t = δjk dA
i †t , (C.8c)
dΛijt dΛk`
t = δjk dΛi`t . (C.8d)
All other differential products are zero.
The remainder of the construction of the quantum Ito calculus is technical, and
involves showing how the product of two piecewise approximations converge as ∆ti →
0, as well as dealing with the equality
〈Xt ψ ⊗ e[f ]|Yt ψ ⊗ e[h]〉 =⟨ψ ⊗ e[f ]
∣∣∣X†t Yt ψ ⊗ e[h]⟩
(C.9)
for two integrals Xt and Yt. These issues are well beyond our scope, except that we
will note that when Xt and Yt are bounded in a suitable sense, all things work out
nicely [25, 45].
Before moving on to the specifics of a QSDE for a unitary propagator, we will give
the following general example for using the quantum Ito rule. (See [45, proposition
2.4] for all of the necessary qualifiers and assumptions.) Consider the quantum
stochastic process Xt, given by the Ito integral (implied sums on repeated indices),
Xt = X0 +
∫ t
0
dΛijs F
ijs +
∫ t
0
dAi †s Fi0s +
∫ t
0
dAjs F0js +
∫ t
0
dsF 00s . (C.10)
The processes Fαβs are assumed to act nontrivially only on the joint Hilbert space
Hsys ⊗F (h[0,s]) and are integrable (without really defining what that means). The
initial value X0 is assumed to be a bounded operator that acts as the identity on
F (h). This integral can also be notated differentially as the QSDE
dXt = F ijt dΛij
t + F i0t dAi †t + F 0j
t dAjt + F 00t dt. (C.11)
Given another process Yt whose differential is
dYt = Kijt dΛij
t +Ki0t dA
i †t +K0j
t dAjt +K00t dt, (C.12)
Appendix C. Quantum Stochastic Calculus 232
the product XtYt can also be expressed in terms of a process Zt = XtYt whose
differential is given by
dZt = M ijt dΛij
t +M i0t dA
i †t +M0j
t dAjt +M00t dt, (C.13)
where the resulting integrands Mαβt are
M ijt = XtK
ijt + F ij
t Yt + F i`t K
`jt , (C.14a)
M i0t = XtK
i0t + F i0
t Yt + F ijt K
j0t , (C.14b)
M0jt = XtK
0jt + F 0j
t Yt + F 0it Kij
t , and (C.14c)
M00t = XtK
00t + F 00
t Yt + F 0it Ki0
t . (C.14d)
This result can be viewed as an application of the quantum Ito product rule
d(XtYt) = Xt dYt + dXt Yt + dXt dYt, (C.15)
where dXt dYt is multiplied out and the second order differentials are evaluated ac-
cording to the quantum Ito rules in Eq. (C.8).
C.1 The Quantum Stochastic Unitary
With the general quantum Ito rule firmly in hand, we would like to apply it to
find a universal expression for a unitary process Ut. This is actually quite straight
forward by first noting that because its unitary, U †t Ut = 1. We also know that if
Ut is independent from the fundamental processes At, A†t and Λt, than Ut is the
solution to the ordinary differential equation dUt = −iHt Ut dt. When including
the fundamental processes, the objective is to write Ut as a general QSDE and find
how unitary constrains the various integrands. Taking the “noise free” solution as a
starting point we hypothesize the coefficients Gαβt so that Ut is given by the QSDE
dUt = Gijt Ut dΛij
t +Gi0t Ut dA
i †t +G0j
t Ut dAjt +G00
t Ut dt (C.16)
Appendix C. Quantum Stochastic Calculus 233
and its adjoint is
dU †t = U †t Gij†t dΛji
t + U †t Gi0†t dAit + U †t G
0j†t dAj †t + U †t G
00†t dt. (C.17)
The unitary constraint’s impact on the differential is that d(U †t Ut) = d(Ut U†t ) = 0.
The general Ito product coefficients in Eq. (C.14) then says that in order for this to
unitary,
U †t Gαβt Ut + U †t G
βα †t Ut + U †t G
`α †t G`β
t Ut = 0
Ut U†t G
βα †t +Gαβ
t Ut U†t +Gα`
t Ut U†t G
β` †t = 0
(C.18)
for α, β starting at zero and the implied sum over ` starting from 1. Eliminating Ut
and U †t from the constraints, they simplify to
Gαβt +Gβα †
t +G`α †t G`β
t = Gαβt +Gβα †
t +Gα`t G
β` †t = 0. (C.19)
The coefficients Gαβ are typically written in terms of a different set of operators,
Sijt , Lit and Ht. The reason for this transformation is that (Sijt , Lit, Ht) have more
desirable and physically relevant properties than Gαβ. Immediately we can see that
some part of G00t should be −iHt, as the general QSDE solution contains the case
where U(t) = exp(−iHt) for a time independent Hamiltonian H. Also if G0i = Gi0 =
0 then Eq.(C.19) reads as G00 = −G00 †, implying that G00t = −iHt for Hermitian
Ht.
To identify how Sijt fits into the picture, consider for the moment the case where
each Gijt = gij1 for some complex coefficients gij. Then the constraints for Gij
t are
gij + g∗ji + g∗`i g`j = gij + g∗ji + g`ig∗`j = 0 (C.20)
Writing the constants in term of a matrix G we have
G+G† +G†G = 0. (C.21)
Appendix C. Quantum Stochastic Calculus 234
If we define a matrix S ≡ G+ 1 then this constraint reads,
0 = S− 1+ S† − 1+ (S† − 1)(S− 1)
0 = S− 1+ S† − 1+ (S†S− S† − S+ 1)
1 = S†S.
(C.22)
In other words S is a unitary matrix. Returning to the general case, we can still
define the operators
Sijt ≡ Gijt + δij. (C.23)
Then Eq. (C.19) transforms the constraint for Gijt into the constraint,
Sij†t Sjkt = Sijt Skj †t = δik. (C.24)
In other words, Sijt is a unitary matrix of operators.
By introducing Sijt , the constraint for G0it is also significantly simpler. Specifically,
0 = G0it +Gi0 †
t +G`0 †t G`i
t
0 = G0it +Gi0 †
t +G`0 †t (S`it − δ`i)
G0it = −G`0 †
t S`it .
(C.25)
The remaining coefficients Gi0t are essentially arbitrary, which is relabeled as the
operators Lit. Writing the constraint for G00t in terms of Li, means that G00
t +G00 †t =
−Li†t Lit.
Bringing all of these results together we can reexpressGαβt in terms of (Sijt , L
it, Ht),
Gijt = Sijt − δij,
Gi0t = Lit,
G0jt = −Li†t S
ijt ,
G00t = −iHt − 1
2Li†t L
it.
(C.26)
Appendix C. Quantum Stochastic Calculus 235
In other words,
dUt =( (Sijt − δij
)dΛij
t +Lit dAi †t −L
i†t S
ijt dA
jt − 1
2Li†t L
it dt− iHt dt
)Ut. (C.27)
This is the standard form for the QSDE for a general propagator Ut. While in
principle, the initial value for U0 could be any unitary operator acting on a system
Hsys, typically Ut describes an interaction picture representation of the system-field
dynamic, and then U0 = 1.
One final remark is that if we take each coefficient to be its own stochastic process,
Gαβt Ut t≥0, then they are required to still be time-adapted. This results in the
constraint that the initial values Sij0 , Li0 and H0 must all be system operators only,
as they must act as the identity on the field at that time. Furthermore if Sijt , Lit and
Ht are known to be time independent, then they must be system operators only.
C.1.1 Unitary evolution
One final calculation we will include is the unitary evolution of a time independent
system operator X. In the quantum stochastic literature this unitary evolution is
written in terms of a map jt(·) generating the “flow” or current of the operator,
jt(X) ≡ U †tXUt. (C.28)
This map can a written as a solution to a QSDE,
jt(X) = U †0XU0 +
∫ t
0
djs(X), (C.29)
with a differential
djt(X) = dU †t XUt + U †tX dUt + dU †tXdUt. (C.30)
After an exercise in quantum stochastic calculus, one finds the recursive QSDE
djt(X) = jt(Lijt (X)) dΛijt +jt(Li0t (X)) dAi †t +jt(L0j
t (X)) dAjt +jt(L00t (X)) dt (C.31)
Appendix C. Quantum Stochastic Calculus 236
where Lαβt (·) are known as the Evens-Hudson maps and in terms of Gαβt are
Lαβt (X) = Gβα †t X +XGαβ
t +Gkα †t XGkβ
t . (C.32)
When written in terms of (Sijt , Lit, Ht) these maps are,
Lijt (X) = Ski †t XSkjt − δijX, (C.33a)
Li0t (X) = Ski †t
[X, Lkt
], (C.33b)
L0jt (X) = −
[X, Lk†t
]Skjt , (C.33c)
L00t (X) = +i[Ht, X] + Li †t XL
it − 1
2Li †t L
itX − 1
2XLi †t L
it. (C.33d)
For an arbitrary model the unitary evolution becomes exceedingly complicated very
rapidly. Each coefficient in Eq. (C.31) is itself given by the unitary flow of the
operator Zt ≡ Lαβt (X). Systems lacking any kind of fundamental symmetry will
rarely close on a useful subspace of operators meaning that after repeated applications
of Lαβt (·) more complicated operators will be generated, spanning a larger and larger
space of operators.
In addition to calculating the unitary output of system operators, it is also useful
to calculate the output for the fundamental field operators Aj†t , Ait, Λijt . In a rather
tedious exercise in manipulating the quantum Ito rules it can be shown that
djt(Ajt) =jt(S
jkt ) dAkt + jt(L
jt) dt, (C.34a)
djt(Ai †t ) =jt(S
ik †t ) dAk †t + jt(L
i †t ) dt, (C.34b)
djt(Λijt ) =jt(S
ik †t Sj`t ) dΛk`
t + jt(Sik †t Ljt) dA
k †t + jt(L
i †t S
j`t ) dA`t + jt(L
i †t L
jt) dt.
(C.34c)
237
Appendix D
The Quantum Wong-Zakai
Theorem
In a classical system, the convergence of an ordinary differential equation to a stochas-
tic one was treated in the work of Wong and Zakai [90]. There they show that an
ODE containing a piecewise-smooth approximation to white noise, ξ(λ)t , converges to
a Stratonovich integral as ξ(λ)t → ξt with λ→ 0. For instance, Consider the ODE
∂x(λ)(t)
∂t= f(t, x(λ)(t)) + g(t, x(λ)(t)) ξ
(λ)t . (D.1)
The Wong-Zakai theorem states that the integrated solution
x(λ)(t) = x(λ)(0) +
∫ t
0
ds f(s, x(λ)(t)) +
∫ t
0
ds g(s, x(λ)(s)) ξ(λ)s (D.2)
converges to the Stratonovich integral
xt = x0 +
∫ t
0
f(x(t), s) ds+
∫ t
0
g(x(s), s) dws. (D.3)
Appendix B reviews the distinctions between the two most common forms of
classical stochastic integration, the Ito integral and the Stratonovich integral and
Appendix C discusses their quantum analogs. This appendix reviews the quantum
Appendix D. The Quantum Wong-Zakai Theorem 238
analog of this result where the specific ODE is for a propagator whose Hamiltonian
contains field operators that are limiting to quantum white noise.
In 2006, Gough derived a quantum limit, equivalent to the Wong-Zakai theorem
[43]. Specifically, he investigated the convergence of the Schrodinger equation, writ-
ten in terms of the time evolution operator U(t), as field operators in the Hamiltonian
converge to singular, delta-commuting operators. The specific Hamiltonian consider
is given in Eq. (D.9), but before discussing it, we will first describe the quantum
version of ξ(λ)t and how it can be interpreted as quantum white noise.
D.1 Quantum white noise
In order to make a connection with a quantum Ito integral as formulated by Hudson
and Parthasarthy, the limiting field operators clearly must be reference to a Fock
space F (h′), h′ = L2(R+) ⊗ Cd. The delta commuting limit is introduced by con-
sidering the differentiable functions ξ(λ)i (t) ∈ h′ : i = 1, . . . , d parameterized by
λ > 0 so that we have the field operators a[ξ(λ)i (t)] and a†[ξ
(λ)j (t′)], with[
a[ξ(λ)i (t)], a†[ξ
(λ)j (t′)]
]=⟨ξ
(λ)i (t), ξ
(λ)j (t′)
⟩≡ cij(λ, t− t′). (D.4)
This inner product is assumed to satisfy the properties:∫ ∞−∞
dt cij(λ, t) <∞, (D.5a)
cij(λ, t) = c∗ji(λ,−t), and (D.5b)
limλ→0
cij(λ, t) = δij δ(t). (D.5c)
To simplify the notation, it is convenient to write ai(λ, t) ≡ a[ξ(λ)i (t)].
These operators end up serving two purposes in the quantum Wong-Zakai theo-
rem. The first is of course to act in the limiting Hamiltonian and the second is to
Appendix D. The Quantum Wong-Zakai Theorem 239
generate the smoothed exponential vectors,
e[g(λ)] ≡ exp
(∫ ∞0
dt gi(t) a†i (λ, t)
)|∅〉. (D.6)
Note that in relation to the one-dimensional representation of Sec. 2.4, ai(λ, t)
is almost equivalent to a[ϕ(σ)(t)]. The differences lies in how the smoothed wave
packets are defined. One possible mapping between paraxial optics and the abstract
operators ai(λ, t) is to identify d paraxial spatial mode functions u(+)i (xT , z), which
satisfy the orthogonality relation,∫d2xT u
∗i (xT , z) · uj(xT , z) = δij σT . For each
paraxial mode there are d independent complex wave packet envelopes, inducing the
smoothing operators a[ϕ(σ)i (t)]. In the case of a single paraxial mode, Eq. (2.107)
gave the expression for a smoothed paraxial wavepacket g(σ)(k, t). In order for this
expression to be equivalent to the argument of Eq. (D.6), we require that
ai(λ, t) ∼= e+iω0ta[ϕ(σ)i (−t)]. (D.7)
The fact that we require the time reversed version of ϕ(σ)i is an artifact of defining
the smoothing with respect to a convolution. The inclusion of the carrier phase is
both mathematically and physically interesting. Physically it is a reminder that the
elements of h′ represent the part of the light existing on the measurement timescale,
which is much slower than the carrier frequency. In fact the appearance of this
phase is intimately related to the rotating wave approximation. In the rotating
frame atomic transition operators develop explicit time dependence at the carrier
frequency. The mathematical relevance of this carrier phase is that it cancels any
rapidly oscillating phases in the inner product,⟨ϕ
(σ)i (−t), ϕ(σ)
i (−t′)⟩
. This cance-
lation is explicitly apparent by computing that[ai(λ, t), a
†j(λ, t
′)]
= cij(λ, t− t′)
= e+iω0(t−t′)⟨ϕ
(σ)i (−t), ϕ(σ)
j (−t′)⟩
= δij
(ϕ(σ) ? ϕ(σ) (t− t′)− i 1
ω0
dϕ(σ)
dt? ϕ(σ) (t− t′)
).
(D.8)
Appendix D. The Quantum Wong-Zakai Theorem 240
where in the final line we inserted the unequal time inner product in Eq. (2.92), as
well as remember that for the smoother ϕ(σ)i , ‖g‖ /
√τ → 1. λ is simply a parameter
representing the formal limit that as λ→ 0, σ → 0 and (σ ω0)−1 → 0.
D.2 The quantum Wong-Zakai theorem
The Hamiltonian that Gough ultimately considers is,
Hint(λ, t) = ~
(d∑
i,j=1
Eij a†i (λ, t) aj(λ, t) +
d∑i=1
Ei0 a†i (λ, t) +
d∑j=1
E0j aj(λ, t) + E00
)(D.9)
The quantum Wong-Zakai theorem then takes the solution to the equation
d
dtU(λ, t) = − i
~Hint(λ, t)U(λ, t), U(λ, 0) = 1 (D.10)
and proves that the limit limλ→0 U(λ, t) ≡ Ut is a quantum stochastic unitary pro-
cess, solving a quantum Stratonovich differential equation. Fortunately, a quantum
Stratonovich integral is also expressible in terms of a quantum Ito integral. Gough
provides a well constructed conversion between both forms, which we review shortly.
The specifics of this limit is that it is shown to hold “weakly”, in that
limλ→0〈ψ ⊗ e[f(λ)]|U(λ, t) |φ⊗ e[g(λ)]〉 = 〈ψ ⊗ e[f ]|Ut |φ⊗ e[g]〉 , (D.11)
for any system state vectors ψ and φ and the exponential vectors e[f(λ)] and e[g(λ)]
as defined in Eq. (D.6). In addition to the matrix elements of the propagator, it is
also shown to hold weakly for the Heisenberg picture evolution of an operator X, so
that
limλ→0
U †(λ, t)XU(λ, t) = U †tXUt. (D.12)
Before describing the resulting unitary process, it is worth stressing several advan-
tages of the quantum Wong-Zakai theorem. So long as Hint(λ, t) and the λ→ 0 limit
Appendix D. The Quantum Wong-Zakai Theorem 241
are physically justifiable, then the specifics of the total state ρtot are almost irreverent.
The only constraint is that the total state must be expressible in terms of convergent
sequence of the matrix elements limλ→0 〈ψi ⊗ e[fi(λ)]| ρtot |φj ⊗ e[gj(λ)]〉. This means
that we allowed the possibility of nonclassical superpositions. A second advantage
is that the presence of the scattering interaction Eij allows for a much broader class
of interactions than the linear interactions typically considered in quantum optics.
While this dissertation will ultimately be considering a linear Hamiltonian (Eij will
be negligibly small) the fact that the theory could consider a system coupled to an in-
stantaneous number operator ai†(λ, t) ai(λ, t) in a “white noise” limit is no small feat.
One possible example is to engineering a quantum-optical router, where for some set
of modes the operators Eij coherently scatter quanta in a system dependent way.
Finally, much can be said for the fact that the limiting interaction is still described
by a unitary operation. While we have constructed the limiting field operators in
terms of a measurement timescale, we have not specified what kind of measurement
we will be performing. Formulating a conditional estimate for the system given a
measurement of the field is one of the main purposes of Chap. 3 but at the level of
the system field interaction, everything remains fully coherent.
D.2.1 Quantum stochastic calculus and operator ordering
In proving the quantum Wong-Zakai theorem, Gough also found an intuitive corre-
spondence between operator orderings and quantum stochastic differential equations.
In order to describe the limiting propagator Ut as a solution to a standard Ito form
QSDE we need to review this correspondence. Ultimately, the correspondence is
between quantum Stratonovich equations and time-ordered solutions to a given re-
cursive differential equation. Conversely, a quantum Ito equation is identified with
a normally-ordered solution [43].
The solution to the Schrodinger equation with a time-dependent Hamiltonian is
Appendix D. The Quantum Wong-Zakai Theorem 242
given by a time-ordered exponential,
U(λ, t) = ~T exp
(− i~
∫ t
0
dsHint(λ, s)
). (D.13)
The time-ordered exponential is a compact short hand for the iterated integrals,
~T exp
(− i~
∫ t
0
dsHint(λ, s)
)=
∞∑n=0
(−i~
)n ∫ t
0
dtn . . .
∫ t2
0
dt1 Hint(λ, tn) . . . Hint(λ, t1). (D.14)
Note that Hint(λ, t) need not commute with Hint(λ, s) and thus the operator ordering
is critical in this expression. There is nothing in the λ → 0 limit that changes the
operator ordering and so identifying a Stratonovich equation with a time-ordered
equation is simply a statement of this fact. The heart of the quantum Wong-Zakai
theorem is relating this time-ordered exponential to a quantum Ito integral. As the
Wong-Zakai theorem considers the limit of matrix elements between two exponen-
tial vectors, this relation is made by comparing the matrix elements between this
expression and the matrix element of an iterated Ito integral.
Sec. B discusses the classical Ito integral and shows how an integral xt =∫ t
0bs dws
for a time-adapted process bt leads to the Ito rules of calculus and how it crucially de-
pend upon the statistical independence of bt from dwt. Not surprisingly the quantum
Ito integral also relies on a similar independence of the integrand from the differen-
tial. As Sec. 2.5.1 discussed, the continuous-time tensor product decomposition
allows for defining field operators a[χ(t,t+dt] ej] = Ajt+dt − Ajt , which commute with
any operator adapted to the the time interval [0, t]. The quantum Ito integral with
respect to dAt is defined as
Yt =
∫ t
0
Xs dAjs ≡ lim
n→∞
n∑i=1
Xti−1(Ajti − A
jti−1
) (D.15)
and likewise an integral with respect to dAk†t is
Zt =
∫ t
0
Ys dAk†s ≡ lim
n→∞
n∑i=1
Yti−1(Aj†ti − A
j†ti−1
) (D.16)
Appendix D. The Quantum Wong-Zakai Theorem 243
Taking the matrix element of Yt between two exponential vectors results in⟨e[f ]
∣∣∣∣ ∫ t
0
Xs dAjs
∣∣∣∣e[h]
⟩=
∫ t
0
ds 〈e[f ]|Xs |e[h]〉 hj(s) (D.17)
and equivalently
〈e[f ]|Zt |e[h]〉 =
∫ t
0
ds f ∗j (s) 〈e[f ]|Ys |e[h]〉 . (D.18)
Eq. (D.18) is valid because Ys is time-adapted and therefore commutes with dAj †s .
This commutating property carries over to the iterated integral, so by substituting
in for Ys,
〈e[f ]|Zs |e[h]〉 =
∫ t
0
ds
∫ s
0
ds′ f ∗k (s′) 〈e[f ]|Xs′ |e[h]〉 hj(s). (D.19)
If we hypothesize the existence of an operator aj(t) by the eigenvalue relationship
aj(t)|e[f ]〉 = fj(t) |e[f ]〉 (D.20)
we then have
〈e[f ]|Zt |e[h]〉 ∼=∫ t
0
ds
∫ s
0
ds′⟨
e[f ]∣∣∣ a†k(s′)Xs′ aj(s)
∣∣∣e[h]⟩. (D.21)
This relation holds for any iterated Ito integral as long as the integrand is expressed in
normal order, with all of the creation operators on the left and all of the annihilation
operators on the right. But as we have
limλ→0
aj(λ, t)|e[f ]〉 = limλ→0
∫ ∞0
ds cjk(λ, t− s) fk(s) |e[f ]〉 = fj(t)|e[f ]〉, (D.22)
we will ultimately find an equivalence between the operator aj(t) and the limiting
form of aj(λ, t).
The proof of the quantum Wong-Zakai theorem follows the procedure of con-
verting the time-ordered exponential into normal order, showing that the matrix
elements converge to a finite quantity and then proving a correspondence with an
equivalent Ito form QSDE.
Appendix D. The Quantum Wong-Zakai Theorem 244
D.2.2 Gauge freedom in the Ito correction
The difference between a Stratonovich equation and an Ito equation is often called
the Ito correction term. As we have identified an Ito equation with the normally order
version of the iterated integral, the Ito correction term is intimately related to this
conversion. Converting any product of field operators into normal order, is given by
Wick’s theorem [91]. It states that any product of creation and annihilation operators
can be written as the sum over the normal ordering of all possible contractions
between all pairs of operators. A contraction between the operators a and b is
defined as
a•b• ≡ ab− : ab : (D.23)
where : ab : is the normal ordering of the two operators. By Wick’s theorem we can
write the product
abc = : abc : + : a•b•c : + : a•bc• : + : ab•c• : . (D.24)
For the boson operator considered here, the only nonzero contraction is
a•i (λ, t)a†•j (λ, s) = [ai(λ, t), a
†j(λ, s)] = cij(λ, t− s). (D.25)
The heart of finding the equivalent Ito QSDE from the time-ordered exponential
is to first apply Wick’s theorem to each term in the time-ordered exponential, then
take the λ→ 0 limit, and finally re-sum the series. We will not be reproducing this
result here, where the details of such a limit can be found in the following references.
In the absence of the scattering terms the proof is detailed in the book by Accardi
et al. [55]. The scattering terms were subsequently added by Gough [92]. However,
one important aspect of the limit must be discussed as it affects the final limiting
QSDE, as well as takes its root in the physical origin of cij(λ, t− s).
In each term of the time-ordered exponential, the operators on the right are always
constrained to be at an earlier time than the operators on the left. Therefore when
Appendix D. The Quantum Wong-Zakai Theorem 245
applying Wick’s theorem, the contraction a•i (λ, t)a†•j (λ, s) will always be constrained
to have t ≥ s. This constraint means that when λ→ 0, only half of the cij(λ, t−s)→
δ(t− s) limit will apply. It is often the case that when a causal constraint is applied
to a delta function limit, an additional complex term appears involving a Cauchy
principle value. For each cij(λ, τ) the extra complex term is called a gauge freedom
and generates, among other things, an effective level shift in the E00 term.
As a concrete example, consider the second order term in the time-ordered ex-
pansion(−i~
)2 ∫ τ
0
dt
∫ t
0
dsHint(λ, t)Hint(λ, s). (D.26)
The operator product Hint(λ, t)Hint(λ, s) contains 16 terms with at most 4 field
operators (from the scattering terms) and in the case of E00(t)E00(s), no operators.
One part of this expression is the integral
−∑ij
∫ τ
0
dt
∫ t
0
dsE0i ai(λ, t)Ej0 a†j(λ, s). (D.27)
Applying Wick’s theorem means that
−∑ij
∫ τ
0
dt
∫ t
0
dsE0i ai(λ, t)Ej0 a†j(λ, s) =
−∑ij
∫ τ
0
dt
∫ t
0
dsE0iEj0
(: ai(λ, t)a
†j(λ, s) : + cij(λ, t− s)
). (D.28)
This commutator term on the right-hand side will ultimately contribute to the Ito
correction term and generate the gauge shift, as long as it survives the λ→ 0 limit.
Therefore the most basic contribution is made by the limit
limλ→0
∫ τ
0
dt
∫ t
0
ds cij(λ, t− s). (D.29)
By substituting Eq. (D.8) for cij(λ, t − s) and dropping the term proportional to
1/ω0,
limλ→0
∫ τ
0
dt
∫ t
0
ds cij(λ, t− s) = δij limσ→0
∫ τ
0
dt
∫ t
0
dsϕ(σ) ? ϕ(σ) (t− s). (D.30)
Appendix D. The Quantum Wong-Zakai Theorem 246
In Sec. 2.3.2 we used the example of ϕ(σ) as a real-valued Gaussian with mean zero
and standard deviation σ. In this case, it is easy to show that
ϕ(σ) ? ϕ(σ) (t− s) =1
σϕ(1) ? ϕ(1)
((t− s)/σ
)(D.31)
where ϕ(1) is a mean zero Gaussian with unit variance. In fact this is a general
property often used in distribution theory where if∫R dt ϕ
(1)(t) = 1 then
limσ→0
1
σϕ(1)(t/σ) = δ(t). (D.32)
In this replacement Eq. (D.31) is also satisfied. Using this relation in the right-hand
side of Eq. (D.30),
limλ→0
∫ τ
0
dt
∫ t
0
ds cij(λ, t− s) = δij limσ→0
∫ τ
0
dt
∫ t
0
ds1
σϕ(1) ? ϕ(1)
((t− s)/σ
).
(D.33)
This limit is easily evaluated by making the change of variables t ≡ (t− s)/σ,
δij limσ→0
∫ τ
0
dt
∫ t/σ
0
dt ϕ(1) ? ϕ(1) (t) = δij τ
∫ ∞0
dt ϕ(1) ? ϕ(1) (t). (D.34)
If ϕ(1) is a normalized real-valued distribution then ϕ(1) ? ϕ(1) (t) = ϕ(1) ? ϕ(1) (−t)
and so∫ ∞0
dt ϕ(1) ? ϕ(1) (t) =1
2
∫ ∞−∞
dt ϕ(1) ? ϕ(1) (t)
=1
2
(∫ ∞−∞
dt ϕ(1)(t)
)2
=1
2.
(D.35)
The change of variables in Eq. (D.34) is the delta correlation limit but because
of the time-ordered integration, we obtain the factor of 12. However in the general
case, ϕ(1) need not be real-valued. For our example involving quasi-monochromatic
fields, it is sufficient for ϕ to be a real-valued function as it is simply a mathematical
tool representing a limit on the rate of change of the arbitrary complex functions
f ∈ L2(R+)⊗ Cd.
Appendix D. The Quantum Wong-Zakai Theorem 247
In the general Wong-Zakai limit cij(λ, t) is not assumed to be real, only that the
criteria of Eq. (D.5) are satisfied. From Eq. (D.5c) we have that
limλ→0
∫ ∞−∞
dt cij(λ, t) = δij, (D.36)
meaning that we can define the potentially complex constants
κij ≡ limλ→0
∫ ∞0
dt cji(λ, t)
κ∗ij ≡ limλ→0
∫ ∞0
dt c∗ji(λ, t) = limλ→0
∫ 0
−∞dt cji(λ, t)
(D.37)
where we used the fact that c∗ji(λ, t) = cji(λ,−t). Combining these two results means
δij = κij + κ∗ij. (D.38)
In the case of a real-valued smoother κij = 12δij, but the general case allows for a
complex coefficient. In complex analysis there is a general relation that∫ ∞0
dt e−iωt = π δ(ω)− iP .V .[
1ω
](D.39)
where P .V . denotes taking the Cauchy principal value. Expressing ϕ(1) ? ϕ(1) (t) in
the frequency domain means that
κij = δij
∫ ∞0
dt
∫ ∞−∞
dω∣∣ϕ(1)(ω)
∣∣2 e−iωt= δij
(π
∫ ∞−∞
dω∣∣ϕ(1)(ω)
∣∣2 δ(ω)− iP .V .∫ ∞−∞
dω
∣∣ϕ(1)(ω)∣∣2
ω
)
= δij
(π∣∣ϕ(1)(0)
∣∣2 − iP .V .∫ ∞−∞
dω
∣∣ϕ(1)(ω)∣∣2
ω
).
(D.40)
The requirement of Eq. (D.38) implies that π∣∣ϕ(1)(0)
∣∣2 = 12. The principal value
will be zero for a symmetric power distribution (i.e. for real ϕ(1)) but it is nonzero
in general. The remaining complex coefficient is what Gough refers to as a gauge
freedom as it depends upon the nature of ϕ(1). Here we will assume that ϕ(1) is
real-valued and so κij = κ∗ij = 12δij.
Appendix D. The Quantum Wong-Zakai Theorem 248
D.3 The Limiting Propagator
We just considered one part of the second order term in the time-ordered exponential
and how through normal ordering it develops a nonzero correction. As this is just
one part of the total time-ordered exponential, more complicated expressions are
generated, involving iterated integrals of the form∫ t
0
dtn . . .
∫ t2
0
dt1 cij(λ, tn − tn−3) . . . ckl(λ, t5 − t1). (D.41)
Depending upon the relative order of the times, these integrals may or may not
converge to zero as λ → 0. It turns out that the only terms that are nonzero have
time consecutive integrals, meaning that for each contraction we must have cij(λ, τn)
be evaluated at τn = tn − tn−1 [92]. In the above example, τn = tn − tn−3 and
τ5 = t5 − t1 are not time consecutive intervals and so Eq. (D.41) converges to zero.
Upon identifying the class of nonzero integrals, it is possible to then re-sum the
expansion, which in general generates a Neumann series [43].
Once the λ→ 0 limit is taken, the normally ordered propagator can be identified
and its equivalent Ito form QSDE written down. A general QSDE for the unitary
propagator Ut can be written as
dUt =(GijdΛij
t +Gi0dAi †t +G0jdA
jt +G00dt
)Ut. (D.42)
The coefficients Gαβ are constrained to insure unitarity. Appendix C.1 discusses
these constraints at length and shows how they can be re-expressed in terms of the
operators S, L and H. Eq. (C.26) gives this conversion. We note here that H is the
part of the unitary that is completely uncoupled to the field operators and so we will
have the correspondence H = E00.
In terms of the system operators Eαβ (α, β = 0, 1, . . . , d) defining Hint(λ, t) in
Eq. (2.145), the general limiting coefficients are
Gαβ = −iEαβ − Eαi(
1
1+ iKE
)ij
κjkEkβ, (D.43)
Appendix D. The Quantum Wong-Zakai Theorem 249
where i and j start from 1 and we introduced the following notation. In the most
general case, we have a d × d matrix of constants K = κij, as well as a matrix of
operators E = Eij. A Neumann series is the operator-valued generalization of a
geometric series, so that for an operator T ,
∞∑n=0
T n = (1− T )−1 (D.44)
is well defined whenever 1 − T is invertible. The time consecutive contractions ul-
timately generate a Neumann series where T is the matrix of operators −iKE =
−iκijEjk. The limiting coefficient then involves the i, j component of the opera-
tor/matrix inverse 1/(1+ iKE).
250
References
[1] E. M. Rasel, M. K. Oberthaler, H. Batelaan, J. Schmiedmayer, and A. Zeilinger,
Physical Review Letters 75, 2633 (1995).
[2] H. Haffner, W. Hansel, C. F. Roos, J. Benhelm, D. Chek-al kar, M. Chwalla,
T. Korber, U. D. Rapol, M. Riebe, P. O. Schmidt, et al., Nature 438, 643 (2005).
[3] D. Leibfried, E. Knill, S. Seidelin, J. Britton, R. B. Blakestad, J. Chiaverini,
D. B. Hume, W. M. Itano, J. D. Jost, C. Langer, et al., Nature 438, 639 (2005).
[4] W. M. Itano, J. C. Bergquist, J. J. Bollinger, J. M. Gilligan, D. J. Heinzen,
F. L. Moore, M. G. Raizen, and D. J. Wineland, Physical Review A 47, 3554
(1993).
[5] I. H. Deutsch and P. S. Jessen, Optics Communications 283, 681 (2010).
[6] A. P. VanDevender, Y. Colombe, J. Amini, D. Leibfried, and D. J. Wineland,
Physical Review Letters 105, 023001 (2010).
[7] E. W. Streed, B. G. Norton, A. Jechow, T. J. Weinhold, and D. Kielpinski,
Physical Review Letters 106, 010502 (2011).
[8] A. Kuzmich, L. Mandel, and N. P. Bigelow, Physical Review Letters 85, 1594
(2000).
[9] J. Hald, J. L. Srensen, C. Schori, and E. S. Polzik, Physical Review Letters 83,
1319 (1999).
REFERENCES 251
[10] M. H. Schleier-Smith, I. D. Leroux, and V. Vuleti, Physical Review Letters 104,
073604 (2010).
[11] A. Silberfarb, P. S. Jessen, and I. H. Deutsch, Physical Review Letters 95,
030402 (2005).
[12] G. A. Smith, A. Silberfarb, I. H. Deutsch, and P. S. Jessen, Physical Review
Letters 97, 180403 (2006).
[13] C. A. Riofrıo, P. S. Jessen, and I. H. Deutsch, Journal of Physics B: Atomic,
Molecular and Optical Physics 44, 154007 (2011).
[14] N. Wiener, Extrapolation, Interpolation, and Smoothing of Stationary Time Se-
ries: With Engineering Applications (The MIT Press, 1949).
[15] R. E. Kalman and R. S. Bucy, Journal of Basic Engineering 83, 95 (1961).
[16] R. L. Stratonovich, Theory of Probability & Its Applications 5, 156 (1960).
[17] H. J. Kushner, Journal of the Society for Industrial and Applied Mathematics
Series A Control 2, 106 (1964).
[18] G. Kallianpur and C. Striebel, The Annals of Mathematical Statistics 39, 785
(1968).
[19] M. Zakai, Probability Theory and Related Fields 11, 230 (1969).
[20] R. Van Handel, Doctoral dissertation, California Institute of Technology,
Pasadena, California (2006).
[21] H. Holden, B. ksendal, J. Ube, T. Zhang, and SpringerLink (Online service),
Stochastic Partial Differential Equations A Modeling, White Noise Functional
Approach, Universitext (Springer New York, New York, NY, 2010).
[22] B. K. Oksendal, Stochastic Differential Equations: An Introduction with Appli-
cations (Springer, 2002), 5th ed.
REFERENCES 252
[23] R. L. Hudson and K. R. Parthasarathy, Communications in Mathematical
Physics 93, 301 (1984).
[24] V. P. Belavkin, Journal of Multivariate Analysis 42, 171 (1992).
[25] L. Bouten, R. van Handel, and M. R. James, SIAM Journal on Control and
Optimization 46, 2199 (2007).
[26] J. E. Gough, Philosophical Transactions of the Royal Society A: Mathematical,
Physical and Engineering Sciences 370, 5241 (2012).
[27] V. P. Belavkin, Radiotekhnika i Elektronika 25, 14451453 (1980).
[28] V. Belavkin, in XXIV Karpacz Winter School on Theoretical Physics, edited by
R. Gielerak and W. Karwowski (World Scientific, 1988), Stochastic methods in
mathematics and physics, pp. 310 – 324.
[29] V. P. Belavkin, Communications in Mathematical Physics 146, 611 (1992).
[30] P. Grangier, J. A. Levenson, and J.-P. Poizat, Nature 396, 537 (1998).
[31] D. Brigo, B. Hanzon, and F. LeGland, IEEE Transactions on Automatic Control
43, 247 (1998).
[32] D. Brigo, B. Hanzon, and F. L. Gland, Bernoulli 5, 495 (1999).
[33] R. van Handel and H. Mabuchi, Journal of Optics B: Quantum and Semiclassical
Optics 7, S226 (2005).
[34] H. Mabuchi, Physical Review A 78, 015801 (2008).
[35] A. S. Hopkins, Thesis, California Institute of Technology (2009).
[36] A. E. B. Nielsen, A. S. Hopkins, and H. Mabuchi, New Journal of Physics 11,
105043 (2009).
[37] B. A. Chase, Ph.D. thesis, University of New Mexico (2009).
REFERENCES 253
[38] C. L. Bris and P. Rouchon, arXiv:1207.4580 (2012).
[39] S. Massar and S. Popescu, Physical Review Letters 74, 1259 (1995).
[40] I. H. Deutsch, American Journal of Physics 59, 834 (1991).
[41] J. C. Garrison and R. Chiao, Quantum Optics (Oxford University Press, USA,
2008).
[42] B. J. Smith and M. G. Raymer, New Journal of Physics 9, 414 (2007).
[43] J. Gough, Journal of Mathematical Physics 47, 113509 (2006).
[44] C. Cohen-Tannoudji, J. Dupont-Roc, and G. Grynberg, Photons and Atoms:
Introduction to Quantum Electrodynamics (Wiley-Interscience, 1989).
[45] A. Barchielli, in Open Quantum Systems III, edited by S. Attal, A. Joye, and
C. Pillet (Springer, 2006), vol. 1882 of Lecture Notes in Mathematics, pp. 207–
292.
[46] I. Bialynicki-Birula, Physical Review Letters 80, 5247 (1998).
[47] A. E. Siegman, Lasers (University Science Books, 1986).
[48] E. Hecht, Optics (Addison-Wesley, 2002).
[49] J.-C. Diels and W. Rudolph, Ultrashort Laser Pulse Phenomena (Academic
Press, 2006).
[50] C. W. Gardiner and M. J. Collett, Physical Review A 31, 3761 (1985).
[51] C. W. Gardiner and P. Zoller, Quantum noise (Springer, 2004).
[52] K. R. Parthasarathy, An introduction to quantum stochastic calculus (Birkhuser,
1992).
REFERENCES 254
[53] R. van Handel, Stochastic calculus, filtering, and stochastic control (2007), URL
http://www.princeton.edu/~rvan/.
[54] L. Accardi, A. Frigerio, and Y. G. Lu, Communications in Mathematical Physics
131, 537 (1990).
[55] L. Accardi, Y. G. Lu, and I. Volovich, Quantum Theory and Its Stochastic Limit
(Springer-Verlag, 2002).
[56] R. van Handel, J. K. Stockton, and H. Mabuchi, Journal of Optics B: Quantum
and Semiclassical Optics 7, S179 (2005).
[57] L. Bouten, J. Stockton, G. Sarma, and H. Mabuchi, Physical Review A (Atomic,
Molecular, and Optical Physics) 75, 052111 (2007).
[58] Z. Schuss, Theory and Applications of Stochastic Processes: An Analytical Ap-
proach (Springer, 2009), 1st ed.
[59] T. Tao, An Introduction to Measure Theory (American Mathematical Society,
2011).
[60] D. Williams, Probability with Martingales (Cambridge University Press, 1991).
[61] R. van Handel, J. Stockton, and H. Mabuchi, Automatic Control, IEEE Trans-
actions on 50, 768 (2005).
[62] H. Maassen, in Quantum Information, Computation and Cryptography, edited
by F. Benatti, M. Fannes, R. Floreanini, and D. Petritis (Springer, 2010), vol.
808 of Lecture Notes in Physics, pp. 65–108.
[63] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Infor-
mation (Cambridge University Press, 2000).
[64] L. Bouten and R. van Handel (2005), arXiv:math-ph/0508006.
[65] J. Gough and C. Kostler, Communications in Stochastic Analysis 4, 505 (2010).
REFERENCES 255
[66] J. E. Gough, M. R. James, and H. I. Nurdin, Quantum Information Processing
(2012).
[67] M. Tsang, Physical Review Letters 102, 250403 (2009).
[68] G. Kimura and A. Kossakowski, Open Systems and Information Dynamics 12,
207 (2005).
[69] P. E. Kloeden, E. Platen, and H. Schurz, Numerical Solution of SDE Through
Computer Experiments (Springer, 1994).
[70] M. Kitagawa and M. Ueda, Physical Review A 47, 5138 (1993).
[71] A. Kuzmich, N. P. Bigelow, and L. Mandel, Europhysics Letters (EPL) 42, 481
(1998).
[72] X. Yin, X. Wang, J. Ma, and X. Wang, Journal of Physics B: Atomic, Molecular
and Optical Physics 44, 015501 (2011).
[73] M. Varbanov and T. A. Brun, Physical Review A 76, 032104 (2007).
[74] E. Bagan, A. Monras, and R. Muoz-Tapia, Physical Review A 71, 062318 (2005).
[75] S. T. Merkel, P. S. Jessen, and I. H. Deutsch, Physical Review A 78, 023404
(2008).
[76] R. Van Handel, Infinite Dimensional Analysis, Quantum Probability and Re-
lated Topics 12, 153 (2009).
[77] J. Hunter, Lecture notes on applied mathematics – methods and models (2009),
URL http://www.math.ucdavis.edu/~hunter/.
[78] D. Nualart, The Malliavin Calculus and Related Topics (Springer, 1995), 1st ed.
[79] B. A. Chase and J. M. Geremia, Physical Review A 78, 052101 (2008).
REFERENCES 256
[80] L. Bouten and A. Silberfarb, Communications in Mathematical Physics 283,
491 (2008).
[81] M. Tsang and C. M. Caves, Physical Review X 2, 031016 (2012).
[82] J. von Neumann, Mathematical Foundations of Quantum Mechanics (Princeton
University Press, 1996).
[83] K. R. Davidson and S. J. Szarek, in Handbook of the Geometry of Banach Spaces,
edited by W. Johnson and J. Lindenstrauss (Elsevier Science B.V., 2001), vol. 1,
pp. 317–366.
[84] Y. Ogata, arXiv:1111.5933 (2011).
[85] A. H. Jazwinski, Stochastic Processes and Filtering Theory (Dover Publications,
2007).
[86] S. Julier and J. Uhlmann, Proceedings of the IEEE 92, 401 (2004).
[87] M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, IEEE Transactions
on Signal Processing 50, 174 (2002).
[88] S. Bonnabel, P. Martin, and P. Rouchon, IEEE Transactions on Automatic
Control 53, 2514 (2008).
[89] B. A. Chase and J. M. Geremia, Physical Review A 79, 022314 (2009).
[90] E. Wong and M. Zakai, The Annals of Mathematical Statistics 36, 1560 (1965).
[91] G. C. Wick, Physical Review 80, 268 (1950).
[92] J. Gough, Communications in Mathematical Physics 254, 489 (2005).
Recommended