469
Financial Derivatives in Theory and Practice Revised Edition P. J. HUNT WestLB AG, London, UK J. E. KENNEDY University of Warwick, UK

Financial derivatives in theory and practice

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Financial derivatives in theory and practice

Financial Derivativesin Theory and PracticeRevised Edition

P. J. HUNTWestLB AG, London, UK

J. E. KENNEDYUniversity of Warwick, UK

Page 2: Financial derivatives in theory and practice
Page 3: Financial derivatives in theory and practice

Financial Derivativesin Theory and Practice

Page 4: Financial derivatives in theory and practice

WILEY SERIES IN PROBABILITY AND STATISTICS

Established by WALTER A. SHEWHART and SAMUEL S. WILKS

Editors: David J. Balding, Peter Bloomfield, Noel A. C. Cressie,Nicholas I. Fisher, Iain M. Johnstone, J. B. Kadane, Geert Molenberghs,Louise M. Ryan, David W. Scott, Adrian F. M. Smith, Jozef L. Teugels;Editors Emeriti: Vic Barnett, J. Stuart Hunter, David G. Kendall

A complete list of the titles in this series appears at the end of this volume.

Page 5: Financial derivatives in theory and practice

Financial Derivativesin Theory and PracticeRevised Edition

P. J. HUNTWestLB AG, London, UK

J. E. KENNEDYUniversity of Warwick, UK

Page 6: Financial derivatives in theory and practice

Copyright c© 2004 John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester,West Sussex PO19 8SQ, England

Telephone (+44) 1243 779777

Email (for orders and customer service enquiries): [email protected] our Home Page on www.wileyeurope.com or www.wiley.com

All Rights Reserved. No part of this publication may be reproduced, stored in a retrievalsystem or transmitted in any form or by any means, electronic, mechanical, photocopying,recording, scanning or otherwise, except under the terms of the Copyright, Designs and PatentsAct 1988 or under the terms of a licence issued by the Copyright Licensing Agency Ltd, 90Tottenham Court Road, London W1T 4LP, UK, without the permission in writing of thePublisher. Requests to the Publisher should be addressed to the Permissions Department, JohnWiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England,or emailed to [email protected], or faxed to (+44) 1243 770620.

This publication is designed to provide accurate and authoritative information in regard to thesubject matter covered. It is sold on the understanding that the Publisher is not engaged inrendering professional services. If professional advice or other expert assistance is required, theservices of a competent professional should be sought.

Other Wiley Editorial Offices

John Wiley & Sons Inc., 111 River Street, Hoboken, NJ 07030, USA

Jossey-Bass, 989 Market Street, San Francisco, CA 94103-1741, USA

Wiley-VCH Verlag GmbH, Boschstr. 12, D-69469 Weinheim, Germany

John Wiley & Sons Australia Ltd, 33 Park Road, Milton, Queensland 4064, Australia

John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop #02-01, Jin Xing Distripark, Singapore129809

John Wiley & Sons Canada Ltd, 22 Worcester Road, Etobicoke, Ontario, Canada M9W 1L1

Wiley also publishes its books in a variety of electronic formats. Some content that appearsin print may not be available in electronic books.

British Library Cataloguing in Publication Data

A catalogue record for this book is available from the British Library

ISBN 0-470-86358-7 (Cloth)ISBN 0-470-86359-5 (Paper)

Typeset in 10/12pt Computer Modern by Laserwords Private Limited, Chennai, IndiaPrinted and bound in Great Britain by Antony Rowe Ltd, Chippenham, WiltshireThis book is printed on acid-free paper responsibly manufactured from sustainable forestry inwhich at least two trees are planted for each one used for paper production.

Page 7: Financial derivatives in theory and practice

To Patrick.Choose well.

Page 8: Financial derivatives in theory and practice
Page 9: Financial derivatives in theory and practice

ContentsPreface to revised edition xvPreface xviiAcknowledgements xxi

Part I: Theory 1

1 Single-Period Option Pricing 31.1 Option pricing in a nutshell 31.2 The simplest setting 41.3 General one-period economy 5

1.3.1 Pricing 61.3.2 Conditions for no arbitrage: existence of Z 71.3.3 Completeness: uniqueness of Z 91.3.4 Probabilistic formulation 121.3.5 Units and numeraires 15

1.4 A two-period example 15

2 Brownian Motion 192.1 Introduction 192.2 Definition and existence 202.3 Basic properties of Brownian motion 21

2.3.1 Limit of a random walk 212.3.2 Deterministic transformations of Brownian motion 232.3.3 Some basic sample path properties 24

2.4 Strong Markov property 262.4.1 Reflection principle 28

3 Martingales 313.1 Definition and basic properties 323.2 Classes of martingales 35

3.2.1 Martingales bounded in L1 353.2.2 Uniformly integrable martingales 363.2.3 Square-integrable martingales 39

3.3 Stopping times and the optional sampling theorem 413.3.1 Stopping times 413.3.2 Optional sampling theorem 45

3.4 Variation, quadratic variation and integration 493.4.1 Total variation and Stieltjes integration 493.4.2 Quadratic variation 513.4.3 Quadratic covariation 55

Page 10: Financial derivatives in theory and practice

viii Contents

3.5 Local martingales and semimartingales 563.5.1 The space cMloc 563.5.2 Semimartingales 59

3.6 Supermartingales and the Doob—Meyer decomposition 61

4 Stochastic Integration 634.1 Outline 634.2 Predictable processes 654.3 Stochastic integrals: the L2 theory 67

4.3.1 The simplest integral 684.3.2 The Hilbert space L2(M) 694.3.3 The L2 integral 704.3.4 Modes of convergence to H •M 72

4.4 Properties of the stochastic integral 744.5 Extensions via localization 77

4.5.1 Continuous local martingales as integrators 774.5.2 Semimartingales as integrators 784.5.3 The end of the road! 80

4.6 Stochastic calculus: Ito’s formula 814.6.1 Integration by parts and Ito’s formula 814.6.2 Differential notation 834.6.3 Multidimensional version of Ito’s formula 854.6.4 Levy’s theorem 88

5 Girsanov and Martingale Representation 915.1 Equivalent probability measures and the Radon—Nikodym derivative 91

5.1.1 Basic results and properties 915.1.2 Equivalent and locally equivalent measures on a filtered space 955.1.3 Novikov’s condition 97

5.2 Girsanov’s theorem 995.2.1 Girsanov’s theorem for continuous semimartingales 995.2.2 Girsanov’s theorem for Brownian motion 101

5.3 Martingale representation theorem 1055.3.1 The space I2(M) and its orthogonal complement 1065.3.2 Martingale measures and the martingale representation

theorem 1105.3.3 Extensions and the Brownian case 111

6 Stochastic Differential Equations 1156.1 Introduction 1156.2 Formal definition of an SDE 1166.3 An aside on the canonical set-up 1176.4 Weak and strong solutions 119

6.4.1 Weak solutions 119

Page 11: Financial derivatives in theory and practice

Contents ix

6.4.2 Strong solutions 1216.4.3 Tying together strong and weak 124

6.5 Establishing existence and uniqueness: Ito theory 1256.5.1 Picard—Lindelof iteration and ODEs 1266.5.2 A technical lemma 1276.5.3 Existence and uniqueness for Lipschitz coefficients 130

6.6 Strong Markov property 1346.7 Martingale representation revisited 139

7 Option Pricing in Continuous Time 1417.1 Asset price processes and trading strategies 142

7.1.1 A model for asset prices 1427.1.2 Self-financing trading strategies 144

7.2 Pricing European options 1467.2.1 Option value as a solution to a PDE 1477.2.2 Option pricing via an equivalent martingale measure 149

7.3 Continuous time theory 1517.3.1 Information within the economy 1527.3.2 Units, numeraires and martingale measures 1537.3.3 Arbitrage and admissible strategies 1587.3.4 Derivative pricing in an arbitrage-free economy 1637.3.5 Completeness 1647.3.6 Pricing kernels 173

7.4 Extensions 1767.4.1 General payout schedules 1767.4.2 Controlled derivative payouts 1787.4.3 More general asset price processes 1797.4.4 Infinite trading horizon 180

8 Dynamic Term Structure Models 1838.1 Introduction 1838.2 An economy of pure discount bonds 1838.3 Modelling the term structure 187

8.3.1 Pure discount bond models 1918.3.2 Pricing kernel approach 1918.3.3 Numeraire models 1928.3.4 Finite variation kernel models 1948.3.5 Absolutely continuous (FVK) models 1978.3.6 Short-rate models 1978.3.7 Heath—Jarrow—Morton models 2008.3.8 Flesaker—Hughston models 206

Page 12: Financial derivatives in theory and practice

x Contents

Part II: Practice 213

9 Modelling in Practice 2159.1 Introduction 2159.2 The real world is not a martingale measure 215

9.2.1 Modelling via infinitesimals 2169.2.2 Modelling via macro information 217

9.3 Product-based modelling 2189.3.1 A warning on dimension reduction 2199.3.2 Limit cap valuation 221

9.4 Local versus global calibration 223

10 Basic Instruments and Terminology 22710.1 Introduction 22710.2 Deposits 227

10.2.1 Accrual factors and LIBOR 22810.3 Forward rate agreements 22910.4 Interest rate swaps 23010.5 Zero coupon bonds 23210.6 Discount factors and valuation 233

10.6.1 Discount factors 23310.6.2 Deposit valuation 23310.6.3 FRA valuation 23410.6.4 Swap valuation 234

11 Pricing Standard Market Derivatives 23711.1 Introduction 23711.2 Forward rate agreements and swaps 23711.3 Caps and floors 238

11.3.1 Valuation 24011.3.2 Put—call parity 241

11.4 Vanilla swaptions 24211.5 Digital options 244

11.5.1 Digital caps and floors 24411.5.2 Digital swaptions 245

12 Futures Contracts 24712.1 Introduction 24712.2 Futures contract definition 247

12.2.1 Contract specification 24812.2.2 Market risk without credit risk 24912.2.3 Mathematical formulation 251

12.3 Characterizing the futures price process 25212.3.1 Discrete resettlement 252

Page 13: Financial derivatives in theory and practice

Contents xi

12.3.2 Continuous resettlement 25312.4 Recovering the futures price process 25512.5 Relationship between forwards and futures 256

Orientation: Pricing Exotic European Derivatives 259

13 Terminal Swap-Rate Models 26313.1 Introduction 26313.2 Terminal time modelling 263

13.2.1 Model requirements 26313.2.2 Terminal swap-rate models 265

13.3 Example terminal swap-rate models 26613.3.1 The exponential swap-rate model 26613.3.2 The geometric swap-rate model 26713.3.3 The linear swap-rate model 268

13.4 Arbitrage-free property of terminal swap-rate models 26913.4.1 Existence of calibrating parameters 27013.4.2 Extension of model to [0,∞) 27113.4.3 Arbitrage and the linear swap-rate model 273

13.5 Zero coupon swaptions 273

14 Convexity Corrections 27714.1 Introduction 27714.2 Valuation of ‘convexity-related’ products 278

14.2.1 Affine decomposition of convexity products 27814.2.2 Convexity corrections using the linear swap-rate model 280

14.3 Examples and extensions 28214.3.1 Constant maturity swaps 28314.3.2 Options on constant maturity swaps 28414.3.3 LIBOR-in-arrears swaps 285

15 Implied Interest Rate Pricing Models 28715.1 Introduction 28715.2 Implying the functional form DTS 28815.3 Numerical implementation 29215.4 Irregular swaptions 29315.5 Numerical comparison of exponential and implied swap-rate models 299

16 Multi-Currency Terminal Swap-Rate Models 30316.1 Introduction 30316.2 Model construction 304

16.2.1 Log-normal case 30516.2.2 General case: volatility smiles 307

16.3 Examples 308

Page 14: Financial derivatives in theory and practice

xii Contents

16.3.1 Spread options 30816.3.2 Cross-currency swaptions 311

Orientation: Pricing Exotic American and Path-DependentDerivatives 315

17 Short-Rate Models 31917.1 Introduction 31917.2 Well-known short-rate models 320

17.2.1 Vasicek—Hull—White model 32017.2.2 Log-normal short-rate models 32217.2.3 Cox—Ingersoll—Ross model 32317.2.4 Multidimensional short-rate models 324

17.3 Parameter fitting within the Vasicek—Hull—White model 32517.3.1 Derivation of φ, ψ and B·T 32617.3.2 Derivation of ξ, ζ and η 32717.3.3 Derivation of µ, λ and A·T 328

17.4 Bermudan swaptions via Vasicek—Hull—White 32917.4.1 Model calibration 33017.4.2 Specifying the ‘tree’ 33017.4.3 Valuation through the tree 33217.4.4 Evaluation of expected future value 33217.4.5 Error analysis 334

18 Market Models 33718.1 Introduction 33718.2 LIBOR market models 338

18.2.1 Determining the drift 33918.2.2 Existence of a consistent arbitrage-free term structure model 34118.2.3 Example application 343

18.3 Regular swap-market models 34318.3.1 Determining the drift 34418.3.2 Existence of a consistent arbitrage-free term structure model 34618.3.3 Example application 346

18.4 Reverse swap-market models 34718.4.1 Determining the drift 34818.4.2 Existence of a consistent arbitrage-free term structure model 34918.4.3 Example application 350

19 Markov-Functional Modelling 35119.1 Introduction 35119.2 Markov-functional models 35119.3 Fitting a one-dimensional Markov-functional model to swaption

prices 354

Page 15: Financial derivatives in theory and practice

Contents xiii

19.3.1 Deriving the numeraire on a grid 35519.3.2 Existence of a consistent arbitrage-free term structure model 358

19.4 Example models 35919.4.1 LIBOR model 35919.4.2 Swap model 361

19.5 Multidimensional Markov-functional models 36319.5.1 Log-normally driven Markov-functional models 364

19.6 Relationship to market models 36519.7 Mean reversion, forward volatilities and correlation 367

19.7.1 Mean reversion and correlation 36719.7.2 Mean reversion and forward volatilities 36819.7.3 Mean reversion within the Markov-functional LIBOR model 369

19.8 Some numerical results 370

20 Exercises and Solutions 373

Appendix 1 The Usual Conditions 417

Appendix 2 L2 Spaces 419

Appendix 3 Gaussian Calculations 421

References 423

Index 427

Page 16: Financial derivatives in theory and practice
Page 17: Financial derivatives in theory and practice

Preface to revised edition

Since this book first appeared in 2000, it has been adopted by a number ofuniversities as a standard text for a graduate course in finance. As a result wehave produced this revised edition. The only differences in content betweenthis text and its predecessor are the inclusion of an additional chapter ofexercises with solutions, and the corrections of a number of errors.

Many of the exercise are variants of ones given to students of the M.Sc.in Mathematical Finance at the University of Warwick, so they have beentried and tested. Many provide drill in routine calculations in the interest-ratesetting.

Since the book first appeared there has been further development in themodelling of interest rate derivatives. The modelling and approximation ofmarket models has progressed further, as has that of Markov-functionalmodels that are now used, in their multi-factor form, in a number of banks. Anotable exclusion from this revised edition is any coverage of these advances.Those interested in this area can find some of this in Rebonato (2002) andBennett and Kennedy (2004).

A few extra acknowledgements are due in this revised edition: to NoelVaillant and Jørgen Aase Nielsen for pointing out a number of errors, andto Stuart Price for typing the exercise chapter and solutions.

Page 18: Financial derivatives in theory and practice
Page 19: Financial derivatives in theory and practice

Preface

The growth in the financial derivatives market over the last thirty years hasbeen quite extraordinary. From virtually nothing in 1973, when Black, Mertonand Scholes did their seminal work in the area, the total outstanding notionalvalue of derivatives contracts today has grown to several trillion dollars. Thisphenomenal growth can be attributed to two factors.The first, and most important, is the natural need that the products fulfil.

Any organization or individual with sizeable assets is exposed to moves in theworld markets. Manufacturers are susceptible to moves in commodity prices;multinationals are exposed to moves in exchange rates; pension funds areexposed to high inflation rates and low interest rates. Financial derivativesare products which allow all these entities to reduce their exposure to marketmoves which are beyond their control.The second factor is the parallel development of the financial mathematics

needed for banks to be able to price and hedge the products demanded bytheir customers. The breakthrough idea of Black, Merton and Scholes, thatof pricing by arbitrage and replication arguments, was the start. But it isonly because of work in the field of pure probability in the previous twentyyears that the theory was able to advance so rapidly. Stochastic calculus andmartingale theory were the perfect mathematical tools for the development offinancial derivatives, and models based on Brownian motion turned out to behighly tractable and usable in practice.Where this leaves us today is with a massive industry that is highly

dependent on mathematics and mathematicians. These mathematicians needto be familiar with the underlying theory of mathematical finance andthey need to know how to apply that theory in practice. The need for atext that addressed both these needs was the original motivation for thisbook. It is aimed at both the mathematical practitioner and the academicmathematician with an interest in the real-world problems associated withfinancial derivatives. That is to say, we have written the book that we wouldboth like to have read when we first started to work in the area.This book is divided into two distinct parts which, apart from the need to

Page 20: Financial derivatives in theory and practice

xviii Preface

cross-reference for notation, can be read independently. Part I is devoted to thetheory of mathematical finance. It is not exhaustive, notable omissions being atreatment of asset price processes which can exhibit jumps, equilibrium theoryand the theory of optimal control (which underpins the pricing of Americanoptions). What we have included is the basic theory for continuous asset priceprocesses with a particular emphasis on the martingale approach to arbitragepricing.

The primary development of the finance theory is carried out in Chapters1 and 7. The reader who is not already familiar with the subject but who hasa solid grounding in stochastic calculus could learn most of the theory fromthese two chapters alone. The fundamental ideas are laid out in Chapter 1, inthe simple setting of a single-period economy with only finitely many states.The full (and more technical) continuous time theory, is developed in Chapter7. The treatment here is slightly non-standard in the emphasis it places onnumeraires (which have recently become extremely important as a modellingtool) and the (consequential) development of the L1 theory rather than themore common L2 version. We also choose to work with the filtration generatedby the assets in the economy rather than the more usual Brownian filtration.This approach is certainly more natural but, surprisingly, did not appear inthe literature until the work of Babbs and Selby (1998).

An understanding of the continuous theory of Chapter 7 requires aknowledge of stochastic calculus, and we present the necessary backgroundmaterial in Chapters 2–6. We have gathered together in one place the resultsfrom this area which are relevant to financial mathematics. We have tried togive a full and yet readable account, and hope that these chapters will standalone as an accessible introduction to this technical area. Our presentationhas been very much influenced by the work of David Williams, who was adriving force in Cambridge when we were students. We also found the booksof Chung and Williams (1990), Durrett (1996), Karatzas and Shreve (1991),Protter (1990), Revuz and Yor (1991) and Rogers and Williams (1987) to bevery illuminating, and the informed reader will recognize their influence.

The reader who has read and absorbed the first seven chapters of the bookwill be well placed to read the financial literature. The last part of theory wepresent, in Chapter 8, is more specialized, to the field of interest rate models.The exposition here is partly novel but borrows heavily from the papers byBaxter (1997) and Jin and Glasserman (1997). We describe several differentways present in the literature for specifying an interest rate model and showhow they relate to one another from a mathematical perspective. This includesa discussion of the celebrated work of Heath, Jarrow and Morton (1992) (andthe less celebrated independent work of Babbs (1990)) which, for the firsttime, defined interest rate models directly in terms of forward rates.

Part II is very much about the practical side of building models for pricingderivatives. It, too, is far from exhaustive, covering only topics which directlyreflect our experiences through our involvement in product development

Page 21: Financial derivatives in theory and practice

Preface xix

within the London interest rate derivative market. What we have tried todo in this part of the book, through the particular problems and productsthat we discuss, is to alert the reader to the issues involved in derivativepricing in practice and to give him a framework within which to make his ownjudgements and a platform from which to develop further models.

Chapter 9 sets the scene for the remainder of the book by identifying somebasic issues a practitioner should be aware of when choosing and applyingmodels. Chapters 10 and 11 then introduce the reader to the basic instrumentsand terminology and to the pricing of standard vanilla instruments usingswaption measure. This pricing approach comes from fairly recent papers inthe area which focus on the use of various assets as numeraires when definingmodels and doing calculations. This is actually an old idea dating back at leastto Harrison and Pliska (1981) and which, starting with the work of Gemanet al. (1995), has come to the fore over the past decade. Chapter 12 is onfutures contracts. The treatment here draws largely on the work of Duffieand Stanton (1992), though we have attempted a cleaner presentation of thisstandard topic than those we have encountered elsewhere.

The remainder of Part II tackles pricing problems of increasing levels ofcomplexity, beginning with single-currency European products and finishingwith various Bermudan callable products. Chapters 13–16 are devoted toEuropean derivatives. Chapters 13 and 15 present a new approach to thismuch-studied problem. These products are theoretically straightforward butthe challenge for the practitioner is to ensure the model he employs in practiceis well calibrated to market-implied distributions and is easy to implement.Chapter 14 discusses convexity corrections and the pricing of ‘convexity-related’ products using the ideas of earlier chapters. These are importantproducts which have previously been studied by, amongst others, Coleman(1995) and Doust (1995). We provide explicit formulae for some commonlymet products and then, in Chapter 16, generalize these to multi-currencyproducts.

The last three chapters focus on the pricing of path-dependent and Amer-ican derivatives. Short-rate models have traditionally been those favouredfor this task because of their ease of implementation and because they arerelatively easy to understand. There is a vast literature on these models,and Chapter 17 provides merely a brief introduction. The Vasicek-Hull-Whitemodel is singled out for more detailed discussion and an algorithm for itsimplementation is given which is based on a semi-analytic approach (takenfrom Gandhi and Hunt (1997)).

More recently attention has turned to the so-called market models,pioneered by Brace et al. (1997) and Miltersen et al. (1997), and extendedby Jamshidian (1997). These models provided a breakthrough in tackling theissue of model calibration and, in the few years since they first appeared, a vastliterature has developed around them. They, or variants of and approximationsto them, are now starting to replace previous (short-rate) models within most

Page 22: Financial derivatives in theory and practice

xx Preface

major investment banks. Chapter 18 provides a basic description of both theLIBOR- and swap-based market models. We have not attempted to surveythe literature in this extremely important area and for this we refer the readerinstead to the article of Rutkowski (1999).

The book concludes, in Chapter 19, with a description of some of our ownrecent work (jointly with Antoon Pelsser), work which builds on Hunt andKennedy (1998). We describe a class of models which can fit the observedprices of liquid instruments in a similar fashion to market models but whichhave the advantage that they can be efficiently implemented. These models,which we call Markov-functional models, are especially useful for the pricingof products with multi-callable exercise features, such as Bermudan swaptionsor Bermudan callable constant maturity swaps. The exposition here is similarto the original working paper which was first circulated in late 1997 andappeared on the Social Sciences Research Network in January 1998. A precisof the main ideas appeared in RISK Magazine in March 1998. The final paper,Hunt, Kennedy and Pelsser (2000), is to appear soon. Similar ideas have morerecently been presented by Balland and Hughston (2000).

We hope you enjoy reading the finished book. If you learn from this bookeven one-tenth of what we learnt from writing it, then we will have succeededin our objectives.

Phil HuntJoanne Kennedy

31 December 1999

Page 23: Financial derivatives in theory and practice

Acknowledgements

There is a long list of people to whom we are indebted and who, eitherdirectly or indirectly, have contributed to making this book what it is.Some have shaped our thoughts and ideas; others have provided support andencouragement; others have commented on drafts of the text. We shall notattempt to produce this list in its entirety but our sincere thanks go to all ofyou.Our understanding of financial mathematics and our own ideas in this area

owe much to two practitioners, Sunil Gandhi and Antoon Pelsser. PH had thegood fortune to work with Sunil when they were both learning the subject atNatWest Markets, and with Antoon at ABN AMRO Bank. Both have had amarked impact on this book.The area of statistics has much to teach about the art of modelling. Brian

Ripley is a master of this art who showed us how to look at the subject throughnew eyes. Warmest and sincere thanks are also due to another colleague fromJK’s Oxford days, Peter Clifford, for the role he played when this book wastaking shape.We both learnt our probability while doing PhDs at the University of

Cambridge. We were there at a particularly exciting time for probability,both pure and applied, largely due to the combined influences of Frank Kellyand David Williams. Their contributions to the field are already well known.Our thanks for teaching us some of what you know, and for teaching us howto use it.Finally, our thanks go to our families, who provided us with the support,

guidance and encouragement that allowed us to pursue our chosen careers. ToShirley Collins, Harry Collins and Elizabeth Kennedy: thank you for beingthere when it mattered. But most of all, to our parents: for the opportunitiesgiven to us at sacrifice to yourselves.

Page 24: Financial derivatives in theory and practice
Page 25: Financial derivatives in theory and practice

Part I

Theory

Page 26: Financial derivatives in theory and practice
Page 27: Financial derivatives in theory and practice

1

Single-Period Option Pricing

1.1 OPTION PRICING IN A NUTSHELL

To introduce the main ideas of option pricing we first develop this theoryin the case when asset prices can take on only a finite number of valuesand when there is only one time-step. The continuous time theory which weintroduce in Chapter 7 is little more than a generalization of the basic ideasintroduced here.

The two key concepts in option pricing are replication and arbitrage. Optionpricing theory centres around the idea of replication. To price any derivativewe must find a portfolio of assets in the economy, or more generally a tradingstrategy, which is guaranteed to pay out in all circumstances an amountidentical to the payout of the derivative product. If we can do this we haveexactly replicated the derivative.

This idea will be developed in more detail later. It is often obscured inpractice when calculations are performed ‘to a formula’ rather than from firstprinciples. It is important, however, to ensure that the derivative being pricedcan be reproduced by trading in other assets in the economy.

An arbitrage is a trading strategy which generates profits from nothingwith no risk involved. Any economy we postulate must not allow arbitrageopportunities to exist. This seems natural enough but it is essential for ourpurposes, and Section 1.3.2 is devoted to establishing necessary and sufficientconditions for this to hold.

Given the absence of arbitrage in the economy it follows immediately thatthe value of a derivative is the value of a portfolio that replicates it. Tosee this, suppose to the contrary that the derivative costs more than thereplicating portfolio (the converse can be treated similarly). Then we can sellthe derivative, use the proceeds to buy the replicating portfolio and still beleft with some free cash to do with as we wish. Then all we need do is usethe income from the portfolio to meet our obligations under the derivativecontract. This is an arbitrage opportunity – and we know they do not exist.

All that follows in this book is built on these simple ideas!

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 28: Financial derivatives in theory and practice

4 Single-period option pricing

1.2 THE SIMPLEST SETTING

Throughout this section we consider the following simple situation. There

are two assets, A(1) and A(2), with prices at time zero A(1)0 and A

(2)0 . At time

1 the economy is in one of two possible states which we denote by ω1 and

ω2. We denote by A(i)1 (ωj) the price of asset i at time 1 if the economy is

in state ωj . Figure 1.1 shows figuratively the possibilities. We shall denote a

portfolio of assets by φ = (φ(1),φ(2)) where φ(i), the holding of asset i, couldbe negative.

A0

A1( 1)ω1ω

2ωA2( 2)ω

t = 1t = 0

Figure 1.1 Possible states of the economy

We wish to price a derivative (of A(1) and A(2)) which pays at time 1 theamount X1(ωj) if the economy is in state ωj . We will do this by constructinga replicating portfolio. Once again, this is the only way that a derivative canbe valued, although, as we shall see later, explicit knowledge of the replicatingportfolio is not always necessary.The first step is to check that the economy we have constructed is arbitrage-

free, meaning we cannot find a way to generate money from nothing. Moreprecisely, we must show that there is no portfolio φ = (φ(1),φ(2)) with one ofthe following (equivalent) conditions holding:

(i) φ(1)A(1)0 + φ(2)A

(2)0 < 0, φ(1)A

(1)1 (ωj) + φ(2)A

(2)1 (ωj) ≥ 0, j = 1, 2,

(ii) φ(1)A(1)0 + φ(2)A

(2)0 ≤ 0, φ(1)A

(1)1 (ωj) + φ(2)A

(2)1 (ωj) ≥ 0, j = 1, 2,

where the second inequality is strict for some j.

Suppose there exist φ(1) and φ(2) such that

φ(1)A(1)1 (ωj) + φ(2)A

(2)1 (ωj) = X1(ωj), j = 1, 2. (1.1)

We say that φ is a replicating portfolio for X . If we hold this portfolio at timezero, at time 1 the value of the portfolio will be exactly the value of X , nomatter which of the two states the economy is in. It therefore follows that a

Page 29: Financial derivatives in theory and practice

General one-period economy 5

fair price for the derivative X (of A(1) and A(2)) is precisely the cost of thisportfolio, namely

X0 = φ(1)A(1)0 + φ(2)A

(2)0 .

Subtleties

The above approach is exactly the one we use in the more general situationslater. There are, however, three potential problems:

(i) Equation (1.1) may have no solution.(ii) Equation (1.1) may have (infinitely) many solutions yielding the same

values for X0.(iii) Equation (1.1) may have (infinitely) many solutions yielding different

values of X0.

Each of these has its own consequences for the problem of derivative valuation.

(i) If (1.1) has no solution, we say the economy is incomplete, meaning thatit is possible to introduce further assets into the economy which are notredundant and cannot be replicated. Any such asset is not a derivativeof existing assets and cannot be priced by these methods.

(ii) If (1.1) has many solutions all yielding the same value X0 then thereexists a portfolio ψ = 0 such that

ψ ·A0 = 0, ψ · A1(ωj) = 0, j = 1, 2.

This is not a problem and means our original assets are not allindependent — one of these is a derivative of the others, so any furtherderivative can be replicated in many ways.

(iii) If (1.1) has many solutions yielding different values for X0 we have aproblem. There exist portfolios φ and ψ such that

(φ− ψ) ·A0 < 0(φ− ψ) ·A1(ωj) = X1(ωj)−X1(ωj) = 0, j = 1, 2.

This is an arbitrage, a portfolio with strictly negative value at time zeroand zero value at time 1.

In this final case our initial economy was poorly defined. Such situations canoccur in practice but are not sustainable. From a derivative pricing viewpointsuch situations must be excluded from our model. In the presence of arbitragethere is no unique fair price for a derivative, and in the absence of arbitragethe derivative value is given by the initial value of any replicating portfolio.

1.3 GENERAL ONE-PERIOD ECONOMY

We now develop in more detail all the concepts and ideas raised in the previoussection. We restrict attention once again to a single-period economy for clarity,

Page 30: Financial derivatives in theory and practice

6 Single-period option pricing

but introduce many assets and many states so that the essential structure andtechniques used in the continuous time setting can emerge and be discussed.We now consider an economy E comprising n assets with m possible states

at time 1. Let Ω be the set of all possible states. We denote, as before, the

individual states by ωj, j = 1, 2, . . . ,m, and the asset prices by A(i)0 and

A(i)1 (ωj). We begin with some definitions.

Definition 1.1 The economy E admits arbitrage if there exists a portfolioφ such that one of the following conditions (which are actually equivalent inthis discrete setting) holds:

(i) φ ·A0 < 0 and φ · A1(ωj) ≥ 0 for all j,(ii) φ ·A0 ≤ 0 and φ · A1(ωj) ≥ 0 for all j, with strict inequality for some j.If there is no such φ then the economy is said to be arbitrage-free.

Definition 1.2 A derivative X is said to be attainable if there exists someφ such that

X1(ωj) = φ · A1(ωj) for all j.

We have seen that an arbitrage-free economy is essential for derivativepricing theory. We will later derive conditions to check whether a giveneconomy is indeed arbitrage-free. Here we will quickly move on to show, in theabsence of arbitrage, how products can be priced. First we need one furtherdefinition.

Definition 1.3 A pricing kernel Z is any strictly positive vector with theproperty that

A0 =j

ZjA1(ωj). (1.2)

The reason for this name is the role Z plays in derivative pricing assummarized by Theorem 1.4. It also plays an important role in determiningwhether or not an economy admits arbitrage, as described in Theorem 1.7.

1.3.1 Pricing

One way to price a derivative in an arbitrage-free economy is to construct areplicating portfolio and calculate its value. This is summarized in the firstpart of the following theorem. The second part of the theorem enables us toprice a derivative without explicitly constructing a replicating portfolio, aslong as we know one exists.

Theorem 1.4 Suppose that the economy E is arbitrage-free and let X bean attainable contingent claim, i.e. a derivative which can be replicated withother assets. Then the fair value of X is given by

X0 = φ ·A0 (1.3)

Page 31: Financial derivatives in theory and practice

General one-period economy 7

where φ solvesX1(ωj) = φ · A1(ωj) for all j.

Furthermore, if Z is some pricing kernel for the economy then X0 can also berepresented as

X0 =j

ZjX1(ωj).

Proof: The first part of the result follows by the arbitrage arguments usedpreviously. Moving on to the second statement, substituting the definingequation (1.2) for Z into (1.3) yields

X0 = φ ·A0 =j

Zj φ ·A1(ωj) =j

ZjX1(ωj).

Remark 1.5: Theorem 1.4 shows how the pricing of a derivative can bereduced to calculating a pricing kernel Z. This is of limited value as it standssince we still need to show that the economy is arbitrage-free and that thederivative in question is attainable. This latter step can be done by eitherexplicitly constructing the replicating portfolio or proving beforehand thatall derivatives (or at least all within some suitable class) are attainable. If allderivatives are attainable then the economy is said to be complete. As we shallshortly see, both the problems of establishing no arbitrage and completenessfor an economy are in themselves intimately related to the pricing kernel Z.

Remark 1.6: Theorem 1.4 is essentially a geometric result. It merely statesthat if X0 and X1(ωj) are the projections of A0 and A1(ωj) in some directionφ, and if A0 is an affine combination of the vectors A1(ωj), then X0 is thesame affine combination of the vectors X1(ωj). Figure 1.2 illustrates this forthe case n = 2, a two-asset economy.

1.3.2 Conditions for no arbitrage: existence of Z

The following result gives necessary and sufficient conditions for an economyto be arbitrage-free. The result is essentially a geometric one. However, itsreal importance as a tool for calculation becomes clear in the continuous timesetting of Chapter 7 where we work in the language of the probabilist. Wewill see the result rephrased probabilistically later in this chapter.

Theorem 1.7 The economy E is arbitrage-free if and only if there exists apricing kernel, i.e. a strictly positive Z such that

A0 =j

ZjA1(ωj).

Page 32: Financial derivatives in theory and practice

8 Single-period option pricing

A0

X0

A1( 2)ω

φ

X1( 2)ω

X1( 1)ω

A1( 1)ω

Figure 1.2 Geometry of option pricing

Proof: Suppose such a Z exists. Then for any portfolio φ,

φ ·A0 = φ ·j

ZjA1(ωj) =j

Zj φ · A1(ωj) . (1.4)

If φ is an arbitrage portfolio then the left-hand side of (1.4) is non-positive,and the right-hand side is non-negative. Hence (1.4) is identically zero. Thiscontradicts φ being an arbitrage portfolio and so E is arbitrage-free.Conversely, suppose no such Z exists. We will in this case construct an

arbitrage portfolio. Let C be the convex cone constructed from A1(·),

C = a : a =j

ZjA1(ωj), Z 0 .

The set C is a non-empty convex set not containing A0. Here Z 0 meansthat all components of Z are greater than zero. Hence, by the separatinghyperplane theorem there exists a hyperplane H = x : φ · x = β thatseparates A0 and C,

φ ·A0 ≤ β ≤ φ · a for all a ∈ C.

Page 33: Financial derivatives in theory and practice

General one-period economy 9

The vector φ represents an arbitrage portfolio, as we now demonstrate.First, observe that if a ∈ C (the closure of C) then µa ∈ C, µ ≥ 0, and

β ≤ µ(φ · a).

Taking µ = 0 yields β ≤ 0, φ · A0 ≤ 0. Letting µ ↑ ∞ shows that φ · a ≥ 0for all a ∈ C, in particular φ ·A1(ωj) ≥ 0 for all j. So we have that

φ · A0 ≤ 0 ≤ φ · A1(ωj) for all j, (1.5)

and it only remains to show that (1.5) is not always identically zero. But inthis case φ ·A0 = φ · C = 0 which violates the separating property for H .Remark 1.8: Theorem 1.7 states that A0 must be in the interior of the convexcone created by the vectors A1(ωj) for there to be no arbitrage. If this is notthe case an arbitrage portfolio exists, as in Figure 1.3.

A1( 1)ω

A1( 2)ω

A0

A 1( 2)ω

A 1( 1)ω

Arbitrage region

A0

Figure 1.3 An arbitrage portfolio

1.3.3 Completeness: uniqueness of Z

If we are in a complete market all contingent claims can be replicated. As wehave seen in Theorem 1.4, this enables us to price derivatives without having

Page 34: Financial derivatives in theory and practice

10 Single-period option pricing

to explicitly calculate the replicating portfolio. This can be useful, particularlyfor the continuous time models introduced later. In our finite economy it isclear what conditions are required for completeness. There must be at leastas many assets as states of the economy, i.e. m ≤ n, and these must span anappropriate space.

Theorem 1.9 The economy E is complete if and only if there exists a(generalized) left inverse A−1 for the matrix A where

Aij = A(i)1 (ωj).

Equivalently, E is complete if and only if there exists no non-zero solution tothe equation Ax = 0.

Proof: For the economy to be complete, given any contingent claim X , wemust be able to solve

X1(ωj) = φ · A1(ωj) for all j,

which can be written as

X1 = ATφ. (1.6)

The existence of a solution to (1.6) for all X1 is exactly the statement thatAT has a right inverse (some matrix B such that ATB : Rm → Rm is theidentity matrix) and the replicating portfolio is then given by

φ = (AT )−1X1.

This is equivalent to the statement that A has full rank m and there being nonon-zero solution to Ax = 0.

Remark 1.10: If there are more assets than states of the economy the hedgeportfolio will not in general be uniquely specified. In this case one or more ofthe underlying assets is already completely specified by the others and couldbe regarded as a derivative of the others. When the number of assets matchesthe number of states of the economy A−1 is the usual matrix inverse and thehedge is unique.

Remark 1.11: Theorem 1.9 demonstrates clearly that completeness is astatement about the rank of the matrix A. Noting this, we see that thereexist economies that admit arbitrage yet are complete. Consider A havingrank m < n. Since dim(Im(A)⊥) = n −m we can choose A0 ∈ Im(A)⊥, inwhich case there does not exist Z ∈ Rm such that AZ = A0. By Theorem 1.7the economy admits arbitrage, yet having rank m it is complete by Theorem1.9.

Page 35: Financial derivatives in theory and practice

General one-period economy 11

In practice we will always require an economy to be arbitrage-free. Underthis assumption we can state the condition for completeness in terms of thepricing kernel Z.

Theorem 1.12 Suppose the economy E is arbitrage-free. Then it is completeif and only if there exists a unique pricing kernel Z.

Proof: By Theorem 1.7, there exists some vector Z satisfying

A0 =j

ZjA1(ωj). (1.7)

By Theorem 1.9, it suffices to show that Z being unique is equivalent to therebeing no solution to Ax = 0. If Z also solves (1.7) then x = Z − Z solves

Ax = 0. Conversely, if Z solves (1.7) and Ax = 0 then Z = Z + εx solves(1.7) for all ε > 0. Since Zj > 0 for all j we can choose ε sufficiently small

that Zj > 0 for all j, yielding a second pricing kernel.The combination of Theorems 1.7 and 1.12 gives the well-known result that

an economy is complete and arbitrage-free if and only if there exists a uniquepricing kernel Z. This strong notion of completeness is not the primary onethat we shall consider in the continuous time context of Chapter 7, where weshall say that an economy E is complete if it is FA

T -complete.

Definition 1.13 Let FA1 be the smallest σ-algebra with respect to which

the map A1 : Ω→ Rm is measurable. We say the economy E is FA1 -complete

if every FA1 -measurable contingent claim is attainable.

The reason why FA1 completeness is more natural to consider is as follows.

Suppose there are two states in the economy, ωi and ωj, for which A1(ωi) =A1(ωj). Then it is impossible, by observing only the prices A, to distinguishwhich state of the economy we are in, and in practice all we are interestedin is derivative payoffs that can be determined by observing the process A.There is an analogue of Theorem 1.12 which covers this case.

Theorem 1.14 Suppose the economy E is arbitrage-free. Then it is F A1 -

complete if and only if all pricing kernels agree on FA1 , i.e. if Z

(1) and Z(2)

are two pricing kernels, then for every F ∈ FA1 ,

E Z(1)1F = E Z(2)1F .

Proof: For a discrete economy, the statement that X1 is FA1 -measurable isprecisely the statement that X1(ωi) = X1(ωj) whenever A1(ωi) = A1(ωj). Ifthis is the case we can identify any states ωi and ωj for which A1(ωi) = A1(ωj).The question of FA1 -completeness becomes one of proving that this reducedeconomy is complete in the full sense. It is clearly arbitrage-free, a propertyit inherits from the original economy.

Page 36: Financial derivatives in theory and practice

12 Single-period option pricing

Suppose Z is a pricing kernel for the original economy. Then, as is easilyverified,

Z := E[Z|FA1 ] (1.8)

is a pricing kernel for the reduced economy. If the reduced economy is(arbitrage-free and) complete it has a unique pricing kernel Z by Theorem1.12. Thus all pricing kernels for the original economy agree on F A

1 by (1.8).Conversely, if all pricing kernels agree on F A

1 then the reduced economy hasa unique pricing kernel, by (1.8), thus is complete by Theorem 1.12.

1.3.4 Probabilistic formulation

We have seen that pricing derivatives is about replication and the results wehave met so far are essentially geometric in nature. However, it is standardin the modern finance literature to work in a probabilistic framework. Thisapproach has two main motivations. The first is that probability gives anatural and convenient language for stating the results and ideas required,and it is also the most natural way to formulate models that will be a goodreflection of reality. The second is that many of the sophisticated techniquesneeded to develop the theory of derivative pricing in continuous time are welldeveloped in a probabilistic setting.With this in mind we now reformulate our earlier results in a probabilistic

context. In what follows let P be a probability measure on Ω such thatP(ωj) > 0 for all j. We begin with a preliminary restatement of Theorem1.7.

Theorem 1.15 The economy E is arbitrage-free if and only if there exists astrictly positive random variable Z such that

A0 = E[ZA1]. (1.9)

Extending Definition 1.3, we call Z a pricing kernel for the economy E .Suppose, further, that P(A(i)1 0) = 1, A

(i)0 > 0 for some i. Then the

economy E is arbitrage-free if and only if there exists a strictly positive randomvariable κ with E[κ] = 1 such that

E κA1

A(i)1

=A0

A(i)0

. (1.10)

Proof: The first result follows immediately from Theorem 1.7 by settingZ(ωj) = Zj/P(ωj). To prove the second part of the theorem we show that(1.9) and (1.10) are equivalent. This follows since, given either of Z or κ, wecan define the other via

Z(ωj) = κ(ωj)A(i)0

A(i)1 (ωj)

.

Page 37: Financial derivatives in theory and practice

General one-period economy 13

Remark 1.16: Note that the random variable Z is simply a weighted version ofthe pricing kernel in Theorem 1.7, the weights being given by the probabilitymeasure P. Although the measure P assigns probabilities to ‘outcomes’ ω j,these probabilities are arbitrary and the role played by P here is to summarizewhich states may occur through the assignment of positive mass.

Remark 1.17: The random variable κ is also a reweighted version of thepricing kernel of Theorem 1.7 (and indeed of the one here). In addition tobeing positive, κ has expectation one, and so κj := κ(ωj)/P(ωj) defines aprobability measure. We see the importance of this shortly.

Remark 1.18: Equation (1.10) can be interpreted geometrically, as shown inFigure 1.4 for the two-asset case. No arbitrage is equivalent to A0 ∈ C where

C = a : a =j

ZjA1(ωj), Z 0 .

Rescaling A0 and each A1(ωj) does not change the convex cone C and whetheror not A0 is in C, it only changes the weights required to generate A0 fromthe A1(ωj). Rescaling so that A0 and the A1(ωj) all have the same (unit)component in the direction i ensures that the weights κj satisfy j κj = 1.

A1( 1)ω

A1( 2)ω

A0

A1( 1) /ω A1 ( 1)ω(2)(2)A0 /A0A1( 2) /ω A1 ( 2)ω(2)

Figure 1.4 Rescaling asset prices

Page 38: Financial derivatives in theory and practice

14 Single-period option pricing

We have one final step to take to cast the results in the standardprobabilistic format. This format replaces the problem of finding Z or κ by oneof finding a probability measure with respect to which the process A, suitablyrebased, is a martingale. Theorem 1.20 below is the precise statement of whatwe mean by this (equation (1.11)). We meet martingales again in Chapter 3.

Definition 1.19 Two probability measures P and Q on the finite samplespace Ω are said to be equivalent, written P ∼ Q, if

P(F ) = 0 ⇔ Q(F ) = 0,

for all F ⊆ Ω. If P ∼ Q we can define the Radon—Nikodym derivative of Pwith respect to Q, dPdQ , by

dPdQ(S) =

P(S)Q(S)

.

Theorem 1.20 Suppose P(A(i)1 0) = 1, A(i)0 > 0 for some i. Then the

economy E is arbitrage-free if and only if there exists a probability measureQi equivalent to P such that

EQi [A1/A(i)1 ] = A0/A

(i)0 . (1.11)

The measure Qi is said to be an equivalent martingale measure for A(i).

Proof: By Theorem 1.15 we must show that (1.11) is equivalent to theexistence of a strictly positive random variable κ with E[κ] = 1 such that

E κA1

A(i)1

=A0

A(i)0

.

Suppose E is arbitrage-free and such a κ exists. Define Qi ∼ P by Qi(ωj) =κ(ωj)P(ωj). Then

EQi [A1/A(i)1 ] = E[κA1/A

(i)1 ]

= A0/A(i)0

as required.Conversely, if (1.11) holds define κ(ωj) = Qi(ωj)/P(ωj). It follows

easily that κ has unit expectation. Furthermore,

E κA1

A(i)1

= EdQidP

A1

A(i)1

= EQi [A1/A(i)1 ]

= A0/A(i)0 ,

and there is no arbitrage by Theorem 1.15.We are now able to restate our other results on completeness and pricing

in this same probabilistic framework. In our new framework the condition forcompleteness can be stated as follows.

Page 39: Financial derivatives in theory and practice

A two-period example 15

Theorem 1.21 Suppose that no arbitrage exists and that P(A(i)1 0) =

1, A(i)0 > 0 for some i. Then the economy E is complete if and only if there

exists a unique equivalent martingale measure for the ‘unit’ A(i).

Proof: Observe that in the proof of Theorem 1.20 we established a one-to-onecorrespondence between pricing kernels and equivalent martingale measures.The result now follows from Theorem 1.12 which establishes the equivalenceof completeness to the existence of a unique pricing kernel.Our final result, concerning the pricing of a derivative, is left as an exercise

for the reader.

Theorem 1.22 Suppose E is arbitrage-free, P(A(i)1 0) = 1, A(i)0 > 0 for

some i, and that X is an attainable contingent claim. Then the fair value ofX is given by

X0 = A(i)0 EQi [X1/A

(i)1 ],

where Qi is an equivalent martingale measure for the unit A(i).

1.3.5 Units and numeraires

Throughout Section 1.3.4 we assumed that the price of one of the assets, asseti, was positive with probability one. This allowed us to use this asset price asa unit; the operation of dividing all other asset prices by the price of asset ican be viewed as merely recasting the economy in terms of this new unit.There is no reason why, throughout Section 1.3.4, we need to restrict the

unit to be one of the assets. All the results hold if the unit A(i) is replacedby some other unit U which is strictly positive with probability one and forwhich

U0 = EP(ZU1) (1.12)

where Z is some pricing kernel for the economy. Note, in particular, that wecan always take U0 = 1, Q(ωj) = 1/n and U1(ωj) = 1/(nZjP(ωj)).Observe that (1.12) automatically holds (assuming a pricing kernel exists)

when U is a derivative and thus is of the form U = φ · A. In this case wesay that U is a numeraire and then we usually denote it by the symbol N inpreference to U . In general there are more units than numeraires.The ideas of numeraires, martingales and change of measure are central to

the further development of derivative pricing theory.

1.4 A TWO-PERIOD EXAMPLE

We now briefly consider an example of a two-period economy. Inclusion ofthe extra time-step allows us to develop new ideas whilst still in a relatively

Page 40: Financial derivatives in theory and practice

16 Single-period option pricing

simple framework. In particular, to price a derivative in this richer setting weshall need to define what is meant by a trading strategy which is self-financing.For our two-period example we build on the simple set-up of Section 1.2.

Suppose that at the new time, time 2, the economy can be in one of fourstates which we denote by ωj , j = 1, . . . , 4, with the restriction that states

ω1 and ω2 can be reached only from ω1 and states ω3 and ω4 can be reachedonly from ω2. Figure 1.5 summarizes the possibilities.

A0

A1( 1)ω

A1( 2)ω

t = 1t = 0

1ω'

2ω'

3ω'

4ω'

A2( 1)ω

A2( 2)ω

A2( 3)ω

A2( 4)ω

t = 2

Figure 1.5 Possible states of a two-period economy

Let Ω now denote the set of all possible paths. That is, Ω = ωk, k =1, . . . , 4 where ωk = (ω1,ωk) for k = 1, 2 and ωk = (ω2,ωk) for k = 3, 4. Theasset prices follow one of four paths with A

(i)t (ωk) denoting the price of asset

i at time t = 1, 2.Consider the problem of pricing a derivative X which at time 2 pays an

amountX2(ωk). In order to price the derivativeX we must be able to replicateover all paths. We cannot do this by holding a static portfolio. Instead, theportfolio we hold at time zero will in general need to be changed at time 1according to which state ωj the economy is in at this time. Thus, replicatingX in a two-period economy amounts to specifying a process φ which is non-anticipative, i.e. a process which does not depend on knowledge of a futurestate at any time. Such a process is referred to as a trading strategy. To findthe fair value of X we must be able to find a trading strategy which is self-financing; that is, a strategy for which, apart from the initial capital at timezero, no additional influx of funds is required in order to replicate X . Such atrading strategy will be called admissible.We calculate a suitable φ in two stages, working backwards in time. Suppose

we know the economy is in state ω1 at time 1. Then we know from the one-period example of Section 1.2 that we should hold a portfolio φ1(ω1) of assets

Page 41: Financial derivatives in theory and practice

A two-period example 17

satisfyingφ1(ω1) ·A2(ωj) = X2(ωj), j = 1, 2.

Similarly, if the economy is in state ω2 at time 1 we should hold a portfolioφ1(ω2) satisfying

φ1(ω2) ·A2(ωj) = X2(ωj), j = 3, 4.

Conditional on knowing the economy is in state ωj at time 1, the fair valueof the derivative X at time 1 is then X1(ωj) = φ1(ωj) · A1(ωj), the value ofthe replicating portfolio.Once we have calculated φ1 the problem of finding the fair value of X at

time zero has been reduced to the one-period case, i.e. that of finding the fairvalue of an option paying X1(ωj) at time 1. If we can find φ0 such that

φ0 ·A1(ωj) = X1(ωj), j = 1, 2,

then the fair price of X at time zero is

X0 = φ0 ·A0.

Note that for the φ satisfying the above we have

φ0 · A1(ωj) = φ1(ωj) ·A1(ωj) , j = 1, 2,

as must be the case for the strategy to be self-financing; the portfolio is merelyrebalanced at time 1.Though of mathematical interest, the multi-period case is not important in

practice and when we again take up the story of derivative pricing in Chapter7 we will work entirely in the continuous time setting. For a full treatment ofthe multi-period problem the reader is referred to Duffie (1996).

Page 42: Financial derivatives in theory and practice
Page 43: Financial derivatives in theory and practice

2

Brownian Motion

2.1 INTRODUCTION

Our objective in this book is to develop a theory of derivative pricing incontinuous time, and before we can do this we must first have developedmodels for the underlying assets in the economy. In reality, asset prices arepiecewise constant and undergo discrete jumps but it is convenient and areasonable approximation to assume that asset prices follow a continuousprocess. This approach is often adopted in mathematical modelling andjustified, at least in part, by results such as those in Section 2.3.1. Havingmade the decision to use continuous processes for asset prices, we must nowprovide a way to generate such processes. This we will do in Chapter 6 whenwe study stochastic differential equations. In this chapter we study the mostfundamental and most important of all continuous stochastic processes, theprocess from which we can build all the other continuous time processes thatwe will consider, Brownian motion.

The physical process of Brownian motion was first observed in 1828 by thebotanist Robert Brown for pollen particles suspended in a liquid. It was in1900 that the mathematical process of Brownian motion was introduced byBachelier as a model for the movement of stock prices, but this was not reallytaken up. It was only in 1905, when Einstein produced his work in the area,that the study of Brownian motion started in earnest, and not until 1923 thatthe existence of Brownian motion was actually established by Wiener. It waswith the work of Samuelson in 1969 that Brownian motion reappeared andbecame firmly established as a modelling tool for finance.

In this chapter we introduce Brownian motion and derive a few of its moreimmediate and important properties. In so doing we hope to give the readersome insight into and intuition for how it behaves.

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 44: Financial derivatives in theory and practice

20 Brownian motion

2.2 DEFINITION AND EXISTENCE

We begin with the definition.

Definition 2.1 Brownian motion is a real-valued stochastic process with thefollowing properties:

(BM.i) Given any t0 < t1 < . . . < tn the random variables (Wti − Wti−1),i = 1, 2, . . . , n are independent.

(BM.ii) For any 0 ≤ s ≤ t, Wt − Ws ∼ N(0, t − s).(BM.iii) Wt is continuous in t almost surely (a.s.).(BM.iv) W0 = 0 a.s.

Property (BM.iv) is sometimes omitted to allow arbitrary initial distribu-tions, although it is usually included. We include it for definiteness and notethat extending to the more general case is a trivial matter.

Implicit in the above definition is the fact that a process W exists withthe stated properties and that it is unique. Uniqueness is a straightforwardmatter – any two processes satisfying (BM.i)–(BM.iv) have the same finite-dimensional distributions, and the finite-dimensional distributions determinethe law of a process. In fact, Brownian motion can be characterized by aslightly weaker condition than (BM.ii):

(BM.ii′) For any t ≥ 0, h ≥ 0, the distribution of Wt+h − Wt is independentof t, E[Wt] = 0 and var[Wt] = t.

It should not be too surprising that (BM.ii′) in place of (BM.ii) gives anequivalent definition. Given any 0 ≤ s ≤ t, the interval [s, t] can be subdividedinto n equal intervals of length (t − s)/n and

Wt − Ws =n∑

i=1

Ws+(t−s)i/n − Ws+(t−s)(i−1)/n,

i.e. as a sum of n independent, identically distributed random variables. Alittle care needs to be exercised, but it effectively follows from the central limittheorem and the continuity of W that Wt − Ws must be Gaussian. Breiman(1992) provides all the details. The moment conditions in (BM.ii′) now forcethe process to be a standard Brownian motion (without them the conditionsdefine the more general process µt + σWt).

The existence of Brownian motion is more difficult to prove. There are manyexcellent references which deal with this question, including those by Breiman(1992) and Durrett (1984). We will merely state this result. To do so, we mustexplicitly define a probability space (Ω,F , P) and a process W on this spacewhich is Brownian motion. Let Ω = C ≡ C(R+, R) be the set of continuousfunctions from [0,∞) to R. Endow C with the metric of uniform convergence,

Page 45: Financial derivatives in theory and practice

Basic properties 21

i.e. for x, y ∈ C

ρ(x, y) =∞∑

n=1

(12

)n ρn(x, y)1 + ρn(x, y)

whereρn(x, y) = sup

0≤t≤n|x(t) − y(t)|.

Now define F = C ≡ B(C), the Borel σ-algebra on C induced by the metric ρ.

Theorem 2.2 For each ω ∈ C, t ≥ 0, define Wt(ω) = ωt. There exists aunique probability measure W (Wiener measure) on (C, C) such that thestochastic process Wt, t ≥ 0 is Brownian motion (i.e. satisfies conditions(BM.i)–(BM.iv)).

Remark 2.3: The set-up described above, in which the sample space Ω is theset of sample paths, is the canonical set-up. There are others, but the canonicalset-up is the most direct and intuitive to consider. Note, in particular, thatfor this set-up every sample path is continuous.

In the above we have defined Brownian motion without reference to afiltration. Adding a filtration is a straightforward matter: it will often be takento be FW

t := σ(Ws, s ≤ t), but it could also be more general. Brownianmotion relative to a filtered probability space is defined as follows.

Definition 2.4 The process W is Brownian motion with respect to thefiltration Ft if:(i) it is adapted to Ft;(ii) for all 0 ≤ s ≤ t, Wt − Ws is independent of Fs;(iii) it is a Brownian motion as defined in Definition 2.1.

2.3 BASIC PROPERTIES OF BROWNIAN MOTION

2.3.1 Limit of a random walk

If you have not encountered Brownian motion before it is important to developan intuition for its behaviour. A good place to start is to compare it witha simple symmetric random walk on the integers, Sn. The following resultroughly states that if we speed up a random walk and look at it from adistance (see Figure 2.1) it appears very much like Brownian motion.

Theorem 2.5 Let Xi, i ≥ 1 be independent and identically distributedrandom variables with P(Xi = 1) = P(Xi = −1) = 1

2 . Define the simple

Page 46: Financial derivatives in theory and practice

22 Brownian motion

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0 0.2 0.4 0.6 0.8 1

Time

Symmetric random walk

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1Time

Figure 2.1. Brownian motion and a random walk

Page 47: Financial derivatives in theory and practice

Basic properties 23

symmetric random walk Sn, n ≥ 1 as Sn =∑n

i=1 Xi, and the rescaledrandom walk Zn(t) := 1√

nSnt where x is the integer part of x. As n → ∞,

Zn ⇒ W.

In the above, W is a Brownian motion and the convergence ⇒ denotes weakconvergence. In this setting this is equivalent to convergence of finite-dimen-sional distributions, meaning, here, that for any t1, t2, . . . , tk and any z ∈ R

k,

P(Zn(ti) ≤ zi, i = 1, . . . , k) → P(Wti ≤ zi, i = 1, . . . , k)

as n → ∞.

Proof: This follows immediately from the (multivariate) central limit theo-rem for the simple symmetric random walk.

2.3.2 Deterministic transformations of Brownian motion

There is an extensive theory which studies what happens to a Brownianmotion under various (random) transformations. One question people ask iswhat is the law of the process X defined by

Xt = Wτ (t),

where the random clock τ is specified by

τ (t) = infs ≥ 0 : Ys > t,

for some process Y. In the case when Y, and consequently τ , is non-random thestudy is straightforward because the Gaussian structure of Brownian motionis retained. In particular, there are a well-known set of transformations ofBrownian motion which produce another Brownian motion, and these resultsprove especially useful when studying properties of the Brownian samplepath.

Theorem 2.6 Suppose W is Brownian motion. Then the following trans-formations also produce Brownian motions:

(i) Wt := cWt/c2 for any c ∈ R\0; (scaling)(ii) Wt := tW1/t for t > 0, W0 := 0; (time inversion)(iii) Wt := Wt+s − Ws : t ≥ 0 for any s ∈ R

+. (time homogeneity)

Proof: We must prove that (BM.i)–(BM.iv) hold. The proof in each casefollows similar lines so we will only establish the time inversion result whichis slightly more involved than the others.

Page 48: Financial derivatives in theory and practice

24 Brownian motion

Given any fixed t0 < t1 < . . . < tn it follows from the Gaussian

structure of W that (Wt0 , . . . ,Wtn) also has a Gaussian distribution. AGaussian distribution is completely determined by its mean and covariance

and thus (BM.i) and (BM.ii) now follow if we can show that E[Wt] = 0 and

E[WsWt] = s ∧ t for all s, t. But this follows easily:

E[Wt] = E[tW1/t] = tE[W1/t] = 0,

E[WsWt] = E[stW1/sW1/t] = st (1/t) ∧ (1/s) = s ∧ t .

Condition (BM.iv), that W0 = 0, is part of the definition and, for any t > 0,

the continuity of W at t follows from the continuity of W . It therefore only

remains to prove that W is continuous at zero or, equivalently, that for all

n > 0 there exists some m > 0 such that sup0<t≤1/m |Wt| < 1/n. The

continuity of W for t > 0 implies that

sup0<t≤1/m

|Wt| = supq∈Q∩(0,1/m]

|Wq| ,

and so

P limt→0

Wt = 0 = Pn m

supq∈Q∩(0,1/m]

|Wq| < 1/n

= Pn m q∈Q∩(0,1/m]

|Wq| < 1/n . (2.1)

There are a countable number of events in the last expression of equation

(2.1) and each of these involves the process W at a single strictly positive

time. The distributions of W and W agree for t > 0 and therefore this last

probability is unaltered if the process W is replaced by the original Brownianmotion W . But W is a.s. continuous at zero and so the probability in (2.1) is

one and W is also a.s. continuous at zero.

2.3.3 Some basic sample path properties

The transformations just introduced have many useful applications. Two ofthem are given below.

Theorem 2.7 If W is a Brownian motion then, as t→∞,Wt

t→ 0 a.s.

Proof: Writing Wt for the time inversion of W as defined in Theorem 2.6,

P(limt→∞ Wt

t = 0) = P(lims→0Ws = 0) and this last probability is one since

W is a Brownian motion, continuous at zero.

Page 49: Financial derivatives in theory and practice

Basic properties 25

Theorem 2.8 Given a Brownian motion W,

P

(supt≥0

Wt = +∞, inft≥0

Wt = −∞)

= 1.

Proof: Consider first supt≥0 Wt. For any a > 0, the scaling property impliesthat

P

(supt≥0

Wt > a

)= P

(supt≥0

cWt/c2 > a

)

= P

(sups≥0

Ws > a/c

).

Hence the probability is independent of a and thus almost every sample pathhas a supremum of either 0 or ∞. In particular, for almost every samplepath, supt≥0 Wt = 0 if and only if W1 ≤ 0 and supt≥1(Wt − W1) = 0. But then,defining Wt = W1+t − W1,

p := P

(supt≥0

Wt = 0)

= P

(W1 ≤ 0, sup

t≥1(Wt − W1) = 0

)

= P(W1 ≤ 0)P(

supt≥0

Wt = 0)

=12p.

We conclude that p = 0. A symmetric argument shows that P(inf t≥0 Wt =−∞) = 1 and this completes the proof.

This result shows that the Brownian path will keep oscillating between pos-itive and negative values over extended time intervals. The path also oscillateswildly over small time intervals. In particular, it is nowhere differentiable.

Theorem 2.9 Brownian motion is nowhere differentiable with probabilityone.

Proof: It suffices to prove the result on [0, 1]. Suppose W.(ω) is differentiableat some t ∈ [0, 1] with derivative bounded in absolute value by some constantN/2. Then there exists some n > 0 such that

h ≤ 4/n ⇒ |Wt+h(ω) − Wt(ω)| ≤ Nh. (2.2)

Denote the event that (2.2) holds for some t ∈ [0, 1] by An,N. The event thatBrownian motion is differentiable for some t ∈ [0, 1] is a subset of the event⋃

N

⋃n An,N and the result is proven if we can show that P(An,N) = 0 for all

n, N .

Page 50: Financial derivatives in theory and practice

26 Brownian motion

Fix n and N. Let ∆n(k) = W(k+1)/n − Wk/n and define knt = infk : k/n ≥

t. If (2.2) holds at t then it follows from the triangle inequality that|∆n(kn

t + j)| ≤ 7N/n, for j = 0, 1, 2. Therefore

An,N ⊆n⋃

k=0

2⋂j=0

|∆n(k + j)| ≤ 7N/n,

P(An,N) ≤ P

( n⋃k=0

2⋂j=0

|∆n(k + j)| ≤ 7N/n)

≤ (n + 1)P( 2⋂

j=0

|∆n(j)| ≤ 7N/n)

= (n + 1)P(|∆n(0)| ≤ 7N/n

)3

≤ (n + 1)(

14N√2πn

)3

→ 0 as n → ∞,

the last equality following from the independence of the Brownian increments.If we now note that An,N is increasing in n we can conclude that P(An,N) = 0for all N, n and we are done.

The lack of differentiability of Brownian motion illustrates how irregularthe Brownian path can be and implies immediately that the Brownian pathis not of finite variation. This means it is not possible to use the Brownianpath as an integrator in the classical sense. The following result is central tostochastic integration. A proof is provided in Chapter 3, Corollary 3.81.

Theorem 2.10 For a Brownian motion W, define the doubly infinitesequence of (stopping) times Tn

k via

Tn0 ≡ 0, T n

k+1 = inft > Tnk : |Wt − WT n

k| > 2−n,

and let[W ]t(ω) := lim

n→∞

∑k≥1

[Wt∧T nk(ω) − Wt∧T n

k−1(ω)]2.

The process [W] is called the quadratic variation of W, and [W ]t = t a.s.

2.4 STRONG MARKOV PROPERTY

Markov processes and strong Markov processes are of fundamentalimportance to mathematical modelling. Roughly speaking, a Markov process

Page 51: Financial derivatives in theory and practice

Strong Markov property 27

is one for which ‘the future is independent of the past, given the present’; forany fixed t ≥ 0, X is Markovian if the law of Xs −Xt : s ≥ t given Xt isindependent of the law of Xs : s ≤ t. This property obviously simplifies thestudy of Markov processes and often allows problems to be solved (perhapsnumerically) which are not tractable for more general processes. We shall,in Chapter 6, give a definition of a (strong) Markov process which makesthe above more precise. That Brownian motion is Markovian follows from itsdefinition.

Theorem 2.11 Given any t ≥ 0, Ws − Wt : s ≥ t is independent ofWs : s ≤ t (and indeed Ft if the Brownian motion is defined on a filteredprobability space).

Proof: This is immediate from (BM.i).The strong Markov property is a more stringent requirement of a stochastic

process and is correspondingly more powerful and useful. To understandthis idea we must here introduce the concept of a stopping time. We willreintroduce this in Chapter 3 when we discuss stopping times in more detail.

Definition 2.12 The random variable T , taking values in [0,∞], is an F tstopping time if

T ≤ t = ω : T (ω) ≤ t ∈ Ft ,for all t ≤ ∞.

Definition 2.13 For any Ft stopping time T , the pre-T σ-algebra FT isdefined via

FT = F : for every t ≤∞, F ∩ T ≤ t ∈ Ft .Definition 2.12 makes precise the intuitive notion that a stopping time is

one for which we know that it has occurred at the moment it occurs. Definition2.13 formalizes the idea that the σ-algebra FT contains all the informationthat is available up to and including the time T .A strong Markov process, defined precisely in Chapter 6, is a process which

possesses the Markov property, as described at the beginning of this section,but with the constant time t generalized to be an arbitrary stopping timeT . Theorem 2.15 below shows that Brownian motion is also a strong Markovprocess. First we establish a weaker preliminary result.

Proposition 2.14 Let W be a Brownian motion on the filtered probabilityspace (Ω, Ft,F ,P) and suppose that T is an a.s. finite stopping time takingon one of a countable number of possible values. Then

Wt :=Wt+T −WT

is a Brownian motion and Wt : t ≥ 0 is independent of FT (in particular,it is independent of Ws : s ≤ T).

Page 52: Financial derivatives in theory and practice

28 Brownian motion

Proof: Fix t1, . . . , tn ≥ 0, F1, . . . , Fn ∈ B(R), F ∈ FT and let τj, j ≥ 1 denotethe values that T can take. We have that

P(Wti ∈ Fi, i = 1, . . . , n; F ) =∞∑

j=1

P(Wti ∈ Fi, i = 1, . . . , n; F ; T = τj)

=∞∑j=1

P(Wti+τj−Wτj

∈Fi, i=1, . . . , n)P(F ; T =τj)

= P(Wti ∈ Fi, i = 1, . . . , n)P(F ),

the last equality following from the Brownian shifting property and byperforming the summation. This proves the result.

Theorem 2.15 Let W be a Brownian motion on the filtered probability space(Ω, Ft,F , P) and suppose that T is an a.s. finite stopping time. Then

Wt := Wt+T − WT

is a Brownian motion and Wt : t ≥ 0 is independent of FT (in particular, itis independent of Ws : s ≤ T).

Proof: As in the proof of Proposition 2.14, and adopting the notationintroduced there, it suffices to prove that

P(Wti ∈ Fi, i = 1, . . . , n; F ) = P(Wti ∈ Fi, i = 1, . . . , n)P(F ).

Define the stopping times Tk via

Tk =q

2kif

q − 12k

≤ T <q

2k.

Each Tk can only take countably many values so, noting that Tk ≥ T and thusF ∈ FTk

, Proposition 2.14 applies to give

P(W kti ∈ Fi, i = 1, . . . , n; F ) = P(Wk

ti ∈ Fi, i = 1, . . . , n)P(F ), 2.3

where W kt = Wt+Tk

− WTk. As k → ∞, Tk(ω) → T (ω) and thus, by the almost

sure continuity of Brownian motion, W kt → Wt for all t, a.s. The result now

follows from (2.3) by dominated convergence.

2.4.1 Reflection principle

An immediate and powerful consequence of Theorem 2.15 is as follows.

Page 53: Financial derivatives in theory and practice

Strong Markov property 29

Theorem 2.16 If W is a Brownian motion on the space (Ω, Ft,F ,P) andT = inft > 0 :Wt ≥ a, for some a, then the process

Wt :=Wt, t < T

2a−Wt, t ≥ T

is also Brownian motion.

Time

W

W~

Figure 2.2 Reflected Brownian path

Proof: It is easy to see that T is a stopping time (look ahead to Theorem3.40 if you want more details here), thus W ∗t :=Wt+T −WT is, by the strongMarkov property, a Brownian motion independent of FT , as is −W ∗t (by thescaling property). Hence the following have the same distribution:

(i) Wt1t<T + a+W ∗t−T 1t≥T(ii) Wt1t<T + a−W ∗t−T 1t≥T.

The first of these is just the original Brownian motion, the second is W , sothe result follows.

A ‘typical’ sample path for Brownian motion W and the reflected Brownian

motion W is shown in Figure 2.2.

Example 2.17 The reflection principle can be used to derive the jointdistribution of Brownian motion and its supremum, a result which is used for

the analytic valuation of barrier options. LetW be a Brownian motion and Wbe the Brownian motion reflected about some a > 0. DefiningMt = sups≤tWs

Page 54: Financial derivatives in theory and practice

30 Brownian motion

(and M similarly), it is clear that, for x ≤ a,

ω : Wt(ω) ≤ x, Mt(ω) ≥ a = ω : Wt(ω) ≥ 2a − x, Mt(ω) ≥ a,

and thus, denoting by Φ the normal distribution function,

P(Wt ≤ x, Mt ≥ a) = P(Wt ≥ 2a − x, Mt ≥ a)

= P(Wt ≥ 2a − x)

= 1 − Φ(

2a − x√t

),

the second equality holding since 2a − x ≥ a. For x > a,

P(Wt ≥ x, Mt ≥ a) = P(Wt ≥ x)

= 1 − Φ(

x√t

).

From these we can find the density with respect to x :

P(Wt ∈ dx, Mt ≥ a) =

exp(−(2a − x)2/2t)√2πt

, x ≤ a

exp(−x2/2t)√2πt

, x > a.

The density with respect to a follows similarly.

Page 55: Financial derivatives in theory and practice

3

Martingales

Martingales are amongst the most important tools in modern probability.They are also central to modern finance theory. The simplicity of themartingale definition, a process which has mean value at any future time,conditional on the present, equal to its present value, belies the range andpower of the results which can be established. In this chapter we will introducecontinuous time martingales and develop several important results. The valueof some of these will be self-evident. Others may at first seem somewhatesoteric and removed from immediate application. Be assured, however, thatwe have included nothing which does not, in our view, aid understanding orwhich is not directly relevant to the development of the stochastic integral inChapter 4, or the theory of continuous time finance as developed in Chapter 7.

A brief overview of this chapter and the relevance of each set of resultsis as follows. Section 3.1 below contains the definition, examples and basicproperties of martingales. Throughout we will consider general continuoustime martingales, although for the financial applications in this book we onlyneed martingales that have continuous paths. In Section 3.2 we introduceand discuss three increasingly restrictive classes of martingales. As we imposemore restrictions, so stronger results can be proved. The most important class,which is of particular relevance to finance, is the class of uniformly integrablemartingales. Roughly speaking, any uniformly integrable martingale M, astochastic process, can be summarized by M∞, a random variable: givenM we know M∞, given M∞ we can recover M. This reduces the studyof these martingales to the study of random variables. Furthermore, thisclass is precisely the one for which the powerful and extremely importantoptional sampling theorem applies, as we show in Section 3.3. A secondimportant space of martingales is also discussed in Section 3.2, square-integrable martingales. It is not obvious in this chapter why these areimportant and the results may seem a little abstract. If you want motivation,glance ahead to Chapter 4, otherwise take our word for it that theyare important.

Section 3.3 contains a discussion of stopping times and, as mentioned

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 56: Financial derivatives in theory and practice

32 Martingales

above, the optional sampling theorem (‘if a martingale is uniformly integrablethen the mean value of the martingale at any stopping time, conditional onthe present, is its present value’). The quadratic variation for continuousmartingales is introduced and studied in Section 3.4. The results which wedevelop are interesting in their own right, but the motivation for the discussionof quadratic variation is its vital role in the development of the stochasticintegral. The chapter concludes with two sections on processes more generalthan martingales. Section 3.5 introduces the idea of localization and defineslocal martingales and semimartingales. Then, in Section 3.6, supermartingalesare considered and the important Doob—Meyer decomposition theorem ispresented. We will call on the latter result in Chapter 8 when we study termstructure models.

3.1 DEFINITION AND BASIC PROPERTIES

Definition 3.1 Let (Ω,F ,P) be a probability triple and Ft be a filtrationon F . A stochastic processM is an Ft martingale (or just martingale whenthe filtration is clear) if:

(M.i) M is adapted to Ft;(M.ii) E |Mt| <∞ for all t ≥ 0;(M.iii) E[Mt|Fs] =Ms a.s., for all 0 ≤ s ≤ t.

Remark 3.2: Property (M.iii) can also be written as E[(Mt −Ms)1F ] = 0 forall s ≤ t, for all F ∈ Fs. We shall often use this representation in proofs.

Example 3.3 Brownian motion is a rich source of example martingales. LetW be a Brownian motion and Ft be the filtration generated by W . Then itis easy to verify directly that each of the following is a martingale:

(i) Wt, t ≥ 0;(ii) W 2

t − t, t ≥ 0;(iii) exp(λWt − 1

2λ2t), t ≥ 0 for any λ ∈ R (Wald’s martingale).

Taking Wald’s martingale for illustration, property (M.i) follows from thedefinition of Ft, and property (M.ii) follows from property (M.iii) by settings = 0. To prove property (M.iii), note that for t ≥ 0,

E[Mt] =∞

−∞exp(λu− 1

2λ2t)exp(−u2/2t)√

2πtdu

=∞

−∞

exp(−(u− λt)2/2t)√2πt

du

= 1 .

Page 57: Financial derivatives in theory and practice

Definition and basic properties 33

Thus, appealing also to the independence of Brownian increments, E[Mt −Ms|Fs] = E[Mt − Ms] = 0, which establishes property (M.iii).

We will meet these martingales again later.

Example 3.4 Most martingales explicitly encountered in practice are Markovprocesses, but they need not be, as this example demonstrates. First define adiscrete time martingale M via

M0 = 0

Mn = Mn−1 + αn, n ≥ 1,

where the αn are independent, identically distributed random variables takingthe values ±1 with probability 1/2. Viewed as a discrete time process, M isMarkovian. Now define Mc

t = Mt where t is the integer part of t. This isa continuous time martingale which is not Markovian. The process (Mt, t) isMarkovian but it is not a martingale.

Example 3.5 Not all martingales behave as one might expect. Consider thefollowing process M. Let T be a random exponentially distributed time,P(T > t) = exp(−t), and define M via

Mt =

1 if t − T ∈ Q+,0 otherwise,

Q+ being the positive rationals. Conditions (M.i) and (M.ii) of Definition 3.1are clearly satisfied for the filtration Ft generated by the process M. Further,for any t ≥ s and any F ∈ Fs,

E[Mt1F] ≤ E[1t−T∈Q+] = 0 = E[Ms1F].

Thus E[Mt|Fs] = Ms a.s., which is condition (M.iii), and M is a martingale.

Example 3.5 seems counter-intuitive and we would like to eliminate fromconsideration martingales with behaviour such as this. We will do this byimposing a (right-) continuity constraint on the paths of martingales that wewill consider henceforth. We shall see, in Theorem 3.8, that this restriction isnot unnecessarily restrictive.

Definition 3.6 A function x is said to be cadlag (continu a droite, limitesa gauche) if it is right-continuous with left limits,

limh↓0

xt+h = xt

limh↓0

xt−h exists.

In this case we define xt− := limh↓0 xt−h (which is left-continuous with rightlimits).

We say that a stochastic process X is cadlag if, for almost every ω, X. (ω)is a cadlag function. If X. (ω) is cadlag for every ω we say that X is entirelycadlag.

Page 58: Financial derivatives in theory and practice

34 Martingales

The martingale in Example 3.5 is clearly not cadlag, but it has a cadlagmodification (the process M ≡ 0).Definition 3.7 Two stochastic processesX and Y are modifications (of eachother) if, for all t,

P(Xt = Yt) = 1 .

We say X and Y are indistinguishable if

P(Xt = Yt, for all t) = 1 .

Theorem 3.8 Let M be a martingale with respect to the right-continuous and complete filtration Ft. Then there exists a unique (up toindistinguishability) modification M ∗ of M which is cadlag and adapted toFt (hence is an Ft martingale).

Remark 3.9: Note that it is not necessarily the case that every sample pathis cadlag, but there will be a set (having probability one) of sample paths,in which set every sample path will be cadlag. This is important to bear inmind when, for example, proving several results about stopping times whichare results about filtrations and not about probabilities (see Theorems 3.37and 3.40).

Remark 3.10: The restriction that Ft be right-continuous and complete isrequired to ensure that the modification is adapted to Ft. It is for thisreason that the concept of completeness of a filtration is required in the studyof stochastic processes.

A proof of Theorem 3.8 can be found, for example, in Karatzas and Shreve(1991). Cadlag processes (or at least piecewise continuous ones) are naturalones to consider in practice and Theorem 3.8 shows that this restriction isexactly what it says and no more — the cadlag restriction does not in anyway limit the finite-dimensional distributions that a martingale can exhibit.The ‘up to indistinguishability’ qualifier is, of course, necessary since we canalways modify any process on a null set. Henceforth we will restrict attentionto cadlag processes. The following result will often prove useful.

Theorem 3.11 Let X and Y be two cadlag stochastic processes such that,for all t,

P(Xt = Yt) = 1 ,

i.e. they are modifications. Then X and Y are indistinguishable.

Proof: By right-continuity,

Xt = Yt, some t =q∈QXq = Yq ,

Page 59: Financial derivatives in theory and practice

Classes of martingales 35

and thus

P(Xt = Yt, some t) ≤q∈Q

P(Xq = Yq) = 0 .

So we can safely restrict to cadlag processes, and henceforth wheneverwe talk about a martingale we will mean its cadlag version. The finite-dimensional distributions of a martingale (indeed, of any stochastic process)determine its law, and this in turn uniquely determines the cadlag versionof the martingale. Thus questions about the sample paths of a martingaleare now (implicitly) reduced to questions about the finite-dimensionaldistributions. Consequently it now makes sense to ask questions about thesample paths of continuous time martingales as these will be measurableevents on the understanding that any such question is about the cadlag versionof the martingale.

3.2 CLASSES OF MARTINGALES

We now introduce three subcategories of martingales which are progressivelymore restrictive. These are L1-bounded martingales, uniformly integrablemartingales and square-integrable martingales. The concept of uniformintegrability is closely tied to conditional expectation and hence to themartingale property (see Remark 3.23). Consequently the set of uniformlyintegrable martingales is often the largest and most natural class for whichmany important results hold. Note that Brownian motion is not included inany of the classes below.

3.2.1 Martingales bounded in L1

First a more general definition which includes L1.Definition 3.12 We define Lp to be the space of random variables Y suchthat

E[|Y |p] <∞ .

A stochastic process X is said to be bounded in Lp (or in Lp) if

supt≥0

E |Xt|p <∞ .

If the process X is bounded in L2 we say X is square-integrable.

We will often also use Lp to denote the space of stochastic processes boundedin Lp.

Page 60: Financial derivatives in theory and practice

36 Martingales

Here we are interested in L1-bounded martingales. Note that if M is amartingale then E |Mt| <∞ for all t. Further, it follows from the conditionalform of Jensen’s inequality that, for s ≤ t,

E |Mt| = EE |Mt||Fs ≥ E E[Mt|Fs] = E |Ms| .

Hence supt≥0 E[|Mt|] = limt→∞ E[|Mt|] and a martingale being bounded inL1 is a statement about the martingale as t→∞.The restriction of L1-boundedness is sufficient to prove one important

convergence theorem. We omit the proof which, although not difficult, wouldrequire us to introduce a few extra ideas. A proof can be found, for example,in Revuz and Yor (1991).

Theorem 3.13 (Doob’s martingale convergence theorem) LetM be acadlag martingale bounded in L1. Then M∞(ω) := limt→∞Mt(ω) exists andis finite almost surely.

Remark 3.14: It follows from Fatou’s lemma that E |M∞| ≤ lim inf E |Mt| <∞. One might be tempted to conclude that(i) Mt →M∞ in L1, meaning E |Mt −M∞| → 0 as t→∞,and(ii) Mt = E[M∞|Ft].Neither of these is true in general. If they are to hold we need to work withthe more restrictive class of uniformly integrable martingales.

3.2.2 Uniformly integrable martingales

The two properties described in Remark 3.14 are very useful when they hold.We shall use the representationMt = E[M∞|Ft] in Chapter 4 when we developthe stochastic integral. The point is that this representation allows us toidentify a martingaleM which is a stochastic process, with a random variable,M∞. To this end we introduce the idea of a uniformly integrable martingale.

Definition 3.15 A family C of (real-valued) random variables is uniformlyintegrable (UI) if, given any ε > 0, there exists some Kε <∞ such that

E |X |1|X|>Kε < ε

for all X ∈ C.A martingale M is UI if the family of random variables Mt, t ≥ 0 is UI.

Remark 3.16: The effect of imposing the uniform integrability assumptionis to control the tail behaviour of the martingale. It is just sufficient forTheorem 3.18, the convergence result described in Remark 3.14, to hold.

Page 61: Financial derivatives in theory and practice

Classes of martingales 37

Clearly supt≥0 E |Mt| < 1+K1 for a UI martingale, so uniform integrability

is indeed a stronger requirement than L1-boundedness. Note the followingresult, however, which shows that, in one sense at least, uniform integrabilityis only just stronger than L1-boundedness.

Theorem 3.17 Suppose M is a martingale bounded in Lp for some p > 1.Then M is uniformly integrable.

Proof: Let A = supt≥0 E |Mt|p . Defining Kε =p−1 A/ε, we have

supt≥0

E[|Mt|1|Mt|>Kε] ≤ supt≥0

E |Mt| Mt

p−11|Mt|>Kε

≤ supt≥0

E|Mt|pKp−1ε

=A

Kp−1ε

= ε .

And now for the theorem that shows UI martingales to be exactly the rightconcept.

Theorem 3.18 For a cadlag martingale M the following are equivalent:

(i) Mt →M∞ in L1 (for some random variable M∞).(ii) There exists some M∞ ∈ L1 such that Mt = E[M∞|Ft] a.s. for all t.(iii) M is uniformly integrable.

Proof: (i) ⇒ (ii): Given t ≥ 0 and F ∈ Ft, for all s ≥ t,

E[(M∞ −Mt)1F ] = E (Ms −Mt)1F + E (M∞ −Ms)1F

= E (M∞ −Ms)1F

≤ E |M∞ −Ms| → 0

as s → ∞. This holds for all F ∈ Ft, and Mt is Ft-measurable, thus weconclude that E[M∞|Ft] =Mt.

(ii) ⇒ (iii): Note the following three results.(a) E |Mt|1|Mt|>K ≤ E |M∞|1|Mt|>K (by Jensen’s inequality and the

tower property).(b) KP(|Mt| > K) ≤ E |Mt| ≤ E |M∞| (by truncating the expectation

and Jensen’s inequality, respectively).(c) If X ∈ L1, then for any ε > 0, there exists some δ > 0 such that

P(F ) < δ ⇒ E |X |1F < ε .

Page 62: Financial derivatives in theory and practice

38 Martingales

Given any ε > 0, choose δ such that (c) holds for X =M∞, and choose Kε

in (b) so that P(|Mt| > Kε) < δ. It follows from (a) and (c) that

E |Mt|1|Mt|>K < ε

for all t, hence M is uniformly integrable.

(iii) ⇒ (i): Since M is uniformly integrable, it is bounded in L1 (Remark3.16), hence Mt →M∞ a.s. for some M∞ ∈ L1 (Theorem 3.13). Almost sureconvergence of a UI process implies L1 convergence, so (i) holds.Exercise 3.19: Establish condition (c) in the above proof: if X ∈ L1, then forany ε > 0, there exists some δ > 0 such that

P(F ) < δ ⇒ E |X |1F < ε .

Exercise 3.20: Establish the convergence result used to prove (iii)⇒ (i) above:suppose Xt ∈ L1 for each t ≥ 0 and X ∈ L1. Then Xt → X in L1 if and onlyif the following hold:(i) Xt → X in probability;(ii) the family Xt, t ≥ 0 is UI.

Exercise 3.21: Prove the following which we shall need later: if Xn, n ≥ 1is a UI family of random variables such that Xn → X a.s. as n → ∞, thenX ∈ L1 (and thus Xn → X in L1 by Exercise 3.20).A proof of these standard results can be found in Rogers and Williams

(1994).

Remark 3.22: Given any uniformly integrable martingale, we can identify itwith an L1 random variable X =M∞. Conversely, given any random variableX ∈ L1 we can define a UI martingale via

Mt := E[X |Ft] . (3.1)

For each t, the random variable Mt in (3.1) is not unique, but if Mt andM∗t both satisfy (3.1) then Mt = M∗t a.s. That is, if M and M ∗ are twomartingales satisfying (3.1) for all t, they are modifications of each other. ByTheorem 3.8, if Ft is right-continuous and complete, and we will insist onthis in practice, then there is a unique (up to indistinguishability) martingaleM satisfying (3.1) that is cadlag.

Remark 3.23: A slight modification of the above proof of (ii)⇒(iii) establishesthe following result: for X ∈ L1 the class

E[X |G] : G a sub-σ-algebra of F

is uniformly integrable.

Page 63: Financial derivatives in theory and practice

Classes of martingales 39

Two important examples of UI martingales are bounded martingales

(obvious from the definition) and martingales of the form Mt :=Mt∧n whereM is a martingale and n is some constant (using property (ii) of Theorem3.18).

3.2.3 Square-integrable martingales

We are working towards the ultimate goal, in Chapter 4, of defining thestochastic integral, and it is for this reason that we introduce the spaceof square-integrable martingales. The motivation for this is as follows. Inconstructing the stochastic integral we will need to consider the limit M of asequence of martingalesM (n). We require this limit also to be a martingale, sowe need to introduce a space of martingales that is complete. If we restrict toa subclass of UI integrable martingales we are able to identify each martingaleM with its limit random variable M∞ and the problem of finding a complete(normed) space of martingales reduces to finding a complete (normed) spaceof random variables. Such a space is well known, the space L2 of (equivalenceclasses of) square-integrable random variables. Those not familiar with square-integrable random variables and their properties can find all the importantresults in Appendix 2.

Definition 3.24 The martingale spacesM2,M20, cM2 and cM2

0 are definedas follows.

M2 = martingales M :M∞ ∈ L2M2

0 = M ∈M2 :M0 = 0 a.s.cM2 = M ∈M2 :M is continuous a.s.cM2

0 = M ∈M2 :M0 = 0, M is continuous a.s. .

We will need the following result in the special case p = 2. It can be foundin any standard text on martingales.

Theorem 3.25 (Doob’s Lp inequality) Let M be a cadlag martingale,bounded in Lp, for some p > 1, and define

M∗ := supt≥0

|Mt| .

Then M∗ ∈ Lp and

E (M∗)p ≤ p

p− 1p

supt≥0

E (Mt)p .

Page 64: Financial derivatives in theory and practice

40 Martingales

Corollary 3.26 Suppose M is a cadlag martingale bounded in Lp, for somep > 1. Then

Mt → M∞ a.s. and in Lp

andsupt≥0

E [(Mt)p] = limt→∞

E [(Mt)p] = E [(M∞)p] .

Proof: It is immediate that M ∈ L1, hence Mt → M∞ a.s. by Theorem 3.13.Further, applying Doob’s Lp inequality,

|Mt − M∞|p ≤ (2M ∗)p ∈ L1

so Mt → M∞ in Lp by the dominated convergence theorem. Now, for all s ≤ t,the conditional form of Jensen’s inequality implies that

E [|Ms|p] = E [|E[Mt|Fs]|p] ≤ E [|Mt|p]

so E [|Mt|p] is increasing in t. This establishes the first equality. The secondequality follows immediately from Lp convergence.

Theorem 3.27 The spaces M2,M20, cM2 and cM2

0 are Hilbert spaces forthe norm

||M ||2 :=(E[M 2

∞]) 1

2 .

(So the corresponding inner product on this space is 〈M, N〉 = E[M∞N∞].)

Remark 3.28: Recall that a Hilbert space is a complete inner product space.We shall use the inner product structure in Chapter 5.

Remark 3.29: To be strictly correct we need to be a little more careful inour definitions before Theorem 3.27 can hold. Let || · || denote the classicalL2 norm ||X|| = (E[X2])1/2. The problem is that if X and Y are two randomvariables which are a.s. equal, then ||X − Y || = 0 but, in general, X = Y .Hence || · || is not a norm on the space L2. For this to be true we must identifyany two random variables which are almost surely equal. Similarly here, wemust identify any two martingales which are modifications of each other. Incommon with most other authors, we will do this without further commentwhenever we need to. Appendix 2 contains a fuller discussion of the space L2.

Proof: In each case the trick is to identify a martingale with its limit randomvariable M∞ ∈ L2, which we can do by Theorems 3.17 and 3.18. All theproperties of a Hilbert space are inherited trivially from those of L2, withthe exception of completeness. We prove this for cM2

0.

Page 65: Financial derivatives in theory and practice

Stopping times and the optional sampling theorem 41

To prove cM20 is complete we must show that any Cauchy sequence M (n)

converges to some limit M ∈ cM20. So let M (n) be some Cauchy sequence with

respect to || · ||2 in cM20 and let (recalling Remark 3.29),

M∞ := limn→∞

M (n)∞ in L2.

Mt := E[M∞|Ft],

where the limit exists since L2 is complete and each M(n)∞ ∈ L2. It is immediate

from the definition of the norm || · || on L2 that limn→∞ ||M (n) − M ||2 = 0, soit remains only to prove that M ∈ cM2

0. That M is a martingale is immediatefrom its definition, so it suffices to prove that it is continuous and null at zero.But, for each n, M (n) − M is bounded in L2, so Corollary 3.26 implies that

supt≥0

E[(M (n)t − Mt)2] = ||M (n)

∞ − M∞||2 (3.2)

which converges to zero as n → ∞, thus M0 = 0. Further, it follows from (3.2)and Doob’s L2 inequality that there exists some subsequence nk on which

supt≥0

|M (nk)t (ω) − Mt(ω)| → 0

for almost all ω. But the uniform limit of a continuous function is continuous,hence M is continuous almost surely.

Exercise 3.30: Convince yourself of the final step in this proof; if Xn → Xin Lp then Xn → X in probability, and if Xn → X in probability then thereexists some subsequence Xnk

such that Xnk→ X a.s.

The space cM20 will be of particular relevance for the development of

the stochastic integral. It is convenient in that development to start withmartingale integrators which start at zero, hence the reason for not workingin the more general space cM2.

3.3 STOPPING TIMES AND THE OPTIONALSAMPLING THEOREM

3.3.1 Stopping times

Stopping times perform a very important role in both the theory of stochasticprocesses and finance. The theory of stopping times is a detailed and deep one,but for our purposes we can avoid most of it. Roughly speaking, a stoppingtime is one for which ‘you know when you have got there’. Here is the formaldefinition of a stopping time.

Page 66: Financial derivatives in theory and practice

42 Martingales

Definition 3.31 The random variable T, taking values in [0,∞], is an Ftstopping time if

T ≤ t = ω : T (ω) ≤ t ∈ Ft

for all t ≤ ∞.

Definition 3.32 For any Ft stopping time T, the pre-T σ-algebra FT isdefined via

FT = F ∈ F : for every t ≤ ∞, F ∩ T ≤ t ∈ Ft.

Technical Remark 3.33: Definition 3.31 is as we would expect. Note that theinequalities are not strict in either definition; if they were the time T wouldbe an optional time. If the filtration Ft is right-continuous, optional andstopping times agree. See Revuz and Yor (1991) for more on this.

Remark 3.34: The pre-T σ-algebra is the information that we will haveavailable at the random time T. In particular, if T is the constant time t,FT = Ft.

Exercise 3.35: Show that the following are stopping times:(i) constant times;(ii) S ∧ T, S ∨ T and S + T for stopping times S and T ;(iii) supnTn for a sequence of stopping times Tn;(iv) infnTn when Ft is right-continuous.

Exercise 3.36: Show that if S and T are stopping times, then FS∧T =FS ∩ FT. Also, if Ft is right-continuous and Tn ↓ T are stopping times thenFT =

⋂n FTn

.

The most important stopping times for the development of the stochasticintegral will be hitting times of sets by a stochastic process X, that is stoppingtimes of the form

HΓ(ω) := inft > 0 : Xt(ω) ∈ Γ. (3.3)

Note the strict inequality in this definition. The story is different if theinequality is softened (see Rogers and Williams (1987), for example). Timesof the form (3.3) are not, in general, stopping times. The following resultsare sufficient for our purposes and give conditions on either the process Xor the filtration Ft under which HΓ is a stopping time. Indeed, for our

Page 67: Financial derivatives in theory and practice

Stopping times and the optional sampling theorem 43

applications X will always be a.s. continuous and the filtration will always becomplete.

Theorem 3.37 Let Γ be an open set and suppose X is Ft adapted; ifX is entirely cadlag, or if X is cadlag and Ft is complete, then HΓ is anFt+ stopping time.

Proof: It suffices to prove that HΓ < t ∈ Ft for all t since then HΓ ≤ t =⋂q∈(t,t+1/n]∩QHΓ < q ∈ Ft+1/n for all n > 0. Taking the intersection over all

n > 0 shows that HΓ ≤ t ∈ Ft+.Denote by Ω∗ the set of ω for which X(ω) is cadlag and by At the set

At :=⋃

q∈(0,t)∩Q

ω : Xq(ω) ∈ Γ

=⋃

q∈(0,t)∩Q

X−1q (Γ).

Note that At ∈ Ft since Γ is open, and on Ω∗ the events HΓ < t and At

coincide. Thus we can write

HΓ < t =(

Ω∗ ∩ HΓ < t)∪

((Ω∗)c ∩ HΓ < t

)

=(

Ω∗ ∩ At

)∪

((Ω∗)c ∩ HΓ < t

). (3.4)

When X is entirely cadlag, Ω∗ = Ω and HΓ < t = At ∈ Ft. When Ft iscomplete both Ω∗ and the second event in (3.4), which has probability zero,are in Ft, as is At, hence HΓ < t ∈ Ft as required.

Corollary 3.38 Suppose, in addition to the hypotheses of Theorem 3.37,that Ft is right-continuous. Then HΓ is an Ft stopping time.

Exercise 3.39: Let Γ be an open set and let X be adapted to Ft. Supposethat X is entirely continuous, or that X is a.s. continuous and Ft iscomplete. Show that HΓ need not be an Ft stopping time.

Theorem 3.40 Let Γ be a closed set and suppose X is Ft adapted. If Xis entirely cadlag, or if X is cadlag and Ft is complete, then

H∗Γ := inft > 0 : Xt(ω) ∈ Γ or Xt−(ω) ∈ Γ

is an Ft stopping time.

Page 68: Financial derivatives in theory and practice

44 Martingales

Proof: We prove the entirely cadlag version of the result. The simple argumentused in Theorem 3.37 extends this result to the cadlag case. Let ρ(x, Γ) :=inf|x − y| : y ∈ Γ. Defining the open set Γn = x : ρ(x, Γ) < n−1, thosepoints within distance n−1 of Γ, we have Γ =

⋂n Γn, and so

H∗Γ ≤ t = Xt ∈ Γ ∪ Xt− ∈ Γ ∪

( ⋂n

⋃q∈(0,t)∩Q

Xq ∈ Γn)

∈ Ft.

Corollary 3.41 If X is entirely continuous (or if X is continuous and Ftis complete), Ft adapted, Γ a closed set, then HΓ is an Ft stopping time.

Example 3.42 Consider a Brownian motion W on its canonical set-up(Theorem 2.2), and let

T = inft > 0 : Wt + t ∈ [−a, b]

for some a, b > 0. The set (−∞, a) ∪ (b,∞) is an open subset of R andW is entirely continuous, so T is an Ft stopping time by Corollary 3.39above.

Example 3.43 Now let W be a two-dimensional Brownian motion on acomplete space (Ω, Ft,F , P) and let

T = inft > 0 : Wt ∈ D

whereD = (x, y) : x2 + y2 < K.

Now, since Brownian motion is a.s. continuous and Ft is complete, we canapply Corollary 3.41 to deduce that T is an Ft stopping time.

Given the intuitive interpretation of FT as the amount of informationavailable at the stopping time T, we would also hope that XT ∈ FT if X is aprocess adapted to Ft. It is true in certain circumstances.

Theorem 3.44 Let X be adapted to Ft and let T be some Ft stoppingtime. If X is entirely cadlag, or if X is cadlag and Ft is complete, then XT

is FT-measurable.

Remark 3.45: Theorem 3.44 holds more generally when X is progressivelymeasurable, as defined later in Theorem 3.60. This is a joint measurabilitycondition which holds when X is cadlag and adapted and Ft is complete.

Page 69: Financial derivatives in theory and practice

Stopping times and the optional sampling theorem 45

Remark 3.46: If T (ω) =∞ and X∞(ω) exists, we define XT (ω) = X∞(ω). IfT (ω) =∞ but X∞ does not exist, we take XT (ω) = 0 by convention.

An important idea for the results to come is that of a stopped process.

Definition 3.47 Given a stochastic process X and a stopping time T , thestopped process XT is defined by

XTt (ω) := Xt∧T (ω) .

There is a natural and immediate extension of Theorem 3.44 to processesas follows.

Corollary 3.48 If T is an Ft stopping time and if X is Ft-adaptedand entirely cadlag (or cadlag and F t is complete), then the process XT isFt∧T adapted (and therefore also Ft-adapted).

3.3.2 Optional sampling theorem

Now that we have met the concept of a martingale and a stopping time, itis time to put the two together and ask the question: does the martingaleproperty

Mt = E[MT |Ft]

hold if T is a stopping time? In general the answer is no, as can be seen bytaking M to be Brownian motion and T = inft > 0 : Mt ≥ 1. There is animportant situation when it is true, however — when M is a UI martingale.Indeed, this property is a characterizing property of UI martingales, asTheorems 3.50 and 3.53 demonstrate.We must first prove a restricted version of the result.

Theorem 3.49 Suppose S ≤ T are two stopping times taking at most afinite number of values (possibly including ∞), and let M be some UI, cadlagmartingale. Then

MS = E[MT |FS ] = E[M∞|FS ] a.s.

Proof: It suffices to prove the first equality since the second then follows byapplying it to the stopping time T =∞. Let 0 = t0 ≤ . . . ≤ tn ≤ ∞ be a setof values containing all those that S and T can take. For any F ∈ FS we have

(MT −MS)1F =

n

j=1

(Mtj −Mtj−1)1S<tj≤T 1F

Page 70: Financial derivatives in theory and practice

46 Martingales

and thus

E (MT −MS)1F =

n

j=1

E (Mtj −Mtj−1)1(S<tj≤T )1F

=

n

j=1

EE (Mtj −Mtj−1)1S<tj≤T1F |Ftj−1

=

n

j=1

E 1S<tj≤T1FE (Mtj −Mtj−1)|Ftj−1

= 0 .

This holds for all F ∈ FS, and MS is FS-measurable; henceE[MT |FS ] = E[MS |FS ] =MS a.s.

It is now a short step to prove the more general result.

Theorem 3.50 IfM is a uniformly integrable, cadlag martingale and S ≤ Tare two stopping times, then

MS = E[MT |FS ] = E[M∞|FS ] a.s.

Proof: If suffices to prove that, for any stopping time S,

MS = E[M∞|FS], (3.5)

since then

E[MT |FS] = E E[M∞|FT ]|FS = E[M∞|FS ] =MS .

To establish (3.5), we need to show that, for all F ∈ FS, E[MS1F ] = E[M∞1F ].Define the sequence of Ft stopping times Sk ↓ S as k →∞ via

Sk =+∞ if S ≥ k,q2k if q−1

2k ≤ S < q2k < k .

Since M is UI, M∞ exists and, for each k, the pair of stopping times Sk,∞satisfy the conditions of Theorem 3.49. Thus,

MSk = E[M∞|FSk ] a.s. (3.6)

Conditioning on FS (⊆ FSk), we deduce thatE[MSk |FS ] = E[M∞|FS ],

i.e. E[MSk1F ] = E[M∞1F ] for all F ∈ FS . The result now follows if wecan show that, for all F ∈ FS, E[MSk1F ] → E[MS1F ] as k → ∞, i.e. thatMSk →MS in an L1 sense. But the representation in (3.6) and Remark 3.23show that the family MSk is UI, and the right-continuity ofM ensures thatMSk → MS a.s. These two conditions combined ensure the L1 convergencethat we require (Exercise 3.21).The following result was established as part of this last proof and will be

needed shortly.

Page 71: Financial derivatives in theory and practice

Stopping times and the optional sampling theorem 47

Corollary 3.51 If M is a uniformly integrable, cadlag martingale and S isa stopping time, then E[|MS |] <∞.

Remark 3.52: There are several special cases of the optional sampling theoremwhich are often stated in slightly different form. For example, we can dropthe UI requirement on M if the stopping time T is bounded by some n sinceMn is a UI martingale. Also, if M is bounded up until the stopping time Tthe result also holds since M T is UI (and we shall shortly prove that it is alsoa martingale).

There is a very useful partial converse to the optional sampling theoremwhich characterizes UI martingales as precisely those for which the optionalsampling theorem holds.

Theorem 3.53 Let M be some cadlag and adapted process such that, forevery stopping time T ,

E |MT | <∞ and E[MT ] = 0.

Then M is a uniformly integrable martingale.

Proof: As per our earlier convention, define M∞(ω) = limt→∞Mt(ω) if thelimit exists,M∞(ω) = 0 otherwise. Note thatM∞ thus defined is in L1 (takingT =∞ in the hypothesis of the theorem). By Theorem 3.18, it suffices to showthat Mt = E[M∞|Ft], i.e. for all t ≥ 0, all F ∈ Ft,

E[Mt1F ] = E[M∞1F ]. (3.7)

Define the stopping time T via

T (ω) =t if ω ∈ F∞ if ω ∈ F c.

Applying the hypotheses to stopping times T and ∞ yields

E[M∞] = E[M∞1F ] + E[M∞1F c ] = 0,

E[MT ] = E[Mt1F ] + E[M∞1F c ] = 0.

Combining these two yields (3.7) as required.This provides the final result needed to prove the following important

proposition.

Proposition 3.54 LetM be an Ft martingale and T be an Ft stoppingtime. IfM is entirely cadlag, or ifM is cadlag and F t is complete, thenMT

is also an Ft martingale. That is, the class of martingales is stable understopping.

Page 72: Financial derivatives in theory and practice

48 Martingales

Proof: We must verify properties (M.i)—(M.iii) of Definition 3.1. The first, thatMT is adapted, is an immediate consequence of Corollary 3.48. To establishproperty (M.ii), note that M t is a UI martingale (since M t

s = E[M t∞|Fs]

for all s), and thus Corollary 3.51 implies that E[|M tT |] = E[|MT

t |] < ∞ asrequired. It remains to show that M T has the martingale property, (M.iii).This is clearly equivalent to showing that, for all 0 ≤ s ≤ t,

E[MT∧tt |Fs] =MT∧t

s ,

and we do this by showing that MT∧t is a UI martingale. We appeal toTheorem 3.53. Since M t is UI, the optional sampling theorem implies that,for all stopping times S,

M0 = E[M tT∧S ] = E[M

T∧tS ] ,

and Corollary 3.51 implies that E[|M T∧tS |] = E[|M t

T∧S |] <∞. We are done.Example 3.55 Let W be a Brownian motion and define T = inft > 0 :Wt /∈[−a, b], a, b > 0. We will show that:(i) T <∞ a.s.;(ii) P(WT = b) = a/(a+ b);(iii) E[T ] = ab.

To do this we will consider stopped versions of each of the martingalesintroduced in Example 3.3. Consider first the martingaleM λ

t = exp λWt∧T −12λ

2(t ∧ T ) which is bounded by zero below and by exp(λ(a + b)) above, sois certainly UI. Applying the optional sampling theorem at the stopping timeT yields

1 = E[Mλ0 ] = E[exp(λWT − 1

2λ2T )]

= E[exp(λWT − 12λ

2T )1T<∞]

→ E[1T<∞]

as λ→ 0 by the dominated convergence theorem. This proves (i) (which alsofollows from Theorem 2.8). Now applying the optional sampling theorem tothe UI martingale W T yields

0 = E[W0] = E[W TT ]

= E[b1WT=b − a1WT=−a]

= bP(WT = b)− a(1− P(WT = b)),

which is (ii). Finally, apply the optional sampling theorem to the UImartingale Mn

t :=W2t∧n − (t ∧ n) at the stopping time T to obtain

0 = E[W 2T∧n − (T ∧ n)] .

Page 73: Financial derivatives in theory and practice

Variation, quadratic variation and integration 49

As n → ∞, T ∧ n ↑ T a.s., and W 2T∧n ≤ (a + b)2, so applications of the

monotone and dominated convergence theorems yield

E[T ] = E[W 2T ] =

ab2 + ba2

a+ b= ab .

3.4 VARIATION, QUADRATIC VARIATION ANDINTEGRATION

We are approaching the point where we can start to define the stochasticintegral. Before we can do so, there is one more idea which we must introduce,that of quadratic variation. First we will introduce the more classical conceptof total variation and review its role in integration.

3.4.1 Total variation and Stieltjes integration

Definition 3.56 Let Xt, t ≥ 0 be some stochastic process. The totalvariation process VX of X is defined, for each t ≥ 0, as

VX(t,ω) = sup∆

n(∆)

i=1

|X(ti,ω)−X(ti−1,ω)|,

where ∆ = t0 = 0 < t1 < . . . < tn(∆) = t is a partition of [0, t].

Remark 3.57: Clearly this is a sample path definition and, for each ω, it agreeswith the classical definition of total variation for functions.

Proposition 3.58 If X is Ft-adapted and entirely cadlag (or cadlag andFt is complete) then VX is also entirely cadlag (respectively cadlag) andFt-adapted.Proof: We prove the result when X is entirely cadlag, the cadlag result thenfollowing along the lines of the proof of Theorem 3.37. Given any ω and t wehave, since X(ω) is cadlag, that

VX(t,ω) = supn

2nt +1

i=1

X(t ∧ (i/2n),ω)−X(t ∧ ((i− 1)/2n),ω) .

This is straightforward to verify and is left as an exercise for the reader. Foreach n the approximating sum is clearly Ft-measurable, and the supremumof a countable number of measurable functions is itself measurable, thus VXis Ft-adapted. The cadlag property is immediate.

Page 74: Financial derivatives in theory and practice

50 Martingales

Definition 3.59 The process X is said to be of finite variation if thefollowing conditions hold:

(i) VX(t,ω) < +∞ for all t, a.s.;(ii) X is cadlag;(iii) X is Ft-adapted.The first property is the main defining characteristic of a finite variation

process. The other two are conditions we have imposed and will continue toimpose on all the processes that will interest us.The advantage of a process being of finite variation is that it is easy to

define a stochastic integral with respect to a finite variation process. This isdone as for Stieltjes integration, as summarized by the following theorem.

Theorem 3.60 Let X be some finite variation process. Then there existincreasing, adapted and cadlag processes X+ and X− such that

X = X+ −X− a.s.

Furthermore, since both X+ and X− are cadlag, we can define the stochasticintegral H •X for any bounded B(R+)⊗ F-measurable integrand H to be

(H •X)t(ω) =t

0

Hu(ω)dX+u (ω)−

t

0

Hu(ω)dX−u (ω) (3.8)

(set it to zero if X(ω) is not cadlag) where the integrals in (3.8) are Lebesgue—Stieltjes integrals. The stochastic integral (H •X) satisfies conditions (i) and(ii) of a finite variation process.A process H is called progressively measurable with respect to Ft if, for

each t, the map (s,ω)→ Hs(ω) from [0, t]×Ω→ (R,B(R)) is B([0, t])⊗Ft-measurable. If H is bounded and progressively measurable then H •X is alsoFt-adapted, hence H •X is of finite variation.

Proof: The functions X+ = 12 (X + VX), X

− = 12 (X − VX) have the desired

properties, as is easily verified. It is then standard integration theory that(3.8) is well defined for any bounded H . That VH •X(t) <∞ follows from theboundedness of H over [0, t] and the finite total variation of X+ and X− over[0, t]. That H •X is cadlag and (when H is progressively measurable) adaptedis straightforward.

Remark 3.61: Note in the above proof that the definition of the integral is notdependent on the particular decomposition of X (which is not unique).

What we have succeeded in doing is to define the stochastic integral of abounded process with respect to a finite variation integrator, and we havedone this on a pathwise basis using no more than standard integration theory.The boundedness condition on the integrand is easily weakened, but we shalldefer this until Chapter 4. The question now is, which processes are of finitevariation? Does this technique allow us to define H •W for a Brownian motionW ?

Page 75: Financial derivatives in theory and practice

Variation, quadratic variation and integration 51

Theorem 3.62 The only continuous martingales of finite variation are a.s.constant.

Proof: Let M be some continuous finite variation martingale and, withoutloss, supposeM0 = 0. We further suppose, by augmentation if necessary, thatthe filtration is complete (note that M is still a martingale for this larger

filtration). We show that M ≡ 0 where M =MT and

T = inft > 0 : VM (t) > K ,for some arbitrary K. This is clearly sufficient.

Since M is continuous, to prove that M ≡ 0 a.s. it suffices to prove thatMq = 0 a.s. for all q ∈ Q+. This follows if we can establish that E[M2

t ] = 0for all t ≥ 0. But for any partition ∆ of [0, t] we have

M2t =

n(∆)

j=1

(M2tj −M2

tj−1)

=

n(∆)

j=1

(Mtj −Mtj−1)2 + 2

n(∆)

j=1

(Mtj −Mtj−1)Mtj−1 . (3.9)

Since T is a stopping time (Corollary 3.39), M T is a martingale (Proposition3.54) and so the second term in (3.9) has expectation zero. Thus

E[M2t ] = E

n(∆)

j=1

(Mtj −Mtj−1)2

≤ E VM(t) sup

j≤n(∆)|Mtj −Mtj−1 |

≤ KE supj≤n(∆)

|Mtj −Mtj−1 | .

Taking a successively finer submesh, this supremum decreases a.s. to zero by

the continuity of M , hence, by dominated convergence, the expectation alsodecreases to zero as required.So if we want to define a stochastic integral relative to a martingale

integrator we will have to work harder. All non-trivial continuous martingalesare of infinite variation. However, they are all of finite quadratic variation,and it is this property that will allow us to define the stochastic integral forcontinuous martingale integrators in Chapter 4.

3.4.2 Quadratic variation

The fact that the only continuous martingales of finite variation are constantmeans we will not be able to define the stochastic integral relative tomartingales as in the previous section. We will also need to replace thevariation process with something, and this is quadratic variation.

Page 76: Financial derivatives in theory and practice

52 Martingales

Definition 3.63 Let X be some continuous process null at zero. Define thedoubly infinite sequence of times T nk (or T

nk (X) when we need to be explicit)

viaT n0 ≡ 0, T nk+1 = inft > T nk : |Xt −XTnk | > 2−n .

The quadratic variation of the process X , written [X ], is defined to be

[X ]t(ω) := limn→∞

k≥1Xt∧Tn

k(ω)−Xt∧Tn

k−1(ω)

2,

or +∞ when the limit does not exist.

Remark 3.64: We have in this definition been very specific about the sequenceof meshes to be used to define the quadratic variation. In particular, they areall nested. It is also possible to define quadratic variation using other nestedmeshes and similar results hold to the ones that we prove in this and the nextchapter. The reason for our particular choice is that we will be able to proveseveral almost sure results, whereas when using other definitions the resultsare only true in probability. It makes no essential difference to the theory, butalmost sure results are often easier to appreciate.

Remark 3.65: The restriction in the definition of quadratic variation thatthe meshes be nested differs from the classical definition (for a function)under which the quadratic variation of any non-trivial continuous martingaleis infinity. This contrasts with our next result.

Theorem 3.66 For any continuous martingale M the quadratic variationprocess [M ] is adapted and a.s. finite, continuous and increasing.

Remark 3.67: As usual, continuous here means entirely continuous or a.s.continuous on a complete filtered space. We need this to ensure that theT nk (M) are stopping times in the proof below.

Proof: We will for now only prove the result for the case when M is bounded(and thus, in particular, M ∈ cM2

0). The general result, and more, followseasily once we have met the idea of localization in Section 3.5, by the inclusionof the phrase ‘By localization we may assume M is bounded’ at the start ofthis proof.We make the following definitions:

tnk = t ∧ T nk (M),snk = sup

jtn−1j : tn−1j ≤ tnk,

∆nk =Mtnk−Mtn

k−1,

Hnk =Msn

k−Mtn

k,

Page 77: Financial derivatives in theory and practice

Variation, quadratic variation and integration 53

and note that

|Hnk | ≤ 2−(n−1) . (3.10)

The times snk and tnk are illustrated in Figure 3.1.

42

t1n

M

t10n

/

32/

22/

12/

12/

22/

32/

42/

t0n= t = s = s0

n 10n

1n

t2n= t = s = ... = s1

n 12n

5n

t6n= t = s = ... = s2

n 16n

9n

= t = s = s3n 1

10n

11n

n

n

n

n

n

n

n

n

Figure 3.1 The times snk and tnk

Defining

Ant =k≥1

Mtnk−Mtn

k−12,

it is readily seen that

Ant =M2t − 2Nn

t (3.11)

where

Nnt =

k≥1Mtn

k−1 Mtnk−Mtn

k−1

=k≥1

Msn+1k−1

Mtn+1k−Mtn+1

k−1.

Page 78: Financial derivatives in theory and practice

54 Martingales

Invoking the notation above, Nnt − Nn+1

t =∑

k≥1 Hn+1k−1 ∆n+1

k , whence

E[(Nnt − Nn+1

t )2] = E[∑j,k≥1

Hn+1j−1 Hn+1

k−1 ∆n+1j ∆n+1

k ]

=∑j,k≥1

EE[Hn+1j−1 Hn+1

k−1 ∆n+1j ∆n+1

k |Ftn+1j−1 ∨tn+1

k−1]

=∑k≥1

E[(Hn+1k−1 )2(∆n+1

k )2]

≤ 4−n∑k≥1

E[(∆n+1k )2]

≤ 4−nE[M 2t ].

Taking limits as t → ∞, and applying Corollary 3.26 to both sides of theabove, yields

E[(Nn∞ − Nn+1

∞ )2] ≤ 4−nE[M 2∞],

and thus, for all m ≥ n,

E[(Nn∞ − Nm

∞)2] ≤ 14n−1 E[M 2

∞]. (3.12)

So Nn is a Cauchy sequence in cM20 and converges in L2 to some continuous

limit N ∈ cM20. But it follows from the bound in (3.12) and the first

Borel–Cantelli lemma that the convergence is also almost sure. Hence, from(3.11), An converges a.s., uniformly in t, to some adapted and continuousprocess [M ].

It remains toprove that [M ] is increasing.For eachω andanyfixeds ≤ t, definek(m, ω) = maxk : Tm

k (ω) ≤ s, k(m, ω) = mink : Tmk (ω) ≥ t. Note that, as

m → ∞, Tmk(m,ω) ↑ s∗ ≤ s, Tm

k(m,ω)↓ t∗ ≥ t, and that

[M ]t(ω) − [M ]s(ω) = [M ]t∗(ω) − [M ]s∗(ω)

= limn→∞

limm→∞

An(Tmk(m,ω)) − An(Tm

k(m,ω)) ≥ 0.

We will need the following corollary in the next chapter.

Corollary 3.68 Let M be some continuous martingale null at zero. If Snk

is any doubly infinite sequence of stopping times, increasing in k, such that,for each n,(i) Tn

k , k ≥ 0 ⊆ Snk , k ≥ 0,

(ii) Snk , k ≥ 0 ⊆ Sn+1

k , k ≥ 0,where Tn

k is as in Definition 3.63, then, for almost every ω,

[M ]t(ω) := limn→∞

∑k≥1

[Mt∧Snk(ω) − Mt∧Sn

k−1(ω)]2. (3.13)

Page 79: Financial derivatives in theory and practice

Variation, quadratic variation and integration 55

Proof: The proof of Theorem 3.66 goes through, with Snk replacing Tn

k ,unaltered except for (3.10), which becomes

|Hnk | ≤ 2−n,

and subsequent consequential changes to future inequalities in the proof. Wethus conclude that the right-hand side of (3.13) has the properties of [M ] asstated in Theorem 3.66. But [M ] is the only process with these properties, aswe show in Meyer’s theorem below. Remark 3.69: Note that quadratic variation is a sample path property. Ifwe change the probability measure on our space from P to some equivalentprobability measure Q (so P(F ) = 0 ⇔ Q(F ) = 0), the quadratic variation ofM will not change and remains a.s. finite.

We now give a very important result which follows almost immediately fromwhat we have done above. This is a special case of the Doob–Meyer theorem,the full version of which we meet in Section 3.6.

Theorem 3.70 (Meyer) The quadratic variation [M ] of a continuoussquare-integrable martingale M is the unique increasing, continuous processsuch that M 2 − [M ] is a UI martingale.

Proof: We saw in the proof above that for M ∈ cM20, [M ] has the required

properties, so it remains only to prove the uniqueness. Suppose A is anotherprocess with these properties. Both [M ] and A are increasing, hence A −[M ] is of finite variation. Further, A − [M ] = (M 2 − [M ]) − (M 2 − A) is acontinuous martingale, null at zero. By Theorem 3.62 it must therefore beidentically zero. 3.4.3 Quadratic covariation

The definition of the quadratic covariation of two processes is an obviousextension of the quadratic variation for a single process.

Definition 3.71 Let X and Y be continuous processes null at zero andlet the doubly infinite sequence of times Tn

k (X) and Tnk (Y ) be as defined in

Definition 3.63. Now define the doubly infinite sequence Snk to be the ordered

union of the Tnk (X) and Tn

k (Y ),

Sn0 = 0,

Snk = inf

t ∈ Tn

k (X) : k ≥ 0 ∪ Tnk (Y ) : k ≥ 0 : t > Sn

k−1

, k > 0.

The quadratic covariation of the processes X and Y, written [X, Y ], is definedto be

[X, Y ]t(ω) := limn→∞

∑k≥1

[Xt∧Snk(ω) − Xt∧Sn

k−1(ω)][Yt∧Sn

k(ω) − Yt∧Sn

k−1(ω)],

or +∞ when the limit does not exist.

Page 80: Financial derivatives in theory and practice

56 Martingales

Theorem 3.72 (Polarization identity) For almost every ω,

[M,N ](ω) = 12 [M +N ]− [M ]− [N ] (ω) .

Proof: This is an easy consequence of Corollary 3.68 and is left as an exercise.

The polarization identity reduces most results about quadratic covariationto the analogous results for the quadratic variation. An important exampleis the following generalization of Theorem 3.70. The proof is left as an exercise.

Corollary 3.73 LetM and N be continuous square-integrable martingales.The quadratic covariation [M,N ] of M and N is the unique finite variation,continuous process such that MN − [M,N ] is a UI martingale. Thus, theprocess MN is a UI martingale if and only if [M,N ] = 0.

3.5 LOCAL MARTINGALES AND SEMIMARTINGALES

Many of the processes that we meet in finance are martingales, but this classof processes is certainly too restrictive. In this section we extend this class,eventually to semimartingales. All the processes that we shall encounter willbe semimartingales. First we shall meet local martingales and the idea oflocalization, a simple yet, as we shall see repeatedly, very powerful technique.

3.5.1 The space cMloc

We have seen that the class of (continuous) martingales is stable understopping (Proposition 3.54). We now define the class of processes which arereduced toM0, the class of martingales null at zero, by an increasing sequenceof stopping times. This technique, of taking a stochastic process and using anincreasing sequence of stopping times to generate a sequence of processes withsome desirable property, is what we mean by localization.

Definition 3.74 Let M be an adapted process null at 0. Then M is calleda local martingale null at 0, and we write M ∈ M0,loc, if there exists anincreasing sequence Tn of stopping times with Tn ↑ ∞ a.s. such that eachstopped process MTn is a martingale (null at 0). If M is also continuous wewrite M ∈ cM0,loc. The sequence Tn is referred to as a reducing sequencefor M (intoM0).A process M is called a (continuous) local martingale, written M ∈Mloc

(respectively M ∈ cMloc), if M0 is F0-measurable and M −M0 ∈ M0,loc

(respectively M −M0 ∈ cM0,loc).

Remark 3.75: Observe that no integrability condition is imposed on therandom variable M0.

Page 81: Financial derivatives in theory and practice

Local martingales and semimartingales 57

Remark 3.76: A martingale is a local martingale but the converse is not true.It is a common misconception to believe that if we have integrability (i.e.E |Mt| < ∞ for all t ≥ 0) then this will ensure that a local martingale is amartingale. Indeed a UI local martingale need not be a martingale. A counter-example to this and other related examples can be found in Revuz and Yor(1991, p. 182).

Remark 3.77: In contrast to Remark 3.76, uniform integrability can sometimeshelp establishing that a local martingale is a true martingale. If M is a localmartingale and for each t > 0 the family M Tn

t is UI then the dominatedconvergence theorem can be applied to show that E[Mt|Fs] =Ms, so M is amartingale.

In Chapter 4, we shall define the stochastic integral for integrators in cM20

and then use localization to extend the theory to more general continuousintegrators. Should we not, therefore, consider processes which localize to thissmaller class? The answer lies in the following simple lemma. This lemmais an example of the simplification that comes from restricting to continuousintegrators. Let cM2

0,loc be the space of local martingales which can be reduced

to martingales in cM20.

Lemma 3.78 Let M ∈ cM0,loc and, for each n ∈ N, define

Sn := inft > 0 : |Mt| > n .

Then for each n, MSn is an a.s. bounded martingale, whence

cM0,loc = cM20,loc .

Proof: Suppose Tk is a reducing sequence for M and that we stop themartingale MTk at the stopping time Sn. Recalling that the space cM0 isstable under stopping, the stopped process (M Tk)Sn = MTk∧Sn is also amartingale (null at 0) and for s < t we have

MTk∧Sn∧s = E MTk∧Sn∧t|Fs .

As k ↑ ∞, MTk∧Sn∧t →MSn∧t a.s. and, since M is continuous,

|MTk∧Sn∧t| ≤ n

almost surely for all k. It now follows from the dominated convergencetheorem that

MSn∧s = E[MSn∧t|Fs],and we are done.Noting that Sn ↑ ∞, it is clear from the lemma that we can always use the

sequence Sn to check if a given continuous adapted process is in cM0,loc.

Page 82: Financial derivatives in theory and practice

58 Martingales

This sequence can also be used to complete the proofs of Theorems 3.62and 3.66. Indeed, these results can both be extended to local martingales,as summarized in the following two theorems.

Theorem 3.79 For any M ∈ cM0,loc, the quadratic variation process [M ]is adapted and a.s. finite, continuous and increasing. Furthermore, [M ] is theunique adapted, continuous, increasing process such that

M 2 − [M ] ∈ cM0,loc. (3.14)

Proof: For each n ∈ N, define Sn as in Lemma 3.78. Each MSn is bounded,hence Theorem 3.66 holds and shows that [MSn ] is finite, adapted, continuousand increasing. That [M ] is adapted now follows since, given any t ≥ 0,[M ]t(ω) = supn[MSn ]t(ω), and the supremum of a countable measurable familyis itself measurable.

Let Et be the event that [M ] is continuous and increasing over the interval[0, t ], and let En

t be the corresponding event for [MSn ]. Theorem 3.66 showsthat P(En

t ) = 1 for all n and t. Since Mt(ω) = limn↑∞ MSnt (ω) for almost all

ω,1Ent↓ 1Et

and thusP(Et) = lim

n↑∞P(En

t ) = 1,

as required.Property (3.14) follows by a similar argument. The uniqueness claim for

[M ] is immediate from the next result, which extends Theorem 3.62 (andcompletes that proof).

Theorem 3.80 Suppose M ∈ cM0,loc. If M has paths of finite variationthen M ≡ 0 a.s.

Proof: Let VM denote the total variation process of M. Stopping M at thestopping time Tn = inft > 0 : VM(t) > n reduces the claim here to the casecovered by Theorem 3.66, i.e. when M is a bounded continuous martingalewith paths of bounded variation.

A simple corollary identifies the quadratic variation of Brownian motion.

Corollary 3.81 If W is a Brownian motion then [W ]t = t.

Proof: Example 3.3 established the fact that W 2t − t is a martingale. The

result thus follows from Theorem 3.79. We conclude this section with the extension of Theorem 3.79 to the

corresponding result for the quadratic variation of two local martingales. Theproof follows from the polarization identity.

Corollary 3.82 Let M, N ∈ cM0,loc. The quadratic covariation [M, N ] ofM and N is the unique finite variation, continuous process such that MN −[M, N ] ∈ cM0,loc. Thus, the process MN is a continuous local martingale ifand only if [M, N ] = 0.

Page 83: Financial derivatives in theory and practice

Local martingales and semimartingales 59

3.5.2 Semimartingales

The class of semimartingales is a relatively minor extension of the class oflocal martingales but it is, in fact, of fundamental importance. It is a verygeneral class of processes and it is, in a sense, the largest class of integratorsfor which the stochastic integral can be defined (Theorem 4.27).

Definition 3.83 A process X is called a semimartingale (relative to theset-up (Ω, Ft,F ,P)) if X is an adapted process which can be written in theform

X = X0 +M +A , (3.15)

where X0 is an F0-measurable random variable, M is a local martingale nullat zero and A is an adapted cadlag process, also null at zero, having paths offinite variation.We denote by S the space of semimartingales and by cS the subspace of

continuous semimartingales.

We shall usually be considering only continuous semimartingales (certainlywhen we define the stochastic integral). The continuity assumption simplifiesthe theory. It turns out that when X is continuous, the processes M andA in the decomposition (3.15) can also be taken to be continuous. It thenfollows from Theorem 3.80 that this continuous decomposition is unique upto indistinguishability (but see Remark 3.85). This allows us to make thefollowing definition.

Definition 3.84 If X is a continuous semimartingale and the decomposition(3.15) is taken to be such that M and A are continuous, we call M thecontinuous local martingale part of X and denote it by M = X loc.

Remark 3.85: Note that the decomposition (3.15) is not unique (even whenX is continuous) if we allow M and A to have discontinuities since there arenon-trivial discontinuous (local) martingales which are of finite variation.

To complete our study of semimartingales we now consider the quadraticvariation of a continuous semimartingale which is, of course, defined viaDefinition 3.63. The following lemma allows us to show that, if X is acontinuous semimartingale then [X ] = [X loc], so we already know all about[X ]. We omit the proof of the lemma. The result is an obvious extension of theequivalent result for a martingale given in Corollary 3.68 once one has seenthe proof of the integration by parts result given in Section 4.6. The readerwho wants to prove this now should look ahead to equation (4.22).

Lemma 3.86 Let X be a continuous semimartingale and let T nk be asin Definition 3.63. If Snk is any doubly infinite sequence of stopping times,increasing in k, such that, for each n,(i) T nk : k ≥ 0 ⊆ Snk : k ≥ 0,

Page 84: Financial derivatives in theory and practice

60 Martingales

(ii) Snk : k ≥ 0 ⊆ Sn+1k : k ≥ 0,then, for almost all ω,

[X ]t(ω) = limn→∞

k≥1Xt∧Sn

k(ω)−Xt∧Sn

k−1(ω)2. (3.16)

Theorem 3.87 For any continuous semimartingale X ∈ cS,[X ] = [X loc]

and, if Y is another continuous semimartingale,

[X,Y ] = [X loc, Y loc] .

Proof: The second equality follows by applying polarization to the first.By stopping X at the time T = inft > 0 : VA(t) > K, where A is the

finite variation part of X , we may assume that VA ≤ K; the general resultthen follows by localization. SupposeX has decomposition (3.15) and considerthe doubly infinite sequence of times Snk defined, for each n, via

Sn0 = 0,

Snk = inf t ∈ T nk (X) : k ≥ 0 ∪ T nk (M) : k ≥ 0 : t > Snk−1 , k > 0,

where T nk (X) (respectively T nk (M)) are the defining stopping times inthe definition of [X ] (respectively [M ]). Letting snk := S

nk ∧ t, we now have by

Lemma 3.86,

[X ]t(ω) = limn→∞

k≥1Xsn

k(ω)−Xsn

k−1(ω)

2

= limn→∞

k≥1Msn

k(ω)−Msn

k−1(ω)2

+k≥1

2 Msnk(ω)−Msn

k−1(ω) + Asn

k(ω)−Asn

k−1(ω)

× Asnk(ω)−Asn

k−1(ω) ,

for almost all ω. For each ω, the first term on the right-hand side convergesto [M ]t by Corollary 3.68. The remaining term is, for each n, bounded by3× 2−nVA(t) ≤ 3× 2−nK → 0 as n→∞. This completes the proof.Remark 3.88 (Change of filtrations and measure): The definition of quadraticvariation is a sample path definition and is in no way dependent on theprobability measure P or the filtration Ft. Clearly the statement that thequadratic variation is a.s. finite and increasing does depend on the probabilitymeasure, but if [X ] is a.s. finite and increasing under P it will also be a.s. finiteand increasing for any Q ∼ P.

Page 85: Financial derivatives in theory and practice

Supermartingales and the Doob—Meyer decomposition 61

Remark 3.89: Suppose X is a continuous semimartingale on (Ω, Ft,F ,P).What happens to X if we change Ft or P? We shall see in Chapter 5 thatif Q ∼ P then X is also a semimartingale for Q, but the decomposition (3.15)will, in general, change.Clearly X will not necessarily remain a continuous semimartingale if we

change the filtration Ft, to a finer filtration for example.

3.6 SUPERMARTINGALES AND THE DOOB—MEYERDECOMPOSITION

Supermartingales are an obvious generalization of martingales and will playan important role in the development of term structure models in Chapter 8.

Definition 3.90 Let (Ω,F ,P) be a probability triple and Ft be a filtra-tion on F . A stochastic process X is an Ft supermartingale if:(i) X is adapted to Ft;(ii) E |Xt| <∞ for all t ≥ 0;(iii) E[Xt|Fs] ≤ Xs a.s., for all 0 ≤ s ≤ t.If −X is a supermartingale then we say that X is a submartingale.

Remark 3.91: A supermartingale is a process which ‘drifts down on average’.Clearly any martingale is also a supermartingale (indeed a process is amartingale if and only if it is both a supermartingale and a submartingale).It is also an immediate consequence of Fatou’s lemma that if X is a localmartingale bounded below by some constant then X is also a supermartingale.

Many of the results stated in this chapter for martingales also hold moregenerally for supermartingales. We shall not explicitly need most of thesemore general results. Note, in particular, however, that Theorem 3.8 holds fora supermartingale X (it has a cadlag modification if F t is right-continuousand complete), as does Doob’s (super)martingale convergence theorem.We finish this chapter with a result of great importance within probability

theory, the Doob—Meyer decomposition theorem (we have already met aspecial case of this decomposition in Theorem 3.70). There are many differentstatements of this result and the one given here is particularly suited to anapplication in Chapter 8. A proof can be found in Elliott (1982, p. 83).

Theorem 3.92 (Doob—Meyer) Suppose Xt, t ∈ [0,∞], is a cadlagsupermartingale. Then X has a unique decomposition of the form

Xt =Mt −At. (3.17)

Page 86: Financial derivatives in theory and practice

62 Martingales

Here A is an increasing predictable process null at zero a.s., and there is anincreasing sequence Tn of stopping times such that Tn ↑ ∞ a.s. and eachstopped process MTn is a uniformly integrable martingale.

Remark 3.93: We will define the concept of a predictable process in Defini-tion 4.2. In particular, ‘predictable’ implies ‘adapted’ which is enough for ouruses later.

Corollary 3.94 If X is an a.s. positive cadlag local martingale integrableat zero then the conclusions of Theorem 3.92 hold.

Proof: Being a positive local martingale integrable at zero, X is a super-martingale. Therefore the hypotheses of Theorem 3.92, and consequently theconclusions, hold.

Remark 3.95: Note that equation (3.17) ensures that the supermartingale Xis also a semimartingale.

Page 87: Financial derivatives in theory and practice

4

Stochastic Integration

In Chapter 3 we often needed a filtration to be complete and right-continuous.We shall throughout the remainder of this book, unless otherwise stated,assume this to be the case. We note that a filtration can always be augmentedso that these properties hold as described in Appendix 1.

4.1 OUTLINE

It is easy to get lost in the details of any account of stochastic integration,so before we begin we will take this opportunity to step back and put allthe results to follow into perspective. At the end of this chapter we will haveassigned a meaning to expressions of the form

It = (H •X)t ≡∫ t

0Hu dXu (4.1)

for a broad class of integrand processes H and integrators X. All the ideaspresented in this outline are introduced with more detailed discussion inlater sections.

The integrators: continuous semimartingales

Throughout we will restrict attention to integrators which are continuous. Adeeper theory is required if we wish to drop this restriction and it is totallyunnecessary to do so for our purposes – remember the integrators will be assetprice processes and we will insist that they are continuous.

Recall that by a semimartingale we mean a process of the form X = X0 +M + A where X0 is a random variable, M is a continuous local martingale (nullat zero) and A is a finite variation process (also null at zero). It turns out thatthe class of semimartingales is, in a sense made precise in Theorem 4.27, thelargest class of integrators for which the stochastic integral can be defined.

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 88: Financial derivatives in theory and practice

64 Stochastic integration

The integrands: predictable processes

Suppose we wish to interpret the stochastic integral (4.1) as the gain fromtrading in the asset X , and that Ht is the holding at t of that asset. At thevery least we want (4.1) to have a meaning when H is a simple process, onefor which there are only a finite number of time points at which we changethe amount of asset X that we hold. Clearly we must also insist that Hdoes not look into the future otherwise there will be arbitrage, so H must atthe very least be adapted. In continuous time, even being adapted is not (ingeneral) enough to give (4.1) a reasonable interpretation. The problem is thatin continuous time the future and the past meet at the present (see Example4.1) so we must be more careful.Consider for a moment the simple integrands H . If we insist that H is

left-continuous we have already decided just before t what the integrand willbe at t. An immediate consequence of this is that, if H is simple and left-continuous and if X is a martingale, then H •X is also a martingale. Thisis exactly what we want because it rules out arbitrage in the economy. Weshall in Section 4.3.1 give a more precise definition of a simple process andshall include left-continuity as a requirement. Of course, simple integrands arenot enough for a complete theory of stochastic integration so we must extendthe class. If we do so, in a natural way, we end up with exactly the classof predictable processes. This class is, in a sense we make explicit in Section4.2, the smallest reasonable class of integrands that contains all simple left-continuous integrands. All left-continuous processes are predictable. We shallsee (from the martingale representation theorem in Chapter 5) that this classof integrands is large enough to encompass all the processes that might be ofinterest.The predictability restriction is what prevents the integrand from

anticipating the future. We must also ensure that the stochastic integraldoes not explode, just as we must when defining a classical Lebesgue—Stieltjesintegral. We will discuss two ways of doing this. The first is to make H locallybounded, meaning there is a sequence of stopping times Tn ↑ ∞ such that HTn

is bounded. The second, which only works for a local martingale integratorM , is to insist that H ∈ Π(M) (see Section 4.5.1). More on this later.

Constructing the integral

The construction of the stochastic integral is done in two stages. The first stepis to define the integral for the case when both H and X are bounded. This isthe majority of the work, and we will outline it shortly. The second step, whichis more straightforward, is to extend from bounded H and X to the generalcase. This is achieved using the idea of localization which was introduced inSection 3.5. The argument goes as follows. Suppose H and X = X0 +M +Aare general and let Tn be an increasing sequence of stopping times, Tn ↑ ∞,

Page 89: Financial derivatives in theory and practice

Predictable processes 65

such that H1(0, Tn] and MTn are bounded. We define the general stochastic

integral H •X via

t

0

Hu(ω) dXu(ω) =t

0

Hu(ω) dMu(ω) +t

0

Hu(ω) dAu(ω)

:= limn↑∞

t∧Tn(ω)

0

Hu(ω) dMu(ω) +t∧Tn(ω)

0

Hu(ω) dAu(ω)

= limn↑∞

t

0

Hu(ω)1(0, Tn(ω)] dMTn(ω)u (ω)

+t

0

Hu(ω)1(0, Tn(ω)] dAu(ω) .

The final term on the right-hand side is a classical Lebesgue—Stieltjesintegral, so we know what that means (Theorem 3.60 has already covered this).All that remains to complete the definition is to define H •M for boundedpredictable H and bounded local martingales M , i.e. bounded martingalesM . This is the crux of the construction and the topic of Section 4.3.

Construction of the stochastic integral for integrators in cM20

We need only construct the integral for boundedM ∈ cM20, but the technique

we use works for any M ∈ cM20.

The construction, which has many features in common with theconstruction of the classical integral, is completed as follows.(1) Define the integral H •M in an ‘obvious’ way for a class of simple

integrands which are piecewise constant (between a finite number ofstopping times). We denote by U the space of all such integrands.

(2) Define the integral for a general H ∈ U (the closure of U) by continuity,

H •M := limn→∞

Hn •M , (4.2)

where Hn ∈ U . What we mean by the closure of U and the limit in (4.2) ismade precise in Section 4.3. The construction is complete once we observethat U contains all bounded predictable integrands.

4.2 PREDICTABLE PROCESSES

A theory of stochastic integration must include integrands which are constantuntil some stopping time T and are then zero. But should we allow integrandsH of the form

H = 1(0, T ] (4.3)

Page 90: Financial derivatives in theory and practice

66 Stochastic integration

or of the formH = 1(0, T ) ,

or both? Both these processes are adapted, but the following example, takenfrom Durrett (1996), shows that (for a general theory) we should restrict tothe former.

Example 4.1 Let (Ω,F ,P) be a probability space supporting two randomvariables T and ξ defined by

P(T ≤ t) = t ∧ 1 ,P(ξ = 1) = P(ξ = −1) = 1

2 .

Define the cadlag process X via

X = ξ1[T,∞] ,

and define the filtration Ft by Ft = σ(Xu, u ≤ t). Note that X is an Ftmartingale and consider the ‘stochastic integral’ It =

t

0Xu dXu. The integral

I can, in this example, be defined pathwise as a Lebesgue—Stieltjes integral,so the stochastic integral must (surely!) be defined to agree with this,

It(ω) =t

0

Xu(ω) dXu(ω) = ξ21t≥T = 1t≥T .

The integral I is not a martingale and the ‘trading strategy’ H = X hasgenerated a guaranteed unit profit by time one.

The problem in this example is that the integrand is right-continuous andthus able to take advantage of the jump in the integrator. Of course, thisproblem cannot occur when the integrator is continuous, which is the casefor which we present the theory, but the point is well made. See Remark 4.4below for more on this point.We will develop a theory that excludes right-continuous integrands and

contains integrands of the type (4.3) and finite linear combinations thereof.To do this we view a general integrand as a map

H : (0,∞)× Ω→ R

(t,ω)→ Ht(ω).

We will define a σ-algebra on the product space (0,∞) × Ω and insist thatall integrands are jointly measurable with respect to this σ-algebra. In sodoing we are reducing a stochastic process (a collection of measurable mapsindexed by t) to a single measurable map from a larger space, and thus weare able to view the integrands as members of a classical (L2) Hilbert space.

Page 91: Financial derivatives in theory and practice

Stochastic integrals: the L2 theory 67

This identification is important in extending the definition of the integral fromsimple integrands to more general bounded integrands.The above joint measurability is strictly stronger than the requirement that

H be adapted. There is no unique choice for the σ-algebra to impose on theproduct space. The one we shall work with, as given by the following definition,is the smallest σ-algebra which is sufficient for the development of the generaltheory. See Remark 4.4 for further comment.

Definition 4.2 The predictable σ-algebra P on (0,∞) × Ω is the smallestσ-algebra on (0,∞)×Ω such that every adapted process having paths whichare left-continuous with limits from the right is P-measurable.A process H with time parameter set (0,∞) is called predictable if it is

P-measurable as a map from (0,∞) × Ω to (R,B(R)), and we write H ∈ P .We say H ∈ bP if H is also bounded.

Technical Remark 4.3: The astute reader will notice that we have not defined apredictable process at zero. We do not need the integrand there and excludingit avoids troublesome technicalities.

In fact, in the definition of the predictable σ-algebra we can replace left-continuous processes by bounded linear combinations of processes of thesimple form (4.3). We will build up the definition of the stochastic integralstarting with this most basic type of integrands.

Technical Remark 4.4: For continuous integrators it is possible to define atheory of stochastic integration for a wider class of integrands. In the extreme,the integrand can be merely adapted if the integrator satisfies extra regularityconditions (Karatzas and Shreve (1991)), and this includes the case whenBrownian motion is the integrator. Without these restrictions, continuity ofX is sufficient to develop a theory for integrands which are progressivelymeasurable as defined in Theorem 3.60. This is the approach of Revuz andYor (1991). However, as they point out, extending the class of integrands inthis way is of no real advantage since the stochastic integral of a progressivelymeasurable integrand is indistinguishable from the stochastic integral of acorresponding predictable integrand.

4.3 STOCHASTIC INTEGRALS: THE L2 THEORY

We are now in a position to develop the stochastic integral with respect toan integrator M ∈ cM2

0. This is the heart of the definition of the generalstochastic integral; after this section the rest is plain sailing.Throughout we will restrict attention to a fixed integratorM ∈ cM2

0 (recallcM2

0 is the space of continuous square-integrable martingales null at zero). Asnoted in Section 4.1, at this stage we need to define the stochastic integral with

Page 92: Financial derivatives in theory and practice

68 Stochastic integration

respect to M at least for all integrands H which are bounded and predictable.In so doing we shall actually define the integral for more general integrands.

The first stage in the construction is to define the stochastic integral for aclass U of simple integrands for which the desired definition is obvious and canbe done pathwise. The next step is to extend this definition by a continuityargument to U , the ‘closure’ of U . However, before we can talk about theclosure of U we need to define a norm (or at least a topology) on the space ofintegrands, and to talk about continuity we further need a norm (topology)on the space of resultant stochastic integrals.

For the space of stochastic integrals the norm is easily defined. We willshow that for any H ∈ U , the square-integrability property of M ensuresthat H •M ∈ cM2

0, and for cM20 we have already defined the norm || · ||2 in

Section 3.2.3. The existence of the increasing process [M ] is what allows us todefine a norm || · ||M for H ∈ P, and this we do in Section 4.3.2. With thesedefinitions we shall see that the continuous linear map

I : U → cM20

H → H •M

is an isometry and thus can be extended uniquely to a map I : U → cM20.

Clearly the space U depends on M via the norm || · ||M. As Theorem 4.10states, this is precisely the Hilbert space L2(M) which we introduce shortly.

4.3.1 The simplest integral

Fix M ∈ cM20. For H of the form (4.3) we make the obvious pathwise

definition for the stochastic integral H •M :=∫

HdM as

(H •M)t(ω) =(∫ t

0HdM

)(ω) := (Mt∧T − M0)(ω).

That is,H •M = MT (since M0 = 0).

Note that clearly H •M ∈ cM20.

We now extend the definition of the stochastic integral by linearity to thespace of finite linear combinations of 1(0, T ]-type processes.

Definition 4.5 For two stopping times S and T, denote by 1(S, T ] theprocess

1(S, T ](t, ω) =

1 if S(ω) < t ≤ T (ω)0 otherwise,

and let U be the space of simple processes H of the form

H(t, ω) =n∑

i=1

ci−11(Ti−1, Ti](t, ω)

Page 93: Financial derivatives in theory and practice

Stochastic integrals: the L2 theory 69

where Tk is a finite sequence of stopping times with 0 ≤ T0 ≤ T1 ≤ . . . ≤ Tnand ck is a finite sequence of constants.For any H ∈ U , we define H •M (by the obvious extension of (4.4)) as

(H •M)t(ω) =n

i=1

ci−1 MTi∧t(ω)−MTi−1∧t(ω) . (4.5)

It is easy to check that the space U is indeed the space of linear combinationsof 1(0, T ]-type processes, and furthermore the definition (4.5) is independentof the particular representation chosen for any H ∈ U . The following obviousresult will prove useful shortly.

Lemma 4.6 Fix M ∈ cM20 and let H ∈ U . Then H •M ∈ cM2

0 and

H •M 22 = E (H •M)2∞ = E

n

i=1

c2i−1(M2Ti −M 2

Ti−1) .

Remark 4.7: Note that it is vital at this point that Ft be complete forH •M to be adapted to Ft (Corollary 3.48). This is the only point inthe construction of the stochastic integral at which we need to appeal tocompleteness.

4.3.2 The Hilbert space L2(M)

We proved in Theorem 3.66 that the quadratic variation [M ] is an increasingprocess, and this property gives us a candidate norm · M on the space ofpredictable processes P , namely

H M = E∞

0

H2sd[M ]s

12

.

There is one problem with this, however, one we have met before, which is thefact that any H which is indistinguishable from zero, yet is not strictly zero,will have H M = 0. Thus · M is not strictly a norm on P . The problemalso arises more generally if [M ] is not strictly increasing. To get around thiswe must identify any processes H and K for which H −K M = 0. We willdo this whenever necessary without further comment. The only concession weshall make is to stick to the convention, as best we can, that we refer to spacesof equivalence classes using Roman font and actual processes using script font.We make the following definitions.

Definition 4.8 The spaces L2(M) and L2(M) are defined to be

L2(M) := H ∈ P such that H M <∞ ,L2(M) := equivalence classes of H ∈ P such that H M <∞ .

Page 94: Financial derivatives in theory and practice

70 Stochastic integration

Technical Remark 4.9: With this definition || · ||M is a true norm for the spaceL2(M), and under this norm L2(M) is actually a Hilbert space. To see this,we view L2(M) as the space L2((0,∞) × Ω,P, µm) where µm is the uniquemeasure on ((0,∞) × Ω,P) such that

µm([s, t] × F ) = E[([M ]t − [M ]s)1F

]for s < t, F ∈ Fs. See Appendix 2 for more detail on L2.

Equipped with a norm we can now identify the closure of U , the setof integrands to which we can hope to extend the stochastic integral byconsidering sequences of simple integrands. Standard results from analysisallow us to state the following result. A proof can be found in Rogers andWilliams (1987).

Theorem 4.10 Viewed as a subspace of L2(M), the closure of U under thenorm || · ||M is precisely L2(M).

Remark 4.11: Note that we have the relationships U ⊂ bP ⊂ U = L2(M) (in-terpreting U as a set of processes again). Thus L2(M) is the smallest ‘closed’space, for the norm || · ||M, containing all bounded predictable processes. Note,however, that if we change M then L2(M) also changes.

4.3.3 The L2 integral

Given M ∈ cM20, how do we extend the definition of the stochastic integral

H •M to include as integrands all H in L2(M)? The following extension ofLemma 4.6 is crucial.

Lemma 4.12 Fix M ∈ cM20. For any H ∈ U , H •M ∈ cM2

0 and

||H •M ||2 = ||H||M, (4.6)

i.e. the map IM : (U , || · ||M) → (cM20, || · ||2) is an isometry. In particular, if

Hn is a Cauchy sequence in U , then Hn •M is Cauchy in cM20.

Proof: From Lemma 4.6 we have that H •M ∈ cM20 and that

||H •M ||22 = E

[n∑

i=1

c2i−1(M

2Ti− M 2

Ti−1)

].

But M 2 − [M ] is a uniformly integrable martingale (Theorem 3.70) and so foreach i ∈ 1, . . . , n we have

E[M 2

Ti− M 2

Ti−1|FTi−1

]= E

[[M ]Ti

− [M ]Ti−1 |FTi−1

].

Page 95: Financial derivatives in theory and practice

Stochastic integrals: the L2 theory 71

Thus

H •M 22 = E

n

i=1

c2i−1 [M ]Ti − [M ]Ti−1 = E∞

0

H2u d[M ]u

as required.The Cauchy property is immediate from (4.6).

We will shortly apply this result to complete the definition of the L2 sto-chastic integral. Before we do so we take a brief pause for thought. To definethe integral in general, ideally we would like to do this on a pathwise basis asfollows:

(i) For each H ∈ L2(M) identify an approximating sequence Hn ⊂ U forwhich Hn −H M → 0 as n→∞.

(ii) Show that the corresponding sequence of stochastic integrals H n •Mconverges a.s. to some limit H •M . On the null set where the sequencedoes not converge, define H •M ≡ 0.Given any Cauchy sequence Hn ∈ L2(M) it follows from Lemma 4.12

that the sequence Hn •M converges in L2, hence converges a.s. along a fastsubsequence, so step (ii) presents no problem. Furthermore, the existence ofa suitable sequence for step (i) is not in doubt — it follows from the factthat L2(M) is complete (Appendix 2). However, there are many possibleapproximating sequences which could be used in step (i) and the remainingproblem is to identify one in particular which will be used for the definition.In the special case when H is left-continuous there is an obvious choice ofapproximating sequence, as we show in the next section, and this completesthe construction. However, for general H ∈ L2(M) there is no constructiveway to define the approximating sequence Hn. This would not be a problemif the limit were independent of the approximating sequence, but givenany two approximating sequences Hn and Kn, the limits (along a fastsubsequence) H •M and K •M will in general disagree on a null set. Thusour inability to explicitly and uniquely identify an approximating sequenceof integrands procludes us from defining the stochastic integral as a pathwiselimit (but see Section 4.3.4 for what is true pathwise).These observations force us to be less ambitious in constructing the

stochastic integral. The following result essentially completes that definition.It defines the stochastic integral but only up to an equivalence class. When weneed to consider the stochastic integral as a true stochastic process and notjust as an equivalence class, as we do when considering sample path properties,we choose some (appropriate) member of the equivalence class.Recall that we can identify cM2

0 with a classical L2 space by imposing the

norm · 2 and taking equivalence classes. For the remainder of this sectionwe denote this space of equivalence classes by L2(cM2

0).

Page 96: Financial derivatives in theory and practice

72 Stochastic integration

Theorem 4.13 Given M ∈ cM20 and H ∈ L2(M), there exists a sequence

Hn ⊂ U and N ∈ cM20 such that

Hn −H M → 0 (4.7)

Hn •M −N 2 → 0 (4.8)

as n→∞. Furthermore, if M and N lie in the same equivalence class of cM20

as M and N respectively, and if H and H lie in the same equivalence class ofL2(M), then (4.7) and (4.8) hold for M , N and H .Viewing M and H as elements of L2(cM2

0) and L2(M) respectively, we

define the stochastic integral H •M ∈ L2(cM20) to be the unique N ∈

L2(cM20) for which (4.7) and (4.8) hold. Viewing M and H as elements of

cM20 and L2(M), we define the stochastic integral H •M ∈ cM2

0 to be someN ∈ cM2

0 for which (4.7) and (4.8) hold.

Proof: By Theorem 4.10, given any H ∈ L2(M) ≡ U , we can find a sequenceHn ⊂ U such thatHn → H in L2(M). The sequence Hn is Cauchy hence,by Lemma 4.12, Hn •M is a Cauchy sequence in L2(cM2

0) which, sinceL2(cM2

0) is complete, converges to a limit in L2(cM2

0). We define H •M to bethis limit. ThatH •M is well defined (as an equivalence class), i.e. independentof the approximating sequence Hn, follows from (4.6) for Hn ∈ U . Thiscompletes the construction at the equivalence class level. The remainder ofthe theorem is obvious.

Corollary 4.14 For any M ∈ cM20 and H ∈ L2(M), H •M ∈ cM2

0 and

E (H •M)2∞ := H •M 22 = H 2

M =: E∞

0

H2u d[M ]u .

4.3.4 Modes of convergence to H •M

It is worth spending a little more time understanding in more detail the wayin which the integrals Hn •M, Hn ∈ U , converge to H •M . Theorem 4.13states that, given any H ∈ L2(M), there exists some continuous martingaleH •M ∈ cM2

0 such that

E[(Hn •M −H •M)2∞]→ 0 (4.9)

wheneverHn −H M → 0.

Suppose we consider all the equivalence classes of cM20 and select one element

within each class. We define this process to be the stochastic integral H •Mof all H for which (4.7) and (4.8) hold. The convergence in (4.9) is L2

Page 97: Financial derivatives in theory and practice

Stochastic integrals: the L2 theory 73

convergence. An immediate consequence of this is convergence in probability,so we can replace (4.9) by

P supt≥0

|(Hn •M −H •M)t| > ε → 0

as n→∞ for any ε > 0.As mentioned above, what one would like ideally is almost sure convergence.

But convergence in probability implies almost sure convergence along asubsequence, so we can conclude, given any H ∈ L2(M), that there existsa sequence Hn ⊂ U such that, for almost every ω,

(Hn •M)t(ω)→ (H •M)t(ω) .

Thus, at least along a fast subsequence, once we have defined the stochasticintegral, any approximating sequence converges a.s. to this stochastic integral.However, any two convergent subsequences will misbehave on different nullsets, and it is this that prevents us from defining the stochastic integral as apathwise limit.There is a special case when the stochastic integral can be defined pathwise.

When H is bounded and left-continuous we have the following result whichexplicitly identifies the stochastic integral, on a pathwise basis, as the limit ofa Riemann sum.

Theorem 4.15 Let M ∈ cM20 and H ∈ bP be left-continuous. Define the

doubly infinite sequence of stopping times T nk via

T n0 ≡ 0, T nk+1 = inft > T nk : |Ht −HTnk | > 2−n .

Then, for almost every ω,

t

0

Hu dMu (ω) = limn→∞

k≥1Ht∧Tn

k−1(ω) Mt∧Tn

k(ω)−Mt∧Tn

k−1(ω) .

Proof: The method of proof is similar to that used in Theorem 3.66. We leavethis as an exercise for the reader.

Remark 4.16:We shall shortly define the stochastic integral for a wider class ofintegrands and integrators; all the comments in this section hold in this greatergenerality. In particular, Theorem 4.15 holds if M is a general continuoussemimartingale and H is locally bounded and left-continuous.

Page 98: Financial derivatives in theory and practice

74 Stochastic integration

4.4 PROPERTIES OF THE STOCHASTIC INTEGRAL

We now review some basic properties of the stochastic integral. These extendto the more general integral which we define shortly, and indeed some of themare needed to be able to carry out that extension.The first result shows how the stochastic integral behaves with respect

to stopping and plays a key role in its extension by localization in thenext section. The second identifies the quadratic variation of the L2

stochastic integral. One property we expect is that integration be associative,H • (K •M) = (HK) •M . This is the fourth result we prove, and to do sowe need first to establish the Kunita—Watanabe characterization, the thirdtheorem.The first result proves that stopping the stochastic integral is the same as

stopping either the integrand or the integrator.

Theorem 4.17 Fix M ∈ cM20. Then for any H ∈ L2(M) and any stopping

time T ,(H •M)T = H1(0, T ] •M = H •MT .

Proof: For general H ∈ L2(M), let Hn ⊂ U be such that Hn−H M → 0as n→∞. For H ∈ U the result is straightforward, thus in particular

(Hn •M)T = Hn1(0, T ] •M = Hn •MT .

The general result is now immediate from the following observations,

(i) Using the optional sampling theorem and Jensen’s inequality, it is clearthat cM2

0 is stable under stopping (that is, if M ∈ cM20 and T is a

stopping time, then MT ∈ cM20). Thus,

(Hn •M)T − (H •M)T 22 = E (Hn •M −H •M)2T

= E E[(Hn •M −H •M)∞|FT ]2

≤ E (Hn •M −H •M)2∞= Hn •M −H •M 2

2 → 0

(the second equality is the optional sampling theorem, the inequalityis Jensen and the convergence is from the definition of the stochasticintegral);

(ii) Hn1(0, T ] − H1(0, T ] M ≤ Hn − H M → 0, thus Hn1(0, T ] •M −H1(0, T ] •M 2 → 0;

(iii) MT ∈ cM20, H

n −H M → 0 thus Hn •MT −H •MT2 → 0.

For integrands H ∈ L2(M) we have seen that H •M ∈ cM20. The following

result identifies the quadratic variation process of H •M and the covariationof any two stochastic integrals in terms of certain Lebesgue—Stieltjes integrals.

Page 99: Financial derivatives in theory and practice

Properties of the stochastic integral 75

Theorem 4.18 (Quadratic variation, quadratic covariation) FixM ∈ cM2

0. For any H ∈ L2(M),

(H •M)2t −t

0

H2u d[M ]u

is a continuous UI martingale and thus

[H •M ]t =t

0

H2u d[M ]u . (4.10)

Further, if N ∈ cM20 and K ∈ L2(N), then

[H •M,K •N ]t =t

0

HuKu d[M,N ]u . (4.11)

Proof: Let

Xt := (H •M)2t −t

0

H2u d[M ]u .

Clearly X is continuous. To prove X is a UI martingale we use Theorem 3.53.Let T be a stopping time. Since H •M is UI (being in cM2

0), the optionalsampling theorem combined with Jensen’s inequality shows that

E (H •M)2T ≤ E (H •M)2∞ .

Further,T

0H2u d[M ]u ≤ ∞

0H2u d[M ]u which is integrable by Corollary 4.14,

and thus we can conclude that the random variableT

0 H2u d[M ]u is integrable

and so

E |XT | ≤ E (H •M)2T + ET

0

H2u d[M ]u <∞ .

Next observe, by Corollary 4.14 and Theorem 4.17, that

ET

0

H2u d[M ]u = E

0

(H1(0, T ])2u d[M ]u

= E (H1(0, T ] •M)2∞= E [(H •M)T∞]

2

= E (H •M)2T .

Thus E[XT ] = 0 and X is indeed a UI martingale by Theorem 3.53.The identity (4.10) follows from Theorem 3.70. To establish (4.11) it is

sufficient to show

[H •M,N ]t =t

0

Hu d[M,N ]u . (4.12)

Page 100: Financial derivatives in theory and practice

76 Stochastic integration

Equation (4.12) can be proved by again using Theorem 3.53 to show that

(H •M)N −H • [M,N ]

is a continuous UI martingale. The details are left to the reader as an exercise.

As our next result indicates, Lebesgue—Stieltjes integration can be used toform the basis of an alternative characterization of the stochastic integral. Thisis, in fact, how the L2 stochastic integral was originally defined by Kunita andWatanabe (1967).

Theorem 4.19 Let M ∈ cM20 and H ∈ L2(M). The stochastic integral

H •M is the unique (up to indistinguishability) martingale in cM20 such that

[H •M,N ] = H • [M,N ] (4.13)

for every N ∈ cM20.

Proof: We already know from (4.12) above that H •M satisfies (4.13). Toshow uniqueness, suppose X is some other martingale in cM2

0 such that

[X,N ] = H • [M,N ] (4.14)

for every N ∈ cM20. Subtracting (4.13) from (4.14) we have, for each t ≥ 0,

that

[X −H •M,N ]t = 0 a.s.

In particular, takingN = X−H •M , we have that [X−H •M,X−H •M ] = 0and so E[(X −H •M)2] = 0 by Theorem 3.70. Hence X = H •M a.s.

Corollary 4.20 (Associativity) Fix M ∈ cM20. If K ∈ L2(M) and

H ∈ L2(K •M), then HK ∈ L2(M) and

(HK) •M = H • (K •M) .

Proof: Since, by Theorem 4.18, [K •M ] = t

0 K2u d[M ]u for all t ≥ 0, we have

Et

0

H2uK

2u d[M ]u = E

t

0

H2u d[K •M ]u

and so HK ∈ L2(M).By Theorem 4.19, for any N ∈ cM2

0 we know that

[K •M,N ] =t

0

Ku d[M,N ]u

Page 101: Financial derivatives in theory and practice

Extensions via localization 77

and so, appealing to the associativity of Stieltjes integrals,

[(HK) •M, N ] =∫ t

0HuKud[M, N ]u

=∫ t

0Hud[K •M, N ]u

= [H • (K •M), N ].

The result now follows from the uniqueness in Theorem 4.19.

4.5 EXTENSIONS VIA LOCALIZATION

The theory we have developed so far for the stochastic integral, though elegantand self-contained, is too restrictive. For example, the class of integratorscM2

0 does not include Brownian motion and we certainly need to extend thetheory to include this for our applications. Fortunately localization allows usto extend the stochastic integral without much difficulty.

Our first use of localization is to extend the class of integrators to continuouslocal martingales null at zero, cM0,loc. This is not the final class of integratorsfor our theory; for that we must wait until Section 4.5.2.

4.5.1 Continuous local martingales as integrators

We are now ready to extend the definition of the stochastic integral to theclass of integrators cM0,loc. The class of integrands we shall use again dependson the integrator M and we denote it by Π(M). This class is defined as follows:

Π(M) :=

H ∈ P such that∫ t

0H2

u d[M ]u < ∞ a.s., for all t

. (4.15)

Observe that if M ∈ cM20, in which case L2(M) is defined, we have

L2(M) ⊆ Π(M).

Thus the definition of the stochastic integral given below broadens the classof integrands as well as the class of integrators.

Fix M ∈ cM0,loc and let H ∈ Π(M). For n ∈ N, let

Sn := inft > 0 : |Mt| > n,

Rn := inft > 0 :∫ t

0H2

u d[M ]u > n, (4.16)

Tn := Rn ∧ Sn.

Page 102: Financial derivatives in theory and practice

78 Stochastic integration

For each n, Tn is a stopping time and Tn ↑ ∞ a.s. Further, note that

MTn ∈ cM20,

H1(0, Tn] ∈ L2(MTn),

and so, using Theorem 4.13, for each n we may define the stochastic integral

(H1(0, Tn]) •MTn ∈ cM20.

It is immediate from Theorem 4.17 that, for 1 ≤ n ≤ m, we have((H1(0, Tm]) •MTm

)Tn

= (H1(0, Tn]) •MTn. (4.17)

Equation (4.17) tells us we can use localization to provide a consistentdefinition of the stochastic integral H •M .

Definition 4.21 Let M ∈ cM0,loc and H ∈ Π(M). We define the stochasticintegral H •M to be the process in cM0,loc defined by

H •M(ω) = limn→∞

(H1(0, Tn])(ω) •MTn(ω),

where Tn is defined by (4.16).

Remark 4.22: Note the statement included in this theorem that the stochasticintegral with respect to a local martingale integrator is itself a localmartingale. This fact is used repeatedly in calculations and applications (seeExample 4.37 for one example).

Given M ∈ cM0,loc, Π(M) is in fact the largest possible class of integrandswe can take. The interested reader is referred to Rogers and Williams (1987)or Karatzas and Shreve (1991) for an explanation of why this is the case.

With the obvious modifications, the properties of the L2 stochastic integralestablished in Section 4.4 carry over to this more general case.

4.5.2 Semimartingales as integrators

The final step in our construction of the stochastic integral is to extendthe class of integrators to continuous semimartingales. In order to discussintegration with respect to a semimartingale we replace the obvious choice,Π(X) (as defined in (4.15)), by a slightly smaller class of integrands that doesnot depend on X loc, the space of locally bounded predictable processes. Weneed to do this because Π(X) does not depend on the finite variation part ofX and so, in general, denoting the finite variation part of X by A, the integralH •A will not exist for H ∈ Π(X). As will be clear from the definition, thenew space we introduce is independent of X and as such can be used as partof a more general stochastic calculus.

Page 103: Financial derivatives in theory and practice

Extensions via localization 79

Definition 4.23 A process H is called locally bounded and predictable, andwe write H ∈ bP, if it is predictable and if there exists an increasing sequenceTn of stopping times with Tn ↑ ∞ a.s. and a sequence cn of constantssuch that, for all ω,

|H1(0, Tn]| < cn . (4.18)

The sequence Tn is referred to as a reducing sequence as H1(0, Tn] ∈ bP.

Remark 4.24: Since Ft is assumed to be complete it is sufficient that (4.18)hold for almost all ω. To see this, let Ω∗ be the set of paths for which (4.18)holds. Now define the stopping times Tn by

Tn(ω) =Tn(ω) ω ∈ Ω∗0 otherwise.

This is a reducing sequence of stopping times for H .

All cadlag processes H such that lim supt↓0 |Ht| <∞ a.s. are seen to be inbP by taking

Tn = inft > 0 : |Ht| > n , (4.19)

which ensures that (4.18) holds a.s. Furthermore, we have the followingimportant result alluded to above.

Lemma 4.25 For any M ∈ cM0,loc (or more generally X ∈ cS),

bP ⊂ Π(M) .

Proof: Given H ∈ bP, take the reducing sequence (4.19) and observe thatTn

0

H2u d[M ]u ≤ n2[M ]Tn <∞ a.s.

The lemma means that, for H ∈ bP, M ∈ cM0,loc, we can define H •M asin Section 4.5.1. Clearly, for H ∈ bP and A a continuous adapted finitevariation process, we can define H •A as the pathwise Lebesgue—Stieltjesintegral (as we did in Chapter 3 for bounded H)

(H •A)t(ω) =t

0

Hu(ω) dAu(ω) .

Noting that H ∈ P implies H is progressively measurable, Theorem 3.60ensures that H •A is again continuous adapted and of finite variation. Thuswe have the following extension of the stochastic integral.

Page 104: Financial derivatives in theory and practice

80 Stochastic integration

Definition 4.26 For X a continuous semimartingale as at (3.15) and forH ∈ bP we define the stochastic integral H •X to be the continuoussemimartingale (null at zero) given by

Hu dXu ≡ H •X := H •M +H •A .

4.5.3 The end of the road!

The concept of a semimartingale was originally arrived at by ad hoc means.However, it can be argued that any reasonable integrator is a semimartingaleas follows. Recall that we began the development of the stochastic integral bytaking as integrand the class U of simple predictable processes of the form

H =

n

i=1

ci−11(Ti−1, Ti] .

For any continuous adapted process X and any H ∈ U , we could define a‘stochastic integral’ H •X in the obvious way via

(H •X)t =n

i=1

ci−1(XTi∧t −XTi−1∧t) .

Let H and the sequence Hn be in U . For a general X , even if we haveuniform convergence of Hn to H (a very strict form of convergence) thisis not enough to ensure the weak requirement that we have convergence inprobability of (Hn •X)t to (H •X)t. The following result tells us that it isprecisely the class of semimartingales for which the map H → H •X satisfiesthis mildest continuity condition, and thus for which a sensible extension ofthis map is possible. A proof of this result can be found in Dellacherie andMeyer (1980).

Theorem 4.27 Suppose X is a continuous adapted process such that,whenever Hn is a sequence in U with

supt,ω|Hn(t,ω)|→ 0 ,

as n→∞, then for each t(Hn •X)t → 0

in probability. Then X is a continuous semimartingale.The converse result is also true.

Page 105: Financial derivatives in theory and practice

Stochastic calculus: Ito’s formula 81

4.6 STOCHASTIC CALCULUS: ITO’S FORMULA

4.6.1 Integration by parts and Ito’s formula

Having defined the stochastic integral we now need to develop the rules bywhich it can be manipulated, i.e. stochastic calculus. The main tool is a changeof variable formula for stochastic integrals known as Ito’s formula. We beginby proving a special case of the result, the integration by parts formula, whichshows that the equivalent classical result must be amended when dealing withstochastic integrals to include a correction term.

Theorem 4.28 (Integration by parts formula) Let X and Y becontinuous semimartingales (with respect to (Ω, Ft,F ,P)). Then, a.s., forall t ≥ 0,

XtYt = X0Y0 +t

0

Xu dYu +t

0

Yu dXu + [X,Y ]t . (4.20)

In particular,

X2t = X

20 + 2

t

0

Xu dXu + [X ]t . (4.21)

Proof: We first prove (4.21) from which we can deduce (4.20) by polarization,

XtYt =12 (Xt + Yt)

2 −X2t − Y 2t .

Suppose X has decomposition (3.15). It suffices, by use of the reducingsequence Sn = inft > 0 : |Xt| + |Mt| > n, n ≥ 1, to assume that bothX and M are bounded. Using the notation of Theorem 3.66, we can write

X2t −X2

0 =k≥1(X2

tnk−X2

tnk−1) + 2

k≥1Xtn

k−1(Xtn

k−Xtn

k−1)

=k≥1(X2

tnk−X2

tnk−1) + 2

k≥1Xtn

k−1(Mtn

k−Mtn

k−1)

+ 2k≥1

Xtnk−1(At

nk−Atn

k−1) . (4.22)

For each ω, the first term on the right-hand side converges to [X ]t bydefinition, and the second converges a.s. to X •M by Theorem 4.15 andRemark 4.16. The final term is, for each ω, merely a discrete approximationto the Lebesgue—Stieltjes integral Xu dAu (which is also continuous), thusconverges toX •A by the classical (Lebesgue) dominated convergence theorem(recall X is bounded).

Remark 4.29: Observe that if X or Y is of finite variation then (4.20) reducesto the ordinary integration by parts formula for Lebesgue-Stieltjes integralssince in that case [X,Y ]t ≡ 0.

Page 106: Financial derivatives in theory and practice

82 Stochastic integration

Remark 4.30: From the particular case (4.21), if M is a local martingale wenow have an expression for the local martingaleM 2− [M ] (see Theorem 3.79)in terms of a stochastic integral,

M2t − [M ]t =M 2

0 + 2t

0

Mu dMu .

Equation (4.21) shows that if Xt is any continuous semimartingale andf(x) = x2 then f(Xt) is also a continuous semimartingale. Ito’s formula tellsus that this is the case for any function f which is C2 and provides thedecomposition for the continuous semimartingale f(Xt).

Theorem 4.31 (Ito’s formula) Let f : R → R be C2 and let X be acontinuous semimartingale. Then, a.s., for all t ≥ 0,

f(Xt) = f(X0) +t

0

f (Xu) dXu +12

t

0

f (Xu) d[X ]u . (4.23)

In particular, if X has the decomposition X = X0 +M + A then f(Xt) hasthe decomposition

f(Xt) = f(X0) +t

0

f (Xu) dMu +t

0

f (Xu) dAu +12

t

0

f (Xu) d[M ]u ,

(4.24)and is thus a continuous semimartingale.

Remark 4.32: Note that f (Xt) is a continuous adapted process in bP and sothe stochastic integral is well defined. The second integral in (4.23) is to beunderstood in the Lebesgue—Stieltjes sense.

Proof: The second result is a simple consequence of the first. To prove thefirst, note that (4.23) trivially holds for f(x) ≡ 1 and f(x) = x. Further, if(4.23) holds for f and g, then it also holds for f + g and, by Theorem 4.28,for fg. We can conclude that the result holds for all polynomial functions.We now prove the result for general f . By localization we may assume that

X, [M ] and VA are bounded, by K say. By the continuity of (4.24) in t, weneed only prove that, for all t ≥ 0 and all ε > 0,

P |(Gf)t| > ε = 0 ,

where (Gf)t is the difference between the left- and the right-hand sides of(4.23). We prove this by polynomial approximation.Fix ε > 0, and let pn be a polynomial such that

sup|x|≤K

|pn(x) − f(x)|+ |pn(x)− f (x)| + |pn(x)− f (x)| < n−1 . (4.25)

Page 107: Financial derivatives in theory and practice

Stochastic calculus: Ito’s formula 83

We have just shown that (Gpn)t = 0, a.s., so (Gf)t = (G(f − pn))t a.s. Itfollows from (4.25) and the bounds on M and VA that

|(G(f − pn))t| ≤2 + 3

2K

n+

t

0

(pn − f ) dMu .

For n sufficiently large (> 2(2 + 32K)/ε),

P |(Gf)t| > ε = P G(f − pn)t > ε

≤ Pt

0

(pn − f ) dMu > ε/2

≤4E | t

0(pn − f ) dMu|2

ε2(Chebyshev)

=4E | t

0(pn − f )2 d[M ]u|2

ε2

≤ 4K

n2ε2,

the last equality following from the isometry property, Corollary 4.14. Takingn sufficiently large, we may make this as small as we wish.

4.6.2 Differential notation

A shorthand for Ito’s formula is to write it in differential notation as

df(X) = f (X)dX + 12f (X)d[X ] .

To facilitate computations using Ito’s formula (especially in the multidimen-sional case to follow) we make the following definition. For continuous semi-martingales X and Y, let

(i) dXdY := d[X,Y ] (= dY dX) .

Since

(ii) dXdA = 0 if A is of finite variation,

if Z is also a continuous semimartingale, we have

(iii) (dXdY )dZ = dX(dY dZ) = 0 ,

and similarly for higher-order multiples. If, in addition, H,K ∈ bP, recallingfrom Theorem 4.18 that

d[H •X,K •Y ] = HKd[X,Y ] ,

Page 108: Financial derivatives in theory and practice

84 Stochastic integration

we have

(iv) (HdX)(KdY ) = HKdXdY .

Properties (ii)—(iv) allow us to perform calculations easily, treating eachterm ‘as if it were a scalar’, but cancelling terms involving three or moredifferentials, or two differentials if at least one is of finite variation.A simple example relates the quadratic variation of a continuous

semimartingale to that of its local martingale part. If X has the decompositionX = X0 +M +A, we have

(dX)2 = (dM + dA)2 = (dM)2 + 2dMdA+ (dA)2 = (dM)2 .

We shall make free use of this type of differential notation in our calculations.Using this formalism we can rewrite Ito’s formula as

df(X) = f (X)dX + 12f (X)(dX)

2 .

This suggests one could base an alternative direct proof for Ito’s formula onTaylor’s theorem. The interested reader is referred to Karatzas and Shreve(1991) for a treatment along these lines.

Example 4.33 Let W be a standard Brownian motion relative to(Ω, Ft,F ,P) and let H ∈ Π(W ). Then, for t ≥ 0, we can define

ζt :=t

0

Hu dWu − 12

t

0

H2u ds .

Consider the processZt := exp(ζt) .

Applying Ito’s formula to the semimartingale ζ, we have, for f ∈ C 2,

df(ζt) = f (ζt)dζt +12f (ζt)(dζt)

2

where(dζt)

2 = (HtdWt − 12H

2t dt)

2 = H2t (dWt)

2 = H2t dt .

Taking f(x) = exp(x), we obtain

dZt = Zt(HtdWt − 12H

2t dt) +

12ZtH

2t dt

= ZtHtdWt .

Noting Z0 = 1 and writing this in integral form, we have shown that, for allt ≥ 0, Zt satisfies

Zt = 1 +t

0

ZuHu dWu . (4.26)

Page 109: Financial derivatives in theory and practice

Stochastic calculus: Ito’s formula 85

In fact Z is the unique solution to (4.26). We prove this as follows. SupposeX is some other solution to (4.26) and define Yt := Z

−1t = exp(−ζt). Apply

Ito’s formula to obtain

dYt = −YtHtdWt + YtH2t dt

and consider the process XY . From the integration by parts formula we have

d(XtYt) = XtdYt + YtdXt + d[X,Y ]t = XtYtH2t dt+ d[X,Y ]t .

Further,d[X,Y ]t = (XtHtdWt)(−YtHtdWt) = −XtYtH2

t dt ,

and so d(XtYt) = 0,XtYt = X0Y0 = 1 .

Thus, Xt = Zt a.s. for all t ≥ 0.Finally, observe from (4.26) that Zt = exp(ζt) is a local martingale. A

natural question to ask is for which processes H ∈ Π(W ) is Z a truemartingale? We will return to this question in connection with Girsanov’sTheorem in the next chapter.

4.6.3 Multidimensional version of Ito’s formula

We now state a version of Ito’s formula for functions of several semimartingles.We will need this multidimensional version for later applications.

Definition 4.34 A process X = (X (1), . . . , X(n)) defined relative to(Ω, Ft,F ,P) with values in Rn is called a continuous semimartingale if eachcoordinate process X (i) is a continuous semimartingale.

Theorem 4.35 (Ito’s formula) Let f : Rn → R be C2(Rn) and letX = (X(1), . . . , X(n)) be a continuous semimartingale in Rn. Then, a.s., f(Xt)is a continuous semimartingale and

f(Xt) = f(X0) +

n

i=1

t

0

∂f

∂xi(Xu) dX

(i)u

+ 12

n

i,j=1

t

0

∂2f

∂xi∂xj(Xu) d[X

(i), X(j)]u .

The proof goes through as for the one-dimensional case with only thenotation being more difficult. We mention two extensions of Ito’s formulawhich are not difficult to prove.

Page 110: Financial derivatives in theory and practice

86 Stochastic integration

Extension 1: If some of the X(i) are of finite variation then the function fneed only be of class C1 in the corresponding coordinates. In particular, ifX is a continuous semimartingale in R1, if A is a continuous process of finitevariation, also in R1, and f is C2,1(R,R) then

f(Xt, At) = f(X0, A0) +t

0

∂f

∂x(Xu, Au) dXu

+t

0

∂f

∂a(Xu, Au) dAu +

12

t

0

∂2f

∂x2(Xu, Au) d[X ]u .

Example 4.36 (Dolean’s exponential) Let X be a continuous semimartingalewith X0 = 0 and define

E(X)t = exp(Xt − 12 [X ]t) .

Then E(X)t is the unique (continuous) semimartingale Z such that

Zt = 1 +t

0

Zu dXu . (4.27)

Note that we proved a special case of this result in Example 4.33 where wetook Xt = (H •W )t. This time, apply Ito’s formula to the two-dimensionalsemimartingale (X, [X ]) with f(x, y) = exp(x− 1

2y) to obtain

dE(X)t = E(X)tdXt − 12E(X)td[X ]t + 1

2E(X)td[X ]t= E(X)tdXt .

The uniqueness can be proved as in Example 4.33. The process E(X) iscalled the exponential of X and (4.27) is known as an exponential stochasticdifferential equation.

Extension 2: Ito’s formula still holds if the function f is defined and in C2

only on an open set and if X takes its values a.s. in this set. This means thatwe can apply Ito’s formula to log(Xt), for example, if Xt is a strictly positiveprocess.

Example 4.37 Let (X,Y ) be a two-dimensional Brownian motion started at(X0, Y0) = (0, 0), and set

Vt = X2t + Y

2t .

By Ito’s formula

dVt = 2(XtdXt + YtdYt) + 2dt

Page 111: Financial derivatives in theory and practice

Stochastic calculus: Ito’s formula 87

and so

d[V ]t = 4(X2t + Y

2t )dt = 4Vtdt .

Next define Tn := inft > 0 : Vt < n−2 where we assume V0 > n−2, andconsider the stopped process Mt = log(V Tnt ). For the stopped process V Tn ,which is a.s. strictly positive, we have

dV Tnt = 1(0, Tn]dVt

and so, by Ito’s formula applied to M ,

dMt = 1(0, Tn] M−1t dVt − 1

2M−2t d[V ]t

= 1(0, Tn]2V−1t (XtdXt + YtdYt) . (4.28)

From (4.28) we see that M = log(V Tn) defines a local martingale (Remark4.22). We can use this to establish the following facts:

(i) two-dimensional Brownian motion hits any open set containing the originwith probability one;

(ii) two-dimensional Brownian motion hits the origin itself with probabilityzero.

To show this, define

SN = inft > 0 : Vt > N2

where N is chosen large enough so that V0 < N2. NowMSN is a bounded local

martingale, hence a UI martingale, and so by the optional sampling theoremwe conclude

E log(VTn∧SN ) = log(V0) . (4.29)

We know from Theorem 2.8 that SN <∞ a.s., so (4.29) yields

log(N 2)pn,N + log(n−2)(1− pn,N ) = log(V0) ,

where pn,N = P(SN < Tn), and thus

pn,N =log(V0) + 2 log(n)

2 log(N) + 2 log(n).

Page 112: Financial derivatives in theory and practice

88 Stochastic integration

Given any open set A containing the origin, the open ball (x, y) : x2+y2 <n−2 lies strictly within A for some n, hence by the continuity of Brownianmotion the probability of hitting A is greater than pn,N . Further,

Tn <∞ =N

Tn < N ,

so the monotone convergence theorem implies that

P(Tn <∞) = limN↑∞

pn,N = 1 ,

which establishes (i).To prove (ii), note that, for N >

√V0, n > 1/

√V0,

P(H0 <∞) = limN↑∞

P(H0 < SN )

= limN↑∞

limn↑∞

P(Tn < SN ) = 0 .

4.6.4 Levy’s theorem

Levy’s theorem is a powerful and extremely useful result which shows thatif X is a continuous local martingale with quadratic variation [X ] t = t,then X must be Brownian motion. Here we present the elegant proof of thisresult provided by Kunita and Watanabe (1967), a simple application of Ito’sformula.

Theorem 4.38 Let X be some continuous d-dimensional local martingaleadapted to the filtration Ft. Then X is an Ft Brownian motion if andonly if

[X(i), X(j)]t = δijt a.s.

for all i, j and t.

Proof: If X is a Brownian motion the result is merely a d-dimensional versionof Corollary 3.81. The converse is a little more involved. Given an arbitraryλ ∈ Rd, define the (complex-valued) continuous semimartingale f(Xt, t) to be

f(Xt, t) := exp(iλ ·Xt + 12 |λ|2t)

= exp( 12 |λ|2t) cos(λ ·Xt) + i sin(λ ·Xt) .

Page 113: Financial derivatives in theory and practice

Stochastic calculus: Ito’s formula 89

By Ito’s formula,

df(Xt, t) =

d

k=1

∂f(Xt, t)

∂X(k)t

dX(k)t +

∂f(Xt, t)

∂tdt

+ 12

d

k, =1

∂2f(Xt, t)

∂X(k)t ∂X

( )t

dX(k)t dX

( )t

=

d

k=1

iλkf(Xt, t)dX(k)t + 1

2 |λ|2f(Xt, t) dt

− 12

d

k, =1

λkλ δk f(Xt, t) dt

=

d

k=1

iλkf(Xt, t)dX(k)t .

Thus f is a local martingale. But, given any T ≥ 0, f is bounded (in absolutevalue and thus in each component) on [0, T ] by exp( 12 |λ|2T ) (which is itsmodulus at time T ) so f is, in fact, a true martingale. Therefore, given anys ≤ t,

E[f(Xt, t)|Fs] = f(Xs, s) a.s.

which can be rewritten as

E[exp iλ · (Xt −Xs) |Fs] = exp − 12 |λ|2(t− s) a.s. (4.30)

Equation (4.30), which holds for all λ ∈ Rd, shows that Xt−Xs is independentof Fs. Taking expectations in (4.30) yields

E[exp iλ · (Xt −Xs) ] = exp − 12 |λ|2(t− s) ,

which is the characteristic function of a N(0, t− s) random variable. Thus wehave established the remaining requirement of Definition 2.4 for X to be aBrownian motion.

Page 114: Financial derivatives in theory and practice
Page 115: Financial derivatives in theory and practice

5

Girsanov and MartingaleRepresentation

This chapter is devoted to two important results in the study of financialmodels, Girsanov’s theorem and the martingale representation theorem. Thecommon theme to these results is the study of change of probability measure.

Girsanov’s theorem shows how the law of a semimartingale changes whenthe original probability measure P is replaced by some equivalent probabilitymeasure Q. It turns out that a semimartingale remains a semimartingale andGirsanov’s theorem relates the semimartingale decomposition under each ofthe measures P and Q to the Radon –Nikodym derivative of Q with respectto P. This is useful when pricing derivatives because, as we saw in Chapter 1,it is convenient to work in some probability measure where the assets of theeconomy are martingales.

The martingale representation theorem gives necessary and sufficientconditions for a family of (local) martingales to be representable as stochasticintegrals with respect to some other (finite) set of (local) martingales. Thisresult is important in a financial context because it provides conditions underwhich a model is complete.

5.1 EQUIVALENT PROBABILITY MEASURES ANDTHE RADON-NIKODYM DERIVATIVE

In this section we collect together some basic results concerning equivalentprobability measures which will be used extensively in later sections, partic-ularly the discussion of Girsanov’s theorem. Recall that we met the idea ofequivalent measures in the discrete setting of Chapter 1.

5.1.1 Basic results and properties

Let us begin our discussion with a formal definition.

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 116: Financial derivatives in theory and practice

92 Girsanov and martingale representation

Definition 5.1 Let P and Q be two probability measures on the measurablespace (Ω,F). We say that Q is absolutely continuous with respect to P (withrespect to the σ-algebra F), written Q P, if for all F ∈ F ,

P(F ) = 0 ⇒ Q(F ) = 0.

If both P Q and Q P then P and Q are said to be equivalent (withrespect to F), written P ∼ Q.

Remark 5.2: The measures P and Q could, and often will, be defined on someσ-algebra G strictly larger than F . Equivalence of P and Q with respect toF ⊂ G does not imply equivalence with respect to G, but the converse istrivially true.

If two measures P and Q are equivalent we can freely talk about resultsholding almost surely without specifying which of the two measures we areworking with, and this we shall do throughout. Many of the results whichwe are about to present for equivalent measures also hold when the measuresare not equivalent but when one is absolutely continuous with respect to theother. In this latter case it is important to qualify an almost sure statementto include the measure under which the result is true. You are referred toProtter (1990) and Revuz and Yor (1991) for the more general statement ofthose results.Equivalent probability measures can be related to each other via a positive

random variable, and this is the content of the Radon—Nikodym theorem.This is a classical result in measure theory and we state it here without proof(which can be found, for example, in Billingsley (1986)).

Theorem 5.3 Suppose P and Q are equivalent probability measures onthe space (Ω,F). Then there exists a strictly positive random variable ρ ∈L1(Ω,F ,P) which is a.s. unique and is such that, for all F ∈ F ,

Q(F ) =F

ρdP = EP[ρ1F ]. (5.1)

Further, ρ−1 ∈ L1(Ω,F ,Q) and

P(F ) =F

ρ−1dQ = EQ[ρ−11F ]. (5.2)

Conversely, given any strictly positive random variable ρ ∈ L1(Ω,F ,P) suchthat EP[ρ] = 1, there exists some unique measure Q ∼ P for which equations(5.1) and (5.2) hold.

This result allows us to make the following definition.

Page 117: Financial derivatives in theory and practice

Equivalent probability measures and the Radon—Nikodym derivative 93

Definition 5.4 The random variable ρ of Theorem 5.3 is referred to as (aversion of) the Radon—Nikodym derivative of Q relative to P on (Ω,F). Notingthat any two versions agree a.s., we write

dQdP F

= ρ a.s.

Similarly, we havedPdQ F

= ρ−1 a.s.

Remark 5.5: These last two results take on a very concrete form in the casewhen Ω is a finite set, ωi : i = 1, 2, . . . , n. In this case we may as well assume(the general case being a trivial extension) that F comprises all subsets of Ω.Whenever P(ωi) > 0 the random variable ρ is now just

ρ(ωi) =Q(ωi)P(ωi)

,

which is finite and strictly positive. For all other ωi we can set ρ(ωi) to be anyvalue, zero for example. These latter ωi have probability zero (under both Pand Q) so the ambiguity in this case is immaterial.

Remark 5.6: Suppose we only know that Q is absolutely continuous withrespect to P. In this case the Radon—Nikodym theorem states that there existsa positive random variable ρ ∈ L1(Ω,F ,P) such that (5.1) holds (P-a.s.).Since we now only know that ρ ≥ 0 (P-a.s.) we cannot necessarily recoverP from Q and ρ, i.e. (5.2) does not hold in general. To see this, considerF = ω ∈ Ω : ρ(ω) = 0. From (5.1)

Q(F ) = EP[ρ1F ] = 0 ,

and so the random variable ρ is strictly positive Q-a.s. and ρ−1 is well-definedunder Q. But if Q is absolutely continuous with respect to P but not equivalentthen P(F ) > 0. If, in addition, (5.2) were to hold then we would have

P(F ) = EQ[ρ−11F ] = 0,

a contradiction.

A straightforward application of (the usual) monotone class arguments(see, for example, Rogers and Williams (1994)) extends (5.1) and (5.2) tocorresponding statements for random variables.

Page 118: Financial derivatives in theory and practice

94 Girsanov and martingale representation

Proposition 5.7 Let X be some random variable defined on (Ω,F) andsuppose that P ∼ Q with respect to F . Then, in each case provided theexpectation exists, expectations under P and Q are related via

EQ[X ] = EP[ρX ]

andEP[X ] = EQ[ρ−1X ] ,

where

ρ =dQdP F

.

Our intention to study equivalent measures on filtered probability spacesleads us to ask what happens when we start with two measures P ∼ Q withrespect to some σ-algebra F and consider the σ-algebra G ⊂ F . We find thefollowing result which extends Theorem 5.3 and Proposition 5.7.

Theorem 5.8 Let P and Q be two probability measures on the space (Ω,F)and suppose G ⊂ F is some σ-algebra. If P ∼ Q with respect to F then P ∼ Qwith respect to G and

dQdP G

= EP[ρ|G] a.s.

dPdQ G

= EQ[ρ−1|G] a.s.

where

ρ =dQdP F

a.s.

Furthermore, given any X ∈ L1(Ω,G,Q), Y ∈ L1(Ω,G,P),EQ[X ] = EP[ρX ]

EP[Y ] = EQ[ρ−1Y ]

where

ρ =dQdP G

.

Proof: That P and Q are equivalent with respect to G is immediate. From(5.1) and the tower property we have, for all G ∈ G ⊂ F ,

Q(G) = EP[ρ1G] = EP EP[ρ | G]1G .

Since EP[ρ|G] is a G-measurable random variable we conclude from Theorem5.3 (applied to G) that

dQdP G

= EP[ρ|G] a.s.

The proof of the rest of the theorem follows similarly.

Page 119: Financial derivatives in theory and practice

Equivalent probability measures and the Radon—Nikodym derivative 95

Corollary 5.9 Let P,Q and ρ be as in Theorem 5.8. Given any X ∈L1(Ω,F ,Q), and a σ-algebra G ⊂ F ,

EQ[X |G] = EP[ρX |G]EP[ρ|G] a.s. (5.3)

Proof: Given any G ∈ G, since the right-hand side of (5.3) is G-measurable,it follows from Theorem 5.8 that

EQ 1GEP[ρX |G]EP[ρ|G] = EP 1G

EP[ρX |G]EP[ρ|G]

dQdP G

= EP 1GEP[ρX |G]= EP 1GρX

= EQ 1GX .

Thus, by the definition of conditional expectation, the right-hand side of (5.3)is indeed EQ[X |G].

5.1.2 Equivalent and locally equivalent measures on a filteredspace

It is interesting and, as we shall see shortly, very important to consider howTheorem 5.3 and Proposition 5.7 can be refined when the space (Ω,F) isfurther endowed with a filtration Ft. We have already provided the basisfor this extension with Theorem 5.8 and Corollary 5.9. Our first result extendsTheorem 5.3 and motivates the rest of this section.

Theorem 5.10 Let P and Q be probability measures on the filtered space(Ω, Ft,F), and suppose that Q ∼ P with respect to F . Suppose further thatthe filtration Ft satisfies the usual conditions. Then, for all t ∈ [0,∞],

ρt :=dQdP Ft

defines an a.s. strictly positive UI Ft martingale under P.Proof: Since P and Q are equivalent with respect to F , there exists a strictlypositive random variable ρ ∈ L1(Ω,F ,P) such that ρ = dQ

dP F . Define theuniformly integrable martingale ρt := EP[ρ|Ft], t ∈ [0,∞]. It follows fromTheorem 5.8 that ρt =

dQdP Ft a.s., so all that remains is to prove that the

martingale ρ is a.s. strictly positive.Since Ft satisfies the usual conditions ρ has a cadlag version (Theorem

3.8). Given any fixed t ≥ 0 it is clear that ρt is strictly positive, beingthe conditional expectation of a strictly positive random variable. Combiningthese two observations, we see that

P(ρt > 0, all t ≥ 0) = Pq∈Q+

ρq > 0 = 1 .

Page 120: Financial derivatives in theory and practice

96 Girsanov and martingale representation

Remark 5.11: It is clear from the above proof that when the usual conditionsdo not apply, ρt is a.s. strictly positive for any given t ≥ 0. However, ρ(ω) willnot necessarily be a.s. strictly positive for all t simultaneously.

The martingale ρ plays a crucial role in Girsanov’s theorem, Theorem5.20, a result which extends Proposition 5.7 to processes and explains howsemimartingales behave under a change of probability measure. This will bediscussed in Section 5.2.The remainder of this section will be devoted to proving various converse

statements to Theorem 5.10. This is important because it will enable us tostart with some probability measure P and construct some other equivalentmeasure Q by using a P martingale.The converse to Theorem 5.10 is not as straightforward as one might at

first hope. Indeed it is immediate that if F is richer than F∞ it is not possibleto recover dQ

dP F from the martingale ρ. The nearest thing to a converse forTheorem 5.10 is the following result, a consequence of Theorems 5.3 and 5.10.

Theorem 5.12 Let P be a probability measure on (Ω, Ft,F). Given anya.s. strictly positive UI (Ft,P) martingale ρ with EP[ρ∞] = 1,

dQdP F∞

:= ρ∞

defines (via equation (5.1)) a measure Q ∼ P with respect to F∞.So given a strictly positive UI (Ft,P) martingale it is possible to use this

to define some other measure Q ∼ P with respect to F∞. But what if ρ is astrictly positive martingale but not a UI martingale (which is a much morerestrictive requirement)? In this case the situation is somewhat more subtle,as the following results demonstrate. Note, however, that if we are workingonly on some finite time horizon [0, T ] then any cadlag martingale is UI sothe complications below (with regard to completing filtrations in particular)do not arise.

Definition 5.13 Two probability measures P and Q defined on the filteredspace (Ω, Ft,F) are said to be locally equivalent (with respect to thefiltration Ft) if, for all t ≥ 0, P ∼ Q with respect to Ft, i.e. for allF ∈ t∈[0,∞)Ft,

P(F ) = 0 ⇔ Q(F ) = 0 .

Theorem 5.14 Let P and Q be probability measures on (Ω, Ft,F), andsuppose that Q is locally equivalent to P with respect to Ft. Then

ρt :=dQdP Ft

Page 121: Financial derivatives in theory and practice

Equivalent probability measures and the Radon—Nikodym derivative 97

defines an Ft martingale under P which is a.s. strictly positive for eacht ≥ 0. If, in addition, Ft satisfies the usual conditions under P then ρ hasan a.s. strictly positive version.Conversely, if ρ is a strictly positive martingale on the probability space

(Ω, Ft,F ,P), then for each T ≥ 0,

dQT

dP FT= ρT

defines a probability measure QT ∼ P on (Ω,FT ). Furthermore, the familyQT : T ≥ 0 is consistent, meaning that if T ≥ S and F ∈ FS , thenQT (F ) = QS(F ).

Proof: This result is an immediate consequence of Theorems 5.10 and 5.3.The natural question to ask now, and the final one we shall address in

this section, is: given a strictly positive (Ft,P) martingale, as in Theorem5.14, does there exist some Q defined on F∞ such that Q(F ) = QT (F )for all F ∈ FT , all T > 0, i.e. does ρ define some probability measureQ on F∞ which is locally equivalent to P? In general the answer to thisquestion is no, as Example 5.28 below demonstrates. It does, however, holdin one important special case. This result follows from Theorem 5.14 and theDaniel—Kolmogorov consistency theorem. For a detailed proof see Karatzasand Shreve (1991, p. 192).

Theorem 5.15 Suppose ρ is a strictly positive martingale on the probabilityspace (Ω, Ft,F ,P) and that Ft = σ(Xu : u ≤ t) for some process X(note crucially that the filtration is not complete). Then there exists someprobability measure Q defined on F ∞ which is locally equivalent to P withrespect to Ft and is such that

dQdP Ft

= ρt a.s.

5.1.3 Novikov’s condition

We will see, in Section 5.2, how the law of a P-continuous semimartingalechanges under a locally equivalent measure Q. Often in practice we findourselves in some measure P with a given continuous semimartingale andwe want to find some other measure Q under which this continuoussemimartingale has particular properties. For example, in a financial contextit is important to find a measure under which a continuous semimartingalebecomes a (local) martingale. In such situations Girsanov’s theorem usuallygives enough information to identify a candidate for the Radon—Nikodymprocess ρ which connects P and Q. It is then obvious that the candidate ρ is a

Page 122: Financial derivatives in theory and practice

98 Girsanov and martingale representation

local martingale under P but not that it is a true martingale. A non-negativelocal martingale is always a supermartingale (a consequence of Fatou’s lemma)and thus it is immediate that it is a martingale if and only if

EP[ρt] = 1 , (5.4)

for all t ≥ 0. It is useful, therefore, to have criteria which are sufficient toguarantee that (5.4) holds. One such sufficient condition is known as Novikov’scondition. For a proof and a discussion of an alternative criterion the readeris referred to Chapter VIII of Revuz and Yor (1991).

Theorem 5.16 (Novikov) Let (Ω, Ft,F ,P) be a filtered probability spacesupporting a continuous local martingale X and a strictly positive randomvariable ρ0 ∈ L1(Ω,F0,P), and define ρ = ρ0E(X). If, for all t ≥ 0,

E exp( 12 [X ]t) <∞ ,

then E[ρt] = E[ρ0] for all t ≥ 0 and thus ρ is a martingale.This result only applies to martingales ρ of the form ρ = ρ0E(X), for some

X . But this class includes all strictly positive continuous martingales, as thefollowing lemma (to which we also appeal later) summarizes.

Lemma 5.17 Suppose Q is locally equivalent to P with respect to F t andthat the strictly positive P martingale ρt = dQ

dP Ft has a continuous version.Then

Xt :=t

0

ρ−1u dρu

is the unique continuous local martingale such that

ρt = ρ0E(Xt) := ρ0exp(Xt − 12 [X ]t). (5.5)

Moreover,dPdQ Ft

= ρ−10 E(−Xt)

Proof: The result follows by applying Ito’s formula to (5.5).

Example 5.18 SupposeW is a d-dimensional Brownian motion on the filteredprobability space (Ω, Ft,F ,P) which satisfies the usual conditions andlet C ∈ Π(W ) (the d-dimensional version of Π(W ) introduced in Chapter4). Then X = C •W defines an (Ft,P) local martingale. If, in addition,EP[

12

T

0 |Cu|2 du] <∞ then

ρt := Et

0

Cu · dWu

defines an (Ftt≤T ,P) martingale.

Page 123: Financial derivatives in theory and practice

Girsanov’s theorem 99

5.2 GIRSANOV’S THEOREM

We have previously stated that a semimartingale remains a semimartingaleunder an equivalent change of probability measure. This result is known asGirsanov’s theorem, a result which also establishes explicitly the link betweenthe decomposition of an arbitrary semimartingale X under the two equivalentmeasures P and Q and the Radon—Nikodym derivative process ρ t =

dQdP Ft .

Girsanov’s theorem and its proof are the subject of this section.We have in this book only developed results for continuous semimartingales

and so here we study Girsanov’s theorem in a restricted context. In particular,we will need to assume that the Radon—Nikodym process connecting the twomeasures is itself continuous. Nevertheless the results we obtain cover manyof the most important applications and certainly everything needed for thisbook. As in Section 5.1, we will state all the results here for (locally) equivalentmeasures, although they do generalize to the case when we only have (local)absolute continuity of one measure with respect to the other.The technique of studying problems using change of measure is unique

to stochastic calculus, having no counterpart in the classical theory ofintegration, and plays a surprisingly important role in further developmentsof the theory. The reader will briefly be introduced to one significant use ofthe idea when we discuss stochastic differential equations driven by Brownianmotion in the Chapter 6.

5.2.1 Girsanov’s theorem for continuous semimartingales

Let Q be a probability measure on (Ω,F) locally equivalent to P with respectto Ft. Clearly a process of finite variation under P is also a processof finite variation under Q. However, local martingales may not have themartingale property under a change of measure and the first part of Girsanov’stheorem gives the decomposition of a continuous local martingale under P as acontinuous semimartingale under Q. The general decomposition then follows.We shall need the following lemma which is a step in the right direction.

Lemma 5.19 Suppose Q is locally equivalent to P with respect to thefiltration Ft and let

ρt =dQdP Ft

.

Then:

(i) M is an (Ft,Q) martingale ⇔ ρM is an (Ft,P) martingale;(ii) M is an (Ft,Q) local martingale⇔ ρM is an (Ft,P) local martingale.Proof: To establish (i), we must check that conditions (M.i)—(M.iii) ofDefinition 3.1 are satisfied. The adaptedness property (M.i) is immediate.Conditions (M.ii) and (M.iii) follow from Corollary 5.9.

Page 124: Financial derivatives in theory and practice

100 Girsanov and martingale representation

Part (ii) is now an exercise in localization of part (i) and is left to the reader.

Theorem 5.20 (Girsanov’s theorem) Let Q be locally equivalent to Pwith respect to Ft and suppose the martingale ρ satisfying ρt := dQ

dP Ft hasa continuous version. If M is a continuous local martingale under P, then

Mt :=Mt −t

0

ρ−1u d[ρ,M ]u

is a continuous local martingale under Q and

[M ]t = [M ]t.

More generally, if Y is a continuous semimartingale under P with canonicaldecomposition

Yt = Y0 +Mt +At

where M is a P-local martingale and the process A is of finite variation, then

Yt = Y0 + Mt −t

0

ρ−1u d[ρ,M ]u +t

0

ρ−1u d[ρ,M ]u + At

is the canonical decomposition of Y under Q.

Proof: The second statement is an immediate consequence of the first, whichwe now prove. Under Q and, since P is locally equivalent to Q, also underP, ρ is a.s. strictly positive and continuous (the strict positivity follows asin the proof of Theorem 5.10). Thus the process ρ−1 is locally boundedand predictable and the integral

t

0ρ−1u d[ρ,M ]u exists and is a well-defined,

continuous finite variation process under both P and Q. Clearly Mt is acontinuous semimartingale under P.Appealing to Lemma 5.19, to show Mt is a Q local martingale we need only

prove that ρtMt is a P local martingale. First note that, sinceM and M differonly by a finite variation process, the quadratic variation processes forM andM agree. Further, since the covariation of a process of bounded variation witha semimartingale is identically zero, we have

[ρ,M ]t = [ρ, M ]t.

Since ρtMt is the product of two continuous P-semimartingales, we can applyintegration by parts (Theorem 4.28) to obtain

ρtMt = ρ0M0 +t

0

Mudρu +t

0

ρudMu + [ρ, M ]t

= ρ0M0 +t

0

Mudρu +t

0

ρudMu − [ρ,M ]t + [ρ, M ]t

= ρ0M0 +t

0

Mudρu +t

0

ρudMu.

Page 125: Financial derivatives in theory and practice

Girsanov’s theorem 101

Thus ρM is a sum of terms each of which is a stochastic integral with respectto a P local martingale and so itself is a P local martingale, and we are done.

Remark 5.21: The hypothesis that Q is locally equivalent to P in Girsanov’stheorem can be relaxed. If we replace it by the requirement that Q be locallyabsolutely continuous with respect to P, the above results remain true asstated.

Because the Radon—Nikodym process is assumed to be continuous, Lemma5.17 shows that it can be written in exponential form. This yields the followingalternative presentation of Girsanov’s theorem in this case.

Corollary 5.22 Suppose Q is locally equivalent to P with respect to F tand that the strictly positive P martingale ρt = dQ

dP Ft has a continuousversion. Let

Xt :=t

0

ρ−1u dρu

as in Lemma 5.17. If Y is a continuous semimartingale under P with canonicaldecomposition Y = Y0 +M +A, then Y has the canonical decomposition

Yt = Y0 + Mt − [M,X ]t + [M,X ]t +At

under Q.

5.2.2 Girsanov’s theorem for Brownian motion

The building block for most continuous state-space models, and all those inthis book, is Brownian motion. It is valuable to study Girsanov’s theorem inthe context of a Brownian motion where the extra structure means that muchmore can be said, beginning with the following simple corollary of Corollary5.22.

Corollary 5.23 Under the assumptions of Corollary 5.22, if W is an(Ft,P) Brownian motion then

W :=W − [W,X ]

is an (Ft,Q) Brownian motion.Proof: From Corollary 5.22, W is a local martingale under Q. That W isa Brownian motion now follows from Levy’s theorem (Theorem 4.38) since

[W ]t = [W ]t = t.An essential ingredient of the above corollary is the assumption that the

Radon—Nikodym process ρ be continuous. Subject to this assumption, we

Page 126: Financial derivatives in theory and practice

102 Girsanov and martingale representation

have identified a simple but significant fact and one we shall make repeateduse of in later chapters: under a locally equivalent change of measure aBrownian motion remains a Brownian motion apart from the addition of afinite variation process.

In the context of Brownian motion, the assumption that ρ is continuousis not as restrictive as it may at first appear. This a consequence ofTheorem 5.49, the martingale representation theorem for Brownian motion,which is discussed in Section 5.3.2. An abridged version of that result is asfollows.

Theorem Let (Ω, Ft,F , P) be a probability space supporting a d-dimen-sional Brownian motion W and let FW

t be the augmented natural filtrationgenerated by W. Then any local martingale N adapted to FW

t can be writtenin the form

Nt = N0 +∫ t

0Hu · dWu,

for some FWt -predictable H ∈ Π(W ).

As a consequence of this theorem, if Ft = FWt then any local mar-

tingale, in particular the Radon–Nikodym process, automatically has a con-tinuous version (since the stochastic integral above is continuous) and thussatisfies the hypothesis of Corollary 5.23.

In the case when we have local equivalence between P and Q and we areagain working with the augmented Brownian filtration, we can apply themartingale representation theorem to be more explicit about the form ofthe Radon–Nikodym process and the finite variation process appearing inCorollary 5.23.

Theorem 5.24 Let W be a d-dimensional (Ft, P) Brownian motion andlet FW

t be the filtration generated by W, augmented to satisfy the usualconditions. Suppose Q is a probability measure locally equivalent to P withrespect to FW

t . Then there exists an FWt predictable Rd-valued process C

such that

ρt :=dQ

dP

∣∣∣∣FW

t

= exp(∫ t

0Cu · dWu − 1

2

∫ t

0|Cu|2du

). (5.6)

Conversely, if ρ is a strictly positive FWt : 0 ≤ t ≤ T martingale, for some

T ∈ [0,∞], with EP[ρT] = 1, then ρ has the representation in (5.6) and definesa measure Q ≡ QT ∼ P with respect to FW

T .In either of the above cases, under Q,

Wt := Wt −∫ t

0Cudu

is an (FWt , Q) Brownian motion (with the time horizon being restricted to

[0, T ] in the latter case).

Page 127: Financial derivatives in theory and practice

Girsanov’s theorem 103

Remark 5.25: In the converse statement above we have imposed the usualconditions on FWt in order to ensure that the martingale representationtheorem holds. It is then not true, in general, that there exists some measureQ defined on FW∞ which is locally equivalent to P with respect to FW

t , asExample 5.28 below demonstrates. For this reason the converse statement inTheorem 5.24 requires either a finite time horizon (T <∞) or that ρ be a UImartingale (T =∞).Proof: Theorem 5.14 implies that the Radon—Nikodym process ρ is astrictly positive (FWt ,P) martingale and it follows from the martingalerepresentation theorem that it has a continuous version. Further, since FW

0 istrivial and E[ρ0] = 1 we conclude that ρ0 ≡ 1 P-a.s. By Lemma 5.17,

Xt :=t

0

ρ−1u dρu

is the unique continuous (FWt ,P) local martingale such that ρ = E(X).Again appealing to the martingale representation theorem, X has an integralrepresentation

Xt =t

0

Cu · dWu

for some FWt predictable C ∈ Π(W ). This proves (5.6).Conversely, the fact that a strictly positive (FW

t ,P) martingale defines ameasureQT which is equivalent to P with respect to FW

T follows from Theorem5.14. The exponential representation of ρ in Equation (5.6) follows exactly asabove.Noting that [W,X ]t =

t

0 Cudu, the final statement of the theorem followsfrom Corollary 5.23.This leads to the following corollary which pulls together several of the

above results in a form useful for application. The proof is left as a simpleexercise.

Corollary 5.26 Under the assumptions and using the notation of thefirst part of Theorem 5.24, any (FW

t ,P) semimartingale Y is also an(FWt ,Q) semimartingale and has the canonical decompositions under P andQ respectively

Yt = Y0 +t

0

σu · dWu +At

= Y0 +t

0

σu · dWu +t

0

Cu · σudu+At

for some σ ∈ Π(W ) and some finite variation process A.

Page 128: Financial derivatives in theory and practice

104 Girsanov and martingale representation

Remark 5.27: Note that the extra finite variation process introduced aboveas a result of the change of measure is absolutely continuous with respect toLebesgue measure.

Example 5.28 Now the promised example (see comments preceding Theorem5.15) which illustrates the difficulties caused by the usual conditions when thetime horizon is infinite. Let (Ω, Ft,F ,P) be a probability space supportinga one-dimensional Brownian motion W, and let µ be a non-zero constant.Then it is easily checked (for example using Novikov’s condition, or by directcalculation as we did in Example 3.3) that

ρt := exp(µWt − 12µ

2t), t ≥ 0,

is an ((FWt ),P) martingale and so can be used to define a measure Q on(Ω, (FW∞ )) which is locally equivalent to P with respect to (FW

t ) (Theorem

5.15). By Corollary 5.23, the process Wt := Wt − µt is an ((FWt ),Q)Brownian motion.Although the measures P and Q are locally equivalent with respect to

(FWt ) they are not equivalent with respect to (FW∞ ). Nor are they

locally equivalent with respect to FWt , the P-augmentation of (FW

t )

with respect to (FW∞ ). To see this, consider the event

Λ := ω ∈ Ω : limt→∞

t−1Wt = µ ∈ (FW∞ ).

Clearly Q(Λ) = 1 but P(Λ) = 0 (Theorem 2.7) and so Q is not absolutelycontinuous with respect to P on the σ-algebra (FW

∞ ). Furthermore, Λ ∈ FWt

for all t ≥ 0 since it has probability zero under P, and so Q is not locallyequivalent to P with respect to FW

t .We can say more. The process W is an ((FW

t ),P) Brownian motion,

and it is also an ((FWt ),P) Brownian motion since P-augmentation has

no material effect. Girsanov’s theorem proves that W is an ((FWt ),Q)Brownian motion but, given any event A ∈ FW∞ ,

P(A ∩ Λ) = 0 ,

so A ∩ Λ ∈ FW0 . Under the measure P this has little effect, adding only nullsets, but under Q,

Q(A ∩ Λ) = Q(A) = 0 ,in general. Thus, given any ‘interesting’ event under the measure Q it is

effectively included in FW0 . ThusW cannot be an (FWt ,Q) Brownian motionbecause Wt+s −Wt is not independent of FWt for any t, s.A consequence of this example is that it is not possible to work with the

completed Brownian filtration over an infinite horizon when constructing a

Page 129: Financial derivatives in theory and practice

Martingale representation theorem 105

locally equivalent measure from a martingale (recall that there is no problemif the martingale is UI). It is possible to derive an analogue of Theorem 5.24when the filtration is not complete, but to do so it is necessary to work onthe canonical set-up (as introduced in Section 2.2 and discussed further inSection 6.3). This additional restriction is required because it is not, in general,possible to define the stochastic integral if the filtration is not complete, exceptwhen working on the canonical set-up. For more on this see, for example,Karatzas and Shreve (1991).Of course, none of these complications arise when considering only a finite

time horizon T and completing, for example, with respect to (FWT ).

5.3 MARTINGALE REPRESENTATION THEOREM

An important question in finance is: which contingent claims (i.e. payoffs)can be replicated by trading in the assets of an economy? It is only derivativeswith these payoffs that can be priced using arbitrage ideas. A related andequally important question in probability theory is the following: given a(vector) martingale M , under what conditions can all (local) martingales Nbe represented in the form

Nt = N0 +t

0

Hu · dMu,

for some H ∈ Π(M).An answer to this second question is provided by the martingale

representation theorem, and this section is devoted to providing a proof.In keeping with the rest of this book, we shall only prove the martingalerepresentation theorem in the case when the martingale M is continuous. Amore general treatment can be found in Jacod (1979).If your objective in reading this section is only to obtain a precise statement

of the results that are needed in Chapter 7 where we discuss completeness,you need only understand the definition of an equivalent martingale measure(Definition 5.40) and the martingale representation theorem itself, Corollary5.47 and Theorem 5.50. Combine these with Theorem 6.38 of Chapter 6 andyou have all you need. If, on the other hand, you wish to understand why themartingale representation theorem holds, then read on.The proof of the martingale representation theorem follows a familiar path.

First we prove a version of the result for the representation of martingales inM2

0 and then we extend to all local martingales by localization arguments.As usual, all the hard work is done in proving the former result. We dothis in two steps. First, we show that any N ∈ M2

0 can be written asthe sum of a term we can represent as a stochastic integral plus a termorthogonal to it. The second step is to provide necessary and sufficient

Page 130: Financial derivatives in theory and practice

106 Girsanov and martingale representation

conditions for the orthogonal term to be zero. For this we shall introducethe idea of an equivalent martingale measure, a measure equivalent to theoriginal probability measure P and under which M remains a martingale.The connection between equivalent martingale measures and martingalerepresentation may seem surprising at first, but remember that the Radon—Nikodym process connecting two equivalent probability measures is a UImartingale, and if the martingale representation theorem holds then we mustbe able to represent this martingale as a stochastic integral. The connectionis made precise in the proofs to follow.Throughout this section we will be working on a filtered probability

space (Ω, Ft,F ,P) satisfying the usual conditions. The usual conditionsare needed for Proposition 5.45 and all dependent results to hold.

5.3.1 The space I2(M) and its orthogonal complementWe begin with the definition of I2(M) and one of its important properties,that of stability.

Definition 5.29 For a continuous (vector) martingaleM we define the spaceI2(M) to be

I2(M) = N ∈ cM20 : N = H •M, for some H ∈ Π(M) .

That is, I2(M) is the set of square-integrable martingales which can berepresented as a stochastic integral with respect to M .

Remark 5.30: When M is a vector martingale we will use the notation H •Mto mean iH

(i) •M (i).

Remark 5.31: Of course, in Definition 5.29 H •M is automatically continuous,being a stochastic integral for a continuous integrator M .

Proposition 5.32 The space I2(M) is a stable subspace of M20, meaning

it satisfies the following two conditions:

(S.i) It is closed (under the norm · 2).(S.ii) It is stable under stopping, meaning that if N ∈ I2(M) and T is astopping time, then NT ∈ I2(M).

Remark 5.33: Notice in Proposition 5.32 that we are considering the spaceM20

rather than the smaller space cM20 in which all martingales are continuous.

Here, and repeatedly in what follows, we could have chosen to work with cM20.

The reason why the precise choice is irrelevant is that when the martingalerepresentation theorem holds (with respect to some continuous martingaleM)all martingales are continuous and soM2

0 = cM20.

Page 131: Financial derivatives in theory and practice

Martingale representation theorem 107

Remark 5.34: We remind the reader once more of a point made repeatedly inChapter 4. Condition (S.i) makes no sense when the set I2(M) is consideredas a set of stochastic processes since then · 2 is not a norm. It is only strictlycorrect when we identify any two processes which are indistinguishable andwork with equivalence classes of processes. As we did in Chapter 4, we shallmake this identification whenever required in what follows without furthercomment. The same is also true for the various other L2 spaces that weencounter.

Proof: If N = H •M ∈ I2(M) then

NT = (H •M)T = H1(0, T ] •M ∈ I2(M)

by Theorem 4.17, so (S.ii) holds.The proof of (S.i) is an adaptation of the usual isometry argument. Let

Hn •M be a Cauchy sequence (using the norm · 2) in cM20 with limit

N ∈ cM20 (cM2

0 is complete). By the stochastic integral isometry (Corollary4.14),

Hn −HmM = Hn •M −Hm •M 2 ,

so Hn is also Cauchy in the space L2(M), which is also complete. It thereforehas a limit H in L2(M) and N = H •M ∈ I2(M). Thus I2(M) is closed.The space I2(M) is precisely those martingales in M2

0 which can berepresented as a stochastic integral with respect toM . We shall see in Theorem5.37 that being able to represent any element ofM2

0 as a stochastic integralis equivalent to the orthogonal complement of I 2(M) inM2

0 being zero.

Definition 5.35 Two square-integrable martingales M and N are said tobe orthogonal, written M ⊥ N , if E[M∞N∞] = 0. Given any N ⊂ M2

0, wedefine the orthogonal complement of N , N⊥, to be

N⊥ = L ∈M20 : L ⊥ N for all N ∈ N .

Remark 5.36: Recall from Chapter 3 that we consider the space M20 to be

a Hilbert space with norm M 22 = E[M2

∞]. The inner product associatedwith the norm · 2 is then just M,N = E[M∞N∞], so orthogonality ofmartingales inM2

0 as defined above is the usual concept of orthogonality fora Hilbert space.

The following result is now an immediate consequence of standard linearanalysis and Remark 5.36.

Theorem 5.37 M20 = I2(M) ⊕ I2(M) ⊥. That is, any M ∈ M2

0 has a

unique decomposition M = N + L where N ∈ I 2(M) and L ∈ I2(M) ⊥.

Page 132: Financial derivatives in theory and practice

108 Girsanov and martingale representation

In the results to follow we will be working with elements of I2(M) andI2(M) ⊥. We will need to consider elements of each space stopped at somestopping time T . We have seen (Proposition 5.32) that I 2(M) is stable, sostopping an element in I2(M) produces another element of I2(M). The sameis true of I2(M) ⊥, a fact we shall also need.Proposition 5.38 The space I2(M) ⊥ is a stable subspace ofM2

0.

Proof: It is a standard result from linear analysis that I2(M) ⊥, theorthogonal complement of a closed space, is closed, so condition (S.i) is

straightforward. To prove that I2(M) ⊥ is closed under stopping (condition(S.ii)), let N ∈ I2(M), L ∈ I2(M) ⊥ and let T be some stopping time.It follows from the optional sampling theorem and the tower property,respectively, that

E[NTLT ] = E NTE[L∞|FT ] = E[NTL∞] = E[NT∞L∞] , (5.7)

andE[NTLT ] = E E[N∞|FT ]LT = E[N∞LT ] = E[N∞LT∞] . (5.8)

But I2(M) is stable, so NT is orthogonal to L and thus (5.7) equates to zero.This is identically equal to (5.8), thus N is orthogonal to LT . Finally, since N

was arbitrary we conclude that LT ∈ I2(M) ⊥ and so I2(M) ⊥ is stableunder stopping.This proposition allows us to establish the following characterization of the

continuous martingales within the space I2(M) ⊥, one we use below whenestablishing the martingale representation theorem.

Lemma 5.39 Let M be a continuous martingale.(i) Suppose that N ∈ cM2

0. Then N ∈ (I2(M))⊥ if and only if NM is alocal martingale.

(ii) Suppose that N is a bounded martingale. Then N ∈ (I2(M))⊥ if andonly if NM is a martingale.

Proof: Throughout this proof we will, without further comment, makerepeated use of Corollary 3.82 which states that if N and M are continuouslocal martingales then NM is a local martingale if and only if [N,M ] = 0.

Proof of (i): Suppose N ∈ I2(M) ⊥ and let T be some arbitrary stoppingtime. For each n, define Sn = inft > 0 : |Mt −M0| > n and note thatMSn −M0 = 1(0, Sn] •M ∈ I2(M). By the stability of I2(M) and I2(M) ⊥respectively, NT ⊥ (MSn −M0)

T and so

E[NT (MSnT −M0)] = 0 .

Furthermore, E[|NT (MSnT −M0)|] ≤ nE[|NT |] <∞ since N is a UI martingale,

and N(MSn−M0) is continuous, so Theorem 3.53 implies that N(MSn−M0)

Page 133: Financial derivatives in theory and practice

Martingale representation theorem 109

is a UI martingale. It now follows that [N,M Sn ] = 0 and, letting n → ∞,[N,M ] = 0 so NM is a local martingale.Conversely, suppose that NM is a local martingale, hence [N,M ] = 0. Let

L = H •M ∈ I2(M). By (the local martingale extension of) Theorem 4.18,[N,L] = [N,H •M ] = H • [N,M ] = 0, so NL is a local martingale. But then

NL = NL− [N,L]

= 12 (N + L)2 − [N + L] − N2 − [N ] − L2 − [L]

expresses NL as a sum of UI martingales, hence it is itself a UI martingaleand E[N∞L∞] = 0. This proves the result.Proof of (ii): As above, we can conclude that if N ∈ (I2(M))⊥ then NM isa local martingale. Note that the continuous process NM is a martingale ifand only if the stopped process (NM)t is a UI martingale for all t ≥ 0. Thus,by Theorem 3.53, to prove that NM is a martingale it suffices to prove thatE[|(NM)tT |] <∞ and E[(NM)tT ] = (NM)0 for all stopping times T .To prove that E[|(NM)tT |] < ∞, note that M t is a UI martingale so it

follows from Corollary 3.51 that E[|(NM)tT |] ≤ KE[|M tT |] <∞,K being some

constant which bounds N . It remains to prove that E[(NM)tT ] = (NM)0. Tosee this, let Xn := N t∧SnM t where Sn is as defined above. Then

E[XnT ] = EE[X

nT |FT∧Sn ]

= EE[N tT∧SnM

tT |FT∧Sn ]

= E[N tT∧SnM

tT∧Sn ]

= (NM)0 ,

the third equality following from the optional sampling theorem applied tothe UI martingaleM t, the final equality following since (NM)Sn is a boundedlocal martingale, thus a UI martingale. Now let n → ∞. In so doing, we seethat Xn

T → (NM)T a.s., and, since |XnT | ≤ K|M t

T | which is bounded in L1,the dominated convergence theorem implies that

E[(NM)tT ] = E[ limn→∞XnT ]

= limn→∞

E[XnT ]

= (NM)t0 ,

which completes the proof.The converse result follows exactly as in (i).

Page 134: Financial derivatives in theory and practice

110 Girsanov and martingale representation

5.3.2 Martingale measures and the martingale representationtheorem

The martingale representation property is concerned with the ability, onsome given probability space (Ω, Ft,F ,P), to represent an arbitrarylocal martingale as a stochastic integral with respect to some other givenmartingale. The martingale representation theorem relates this property tothe extremality of P within the set of equivalent martingale measures for M .To procede we therefore need the following definitions.

Definition 5.40 Let (Ω, Ft,F ,P) be a probability space satisfying theusual conditions and let M be a continuous (possibly vector) martingale un-der P. The set of equivalent martingale measures forM , written M(M), is theset of probability measures Q satisfying each of the following conditions:

(i) Q ∼ P with respect to F ;(ii) Q and P agree on F0;(iii) M is an Ft martingale under Q.Remark 5.41: Although it is not explicit in the notation, note that M(M) isdependent on the whole probability space (Ω, Ft,F ,P).Remark 5.42: We have insisted in the definition that (Ω, Ft,F ,P) satisfiesthe usual conditions. This is essential for Proposition 5.45 and all subsequentresults to hold. Note that when this is true condition (i) is redundant since itfollows from condition (ii).

The following lemma is a trivial consequence of the above definition.

Lemma 5.43 The set M(M) is convex.

Definition 5.44 The measure P is said to be extremal forM(M) if, wheneverQ,R ∈ M(M) and P = λQ+ (1− λ)R, 0 < λ < 1, then P = Q = R.

We shall need the following result which we state without proof. Aprobabilistic proof can be found in Protter (1990), an analytical one inRevuz and Yor (1991). This is a vital step in the proof of the martingalerepresentation theorem but we would have to include several new ideas toprovide a proof.

Proposition 5.45 If P is extremal in M(M) then every (Ft,P) localmartingale is continuous.

This proposition enables us to state and prove the main result of this section,the martingale representation theorem for square-integrable martingale.

Theorem 5.46 Let (Ω, Ft,F ,P) satisfy the usual conditions and let Mbe a continuous martingale under P. The following are equivalent:

(i) P is extremal for M(M);(ii) I2(M) =M2

0.

Page 135: Financial derivatives in theory and practice

Martingale representation theorem 111

Proof: (i)⇒ (ii): We already know thatM20 = I2(M)⊕I2(M)⊥ by Theorem

5.37. It suffices to prove that I2(M)⊥ = 0.Let N ∈ I2(M)⊥. Since P is extremal, N is continuous by Proposition 5.45

whence, defining T = inft > 0 : |Nt| > 12, NT is bounded by 1

2 . Thus wecan use NT to define two new measures Q and R via

dQdP F

:= (1−NT∞) = (1−NT ),

dRdP F

= (1 +NT ) .

Clearly P = 12Q +

12R, P ∼ Q ∼ R, and they agree on F0 (since N0 = 0).

Furthermore, noting that NT ∈ I2(M) ⊥ and is bounded, we have that

NTM is a martingale under P by Lemma 5.39. Thus, subtracting this fromthe (Ft,P) martingaleM we see thatM(1−N T ) is also a martingale underP, and it then follows from Lemma 5.19 that M is an (Ft,Q) martingale.But P is extremal by hypothesis, thus P = Q = R and N ≡ 0.(ii) ⇒ (i): Let Q, R ∈M(M) and 0 < λ < 1 be such that P = λQ+ (1− λ)R.Define the UI martingale ρ via

ρt = EdQdP

Ft , t ∈ [0,∞] .

Observe that ρ is bounded by λ−1, a fact which follows since

1 = EPdPdP

F∞ = λρ∞ + (1− λ)EP dRdP

F∞ ≥ λρ∞ ,

andMρ is an (Ft,P) martingale by Lemma 5.19. Now, define the martingaleN := ρ − ρ0. The martingale property of M and ρM under P now impliesthat NM is a martingale under P. Furthermore, the measures P and Q agreeon F0, so ρ0 = 1 a.s., and N = ρ−1 is thus bounded by λ−1+1. Lemma 5.39now implies that N ∈ I2(M)⊥ and thus, from (ii), N ≡ 0 and so P = Q = Rand P is extremal.

5.3.3 Extensions and the Brownian case

Theorem 5.46 is the core result in martingale representation theory forcontinuous martingale integrators. However, as it stands it is restrictive andit is not yet obvious that the result is of any practical significance. In thissection we begin to rectify these shortcomings.First, for a given continuous martingaleM , we extend the class of processes

we can represent to include all local martingales. This follows easily bylocalization.

Corollary 5.47 Let (Ω, Ft,F ,P) satisfy the usual conditions and let Mbe a continuous martingale under P. Define

I(M) = N ∈M0,loc : N = H •M, for some H ∈ Π(M) .

Page 136: Financial derivatives in theory and practice

112 Girsanov and martingale representation

The following are equivalent:

(i) P is extremal for M(M);(ii) I(M) =M0,loc.

Proof: (i)⇒ (ii): If P is extremal thenN ∈M0,loc is continuous by Proposition5.45 and so can be localized using Sn = inft > 0 : |Mt| > n. Theorem 5.46now implies that there exists some Hn ∈ Π(M) such that

NSnt =

t

0

Hnu · dMu .

Defining H = ∞n=1 1(Sn−1, Sn]H

n ∈ Π(M) yields N = H •M .

(ii) ⇒ (i): If I(M) = M0,loc then cM20 = I2(M) so the result follows from

Theorem 5.46.Given any probability space (Ω, Ft,F ,P) one often wants to be able

to represent (local) martingales which are adapted to a smaller filtrationGt ⊂ Ft, G ⊂ F . But, of course, P induces a probability measure on G and wecould throughout have chosen to work on the smaller space (Ω, Gt,G,P). Aslong as M is Gt-adapted and we modify Definition 5.40 for an equivalentmartingale measure (replacing (Ft,F) with (Gt,G) throughout), all theresults above remain valid.Replication of derivative payoffs is more a question of representing random

variables than of representing (local) martingales. However, the two ideasare intimately related and it is possible to move from one to the other. Thefollowing result shows how random variable representation follows from ourresults so far.

Corollary 5.48 Let (Ω, Ft,F ,P) be a probability space supporting acontinuous martingale M ∈ cM0 and let FMt be the filtration generatedby M , augmented to satisfy the usual conditions. If P is extremal for M(M),defined relative to (Ω, FMt ,FM∞ ,P), then for any Y ∈ L1(FM∞ ) there existssome FMt -predictable H ∈ Π(M) such that

Y = E[Y ] +∞

0

Hu · dMu .

Proof: Defining Yt = E[Y |FMt ], Corollary 5.47 yields

Yt = Y0 +t

0

Hu · dMu

for some FMt -predictable H ∈ Π(M). Since M0 = 0, FM0 is trivial, Y0 is aconstant a.s., and Y0 := E[Y |FM0 ] = E[Y ]. The result follows.As an example of this corollary we consider the case when the martingaleM

is actually a Brownian motion. In this special case it is possible to establish theextremality of the measure P in the set M(M) by showing that M(M) = P.

Page 137: Financial derivatives in theory and practice

Martingale representation theorem 113

Theorem 5.49 Let (Ω, Ft,F ,P) be a probability space supporting aBrownian motion W and let FW

t be the augmented natural filtrationgenerated by W . Then any local martingale N adapted to FW

t can bewritten in the form

Nt = N0 +t

0

Hu · dWu ,

for some FWt -predictable H ∈ Π(W ). In particular, any Y ∈ L1(FW∞ ) hasthe form

Y = E[Y ] +∞

0

Hu · dWu .

Proof: We work on the space (Ω, FWt ,FW∞ ,P). The results follow from

Corollaries 5.47 and 5.48 if we can show that P is extremal for M(W ) (onthis set-up). We show that P is the unique element of M(W ).Suppose Q ∈ M(W ). The quadratic variation [W ] is independent of

probability measure and, since Q ∼ P (with respect to FWt ), [W ]t = t

a.s. under both P and Q. It follows from Levy’s theorem that W is also aBrownian motion under Q. Therefore all finite-dimensional distributions forW agree under P and Q, and thus P = Q on (FW

t ). Completing with

respect to P (or Q which is equivalent) gives agreement on FWt as required.

Our main interest in these results is to be able to represent a contingentclaim (the random variable Y ) as a stochastic integral with respect to assetprices (the martingale M). Asset price processes will later be specified asthe solution to some stochastic differential equation (SDE). We will see(Theorem 6.38) that the martingale representation property can be related tothe SDE which defines the asset price processes. We will also provide sufficientconditions on the coefficients of the SDE to ensure that the martingalerepresentation theorem holds.In order to establish Theorem 6.38 we will need the following extension of

Corollaries 5.47 and 5.48. The extension replaces the martingale integratorMwith a more general local martingale. Of course, in this context the relevantset of measures to consider is not M(M) but L(M), the set of equivalent localmartingale measures for M . The representation results are then identical.A proof, which is effectively a localization argument based on the resultsestablished so far, can be found in Revuz and Yor (1991).

Theorem 5.50 Let L(M) be the set of equivalent local martingale measuresfor M (defined exactly as in Definition 5.40 with ‘martingale’ replaced by‘local martingale’ throughout). If the martingale M in Corollaries 5.47 and5.48 is replaced by a local martingale and the set M(M) is replaced by L(M)then the conclusions of the corollaries remain valid.

Page 138: Financial derivatives in theory and practice
Page 139: Financial derivatives in theory and practice

6

Stochastic Differential Equations

6.1 INTRODUCTION

An ordinary differential equation (ODE) is a way to define (and generate) afunction x by specifying its local behaviour in terms of its present value (andmore generally past evolution):

dx

dt= b(xt, t),

x0 = ξ.

In general, the functional form x(t) will not be known explicitly but,under suitable regularity conditions on the function b, various existence anduniqueness results can be established.

A stochastic differential equation (SDE) is the stochastic analogue of theODE and its use is similar (although more wide-ranging). In general an SDEmay take the form

dXt = σt dZt

for suitable σ and some semimartingale Z. Similar questions of existenceand uniqueness then arise as in the case of ODEs. We have already metone important example of an SDE such as this, the exponential SDE ofExample 4.36,

dXt = Xt dZt,

for a general semimartingale Z, and we established there that the solution wasunique (more precisely we established pathwise uniqueness, a concept that wewill meet shortly).

In this chapter we shall discuss the special case of SDEs of the form

dXt = b(Xt)dt + σ(Xt)dWt (6.1)

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 140: Financial derivatives in theory and practice

116 Stochastic differential equations

where W is a d-dimensional Brownian motion. Here the process X is n-dimensional and t may be one of its components (so (6.1) automaticallyincludes the time-inhomogeneous case). This is by far the most studied caseand is more than adequate for our purposes. For more general treatmentssee Revuz and Yor (1991), Protter (1990) and Jacod (1979) (in increasinggenerality).By driving the SDE with Brownian motion we are building up more general

processes from a process we understand (a little!); by making (σ, b) dependonly on Xt we create Markovian models. It is an easy exercise to show that ifσ and b are bounded and X satisfies (6.1) then

limh→0

1

hE[Xt+h −Xt|FXt ] = b(Xt)

limh→0

1

hE[(Xt+h −Xt)(Xt+h −Xt)T |FXt ] = σ(Xt)σ(Xt)

T := a(Xt) .

Consequently b is often refered to as the drift of the process X , and the matrixa is its diffusion matrix. Processes such as this are often known collectivelyas diffusion processes.In Section 6.2 we shall discuss precisely what is meant by an SDE, and

in Section 6.4 (after an aside on the canonical set-up in Section 6.3) what ismeant by a solution to an SDE such as (6.1). Section 6.4 also discusses theideas of existence and uniqueness of a solution to an SDE. There are two typesof solution to an SDE, strong and weak, and corresponding to each there arenotions of existence and uniqueness. These are each discussed, along with therelationships between them. All these result are of a general nature. In Section6.5 we establish Ito’s result regarding (strong) existence and uniqueness whenthe SDE (σ, b) obeys (local) Lipschitz conditions. These are the first resultswhich allow us to decide, for a specific SDE, whether existence and uniquenesshold and as such they are of considerable importance in practice. More generalresults can be established in this area but we shall not have need of them.The chapter concludes with a proof of the strong Markov property for asolution to a (locally) Lipschitz SDE in Section 6.6, and a result relating(weak) uniqueness for an SDE to the martingale representation theorem inSection 6.7. It is the latter result that we shall appeal to when discussingcompleteness in Chapter 7.

6.2 FORMAL DEFINITION OF AN SDE

By the SDE (σ, b) we shall mean an equation of the form

dXt = b(Xt)dt+ σ(Xt)dWt,

Page 141: Financial derivatives in theory and practice

An aside on the canonical set-up 117

or, in integral form,

Xt = X0 +t

0

b(Xu)du+t

0

σ(Xu)dWu, (6.2)

where W is a d-dimensional Brownian motion and

the maps b : Rn → Rn and σ : Rn → Rn × Rd are Borel measurable.

The SDE is defined by the pair (σ, b) so we shall refer to it by this label. Themeasurability restrictions on (σ, b) ensure that for any continuous adaptedX the processes b(X·) and σ(X·) are predictable so, subject to boundednessrestrictions discussed in Remark 6.6 below, the right-hand side of (6.2) is astochastic integral as defined in Chapter 4.Note that the underlying probability space (Ω, Ft,F ,P) is not specified as

part of the SDE, nor is the distribution of X0 — they are part of the solution.For future reference we shall denote by | · | the Euclidean metric on the

spaces Rn and Rn × Rd as appropriate.

6.3 AN ASIDE ON THE CANONICAL SET-UP

We met the canonical set-up for a continuous process in Chapter 2 whendiscussing the existence of Brownian motion. It also arises in the study ofSDEs, so we now give a reminder and extension of the discussion there.Let C = C(R+,R) be the set of continuous functions ω : [0,∞) → R and

let us endow it with the metric of uniform convergence, i.e. for ω, ω ∈ C

ρ(ω, ω) =

j=1

1

2

jρj(ω, ω)

1 + ρj(ω, ω)

where

ρj(ω, ω) = sup0≤t≤j

|ω(t)− ω(t)|.

Now we define C := B(C), the Borel σ-algebra on C induced by the metric ρ.Furthermore, for all t ≥ 0 we can define the projection map πt : C→ R via

πt(ω) = ωt ,

for all ω ∈ C. Then the family Ct defined by

Ct := σ(πs : s ≤ t)

defines a filtration on C. (In fact C∞ = C, as shown in Williams (1991).) Welet (Cn, Cnt , Cn) denote the (obvious) n-dimensional version of this space,

Page 142: Financial derivatives in theory and practice

118 Stochastic differential equations

n-dimensional path space. The (. . . ) is used to signify the fact that this spacehas not been augmented.Now suppose that X is some a.s. continuous n-dimensional process on

the complete probability space (Ω, Ft,F ,P). Then X induces a probability

measure, P, on (Cn, Cnt , Cn) given by, for all A ∈ Cn,

P(A) = P(X· ∈ A) .

The stochastic process X defined on the probability space (Cn, Cn, Cnt , P)via

Xt(ω) := ωt

has the same law as X . We denote by (Cn, Cnt , Cn, P) the probability space(Cn, Cnt , Cn) augmented with respect to (C, P), and refer to it as thecanonical set-up corresponding to X .In a similar fashion to the above, if ξ is a random variable defined on

(Ω,F ,P) and taking values in Rm, the canonical set-up corresponding to ξ isthe space (Rm,B,P∗) where, for all A ∈ B(Rm),

P∗(A) = P(ξ ∈ A),

and B is B(Rm) completed with respect to P∗.Combining these two ideas we obtain the following generalization.

Definition 6.1 Let (Ω, Ft,F ,P) be a complete filtered probability spacesupporting a random variable ξ ∈ Rm and some continuous process X takingvalues in Rn. Define the filtration Gt and the σ-algebra G on Rm×Cn via

Gt := B(Rm)⊗ CntG := G∞ .

Then the canonical set-up corresponding to (ξ, X) is the probability space

(Rm × Cn, Gt,G, P), where(i) the measure P is given, for each A ∈ G, by

P(A) = P((ξ, X·) ∈ A) ,

(ii) Gt and G are, respectively, Gt and G augmented with respect to (G, P).Furthermore, (ξ, X), defined, for each (r, x) ∈ Rm × Cn, by ξ(r, x) =

r, X(r, x) = x, has the same law under P as (ξ, X) does under P.

Page 143: Financial derivatives in theory and practice

Weak and strong solutions 119

6.4 WEAK AND STRONG SOLUTIONS

In this section we discuss what is meant by a solution to an SDE. There aretwo basic types of solution, weak and strong, with strong being a special caseof weak. We describe each of these, the associated concepts of existence anduniqueness for an SDE and the relationships between them.

6.4.1 Weak solutions

The broadest notion of a solution is defined as follows. The (redundant)qualifier weak is always included to contrast this general situation with thespecial case of a strong solution.

Definition 6.2 A weak solution of the SDE (σ, b) with initial distribution µis a sextuple (Ω, Ft,F ,P,W,X) such that:(i) (Ω, Ft,F ,P) is a filtered probability space satisfying the usual

conditions;(ii) W is a d-dimensional Brownian motion with respect to the filtration

Ft, and X is a continuous n-dimensional process adapted to F t;(iii) X0 has distribution µ;(iv) for all t ≥ 0,

t

0

|b(Xu)|+ |σ(Xu)|2du <∞ a.s.; (6.3)

(v) for all t ≥ 0,

Xt = X0 +t

0

b(Xu)du+t

0

σ(Xu)dWu a.s.

For reasons which shall become clear later (related to strong solutions) thesextuple (Ω, Ft,F ,P,W,X0) is referred to as the initial data of the solution.The process X is refered to as the solution process.

Remark 6.3: Note that an SDE is specified only by the pair (σ, b), so a solutioncomprises the probability space as well as the processes W and X on thatspace.

Remark 6.4: The filtration Ft need not be the augmentation of σ(X0) ∨FWt . A general filtration Ft can always be replaced by F (W,X)

t but notin general by σ(X0) ∨ FWt . Solutions satisfying the latter property are thesubject of the next section.

Remark 6.5: Note that, because W is a Brownian motion relative to thefiltration Ft and X0 is F0-measurable, it follows immediately that X0 mustbe independent of W .

Page 144: Financial derivatives in theory and practice

120 Stochastic differential equations

Remark 6.6: The finiteness condition (6.3) is required to ensure the integrals(6.2) exist. The condition on σ ensures that σ(X) ∈ Π(W ) (the obviousgeneralization of Π(W ) defined in Section 4.5.1 for the case d = 1) and thecondition on b ensures that the first integral in (6.2) exists as a (pathwise)classical integral.

Remark 6.7: Note that the continuity of X means that conditions (iv) and (v),which as specified hold separately at each t, also hold for all t simultaneously.

What we mean by weak existence for an SDE is the following.

Definition 6.8 We say that weak existence holds for the SDE (σ, b) if, forevery probability measure µ on (Rn,B(Rn)), there exists a weak solution to(σ, b) with initial law µ.

There are two natural notions of uniqueness for weak solutions to an SDE.

Definition 6.9 We say that weak uniqueness, or uniqueness in law, holdsfor the SDE (σ, b) if, given any two weak solutions (Ω, Ft,F ,P,W,X) and(Ω, Ft, F , P, W , X) with the same initial distribution µ, the laws of X and

X agree.

Definition 6.10 Pathwise uniqueness holds for the SDE (σ, b) if, given

any two (weak) solutions (Ω, Ft,F ,P,W,X) and (Ω, Ft,F ,P,W, X), withthe same initial data (Ω,F , Ft,P,W,X0), the processes X and X areindistinguishable,

P(Xt = Xt, for all t) = 1.

Remark 6.11: If the filtrations for the two solutions in the above definition areallowed to differ, we say that strict pathwise uniqueness holds. It can be shownthat this reduces to the apparently weaker notion of pathwise uniqueness asdefined above.

Uniqueness in law is the most obvious notion of uniqueness for weaksolutions of an SDE because the law is the only thing which can be compared ingeneral, since the probability spaces may be different. It is the relevant notionof uniqueness we shall need when we revisit the martingale representationtheorem in Section 6.7. Pathwise uniqueness is only a statement about twosolutions on the same set-up, but is then a more stringent requirement.The relationship between these two notions of uniqueness is not immediately

obvious. However, the following is true.

Theorem 6.12 Pathwise uniqueness implies weak uniqueness.

This fact was first proved by Yamada and Watanabe (1971). Given any twoweak solutions it is clear from the discussion in Section 6.3 that, for either

Page 145: Financial derivatives in theory and practice

Weak and strong solutions 121

of the two solutions, a corresponding weak solution can be defined on thecanonical set-up (for the Brownian motion and the solution process jointly)

(Rd+n,Cd+n, Cd+nt , Cn+d). The proof of Yamada and Watanabe rests onthe fact that any two weak solutions can be put on the same canonical set-up(Cd+2n, Cd+2nt , Cd+2n) by using regular conditional probabilities. Roughlyspeaking, the idea is as follows. Starting from the two weak solution processesX and X , and the corresponding Brownian motions W and W , constructthe regular conditional probabilities P(X·|W·) and P(X·|W·). Now definea probability measure on the space Cd+2n so that the first d componentshave the law of a Brownian motion (using Wiener measure), the next ncomponents have the law of X (using the probabilities P(X·|W·)) and thefinal n components have the law of X (using the probabilities P(X·|W·)). Inso doing we have succeeded in putting both solutions on the same space and sopathwise uniqueness now implies that they must be indistinguishable, hencemust have the same law.

Remark 6.13: Note in the above discussion that when discussing a weaksolution the process (W,X) can be transferred to the canonical space

(Cd+n, Cd+nt , Cd+n) but not necessarily (Cd, Cdt , Cd) which is canonical forthe Brownian motion W . This contrasts with the discussion below for strongsolutions to an SDE.

6.4.2 Strong solutions

For any weak solution to the SDE (σ, b), the process (W,X) is adapted to

the filtration F (W,X)t . In many situations an SDE is used as a way tospecify a model for some physical quantity which is subject to uncertainty,in our case asset prices, and for this a general weak solution is sufficient. Forsome applications, however, both the driving Brownian motion W and theprocess X have independent significance. In signal processing, for example,the Brownian motion will be an input noise and X will be the output. In such

situations it is important that, at the very least, X be adapted to F (X0,W )t

since knowledge of W and X0 should be all that is required to have completeknowledge of X . This leads to the concept of a strong solution introducedbelow. An even more direct link between the input W and the output Xwould be the existence of a pathwise relationship between them,

X. = F (X0,W.) (6.4)

for some mapping F . We shall soon provide conditions under which such amap exists.

Definition 6.14 A strong solution of the SDE (σ, b) is a (weak) solution(Ω, Ft,F ,P,W,X) with the additional property that the solution processX

Page 146: Financial derivatives in theory and practice

122 Stochastic differential equations

is adapted to the filtration F (X0,W)t , the filtration σ(X0) ∨ FW

t augmentedto include all (F , P) null sets.

Remark 6.15: Our definition of strong is identical to that of Revuz and Yor(1991) but different from that of Rogers and Williams (1987) who requiremore of their strong solutions. Where we differ from Revuz and Yor (1991) isin the definition of weak; they define a weak solution to be one which is weakin our sense but not strong.

It may at first appear surprising that situations arise where the solutionprocess X is not adapted to the filtration F (X0,W). The following famousexample of Tanaka shows that they do.

Example 6.16 (Tanaka) Consider the SDE

dXt = sgn(Xt)dWt. (6.5)

To see that weak existence holds for this SDE take any probability space(Ω, Ft,F , P) supporting a Brownian motion B and an independent randomvariable ξ having law µ. Now define

Xt = ξ + Bt,

Wt =∫ t

0sgn(Xt)dXt. (6.6)

Equation (6.6) defines a local martingale with (dWt)2 = (dBt)2 = dt, so W is aBrownian motion by Levy’s theorem (Theorem 4.38). Furthermore, X clearlysatisfies (6.5) so weak existence holds.

Weak uniqueness also holds since any (weak) solution to (6.5) is Brownianmotion, again by Levy’s theorem, but pathwise uniqueness does not hold sinceXt = ξ − Bt is another weak solution. Finally, note the plausible fact (from(6.6)), proved in Rogers and Williams (1987), that FW

t = F|X|t , so X is not

adapted to FWt and the solution is not strong. (To prove this we would need

to study the local time of X, which is beyond the scope of this book.)Motivated by the fact that for a strong solution the ‘input’ process W is

often specified, as is the probability space, the notion of strong existence foran SDE is stricter than the obvious counterpart of Definition 6.8.

Definition 6.17 We say strong existence holds for the SDE (σ, b) if,for all initial data (Ω, Ft,F , P, W, ξ), there exists some X such that(Ω, Ft,F , P, W, X) is a strong solution to (σ, b) (with X0 = ξ a.s. of course).

Remark 6.18: It is, of course, the use of (Ω, Ft,F , P, W, ξ) as an input thatmotivated the label ‘initial data’, one which is less natural in the context of aweak solution.

Page 147: Financial derivatives in theory and practice

Weak and strong solutions 123

Definition 6.19 Strong uniqueness holds for (σ, b) if, given any initialdata (Ω, Ft,F ,P,W, ξ) and any two strong solutions with this initial data,(Ω, Ft,F ,P,W,X) and (Ω, Ft,F ,P,W, X), the processes X and X areindistinguishable,

P(Xt = Xt, for all t) = 1.

Returning to the question of existence or otherwise of the function F in(6.4), this is not true in general. However, when strong existence holds thestory is altogether clearer, essentially because a solution must exist on thecanonical set-up corresponding to the input Brownian motion and the initialcondition.

Theorem 6.20 Suppose strong existence holds for the SDE (σ, b).Fix some probability distribution µ on Rn and consider the initial data(Ω, Gt,G, P, W , ξ) where Gt and G are as in Definition 6.1 and

Ω = Rn × Cd,P = µ×Qd,

ξ(ω) := ξ(r, x) = r,

W (ω) := W (r, x) = w,

where Qd is Wiener measure on (Cd, Cd). By strong existence, for this initialdata there exists some solution process X such that

Xt = ξ +t

0

b(s, X·) ds+t

0

σ(s, X·) dWs

satisfying the finiteness condition (6.3).It now follows that there exists a function

Fµ : Rn × Cd → Cn

such that, for all A ∈ Cnt ,F−1µ (A) ∈ Gt

and such thatX·(ω) = Fµ(ξ(ω), W·(ω))

for almost all ω.Furthermore, given any initial data (Ω, Ft,F ,P,W, ξ) with ξ having

distribution µ, the process

X·(ω) = Fµ(ξ(ω),W·(ω))

satisfies

Xt = ξ +t

0

b(s,X·) ds+t

0

σ(s,X·) dWs .

Page 148: Financial derivatives in theory and practice

124 Stochastic differential equations

Remark 6.21: It is the existence of a solution on the canonical set-up for(ξ, W ) (compare with Remark 6.13) which ensures existence of the map Fµ,and this in turn ensures there exists a solution on any set-up. Thus strongexistence holds if and only if there is a solution on the canonical set-up for(ξ, W ) for all random variables ξ.

A proof of this can be found in Rogers and Williams (1987). Note, of course,that the function Fµ is not uniquely defined because changing its definitionon a P-null set makes no difference. If strong uniqueness holds for the SDEthen any two such functions satisfying the conditions of the theorem will haveP

(F

(1)µ (X0(ω), W (ω)) = F

(2)µ (X0(ω), W (ω))

)= 1.

Remark 6.22: In general the function Fµ depends on the initial distributionµ. Rogers and Williams (1987) discuss this point further in the context ofstochastic flows.

6.4.3 Tying together strong and weak

We have introduced two types of solution to an SDE, two notions of existenceand three of uniqueness. In this section we catalogue the relationship betweenthem.

Clearly any strong solution is also a weak solution and strong existencecertainly implies weak existence. On the uniqueness question we have alreadyseen that pathwise uniqueness implies uniqueness in law and it is immediatethat pathwise uniqueness also implies strong uniqueness. There are no othergeneral relationships between any two of the concepts already introduced.However, when these are combined we get the following.

Theorem 6.23 (Yamada and Watanabe) For an SDE (σ, b), weakexistence plus pathwise uniqueness implies strong existence plus stronguniqueness.

This is a powerful and surprising result. Weak existence states only thatfor each initial distribution µ there is some probability space (Ω, Ft,F , P)on which (W, X) can be defined, whereas strong existence states that thereexists a solution for all (Ω, Ft,F , P) which support a Brownian motion Wand a random variable ξ with distribution µ.

Remark 6.24: We pointed out in Remark 6.15 that our concept of a strongsolution is different from that of Rogers and Williams (1987). In fact, theirdefinition is slightly stronger and with their definition Theorem 6.23 can bestrengthened to an equivalence, i.e. strong existence and strong uniquenessalso implies weak existence and pathwise uniqueness. Rogers and Williams(1987) contains the details.

Page 149: Financial derivatives in theory and practice

Establishing existence and uniqueness: Ito theory 125

6.5 ESTABLISHING EXISTENCE AND UNIQUENESS:ITO THEORY

The general theory in the previous section defines and relates the differentconcepts of existence and uniqueness for solutions to an SDE but gives nomethod for establishing whether or not these properties hold for a givenSDE. In this section we reproduce Ito’s theory for strong solutions to (locally)Lipschitz SDEs. This theory shows that if the SDE (σ, b) is (locally) Lipschitzand satisfies a linear growth constraint then strong existence and pathwiseuniqueness hold. This was the first work done on SDEs and is still the mostimportant technique for generating strong solutions.

In finance applications the existence of a strong solution is not necessarysince in this context the use of an SDE is to generate a model for the assetprices and there is no interpretation given to the driving Brownian motionas an input. It is convenient, however, to work with a locally Lipschitz SDEbecause existence and uniqueness results are relatively easy to prove, as is thestrong Markov property. In this case, since strong existence and pathwiseuniqueness hold, all the results of Section 6.4 apply. In particular, thesolution is also unique in law (Theorem 6.12) and this yields the martingalerepresentation result (Theorem 6.38) which we shall need when we discusscontinuous time finance theory in Chapter 7.

Two topics that we shall not discuss in this book are the questions ofwhen weak existence and the strong Markov property hold for an SDE. Asmentioned above, both hold for (locally) Lipschitz SDEs, but this is a veryparticular and restrictive case.

The primary tool for the study of weak existence is Girsanov’s theorem.The basic idea of this approach is to start with some (relatively simple) SDE(σ, b) for which we know there is a weak solution (Ω, Ft,F , P, W, X). If wenow change measure to some other measure Q ∼ P then under Q the sextuple(Ω, Ft,F , Q, W , X), W here being the Brownian motion of Corollary 5.23,will be a solution to a different SDE (σ, b). In practice, of course, the idea isused in reverse. Given some SDE (σ, b), it is reduced to a simpler one usinga change of measure and the solution to this simpler SDE then provides asolution to (σ, b).

Another technique for establishing weak existence, one which is alsoimportant when studying the strong Markov property, is the martingaleproblem. This is a very important topic but one which we shall not considerfurther. The interested reader is referred to Revuz and Yor (1991), Rogersand Williams (1987) or Strook and Varadhan (1979).

Page 150: Financial derivatives in theory and practice

126 Stochastic differential equations

6.5.1 Picard—Lindelof iteration and ODEs

In the theory of ODEs it is known that, for any constant ξ ∈ Rn, the equation

dxt = b(xt)dt,

x0 = ξ,(6.7)

has a unique solution if the function b is globally Lipschitz, i.e. if there existssome K > 0 such that, for all x, y ∈ Rn,

|b(x)− b(y)| < K|x− y|. (6.8)

Without this condition (or some relaxation of it) it is difficult to say much ingeneral. The way this result is proved is by use of a Picard—Lindelof iterationas follows. Define a sequence of functions x(k) by

x(0)t = ξ, for all t

x(k+1) = Gx(k) ,where

(Gx)t = ξ +t

0

b(xu)du. (6.9)

We fix some T and prove existence and uniqueness over t ∈ [0, T ] which isclearly enough. We begin by establishing an inequality which we shall use toprove that the sequence x(k) converges to a unique limit satisfying (6.7).The inequality we establish is an L2 inequality, whereas it is more usualfor ODEs to use a simpler L1 inequality. We use the L2 result because thiscan be extended to a stochastic equivalent which is needed when consideringSDEs. Defining z∗t := sup0≤u≤t |zu|, it follows from (6.9), the Cauchy—Schwarzinequality and (6.8) respectively that, for all 0 ≤ t ≤ T ,

(Gx − Gy)∗2t = sup0≤s≤t

s

0

(b(xu)− b(yu))du2

≤t

0

|b(xu)− b(yu)|du2

≤ tt

0

|b(xu)− b(yu)|2du

≤ K2Tt

0

(x− y)∗2u du .

By induction, therefore, for bounded, continuous x, y,

(Gkx−Gky)∗2t ≤ (K2T )ktk−1

(k − 1)!T

0

(x−y)∗2u du ≤(KT )2k

(k − 1)! (x−y)∗2T . (6.10)

Page 151: Financial derivatives in theory and practice

Establishing existence and uniqueness: Ito theory 127

We now establish existence of a solution to (6.7). Given any m, k > 0,

(x(m+k) − x(k))∗2T =

m−1

j=0

(x(k+j+1) − x(k+j))∗2T

=m−1

j=0

(Gk+jx(1) − Gk+jx(0))∗2T

≤m−1

j=0

(KT )2(k+j)

(k + j − 1)! (x(1) − x(0))∗2T

≤∞

j=0

(KT )2(k+j)

(−1)!j! (x(1) − x(0))∗2T

≤ exp (KT )2 (KT )2k

(k − 1)! (x(1) − x(0))∗2T

→ 0,

as k → ∞, so the sequence of functions x(k) is Cauchy for the supremumnorm on [0, T ] and thus converges to some continuous limit function x whichsatisfies

limk→∞

(x− x(k))∗2T = 0. (6.11)

To show that this limit function x solves (6.7), observe that (6.10) impliesthat

limk→∞

(Gx − x(k))∗2T = limk→∞

(Gx − Gx(k−1))∗2T≤ (KT )2(x− x(k−1))∗2T → 0,

so x = Gx as required.To establish uniqueness, note that any solution to (6.7) satisfies x = G kx

for all k > 0. Thus it follows from (6.10) that if x and y are two solutions,(x− y)∗2T = 0, i.e. x ≡ y and any solution to (6.7) is unique.

6.5.2 A technical lemma

Ito’s idea was to apply a similar Picard—Lindelof procedure to an SDE toobtain a pathwise unique solution. The key to the proof for the ODE wasinequality (6.10) for which a stochastic analogue must be found. This is thecontent of this section.

Lemma 6.25 Let (Ω, Ft,F ,P) be a filtered probability space supportinga d-dimensional Brownian motion W and suppose the n-dimensional processZ is defined by

Zt =t

0

budu+t

0

σudWu

Page 152: Financial derivatives in theory and practice

128 Stochastic differential equations

where b and σ are Ft-predictable processes. Then, for all t ≥ 0,

E[Z∗2t ] ≤ 2(4 + t)Et

0

(|bu|2 + |σu|2)du . (6.12)

Now let ξ be an F0-measurable random variable and define the operator G(or Gξ when we wish to emphasize the role of ξ) by

(GX)t = ξ +t

0

b(Xu)du+t

0

σ(Xu)dWu,

where now σ and b are Borel measurable functions on Rn satisfying theLipschitz conditions

|b(x)− b(y)| ≤ K|x− y|,|σ(x) − σ(y)| ≤ K|x− y| , (6.13)

and X is a continuous adapted process taking values in Rn. Then, given anyfixed T > 0 and for all 0 ≤ t ≤ T, k > 0 and any continuous adapted processesX and Y ,

E[(GkX − GkY )∗2t ] ≤ CkTtk−1

(k − 1)!ET

0

(X − Y )∗2u du

≤ CkTT k

(k − 1)!E (X − Y )∗2T , (6.14)

where CT = 4K2(4 + T ).

Proof: Write Zt =Mt +At where

Mt =t

0

σudWu,

At =t

0

budu.

Trivially, for each t ≥ 0,

|Zt|2 ≤ 2(|Mt|2 + |At|2),

and further

Z∗2t ≤ 2(M ∗2t +A∗2t ),

whence

E[Z∗2t ] ≤ 2E[M ∗2t ] + 2E[A∗2t ].

Page 153: Financial derivatives in theory and practice

Establishing existence and uniqueness: Ito theory 129

From Doob’s L2 inequality for a continuous local martingale in n dimensions(Exercise 6.26) we have

E[M∗2t ] ≤ 4Et

0

σudWu

2

= 4Et

0

|σu|2du ,

and by Jensen’s inequality,

E[A∗2t ] ≤ Et

0

|bu|du2

≤ E tt

0

|bu|2du .

Inequality (6.12) now follows immediately.We now prove (6.14). The second inequality is trivial. We prove the first by

induction on k as follows. Apply (6.12) to Zt = (GU − GV )t, where U and Vare continuous adapted processes, to obtain

E (GU − GV )∗2t ≤ 2(4 + t) Et

0

|b(Uu)− b(Vu)|2du

+ Et

0

|σ(Uu)− σ(Vu)|2du

≤ CTEt

0

|Uu − Vu|2du

= CT

t

0

E |Uu − Vu|2 du, (6.15)

the last inequality following from the Lipschitz conditions (6.13). SettingU = X , V = Y in (6.15) yields (6.14) when k = 1. For general k, set U = G kX ,V = GkY and use the inductive hypothesis to obtain

E (Gk+1X − Gk+1Y )∗2t ≤ CTt

0

E |GkXu − GkYu|2 du

≤ CTt

0

E (GkX − GkY )∗2u du

≤ CTt

0

CkTvk−1

(k − 1)!ET

0

(X − Y )∗2u du dv

= Ck+1T

tk

k!E

T

0

(X − Y )∗2u du ,

and we are done.

Exercise 6.26: Prove that Doob’s L2 inequality for one-dimensional cadlagmartingales generalizes to continuous n-dimensional local martingales.

Page 154: Financial derivatives in theory and practice

130 Stochastic differential equations

6.5.3 Existence and uniqueness for Lipschitz coefficients

Armed with Lemma 6.25, we are now ready to prove Ito’s existence anduniqueness results for SDEs. The first theorem requires global Lipschitzrestrictions on the coefficients of the SDE. We relax these later to localLipschitz conditions in the presence of a linear growth constraint.

Theorem 6.27 Suppose the (Borel measurable) functions b and σ satisfythe global Lipschitz growth conditions

|b(x)− b(y)| ≤ K|x− y||σ(x) − σ(y)| ≤ K|x− y|

for all x, y ∈ Rn and some K > 0. Then strong existence and pathwiseuniqueness hold for the SDE (σ, b).

Proof: Let (Ω, Ft,F ,P) be a probability space supporting a d-dimensionalBrownian motion W and some F0-measurable random variable ξ. We must

show there is a strong solution to (σ, b) (adapted to the filtration F (ξ,W )t ) with

initial condition ξ. We construct such a solution in the case when ξ is boundedby using a stochastic anologue of Picard—Lindelof iteration. This is sufficientsince the general result then follows from this special case by truncation.Let G be the operator of Lemma 6.25 and define

X(0)t = ξ, for all t

X(k+1) = GX(k) = Gk+1X(0).

We show that for any fixed T > 0, X (k) converges a.s., uniformly on the

interval [0, T ], to a continuous F (ξ,W )t -adapted process X which satisfies

X = GX . Since T is arbitrary this is sufficient to establish that X is a strongsolution to (σ, b).So fix T > 0 and note that, for t ∈ [0, T ],

(X(1) −X(0))t =t

0

b(ξ)du+t

0

σ(ξ)dWu = tb(ξ) + σ(ξ)(Wt −W0) ,

and thus, noting that ξ and W are independent,

E (X(1) −X(0))∗2t ≤ 2t2E[b2(ξ)] + 2E[σ2(ξ)]E[W ∗2t ]≤ 2T 2E[b2(ξ)] + 8TE[σ2(ξ)]=: B <∞ .

Now apply Lemma 6.25 to obtain

E (X(k+1) −X(k))∗2T = E (GkX(1) − GkX(0))∗2T

≤ CkTT k

(k − 1)!B ,

Page 155: Financial derivatives in theory and practice

Establishing existence and uniqueness: Ito theory 131

and thus, by Chebyshev’s inequality,

P sup0≤t≤T

(X(k+1)t −X(k)

t ) > 2−k ≤ 4kCkTT k

(k − 1)!B. (6.16)

The first Borel—Cantelli lemma can now be applied to (6.16) to conclude that

P sup0≤t≤T

(X(k+1)t −X(k)

t ) > 2−k infinitely often = 0 .

Let Ω∗ be the set, having probability one, given by

Ω∗ := ω : sup0≤t≤T

|X(k+1)t (ω)−X(k)

t (ω)| > 2−k finitely often .

For ω ∈ Ω∗, we see that X (k)· (ω) is a Cauchy sequence in the space

C([0, T ],Rn) (continuous functions from [0, T ] to Rn) endowed with thesupremum norm, x∗T := sup0≤t≤T |xt|, and hence converges to a limit X·(ω).That is, X (k) converges a.s. uniformly on [0, T ] to the continuous limit

process X . This limit process is F (ξ,W )t -adapted since each X (k)

t is F (ξ,W )t -

measurable and the a.s. limit of a measurable random variable is measurable.It can now also be concluded, using (6.14) and an argument similar to the oneused to establish (6.11), that

limk→∞

E (X −X(k))∗2T = 0 .

That X solves X = GX follows since, again from Lemma 6.25,

E (X − GX)∗2T ≤ 2E (X −X (k))∗2T + 2E (X(k) − GX)∗2T≤ 2E (X −X (k))∗2T + 2CTE (X(k−1) −X)∗2T→ 0

as k →∞, so X (k) also converges to GX .We conclude by establishing pathwise uniqueness. Let X and Y be two

solution processes having the same initial data. Clearly, for any stopping timeτ , (GkXτ )T∧τ = (GkX)T∧τ , and similarly for Y , so Lemma 6.25 applied toXτ and Y τ yields

E (Xτ − Y τ )∗2T = E (X − Y )∗2T∧τ= E (GkX − GkY )∗2T∧τ= E (GkXτ − GkY τ )∗2T∧τ≤ E (GkXτ − GkY τ )∗2T

≤ CkTT k

(k − 1)!E (Xτ − Y τ )∗2T . (6.17)

Taking τ = Tk := inft > 0 : |Xt| + |Yt| > k in (6.17), the right-hand sideof (6.17) is bounded, hence converges to zero as k → ∞. Thus X Tk = Y Tk

a.s. on [0, T ]. Letting k → ∞, and noting that Tk ↑ T (since both X and Yare a.s. finite), now establishes that X = Y a.s. on [0, T ]. Since T is arbitrary,pathwise uniqueness holds.

Page 156: Financial derivatives in theory and practice

132 Stochastic differential equations

Corollary 6.28 Suppose the (Borel measurable) functions b and σ are locallyLipschitz, meaning that for each N > 0 there exists some constant KN > 0such that

|b(x)− b(y)| ≤ KN |x− y|,|σ(x) − σ(y)| ≤ KN |x− y|,

(6.18)

whenever |x|, |y| ≤ N . Suppose further that, for some K > 0, the followinglinear growth conditions also hold:

|b(x)| ≤ K(1 + |x|),|σ(x)| ≤ K(1 + |x|). (6.19)

Then strong existence and pathwise uniqueness hold for the SDE (σ, b).

Remark 6.29: As is the case also for ODEs, it is not possible to relax thestatement of this theorem by removing the linear growth constraints.

Proof: Let

bN(x) = b(x)1|x|≤N + b Nx

|x| 1|x|>N,

σN (x) = σ(x)1|x|≤N + σ Nx

|x| 1|x|>N .

For eachN the SDE (σN , bN) is globally Lipschitz so admits a pathwise uniquestrong solution XN by Theorem 6.27. Now define

τN := inft > 0 : |XNt | > N,

Xt := limN→∞

XNt∧τN . (6.20)

Note that, for all M ≥ N , (XMt∧τN −XN

t∧τN ) = 0 since (σN , bN ) and (σM , bM )

agree for |x| ≤ N . Hence the limit in (6.20) will exist a.s. if supN τN = +∞a.s., and will satisfy the SDE (σ, b). It will also be the unique such processsince each XN is unique.We suppose that |X0| is bounded by some constant, A say, and consider

only a finite interval [0, T ]. The general case of unbounded X0 follows bythe usual truncation argument. Defining ENt := E[(XN − X0)∗2t ], it is clearthat EN is increasing and finite for all t ∈ [0, T ], thus differentiable Lebesguealmost everywhere. Furthermore, applying (6.12) and (6.19),

ENt ≤ 4(4 + t)E K2t

0

(1 + |XNu |)2du

≤ 8(4 + t)E K2t

0

(1 + |X0|)2 + (XN −X0)∗2u du

≤ 8(4 + T )K2 (A+ 1)2T +t

0

ENu du

=: AT +BT

t

0

ENu du . (6.21)

Page 157: Financial derivatives in theory and practice

Establishing existence and uniqueness: Ito theory 133

Gronwall’s lemma (Lemma 6.30 below) applied to (6.21) now implies thatENt ≤ AT eBT t, thus

P(τN < t) ≤ ENt

N2≤ AT e

BTT

N2→ 0

as N →∞.Gronwall’s lemma, just used, is a simple but very useful result which we

call on again later.

Lemma 6.30 Let f be some Lebesgue integrable real-valued function andsuppose there exist some constants A and C such that

f(t) ≤ A+ Ct

0

f(u) du (6.22)

for all t ∈ [0, T ]. Then, for Lebesgue almost all t ∈ [0, T ],f(t) ≤ AeCt .

Proof: The function

h(t) := e−Ctt

0

f(u)du, t ≥ 0,

starts at h(0) = 0 and satisfies h (t) ≤ Ae−Ct by (6.22). This implies thatt

0

f(u)du ≤ A

CeCt − 1 .

Substitution back into (6.22) establishes the result.Note that the SDE we have considered is a diffusion SDE but that time

t could be one of the components so we already have results for the time-inhomogeneous case. The following theorem, one we shall appeal to later,explicitly identifies and separates out the time component.

Theorem 6.31 Suppose the (Borel measurable) functions

b : Rn × R+ → Rn,

σ : Rn × R+ → Rn × Rd

satisfy:

(i) the local Lipschitz growth conditions, i.e. for each N > 0,

|b(x, t) − b(y, t)| ≤ KN |x− y|,|σ(x, t) − σ(y, t)| ≤ KN |x− y|,|b(x, s)− b(x, t)| ≤ KN |t− s|,|σ(x, s) − σ(x, t)| ≤ KN |t− s|,

(6.23)

Page 158: Financial derivatives in theory and practice

134 Stochastic differential equations

whenever |x|, |y| ≤ N , s ≤ t ≤ N , and some KN > 0;(ii) the linear growth conditions

|b(x, t)| ≤ K(1 + |x|),|σ(x, t)| ≤ K(1 + |x|), (6.24)

for some K > 0.

Then strong existence and pathwise uniqueness hold for the SDE (σ, b).

6.6 STRONG MARKOV PROPERTY

We discussed the strong Markov property in Section 2.4, where we proved thatBrownian motion is strong Markov. Here we show that the solution process ofa locally Lipschitz SDE is also strong Markov, a property essentially inheritedfrom the strong Markov property of the driving Brownian motion. Recall thatthe basic idea behind the strong Markov property is that a process X is strongMarkov if, given any finite stopping time τ , the evolution of X beyond τ isdependent on the evolution prior to τ only through its value at τ . The formaldefinition is as follows.

Definition 6.32 Let (Ω, Ft,F ,P) be a filtered probability spacesupporting an Rn-valued process adapted to Ft. We say that X is a strongMarkov process if, given any a.s. finite Ft stopping time τ , any Γ ∈ B(Rn),and any t ≥ 0,

P(Xτ+t ∈ Γ|Fτ ) = P(Xτ+t ∈ Γ|Xτ ) a.s. (6.25)

From (6.25) we can immediately conclude, using the tower property, thatgiven any t1 < . . . < tk, Γi ∈ B(Rn), i = 1, . . . , k,

P(Xτ+ti ∈ Γi, i = 1, . . . , k|Fτ ) = P(Xτ+ti ∈ Γi, i = 1, . . . , k|Xτ ) a.s.

Further, as shown in Karatzas and Shreve (1991, p. 82),

P(Xτ+· ∈ Γ|Fτ ) = P(Xτ+· ∈ Γ|Xτ ) a.s.

for all Γ ∈ B((Rn)[0,∞)) (the Borel σ-algebra on the space of functions fromR+ to Rn. So (6.25) is a much stronger result than it may at first appear and isprecisely the property that we described earlier. An equivalent formulation to(6.25), the one we shall appeal to when verifying the strong Markov property,is the following standard result (see, for example, Karatzas and Shreve (1991)).

Page 159: Financial derivatives in theory and practice

Strong Markov property 135

Theorem 6.33 The process X is strong Markov if and only if, for all a.s.finite Ft stopping times τ and all t ≥ 0,

E[f(Xτ+t)|Fτ ] = E[f(Xτ+t)|Xτ ] (6.26)

for all bounded continuous f .

As mentioned earlier, the strong Markov property of the solution process fora locally Lipschitz SDE (σ, b) is essentially inherited from the strong Markovproperty of the driving Brownian motion. Theorem 6.36, the main result ofthis section, establishes this fact. In order to prove this result we first needtwo lemmas which show that the solution process for a globally Lipschitz SDEis well behaved as a ‘function’ of the initial distribution.

Lemma 6.34 Let X and Y be two solution processes, on some probabilityspace (Ω, Ft,F ,P), for the SDE (σ, b) and suppose that σ and b are globallyLipschitz (i.e. satisfy (6.13)). Then, for all T > 0 and all t ∈ [0, T ],

E (Xt − Yt)∗2 ≤ DTE (X0 − Y0)2

whereDT = 2 exp 8K

2(4 + T )T .

Proof: DefineZt = Xt − Yt,Zt = (Xt −X0)− (Yt − Y0),

and note thatZ∗2t ≤ 2|X0 − Y0|2 + 2Z∗2t . (6.27)

Since X and Y are solution processes it follows that X = GX0X, Y = GY0Y ,so applying (6.12) and the global Lipschitz conditions yields

E[Z∗2t ] ≤ 2(4 + t)Et

0

|b(Xu)− b(Yu)|2 + |σ(Xu)− σ(Yu)|2 du

≤ 4(4 + t)K2t

0

E[Z∗2u ]du . (6.28)

Defining ht := E[Z∗2t ], it follows from (6.27) and (6.28) that

ht ≤ 2E |X0 − Y0|2 + 2E[Z∗2t ]

≤ 2E |X0 − Y0|2 + 8(4 + t)K2t

0

hudu .

The result now follows from Gronwall’s lemma (Lemma 6.30).

Page 160: Financial derivatives in theory and practice

136 Stochastic differential equations

Lemma 6.35 Let ξ be some F0-measurable random variable and suppose the(Borel measurable) functions σ and b are globally Lipschitz, i.e. satisfy (6.13).Now suppose that (Ω, Ft,F ,P,W,Xξ) is a (strong) solution to the SDE(σ, b) with X0 = ξ a.s. Then, for all bounded continuous functions f : Rn → R,

E[f(Xξt )|F0] = v(ξ, t)

where, for all x ∈ Rn, the function v (which is continuous in x) is given by

v(x, t) := E[f(Xxt )] .

Proof: The solution process X ξ solves

Xξt = ξ +

t

0

b(Xu) du+t

0

σ(Xu) dWu . (6.29)

Let f : Rn → R be some bounded continuous function. If ξ = x a.s., for somex ∈ Rn, it suffices to prove that

E[f(Xxt )|F0] = E[f(Xx

t )] a.s. (6.30)

This is immediate since x is a constant and the Brownian motion in (6.29) isindependent of F0.If ξ can take only countably many values ξj : j ≥ 1 the result follows

from (6.30) by summation (note that ξ is F0-measurable),

E[f(Xξt )|F0] =

j=1

1ξ=ξjE[f(Xξjt )|F0]

=

j=1

1ξ=ξjv(ξj , t)

= v(ξ, t) .

The general case now follows by approximation of ξ by a random variabletaking at most countably many values. Given a general ξ, let ξk be somesequence of random variables converging to ξ in L2 (for example, each ξk couldbe the random variable ξ rounded in each component to the nearest 1/k). It

follows from Lemma 6.34 that, for each t ≥ 0, X ξkt → Xξ

t in L2, and this inturn implies that f(Xξk

t )→ f(Xξt ) in L2. Consequently,

E E[f(Xξkt )|F0]− E[f(Xξ

t )|F0]2

= E E[f(Xξkt )− f(Xξ

t )|F0]2

≤ E f(Xξkt )− f(Xξ

t )2 → 0

Page 161: Financial derivatives in theory and practice

Strong Markov property 137

as k →∞, so E[f(Xξkt )|F0]→ E[f(Xξ

t )|F0] in L2. But L2 convergence impliesa.s. convergence along a fast subsequence so, given any ξk such that ξk → ξin L2, there is some subsequence ξk along which

E[f(Xξkt )|F0]→ E[f(Xξ

t )|F0] a.s. (6.31)

An immediate consequence of this last observation, taking ξ to be somearbitrary constant and ξk to be an arbitrary sequence of constantsconverging to ξ, is that the function v(x, t) is continuous in x. This continuityalong with a second application of (6.31) now completes the proof, since givenan arbitrary random variable ξ there is a fast subsequence ξk along which(6.31) holds and thus, for almost every ω,

E[f(Xξt )|F0](ω) = lim

k →∞E[f(Xξk

t )|F0](ω)= lim

k →∞v(ξk (ω), t) = v(ξ(ω), t) .

Armed with the regularity result established in Lemma 6.35, we are nowable to establish the strong Markov property.

Theorem 6.36 Suppose the SDE (σ, b) is globally Lipschitz and let(Ω, Ft,F ,P,W,X) be some solution. Then the solution process X is strongMarkov, i.e. (6.26) holds for all bounded continuous f and all a.s. finite F tstopping times τ .

Proof: Given any Ft stopping time τ < ∞ a.s., we have, for almost everyω and all t ≥ 0,

Xτ+t(ω) = X0(ω) +τ(ω)+t

0

b(Xu(ω)) du+τ(ω)+t

0

σ(Xu(ω)) dWu(ω)

= Xτ(ω)(ω) +τ(ω)+t

τ(ω)

b(Xu(ω)) du+τ(ω)+t

τ(ω)

σ(Xu(ω)) dWu(ω)

= Xτ +t

0

b(Xτ+u) du+t

0

σ(Xτ+u) dWu (6.32)

where Wt = Wτ+t −Wτ and in the last line we have suppressed ω in thenotation. Note, by the strong Markov property of Brownian motion (Theorem

2.15), that W is an Fτ+t : t ≥ 0 Brownian motion independent of Fτ . Thus(6.32) shows that the process X· := Xτ+· is a strong solution of the SDE (σ, b)for the initial data (Ω, Ft,F ,P, W,Xτ ) where Ft = Fτ+t. It follows fromLemma 6.35 that

E[f(Xτ+t)|Fτ ] = E[f(Xt)|F0] = v(X0, t) = v(Xτ , t) . (6.33)

Page 162: Financial derivatives in theory and practice

138 Stochastic differential equations

Finally, observe that the right-hand side of (6.33) is σ(Xτ )-measurable, henceso is the left. It follows that

E[f(Xτ+t)|Fτ ] = E[f(Xτ+t)|Xτ ] ,

which establishes (6.26) as required.Our final result in this section extends Theorem 6.36 to the case of locally

Lipschitz coefficients subject to a linear growth constraint. This localizationargument follows familiar lines.

Corollary 6.37 Theorem 6.36 remains valid if the global Lipschitzrequirement on the SDE is relaxed to the local Lipschitz conditions (6.18)combined with the linear growth condition (6.19).

Proof: Let X be a solution process for the SDE (σ, b) and define SN = inft >0 : |Xt| > N. Denote by XN the solution to the globally Lipschitz SDE(σN , bN) defined in the proof of Corollary 6.28. By Theorem 6.36, XN isstrong Markov and thus, given any a.s. finite stopping time τ and boundedcontinuous f ,

E[f(XNτ+t)|Fτ ] = vN (XN

τ , t) a.s.

wherevN (x, t) := E[f(X(N,x)

t )] ,

X(N,x) being the process with initial condition XN0 = x a.s. Noting that X(ω)

and XN (ω) agree for t ≤ SN (ω), and defining v(x, t) as in Lemma 6.35, wefind

E[f(Xτ+t)|Fτ ]− v(Xτ , t) = E f(Xτ+t)− f(XNτ+t) |Fτ

− v(Xτ , t)− vN (XNτ , t)

≤ E f(Xτ+t)− f(XNτ+t) |Fτ

+ v(Xτ , t)− vN (Xτ , t)

+ vN (Xτ , t)− vN (XNτ , t)

≤ 2BE[1τ+t>SN|Fτ ] + v(Xτ , t)− vN (Xτ , t)

+ 2B1τ>SN , (6.34)

where B is some constant bounding f . Letting N →∞, the third term on theright-hand side of (6.34) converges to zero a.s., as does the first (applying thedominated convergence theorem). Given any x ∈ Rn,

|v(x, t) − vN (x, t)| = E[f(Xxt )− f(X(N,x)

t )] ≤ 2BP(SN < t)→ 0

Page 163: Financial derivatives in theory and practice

Martingale representation revisited 139

as N → ∞. Hence the second term on the right-hand side of (6.34) alsoconverges to zero a.s. and this establishes the result.

6.7 MARTINGALE REPRESENTATION REVISITED

We established in Corollary 5.47 and Theorem 5.50 that uniqueness of a (local)martingale measure implied martingale representation. This in turn can berelated to the uniqueness of the solution to an SDE which the underlyingmartingale satisfies. Theorem 5.49, the martingale representation theorem forBrownian motion, was a special case of this result.

Theorem 6.38 Suppose that weak uniqueness holds for the SDE (σ, b).Then, given any solution (Ω, Ft,F ,P,W,X) of (σ, b), any FXt localmartingale N can be represented in the form

Nt = N0 +t

0

φu · dXloc,P

u ,

for some FXt -predictable φ.Proof: By Theorem 5.50, it suffices to prove that, if Q ∼ P and X loc,P is an(FXt ,Q) local martingale, then P = Q. Note that

X loc,Pt = Xt −X0 −

t

0

b(Xu)du

and thus, since X solves (σ, b), X loc,P satisfies the SDE

dX loc,Pt = σ(Xt)dWt .

By Girsanov’s theorem (Corollary 5.26), since X loc,P is also an (FXt ,Q)

local martingale,

dX loc,Pt = σ(Xt)dWt

and sodXt = b(Xt)dt+ σ(Xt)dWt

for some Q Brownian motion W . But (σ, b) admits a unique weak solution,and X satisfies the same SDE under both P and Q, so X has the same lawunder P and Q. This, in turn, implies that P = Q on (FX

t ), and indeed on

FXt since P ∼ Q.

Page 164: Financial derivatives in theory and practice
Page 165: Financial derivatives in theory and practice

7

Option Pricing in ContinuousTimeAt last we are able to develop the theory of option pricing in continuoustime, and the probabilistic tools which we have developed in Chapters 2–6are just what we need to do it. This theory, when developed carefully, is quitetechnical but the basic ideas at the centre of the whole theory are the sameas for discrete time, replication and arbitrage. If you have not already doneso you should now read Chapter 1 which contains all the important conceptsand techniques which will be developed in this chapter, but in a much simplerand mathematically more straightforward setting. We will assume throughoutthis chapter that you are familiar with Chapter 1.

There is a conflict when writing about a technical theory. If you includeall the technical conditions as and when they arise the development of thetheory can be very dry and the main intuitive ideas can be missed. On theother hand, if the technicalities are omitted, or deferred until later, it can bedifficult for the reader to know exactly what conditions apply and, for example,to decide whether any particular model is one to which the theory applies.If working through this chapter you feel you are missing some importantintuition, Chapter 1 should help. If there are details which confuse you atfirst, read on and spend more time on them second time through.

This chapter is organized as follows. In Section 7.1 we define a continuoustime economy. An economy consists of a set of assets and a set of admissibletrading strategies. For technical reasons we will not be able to completelyspecify exactly which strategies are allowed until we have developed the theorya little further. We have therefore given a partial description of admissiblestrategies in Section 7.1.2 and leave until Section 7.3.3 the final definition.With the exception of this one detail, Section 7.1 defines the probabilistic set-up for our economy. The conditions we impose at this stage will be assumedin all results later in the chapter.

Section 7.2 contains most of the intuition behind the results of continuoustime option pricing theory. We derive the standard partial differential

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 166: Financial derivatives in theory and practice

142 Option pricing in continuous time

equations for the price of a European derivative and an expression for thevalue as an expectation. The development in this section is as free of technicaldetails as we can manage without sacrificing accuracy. The price we pay isthat we have to make various assumptions which we do not fully justify untilSection 7.3. The restriction to European derivatives is for clarity and variousgeneralizations are possible which we cover in Section 7.4.

In Section 7.3 we give a rigorous development of the theory. To do thiswe must introduce numeraires, martingale measures and admissible tradingstrategies. We develop here results which explain when an economy will becomplete and hence when we will be able to price all the derivatives ofinterest to us. The treatment we provide is different to most others that wehave encountered. We develop an L1 theory and highlight the importance ofnumeraires (which have recently become an increasingly important modellingtool). We also introduce a different (and more natural) notion of completenessthan that usually considered. Finally, in Section 7.4, we consider severalgeneralizations of the framework treated in Section 7.3. It turns out that thetheory of Section 7.3 extends easily to derivatives in which an option holder isable to exercise controls (American options, for example), to assets which paydividends, and to an economy over the infinite time horizon. The case wherethe number of underlying assets is infinite is discussed in Chapter 8.

7.1 ASSET PRICE PROCESSES AND TRADINGSTRATEGIES

Before we can start pricing derivatives we must first define the underlyingeconomy in which we are working. An economy, which we will denote by E ,is defined by two components. The first, which we define in Section 7.1.1, is amodel for the evolution of the prices of the assets in the economy. The second,the subject of Section 7.1.2, is the set of trading strategies that we will allowwithin the economy. Throughout we shall restrict attention to some finite timeinterval [0, T ]. We discuss the case of an infinite horizon in Section 7.4.

7.1.1 A model for asset prices

Let W = (W (1), W (2), . . . , W (d)) be a standard d-dimensional Brownian motionon some filtered probability space (Ω, Ft,F , P) satisfying the usual condi-tions (recall that we need these to ensure that stochastic integrals are welldefined). In addition, we suppose that F0 is the completion of the trivialσ-algebra, ∅, Ω.

The economy E consists of n assets with price process given by A =(A(1), A(2), . . . , A(n)). The process A is assumed to be continuous and almostsurely finite. We will insist that A0 is measurable with respect F0, and thus

Page 167: Financial derivatives in theory and practice

Asset price processes and trading strategies 143

some constant known at time zero, and that A satisfies an SDE of the form

dAt = µ(At, t) dt+ σ(At, t) dWt , (7.1)

for some Borel measurable µ and σ. Recall from Chapter 6 that it is implicitin (7.1) that for any solution process A to (7.1) the finiteness condition

t

0

|µ(Au, u)|+ |σ(Au, u)|2 du <∞ a.s.

holds.Chapter 6 provided an extensive study of SDEs of the form (7.1). If we are

using such an SDE to specify a model we need to ensure that the given SDEhas a solution, at least a weak solution, and that some kind of uniquenessresult holds so that there is no ambiguity about the asset price model underconsideration. Throughout this chapter we shall keep the discussion generalbut usually in practice the SDE (σ, µ) will be taken to be locally Lipschitzand subject to a linear growth condition. Recall from Theorem 6.31 that thisensures that a strong solution exists which is pathwise unique and unique inlaw, and that the process (At, t) is strong Markov. All the processes we shallencounter satisfy these local Lipschitz and linear growth conditions, (6.23)and (6.24) (although, in practice, we will not always explicitly present anSDE satisfied by the asset price processes).

Example 7.1 This example is the model first considered by Black and Scholes(1973) in their original paper on the subject of option pricing. It is one that weshall develop throughout this chapter. Start with a probability space (Ω,F ,P)which satisfies the usual conditions and which supports a one-dimensionalBrownian motion W . Define two assets in the economy, the stock and thebond, whose price processes S and D satisfy the SDE

dSt = µStdt+ σStdWt ,

dDt = rDtdt .

It is easy to check that the conditions (6.23) and (6.24) are satisfied so SDE(7.1) has a pathwise unique strong solution which is (apply Ito’s formula tocheck)

St = S0exp (µ− 12σ

2)t+ σWt ,

Dt = D0exp(rt) .

We suppose that S0 is some given constant and that the bond has unit valueat the terminal time T , which then yields

Dt = exp −r(T − t) .

Page 168: Financial derivatives in theory and practice

144 Option pricing in continuous time

7.1.2 Self-financing trading strategies

In this section we discuss trading strategies for a continuous time economy andsome of their basic properties, these being natural generalizations of those wemet in Chapter 1 for a discrete time economy. We will later, in Section 7.3.3,impose further technical constraints on those strategies which we will allow,admissible strategies. Without these technical constraints most economies ofpractical interest would admit arbitrage.The first step in defining a trading strategy is to decide what information

will be available to make the trading decision at any time t. The flow ofinformation available in the economy is represented by the asset filtrationFAt which we define formally in Section 7.3.1. Intuitively, the informationavailable in this filtration at any time t is the past realization of the priceprocess.A trading strategy will be specified by a stochastic process φ =

(φ(1), . . . ,φ(n)) where φ(i)t is interpreted as the number of units of asset i

held at time t, before time t trading. Consequently, the value at time t of theportfolio is given by Vt = φt ·At. A decision to rebalance the portfolio at timet must be based on information available the instant before time t, thus werequire φ to be predictable with respect to the filtration FAt .Associated with any trading strategy is its capital gain. This is just the

change in value of the portfolio due to trading in the assets. Consider firsta simple trading strategy, one which is piecewise constant and bounded, andthus of the form

φt = φ 1(τ , τ +1], = 0, 1, . . . , p,

where φ ∈ Rn and 0 = τ0 < . . . < τp+1 are FAt stopping times. For such astrategy we define the capital gain Gφ

t (which can be negative) in the obviousway:

Gφt =

p

=0

φ · (At∧τ +1−At∧τ ) . (7.2)

Note that φ is predictable — decisions about holdings of assets are made theinstant before subsequent price movements occur.This class of simple strategies is too restrictive for modelling purposes as it

does not enable us to replicate and thus price all derivatives of interest. Weneed to define the capital gain for other more general trading strategies andwe choose to do this via the Ito integral.

Definition 7.2 The capital gain Gφ associated with any (FAt -predictable)

trading strategy φ for the economy E is defined to be the Ito integral

Gφt =t

0

φu · dAu .

Page 169: Financial derivatives in theory and practice

Asset price processes and trading strategies 145

Remark 7.3: Note that this definition of the capital gain is indeed a modellingassumption. For trading strategies where trading occurs infinitesimally onecould in principle make an alternative definition which is consistent with(7.2), such as the Stratonovich integral. However this alternative definitionwould lack certain important properties which the capital gain process shouldpossess (see equation (7.27) and Remark 7.5).

Remark 7.4:We saw how to define the Ito integral in Chapter 4 and discoveredsome of its properties. Recall the fact proved there, Theorem 4.15 and Remark4.16, that if the integrand (trading strategy) φ is left-continuous then, foralmost every ω,

t

0

φu · dAu (ω) = limn→∞

k≥1φt∧Tn

k−1(ω) · At∧Tn

k(ω)−At∧Tn

k−1(ω) ,

where, for each n,

T n0 ≡ 0, T nk+1 = inft > T nk : |φt − φTnk | > 2−n .

Remark 7.5: The Ito integral has a second important property which makesthis choice the appropriate one for financial modelling: Ito integrals preservelocal martingales (Remark 4.22). We shall see in Section 7.3.3 that it isprecisely this property which enables us to construct an arbitrage-free model.

There is one further restriction that we impose on the trading strategies thatwe allow in the economy E . We will insist that no money is either injected intoor removed from a trading portfolio other than the initial amount at time zero.Consequently, the value of the portfolio at time t, φt · At, will be the initialamount invested plus the capital gain from trading. Such a strategy is calledself-financing. In particular, note that we have assumed that the market isfrictionless, meaning there is no cost associated with rebalancing a portfolio.This is an essential assumption which we continue to make throughout thisbook. Without it the ideas of replication and no arbitrage cannot be appliedto price derivatives.

Definition 7.6 A self-financing trading strategy φ for the economy E is astochastic process φ = (φ(1), . . . ,φ(n)) such that:

(i) φ is FAt -predictable;(ii) φ has the self-financing property,

φt · At = φ0 · A0 +t

0

φu · dAu .

Page 170: Financial derivatives in theory and practice

146 Option pricing in continuous time

7.2 PRICING EUROPEAN OPTIONS

A European option is a derivative which at some terminal time T has a valuegiven by some known function of the asset prices at time T . Our aim in thissection is to convey the essential ideas behind pricing and hedging such anoption without getting bogged down in technicalities. In later sections we fillin the gaps and generalize the derivatives that we consider.Let E be an economy which is arbitrage-free (we will define precisely what

we mean by this in Section 7.3.3) and consider a European option having valueVT = F (AT ) at time T , where VT is some FAT -measurable random variable.We assume that there exists some admissible self-financing trading strategywhich replicates the option, and thus its price is well defined for all t ≤ T .Our aim is to calculate both this price and the replicating strategy for theoption.We begin by assuming that the value at time t of this derivative is given

by some as yet unknown function V (At, t) which is C2,1(Rn, [0, T )). This

assumption must be justified later, after we have deduced the form of V .By the assumption of no arbitrage the value of the option at time t mustbe the time-t value of the replicating strategy (Section 7.3.4 formalizes thisargument) and so

V (At, t) = V (A0, 0) +n

i=1

t

0

φ(i)u dA(i)u . (7.3)

On the other hand, applying Ito’s formula to V yields

V (At, t) = V (A0, 0) +

n

i=1

t

0

∂V (Au, u)

∂A(i)u

dA(i)u +t

0

GV (Au, u) du (7.4)

where

GV (At, t) := ∂V (At, t)

∂t+ 1

2

i,j

aij(At, t)∂2V (At, t)

∂A(i)t ∂A

(j)t

and

aij(At, t) =d

k=1

σik(At, t)σjk(At, t).

Now consider equations (7.3) and (7.4) together. Both will hold a.s. if forall t ∈ [0, T ) with probability one we have

φ(i)(At, t) =∂V (At, t)

∂A(i)t

(7.5)

andGV (At, t) = 0. (7.6)

Page 171: Financial derivatives in theory and practice

Pricing European options 147

Then in addition, if (7.5) holds, the self-financing property of V means thatthe following equality holds a.s. for all t ∈ [0, T ):

V (At, t) =

n

i=1

∂V (At, t)

∂A(i)t

A(i)t . (7.7)

7.2.1 Option value as a solution to a PDE

A C2,1(Rn, [0, T )) function V satisfying equations (7.5)—(7.7) is a candidatefor the value of the European option. Suppose possible values of the process Alie in a region U ⊂ Rn. In order for (7.6) and (7.7) to hold it is necessary thatV satisfies the partial differential equation (PDE): for all (x, t) ∈ U × [0, T ),

GV (x, t) = 0 , (7.8)n

i=1

x(i)∂V (x, t)

∂x(i)= V (x, t) . (7.9)

The requirement that VT = F (AT ) gives the boundary condition

V (x, T ) = F (x), x ∈ U.Solving this PDE is one way of pricing the option. If we can find a solutionV to the above then it will be C2,1(Rn, [0, T )) and indeed (7.5) then definesa self-financing trading strategy which replicates the option.Usually there will at this stage be many solutions V (x, t) to (7.8) and

(7.9). Unless we eliminate all but at most one of these the economy admitsarbitrage (consider the difference between two replicating strategies). As notedin Section 7.1.2, it is therefore necessary to restrict the strategies that we allowin order to ensure no arbitrage for the economy, and this requirement formsan extra restriction on the solutions to the above set of equations.The requirement that the replicating strategy be admissible, as we define in

Section 7.3.3, is sufficient to ensure no arbitrage. Consequently, there will be atmost a unique solution V (x, t) to (7.8)—(7.9) corresponding to an admissiblestrategy. Therefore, if we can show that one of our solutions to (7.8)—(7.9)corresponds to an admissible strategy then this is indeed the price of ouroption. Note that at this stage we do not know that any such solution willexist which is C2,1(Rn, [0, T )); we assumed only that the option was attainable.The PDE approach to option pricing is commonly used but will not be the

primary focus of this book. The reader is referred to Duffie (1996), Lambertonand Lapeyre (1996) and Willmot et al. (1993) for more on this approach. Theequations satisfied by the option price, (7.8)—(7.9) above, are of linear second-order parabolic form and these have been extensively studied and are wellunderstood. Mild regularity conditions ensure the existence and uniquenessof the solution for suitable boundary conditions. In practice they are oftensolved by numerical techniques.

Page 172: Financial derivatives in theory and practice

148 Option pricing in continuous time

Example 7.7 Consider the Black—Scholes economy introduced in Example 7.1and an option which pays

F (ST , DT ) = (ST −K)+at time T . If we denote by V (St, Dt, t) the value of this option at time t, thePDEs (7.8) and (7.9) for the value of the option become

∂tV (St, Dt, t) +

12σ

2S2t∂2

∂S2tV (St, Dt, t) = 0 (7.10)

St∂

∂StV (St, Dt, t) +Dt

∂DtV (St, Dt, t) = V (St, Dt, t) , (7.11)

all other terms in GV being zero because the bond is a finite variation process.Equations (7.10) and (7.11) are equivalent to the celebrated Black—ScholesPDE and can be solved explicitly subject to the boundary condition

V (ST , DT , T ) = (ST −K)+ .

Anyone already familiar with deriving the Black—Scholes PDE may at thispoint be a little confused because (7.10)—(7.11) are not in the standard form.The reason for this is that the value function V is here parameterized interms of St, Dt and t. In the more usual development the value function isparameterized in terms only of St and t. Since Dt is a known function oftime this can be done, but in doing it some of the underlying symmetry andstructure of the problem is lost (although it is more straightforward to solve aPDE in two variables than in three). To recover the more usual Black—Scholes

PDE, let V (St, t) = V (St, Dt, t). Then

∂V

∂t=∂V

∂t+

∂V

∂Dt

∂Dt∂t

=∂V

∂t+ rDt

∂V

∂Dt. (7.12)

Further, since V is self-financing,

V (St, t) = V (St, Dt, t) = St∂V

∂St+Dt

∂V

∂Dt

= St∂V

∂St+Dt

∂V

∂Dt. (7.13)

Substituting (7.13) into (7.12) now gives

∂V

∂t=∂V

∂t− r V − St ∂V

∂St

Page 173: Financial derivatives in theory and practice

Pricing European options 149

and so (7.10) becomes

12σ

2S2t∂2

∂S2t+ rSt

∂St+

∂tV (St, t) = rV (St, t)

which is the Black—Scholes PDE in its more familiar form. There are manyways to solve a PDE such as this. We leave this until the next section, wherewe derive the solution using the martingale approach.

7.2.2 Option pricing via an equivalent martingale measure

Central to our probabilistic formulation of the discrete theory of Chapter 1was the existence of a probability measure with respect to which the processA,suitably rebased, was a martingale. No arbitrage was shown to be equivalentto the existence of an equivalent martingale measure (EMM) and (under theassumption of no arbitrage) completeness held if and only if this measure wasunique. Analogous results, modulo technicalities, hold for continuous time.We return to a discussion of the general picture later. For now we focus ourattention on how to use the ideas of Chapter 1 to price a European option,assuming that the economy is arbitrage-free.When we write down a model such as (7.1) we are describing the law of

the asset price process A in some arbitrarily chosen units such as pounds ordollars. These are in themselves not assets in the economy — they are merelyunits. As such we are free to redefine these units in any way we like, includingchoosing units that may evolve randomly through time, as long as we introduceno real economic effects.For reasons which will become clear later, we will restrict attention to units

which are also numeraires. For now it is enough to think of these just asunits; later, in Section 7.3.2, we will introduce numeraires formally and thereasons for this restriction will become clear. So suppose we change units tothose given by the numeraire N , where N is a strictly positive continuoussemimartingale adapted to FA

t . Let AN be the price process A quoted inthe new units, so ANt = N

−1t At, which is also a continuous semimartingale.

Now consider a replicating strategy φ having the self-financing property,

Vt = φt ·At = V0 +t

0

φu · dAu .

Since the numeraire represents only a change of units and since φ is self-financing, it is intuitively clear that

V Nt := N−1t Vt = VN0 +

t

0

φu · dANu (7.14)

Page 174: Financial derivatives in theory and practice

150 Option pricing in continuous time

and the reader can check that this is indeed the case using Ito’s formula (thisis the unit invariance theorem which we present formally in Section 7.3.2).Immediately from (7.14), any self-financing strategy remains self-financingunder a change of units.The motivation for this change of units is, as was the case in Chapter 1, to

enable us to change to an EMM corresponding to the numeraire N . That is,suppose there exists some measure N ∼ P such that AN is an FAt martingaleunder N. If such a measure exists, it follows from the representation (7.14) thatV N is an FAt local martingale under N. If V N is in fact a true martingale,and our definition of an admissible in Section 7.3.3 will ensure that it is, wecan now use the martingale property to find the option value,

V Nt = EN V NT |FAt ,

and thus

Vt = NtENF (AT )

NTFAt . (7.15)

In general, to calculate the option value we need to know the law of theprocess (A,N) under the measure N, and Girsanov’s theorem is the resultwhich gives us this law. The problem is often simplified in practice by ajudicious choice of numeraire which reduces the dimensionality of the problemand leaves us only needing to calculate the law of AN .A review of this section should convince you that we have relied neither

on the Markovian form of A nor on any C 2,1 regularity conditions for thevalue function V . Consequently, this martingale approach is more generalthan the PDE approach and can be applied to other more general models andto other derivatives. In this book we will not extend the class of models weconsider beyond those of Section 7.1 but we will later make considerable useof being able to price more general derivatives, in particular path-dependentand American products.

Example 7.8 Consider once more the Black—Scholes model of Example 7.1

and take D as numeraire. Then by Ito’s formula, writing S(D)t = D−1t St, we

havedS

(D)t = (µ− r)S(D)t dt+ σS

(D)t dWt.

Suppose an EMM Q exists corresponding to this numeraire. We know fromGirsanov’s theorem (Theorem 5.20) that a change of measure only alters thedrift of a continuous semimartingale. This, combined with the fact that S (D)

is a martingale under Q, implies that

dS(D)t = σS

(D)t dWt,

where W is a standard Brownian motion under Q. The solution to this SDEis given by

S(D)t = S

(D)0 exp(σWt − 1

2σ2t) (7.16)

Page 175: Financial derivatives in theory and practice

Continuous time theory 151

and soSt = S0 exp(σWt + rt − 1

2σ2t).

A direct calculation confirms that S(D) is indeed a true martingale, and theprocess D(D) ≡ 1 is trivially also a martingale under Q.

We now return to the valuation problem of Example 7.7. Note that, withrespect to the measure Q, the law of S no longer depends on µ and henceneither will the option value V. This should not surprise you. In Chapter 1we saw that the role of the probability measure P was purely as a marker forthose outcomes (sample paths) which are possible, and the exact probabilitiesare not important. The drift of a continuous semimartingale is dependent onwhich equivalent measure we are working in and so cannot be relevant. Thevolatility σ, on the other hand, is a sample path property and is invariantunder a change of probability measure. Consequently, it can and does havean effect on the option value.

Writing A = (S, D) clearly here FAt = FW

t . From (7.15) and (7.16) we nowhave

Vt = DtEQ

[(ST − K)+D−1

T |FWt

]

= DtEQ

[S

(D)T 1S(D)

T>K|FW

t

] − DtKEQ

[1S(D)

T>K|FW

t

](7.17)

= StN(d1) − KDtN(d2)

whered1 =

log(St/KDt)σ√

T − t+ 1

2σ√

T − t,

d2 =log(St/KDt)

σ√

T − t− 1

2σ√

T − t,

and (7.17) has been evaluated by explicit calculation.

7.3 CONTINUOUS TIME THEORY

We are now familiar with the basic underlying ideas of derivative pricing.First we must define the economy along with those trading strategies whichare admissible; then we must prove that the economy is arbitrage-free. Finally,given any derivative, we must prove that there exists some admissible tradingstrategy which replicates it, and the value of this derivative is then the valueof its replicating portfolio. Showing the existence of a suitable replicatingstrategy is usually done by proving a completeness result, i.e. proving that allderivatives in some set (which includes the one under consideration) can bereplicated by some admissible trading strategy.

What we have done so far in this chapter is provide techniques to find thevalue and replicating strategy when it exists, which is the most important thingfor a practitioner to know. Now we fill in the remaining parts of the theory.

Page 176: Financial derivatives in theory and practice

152 Option pricing in continuous time

We begin in Section 7.3.1 with a brief discussion of the informationobservable within the economy. Then, in Section 7.3.2, we provide precisedefinitions for numeraires and EMMs, and derive some of their properties. Wealready know that these are important tools in discrete time, but in continuoustime they are even more central to the theory. The first instance of this addedimportance comes in Section 7.3.3 where we define admissible strategies andarbitrage. In continuous time we must eliminate certain troublesome strategiesfrom the economy, otherwise virtually any model admits arbitrage and is thusrendered useless for the purposes of option pricing. With our careful choiceof admissible strategies the arbitrage-free property follows quickly, so we nowknow that the replication arguments can be applied to the economy.Section 7.3.4 is a brief summary of the argument that shows that the value

of a derivative in an arbitrage-free economy is the value of any replicatingportfolio. This section pulls together in one place several ideas that we havealready met earlier in the chapter. The last part of the general theory neededis a discussion of completeness, and this is the topic of Section 7.3.5. It is onlyhere that we can fully appreciate the reasons for our carefully chosen definitionof a numeraire. As we shall see, completeness (in a precise sense to be defined)for an economy is equivalent to the existence of a unique EMM (for any givennumeraire). When weak uniqueness holds for the SDE defining the asset priceprocess, equation (7.1), completeness is then equivalent to the existence of afinite variation numeraire. The weak uniqueness property of the SDE can bechecked using standard techniques such as those we met in Chapter 6. Theexistence or otherwise of a finite variation numeraire is usually easily verifiedin practice.The presentation of the theory given here places much more emphasis on

numeraires than is usual. This is justified by the role they play in complete-ness, but more importantly by the need in much recent work to be able toswitch freely between different numeraires and know that we are allowed todo so.

7.3.1 Information within the economy

Part of the definition of a trading strategy is the amount of information avail-able at any time t on which to base the strategy φ. Recall that we insistedthat φ be FAt -predictable. Here we formally define this filtration and twoothers that are relevant to the economy.

Definition 7.9 For an economy E defined on the probability space(Ω, Ft,F ,P), the natural filtration, FEt , the asset filtration, FAt , andthe Brownian filtration, FWt , are defined to be the P-augmentations of thefollowing:

(i) (FEt ) = σ(A(i)u /A

(j)u : u ≤ t, 1 ≤ i, j ≤ n);

(ii) (FAt ) = σ(A(i)u : u ≤ t, 1 ≤ i ≤ n);

(iii) (FWt ) = σ(W(i)u : u ≤ t, 1 ≤ i ≤ d).

Page 177: Financial derivatives in theory and practice

Continuous time theory 153

It is important to appreciate the significance of each of these to derivativepricing. The Brownian filtration contains all the information about theBrownian motions which underlie the economy. This filtration is importantonly at the start of the modelling process when we specify the asset priceprocesses (Section 7.1.1). The asset filtration contains all the information thatis available to us if we watch the asset prices evolve through time. This is whatcan be observed in practice and so all trading strategies should be based oninformation contained in the asset filtration and no more. The third of thefiltrations, the natural filtration, contains all the information in the economyabout relative asset prices. This is arguably the most fundamental of the threesince it contains only information intrinsic to the economy. The asset filtration,by contrast, will depend on the units initially chosen to set up the economy.

7.3.2 Units, numeraires and martingale measures

We met the idea of a unit in Section 7.2. We shall formally reintroduce unitsin this section along with the more refined notion of a numeraire. Numerairesplay an important role in the theory of option pricing, but that role has notbeen fully appreciated and exploited until very recently.

Definition 7.10 A unit U for the economy E defined on the probabilityspace (Ω, Ft,F ,P) is any a.s. positive continuous semimartingale adaptedto the filtration Ft.

Remark 7.11: What is important in the definition of a unit is that it bedefined on the probability space on which we set up the economy and that itbe positive. That is, it must exist and we need to be able to divide throughby it.

Example 7.12 Define an economy E on the probability space (Ω, Ft,F ,P)which supports d-dimensional Brownian motionW , and suppose that the assetprice process solves the SDE

dAt = µ(At, t)dt+ σ(At, t)dWt ,

whereW is (d−1)-dimensional Brownian motion, the first (d−1) componentsofW . We can now define the unit Ut = exp W

(d)t which is independent of all

the assets in the economy.We have been able to do this because the probabilityspace we have used to set up the economy is richer than we need to specifythe asset price process.

If we change the units in which we describe the prices of the assets in theeconomy this should have no economic effect — it is just a different way ofpresenting price information. This is the content of the following simple yetvery important result.

Page 178: Financial derivatives in theory and practice

154 Option pricing in continuous time

Theorem 7.13 (Unit invariance) For any unit U, a trading strategy isself-financing with respect to A if and only if it is self-financing with respectto AU.

Proof: A process U is a.s. positive and finite if and only if U−1 is also a.s.positive and finite, so it suffices to prove the implication in one direction. Sosuppose φ is self-financing with respect to A. Writing Vt = φt · At, we have

Vt = V0 +∫ t

0φu · dAu.

We must show that V Ut := φt · AU

t satisfies

V Ut = V U

0 +∫ t

0φu · dAU

u .

This follows from an application of the integration by parts formula:

dV Ut = U−1

t dVt + Vtd(U−1t ) + dVtd(U−1

t )

= φt · [U−1t dAt + Atd(U−1

t ) + dAtd(U−1t )]

= φt · dAUt .

So changing units makes no material difference to our economy. However,

introducing a new unit may change the information in the economy (i.e.FAU

t = FAt ) and this may complicate calculations. There is, however, a

class of units which introduce no additional information over and above thatalready available through the asset price filtration FA

t , numeraires. Indeed,using a numeraire unit may reduce the amount of information.

Definition 7.14 A numeraire N for the economy E is any a.s. strictlypositive FA

t -adapted process of the form

Nt = N0 +∫ t

0αu · dAu

= αt · At,

where α is FAt -predictable.

Remark 7.15: Nt is the value at t of a portfolio which starts with value N0 andis traded according to the self-financing strategy α. Note, however, that wehave not yet defined which strategies will be admissible within our economyand so there is no assumption that α be admissible.

Page 179: Financial derivatives in theory and practice

Continuous time theory 155

Remark 7.16: Nt is adapted to FAt and so FAt = F (A,N)t .

We now prove a result which is, as was the case for the unit invariancetheorem, simple but of great importance. It is central to deciding when aneconomy is complete. What this theorem shows is that if we can represent arandom variable, working in units of some numeraire N , with a strategy thatis not necessarily self-financing, then we can find some strategy which alsoreplicates the random variable but is self-financing. That is, the theorem is thelink between the martingale representation theorem of Chapter 6, which showshow to represent a local martingale as a stochastic integral, and completeness,which additionally requires that the replicating strategy be self-financing.It is the fact that the unit is a numeraire that makes this result hold andthus this result demonstrates the importance of numeraires to questions ofcompleteness.

Theorem 7.17 Let N be some numeraire process and, for 0 ≤ t ≤ T , let

Zt = Z0 +t

0

φu · dANu ,

for some FAt -predictable φ. Then there exists some FAt -predictable φ such

that

Zt = Z0 +t

0

φu · dANu= φt ·ANt .

That is, φ is self-financing.

Proof: Since N is a numeraire process, we can write

Nt = N0 +t

0

αu · dAu ,for some self-financing α = 0. In particular, it follows from the unit invariancetheorem that

αt · ANt = 1 , αt · dANt = 0 .This immediately yields a suitable strategy,

φt = φt + (Zt − φt ·ANt )αt .

The next important concept is that of martingale measures for the economyE . We saw in Section 7.2 that martingale measures are useful for calculation. InChapter 1 we saw their role in deciding whether an economy admits arbitrageand is complete. We will in the sections that follow provide analogous resultsfor a continuous time economy. The results are basically the same as those wehave met already but with extra complexities which are present in the richercontinuous time setting. In particular, we shall also use martingale measuresas a tool to specify exactly which trading strategies are admissible (the detailwe omitted in Section 7.1.2). We begin with the definition.

Page 180: Financial derivatives in theory and practice

156 Option pricing in continuous time

Definition 7.18 The measure N defined on the probability space(Ω, Ft,F ,P) is a martingale measure (or equivalent martingale measure)for the economy E if N ∼ P and there exists some numeraire N such that AN

is an FAt martingale under the measure N. The pair (N,N) is then referredto as a numeraire pair.

Remark 7.19: This definition of an EMM should be compared with Definition5.40 of Chapter 5. In Definition 7.18 above, the process AN must be amartingale under N but not necessarily under P. Note, however, that if NandM are two EMMs corresponding to the same numeraire, N , then they areEMMs for the martingale AN in the sense of Definition 5.40 (note that FA0 istrivial, which ensures that condition (ii) of the definition holds).

Recall from our discussion of change of measure in Chapter 6 that if Q ∼ Pare two equivalent measures with respect to FT then, for any A ∈ FT ,

Q(A) = 0 if and only if P(A) = 0 .

That is, a change of measure does not alter the sets of paths which are possible,it just changes the probabilities of these sets (from one strictly positive value toanother strictly positive value). This is an important observation for optionpricing which, as we saw in Chapter 1, is concerned with replication on asample path by sample path basis. We do not want a change of probabilitymeasure to alter the set of paths which the economy can exhibit.The question of whether or not a numeraire pair exists for any given

economy E can be tricky to decide. In practice this is not a problem. Weusually specify a model for an economy already in some martingale measureN. It is, however, important to understand how the evolution of asset priceswill vary between different martingale measures and between a martingalemeasure and any other equivalent measure. Girsanov’s theorem, which wemet in Chapter 5, is the tool that allows us to do this.Suppose (N,N) is a numeraire pair for the economy E , where Nt = αt ·At.

Under the measure P, A satisfies the SDE (7.1) and this can be rewritten inthe form

dAN = µN (At,αt, t)dt+ σN (At,αt, t)dWt, (7.18)

dNt = αt · µ(At, t)dt+ αt · σ(At, t)dWt, (7.19)

where µN and σN can be derived from (7.1) using Ito’s formula. By Girsanov’stheorem, assuming the Radon—Nikodym derivative dN

dP can be written in theform

dNdP Ft

= expt

0

Cu · dWu − 12

t

0

|Cu|2du (7.20)

Page 181: Financial derivatives in theory and practice

Continuous time theory 157

for some Ft-predictable C, which is certainly true if the filtration Ft isthe augmented natural filtration for some Brownian motion (Theorem 5.24),then

Wt = Wt −∫ t

0Cudu

is an Ft Brownian motion under N. Rewriting (7.18) and (7.19) in terms ofWt, then

dANt =

(µN(At, αt, t) + σN(At, αt, t)Ct

)dt + σN(At, αt, t)dWt, (7.21)

dNt = αt ·(µ(At, t) + σ(At, t)Ct

)dt + αt · σ(At, t)dWt. (7.22)

Since (N, N) is a numeraire pair, we can conclude from (7.21) that

dANt = σN(At, αt, t)dWt, (7.23)

otherwise AN is not even a local martingale under N. (We could haveanticipated the form of (7.23) from the martingale property of AN and thefact that quadratic variation is a sample path property, not a property of theprobability measure.) We do not, however, know the SDE which N satisfiesunless we know C.

In the case where there exists some finite variation numeraire Nft = αf

t · At

we can find the SDE for N. To see this, note that since αf is self-financingand Nf is finite variation,

αft · dAN

t = d

(Nf

t

Nt

)=

(Nf

t

Nt

) (dNf

t

Nft

+d[N ]tN 2

t

− dNt

Nt

). (7.24)

The left-hand side of (7.24) is a local martingale under N, and Nf, being afinite variation process, satisfies the same SDE under P and N (substituteσ ≡ 0 into (7.22)). It thus follows by rearranging (7.24) that

dNt =Nt

Nft

dNft +

d[N ]tNt

+ d(local martingale),

under N and thus, from (7.22),

dNt =Nt

Nft

dNft +

d[N ]tNt

+ αt · σ(At, t)dWt.

This, together with (7.23), completely specifies the SDE for the assets underN.

In practice we tend to specify a model by giving equations of the form (7.21)and (7.22). We assume the original measure P was such that we could change

Page 182: Financial derivatives in theory and practice

158 Option pricing in continuous time

to an appropriate measure N. If the reader is ever faced with the problem ofproving, starting from some P, that an EMM exists for a particular numeraireN, Girsanov’s theorem points in the right direction. The Radon–Nikodymderivative will be of the form (7.20) for some C satisfying (from (7.21) and(7.23))

µN(At, αt, t) + σN(At, αt, t)Ct = 0.

This is a necessary condition. In addition we must prove that (7.20) definesa true Ft martingale under P, not just a local martingale, and that (7.23)defines a true FA

t martingale under N. Novikov’s condition (Theorem 5.16)is one way to establish this in some cases. Example 7.20 shows things arestraightforward for the Black–Scholes model.

Example 7.20 We return to the Black–Scholes example and take D as ournumeraire as before. Define a measure Q ∼ P on FW

T via

dQ

dP

∣∣∣∣FW

t

= ρt

where

ρt = exp(φWt − 12φ

2t),

φ =r − µ

σ.

Note that ρ is an FWt martingale under P, ρ0 = 1, so dQ/dP is well defined.

Under the measure Q, D(D)t ≡ 1 so is trivially a martingale. Furthermore,

by Girsanov’s theorem,Wt := Wt − φt

is an FWt Brownian motion under Q, and we have

dSDt = σSD

t dWt.

The solution to this SDE (see equation (7.16)) is an FWt martingale. Thus

Q is an EMM for numeraire D.

7.3.3 Arbitrage and admissible strategies

A prerequisite for being able to price derivatives using replication arguments isthat the economy does not allow arbitrage. We saw this in Chapter 1 and willsee it again in Section 7.3.4. An arbitrage is a self-financing trading strategywhich generates a profit with positive probability but which cannot generatea loss. This is summarized in the following definition.

Page 183: Financial derivatives in theory and practice

Continuous time theory 159

Definition 7.21 An arbitrage for the economy E is an admissible tradingstrategy φ such that one of the following conditions holds:

(i) φ0 · A0 < 0 and φT ·AT ≥ 0 a.s.;(ii) φ0 · A0 ≤ 0 and φT · AT ≥ 0 a.s. with φT · AT > 0 with strictly positive

probability.

If there is no such φ then the economy is said to be arbitrage-free.

The first step in derivative pricing is to ensure that the model for theeconomy is arbitrage-free. This imposes conditions both on the law for theevolution of the asset prices and the set of trading strategies which are allowed.The following two examples illustrate each of these points.

Example 7.22 Consider an economy E consisting of two assets satisfying theSDE

dA(i)t = µ(i)A

(i)t dt+ σA

(i)t dWt, i = 1, 2 ,

where µ(1) > µ(2) and A(1)0 = A

(2)0 . Let φ be the ‘buy and hold’ strategy

φ(1)t = −φ(2)t = 1, 0 ≤ t ≤ T . At time zero the portfolio value is A(1)0 −A(2)0 = 0,whereas at any subsequent time t,

Vt = φ(1)t A

(1)t + φ

(2)t A

(2)t

= (1− e(µ(2)−µ(1))t)A(1)t> 0 a.s.

In particular VT > 0 a.s., hence φ is an arbitrage.

Example 7.23 The following example of Karatzas (1993) is taken fromDuffie (1996). Consider the special case of the Black—Scholes economy whenS0 = 1, r = µ = 0 and σ = 1. Then

dSt = StdWt ,

Dt = 1 ,

whereW is a standard one-dimensional Brownian motion. We will construct aself-financing trading strategy φ which is an arbitrage for this economy. First

define a trading strategy φt = (φ(S)t ,φ

(D)t ), 0 < t < T , via

φ(S)t =

1

St√T − t , 0 < t < T,

and choose φ(D) to make φ self-financing,

φ(D)t = −φ(S)t St +

t

0

φ(S)u dSu.

Page 184: Financial derivatives in theory and practice

160 Option pricing in continuous time

The initial portfolio value, φ(S)0 S0 + φ

(D)0 D0, is zero and the gain process for

this strategy is given by

Gφt =t

0

φ(S)u dSu =t

0

1√T − udWu . (7.25)

Now define an FAt stopping time τ by

τ := inft > 0 : Gφt = α, (7.26)

the first time the gain hits some positive level α. By Exercise 7.24, 0 < τ < Ta.s. and thus the gain will at some point on (0, T ) hit any positive level α.We can use this observation to modify φ to construct an arbitrage as follows.

Define the self-financing trading strategy φ via

φ(S)t =

φ(S)t , t ≤ τ

0, otherwise,

φ(D)t =

φ(D)t , t ≤ τ

α, otherwise.

This strategy starts with zero wealth and guarantees a final portfolio value atT of α > 0, hence it is an arbitrage.

Exercise 7.24: Show that the process W defined by

Wt := GφT (1−e−t)

is a Brownian motion, where Gφ is defined by (7.25). Hence show that thestopping time τ defined in (7.26) is strictly less that T with probability one.

In Example 7.22 the arbitrage was a simple strategy, and we certainly wantto include all simple strategies within any model. The problem here is withthe model for the evolution of the asset prices. In Chapter 1 we saw that theexistence of a numeraire pair was sufficient (and necessary) to ensure thatthe asset price process is not one which introduces arbitrage. The model inExample 7.22 does not admit a numeraire pair.The strategy of Example 7.23 is not restricted to the asset price model given

there. The strategy is referred to as a ‘doubling strategy’ because the holdingof one asset keeps on doubling up (indeed in the classical example of such astrategy the holding is precisely doubled at each of a set of stopping times).Such a doubling strategy is not particular to this economy. Consequently wecan see that, unlike in the discrete time setting, it is not sufficient to imposeconditions just on the asset price process but we must also impose further

Page 185: Financial derivatives in theory and practice

Continuous time theory 161

conditions other than FAt -predictability and the self-financing property on

the trading strategies that we allow. Of course, we must also ensure that wedo not remove too many strategies.There is no unique way to eliminate these troublesome trading strategies.

The restrictions that we choose have the advantage of being ‘numeraire-friendly’. We are able to change between different numeraires and EMMs formodelling and calculation purposes without worrying about the implicationsfor trading strategies. We now formally introduce our definition of admissibletrading strategies.

Definition 7.25 If the economy E does not admit any numeraire pair theset of admissible strategies is taken to be the empty set. Otherwise, an F A

t -predictable process φ is admissible for the economy E if it is self-financing,

φt · At = φ0 · A0 +t

0

φu · dAu ,

and if, for all numeraire pairs (N,N), the numeraire-rebased gain process

GφtNt

=t

0

φu · dANu (7.27)

is an FAt martingale under N.

Definition 7.26 If φ is admissible then Vt = φt · At is said to be a priceprocess.

Remark 7.27: The martingale requirement that we have imposed on tradingstrategies is sufficient to remove the doubling strategy arbitrage, as Theorem7.32 demonstrates. There are two different ways commonly employed toremove such strategies. The first is to impose some kind of integrabilitycondition on the set of trading strategies, and our conditions fall into thiscategory. Other integrability conditions are often more restrictive, with ourchoice being precisely enough to ensure that the standard proof of no arbitrage(Theorem 7.32) can be applied. The second method is to impose some absolutelower bound on the value of the portfolio, which clearly removes doubling(this makes the gain process in (7.27) a supermartingale). However, for someeconomies this approach also removes some simple strategies.

Remark 7.28: One reason why we chose, in Section 7.1.2, to model the gainprocess via an Ito integral was that it preserves the local martingale property.

The additional constraint on the gain Gφt of an admissible strategy ensures

that trading in assets which have zero expected change in price over any timeinterval results in zero expected gain. This behaviour is intuitively what onewould expect. If one placed no restrictions on φ this would not be the case ingeneral.

Page 186: Financial derivatives in theory and practice

162 Option pricing in continuous time

Remark 7.29: The set of admissible strategies posseses several importantproperties. First note that it is linear: if φ1 and φ2 are admissible then so isαφ1 + βφ2 for α, β ∈ R. This is important in the replication argument to pricea derivative (see Section 7.3.4) in which we need to consider the differencebetween two strategies. This is why we have chosen the intersection overnumeraire pairs rather than the union in our definition of ‘admissible’.A second property of our definition is that it is independent of the particularmeasure P in which the economy is initially specified. Recall that optionpricing is about replication not probability. The role of the probability measureis as a marker for those sample paths which are allowed and the preciseprobabilities are not important. It is mathematically more pleasing for thestrategies to also be independent of the measure P and in practice this meanswe avoid tricky questions on which strategies are admissible when we changenumeraire, something we do often in applications. This is also important whenwe come to consider completeness in Section 7.3.5.Remark 7.30: We shall see in Section 7.3.5 that, when the economy iscomplete, the martingale condition (7.27) holds for one numeraire pair (N, N)if and only if it holds for all numeraire pairs. Note that the martingale propertyof an admissible strategy ensures in particular that, for all numeraire pairs(N, N), V N

T = φT · ANT satisfies

EN[|V NT |] < ∞. (7.28)

Completeness questions involve deciding which FAT -measurable random vari-

ables VT satisfying (7.28) can be replicated by an admissible trading strategy.Remark 7.31: Note that if there is no numeraire pair then the economy wehave defined is particularly uninteresting because there are no admissibletrading strategies. A natural concern at this point is that we may havedefined an otherwise perfectly reasonable model for an economy and theneliminated all the trading strategies – and we would at the very least like tohave simple strategies included, those which could be implemented in practice.There are two things which alleviate this concern. The first is that whenmodelling in practice we usually specify the initial model in a martingalemeasure, so we know one exists and we have a set of strategies that at leastincludes all the simple strategies. The second reason is the close relationshipbetween the existence of a martingale measure and absence of arbitrage (whichwe proved in the discrete setting of Chapter 1). This topic is much moretechnical in the continuous time setting than in discrete time and we omitany further discussion of this point. Roughly speaking, the continuous timeresult states that if there is no approximate arbitrage (a technical term we shallnot define) then there exists a numeraire pair. Therefore we have only ruledout economies that admit arbitrage. Delbaen and Schachermayer (1999) andreferences therein contain a fuller discussion of this point. With our definitionwe do not need to worry about such matters since the following theoremguarantees that our economy does not admit an arbitrage strategy.

Page 187: Financial derivatives in theory and practice

Continuous time theory 163

Theorem 7.32 The economy E is arbitrage-free.

Proof: Suppose φ is an arbitrage for the economy E and let

Vt = φ0 · A0 +∫ t

0φu · dAu.

Since φ is admissible there exists some numeraire pair (N, N) such that V N isa martingale under N. In particular,

EN[V NT ] = V N

0 .

It is straightforward to show that this contradicts φ being an arbitrage.

7.3.4 Derivative pricing in an arbitrage-free economy

Here we group together the ideas of option pricing via replication which wehave already encountered throughout the text. We summarize this in thefollowing theorem.

Theorem 7.33 Let E be an arbitrage-free economy and let VT be some FAT -

measurable random variable. Consider a derivative (contingent claim) whichhas value VT at the terminal time T. If VT is attainable, i.e. if there existssome admissible φ such that

VT = φ0 · A0 +∫ T

0φu · dAu a.s.,

then the time-t value of this derivative is given by Vt = φt · At.

Proof: The value of the derivative at time t is just the time-t wealth neededto enable us to guarantee, for each ω ∈ Ω (actually, for all ω in some set ofprobability one), an amount VT(ω) at the terminal time. The trading strategyφ generates a portfolio which has value at t given by φt · At and, in particular,value VT = φT · AT at the terminal time. Clearly then φt · At is a candidate forthe time-t price of the derivative. To show that this is indeed the correct pricewe must show that if ψ is any other admissible strategy which replicates theoption at T then, with probability one,

φt · At = ψt · At for all t.

Suppose ψ is some other strategy which replicates VT and let α be someadmissible trading strategy corresponding to a numeraire N. (See the proofof Theorem 7.48 (i) for the construction of one such strategy and numeraire.)For fixed t∗ ≥ 0, consider the admissible trading stategy ϕ defined as follows:

ϕt =

(ψt − φt) + 2

(φt∗ − ψt∗) · At∗

Nt∗αt if t > t∗ and (φt∗ − ψt∗) · At∗ > 0

(φt − ψt) otherwise.

Page 188: Financial derivatives in theory and practice

164 Option pricing in continuous time

Since F0 is trivial we may assume, without loss of generality, that ϕ0 · A0 =c ≤ 0 a.s., for some constant c. Furthermore,

ϕT · AT = 2(φt∗ − ψt∗) · At∗

Nt∗NT1(φt∗−ψt∗ )·At∗>0.

Thus ϕT · AT ≥ 0 a.s., and so the absence of arbitrage implies that ϕ0 · A0 =ϕT · AT = 0 a.s. But this in turn implies that ψt∗ · At∗ ≥ φt∗ · At∗ a.s. A similarargument reversing the roles of φ and ψ shows that φt∗ · At∗ ≥ ψt∗ · At∗ a.s.,thus φt∗ · At∗ = ψt∗ · At∗ a.s., as required. Corollary 7.34 Let E be an economy with a numeraire pair (N, N) and letVT be some attainable contingent claim. Then Vt admits the representation

Vt = NtEN[V NT |FA

t ].

Proof: The existence of an EMM along with our definition of an admissibletrading strategy ensures that the economy is arbitrage-free. The result isnow immediate from the martingale property of an admissible strategy(Definition 7.25).

7.3.5 Completeness

In Chapter 1 we learnt that the concept of completeness is tied in with whatwe mean by an attainable claim. The following definition is broad enough tocover the cases of interest in our current context. Let S be a collection ofrandom variables.

Definition 7.35 The economy E is said to be S-complete if for everyrandom variable X ∈ S there exists some admissible trading strategy φ suchthat

X = φ0 · A0 +∫ T

0φu · dAu.

Typically the set S is specified as a collection of random variables measurablewith respect to one of FA

T or FWT and satisfying some integrability condition.

The choice of S in the following definition is a natural one, and for this specialchoice we will simply refer to the economy E as being complete.

Definition 7.36 Let E be an economy which admits some numeraire pair(N, N). The economy E is said to be complete if for every FA

T -measurablerandom variable X satisfying

EN

[∣∣∣∣ X

NT

∣∣∣∣]

< ∞

there exists some admissible trading strategy φ such that

X = φ0 · A0 +∫ T

0φu · dAu.

Page 189: Financial derivatives in theory and practice

Continuous time theory 165

Remark 7.37:When we wish to distinguish this from any other S-completenesswe will refer to it as FA

T -completeness.

Remark 7.38: This definition of completeness is by no means a standard one.It is more usual to work with the augmented Brownian filtration FW

t andthen take S to be the set of FWT -measurable random variables having finitevariance under P. Admissible strategies are taken to be FW

t -predictable andto satisfy suitable integrability conditions (usually expressed in the originalmeasure P) which ensure no arbitrage. For details of two possible choices ofadmissible strategies the reader is referred to Duffie (1996).

It is implicit, but not immediately obvious, that Definition 7.36 isindependent of the numeraire pair (N,N). This we shall shortly prove inCorollary 7.40. Corollary 7.40 also shows that admissibility of a strategy in acomplete economy, as defined in Definition 7.25, also reduces to a requirementfor a single, arbitrary numeraire pair — if (7.27) defines a martingale for onenumeraire pair then, in a complete economy, it defines a martingale for anynumeraire pair. These results both follow immediately from the stronger resultthat the Radon—Nikodym derivative connecting two EMMs in a completeeconomy is just the ratio of the corresponding numeraires.

Theorem 7.39 Let E be some economy and suppose that the followingpartial completeness result holds: for some numeraire pair (N,N), given anyrandom variable X for which X/NT is FAT -measurable and

ENX

NT<∞ ,

there exists some self-financing FAt -predictable φ such that

X = φ0 · A0 +T

0

φu · dAu ,

and

V Nt := φ0 · AN0 +t

0

φu · dANu

is a (FAt ,N) martingale (recall that full admissibility requires that V N be a(FAt ,N) martingale for all numeraire pairs (N,N)). Then if (M,M) is anyother numeraire pair, the measures N and M are related via

dMdN FA

T

=MT /M0

NT /N0P-a.s. (7.29)

As a consequence, MN is an (FAt ,N) martingale.

Page 190: Financial derivatives in theory and practice

166 Option pricing in continuous time

Conversely, if (N, N) is a numeraire pair and if M a numeraire such thatMN is an (FA

t , N) martingale, then (7.29) defines a martingale measure M

on the σ-algebra FAT with associated numeraire M.

Proof: We prove (7.29) by showing, equivalently, that

dN

dM

∣∣∣∣FA

T

=NT/N0

MT/M0P-a.s.,

and to prove this we show that, for any S ∈ FAT ,

EM

[1S

NT/N0

MT/M0

]= EM

[1S

dN

dM

].

Define a sequence of FAt stopping times Tn via

Tn = inft > 0 : NM

t > n

and letV n = 1S1T<Tn.

By the partial completeness hypothesis (applied to X = V nNT), combinedwith the unit invariance theorem, we can write

V n = φn0 · AN

0 +∫ T

0φn

u · dANu

for some self-financing FAt -predictable φn, where

Y nt := φn

0 · AN0 +

∫ t

0φn

u · dANu

is a (FAt , N) martingale.

A further application of the unit invariance theorem yields

Y nt = Y n

0 +∫ t

0φn

u · dAMu ,

where Y nt = Y n

t NMt . This representation shows Y n to be an FA

t localmartingale under M which, since |Y n

t | = |Y nt ||NM

t | ≤ n, is also an FAt

Page 191: Financial derivatives in theory and practice

Continuous time theory 167

martingale. We can conclude (remembering that FA0 is trivial) that

EM

[Y n

T

dN

dM

]= EN[Y n

T ] = Y n0 =

M0

N0Y n

0

=M0

N0EM[Y n

T ] =M0

N0EM

[Y n

T

NT

MT

]. (7.31)

Now take the limit as n → ∞. Since Y nT ≤ 1 and both dN

dMand NM

T have finiteexpectation under M (the latter being a positive local martingale integrable atzero and hence a supermartingale), we can apply the dominated convergencetheorem to both sides of (7.31) to obtain

EM

[1S

dN

dM

]= EM

[1S

NT/N0

MT/M0

],

which establishes (7.30).An immediate consequence of (7.29) is that EN[(MT/M0)/(NT/N0)] = 1

which, combined with the fact that MN is a positive (FAt , N) local

martingale (hence supermartingale), implies that MN is a true (FAt , N)

martingale.The converse part of the theorem is immediate. This yields the following important corollary.

Corollary 7.40 Suppose the hypotheses of Theorem 7.39 hold. Then thefollowing are true.

(i) If L is a (FAt , M) martingale, then LMN is a (FA

t , N) martingale.In particular, if (7.27) in the definition of an admissible trading strategyholds for one numeraire pair then it holds for all numeraire pairs.

(ii) If (N, N) and (M, M) are two numeraire pairs then

EM

[∣∣∣∣ X

MT

∣∣∣∣]

= EN

[∣∣∣∣ X

NT

∣∣∣∣]

< ∞,

so Definition 7.36 is independent of the numeraire pair (N, N) chosenin that definition.

Theorem 7.39 is half of the continuous time equivalent to Theorem 1.21 forthe discrete economy of Chapter 1. The full result, which is mathematicallyappealing but not the primary result to which we appeal in practice, is asfollows.

Theorem 7.41 Suppose the economy E admits some numeraire pair. ThenE is complete if and only if, given any two numeraire pairs (N, N1) and (N, N2)with common numeraire N, N1 and N2 agree on FA

T .

Page 192: Financial derivatives in theory and practice

168 Option pricing in continuous time

Proof: If the economy E is complete then it follows from Theorem 7.39 thatdN1dN2

∣∣∣FA

T

= 1 a.s., so N1 = N2 on FAT . Conversely, if N1 is the unique measure

on FAT such that AN is an FA

t martingale, it follows from the martingalerepresentation theorem (Corollary 5.47) that, given any (FA

t , N1) localmartingale M, there exists some FA

t -predictable φ such that

Mt = M0 +∫ t

0φu · dAN

u . (7.32)

By Theorem 7.17, φ can be assumed to be self-financing. Let X be anFA

T -measurable random variable satisfying EN1 [|X/NT|] < ∞. Setting Mt =EN1 [X/NT|FA

t ], a true martingale, (7.32) shows that the contingent claim Xcan be replicated by a self-financing trading strategy. It remains to check thatφ is admissible. But the gain process in (7.32) is a martingale under N1 sinceM is a martingale, and so it follows from Corollary 7.40 that φ is admissible.

Whereas Theorem 7.41 gives a precise characterization of when a model

is complete, it is not easy to check in practice. The question of whetherthere exists a unique EMM for the process AN on the filtration FA

t is non-standard, the more usual question being whether the EMM is unique whenworking on the filtration FAN

t , which may be less refined. Indeed, some of theessential intuition and structure is masked in the apparently simple statementof Theorem 7.41.

Theorem 7.43 below addresses the issues above. It gives criteria forcompleteness which are much more easily checked in practice and which arestated in terms of the process A rather than AN. What becomes apparent alsois the extra requirement, the existence of a finite variation numeraire, which isa consequence of the fact that any replicating strategy must be self-financing.

We shall need the following lemma in the proof of Theorem 7.43. It is similarin character to Theorem 7.17 and shows how the self-financing property ofa strategy leads to an alternative form of representation result for localmartingales.

Lemma 7.42 Let E be a complete economy and (N, N) any associatednumeraire pair. Then for any (FA

t , N) local martingale M there exists someself-financing FA

t -predictable φ such that

Mt = M0 +∫ t

0φu · dAN

u .

Proof: Let Tn be a reducing sequence for the local martingale M anddefine Mn := MTn . Since Mn is a true martingale under N, EN

[∣∣∣MnTNT

NT

∣∣∣] =EN [|Mn

T |] < ∞ so, by completeness,

MnT NT = φn

0 · A0 +∫ T

0φn

u · dAu

Page 193: Financial derivatives in theory and practice

Continuous time theory 169

for some admissible φn, and by the unit invariance theorem,

MnT = φn

0 · AN0 +

∫ T

0φn

u · dANu . (7.33)

The integral on the right-hand side of (7.33), with the upper limit beingreplaced by the variable t, is a martingale (since φn is admissible), so takingexpectations on both sides of (7.33), conditional on FA

t , we can conclude that

Mnt = M0 +

∫ t

0φn

u · dANu .

Defining the strategy φ via

φu(ω) = φn∗

u (ω),

wheren∗(ω, u) = minn : u ≤ Tn(ω),

yields

Mnt = M0 +

∫ t∧Tn

0φu · dAN

u .

Letting n ↑ ∞ now yields the result. We now come to the result on completeness to which we most often appeal

in practice.

Theorem 7.43 Let E be an economy admitting some numeraire pair andsuppose that weak uniqueness holds for the SDE (7.1) (i.e. any two solutionprocesses must have the same law). Then the following are equivalent.

(i) E is complete.(ii) There exists some FA

t -predictable φ satisfying:(a) φt · σ(At, t) = 0 for all t a.s.;(b) φt · At > 0 for all t a.s.;(c) −∞ <

∫ t0

φu

φu·Au· dAu < ∞ for all t a.s.

(iii) There exists some finite variation numeraire Nft .

Remark 7.44: When one of the assets in an economy is of finite variation andpositive it is common practice to take this asset as the numeraire. In thesecircumstances the converse part of Theorem 7.39 ensures that if an EMMexists for some numeraire then one will exist for the finite variation numeraire.However, this is not true in general when there is no finite variation asset.Part (iii) of the above theorem does not guarantee the existence of an EMMcorresponding to the finite variation numeraire Nf. In particular, it is notnecessarily the case that Nf is a price process (i.e. that it can be replicatedby an admissible strategy). Example 8.11 provides such an example.

Page 194: Financial derivatives in theory and practice

170 Option pricing in continuous time

Remark 7.45: When working with the Brownian filtration FWt , establishingthat the economy is FWT -complete in the sense of Remark 7.38 reduces tochecking that rank(σt) = d (d being the dimension of the Brownian motionW ) for all t a.s., under the assumption that there is a finite variation assetin the economy. If σ does have full rank d then FAt = FWt and so theeconomy will also be complete in the sense discussed above. Clearly we canhave an economy which is complete working with the FA

t filtration but notwith respect to the Brownian filtration FW

t . But it is the FAt filtrationthat makes sense from a modelling perspective. Many models specified by anSDE for which weak uniqueness holds admit a finite variation numeraire, andthe treatment here shows that these economies are complete in a sense whichis all that matters in practice.

Proof:(i)⇒(ii): Let (N,N) be some numeraire pair for the economy E , and Nt =αt · At for some FAt -predictable process α. Then, by the uniqueness of theDoob—Meyer decomposition, we can write the local martingale part under Nof the numeraire N as

N loc,Nt =

t

0

αu · dAloc,Nu .

Since E is complete it follows from Lemma 7.42 that we can find some FAt -

predictable self-financing ψt such that

N loc,Nt =

t

0

ψu · dANu = ψt ·ANt .

We now show that

φt = αt − ψNt + (ψNt ·ANt )αt (7.34)

has the required properties in part (ii) of the theorem.Predictability is immediate. To check (a), it is clearly sufficient to prove

φt · dAloc,Nt = 0 .

Noting that

dAt = d(NtANt ) = NtdA

Nt +A

Nt dNt + dNtdA

Nt

and appealing to the Doob—Meyer decomposition yields

ANt αt · dAloc,Nt = dAloc,Nt −NtdANt .

Page 195: Financial derivatives in theory and practice

Continuous time theory 171

With φ as in (7.34) we have

φt · dAloc,Nt = αt · dAloc,Nt − ψNt · dAloc,Nt + (ψNt · ANt )αt · dAloc,Nt

= αt · dAloc,Nt − ψNt · dAloc,Nt + ψNt · (dAloc,Nt −NtdANt )= αt · dAloc,Nt − ψt · dANt= 0

as required. Finally, note that α is self-financing, and thus

φt ·At = αt · At − ψNt · At + (ψNt ·ANt )αt · At = Nt .

Property (b) follows from the positivity of Nt.To establish property (c), the reader can check using the self-financing

property of ψ and α that

φtφt ·At · dAt =

φtNt· dAt

=dNtNt− (dNt)

2

N2t

− dNloc,Nt

Nt

= d(logNt)− d (logNt)loc,N .

Property (c) is now immediate from the finiteness and positivity of Nt.

(ii)⇒(iii): Given φt satisfying (a),(b) and (c) of (ii), define

φ∗t =φt

φt · Atand

Nft = exp

t

0

φ∗u · dAu . (7.35)

By Property (a), N f is of finite variation and is a.s. positive for all t. It remainsto check N f has the form of a numeraire. But trivially

Nft = (N

ft φ∗t ) · At ,

and, since φt · dAloc,Nt = 0, we have, by applying Ito’s formula to (7.35),

dNft = N

ft φ∗t · dAt .

We are done.

Page 196: Financial derivatives in theory and practice

172 Option pricing in continuous time

(iii)⇒(i): Let (N,N) be some numeraire pair for the economy E and let X beany FAT -measurable random variable satisfying

ENX

NT<∞ .

Consider the FAt martingale under N,

Mt = ENX

NTFAt .

By Theorem 6.38 there exists some FAt -predictable η such that

Mt = ENX

NT+

t

0

ηu · dAloc,Nu .

In particular,X

NT= EN

X

NT+

T

0

ηu · dAloc,Nu . (7.36)

We now seek an expression for the local martingale part of A under N. Recallthat

dAt = d(NtANt ) = A

Nt dNt +NtdA

Nt + dNtdA

Nt . (7.37)

The second term in (7.37) is a local martingale under N and the last term isof finite variation, so we can deduce that

dAloc,Nt = (ANt dNt)loc,N +NtdA

Nt . (7.38)

To find the local martingale part of ANdN , write

Zt =Nft

Nt

and observe thatANt dNt = A

Nf

t ZtdNt .

By Ito’s formula

dNft = ZtdNt +NtdZt + dZtdNt,

and so(ANt dNt)

loc,N = −ANf

t NtdZt .

Subsituting into (7.38) now gives

dAloc,Nt = NtdANt −AN

f

t NtdZt . (7.39)

Page 197: Financial derivatives in theory and practice

Continuous time theory 173

Since Nf is a numeraire, unit invariance implies that we can find an FAt -

predictable αf such thatdZt = αf

t · dANt

and thus (7.39) becomes

dAloc,Nt = NtdAN

t − ANf

t Ntαft · dAN

t .

Substituting into (7.36), we have

X

NT= EN

[X

NT

]+

∫ T

0Nu[ηu − (ηu · ANf

u )αfu] · dAN

u .

Thus by Theorem 7.17, the martingale property of M and Corollary 7.40, wecan find an admissible strategy replicating X and so we have shown that E iscomplete.

Our results from Chapter 6 for locally Lipschitz SDEs provide sufficientconditions for weak uniqueness to hold for an SDE and thus we have thefollowing very useful corollary.

Corollary 7.46 If the SDE (7.1) satisfies the local Lipschitz conditions(6.23) and the linear growth restriction (6.24), then the economy E is completeif and only if there exists some finite variation numeraire.

Proof: Theorems 6.12 and 6.31 show that weak uniqueness holds for (7.1),so the result follows from Theorem 7.43.

7.3.6 Pricing kernels

In Chapter 1 we introduced the idea of a pricing kernel and only later didwe meet numeraires. We saw there the close relationship between pricingkernels and numeraire pairs (although we did not use this term at that stage).We have, in this chapter, chosen to develop the theory using numeraires inpreference to pricing kernels because of the intuitively appealing interpretationof a numeraire as a unit, because of results such as Theorem 7.43, and becauseit is the most common way models and pricing results are presented in thefinance literature. However, the theory can be developed and presented interms of pricing kernels, and Chapter 8 is an occasion when using pricingkernels is sometimes more convenient. Hence we establish the link betweenthe two approaches.

To begin we must define what we mean by a pricing kernel.

Definition 7.47 Let E be an economy defined on the probability space(Ω, Ft,F , P). We say that the strictly positive, (FA

t , P) continuoussemimartingale Z is a pricing kernel for the economy E if the process ZAis an (FA

t , P) martingale.

Page 198: Financial derivatives in theory and practice

174 Option pricing in continuous time

The following theorem demonstrates the equivalence of, on the one hand,a theory developed using pricing kernel terminology and ideas and, on theother, one developed using numeraires.

Theorem 7.48 Let E be an economy defined on the probability space(Ω, Ft,F ,P). Then:(i) E admits a numeraire pair if and only if there exists a pricing kernel;(ii) for a self-financing trading strategy φ,

·0 φu · dANu is an (FAt ,N)

martingale for all numeraire pairs (N,N) if and only if ·0φu · d(ZA)u

is an (FAt ,P) martingale for all pricing kernels Z;(iii) any two numeraire pairs, (N,N1) and (N,N2), with common numeraire

N agree on FAT if and only if any two pricing kernels, Z (1) and Z(2),

agree up to FAT (i.e. EP[Z(1)T − Z(2)T |FAT ] = 0 a.s.).

Remark 7.49: What Theorem 7.48 shows is that all the results regardingarbitrage, admissible strategies and completeness carry over immediately toa corresponding result for pricing kernels (see Theorems 7.32 and 7.41 for therelevant results stated in numeraire form).

Proof:Proof of (i): Suppose (N,N) is a numeraire pair for E . Define the a.s. positive,FAt -adapted process Z = (Nρ)−1 where

ρt :=dPdN FA

t

= ENdPdN

FAt .

Then

EP (ZA)T |FAt =EN (ZA)T dP

dN |FAtEN dP

dN |FAt

=ENEN (ZA)T dP

dN |FAT |FAtρt

=EN[ANT |FAt ]

ρt

=ANtρt

= (ZA)t .

A similar argument shows that EP[|(ZA)t|] < ∞ and ZA is clearly FAt -

adapted. Thus ZA is an (FAt ,P) martingale and Z is a pricing kernel.Conversely, suppose Z is a pricing kernel for E . Define a numeraire N as

follows. Let

it := min i ∈ 1, . . . , n : |A(i)t | ≥ |A(j)t |, for all 1 ≤ j ≤ n ,

Page 199: Financial derivatives in theory and practice

Continuous time theory 175

the index of the asset which has the price at t with maximum absolute value.Define the sequence of FA

t stopping times τm and the index It via

τ0 = 0, τm = inft > τm−1 : |A(Iτm−1 )t | < 1

2 |Aitt ,

It = iτk(t) where k(t) is such that τk(t) ≤ t < τk(t)+1.

That is, we construct the index process I and the sequence τm inductivelyas follows. Start with I0 being the index of the asset with price of maximumabsolute value and τ0 = 0. Wait until this asset price is less than half the value(in absolute terms) of the new maximum price, then switch the index I to thecurrent maximum priced asset. The time of this switch is τ1. Continue in thisway. Now take the numeraire N to be

Nt :=

∣∣∣∣∣∣A(It)t

k(t)∏j=1

A(Iτj−1 )τj

A(Iτj

)τj

∣∣∣∣∣∣ =(

12

)k(t)

|A(It)t |,

the portfolio produced by starting with a unit amount of the most valuableasset (in absolute terms) and switching to the most valuable asset (again inabsolute terms) at each FA

t stopping time τm.The process ZA is an (FA

t , P) martingale so ZN is an (FAt , P) local

martingale. The sequence τm is a localizing sequence for NZ. Furthermore,for all m > 0, (ZN)τm

t ≤∑n

i=1 |ZtA(i)t | and

∑ni=1 EP[|ZtA

(i)t |] < ∞ so the dom-

inated convergence theorem (applied to ZtNτmt , m ≥ 1) ensures that ZN is a

true (FAt , P) martingale. Defining the measure N via

dN

dP:=

(NZ)T

(NZ)0,

it follows easily that (N, N) is a numeraire pair.

Proofs of (ii) and (iii): The proofs of (ii) and (iii) are relatively straightforwardusing the construction in the proof of (i). We prove (iii) in one direction forillustration. Suppose N1 = N2 on FA

T whenever (N, N1) and (N, N2) are twonumeraire pairs with common numeraire N, and let Z(1) and Z(2) be twopricing kernels. Define the measures N1 and N2 by

dNi

dP:= NTZ

(i)T , i = 1, 2, (7.40)

where the numeraire N is as constructed in the proof of (i). As shownthere, (N, N1) and (N, N2) are both numeraire pairs, thus N1 = N2 on FA

T

by hypothesis. It follows from (7.40) that Z(1) = Z(2) up to FAT , as required.

The equivalence of the pricing kernel and numeraire approaches also leads

to a pricing kernel valuation formula. We shall appeal to this in Chapter 8.The proof follows immediately from Theorem 7.48 and Corollary 7.34, themartingale valuation formula using numeraires.

Page 200: Financial derivatives in theory and practice

176 Option pricing in continuous time

Corollary 7.50 Let E be an economy with a pricing kernel Z and let VT besome attainable contingent claim. Then Vt admits the representation

Vt = Z−1t EP VTZT |FAt .

7.4 EXTENSIONS

The theory developed in Section 7.3 is restrictive in several senses. Inparticular,

(i) derivatives were restricted to making a single payment, VT , at thedeterministic terminal time T ;

(ii) the payment amount VT is an FAT -measurable random variable and nocontrol can be applied by the purchaser of the option;

(iii) the economy comprises only a finite number of assets, each of which is acontinuous process which pays no dividends;

(iv) the economy and trading strategies are defined only over a finite timeinterval [0, T ].

Each of these restrictions can be removed in a more general theory. For theapplications treated in this book we will need some of these generalizationsbut not others. Below we discuss each in turn and show how to carry out theextension or point towards other references which do so.

7.4.1 General payout schedules

The obvious extension of the payout schedule treated in Section 7.3 (a singlepayment at T ) is to replace it by a payout process Ht, 0 < t ≤ T, where His some FAt -adapted, cadlag semimartingale, Ht being the total paymentsmade up to and including time t. To treat a payoff as general as this wewould need to have defined the stochastic integral for a general semimartingaleintegrator. This is done, for example, in Chung and Williams (1990), Elliott(1982) and Protter (1990).We shall not need this generality. For our purposes it will be sufficient to

consider processes H of the form

Ht = Ct +

m

i=1

V (i)1t≥τi

where C is some continuous FAt -adapted process, each τi is an FAt

stopping time and each V (i) is FAτi -measurable. Because option valuationis linear, to treat this type of payoff it is sufficient to separately value acontinuous payment stream C and a single FA

τ -measurable payment V madeat some FAt stopping time τ .

Page 201: Financial derivatives in theory and practice

Continuous time theory 177

The need to treat a continuous payment stream arises from futurescontracts. This is our only application of this and so we defer a discussion untilthen, Chapter 20. This leaves the problem of valuing a single FA

τ -measurablepayment V made at the FA

t stopping time τ ≤ T . Two important examplesof products of this type are barrier rebates, payments of known amount paidat the moment an asset price hits some barrier level, and American options,which we discuss in more detail in Section 7.4.2 below.

Let (N, N) be some numeraire pair for the economy E . Given any (V, τ )with

EN

[∣∣∣∣ V

∣∣∣∣]

< ∞,

we can define

VT := V +∫ T

0

(V

)1u(τ, T ]dNu = V

NT

Nτ. (7.41)

Note that the integrand in (7.41) is FAt -predictable (being left-continuous

with right limits and adapted) so the integral in (7.41) is a standard stochasticintegral. The payoff VT is precisely that which results from investing the payoffV, made at τ , in the numeraire strategy N.

In Section 7.3 we treated payoffs VT for which

EM

[∣∣∣∣ VT

MT

∣∣∣∣]

< ∞,

for all numeraire pairs (M, M). Note that with V and VT defined as above,for any FA

t stopping time τ and any numeraire pair (M, M),

EM

[∣∣∣∣ VT

MT

∣∣∣∣]

= EMEM

[∣∣∣∣ V

NT

MT

∣∣∣∣∣∣∣∣FA

τ

]≤ EM

[∣∣∣∣ V

∣∣∣∣]

, (7.42)

(the inequality following by applying (the conditional form of) Fatou’s lemmato the positive local martingale M/N ). Thus if EM[| V

Mτ|] < ∞ for all numeraire

pairs we can price this option using the theory of Section 7.3. If the economyis complete then the inequality in (7.42) is an equality and (V, τ ) can bereplicated if and only if (VT, T ) can be replicated.

We conclude from the above discussions that, by immediately investingany payment in the numeraire N, the valuation of (V, τ ) is identical to thevaluation of (VT, T ), and this latter problem was the topic of Section 7.3. Themartingale valuation formula becomes, for t ≤ τ ,

Vt = NtEN

[V

∣∣∣∣FAt

].

Page 202: Financial derivatives in theory and practice

178 Option pricing in continuous time

7.4.2 Controlled derivative payouts

A particularly important problem is the valuation of derivatives whose payoutdepends on actions taken by the buyer of the option (the more general casewhere both buyer and seller can apply controls can also be handled, but weshall not need these results). An American put option, for example, allows theoption holder to receive the amount (K − Sτ )+ at time τ , where K is someconstant, S is the price of a stock and the stopping time τ is chosen by theholder of the option.More generally, the control problem can be formulated as follows. Let U be

the set of controls (for the American option this would be the set of all FAt

stopping times). By the techniques of Section 7.4.1 we can assume a singlepayment VT is made at time T . For each u ∈ U let VT (u) denote the paymentmade at T if the option holder follows the strategy u. The set of controlsmust be such that, for each u ∈ U , VT (u) is an attainable claim and this willimpose constraints on the set U . For example, in the case of an Americanoption the time τ must be an FAt stopping time and not a more generalFWt stopping time. Let Vt(u) be the value at t of the payout VT (u) and let

V ∗0 := supu∈U

V0(u) .

Suppose, for some u∗ ∈ U , V0(u∗) = V ∗0 . Then (subject to regularity

conditions on the set U as described below), we claim, VT is attainable, hasinitial value V0 = V

∗0 , and is replicated by the strategy φ(u

∗) which replicatesVT (u

∗). To see this, suppose that we buy the option for V < V ∗0 . We chooseto follow the strategy u∗ and simultaneously, at a cost of V0(u∗), replicate thepayoff VT (u

∗) dynamically. Doing so, we have a time zero profit of V0(u∗)−Vand no other cashflows — an arbitrage. Thus V0 ≥ V (u∗) = V ∗0 .To see that V0 ≤ V (u∗) is a little more involved. Suppose we sell the

option for V > V ∗0 . We must assume that the set of strategies, and theinformation available to us about the strategy which the option holder isfollowing, together are sufficient for us to be able to follow a strategy whichperforms no worse than that of the option holder. This will not be the casein general. One example where it is the case is if the holder must declare thestrategy u at time zero, in which case we can replicate the payout and receivean arbitrage profit of V −V (u) ≥ V −V ∗0 at time zero. A second more realisticand more important example is the American option. In this case, if τ is anarbitrary FA

t stopping time and if τ∗ is the optimal stopping time,

P Vt(τ∗) ≥ Vt(τ) for all t = 1,

so we can make a profit of V − V (τ ∗) at a time zero and an additional profitof Vτ (τ

∗) − Vτ (τ) at the holder’s exercise time τ . Once again this, would bean arbitrage and thus we can conclude that V0 ≥ V ∗0 .

Page 203: Financial derivatives in theory and practice

Extensions 179

There are two issues not addressed in the discussions above. The first is thatthe supremum over u ∈ U may not be attainable. In this case the payoff is notattainable and cannot be valued. The second, one of considerable theoreticaland practical importance, is how we find the optimal strategy u∗ for anyparticular problem. This topic, optimal control theory, is not one we shalltackle, but Karatzas and Shreve (1998) and Oksendal (1989) are suitableplaces to look.

7.4.3 More general asset price processes

There are several ways to generalize the model for the assets in the economy.We have assumed that all price processes are continuous semimartingalesdriven by a finite number of Brownian motions. This can be generalizedto the case of a general semimartingale. Continuous models driven by morethan a finite number of Brownian motions have been considered by, amongstothers, Kennedy (1994) and Jacka et al. (1998). Models which allow jumps areparticularly important when modelling assets which are subject to default riskand thus which can suddenly devalue. References covering this more generalsituation include Jarrow and Turnbull (1995), Jarrow and Madan (1995) andDuffie and Singleton (1997).A second restriction we have imposed on the assets in an economy is that

they do not pay dividends. This is a significant omission as most stocks paydividends. However, these often fall under the results of Section 7.3 as we nowshow.Suppose A is the price process, with jumps, of the dividend-paying assets

of an economy. Suppose the dividends are paid at times T1, . . . , Tm, and thedividend amounts at Ti form an FATi-measurable random vector Gi. Let

At := At +m

i=1

Gi1t≥TiNtNTi

where N is some numeraire. We assume the process A is continuous. We can

interpret A(i)t as the value at t of buying and holding the asset A(i) at time

zero and reinvesting all dividends in the numeraire N . If we treat A as the

‘true’ asset price process of the economy then, noting that FAt = F At , all the

results of Section 7.3 apply if we consider the process A in preference to A.This analysis does not treat the case when A is not continuous (so the asset

prices are discontinuous even after correcting for dividend payments) or whenthe dividends, unrealistically, are a continuous cashflow stream. The formercase is beyond the scope of this book. We do not discuss the latter, but aformal treatment can be developed easily from the ideas of Section 7.4.1 andChapter 20.The final extension we shall consider is to the case of a continuum of assets

Page 204: Financial derivatives in theory and practice

180 Option pricing in continuous time

rather than just finitely many. This is relevant to the modelling of interestrate products and is the subject of Chapter 8.

7.4.4 Infinite trading horizon

There are some situations in which we would like the economy to be definedover an infinite horizon. We shall meet one such situation in the next chapterwhere we discuss term structure models. Doing this introduces a few extraproblems, as the following example illustrates.

Example 7.51 Return to the basic Black—Scholes economy introduced inExample 7.1 and suppose that µ ≥ r. Define the stopping time τ to be

τ := inft > 0 : Stexp(−rt) > 2S0 .It follows from Theorem 2.8 that τ <∞ a.s. and thus the strategy of buyingone unit of stock, S, and selling one unit of bond, D, at time zero yields aguaranteed profit of S0 at τ with no risk. Similarly, if µ < r the strategy ofbuying one unit of bond and selling one unit of stock at time zero and holdingthis portfolio until

η := inft > 0 : Stexp(−rt) < S0/2guarantees a riskless profit of S0/2 at η <∞.A strategy such as the one above is an arbitrage and as such presents

problems when we come to pricing. There are two obvious ways to dealwith this problem. The first is to eliminate the strategy completely from theeconomy by modifying our admissibility definition (Definition 7.25) and thatof a numeraire pair to insist that all gain processes are uniformly integrablemartingales (recall that if M is a UI martingale then it follows from theoptional sampling theorem that E[Mτ ] = M0 for all stopping times τ). If weadopt this approach the theory of this chapter is altered only by replacing thefinite time interval [0, T ] by the (closed) time interval [0,∞].The problem with adopting the approach above is that it would rule out

many otherwise acceptable models, such as the Black—Scholes model, andmay be too restrictive for many applications. We shall adopt an alternativeapproach, one which effectively reduces the general modelling and valuationproblem to the case we have already considered. We allow the model, theeconomy and trading strategies to be defined over the time horizon [0,∞),thus incorporating the yield curve models introduced in Chapter 8, butwill only value derivatives for which the payout is made on or before somepredetermined time T , a time that will depend on the derivative in question.This is sufficiently general to treat all derivatives of practical importance.If we adopt this latter approach we are effectively reduced to the framework

of Section 7.3. One could (we shall not) introduce the terminology T -admissible (admissible when the strategy and economy are considered only on

Page 205: Financial derivatives in theory and practice

Extensions 181

[0, T ]), T -complete (complete when the strategy and economy are consideredonly on [0, T ]) and locally complete (T -complete for every T ). An economysuch as this would admit an arbitrage in the sense of Example 7.51 but thiswould not violate any of the replication arguments used to price a derivativewhich must settle on or before T <∞.

Page 206: Financial derivatives in theory and practice
Page 207: Financial derivatives in theory and practice

8

Dynamic Term Structure Models

8.1 INTRODUCTION

The theory of Chapter 7 applies to models for which there are only finitelymany assets. The main applications treated in this book are pricing problemsfor interest rate derivatives, in which case the underlying assets are purediscount bonds. One can define a pure discount bond for each and everymaturity date, a continuum of assets, and in this context the theory ofChapter 7 may be inadequate. In actual fact, for the vast majority ofapplications it is only necessary to model a (small) finite number of thesebonds and in this sense the finite-asset theory is usually sufficient even forinterest rate products. However, there are occasions, the valuation of futurescontracts being a case in point, when it is convenient to model an infinitenumber of bonds. Furthermore, there is considerable economic and theoreticalunderstanding to be gained from studying models of the whole term structure(of interest rates), and it is important to know even for models involvingfinitely many bonds that there is a (realistic) whole term structure modelconsistent with the model. This is the topic of this chapter.

Section 8.2 briefly describes how, by restricting the class of admissibletrading strategies, a study of derivative valuation for an economy comprising acontinuum of bonds can effectively be reduced to the finite-asset case studiedin Chapter 7. Then, in Section 8.3, term structure models are studied indetail, at the abstract level. Several ways to specify a term structure modelare presented, including the celebrated Heath-Jarrow-Morton framework, andthese are related to each other. Very little is said about particular models,this discussion being deferred until Part II.

8.2 AN ECONOMY OF PURE DISCOUNT BONDS

As was the case with finitely many assets, to specify an economy of purediscount bonds we must define both the joint law of all the assets within

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 208: Financial derivatives in theory and practice

184 Dynamic term structure models

the economy and the set of admissible trading strategies. One could do thisin considerable generality. For example, Kennedy (1994) considers a generalclass of Gaussian term structure models which includes models generated bya Brownian sheet, and Jacka et al. (1998) also consider continuous modelsmore general than those generated by only finitely many Brownian motions.Neither of these, however, discusses admissible strategies and questions ofcompleteness. The latter issues are considered by Bjork et al. (1997a; 1997b)who develop term structure models driven by a finite number of Brownianmotions and a marked point process. A trading strategy is then a predictableprocess L with Lt being a signed finite Borel measure on [t,∞), Lt(A)representing the holding at t of bonds with maturities in the set A ∈ B([t,∞)).This approach is mathematically very appealing, but we shall not have needof such generality.In keeping with the continuous theory of Chapter 7 we shall develop term

structure models under the following assumption (which, through the filtrationrestriction, is stronger than the conditions imposed in Chapter 7):

the underlying ‘real-world’ probability space (Ω, Ft,F ,P) satisfies the usualconditions, supports a d-dimensional (Ft,P) Brownian motion W , andthe filtration Ft is the (augmented) natural filtration generated by W ,Ft = FWt .Now denote by DtT the value at t of a pure discount bond with maturity T ,

an asset which pays a unit amount on its maturity date. We shall, in Section8.3, describe in detail several ways to model the complete term structure ofpure discount bonds, DtT : 0 ≤ t ≤ T <∞. If, in keeping with the notationof Chapter 7, we denote by FA

t the filtration generated by the asset pricesof the economy,

FAt := σ(DuT : 0 ≤ u ≤ t, u ≤ T <∞) ,

then every model we shall develop has the following four properties:

Definition 8.1 A term structure (or (TS)) model is a model of the dynamicsof the discount curve, D·T : 0 ≤ T <∞, with the following properties:(TS1) For all T ≥ 0, the process DtT , t ≤ T, is a continuous semimartingale

with respect to the filtration Ft.(TS2) DtT ≥ 0 for all t ≤ T .(TS3) DTT = 1 for all T .(TS4) The model admits a pricing kernel: there exists some strictly positive

continuous semimartingale Z such that ZtDtT is an (FAt ,P)martingale for all T ≥ 0.

The focus of this book is on models in which asset prices are continuousand which we can integrate against (so that the gain from trading is welldefined), thus (TS1) is a natural assumption for us to make (recall Theorem

Page 209: Financial derivatives in theory and practice

An economy of pure discount bonds 185

4.27). Conditions (TS2) and (TS3) are obviously required. The final condition,(TS4), requires a little more comment. We saw for the discrete economy ofChapter 1 that, in that context, the existence of a pricing kernel is equivalentto the absence of arbitrage (Theorem 1.7). We saw in Chapter 7 that, aslong as we eliminate troublesome ‘doubling strategies’, the existence of anumeraire pair (which, over a finite time horizon, is equivalent to the existenceof a pricing kernel) implies no arbitrage (Theorem 7.32), and we commentedthat the absence of an ‘approximate arbitrage’ implies the existence of anumeraire pair (Remark 7.31). Condition (TS4) is thus a natural conditionto impose and will guarantee the absence of arbitrage (assuming, as before,that doubling strategies are removed). However, there is currently no proofthat no approximate arbitrage implies the existence of a numeraire pair underwhich all bonds are simultaneously martingales.Given a model for the asset price evolution satisfying (TS1)—(TS4), we can

proceed to define an admissible trading strategy for the economy which mirrorsthe one used in Chapter 7. Note the significant restriction to the generalityof an admissible trading strategy imposed by condition (i) of Definition 8.2below. Note also that, whereas in Chapter 7 the definition of an admissiblestrategy was stated in terms of numeraire pairs, here it is presented in termsof pricing kernels. We showed in Section 7.3.6 that the two are equivalent overa finite horizon. Here we prefer to use pricing kernels rather than numerairepairs because of the technical issues which arise when considering a modelover the infinite time horizon [0,∞) (see Theorem 8.9).

Definition 8.2 An admissible trading strategy for the term structure econ-omy E is an FAt -predictable vector process φ = (φ(1),φ(2), . . . ) and a se-quence of times (not necessarily increasing) (T1(φ), T2(φ), . . . ) such that:

(i) for each T > 0, there exists some M(φ, T ) such that φ(i)t = 0 for all

i > M(φ, T ), t ≤ T ;(ii) the strategy is self-financing,

Vt =

i=1

φ(i)t DtTi(φ) =

M(φ,t)

i=1

φ(i)t DtTi(φ)

=

M(φ,t)

i=1

φ(i)0 D0Ti(φ) +

t

0

M(φ,t)

i=1

φ(i)uTi(φ)

dDuTi(φ),

where Vt is value at t of the portfolio corresponding to the strategy φ;(iii) for all pricing kernels Z, the kernel-rescaled gain process

(ZGφ)t =t

0

M(φ,t)

i=1

φ(i)u d(Z·D·Ti(φ))u

is an (FAt ,P) martingale.

Page 210: Financial derivatives in theory and practice

186 Dynamic term structure models

Remark 8.3: We have imposed the condition on the trading strategy φ thatit involves only a predetermined and finite set of bonds up to any time T .This reflects the way these models are used in practice — for most derivativesthere is a known finite set of bonds which is required to price and hedge theproduct. In this context the existence of a continuum of bonds is more of aconsistency condition on the model than an essential requirement.

Remark 8.4: Under the terms of an interest rate futures contract, discussedin Chapter 20, cashflows occur throughout the life of the contract and withat most one cashflow each day. As such, a trading strategy need only involvefinitely many bonds (one per day until the futures expiry date). However,from a modelling perspective it is convenient to treat a futures contract as ifresettlement occurred continuously, just as we allow continuous rebalancing inthe definition of a trading strategy, something which is impossible in practice.To replicate a (theoretical) futures contract such as this would, in a generalterm structure model, require trading in a continuum of bonds, and sucha strategy is not within the scope of the development here. However, thedegeneracy of the models we consider, in which a continuum of bonds is drivenby finitely many Brownian motions, will allow us to overcome this problem.

With the above definition of an admissible strategy the martingale propertyof the gain process precludes any arbitrage within the economy, exactly asit did when the economy only had finitely many assets. Furthermore, themartingale valuation formula for an attainable claim is also valid (Corollaries7.34 and 7.50). The remaining question is: under what conditions is theeconomy complete? The following result is adequate for all the applicationswe have encountered.

Theorem 8.5 Let E be a term structure economy defined on the probabilityspace (Ω, Ft,F ,P), with asset filtration FA

t and admitting a pricingkernel Z. Suppose that, for each T > 0, there exists a finite set of times(T1, . . . , Tn(T )) such that, for all t ≤ T ,

FAt = σ(DtT1 , . . . , DtTn(T)) . (8.1)

Then the economy is FAT -complete if any two pricing kernels Z1 and Z2 forthe finite economy comprising the bonds (DtT1 , . . . , DtTn(T )) agree on FAT (orequivalently, if, whenever (N,N1) and (N,N2) are two numeraire pairs withcommon numeraire N , N1 and N2 agree on FA

T ).Suppose, further, that for each T > 0 the process Dt := (DtT1 , . . . , DtTn(T))

satisfies an SDE of the form

dDt = µ(Dt, t) dt+ σ(Dt, t) dWt (8.2)

for some Borel measurable µ and σ, and that weak uniqueness holds for theSDE (8.2). Then the economy is complete if either of the two following con-ditions holds:

Page 211: Financial derivatives in theory and practice

Modelling the term structure 187

(i) There exists some FAt -predictable φ satisfying:

(a) φt · σ(Dt, t) = 0 for all t a.s.;(b) φt · Dt > 0 for all t a.s.;

(c) −∞ <∫ t

0φu

φu · Du· dDu < ∞ for all t a.s.

(ii) There exists some finite variation numeraire Nft .

Proof: Consider the finite economy comprising the assets (DtT1 , . . . , DtTn(T ) ).That this economy is FA

T -complete follows from the filtration restriction (8.1)and Theorems 7.41 and 7.43 (see Theorem 7.48 and Remark 7.49 for thepricing kernel statement of these numeraire-pair results). This immediatelyimplies that the term structure economy is FA

T -complete since replication canbe performed with this finite set of assets. (Note that Theorem 7.39 ensuresthat the larger set of numeraires in the term structure economy, which couldpotentially have made a strategy which is admissible for the finite economyinadmissible for the term structure economy, presents no problem.)

Remark 8.6: If the conditions of Theorem 8.5 are satisfied the model iscomplete and replication over the interval [0, T ] can be performed by a givenfinite set of bonds. In particular, a unit payoff at any time S < T can bereplicated. Thus the zero coupon bond of maturity S (which is not in the set(T1, T2, . . . , Tn(T )) is redundant. Clearly this is not the case in practice andcare must be taken when using a (necessarily simple) model to ensure thatan inappropriate replicating portfolio is not used (for example, replicatinga one-week bond with a combination of bonds with maturities all over tenyears!).

8.3 MODELLING THE TERM STRUCTURE

The remainder of this chapter is devoted to discussing eight closely relatedways to specify a model for the term structure satisfying (TS1)-(TS4), andit is based primarily on the papers of Baxter (1997) and Jin and Glasserman(2001). Each of these techniques has its place in the development of the theoryor as a tool for applications.

In this introduction we will briefly introduce each technique and summarizethe relationships between them. In the remaining sections we shall go intomore detail on each and establish these connections. We will, in passing,mention some of the most important example models but will not study themin detail until Part II of the book when they will be introduced in the contextof an appropriate application. The eight techniques are as follows.

(PDB) Pure discount bond : The most direct way to specify a term structuremodel is by explicitly stating the law of the pure discount bonds or anSDE that they satisfy. It is also common to give an SDE for DtT/Nt,

Page 212: Financial derivatives in theory and practice

188 Dynamic term structure models

0 ≤ t ≤ T <∞, where N is some numeraire process. In the latter situationthe numeraire is often a given bond, and the model specification is usuallyincomplete in that the law of the numeraire bond is not given and the economyis only modelled until some fixed time T because this is all that is needed forthe application to hand.Clearly all models satisfying (TS1)—(TS4) can be specified in this way. Any

model which satisfies (TS1)—(TS4) which is specified this way is said to be(PDB).

(PK) Pricing kernel: Take any strictly positive continuous semimartingaleZ ∈ L1(Ω, Ft,F ,P) and define

DtT = Z−1t EP[ZT |Ft] . (8.3)

The process Z may be specified either explicitly or implicitly, depending onthe application.One might think of generalizing this class by replacing the ‘real-world’

measure P in (8.3) by some other locally equivalent measure Z, somethingwe shall do frequently in the models below. For a (PK) model, however, this

offers no generalization since, if (Z,Z) is such that

DtT = Z−1t EZ[ZT |Ft]

for all t ≤ T , thenDtT = Z

−1t EP[ZT |Ft] ,

where

Zt := ZtdZdP Ft

.

(N) Numeraire: Define the law of some asset or, more generally, somenumeraire, N , in a martingale measure N which is locally equivalent to P,then extend to a definition for the whole term structure via

DtT = NtEN[N−1T |Ft] .

Note that under this approach, which is similar to (PK), there will beadditional restrictions on N . For example, if we claim that N is the valueof a bond maturing at T then NT = 1.

(FVK) Finite variation kernel: This is the subclass of (PK) for which the(continuous) pricing kernel is also taken to be a finite variation process. Wewill denote such a kernel by ζ rather than Z to distinguish it, and so (8.3)becomes

DtT = ζ−1t EZ[ζT |Ft] .

Page 213: Financial derivatives in theory and practice

Modelling the term structure 189

Observe that the real-world measure P in (8.3) is here replaced by somearbitrary Z which is locally equivalent to P. It was not necessary, for a general(PK) model, to allow this extra freedom in model specification because achange of measure from an arbitrary Z to P left the form (8.3) unaltered,albeit with a different kernel. By contrast, the finite variation property of thekernel is dependent on the measure in question.

(acFVK) Absolutely continuous (FVK): This is the subset of (FVK) modelsfor which the finite variation kernel ζ is absolutely continuous with respectto Lebesgue measure. In practice all models encountered which are of type(FVK) are also of type (acFVK). However, examples do exist which are (FVK)but not (acFVK), as we shall show later.

(SR) Short rate: Let r be some adapted process, not necessarily continuous,such that the process

·0 ru du exists. Let Q be some measure locally equivalent

to P and suppose that

EQ exp −t

0

ru du <∞ ,

for all t ≥ 0. Now define the term structure via

DtT = EQ exp −T

t

ru du Ft .

The measure Q is referred to as the risk-neutralmeasure. Clearly such a modelis also an (acFVK) model. We say that this model is a short-rate model, (SR),if the process r has the additional interpretation as the short-term interestrate, or short rate,

rt = − limh↓0

1

hlog(Dt,t+h) .

(HJM) Heath—Jarrow—Morton: A requirement of any reasonable termstructure model is that the discount curve be almost everywhere differentiablein the maturity parameter T . If it is also absolutely continuous with respectto Lebesgue measure, another reasonable requirement, then we have therepresentation

DtT = exp −T

t

ftS dS ,

where the ftS are the instantaneous forward rates. The (HJM) approach is tomodel these instantaneous forward rates. By definition every (HJM) model isan (SR) model.An (HJM) model is usually specified via an SDE which the instantaneous

forwards satisfy. For the resulting model to satisfy (TS1)—(TS4), in particular(TS4), additional regularity conditions need to be imposed on this SDE. Theseare discussed in detail in Section 8.3.7.

Page 214: Financial derivatives in theory and practice

190 Dynamic term structure models

(FH) Flesaker–Hughston: The Flesaker–Hughston approach is specificallyintended to produce term structure models for which all interest rates arepositive (meaning DtT is non-increasing in T for all t ≥ 0 a.s.). Let Z be locallyequivalent to P and let M·S, 0 ≤ S < ∞, be a family of (jointly measurable)positive (Ft, Z) martingales (automatically continuous given the Brownianrestrictions on the original set-up). Now define the term structure via

DtT =

∫ ∞T MtS dS∫ ∞t MtS dS

.

Note that DtT is decreasing in T so interest rates are always positive, andthat limT→∞ DtT = 0.

Each of the procedures above will generate a model satisfying (TS1)–(TS4). The different approaches do not yield exactly the same set of models,but they are very closely related. Using a superscript + to denote a modelwith positive interest rates (i.e. DtT is decreasing in T for all t a.s.) and asuperscript 0 to denote a model for which the discount curve decreases tozero as the maturity parameter T → ∞, a summary of these relationships isas follows:

(TS) = (PDB) = (PK) =∗ (N) ⊃ (FVK) ⊃ (acFVK) ⊃ (SR) ⊃ (HJM)

(TS)+ = (PDB)+ = (PK)+ =∗ (N)+ ⊃ (FVK)+ ⊃ (acFVK)+ = (SR)+ = (HJM)+

(TS)+0 = (PDB)+0 = (PK)+0 =∗ (N)+0 ⊃

(FVK)+0

(FH)

⊃ (acFVK)+0 = (SR)+0 = (HJM)+0

The starred equalities, (PK) =∗ (N), and consequential identities for therestriction to positive rates, do not strictly hold in general if the economyis considered over the infinite horizon on a probability space which satisfiesthe usual conditions. If the model is set up with the filtration being the(uncompleted) natural filtration for a d-dimensional Brownian motion, or ifthe horizon is finite, this is an equality. We discuss this technical point furtherbelow.

If interest rates are constrained to be positive, then (acFVK), (SR) and(HJM) coincide. In addition, (FH) contains all (acFVK) models for whichinterest rates are positive and limT→∞ DtT = 0 a.s. There are examples ofmodels which are (FH) but not (FVK)+0 and vice versa, so no more canbe said on this matter. However, in Section 8.3.8 we give an alternativerepresentation of an (FH) model, one which readily generalizes to a class wecall (eFH), extended Flesaker–Hughston. This broader class is also containedstrictly within (N)+0 but strictly contains all (FVK)+0 models.

In the remainder of this chapter we will prove all these facts and provideexamples to show where inclusions are strict.

Page 215: Financial derivatives in theory and practice

Modelling the term structure 191

For completeness we note here that, for each model class (M),

(M)+0 ⊂ (M)

+ ⊂ (M). The inclusion is immediate. That this inclusion isstrict is also obvious. Two example discount curve models which establishthis, chosen because they are easy to explain rather than because they arerealistic, are as follows. Each is (HJM), the smallest class we consider, andthus establishes the strict inclusion for all classes. The first, a model whichis (HJM) but not (HJM)+, is the non-stochastic model defined by the initial

discount curveD0T = 1− 12e−T sin(T ), a model in which interest rates oscillate

between positive and negative. The second, a model which is (HJM)+ but not(HJM)+0 is the non-stochastic model defined by D0T = exp − T

1+T . In each

case the curve at any future time is given by DtT = D0T /D0t.On an historical note, one of the major breakthroughs in term structure

models was the work of Heath et al. (1992). However, as the above discussionmakes clear, (HJM) is the smallest class of model that we consider. It turns outthat the manner in which an (HJM) model is specified often makes it difficultto work with. Furthermore, the technical conditions which are usually imposedon (HJM) models (to ensure that they are (HJM)) make it hard to prove thatany given model is of (HJM) type. By contrast, proving that a model is (PK) isusually straightforward. Furthermore, the (HJM) framework (via its technicalregularity conditions) excludes some models of the term structure which areotherwise perfectly acceptable and endowed with all the properties that onewould like for a term structure model. Example 8.11 below illustrates thispoint.

8.3.1 Pure discount bond models

We need say very little about this approach, which is merely a way to specifya model satisfying (TS1)—(TS4), i.e. directly in terms of the bond prices.Using this approach it is not obvious what restrictions must be imposed onthe directly modelled asset prices to ensure that (TS1)—(TS4) are satisfied.The approach is, however, one of the most natural to use for modellingsince bond prices and their dynamics are directly observable. Some of themost important examples defined in this way are the recent ‘market models’developed by Miltersen et al. (1997), Brace et al. (1997) and Jamshidian(1997). These models define the (numeraire-rebased) dynamics of a finite setof bonds. The existence of a model for the whole term structure consistentwith these dynamics then needs to be established. We shall discuss this furtherin Chapter 18.

8.3.2 Pricing kernel approach

Let Z ∈ L1(Ω, Ft,F ,P) be a strictly positive continuous semimartingale.

Page 216: Financial derivatives in theory and practice

192 Dynamic term structure models

DefineDtT = Z

−1t EP[ZT |Ft] . (8.4)

This is a candidate term structure model and it clearly satisfies (TS2)—(TS4).The expectation in (8.4) is an (Ft,P) martingale, hence is continuous,by Theorem 5.49, and Z−1 is a continuous semimartingale, thus DtT is acontinuous semimartingale for all T ≥ 0. This establishes (TS1) and thus weconclude that (PK) ⊆ (TS).That (TS) ⊆ (PK) is immediate since we can take the pricing kernel in

(TS4) to be the process Z in (8.4). We have thus established that (TS) =(PK).

Remark 8.7: Had we allowed the process Z to be more general than acontinuous semimartingale, the processDtT would not have been a continuoussemimartingale. However, the process ZtDtT , which has the interpretation ofthe value at t of a bond paying unity at T but stated in the units U = Z−1, isa continuous martingale (continuous because we are working on a Brownianfiltration). Thus even in this more general case there is a unit in which thebond price process is a continuous semimartingale under P.

Remark 8.8: We met pricing kernels in Chapter 1 and Section 7.3.6. Recallthat the reason, obvious in earlier discussions but less so in this context,for calling Z a pricing kernel arises from the derivative valuation formulaVt = Z−1t EP[VTZT |Ft]. That is (recalling that taking an expectation is thesame as performing an integration), Z takes on the role of a classical kernel.

8.3.3 Numeraire models

An alternative to (PK) is to use a numeraire pair (N,N) to define a termstructure model,

DtT = NtEN[N−1T |Ft] . (8.5)

All numeraire models are (TS) models (i.e. satisfy conditions (TS1)—(TS4)),or equivalently are of class (PK). To see that (N) ⊆ (PK), define

Zt = N−1t

dNdP Ft

. (8.6)

Substitution of (8.6) into (8.5) yields (8.4).We are also able to establish a (partial) converse to this result and show

(modulo technicalities at t =∞) that (PK) ⊆ (N) and thus (PK) = (N). Moreprecisely, we have the following theorem.

Theorem 8.9 Let Z be some pricing kernel defined on the probability space(Ω, Ft,F ,P). As previously, suppose that Ft = FWt , the augmentedfiltration generated by a d-dimensional Brownian motion W . Consider now a

Page 217: Financial derivatives in theory and practice

Modelling the term structure 193

(PK) model defined by equation (8.4). Define the unit-rolling numeraire Nu

via

Nut =

Dt t+1ti=1Di−1,i

( x being the largest integer no greater than x), the portfolio constructedby repeatedly buying the bond expiring one time unit in the future. Then wehave the following:

(i) Given any T ∗ > 0, we have that

DtT = Nut ENT∗ (N

uT )−1|Ft ,

for all t ≤ T ≤ T ∗, where the measure NT∗ ∼ P is defined by

dNT∗

dP Ft:= (ZNu)T

∗t , (8.7)

(ZNu)T∗being the martingale (ZNu) stopped at T ∗. That is, the model

is of class (N) over the time interval [0, T ∗].(ii) If the filtration Ft is taken instead to be the uncompleted filtration

(FWt ) then, for all T ∈ [0,∞),

DtT = Nut EN (N

uT )−1|Ft ,

where the measure N, locally equivalent to P with respect to (FW∞ ), is

defined viadNdP Ft

:= (ZNu)t ,

for all t ≥ 0.

Remark 8.10: The reason for this somewhat involved statement of the converselies in the results and discussion of Section 5.1.2 and, in particular, Theorems5.14 and 5.15. The first part of this converse result is sufficient for all practicalpurposes since in practice we only ever need to consider an economy over afinite horizon.

Proof: The proof of (i) is immediate by carrying out the change of measureprescribed by (8.7). The proof of (ii) follows from Theorem 5.15.The disadvantage of the numeraire approach compared with (PK) is that,

in claiming N to be a numeraire, we must ensure consistency conditions holdwithin the model. For example, if Nt, t ≤ T , is the value of the pure discountbond maturing at T then NT = 1. On the other hand, modelling within amartingale measure is more direct and intuitive. Furthermore, applicationsoften dictate which measure we should work in and it usually corresponds to

Page 218: Financial derivatives in theory and practice

194 Dynamic term structure models

a particular numeraire. All the models we shall cover are developed within amartingale measure N rather than the ‘real-world’ measure P.For our purposes the most important examples of models specified via a

numeraire are Markov-functional models. We shall meet these formally inChapter 19 but, in fact, most models used in practice are Markov-functional.In the examples introduced in Chapter 19 the law of the numeraire is onlycalculated implicitly and no SDE is ever specified which is satisfied by thenumeraire.

8.3.4 Finite variation kernel models

An (FVK) model is one for which the prices of all pure discount bonds canbe expressed in the form

DtT = ζ−1t EZ[ζT |Ft] (8.8)

for some adapted and continuous finite variation process ζ and some measureZ locally equivalent to P.Historically, most models that have been studied are (FVK) models, indeed

short-rate models (although they are increasingly not parameterized in termsof the short rate). Does this restriction eliminate any realistic models of theterm structure, or can all models be cast in this form? The answer to thisquestion is provided by the following example, a model which possesses allthe properties one might realistically demand of an interest rate model butone which is not (FVK).

Example 8.11 Consider the (PK) model generated by taking the pricingkernel Z to be the solution, with Z0 = 1 a.s., to

dZt = Z2t dWt , (8.9)

where W is a one-dimensional Brownian motion. The process Z is thereciprocal of the Bessel(3) process and various of its properties are developedin Exercise 8.14 below. In particular, it is well known as an example of aprocess which is a uniformly integrable local martingale but which is not a truemartingale. This kernel, being a.s. positive, is also a (strict) supermartingaleand, furthermore, EP[ZT |Ft] ↓↓ 0 as T →∞. Consequently, (8.9) can be usedto define a term structure model in which rates are a.s. strictly positive for allt and the discount curve decays to zero as T → ∞, one of the nicest (single-factor) interest rate models one could imagine. Indeed, applying a time-changetechnique which applies equally well to many other pricing kernel models,given any continuous initial discount curve D0T , T ≥ 0, the model can beadapted to fit this initial term structure. To do this, let

E(T ) := EP[ZT ] ,

Page 219: Financial derivatives in theory and practice

Modelling the term structure 195

a decreasing function of T , and define the process Z by

Zt := Z E−1(D0t) .

Then Z is a pricing kernel on the the probability space (Ω, Ft,F ,P) whereFt := FE−1(D0t), and the model is consistent with the initial discount curveD0T , T ≥ 0.We show that this model is not (FVK). To do so, we first show that the

model is FAT -complete for any T > 0. To see this, work on the time intervalt ∈ [0, T ] and take Nt = DtT as numeraire. Under the measure N defined by

dNdP Ft

= ZtDtT ,

all N -rebased assets are Ft martingales. Consider now the process DNtS

for some S > T . This is an (Ft,N) martingale, hence, by the martingalerepresentation theorem for Brownian motion (Theorem 5.49),

DNtS = D

N0S +

t

0

φu dWu (8.10)

for some Ft-predictable process φ, where the (Ft,N) Brownian motionW is defined by

Wt =Wt −t

0

Cu du

for some Ft-predictable process C (defined precisely by Theorem 5.24 withQ = N). Given any FAT -measurable random variable VT with EN[|VT /NT |] <∞ (note, in fact, that NT = 1), the Brownian martingale representationtheorem can be applied again to conclude that

VTNT

= ENVTNT

+T

0

ψu dWu

for some Ft-predictable process ψ. Combining this with (8.10) yields

VTNT

= ENVTNT

+T

0

ψuφ−1u dDN

uS .

It now follows from Theorem 7.17 that this strategy can be made to be self-financing (if we also trade in the numeraire asset DtT ), and from Corollary7.40 that the strategy is admissible, hence the economy is complete.Theorems 7.41 and 7.48 showed that if an economy is complete then any two

pricing kernels agree up to FAT , which in this case is all of FT . This contradicts

the existence of a finite variation pricing kernel. For suppose there does exist

Page 220: Financial derivatives in theory and practice

196 Dynamic term structure models

some measure Z, locally equivalent to P, and a finite variation process ζ whichis a pricing kernel under the measure Z. Defining the process ρ by

ρt :=dZdP Ft

,

it follows that ζρ is a pricing kernel under P. But the economy is complete andas such admits a unique pricing kernel. Since Z is also a pricing kernel underP we can conclude that Z = ζρ. Write Z = Z0E(X) and ρ = ρ0E(X) for localmartingales X and X (as we did in Lemma 5.17). Taking logs of Z and ρζ,and appealing to the uniqueness of the Doob—Meyer decomposition, it followsthat X = X and ζ is the constant 1. Thus Z = ρ which is a contradictionsince ρ is a martingale and yet Z is not.

Remark 8.12: The fact that the discount curve for Example 8.11 is strictlydecreasing and converges to zero as the maturity tends to infinity means thatit also proves that the inclusions (FVK) +0 ⊂ (PK)+0 and (FVK) + ⊂ (PK)+are strict.

Remark 8.13: Note, in particular, that this is an example of a model wherethere is no arbitrage, where the model is complete but where there is nonumeraire pair (N f ,N) for which the numeraire is finite variation. However,it is easy to see that there are many finite variation numeraires. See Remark7.44 for more on the significance of this.

Exercise 8.14: Let (W (1),W (2),W (3)) be a three-dimensional Brownian

motion, but starting at the point (W(1)0 ,W

(2)0 ,W

(3)0 ) = (1, 0, 0). Define

Rt := (W(1)t )2 + (W

(2)t )2 + (W

(3)t )2 .

Show that R, the Bessel(3) process, satisfies the SDE

dRt = −dWt +R−1t dt,

where the Brownian motion W is defined by

Wt :=t

0

3

i=1

W(i)u

RudW (i)

u .

Hence show that the kernel Z of Example 8.11 is given by Z = R−1, andderive an expression for

f(Zt, T − t) := EP[ZT |Ft] .

Page 221: Financial derivatives in theory and practice

Modelling the term structure 197

Establish the following facts which have been used above (the first of whichshows that Z is not a martingale):

(i) f(Z, T − t) is a.s. continuous and strictly decreasing in T , withlimT→∞ f(Z, T − t) = 0;

(ii) f(Z, T − t) is twice differentiable in the variable Z;(iii) for all t < T < S, the diffusion term in the SDE for (DtS/DtT ) is a.s.

non-zero.

8.3.5 Absolutely continuous (FVK) models

Recall that a finite variation process is one which is finite, adapted and withsample paths which are finite variation functions. It is a standard result fromanalysis that any finite variation function can be written as a sum of twocomponents, one which is absolutely continuous (with respect to Lebesguemeasure) and one which is singular with respect to Lebesgue measure. Themost common and important examples of (FVK) models are short-ratemodels, discussed shortly, for which the kernel is absolutely continuous. Thisleaves the question whether or not there are, in theory at least, (FVK) modelsin which the kernel is not absolutely continuous. The answer is yes, as thefollowing (non-stochastic) example shows.

Example 8.15 Let F be the standard Cantor distribution function on [0, 1],as defined, for example, in Chung (1974). This is a function F : [0, 1] →[0, 1] which is everywhere continuous and non-decreasing, almost everywheredifferentiable with F = 0, but singular with respect to Lebesgue measure.Now define the finite variation function (which we take as a deterministicpricing kernel process) ζ by

ζ(t) = e−rt 1− F t

1 + t

for some r > 0. Clearly ζ is strictly decreasing and positive, thus correspondsto a pricing kernel, and the resulting model has all interest rates strictlypositive. However, ζ is singular with respect to Lebesgue measure so the modelis not (acFVK).

8.3.6 Short-rate models

A short-rate model is one for which the prices of all pure discount bonds canbe expressed in the form

DtT = EQ exp −T

t

ru du Ft , (8.11)

Page 222: Financial derivatives in theory and practice

198 Dynamic term structure models

for some measure (the risk-neutral measure) Q locally equivalent to P andsome adapted process r. In addition, it is a requirement that r has theinterpretation as the instantaneous spot interest rate,

rt = − limh→0

1h

log Dt,t+h. (8.12)

It is implicit in this definition that, with probability one,∫ t

0 rudu and theexpectation in (8.11) are finite. Note that any (acFVK) model can be writtenin the form (8.11) but does not necessarily satisfy (8.12).

Short-rate models are very common and have been very popular. Thispopularity is decreasing somewhat amongst (front-office) practitioners withthe advent of ‘market models’, but they are still a very valuable tool andmost banks have various examples implemented in their trading systems.Some of the best-known examples are the Vasicek–Hull–White model andthe Black-Karasinski model, both of which we will describe in Chapter 17.(Actually most if not all models used in practice, even market models, are(SR) but not all models are parameterized as such.) Both of these examplesare driven by a one-dimensional Brownian motion, but this need not be thecase in general.

It is immediate, setting

ζt = exp(−

∫ t

0rudu

), (8.13)

that (SR) ⊆ (acFVK). The question remains as to whether there are any(acFVK) models which are not (SR), i.e. for which the process r in (8.13)does not satisfy (8.12). Sufficient conditions for an (acFVK) model to be (SR)(and (HJM)) will be given in Theorem 8.18 below, conditions which mean thatany ‘reasonable’ (acFVK) model must indeed be (SR). However, models doexist which are (acFVK) but not (SR) as the following example demonstrates.

Example 8.16 The example we are about to present is somewhat pathologicaland unrealistic. Theorem 8.18 below and the remark following it show this hasto be the case. This contrasts with Example 8.11 which presented a reasonableand not unrealistic model that was (PK) but not (FVK).

Work on a filtered probability space (Ω, Ft,F , P) supporting a univariateBrownian motion W. Let αn, n ≥ 1 be a set of probabilities and define thefunctions Rn(t) via

Rn(t) =

α−1n , if 0 ≤ t < 2−n,

−α−1n , if 2−n ≤ t < 2−(n−1),

1, if 2−(n−1) ≤ t ∈ [q2−n, (q + 1)2−n

), q even,

−1, if 2−(n−1) ≤ t ∈ [q2−n, (q + 1)2−n

), q odd.

Page 223: Financial derivatives in theory and practice

Modelling the term structure 199

Now define the stopping time τn via

τn = inft = q2−(n−1), some integer q : |W2q2−n −W(2q−1)2−n | > nwhere n is chosen such that P(τn = 2−(n−1)) = αn. Note that 2

n−1τn is ageometrically distributed random variable. Now let

ζn(t) =

0, t < τn,

t

τn

Rn(u− τn) du, t ≥ τn,

and note that, for each n, ζn is an absolutely continuous finite variation processand ζn(t) ≥ 0 for all t. Furthermore, letting x be the largest integer no largerthan x,

EP ζn(T )|Ft = ζn(T )1t≥τn + 2−nE 2nT − 2 2(n−1)t ,βn(t)/αn 1t<τn

where E(t, x) =t

0 e(u, x)du,

e(t, x) =

0, t < 2,

x, 2 ≤ t < 3,−x, 3 ≤ t < 4,1, 4 ≤ t ∈ [q, q + 1), q even,

−1, 4 ≤ t ∈ [q, q + 1), q odd,

and

βn(t) = P τn =2(n−1)t + 12(n−1)

Ft .

Now defining

ζt = 1 +∞

n=1

ζn(t) ,

it follows from the monotone convergence theorem that EP[ζ(T )] < 2 andthus ζ ∈ L1 and may be used as a pricing kernel (it is also strictly positive).Furthermore, the monotone convergence theorem also implies that

EP∞

n=1

1τn≤t =

n=1

P(τn ≤ t)

=

n=1

P τn ≤ 2(n−1)t2(n−1)

≤∞

n=1

1− (1− αn)2(n−1)t

≤ t∞

n=1

2(n−1)αn .

Page 224: Financial derivatives in theory and practice

200 Dynamic term structure models

Choosing αn such that this sum is finite ensures that only finitely manyτn will have occurred by any time t, and thus ζ is absolutely continuous andof finite variation a.s. But it is not difficult to verify (and this is left as anexercise) that with this pricing kernel DtT is a.s. nowhere differentiable, forany t and T . In particular, it is not differentiable at t = T , thus the short ratedoes not exist.

8.3.7 Heath—Jarrow—Morton models

All term structure models of practical relevance have the property that thediscount curve is almost everywhere differentiable in the maturity parameterT , a property that automatically holds for any model in which interest ratesare positive. Furthermore, if the discount curve is absolutely continuous thisthen yields

DtT = exp −T

t

ftSdS ,

where

ftT = − ∂

∂Tlog(DtT ).

The idea of Heath et al. (1992) was to exploit this fact and to formulate amodel by specifying the processes f·T : 0 ≤ T <∞. This they did by givingSDEs satisfied by the forward rate processes

dftT = µtT dt+ σtT · dWt, (8.14)

for suitable µ and σ.For this approach to be useful it is necessary to have a way to move

from a specification of the instantaneous forwards to the discount bond price

processes, i.e. to derive an SDE forT

t ftSdS from the SDEs for the forwardsftT . To do this systematically requires an application of a stochastic Fubinitheorem, a result which allows the interchange of the order of integrationwhen one integral is a Lebesgue integral and the other is a stochastic integral.For this result to be applicable, regularity conditions need to be imposed onthe SDEs (8.14). If these regularity conditions do hold then the next step inthe development of an (HJM) model is to determine conditions on the SDEs(8.14) to ensure that the model satisfies (TS1)—(TS4).In their original paper, Heath et al. (1992) present a general SDE for the

forwards, such as (8.14), and sufficient conditions for the stochastic Fubiniresult to apply. They then show that (TS1)—(TS4) hold, that the resultingmodel is (SR), and derive the corresponding SDEs for the pure discount bondprices. One could, generally, define an (HJM) model to be one for which thestochastic Fubini result and (TS1)—(TS4) hold. This is precisely the viewpointwe take, defining an (HJM) model to be one for which the conclusions ofTheorem 8.21 are valid.

Page 225: Financial derivatives in theory and practice

Modelling the term structure 201

Definition 8.17 An (HJM) is an (SR) model for which, in the risk-neutralmeasure Z, the instantaneous forward rates obey SDEs of the form

dftT =∂ΣtT∂T

· ΣtT dt− ∂ΣtT∂T

· dWt ,

where each Σ·T is Ft-predictable, and for which

dDtT = DtT rtdt+ ΣtT · dWt .

In this presentation of the (HJM) framework we will approach things froma slightly different perspective than Heath et al. (1992), following closely theapproach of Baxter (1997). We start with the class (SR) of short-rate modelsand provide a sufficient condition on the corresponding pricing kernel ζ forthe stochastic Fubini result to apply and thus for the model to be (HJM).This sufficient condition is not necessary but it is such that any model forwhich it fails to hold is highly unrealistic and thus of no practical relevance.This contrasts with the fact that there do exist realistic (PK) models, such asExample 8.11, which are not (FVK). We then derive the corresponding SDEfor the instantaneous forwards and the interrelationships implied by the factthat (TS1)—(TS4) hold.We begin with a statement of the sufficient condition, inequality (8.15), and

the consequential relationship between the forwards and the finite variationkernel. Note in the statement of this result that we only require the modelto be (acFVK). The (SR) property is then a consequence of the regularitycondition (8.15).

Theorem 8.18 Let ζ be a finite variation pricing kernel associated withan (acFVK) model (not assumed to be (SR)) with the minimal cononicaldecomposition ζ = ζ0+ζ

+−ζ− (i.e. ζ+ and ζ− are the minimal non-decreasingfunctions, null at zero, for which this holds). Then a sufficient condition forthe existence of the instantaneous forwards, f , is

EZ[ζ+T ] + EZ[ζ

−T ] <∞ , (8.15)

for all T > 0, in which case

ftT = −EZ

∂ζT∂T Ft

EZ ζT Ft. (8.16)

In particular, writing ζt = exp(− t

0 ru du), ftt = rt and the model is (SR).

Page 226: Financial derivatives in theory and practice

202 Dynamic term structure models

Remark 8.19: The regularity condition (8.15) can easily be seen to beequivalent to the condition given in Baxter (1997), namely

EZT

0

|ru|expu

0

−rv dv du <∞ .

Proof: Condition (8.15) and the fact that ζ is absolutely continuous ensurethat we can apply (the conditional form of) Fubini’s theorem to (8.8) to obtain

DtT = ζ−1t EZ[ζT |Ft]

= ζ−1t EZ[ζ0|Ft] +T

0

EZ∂ζS∂S

Ft dS .

It follows that

∂DtT∂T

= ζ−1t EZ∂ζT∂T

Ft ,

ftT := −∂ log(DtT )∂T

= −EZ

∂ζT∂T Ft

EZ ζT Ft.

The last statement of the theorem is immediate.

Remark 8.20: It is implicit from the fact that ζ is a pricing kernel that ζ itselfis in L1(Ω, Ft,F ,Z). As a result, the left-hand side of (8.15) is infinite ifand only if both expectations are infinite. But ζ(ω) ≥ 0 for almost all ω andthus any model for which (8.15) does not hold must have a wildly fluctuatingshort rate (if ζ−t (ω) > K then ζ+t (ω) > K − ζ0(ω)) and it is not possible thatsome paths have highly positive short rates and others have very negativeshort rates; if large negative rates occur then large positive rates must alsooccur on the same path. Consequently, any (acFVK) model that is not (SR),or any (SR) model that is not (HJM), must be highly unrealistic and of nopractical relevance. Example 8.16 was one such example which separated theclasses (acFVK) and (SR). Example 8.24 at the end of this section is (SR)but not (HJM).

It remains for us to connect the SDEs that are satisfied by the bond pricesDtT and the instantaneous forwards ftT under the regularity condition (8.15).This is the content of the next two theorems. For the first we work in the ‘risk-neutral’ measure Z corresponding to the kernel ζ; the general case then followsfrom Girsanov’s theorem.

Theorem 8.21 Given a general (acFVK) model with pricing kernel ζt =

exp(− t

0rudu), there exists, for each T > 0, some d-dimensional Ft-

predictable Σ·T such that

Page 227: Financial derivatives in theory and practice

Modelling the term structure 203

(i) EZ[ζT |Ft] = EZ[ζT |F0] +t

0

ζuDuTΣuT · dWu,

(ii) dDtT = DtT (rtdt+ ΣtT · dWt),

where W is a d-dimensional (Ft,Z) Brownian motion.If, in addition, (8.15) holds for each T > 0, then

(iii) EZ∂ζT∂T

Ft = EZ∂ζT∂T

F0 −t

0

ζuDuT σuT + fuTΣuT · dWu,

(iv) dftT = − σtT · ΣtT dt+ σtT · dWt,

where the processes σ and Σ are related by

σtT = −∂ΣtT∂T

,

and the model is (HJM).

Proof: The existence of ΣtT in (i) for each T is just the Brownian martingalerepresentation theorem (Theorem 5.49) applied to the martingale M t =EZ[ζT |Ft]. Equation (ii) now follows from (i) by applying Ito’s formula to(8.8).Turning to the second part of the theorem, the regularity condition (8.15)

establishes the existence of the process ∂ζT∂T , and it then follows by the

Brownian martingale representation theorem again that, for each T , thereexists some predictable process Λ·T such that

EZ∂ζT∂T

Ft = EZ∂ζT∂T

F0 +t

0

ΛuT · dWu . (8.17)

We must prove that Λ has the representation in (iii). We will only sketchthe proof of this part of the theorem. The complete proof requires severaltechnical lemmas which we will not include. The interested reader is referredto Baxter (1997) for details of this part of the argument. The key to the proofis to observe that (i), (8.17) and an application of Fubini’s theorem yield twoalternative representations for EZ[ζT − ζ0|Ft], namely

EZ[ζT − ζ0|Ft] = EZ[ζT − ζ0|F0] +t

0

ζuDuTΣuT · dWu ,

EZ ζT − ζ0|Ft = EZT

0

∂ζS∂S

dS Ft

=T

0

EZ∂ζS∂S

Ft dS

=T

0

EZ∂ζS∂S

F0 +t

0

ΛuS · dWu dS

= EZ[ζT − ζ0|F0] +T

0

t

0

ΛuS · dWu dS .

Page 228: Financial derivatives in theory and practice

204 Dynamic term structure models

Equating these two expressions yields

T

0

t

0

ΛuS · dWu dS =t

0

ζuDuTΣuT · dWu . (8.18)

A fact we shall not prove, which follows from condition (8.15), is that astochastic Fubini theorem can be applied which allows us to interchange theorder of the two integrations on the left-hand side of (8.18). Doing so yields

t

0

T

0

ΛuS dS · dWu =t

0

ζuDuTΣuT · dWu ,

T

0

ΛtS dS = ζtDtTΣtT . (8.19)

Differentiating both sides of (8.19) yields the representation in (iii). The finalidentity (iv) is now just an application of Ito’s formula to (8.16).

Remark 8.22: It is implicit in the last statement that we can indeed integrateσtT over the maturity parameter T (and we made similar assumptions for theinstantaneous forwards ftT ). For this to be possible σ must be measurable inthis parameter. It is indeed possible to establish this fact, that there exists aversion of σ which is jointly measurable as a map

σ : Ω× [0, T ∗]× [0, T ∗]→ R

(ω, t, T )→ σtT (ω) ,

for each T ∗ > 0. That is, σ (and f) is jointly progressively measurable (in tand T ). The details can be found in Baxter (1997).

The relationship in part (iv) of Theorem 8.21 between the drift and diffusionterms of the instantaneous forward rates in the measure Z is a consequenceof insisting, via (TS4), that the model is arbitrage-free. It is now a simplematter to see the form these equations must take in any other measure whichis locally equivalent to Z, in particular the real-world measure P. The result isan immediate consequence of Theorem 8.21 and Girsanov’s theorem (Theorem5.24).

Theorem 8.23 Under any measure P which is locally equivalent to Z, themodel of Theorem 8.21 obeys

(ii ) dDtT = DtT (rt + ΣtT · Ct)dt+ ΣtT · dWt .

Page 229: Financial derivatives in theory and practice

Modelling the term structure 205

If (8.15) also holds then (iv) becomes

(iv ) dftT =∂ΣtT∂T

· (ΣtT − Ct) dt− ∂ΣtT∂T

· dWt.

In the above W is a Brownian motion under P and is given by

Wt =Wt −t

0

Cu du ,

where C is the unique Ft-predictable process such that

dPdZ Ft

= expt

0

Cu · dWu − 12

t

0

|Cu|2 du .

We conclude our discussion of (HJM) models with an example of a modelthat is (SR) but not (HJM), by necessity a model which fails to satisfy (8.15)and which is thus unrealistic.

Example 8.24 Work on a filtered probability space (Ω, Ft,F ,P) supportinga univariate Brownian motion W . For each n ≥ 1 define the function Rn via

Rn(t) =

0, if t < 1,

2n, if t− 1 ∈ [q2−n, (q + 1)2−n), q odd,

−2n, if t− 1 ∈ [q2−n, (q + 1)2−n), q even.

Now define a random variable n by n = j if |W1| ∈ [Lj , Lj+1), where thesequence Lj is defined so that L1 = 0 and 2(N(Lj+1) − N(Lj)) = 2−j. Fixr ≥ 0, α < 1 and define the finite variation kernel ζ via

ζt = exp(−rt) 1 + αt∨1

1

Rn(u) du

and, as usual,DtT = ζ−1t EZ[ζT |Ft]

(note that 0 ≤ ζt ≤ exp(−rt) so ζ is certainly in L1).This is clearly an (acFVK) model. Furthermore, for all T ≥ 0, there exists

some ε(T ) > 0 such that ζt is FT -measurable for all t ∈ [T, T + ε(T )]. Hence,

writing ζt = exp(− t

0rudu), the short rate exists and is exactly rt, so the

model is (SR). However, for t > 1 neither ζ+t nor ζ−t is integrable and (exercise

for the reader) DtT is nowhere differentiable for t < 1 ≤ T . Figure 8.1 showsthe initial discount curve in the case r = 10%,α = 25%.

Page 230: Financial derivatives in theory and practice

206 Dynamic term structure models

1.2

1.0

0.8

0.6

0.4

0.2

00.0 0.5 1.0 1.5 2.0 2.5

Dis

coun

t fac

tor

Time

Figure 8.1 Non-differentiable (SR) discount curve

8.3.8 Flesaker—Hughston models

In 1996 Flesaker and Hughston proposed a general class of models in whichinterest rates are guaranteed to be positive. In their framework, the termstructure is of the form

DtT =

∞T MtSdS∞t MtSdS

, (8.20)

where M·S : 0 ≤ S < ∞ is a family of positive Ft martingales undersome measure Z locally equivalent to P. Implicit in this definition is the factthat the family M is jointly measurable and that both integrals in (8.20) arefinite a.s.The best-known example of a model developed in this framework is the

rational log-normal model defined by

DtT =AT +BTMt

At +BtMt.

Here the process M is a log-normal martingale with initial condition M0 =1 and the deterministic functions A and B are both positive, absolutelycontinuous, and decrease to zero at infinity. For consistency with the initialterm structure the constraint

D0T =AT +BTA0 +B0

Page 231: Financial derivatives in theory and practice

Modelling the term structure 207

also applies. This still leaves freedom to tailor the model for any particularapplication.

Two noteworthy features of this model are as follows. First, it producesclosed-form prices for both caps and swaptions. We will not derive theseresults, but the interested reader is referred to Flesaker and Hughston (1996).Second, given any t and T, DtT is monotone in Mt and is bounded byAT/At and BT/Bt (Mt = 0 and Mt = ∞). Consequently, interest rates are alsobounded both above and below.

To understand how (FH) relates to other ways of specifying the termstructure we give an alternative representation, first presented in 1997 by,amongst others, Rutkowski (1997).

Theorem 8.25 Given a jointly measurable family of positive (Ft, Z)martingales M·S : 0 ≤ S < ∞, let

Zt =∫ ∞

t

MtSdS, (8.21)

At =∫ t

0MSSdS.

Note that A is an absolutely continuous, increasing, finite variation pro-cess with A0 = 0. If Z0 < ∞ then both Z ∈ L1(Ω, Ft,F , Z) and A∞ ∈L1(Ω,F , Z). Thus Z can be used as a pricing kernel and the resulting modelhas the form (8.20), and is thus one for which all interest rates are positiveand, for all t ≥ 0,

limT→∞

DtT = 0, a.s. (8.22)

Furthermore, in this case Z has the representation

Zt = EZ[A∞|Ft] − At. (8.23)

Conversely, given any absolutely continuous, increasing A with A0 = 0 andEZ[A∞] < ∞, (8.23) defines a positive supermartingale Z ∈ L1(Ω, Ft,F , Z)and (8.4) then defines a model in which interest rates are positive and forwhich (8.22) holds. Furthermore, Z has the representation given in (8.21) fora (jointly measurable) family of positive martingales M·S : 0 ≤ S < ∞.

Remark 8.26: What this theorem shows is that (FH) models are preciselythe class of (PK) models with kernels of the form (8.23) for some increasingabsolutely continuous A. If the continuity requirement is removed, (8.23)does in fact define a general potential of class (D), i.e. a right-continuous,positive supermartingale Z with the property that Z∞ = 0 (a potential) andfor which the family Zτ : τ an Ftstopping time is uniformly integrable (ofclass (D)). Note that the kernel Z of Example 8.11 is also a potential, but thefamily Zτ is not in that case uniformly integrable.

Page 232: Financial derivatives in theory and practice

208 Dynamic term structure models

Proof: Given the family of positive martingales M ·S, it is clear that both Zand A exist, albeit possibly taking the value +∞. However, if Z0 < ∞ thenit follows immediately from Fubini’s theorem that Z ∈ L1(Ω, Ft,F ,Z) andA∞ ∈ L1(Ω,F ,Z), since

Z0 =∞

0

M0SdS =∞

0

EZ[MSS ]dS = EZ∞

0

MSSdS = EZ[A∞] ,

and

Z0 >∞

T

M0SdS =∞

T

EZ[MTS]dS = EZ∞

T

MTSdS = EZ[ZT ] ,

the third relation in each case being the application of Fubini. HenceZ ∈ L1(Ω, Ft,F ,Z) and, since it is also a strictly positive continuoussemimartingale, can be used as a pricing kernel. A further application ofFubini, this time the conditional version, now yields

EZ ZT |Ft = EZ∞

T

MTS dS Ft =∞

T

EZ MTS |Ft dS =∞

T

MtS dS

for all 0 ≤ t ≤ T <∞. Therefore, substituting into (8.4),

DtT = Z−1t EZ[ZT |Ft] =

∞T MtS dS∞t MtS dS

,

as required.The representation of Z in terms of A follows, once more, from conditional

Fubini,

Zt :=∞

t

MtS dS =∞

t

EZ[MSS |Ft] dS

= EZ∞

t

MSS dS Ft = EZ A∞ −At Ft .

This completes the first part of the proof.The converse result is immediate with the exception of the last part. But

the fact that A is absolutely continuous, increasing and null at zero ensuresthat

At =t

0

au du

for some positive process a. Defining MtS = EZ[aS |Ft], the fact that A∞ ∈L1(Ω,F ,Z) allows us to apply Fubini once more to obtain the representation(8.21). (That the family M·T : T ≥ 0 is jointly measurable is established inBaxter (1997).)The representation (8.23) suggests an obvious generalization of the (FH)

class of models.

Page 233: Financial derivatives in theory and practice

Modelling the term structure 209

Definition 8.27 A term structure model is said to be extended Flesaker—Hughston, (eFH), if it is a (PK) model with pricing kernel of the form (8.23)for some adapted, increasing and continuous process A with A0 = 0 andA∞ ∈ L1(Ω,F ,Z).

Remark 8.28: The only difference from a standard (FH) model is the relaxationof the absolute continuity requirement on A to one of continuity.

It is immediate from the defining equation (8.20) that the discount curvefor an (FH) model is continuous in the maturity parameter T . The continuityproperty also carries over to the class (eFH). This is not true for a general(PK) model, even when interest rates are positive.

Corollary 8.29 For all t ≥ 0, the discount curve for an (eFH) model is a.s.continuous in the maturity parameter T .

Proof: It follows from (8.23) and (8.4) that, for an (eFH) model,

DtT =EZ[A∞|Ft]− EZ[AT |Ft]

EZ[A∞|Ft]−At . (8.24)

The continuity of A ensures that limS→T AS = AT a.s., for all T ≥ 0. Since0 ≤ AS ≤ A∞ ∈ L1(Ω,F ,Z), the dominated convergence theorem can beapplied to the second expectation in the numerator of (8.24) and yields

limS→T

DtS = DtT ,as required.We conclude our discussions on the theory of term structure models by

proving the relationship between (FH), (eFH) and the other model classesalready considered. First we show that (FH) contains all (acFVK)+0 models.

Theorem 8.30 The class (FH) of Flesaker—Hughston models contains all(acFVK) models for which interest rates are positive and limT→∞DtT = 0for all t ≥ 0 a.s. The class (eFH) contains all of (FVK)+0.Proof: Suppose ζ is the pricing kernel for an (acFVK)+0 model. Then, for allt ≤ T ,

ζt ≥ EZ[ζT |Ft] ↓ 0 , (8.25)

as T → ∞. Thus ζ is a supermartingale and, since ζ is also continuous andof finite variation, it follows from the Doob—Meyer theorem (Corollary 3.94)that ζ is actually decreasing. Furthermore, since ζ > 0 a.s., it follows from(8.25) that ζ∞ = 0.Define A = ζ0 − ζ. Clearly A is increasing, absolutely continuous, and

A0 = 0. Furthermore, A∞ = ζ0 ∈ L1(Ω,F ,Z) so A can be used to define an(FH) pricing kernel Z, given by

Zt := EZ A∞ −At|Ft = ζt .

Page 234: Financial derivatives in theory and practice

210 Dynamic term structure models

Hence the model is (FH) as required.When the model is only (FVK)+0 the argument above carries through with

the exception of the fact that A is now not necessarily absolutely continuous.The resulting model is thus (eFH) but not necessarily (FH).It is clear that (FH) ⊂ (eFH) with the inclusion being strict. We have just

shown that (FVK)+0 ⊆ (eFH) and (acFVK)+0 ⊆ (FH). It is immediate that

(eFH) ⊆ (PK)+0. It remains to prove that these last three inclusions are strict.The next two examples do precisely that.

Example 8.31 For an example of a model that is (PK)+0 but not (eFH) wereturn to Example 8.11. Recall that the underlying kernel Z for this model isdefined by Z0 = 1, a.s., and

dZt = Z2t dWt

where W is a one-dimensional Brownian motion on some probability space(Ω, Ft,F ,P). We showed in Example 8.11 that the model is not (FVK).That it is not (eFH) follows by a similar argument.Suppose there exists some measure Z, locally equivalent to P, and some

increasing process A such that

Zt := EZ[A∞|Ft]−At (8.26)

is a pricing kernel under Z. Since Z is a pricing kernel under P it followsimmediately that ρZ is a pricing kernel under Z, where

ρt :=dPdZ Ft

.

As we saw in Example 8.11, the model is complete and hence the pricing kernelis unique, thus ρZ = Z. But Z is an (Ft,P) local martingale, so ρZ is an(Ft,Z) local martingale by Lemma 5.19. On the other hand, Z is definedin (8.26) as the sum of a martingale plus a finite variation term. Hence thefinite variation term is constant which means the process A is constant andZ ≡ 0, a contradiction. Thus the model is not (eFH).The key to the example above was the fact that the pricing kernel was a

local martingale but not a true martingale. A similar idea is used to generatea model that is (FH) but not (FVK)+0. This thus demonstrates the strict

containment in the relations (FVK)+0 ⊂ (eFH) and (acFVK)+0 ⊂ (eFH).Note that the Cantor example of Section 8.15 is (FVK)+0 but not (FH) (all(FH) models have an absolutely continuous discount curve), so this examplealso shows that there are no additional containment relationships between theclasses (FH) and (FVK)+0.

Page 235: Financial derivatives in theory and practice

Modelling the term structure 211

Example 8.32 Let W be a Brownian motion on the probability space(Ω, Ft,F ,P) and define Lt : 0 ≤ t < 1 by

Lt =t

0

dWu

1− u .

The process L is a continuous local martingale, null at zero, and [L]t =t1−t ,

so Levy’s theorem implies that L∗(t) := L t1+t is a Brownian motion (relative

to the filtration Ft := F( t1+t )

).

Now define the stopping time τ = inft > 0 : Lt < −1 and let L = 2+ Lτ .The fact that L∗(t) is Brownian motion ensures that τ < 1 a.s., and sothe process L can be defined over [0,∞) (there will be no problem with the

explosion of L at t = 1). Note that L is a martingale on any interval [0, T ] whenT < 1, and thus L is a martingale on [0, T ]. Combining these observations,

E[LT |Ft] = Lt1T<1 + 1T≥1 .

Now, for some fixed r > 0, define the continuous and increasing process Gby

Gt =1

1− t1t≤τ +er(t−τ)

1− τ 1t>τ ,

and observe L has the representation

Lt := 2 + Lτt = 2 +

t

0

GudWτu

(note that the behaviour of G on t > τ is not material to thisrepresentation). We now use L and G to construct a model that is (FH)but not (FVK), letting

Zt :=LtGt. (8.27)

Applying Ito’s formula to (8.27),

dZt = G−1t dLt + LtdG

−1t

= dW τt − dAt,

where the absolutely continuous process A is defined by

At := −t

0

LudG−1u ,

and thusZt = Z0 +W

τt −At. (8.28)

Page 236: Financial derivatives in theory and practice

212 Dynamic term structure models

The fact that Z∞ = 0 and τ < 1, a.s., now yields, from (8.28),

A∞ = Z0 +W τ∞,

and, taking expectations,

EP[A∞|Ft] = Z0 + EP[W τ∞|Ft]

= Z0 +Wτt

= Zt +At .

Thus Z has the form of an (FH) pricing kernel under the measure P. We showthat this kernel does not generate an (FVK) model, essentially because L isa local martingale but not a true martingale.Note that for this model the asset filtration FA

t satisfies FAt =

FW τ

t = Ft∧τ. A similar argument to that used in Example 8.11 canbe employed to show that this model is FA

T -complete for all T > 0 (filling inthe details of this argument is left as an exercise). Now suppose there exists astrictly positive, continuous, finite variation process ζ and a measure Z locallyequivalent to P such that

DtT = ζ−1t EZ[ζT |Ft] . (8.29)

Changing measure in (8.29) from Z to P yields

DtT = (ζρ)−1t EP (ζρ)T |Ft ,

where

ρt :=dZdP Ft

,

so ζρ is also a pricing kernel for the model under the measure P. But thecompleteness of the model implies that the pricing kernel is unique, thus

ζρ = Z =L

G

on Ft∧τ. It now follows from the uniqueness of the Doob—Meyerdecomposition that ρ = L and ζ = G−1 up to Ft∧τ (the argument hereis similar to that used in Example 8.11). But ρτ is an (Ft∧τ,P) martingaleand Lτ is not, a contradiction.

Page 237: Financial derivatives in theory and practice

Part II

Practice

Page 238: Financial derivatives in theory and practice
Page 239: Financial derivatives in theory and practice

9

Modelling in Practice

9.1 INTRODUCTION

Part I was devoted to the theory of derivative pricing. The development givenwas very much a theoretical mathematical approach and showed how to movefrom a model for an economy to the price of a derivative within that economy.

When it comes to modelling derivatives in practice there is much more to doand knowing (to some degree) the mathematical theory is only the first step.If we are faced with the problem of pricing a particular derivative it wouldbe unusual to have a model for the economy already given to us. We mustdecide on what that model should be and this will depend on many things,including the particular derivative in question and the need to get numbersout of this model in a reasonable time. How to get numbers out efficiently isanother topic in itself, something we shall cover only in one particular case(see Section 17.4).

The problem of how to select and develop a model of (relevant parts of)the real-world economy is something we shall be considering throughout thispart of the book. In this chapter we aim to give a few pointers as to howwe approach this problem and a few warnings of the pitfalls to avoid. Thereis no recipe for doing this and what works well in one situation may faildismally in another. All you can do is keep your wits about you, work hardto understand the essence of any given product and try to focus on the reallyimportant issues.

9.2 THE REAL WORLD IS NOT A MARTINGALEMEASURE

The mathematical theory of derivative pricing is clear. One starts with amodel of the asset price processes in an economy, usually specified via someSDE in a real-world probability measure P. Then one chooses a numeraire Nand changes probability measure, from the original measure P to an equivalent

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 240: Financial derivatives in theory and practice

216 Modelling in practice

martingale measure N under which all N -rebased assets are martingales.Having done this, the value of any derivative can be calculated by takingexpectations in this measure N.For most products encountered in practice the value is determined by the

joint distribution of a finite number of asset prices on a finite number of datesin a martingale measure. On the other hand, a model for the asset price pro-cesses can only be formulated based on information and intuition availablein the real world — we do not live in a martingale measure. So the questionarises as to how we should use real-world information to formulate a modelwhich, when we have changed to a martingale measure, will give an asset pricedistribution which in turn yields a reasonable price for the derivative underconsideration. To answer this question we need to consider what is commonto both the real-world measure and any EMM. This includes the following:

(i) the quadratic variation of any continuous process;(ii) the prices of traded instruments;(iii) null sets.

9.2.1 Modelling via infinitesimals

Each of the above quantities (i)—(iii) gives useful information about themartingale measure. In theory, for an economy containing a finite variationasset, it is enough to know the quadratic covariation of all the asset priceprocesses and the drift of the finite variation asset in the real-world measure.If we do, and if we assume the market is arbitrage-free, then these completelydetermine all the joint distributions in a martingale measure. To see whythis is, consider the SDE which determines the evolutions of the asset priceprocesses. If we change to an EMM the drifts will change to make numeraire-rebased assets into martingales. This fact and the quadratic covariationsuniquely determine the SDE for numeraire-rebased assets in the EMM. TheSDE satisfied by the finite variation asset is unchanged from the real world.These new SDEs, assuming they have a unique solution, completely determinethe law of all the asset price processes in the EMM, and hence all jointdistributions and all derivative prices.Whereas the logic of the previous paragraph is valid and infinitesimals

determine everything in theory, in practice we do not attempt to determineprice processes via their infinitesimals. There are a number of reasons for this.Even when practitioners use time series to estimate the infinitesimals, the

first step in doing so is to postulate a simple functional form for the diffusionterm of a process. Then various parameters are estimated from the data. Anobvious example of this is using a log-normal model for a stock price process.The price A is assumed to follow an SDE of the form

dAt = µt dt+ σAtdWt

for some general function µ, some constant σ and a Brownian motion W .

Page 241: Financial derivatives in theory and practice

The real world is not a martingale measure 217

The value σ is then estimated from data. The initial choice of a log-normalmodel is one taken at the macro level and determines broad features ofthe model. Thus the primary modelling decision is made with reference tomacro behaviour in the real world and the infinitesimals are merely a wayto reproduce this behaviour. Indeed, whereas the resulting model may givea reasonable reflection of macro behaviour, it certainly gives a very poorreflection at the infinitesimal level — real-world price processes are piecewiseconstant with jumps, not continuous processes.There is one further reason to avoid focusing too much on infinitesimals,

and that is stability. As with ordinary differential equations, a small changein the form of an SDE can lead to large, and sometimes unrealistic, changesat the macro level.

9.2.2 Modelling via macro information

We have just seen why it can be dangerous to rely on infinitesimals whenformulating a model for derivative pricing. An alternative is to use pricesand null sets, the other things common to the real world and any equivalentmeasure.The first of these, prices, are particularly relevant and useful. We shall see

in Section 13.2.2, that if, for all strikes, we know the prices of call options onan asset then this gives us information about marginal distributions of thatasset in an EMM. This macro information can and should be incorporatedwithin the model. Furthermore, any derivative must be hedged with otherinstruments, so using this price information to the full also calibrates themodel to the prices of hedge instruments. This is important because thesehedge prices give the best measure of the real cost of replicating the derivativein practice. We shall rely heavily on this approach in the rest of this book.The third invariant when changing measure is null sets. By ensuring that a

model assigns zero probability in the real world to an event that we wish toexclude, we are also giving that event zero probability in an EMM. In practicewe tend to push this a little further, insisting that an event that is very unlikelyto occur in the real world should also be unlikely to occur in the EMM.This is sensible because we tend to model with continuous distributions anddo not impose hard cut-off levels for distributions, even though we may feelthey occur. Whereas this makes sense in practice, great care must be taken.Intuition about the real world can be misleading about the EMM becausethe change of measure can lead to large changes in distributions. Here is anexample to illustrate.

Example 9.1 Consider an economy with three assets, the first being constant,A(0) ≡ 1. Suppose that the other two are jointly log-normal with, for i = 1, 2,

dA(i)t = µ(i)A

(i)t dt+ σA

(i)t dW

(i)t

Page 242: Financial derivatives in theory and practice

218 Modelling in practice

where µ(1), µ(2) and σ are constants and W = (W (1),W (2)) is a Brownianmotion.Suppose we set µ(1) = 0 and A

(1)0 = A

(2)0 = A. Then, for any K,

P A(1)T > K = N(d2)

where N(·) is the standard Gaussian distribution function and

d2 =log(A/K)− 1

2σ2T

σ√T

.

Fix A, T and σ and choose K such that this probability is small, say 10−6. Asimilar calculation shows that

P A(2)T < K = N −d2 − µ(2)

σ

√T .

For µ(2) sufficiently large this latter probability can be made arbitrarily small,say 10−6 once more.Note that the economy above is arbitrage-free and complete. Consider

pricing a call option on each of the assets A(1) and A(2) which maturesat time T and has strike K. Having changed measure, they both have thesame distribution and so the option prices are identical. Yet the real-worldprobability that the first option pays off is one in a million, whereas the real-world probability that the second does not pay off is also one in a million.Which option would you buy?

Fortunately the situation above, although theoretically possible for anarbitrage-free model, is not sustainable and would not occur in practice. Thereader may wish to think about why. The point of the example is that thechange of measure from the real world to the EMM may lead to very counter-intuitive results unless you are careful.

9.3 PRODUCT-BASED MODELLING

Suppose we wanted to price a derivative with payoff depending on a numberof assets, n say. There are two ways we might first approach this problem. Thefirst would be to look at historical time series for the prices of these assetsand at the prices of other derivatives of these assets to build an understandingof the real market. Then, with this understanding, we might formulate arealistic model for the evolution of the n assets. Because of computationalconsiderations this model would have to be of low dimension if we were touse it for derivative pricing. Finally, having formulated the model, we wouldapply it to price the derivative.

Page 243: Financial derivatives in theory and practice

Product-based modelling 219

Following this approach will lead to a reasonable model for the economy butone which is not necessarily a good model when used to price the particularderivative in question. An alternative, therefore, is to understand the economy,as before, and then try to gain an understanding of what features of thiseconomy will have the most impact on the derivative in question. Only thendo we formulate the precise model.This latter approach is what we call product-based modelling. In the rest of

this section we illustrate its use. First, in Section 9.3.1 we show what can gowrong if we do not follow this approach. Then, in Section 9.3.2 we apply thetechnique for a product which is similar to one encountered in practice.

9.3.1 A warning on dimension reduction

When confronted with a high-dimensional set of data, such as a time seriesof the prices of a set of assets, a common and very powerful technique oftenapplied is that of principal component analysis. Using this standard statisticaltool it is possible to gain considerable understanding of the main structureof the asset price processes, vital knowledge for successful modelling of theeconomy (with a process of lower dimension). Here we consider a ‘toy’ pricingproblem to illustrate some of the issues involved. The main point we are tryingto communicate here is the importance of considering the derivative and theeconomy together: understanding both separately and also how they interact.So consider an economy comprising n+1 assets whose price processes A (0)

and A := (A(1), . . . , A(n)) satisfy, for all t ≥ 0,

A(0)t ≡ 1,dAt = ΣdWt,

in the real-world measure P, where W is an n-dimensional Brownian motionand Σ is some constant matrix of full rank n. Now consider the problem ofpricing a European derivative which at time T has the quadratic payoff

VT = ATTQAT .

We may here, without loss of generality (by considering 12 (Q + QT ) if

necessary), assume that Q is symmetric. For illustrative purposes we shallalso impose some extra structure, although this is not essential for the mainpoints made below to hold. We assume that there exists some unitary matrixP (i.e. one for which P TP = I) such that

PT (ΣΣT )P = D1 = diag(σ2i : i = 1, 2, . . . , n) (9.1)

PTQP = D2 = diag(qi : i = 1, . . . , n).

This assumption corresponds exactly to saying that Q and ΣΣT have the sameeigenvectors, namely the columns pi of P , with corresponding eigenvalues σ

2i

Page 244: Financial derivatives in theory and practice

220 Modelling in practice

and qi respectively. Note that the representation (9.1) can always be achievedfor a positive definite matrix Σ, and this is the basis of a principal componentanalysis. In that context the columns of P are referred to as the principalcomponents and it is usual to order these columns such that σ21 ≥ . . . ≥ σ2n.We shall assume this ordering holds here.The original measure P is in fact a martingale measure corresponding to

the numeraire asset A(0), so the model is arbitrage-free. It further followsfrom the theoretical results of Chapter 7 that the market is complete, so thisquadratic payoff can be replicated, and the derivative can be valued by takingexpectations in the martingale measure P. Hence the value at any time t ≤ Tis given by

Vt = A(0)t EP

ATTQAT

A(0)T

Ft = EP ATTQAT |Ft ,

where Ft is the augmented Brownian filtration. Evaluating this expectationis straightforward. First note thatM := D

−1/21 PTΣ satisfiesMMT = I , hence

Wt :=MWt defines a standard Brownian motion. Thence

dAt = ΣdWt = ΣM−1dWt = PD

1/21 dWt

and soAt = A0 + PD

1/21 Wt .

It follows from this representation and the independence of the Brownianincrements and the filtration that

EP (AT −At)TQ(AT −At)|Ft=EP (AT −At)TQ(AT −At)=EP (WT −Wt)

TD1/21 PTQPD

1/21 (WT −Wt)

=EP (WT −Wt)TD1D2(WT −Wt)

=n

i=1

σ2i qi(T − t) .

This in turn yields the option value at time t,

Vt = EP[ATTQAT | Ft] = ATt QAt + (T − t)n

i=1

σ2i qi .

This valuation formula depends on both the dynamics of the asset pricemovements, as summarized by the weightings σ2i given to each componentof the asset covariance matrix, and on the derivative payoff, as summarizedby the corresponding weightings qi. It is the product of the two which matters.

Page 245: Financial derivatives in theory and practice

Product-based modelling 221

Suppose now that we need to approximate the process governing theasset price movements to reduce the dimension of the problem to m. Themost natural choice of model, not taking into account the particulars of thederivative, would be to set to zero the smallest n−m values of σi. This wouldthen give an approximate value to the option of

Vt = ATt QAt + (T − t)

m

i=1

σ2i qi.

In the situation when q1 = · · · = qm = 0, qi 0, i > n, this would be avery poor approximation. The model value would be just the intrinsic value,ATt QAt, whereas the true value is A

Tt QAt+(T−t) n

i=m+1 σ2i qi. This example

illustrates the need to examine the derivative closely when choosing a modelfor the economy.It appears, from the discussion above, that we could price any quadratic

payoff with a low-dimensional model provided we were judicious in our modelchoice. However, this is not the case. Consider, for example, the situation whenqi = 0 for all i < n and qn = 1. A one-factor model might appear to be enoughto find the right price. However, in practice this is not true as we would notknow the covariance matrix Σ precisely. By data analysis or by examining theprices of other derivatives we would come up with some estimate Σ. Our bestestimate of the derivative price, even using a full n-factor model, would nowbe

Vt = ATt QAt + (T − t)qnpTn Σpn .

However, because the payoff is based on the last principal component of thematrix Σ, the variance of this estimate would typically be very large comparedwith the value itself. In other words, we should not even attempt to price thisproduct.

9.3.2 Limit cap valuation

Here we present a further important example using the idea of choosing amodel specifically for an application. Consider again the Gaussian economy ofSection 9.3.1 but restrict to the case when n = 2. Assume that the covariancematrix Σ is known. Suppose we have some derivative to price whose payoff

depends only on A(1)T1and A

(2)T2for some T1 < T2. An example of this type

of product would be one which pays (A(1)T1−K1)+ if this is strictly positive,

otherwise it pays (A(2)T2−K2)+. This is the analogue for this ‘toy’ economy of

the limit cap introduced in Section 18.2.3.We could clearly value this product exactly using a two-factor model. But

suppose, because of issues with the speed of a numerical implementation, thatwe wish instead to use a one-factor model. How should we do this?

Page 246: Financial derivatives in theory and practice

222 Modelling in practice

One approach would be to use principal component analysis. That is,take the matrix ΣΣT and calculate the first principal component p1 andthe corresponding eigenvalue σ21 . Use this component to define a one-factorapproximation to the economy,

At := A0 + σ1p1Wt

for a one-dimensional Brownian motion W . Then calculate the option pricewithin this model, which is given by

V0 = EP A(1)T1−K1 1A(1)

T1>K1 + A

(2)T2−K2 1A(1)

T1≤K1, A

(2)

T2>K2 . (9.2)

This expectation can be evaluated in any of a number of standard ways. Theaccuracy of the model, and hence the price of the option, will depend on howdominant the first principal component is. See also the warnings in Section9.3.1 above. But if the first component is dominant this model will work wellfor many options, provided the option being priced is not ‘nearly orthogonal’to this first component.An alternative approach involves examining the derivative more closely.

Instead of using principal component analysis, we first note that the valuationformula for this option, equation (9.2), depends only on the joint distribution

of the pair (A(1)T1, A

(2)T2). Given the covariance matrix Σ,

Σ =ξ21 ξ1ξ2ρ

ξ1ξ2ρ ξ22,

the vector (A(1)T1, A

(2)T2) is a Gaussian vector with covariance matrix

Σ =ξ21T1 ξ1ξ2ρT1

ξ1ξ2ρT1 ξ22T2.

Now define the one-factor model for this real economy as

dA(1)t := ξ1dWt ,

dA(2)t :=

ρ1t<T1 +T2 − ρ2T1T2 − T1 1t≥T1

ξ2dWt ,

again for a one-dimensional Brownian motion W . The second model is also

one-factor but the joint distribution (A(1)T1, A

(2)T2) exactly matches that of

(A(1)T1, A

(2)T2). Thus the model will give an exact price for the option.

On this occasion the second approach gave the exact option price. It is,of course, not usually possible to achieve this. The cost of such a tailored

Page 247: Financial derivatives in theory and practice

Local versus global calibration 223

approach to modelling is that the model produced will often be a very poor

model for other derivatives — by fitting the joint distribution of (A(1)T1, A

(2)T2) we

have distorted other distributions in the economy quite severely. It is thereforeimperative that one understands well the product in question before applyingthis technique.

9.4 LOCAL VERSUS GLOBAL CALIBRATION

Suppose, based on a general understanding of the market and a given product,we have selected a model for pricing purposes. Now we need to calibratethe model to market data. There are two distinct approaches to calibratingderivative pricing models to market data, the global and the local approaches.We prefer the latter for derivative pricing and hedging purposes, for reasonsgiven below. However, the local approach is only better if you are very carefulabout how you apply the technique and it requires a deeper understanding ofthe product being priced and its relationship to the market more generally.By the global approach to model calibration we mean one which uses a

(large) number of existing vanilla market instruments’ prices as inputs to amodel and then fits the model to approximately price all these input prices.The model is then used to price the derivative to hand. Indeed it is likely thatthe same model would be used to price a number of different exotic derivatives.By contrast to the above, the local approach fits a model to a much smaller

number of market prices, ones which are highly relevant to the particularproduct being priced. The fit to these prices is exact. Every time a differentproduct is considered a different set of calibrating instruments is chosen.One can draw a simple analogue to each of the above techniques. Suppose

we have a series of data points (xi, yi), i = 1, . . . , n, and we know that

yi = f(xi)

where f is some unknown function. The analogue here of the derivative pricingproblem is to find f(x) for some given value x.A global approach to this problem would be to postulate some simple

functional form for f , perhaps linear, quadratic, log-linear, . . . , and then toperform a least squares fit to the points (xi, yi), as illustrated in Figure 9.1.This would be the right thing to do if we did not know beforehand the value

of x, i.e. in the pricing context, if we did not know the derivative to be priced.The advantage of this approach is that it will give reasonable answers over awhole range of input values x, i.e. in the pricing context it will price manyproducts reasonably well. This approach is what an econometrician interestedin general market structure would use.The analogue of local calibration for this example is to fit the function f

using the points close to the given x as shown in Figure 9.2.

Page 248: Financial derivatives in theory and practice

224 Modelling in practice

f (x)

x

Figure 9.1 Least squares fit to data

f (x)

x

Figure 9.2 Local linear fit to data

f (x)

x

Figure 9.3 Overfitting to data

Page 249: Financial derivatives in theory and practice

Local versus global calibration 225

The main benefit of this second approach, the one that matters forderivative pricing, is that it exactly matches the prices of the most relevantinstruments and those which will be used for the hedge. If relevant marketprices are not the values one would expect based on historical experience andother market prices, to an extent this does not matter to us when trying toprice an exotic. If they are out of line now, so should be the exotic’s price. If,over time, the values come back into line, so will the value of the exotic.One of the major dangers of the local approach is that perfect fitting has a

cost. If we fit too many prices exactly this will give very poor results. In oursimple analogue this would correspond to fitting a polynomial locally of toohigh an order, with obvious consequences as shown in Figure 9.3. An exampleof how this can arise in a derivative pricing context is discussed in Section15.4.

Page 250: Financial derivatives in theory and practice
Page 251: Financial derivatives in theory and practice

10

Basic Instruments andTerminology

10.1 INTRODUCTION

To develop the theory and ideas of option pricing beyond the abstract settingof Part I of this book we need to introduce the basic assets that are traded inthe financial markets. It is only with a thorough understanding of these assets,and equally importantly of how the financial markets view these assets, thatwe can start to formulate appropriate models for their evolution. Once wehave done this we can start pricing derivatives.

The remainder of this book is motivated by examples and problems fromthe interest rate markets. Many of the techniques employed carry over to othermarkets, such as equity markets, and are often adaptations of techniquesfirst employed in those other areas. However, interest rate derivatives havetheir own specific characteristics which make them particularly challengingto understand.

The purpose of this chapter is to introduce some of the basic andfundamental products that define the interest rate market. The productsthat we consider are all readily traded and will be the underlying assets onwhich other derivatives are based. We will also introduce standard marketterminology which is important for the discussions that follow.

10.2 DEPOSITS

We begin with the most fundamental instrument in the market, the deposit.A deposit is an agreement between two parties in which one pays the othera cash amount and in return receives this money back at some pre-agreedfuture date, with a pre-agreed additional payment of interest. This interest is,of course, proportional to the amount of cash initially deposited.

Observe that in the above definition a deposit is made until a fixed date.

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 252: Financial derivatives in theory and practice

228 Basic instruments and terminology

This would often in the retail market be referred to as a fixed term deposit.It is quite distinct from investing money in a deposit account such as thoseoffered by UK building societies and from which money can be withdrawn atany time.Deposits are available for a range of maturities, or terms. Only a small

number of maturities, none greater than a year, are quoted as standard.Exactly which are quoted depends upon the currency, with typical values beingovernight, 2 days, 7 days, 1, 2, 3, 6 and 12 months. The amount of interest paidat the end of the period is naturally larger for longer periods. The way thisinterest is quoted is via an accrual factor and an interest rate. It is importantto understand exactly what these are as they play an important role in mostof the derivatives that we shall encounter.

10.2.1 Accrual factors and LIBOR

LIBOR, sometimes called spot LIBOR, is the London Interbank Offer Rate.As the name implies, it is the rate of interest that one London bank will offerto pay on a deposit by another. There will, in general, be a different LIBORfor each of the standard deposit maturities.The total amount of interest that the depositing bank will receive is

calculated by multiplying the LIBOR by the amount of time, as a proportionof a year, for which this money has been on deposit. This amount of time iscalled the accrual factor or daycount fraction.Accrual factors are calculated by dividing the number of days in the period

by the number of days in a year. Different markets use different conventionsto calculate these figures and the name given to the particular method is thebasis. Two very common examples are actual/365 and actual/360. In bothcases the number of days in the period is taken to be the exact number ofcalendar days between the payments, but the number of days in a year iscalculated differently. For the actual/360 basis it is assumed to be 360 days;for the actual/365 basis it is assumed to be 365, or 366 in a leap year.It will not be important for us to have a detailed knowledge of bases. All

that we need to know is that the interest payable is calculated via an accrualfactor which is a known number. Clearly it is also the case that any quotationof a LIBOR is for a specified maturity and basis.In summary, if we denote the accrual factor by α and the corresponding

LIBOR by L, a deposit of notional amount A will be repaid at maturityalong with an interest payment of amount AαL. Usually we will suppress thenotional amount and assume it to be unity. These cashflows are illustrated inFigure 10.1.We will often represent products in this way. The horizontal axis represents

the time of cashflows; those above the axis are ones which we receive, whilethose below the line are ones that we pay.

Page 253: Financial derivatives in theory and practice

Forward rate agreements 229

α

A

A L

T

A

α

Figure 10.1 Deposit cashflows

10.3 FORWARD RATE AGREEMENTS

A forward rate agreement, or FRA, is an agreement between twocounterparties to exchange cash payments at some specified date in the future.Associated with each FRA there is a notional amount and two dates, the resetdate T and the payment date S. These two dates will be such that the period[T, S] corresponds exactly to the accrual period for a standard deposit startingon date T , the most common lengths being 3 and 6 months. The reset datewould usually be some multiple of a month out to two years.Under an FRA, one counterparty agrees to pay to the other an amount

AαK, where A is the notional amount, α is the accrual factor for the period[T, S], and K is the fixed rate. In return the second counterparty agrees to payto the first an amount AαLT [T, S], where LT [T, S] is the LIBOR for the period[T, S] that sets on date T . Figure 10.2 summarizes this, with the conventionthat a ‘wavy’ cashflow is not known on the date that the agreement is made.We call this a floating cashflow.

A LT [T,S]

T

α

A Kα

S

Figure 10.2 Cashflows for the holder of an FRA

When, at time t, two counterparties enter an FRA they do so at zero costto both parties, and this is achieved by choosing an appropriate value for the

Page 254: Financial derivatives in theory and practice

230 Basic instruments and terminology

fixed rateK. It is this breakeven value forK that is quoted in the FRA marketand it is referred as the forward LIBOR, Lt[T, S], for the period [T, S], whichwe will tend to abbreviate to Lt when the context is clear. Note, however, thatwhen an FRA is entered its fixed rate is determined but its value will thenchange through time as rates increase and decrease. We will see soon how tocalculate this value.One main purpose of the FRA is to enable a counterparty to fix today

the amount of interest he will receive on a deposit he intends to make atsome future date. It is worthwhile examining why this is the case. Suppose hewishes to invest a unit amount from time T until S but would like to fix todaythe amount of interest he will receive, rather than wait until T when marketconditions may have changed to his disadvantage. To achieve this he sells, orgoes short, an FRA today (although there is no exchange of cash today) at thecurrent forward LIBOR Lt[T, S]. At time T he then invests his unit capital atthe then spot LIBOR LT [T, S] and in return receives his capital back at timeS along with interest αLT [T, S]. In addition, under the FRA he will receiveα(Lt[T, S] − LT [T, S]), so his net income at time S will be his notional plusexactly αLt[T, S], the interest amount he locked in at the time he entered theFRA.

Remark 10.1: In later chapters we will be using models for the forward LIBORprocess Lt[T, S] : t ≤ T. Note that the times T and S are held fixed as tvaries. The spot LIBOR process, Lt[t, t + ∆] : t ≥ 0, for some fixed ∆,is something quite distinct which has different properties and which will notconcern us. It is important to note the distinction. The same comments holdfor the forward swap rate which we treat below.

Remark 10.2: Those familiar with the FRA market will realize that thedefinition we have just given is not strictly correct. In reality the FRA issettled at time T and not at time S, but the amount paid is exactly the valueat T of what we have described above as an FRA. Consequently this could beinvested at LIBOR to yield the cashflows we have described. We will, for easeof exposition, stick to our slightly altered FRA definition which is also moreclosely related to swaps, which we now introduce.

10.4 INTEREST RATE SWAPS

An interest rate swap, which we will abbreviate to swap, is an agreementbetween two counterparties to exchange a series of cashflows on pre-agreeddates in the future. This is illustrated in Figure 10.3.We refer to all the payments we receive as one leg of the swap, and those

we pay constitute the other leg. To specify the swap we must know its start

Page 255: Financial derivatives in theory and practice

Interest rate swaps 231

T

α

n Kα

S1 Sn

n LSn – 1 [Sn – 1, Sn ]

Figure 10.3 Payers interest rate swap

date (the start of the first accrual period), maturity date (the date of the lastcashflow), and the payment frequency, for each of the legs. Typical maturitiesare 1—10, 12, 15, 20 and 30 years. Each leg can in general have a differentpayment frequency, but here we describe the case where it is the same forboth legs.Suppose there are a total of n cashflows in each leg, cashflow i occurring

at time Si. One of the legs will be a fixed leg, for which all the cashflows areknown at the start of the swap. The amount of cashflow i is given by αiKwhere αi is the accrual factor for the period [Si−1, Si] and K is the fixed rateof the swap. For the swap of Figure 10.3 the fixed leg is represented below theaxis, so this is a pay-fixed, or payers swap; the contract with the cashflowsreversed is a receivers swap.The other leg is the floating leg, so called because the payment amounts

will be set during the life of the swap. The amount of payment i is set at timeSi−1 and is the accrual factor αi multiplied by LSi−1 [Si−1, Si], the LIBORthen quoted for the period [Si−1, Si].

Remark 10.3: In actual fact, the presence of holidays and weekends means itis very unusual for a swap to have an exact match between payment datesand the start of the next accrual period. The market allows for this fact whenvaluing swaps. However, when modelling more complex option-type productsit usually simplifies the analysis to assume there is a perfect match. As longas one is careful, this does not cause any problems. In all the modelling in thisbook we will assume that there is a perfect match. However, for notationalconvenience we shall often refer to the start of the ith accrual period asTi (rather than Si−1). Thus the LIBOR appropriate to the ith period isLTi [Ti, Si].

Swaps are usually entered at zero initial cost to both counterparties. Aswap with this property is called a par swap, and the value of the fixed rateK for which the swap has zero value is called the par swap rate. In the casewhen the swap start date is spot (i.e. the swap starts immediately), this isoften abbreviated to just the swap rate, and it is these par swap rates thatare quoted on trading screens in the financial markets. A swap for which thestart date is not spot is, naturally enough, referred to as a forward start swap,

Page 256: Financial derivatives in theory and practice

232 Basic instruments and terminology

and the corresponding par swap rate is the forward swap rate. Forward startswaps are less common than spot start swaps and forward swap rates are notquoted as standard on market screens.We will denote the (forward) par swap rate, at time t for a swap starting

on date T and making fixed payments on dates given by the vector S =(S1, . . . , Sn), by yt[T, S] (often just yt when the context is clear). Note thatthis definition is independent of the payment frequency in the floating leg.This is indeed the case, as we see when we learn how to value swaps in Section10.6.4.We can see from the above that a swap is just a series of FRAs, each having

the same strike K. Note that in Figure 10.3 we have represented the firstcashflow in the floating leg of the swap as an unknown floating payment, eventhough the amount of this cashflow will be set immediately the swap is agreed,based on spot LIBOR. This is a slight abuse of our previous notation but isconvenient since, like all the other cashflows in the floating leg, its amount isbased on LIBOR at the start of the corresponding accrual period.

10.5 ZERO COUPON BONDS

We have already met these in the theory of Part I. Zero coupon bonds (ZCBs),also known as pure discount bonds, are assets which entitle the holder toreceive a cashflow at some future date T . The amount of this cashflow is partof the contract specification, although we will often assume it to be a unitamount.

T

1

Figure 10.4 Zero coupon bond

ZCBs are not generally liquid market instruments. It is not possible, forexample, to find their prices on standard quotation screens. However thereare many in existence, and indeed a deposit, once the initial capital is handedover, is effectively a ZCB, so they certainly exist for all maturities less than ayear. Furthermore, banks will quote prices for most maturities over a year aswell.The main value of introducing ZCBs comes from the fact that other

products can be built up from them. In this sense they are fundamental.

Page 257: Financial derivatives in theory and practice

Discount factors and valuation 233

As an example, the fixed leg of a swap consists of n known cashflows and socan be thought of as n ZCBs. Thus the value of the fixed leg will be the sumof the values of each of these component ZCBs. We shall shortly see how anFRA can also be related to ZCBs.

10.6 DISCOUNT FACTORS AND VALUATION

In this section we introduce discount factors, the most fundamental tool forsummarizing the value of interest rate products. We will see how discountfactors can be used to express the value of the basic instruments we have metand also how they relate to forward LIBOR and swap rates.

10.6.1 Discount factors

Suppose we are at time t. For any time T ≥ t there is a corresponding discountfactor, denoted as DtT , defined to be the value, at time t, of a ZCB paying aunit amount at time T . Note that, for all t, Dtt = 1 since a unit cashflow nowis worth one unit.We will usually assume that the initial discount curve, D0T : T ≥ 0

if today is time zero, is known. This is not strictly the case since thestandard liquid market instruments (which do not include ZCBs of allmaturities) only give us a limited amount of information. In practice bankstake these liquid market instruments and construct a full discount curvesconsistent with their prices. There is no unique way to do this, and what isdone often depends on the use to which the discount curve is to be put. Weshall not discuss the techniques used to construct discount curves in detailand shall assume that this has been done as the starting point for the rest ofour analyses.

10.6.2 Deposit valuation

A standard unit deposit at time T pays at maturity S an amount 1+αLT [T, S].Since there are no further payments other than the initial deposit and the finalredemption, these two cashflows must have the same value at time T . Thus itfollows that

DTT = (1 + αLT [T, S])DTS ,

DTS = (1 + αLT [T, S])−1,

LT [T, S] =DTT −DTS

αDTS. (10.1)

Page 258: Financial derivatives in theory and practice

234 Basic instruments and terminology

10.6.3 FRA valuation

This our first example of derivative pricing. Recall that the net payment underan FRA is given by α(LT [T, S]−K). From (10.1) it can be seen that the finalpayment of an FRA depends on two ZCBs which mature at times T and S,thus the FRA is a derivative of these bonds. We can thus apply the theorydeveloped in Chapters 7 and 8, but for this simple product we can use a moredirect argument.Suppose at time t we have an amount of cash

Vt = DtT − (1 + αK)DtS . (10.2)

Consider the following investment strategy. At time t we buy one unit of theZCB maturing at time T and sell 1 + αK units of the ZCB maturing at timeS. When, at time T , we receive a unit payment from the maturing ZCB, wedeposit this until time S (equivalently buy (1 + αLT [T, S])

−1 units of theother ZCB). At time S the deposit and ZCB mature, giving a net payment of

(1 + αLT [T, S])− (1 + αK) = α(LT [T, S]−K) .This is precisely the payment received under the FRA, and hence we havereplicated the FRA by trading in the two ZCBs. It follows immediately,assuming there is no arbitrage in the economy, that Vt is the value of theFRA.In general, of course, Vt will be non-zero. Recall that the value of K for

which Vt is zero is defined to be the forward LIBOR for the period [T, S] and,from (10.2), is given by

Lt[T, S] =DtT −DtSαDtS

.

We can use this to obtain an alternative representation for the valuationformula (10.2), namely

Vt = αDtS(Lt[T, S]−K) . (10.3)

Note that in this argument, unusually, we have not needed to specify amodel for the evolution of the assets of the economy. This is because thehedging strategy is static up until the time T at which time all payments areknown. We will see in Chapter 11 how to price the same product using thetechniques of Chapter 7.

10.6.4 Swap valuation

A swap is a series of FRAs so it can be valued by adding together the valuesof the constituent FRAs. However, we will instead value the swap directly,considering each of the two legs in turn.

Page 259: Financial derivatives in theory and practice

Discount factors and valuation 235

The fixed leg consists of a series of payments on dates S1, S2, . . . , Sn, thepayment at time Si being αiK. Its value is therefore given by

V FXDt = K

n

j=1

αjDtSj

= KPt[T, S] ,

where S = (S1, S2, . . . , Sn) and

Pt[T, S] :=

n

j=1

αjDtSj .

We will refer to the expression Pt[T, S] as the present value of a basis point,or PVBP, of the swap. It represents the value of the fixed leg of the swap ifthe fixed rate were unity. This is a slight abuse of notation since Pt[T, S] asdefined above is the present value of unity, but one which is common in themarket and which we will always use. A basis point is actually 0.01%, or 10−4,so Pt is 10, 000 times the true PVBP of the swap. The PVBP will play animportant role in some of the products we examine later. When there can beno possible confusion we will usually abreviate Pt[T, S] to Pt.The floating leg of the swap is more difficult to value, comprising as it does

a series of floating payments. However, just as we did for the FRA, we canfind a simple trading strategy which replicates the swap’s floating payments.Suppose at time t we have an amount of cash

V FLTt = DtT −DtSn . (10.4)

Take this cash and buy one ZCB of maturity T and sell one of maturity Sn. Attime T take the unit paid by the ZCB and deposit it at LIBOR until time S1.At time S1 (= T2) we will receive 1+α1LT [T, S1]. The term α1LT [T, S1] is thefloating payment we need to replicate the swap, the extra unit of principal wedeposit at LIBOR until time S2. We repeat this at each floating payment dateuntil at time Sn we receive 1+αnLTn [Tn, Sn]. The LIBOR part of this paymentis what we need for the last floating payment of the swap, the notional paysthe amount owed on the ZCB we sold at time t.It follows that the net value of a payers swap is exactly

Vt = VFLTt − V FXDt

= DtT −DtSn −KPt[T, S] . (10.5)

The forward swap rate yt[T, S] is the value of K which sets this value to zero.Substituting Vt = 0 into (10.5) then yields

yt[T, S] =DtT −DtSnPt[T, S]

. (10.6)

Page 260: Financial derivatives in theory and practice

236 Basic instruments and terminology

Substituting this back into (10.5), we obtain the more usual expression forthe value of a payers swap,

Vt = Pt[T, S](yt[T, S]−K) . (10.7)

The value of a receivers swap is, of course, just −Vt.As a final point on swaps, recall that we allow the floating and fixed legs

to have different payments dates. It is easy to see from the arguments abovethat the value of the floating leg, as given by (10.4), is independent of thefloating payment frequency. It is therefore only the fixed payment frequencythat matters when defining and valuing a forward starting swap.

Page 261: Financial derivatives in theory and practice

11

Pricing Standard MarketDerivatives11.1 INTRODUCTION

In this chapter we apply the techniques of Part I to price some of the morecommon derivatives seen in the market. The models we present have beenaround for quite some time, but it is only recently that the theory has beenunderstood well enough for those working in the area to realize that the pricesderived are not just approximations but are actually theoretically exact, givena suitable initial model.

Most of the derivatives we price here are sufficiently common in the financialmarkets that models are no longer really used to derive the price from modelinputs. Rather a model is selected and the market price is used to implythe parameters of the pricing model (such as volatility) which would givethe observed market price. The model is then used to calculate the hedgefor the derivative in terms of the underlying instruments. Furthermore, thesederivatives are themselves often used as underlying instruments for other morecomplex derivatives. We shall see several examples of this kind in the chaptersthat follow.

11.2 FORWARD RATE AGREEMENTS AND SWAPS

We have already met the FRA in Chapter 10, where we derived the price via asimple replication argument in terms of zero coupon bonds. Recall that underan FRA one counterparty makes a fixed payment αK at some time S and inreturn receives an amount αLT[T, S] on the same date. We saw that the valueat time t is given by

Vt = DtT − (1 + αK)DtS.

This is an unusual example in that we have not needed to specify a model forthe evolution of asset prices in order to price the derivative. This is because

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 262: Financial derivatives in theory and practice

238 Pricing standard market derivatives

there is in this case a static replicating portfolio. It is instructive, however,to use the tools developed in Chapter 7 to price the FRA; the resulting priceshould be independent of the model we choose. So suppose we have somemodel for the evolution of discount bonds. Following the approach of Chapter7, if (N,N) is some numeraire pair and if Ft is the filtration generated bythe assets in the economy, then the price of the FRA is given by

Vt = NtEN α(LT −K)N−1S |Ft= NtEN α(LT −K)EN N−1S FT Ft= NtEN α(LT −K)DTSN−1T |Ft= DtT − (1 + αK)DtS ,

the last equality following by substituting for LT using equation (10.1). Aswap could be valued in exactly the same way, but as we saw in Chapter 10it is just a linear combination of FRAs and pure discount bonds, and so itsvalue follows immediately by summing the values of these FRAs and ZCBs.

11.3 CAPS AND FLOORS

It is often the case that a customer is either making or receiving a series offloating payments and does not wish to convert them into a series of fixedpayments. This may be because he believes future rate moves will be in hisfavour. However, he is then exposed if rates move against him and would liketo buy some protection against this without removing the benefits of the movein rates that he expects. The solution in this case is for the customer to buya cap or a floor.Caps and floors are similar to swaps in that they are made up of a series

of payments on regularly spaced dates, Sj , j = 1, 2, . . . , n. On date Sj theholder of a cap receives a payment of amount

αj max LTj [Tj, Sj ]−K, 0 ,

where Tj := Sj−1 is the setting time for the LIBOR which pays at time Sj.A floor is similar except the payment amount is given by

αj max K − LTj [Tj , Sj ], 0 .

The constant K in these expressions is part of the contract specification andis known as the strike of the option. These payoff profiles are shown in Fig-ures 11.1 and 11.2. A counterparty who is paying LIBOR and who buys acap has ensured that he will never pay more that αjK at time Sj ; one whois receiving LIBOR and buys a floor will never receive less than α jK. Each

Page 263: Financial derivatives in theory and practice

Caps and floors 239

3.00%

2.50%

2.00%

1.50%

1.00%

0.50%

0.00%4.00% 5.00% 6.00% 7.00% 8.00% 9.00% 10.00%

Caplet value

Caplet payoff

Figure 11.1. Caplet payoff profile and value

3.00%

2.50%

2.00%

1.50%

1.00%

0.50%

0.00%4.00% 5.00% 6.00% 7.00% 8.00% 9.00% 10.00%

Floorlet value

Floorlet payoff

Figure 11.2. Floorlet payoff profile and value

individual payment is usually referred to as a caplet or floorlet. At any timet ≤ T , the amount αj(Lt[Tj, Sj] − K)+ is referred to as the intrinsic value ofthe caplet and, similarly, αj(K − Lt[Tj, Sj])+ is referred to as the intrinsicvalue of the floorlet. In either case, if the intrinsic value is positive then the

Page 264: Financial derivatives in theory and practice

240 Pricing standard market derivatives

option is said to be in the money, whereas if it is zero the option is out of themoney. When Lt[Tj, Sj] = K both options are said to be at the money.

Since caps and floors are linear combinations of caplets and floorlets, itsuffices to price these single payments. Caplets and floorlets are usuallyconsidered as derivatives of FRAs, but can of course also be seen as derivativesof ZCBs. Which is done depends on the application. We will consider theformer here, and will derive the market standard pricing formula.

11.3.1 Valuation

The payoff at some time S from a single caplet setting at time T is just

VS = α(LT[T, S] − K)+

and so its time-t value is given, dropping the [T, S ] from the notation, by

Vt = NtEN[α(LT − K)+N−1S |Ft], (11.1)

for some numeraire pair (N, N). To calculate this price we must choose anumeraire N and a suitable model for L[T, S] in the measure N. Supposewe choose Nt = DtS, the discount bond maturing on the payment date S.The corresponding measure N is usually referred to as the forward measureand we denote it by F. Using this measure has two important consequences.First, NS = DSS = 1 and so this disappears from equation 11.1. Secondly, theforward LIBOR L, which is of the form

Lt =DtT − DtS

αDtS

=DtT − DtS

αNt,

is a ratio of asset prices over the numeraire and so must be a martingale. Aslong as we model L as a martingale (under F) we will have a model that isarbitrage-free and the caplet value will be given by (11.1). Because interestrates are always assumed to be positive, we will model L as a log-normalmartingale,

dLt = σtLtdWt,

for some deterministic σ and a Brownian motion W. This is standard marketpractice and yields a solution

Lt = L0 exp(∫ t

0σudWu −

12

∫ t

0σ2

udu

).

Page 265: Financial derivatives in theory and practice

Caps and floors 241

Substituting into (11.1) now yields

Vt = DtSEF [α(LT −K)+|Ft]

= DtSEF α LtexpT

t

σudWu − 12

T

t

σ2udu −K+

Ft

= αDtS LtN(d1)−KN(d2) (11.2)

where

d1 =log(Lt/K)

σt√T − t +

12 σt√T − t, (11.3)

d2 =log(Lt/K)

σt√T − t −

12 σt√T − t,

σ2t =1

T − tT

t

σ2udu .

This is Black’s formula, and was first published in Black (1976). The derivationat that time used a series of approximations. Criticisms of the assumptionsmade have been that the model admits arbitrage and that inconsistentassumptions are made about the evolution of the discount factor (taken to bedeterministic) and the forward rate (taken to be stochastic). Whereas thesecriticisms may be valid for the thought process at the time, they are not validfor the above model and pricing formula, as the earlier analysis shows.Pricing of floorlets is identical and yields the usual put option formula:

Vt = αDtSEF (K − LT )+|Ft= αDtS KN(−d2)− LtN(−d1) . (11.4)

Figures 11.1 and 11.2 show, for some choice of the model input parameters,the value at t of a caplet and floorlet as a function of the forward LIBOR L t.

11.3.2 Put—call parity

There is a simple relationship between caplets, floorlets and FRAs which mustalways hold, namely

V CAPt − V FLOORt = α(Lt −K)DtS . (11.5)

This can be verified by substituting (11.2) and (11.4) into (11.5), but it mustalso hold more generally, irrespective of the model being used. This followssince

(LT −K)+ − (K − LT )+ = (LT −K)

Page 266: Financial derivatives in theory and practice

242 Pricing standard market derivatives

and thus, recalling the FRA valuation formula (10.3),

NtEN[α(LT −K)+N−1S |Ft]−NtEN[α(K − LT )+N−1S |Ft]= NtEN[α(LT −K)N−1S |Ft]= α(Lt −K)DtS .

11.4 VANILLA SWAPTIONS

Just as a caplet is an option on an FRA, a swaption is an option on a swap.Swaptions are also commonly traded in the market and are priced usingBlack’s formula. There are two types of swaption: receivers swaptions (referredto as receivers) and payers swaptions (referred to as payers). In the first case,upon exercise the option holder enters a swap in which he receives a fixed rateK, the swaption strike, and pays the floating rate; a payers is the reverse. Werefer to this as a vanilla swaption if the underlying swap is a plain vanillainterest rate swap.We now price a payers swaption. Let Ut[T, S] be the value at time t of a

vanilla payers swap which starts on date T and makes fixed payments on thedates S = (S1, . . . , Sn). The effective payoff to the option holder, who has theright but not the obligation to enter the swap, is given at the option expiryT by

VT = max(UT , 0)

since he will only enter the swap if it is to his advantage to do so. The valueat time t of the swaption is, by the usual valuation formula, given by

Vt = NtEN[max(UT , 0)N−1T |Ft]= NtEN[PT (yT −K)+N−1T |Ft] , (11.6)

for some suitable numeraire pair (N,N). The final equality above follows bysubstituting (10.7) for the value of a swap, where PT , yT and K are, as usual,the PVBP and par swap rate at T and the fixed rate for the underlying swap.We usually refer to K as the swaption strike. At any time t ≤ T we use theterminology in the money, out of the money and at the money just as we didfor caplets and floorlets according to the sign of (yt −K).To evaluate (11.6) we follow a procedure similar to that for the FRA. On

this occasion we choose P as the numeraire, in which case the correspondingmartingale measure is called the swaption measure and will be denoted by S.This reduces (11.6) to the form

Vt = PtES[(yT −K)+|Ft] ,

Page 267: Financial derivatives in theory and practice

Vanilla swaptions 243

which should by now be quite familiar. We recall from (10.6) that the forwardswap rate yt is of the form

yt =DtT −DtSn

Pt,

which is the ratio of asset prices over the numeraire, so must be a martingaleunder S. Since we want interest rates to remain positive, we will model y tobe a log-normal martingale,

dyt = σtytdWt , (11.7)

for some deterministic function σ and a Brownian motion W . This puts usin precisely the same framework as when we priced caplets and floorlets andagain yields Black’s formula for the price,

Vt = Pt ytN(d1)−KN(d2) (11.8)

where

d1 =log(yt/K)

σt√T − t +

12 σt√T − t,

d2 =log(yt/K)

σt√T − t −

12 σt√T − t,

σ2t =1

T − tT

t

σ2udu .

The corresponding receivers valuation formula is just

Vt = Pt KN(−d2)− ytN(−d1) ,

and the put—call parity relationship becomes

V PAYt − V RECt = Pt(yt −K) .

Remark 11.1: The choice of P as numeraire means that expression (11.7) isthe only distributional assumption made, despite the fact that the underlyingswap is made up of several ZCBs. For other applications we will need to makeextra modelling assumptions, and this is the topic of the chapters that follow.

Remark 11.2: Precisely the same criticisms have been levelled against Black’sformula for swaption pricing as for caplet and floorlet pricing. Again thesecriticisms have been shown to be invalid (Neuberger 1990; Jamshidian 1997),on this occasion by working in the swaption measure. This is the first examplewe have met where the numeraire is not the price of a single asset in theeconomy. Note that when the swap comprises a single payment date the swapreduces to an FRA and the swaption measure reduces to the forward measure.

Page 268: Financial derivatives in theory and practice

244 Pricing standard market derivatives

11.5 DIGITAL OPTIONS

A digital option is one which pays either one or zero at some future date,depending on the level of some index rate. The two most common examplesare digital caps and floors, and digital swaptions.

11.5.1 Digital caps and floors

Just as caps and floors are made up of a series of caplets and floorlets, sodigital caps and floors are made up of digital caplets and floorlets. A digitalcaplet is an option which pays a unit amount at time S if at T the LIBOR forthe period [T, S] is above some strike level K. A floorlet pays a unit amountif the LIBOR is below the strike.The value at S of a digital caplet is given by VS = 1LT>K, and so it

follows thatVt = NtEN[1LT>KN

−1S |Ft] .

Taking DtS as numeraire, as we did for caps and floors, and using the samelog-normal model, yields

Vt = DtSEF[1LT>K|Ft]= DtSF(LT > K|Ft)= DtSN(d2) , (11.9)

where d2 is as defined in (11.3). The analogous digital floorlet pricing formulais then

Vt = DtSEF[1LT<K]

= DtSN(−d2) , (11.10)

and the put—call parity relationship is now

V DCAPt + V DFLOORt = DtS .

Remark 11.3:What we have just done in (11.9) and (11.10) is derive the priceof a digital caplet and floorlet using the same log-normal model as we did toprice a standard caplet and floorlet. Even when alternative models are usedfor the forward LIBOR there is still a fundamental relationship between theprices of caps and floors on the one hand and digital caps and digital floorson the other. This relationship essentially follows from the identity

d

dK(x−K)+ = −1x>K .

The more general relationship will be exploited further in Chapters 14 and19. The reader may wish to derive it now.

Page 269: Financial derivatives in theory and practice

Digital options 245

11.5.2 Digital swaptions

Digital swaptions are less common than digital caps and floors, and theyare also more difficult to price. There are, once again, two types of digitalswaptions, payers and receivers. In the case of a digital payers swaption theoption holder receives a unit amount at some date M if the index swap rate,yT , is above the strike K on the setting date T . For a digital receivers theoption holder receives the unit payment if the swap rate is below the strikeK.In the case of swaptions there is no obvious choice for the option payment

date M relative to the rate setting date T . Common choices are M = Tor M = S1, the first payment date of the underlying swap. Considering thepayers digital, the value at time M is 1yT>K and so the time-t value of theoption is

Vt = NtEN[1yT>KN−1M |Ft]

= NtEN[1yT>KDTMN−1T |Ft] ,

the last equality following by conditioning on FT . To value the digitalconsistently with standard swaptions, we would like to take the same modelas in Section 11.4. Working in swaption measure, we have

Vt = PtES[1yT>KDTMP−1T |Ft] . (11.11)

Evaluation of the expectation on the right-hand side of (11.11) presents uswith a new problem. The expectation involves three distinct random terms,namely yT , DTM and PT . We know that within our model yT is a log-normalmartingale, but we have not yet specified a suitable model for the other twoterms. To price this product is thus beyond the scope of this current chapter.Chapters 13 and 14 show how this type of product can be treated.

Page 270: Financial derivatives in theory and practice
Page 271: Financial derivatives in theory and practice

12

Futures Contracts12.1 INTRODUCTION

In Chapter 10 we met the forward rate agreement (FRA) and saw how itcould be used to lock in today a rate of interest on a deposit or loan which isscheduled for some future date. This has the effect, therefore, of immunizingthe buyer (or seller) from future changes in interest rates. One drawback withthe FRA, however, is that it is a contract traded over the counter, directlybetween the two counterparties. If, before the settlement date of the contract(the date on which the cashflows occur), one of the counterparties were todefault, the other would again be left exposed to interest rate moves thathad occurred between the date when the FRA was entered and the date ofthe default.

The Eurodollar futures contract is a derivative designed to provide similarprotection from interest rate moves as the FRA but without the accompanyingcounterparty risk. Other futures contracts fulfil a similar role in othermarkets, such as equity markets, and for other interest rate instruments,such as government bonds. Futures contracts generally are amongst the mostimportant and liquidly traded instruments.

In this chapter we define and analyse the futures contract. We shall do thisin some generality since the extra abstraction required to do so makes theanalysis no more difficult but clarifies some of the underlying issues. As weshall see, any futures contract is defined with reference to some other financialinstrument, commodity or derivative. We will derive the relationship betweenthe futures price (which is in fact not a price at all) and the forward price orforward rate of the associated instrument.

12.2 FUTURES CONTRACT DEFINITION

Here, in Section 12.2.1, we describe the futures contract and how it works inpractice. This is followed, in Section 12.2.2, by a brief discussion of how the

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 272: Financial derivatives in theory and practice

248 Futures contracts

futures price is related to the price of the associated reference instrument,something that is not immediately clear from the contract definition. Wethen, in Section 12.2.3, give a mathematical formulation of the product andset notation.

12.2.1 Contract specification

The first important point to note about futures contracts, one whichdistinguishes them from the derivatives discussed so far, is that they are alltraded on an exchange. When one counterparty buys a futures contract thereis a second counterparty who simultaneously sells the same contract at thesame price. However, neither counterparty knows who the other counterpartyis and both counterparties, when buying and selling respectively, enter a legalagreement with the exchange rather than with each other. The exchange is afinancially secure organization, backed by its members. It is very unlikely todefault because of this backing and because the agreements it enters are setup in such a way that it does not become exposed to defaults by any tradecounterparties.Futures contracts are defined relative to some other reference instrument or

commodity, examples being interest rates and stock prices. In the interestrate market this reference instrument is always a standard liquid marketinstrument, the FRA in the case of a Eurodollar futures contract. The aim ofthe future is to give the counterparties roughly the same exposure to marketmoves as if they had done the reference trade, but without the associatedcredit risk. How the contract defined here simultaneously achieves these twoobjectives is discussed in Section 12.2.2 below.Associated with any futures contract there is a settlement date T and a

futures price process, Φs : 0 ≤ s ≤ T. A counterparty who buys thefutures contract at time t, when the futures price is Φt, agrees to pay to theexchange the net amount Φt − ΦT over the time interval [t, T ]. The precisepayment dates and payment amounts, some of which may be positive and somenegative, are determined by the evolution of the futures price process over theinterval [t, T ]. We will say more on this process and on its relationship to theprice of the reference instrument later, but for now it is sufficient to knowthat the futures price is always available on the exchange (during openinghours) and that the settlement price ΦT is derived, according to simple rules,from the price of the reference instrument. For example, in the case of theEurodollar futures contract the final futures price is given by

ΦT := 100 1− LT [T, S]where, as usual, L[T, S] is the forward LIBOR for the period [T, S] (which isalways a three-month period for the Eurodollar future).To complete the definition of the future we need to specify the payment rules

which determine precisely how the (possibly negative) net amount Φt − ΦT

Page 273: Financial derivatives in theory and practice

Definition 249

is paid. When a counterparty buys a futures contract at time t (similar rulesapply to a sold future) he places with the exchange an initial margin. (Thisamount is set with reference to the volatility of the price of the referenceinstrument. The details of this will not concern us.) Over time the futuresprice will change from its initial value Φt, sometimes going up, sometimesgoing down. At the end of every day the exchange checks the closing futuresprice. If the price has gone up during the course of the day the exchangecredits to the counterparty’s margin account the amount of that increase; ifit has gone down the exchange debits the amount of the decrease. Wheneverthe balance exceeds the initial margin level the counterparty has the right to(and therefore should) withdraw any excess credit amount above the initialmargin from the account. On the other hand, whenever the balance dropsbelow the maintenance margin level, a fixed level which is less than the initialmargin amount, he must bring the balance back up to the initial margin levelagain. At the expiry of the contract he may close the account and remove allremaining funds. Thus the net payment made by the counterparty over theinterval [t, T ] is indeed precisely the amount Φt − ΦT .

12.2.2 Market risk without credit risk

As explained above, futures contracts are designed to allow a counterpartyto gain exposure to or protection from various market movements withouthaving to take on the accompanying counterparty risk. Here we explain whythe contract just described has indeed achieved each of these objectives.First consider the issue of counterparty risk. For the exchange to suffer a

loss due to a credit default one of its customers must fail to make a marginpayment when requested to do so. Consider the case when the counterpartybought the future. In the event of default, as soon as it becomes clear, at times, that a counterparty is not going to honour his obligations and make themargin payment, the exchange cancels the contract with the counterparty andsells the same futures contract to some other counterparty at the then futuresprice Φs. By entering this new contract the exchange ensures that it willreceive from this new counterparty an amount Φs−ΦT over the period [s, T ].On the other hand, at that same moment the total amount that the exchangehas received from the defaulting counterparty is Φt − Φs plus the balancein the margin account. Therefore as long as this balance is not negative theexchange has no exposure to the defaulting counterparty. The levels of theinitial and maintenance margins are set in such a way that the likelihood ofthis balance being negative, as the result of a large market move between thedefault on a margin call and the close out of the position, is small.This discussion demonstrates that the exchange has only a very small credit

exposure to its counterparties. But this in turn means these counterpartieshave only a small credit exposure to the exchange. Because the exchange isnot running credit risk and does not itself take market risk, and because it has

Page 274: Financial derivatives in theory and practice

250 Futures contracts

the financial backing of its members, there is very little chance that it wouldever default. And if it did the exposure of a counterparty to the exchange islimited to an amount which is approximately the size of the initial margin.The above discussion demonstrates the effectiveness of the futures contract

in removing credit risk. What is less clear from the original contract definitionis that the future also gives the desired exposure to market moves. Here wewill briefly explain why this is the case. To do this we will consider the specificexample of the Eurodollar future.Recall that the final settlement price of the Eurodollar futures contract is

given by ΦT = 100(1− LT [T, S]) for some given period [T, S]. Here we arguethat the futures price at time t is approximately given by

Φt 100(1− Lt[T, S]) . (12.1)

If this is the case then a counterparty buying the future at time t will payapproximately the net amount

Φt − ΦT 100(Lt[T, S]− LT [T, S])

to the exchange over the period [t, T ]. To see why (12.1) is roughly what onewould expect, consider also the FRA on which the future is based.Suppose that the relationship (12.1) were exact. Had the counterparty,

instead of buying the future, bought a notional amount 100/α of thecorresponding FRA (α being the accrual factor for the period [T, S]) then,at time S, he would have paid the net amount 100(Lt[T, S] − LT [T, S]). Byassumption this is exactly the same amount as for the future. But which isbetter? If the forward LIBOR rises over the interval [t, T ] the counterparty willin each case be making a net positive payment. Under the FRA this paymentis all made at time S > T so this would generally be the better contractto have entered (better to pay later than earlier). On the other hand, if theforward LIBOR dropped over the interval [t, T ] the counterparty will in eachcase receive a net positive payment and so would like to receive it early. Soon this occasion the future would have been better.The preceeding paragraph is intended to give a heuristic argument to justify

why the futures price of the Eurodollar future should be approximately givenby (12.1) and thus why the future gives a market exposure similar to theFRA. If (12.1) were true then both contracts make the same payment, withthe timing sometimes favouring the future, sometimes favouring the FRA. Theexact relationship between the futures price and the forward LIBOR is a littlemore subtle and depends, as one might guess, on the relationship between theprice or rate on which the contract is based and the discount curve over thetime period [t, T ]. For the Eurodollar future, for example, the purchaser ofthe future pays money to the exchange when the reference LIBOR goes up.But when LIBOR goes up discount factors generally go down, which leads

Page 275: Financial derivatives in theory and practice

Definition 251

to an increased advantage to the FRA under which the payment is later.Furthermore, when the reference LIBOR goes down the purchaser receivesmoney from the exchange and discount factors generally go up, which reducesthe disadvantage to the FRA under which the payment is received later. Thesetwo effects combine and lead to a futures price which is actually less thanthat given by (12.1). Quantifying this relationship is the primary topic of theremainder of this chapter.

12.2.3 Mathematical formulation

In this section we give a more mathematical description of a futures contract.Throughout we shall suppose we are working on some filtered probabilityspace (Ω, Ft,F ,P) where Ft is the augmented natural filtration for somed-dimensional Brownian motion.A futures contract is specified by the following:

(i) the settlement date T ;(ii) the settlement amount H , an FT -measurable random variable;(iii) a resettlement arrangement, which determines when payments to and

from the counterparties are to take place.

For a particular futures trade, entered at time t say, it is also important toknow the entry time since this is required to determine the amount of the firstresettlement payment.We will denote by ΦHt : 0 ≤ t ≤ T the futures price process corresponding

to this contract. The contract resettlement times will be denoted by 0 < T1 <. . . < Tn = T .In common with other authors we will make a simplifying assumption at

this stage. We shall suppose that both the initial and variation margins arezero. If we expect movements in the futures price process over the life of thecontract to be large compared with the size of the margin this will not be anunreasonable assumption. Note also that a non-zero margin in an arbitrage-free world is not possible (without a modification to the contract definition)since both the buyer and the seller must place margin with the exchange.With this simplifying assumption regarding the margin, the payment made

at time Ti, 1 ≤ i ≤ n, by a counterparty who buys a futures contract at timet will be for the amount

ΦHTi∨t − ΦHTi−1∨t .

Note that this payment will be zero if Ti < t and that the first non-zeropayment, at Ti∗ say, will be for the amount

ΦHTi∗ − ΦHt .

Page 276: Financial derivatives in theory and practice

252 Futures contracts

12.3 CHARACTERIZING THE FUTURES PRICEPROCESS

Throughout the rest of this chapter we will assume the existence of the risk-neutral measure Q equivalent to P and a short-rate process r such that if Ais any asset in the economy then ζA is an (Ft,Q) martingale where

ζt = exp −t

0

ru du .

Recall this assumption implies that the economy is arbitrage-free, and weshall also assume it to be complete. One consequence of market completenessis that the cash account, ζ−1, is a numeraire. We choose to use the languageof numeraires throughout the remainder of this chapter rather than that ofpricing kernels.Our primary interest is in futures contracts in the interest rate markets,

but under the assumptions above the discussion applies equally well to othermarkets.

12.3.1 Discrete resettlement

The futures contract specification of Section 12.2.3 completely defines thefutures price process ΦH , as we now show.Let (N,N) denote a numeraire pair for the economy. From the theory of

Chapter 7, the total value of these payments at time t is given by

Vt = NtEN

n

j=1

(ΦHTj∨t − ΦHTj−1∨t)N−1Tj Ft

. (12.2)

The property that a futures contract is entered at zero cost corresponds toVt = 0 for all t ∈ [0, T ]. The futures price process can thus be recovered from(12.2) by backwards induction as follows. First recall that

ΦHTn = H.

Now fix i < n and assume that we have found ΦHTk for all i < k ≤ n. Then,setting t = Ti in (12.2) and rearranging yields

ΦHTi =EN ΦHTi+1N

−1Ti+1

|FTi + ENnj=i+2(Φ

HTj− ΦHTj−1)N−1Tj |FTi

EN[N−1Ti+1

|FTi ].

This defines the futures price ΦHTi for i = 0, 1, . . . , n. For general t ∈ (Ti, Ti+1)we can now use (12.2) to obtain

ΦHt =EN ΦHTi+1N

−1Ti+1

|Ft + EN nj=i+2(Φ

HTj− ΦHTj−1)N−1Tj |Ft

EN[N−1Ti+1

|Ft].

Page 277: Financial derivatives in theory and practice

Characterizing the futures price process 253

In principle the above completely characterizes the process Φ. However, aclosed-form solution can only be found in very special cases, for example underGaussian assumptions. A mathematically cleaner and more enlightening resultcan be obtained by considering the limit as the inter-resettlement times aretaken to zero. This we do in the next section.

12.3.2 Continuous resettlement

Typically for a futures contract the settlement time T is large compared to theinter-resettlement time interval. In US dollars, for example, Eurodollar futurescontracts are traded with settlement times up to ten years. From a modellingviewpoint it is therefore reasonable to consider the limiting situation as theinter-resettlement time period decreases to zero, i.e. continuous resettlement.We have just seen in Section 12.3.1 that the futures price process is

characterized by equation (12.2). The summation inside the expectation in(12.2) can be interpreted as the gain, associated with the numeraire N,between times t and T for a contract entered at t. That is, it is the numberof units of the numeraire N which would be held at time T if all cashreceipts between t and T were immediately invested in the numeraire. Fora given numeraire N and an arbitrary partition ∆[t, T ] = t = T ∆0 , T∆1 , . . . ,T∆m∆ = T of the interval [t, T ], define the associated gain to be

G∆[t,T ]t (ΦH , N) =

m∆

j=1

(ΦHT∆j− ΦHT∆

j−1)N−1

T∆j

. (12.3)

Assuming we are given a futures price process Φ, the following resultidentifies the gain process in the limit as the partition spacing converges tozero.

Theorem 12.1 Let Φ and N be continuous semimartingales on(Ω, Ft,F ,P) with N > 0 a.s. and let ∆m[t, T ], m > 0, be a nested sequenceof partitions of the interval [t, T ] with partition spacing converging to zero asm→∞. Then

G∆[t,T ]t (Φ, N)→

T

t

N−1u dΦu +T

t

d[N−1,Φ]u (12.4)

in probability as m→∞.Proof: First note that G

∆[t,T ]t (Φ, N), as defined by (12.3), can be rewritten

as

G∆[t,T ]t (Φ, N) =

m∆

j=1

N−1T∆j−1(ΦT∆

j−ΦT∆

j−1)+

m∆

j=1

(N−1T∆j

−N−1T∆j−1)(ΦT∆

j−ΦT∆

j−1) .

(12.5)

Page 278: Financial derivatives in theory and practice

254 Futures contracts

The first sum on the right-hand side of (12.5) is a Riemann sumapproximation to the first (stochastic) integral on the right-hand side of (12.4).It can be shown that this sum converges in probability to this stochasticintegral. The reader is referred to Proposition IV.2.13 of Revuz and Yor (1991).A proof that the second sum on the right-hand side of (12.5) converges inprobability to the second integral on the right of (12.4) can also be found inRevuz and Yor (1991, p. 115).

Remark 12.2: Had we imposed further restrictions on the partitions allowedin the above result, we could have established almost sure convergence of thesums to the integrals. For the type of restrictions required and the method ofproof, see for example Theorem 4.15 and Definition 3.71.

Remark 12.3: The second integral in (12.4) is significant. It has arisen becausefor a futures contract payment is made at the end of the time interval, once theprice increment is known. This contrasts with the case of a trading strategywhere the portfolio must be selected at the start of the interval, before theprice increment is known. This latter case, of course, led to the stochasticintegral.

Theorem 12.1 shows us how we should define the gain for a given futuresprice process corresponding to a contract with continuous resettlement. Firstdefine the continuous process G(ΦH , N) by

Gt(ΦH , N) :=

t

0

N−1u dΦHu +t

0

d[N−1,ΦH ]u . (12.6)

The gain over the interval [t, T ] from holding the futures contract can be seento be precisely

GT (ΦH , N)−Gt(ΦH , N) =

T

t

N−1u dΦHu +T

t

d[N−1,ΦH ]u .

Equation (12.2), the requirement that a futures contract can at any time t beentered at zero cost, now becomes

0 = Vt = NtEN GT (ΦH , N)−Gt(ΦH , N)|Ft . (12.7)

This is precisely the statement that G(ΦH , N) is a martingale under themeasure N. Equation (12.2), the analogue of (12.7), characterized the futuresprice process in the discrete case. This is no longer true under continuousresettlement. What we do know from (12.6) is that

ΦHt = ΦH0 +

t

0

NudGu(ΦH , N) +

t

0

d[N,G(ΦH , N)]u . (12.8)

We will use this representation in the next section.

Page 279: Financial derivatives in theory and practice

Recovering the futures price process 255

12.4 RECOVERING THE FUTURES PRICE PROCESS

To recover the futures price process from (12.8) we will now specialize totaking the cash account ζ−1 as numeraire. The reason for working in the risk-neutral measure is that ζ is a finite variation process. As such the quadraticvariation term in (12.8) is identically zero and so Φ is a local martingale underQ. This leads us to make the following definition.

Definition 12.4 We say that ΦH is a futures price process with settlementvalueH at time T if the process G(ΦH , N) defined by (12.6) is a Qmartingale.Note, in particular, that ΦH is then a local martingale.If, additionally, ΦH is a true martingale, and thus

ΦHt = EQ[ΦHT |Ft] = EQ[H |Ft] , (12.9)

we say that ΦH is a martingale futures price process.

For a general futures price process, even when working with ζ−1 asnumeraire, it is still unclear how to establish the existence of and how torecover the process ΦH from H and T , the problem being the need to solvefor ΦH and G(ΦH , ζ−1) simultaneously. It is also difficult to check, and indeedneed not be true in general, that ΦH is uniquely specified by (12.7) and (12.8).If we restrict to martingale futures price processes the situation is clearer.

Recovering ΦH from the settlement value H is immediate from the martingaleproperty (12.9). Given any particular H , to establish that (12.9) defines afutures price process it is still necessary to check that the resulting gainprocess G(ΦH , ζ−1) is also a martingale. This can be done on a case by casebasis, but see Theorem 12.6 below. That, for a given H , there is at most onecorresponding martingale futures price process is easily established.

Theorem 12.5 There exists at most one martingale futures price processcorresponding to any settlement value H and settlement date T .

Proof: If ΦH and ΦH are two such processes then M := ΦH − ΦH is amartingale with MT = ΦHT − ΦHT = 0. Hence M is identically zero and

ΦH ≡ ΦH .Note that the above result has not in general ruled out the possibility of

another (local martingale) futures price process.We conclude this section with a result concerning the existence of a

(martingale) futures price process under special restrictions on the randomvariable H and the cash account ζ−1.

Theorem 12.6 Suppose that the random variable H satisfies EQ[H2] <∞and that ζ is bounded above on the time interval [0, T ] (which holds, forexample, if interest rates are positive). Then the martingale ΦH defined by

ΦHt := EQ[H |Ft]is a futures price process corresponding to H .

Page 280: Financial derivatives in theory and practice

256 Futures contracts

Proof: Since EQ (ΦHT )2 = EQ[H2] < ∞ the martingale ΦH is a square-

integrable martingale over the interval [0, T ]. The process ζ, being bounded

both above and below (by zero) lies in L2(Φ). Hence Gt(ΦH , ζ−1) := t

0 ζudΦHu

also defines a (square-integrable) martingale, by Corollary 4.14, and ΦH is afutures price process with corresponding gain G(ΦH , ζ−1).

12.5 RELATIONSHIP BETWEEN FORWARDS ANDFUTURES

A forward contract is a European derivative under which one counterpartypays to the other an unknown amount H at time S and in return receivesa known amount on the same date. This differs from a standard Europeanderivative only in that the ‘premium’ payment is delayed until the time ofthe derivative payoff. For the forward there is no other payment, in particularwhen the trade is done, so the forward contract has zero value on the tradedate.For a given payoff H and payment date S, the fixed payment FHt which

gives the forward contract zero value at time t is called the forward price. Wehave already met an example of a forward contract, the FRA. In this case theforward price is, of course, referred to the forward rate or forward LIBOR.It is important to understand the relationship between the forward price for

the payoffH at time S and the corresponding futures price. Both products areliquidly traded and their relative pricing is therefore important. In general,as is the case for the FRA and Eurodollar futures contract, the future mayexpire at some time T ≤ S and we allow this possibility in the discussion.The forward price is easily determined. For a forward contract agreed at

time t, the net payment at time S is for the amount H − FHt . This has zerovalue at time t and thus, valuing in the risk-neutral measure,

0 = ζ−1t EQ (H − FHt )ζS |Ft ,

which yields

FHt =ζ−1t EQ[HζS |Ft]ζ−1t EQ[ζS |Ft]

=ζ−1t EQ[HζS |Ft]

DtS.

On the other hand, we have just seen in Section 12.4 that the correspondingmartingale futures price is given by

ΦHt = EQ[H |Ft] .

The difference between the two, which is often referred to as the futures

Page 281: Financial derivatives in theory and practice

Relationship between forwards and futures 257

correction, is thus given by

ΦHt − FHt = EQ[H |Ft]− ζ−1t EQ[HζS |Ft]ζ−1t EQ[ζS |Ft]

=ζ−1t EQ[ζS |Ft]EQ[H |Ft]− ζ−1t EQ[HζS |Ft]

ζ−1t EQ[ζS |Ft]

= − ζ−1t covQ(H, ζS |Ft)ζ−1t EQ[ζS |Ft]

= −ζ−1tcovQ(H, ζS |Ft)

DtS.

In the particular case when t = 0 this reduces to

ΦH0 − FH0 =covQ(H, ζS)

D0S. (12.10)

Thus the futures correction is the forward value to the date S of thecovariance of the payoff amount H with ζS . If they are independent there isno correction. For the Eurodollar future H and ζS will be positively correlatedand so the futures price will be lower than the corresponding forward price100(1− L0[T, S]).The futures correction (12.10) can be calculated analytically in special cases.

Of particular interest is when H = LT [T, S] and H = (LT [T, S] −K)+, forsome K > 0, under the assumption of a Gaussian short-rate model such asthe Vasicek—Hull—White model described in Chapter 17. These calculationsare left as exercises for the reader.

Page 282: Financial derivatives in theory and practice
Page 283: Financial derivatives in theory and practice

Orientation: Pricing ExoticEuropean Derivatives

The next four chapters are concerned with the pricing of exotic Europeanderivatives. By an exotic option we mean one which is not vanilla, i.e. notincluded in the set of liquid products discussed in Chapters 10—12. The lackof liquidity means that a bank trading an exotic derivative would have toquote a price based on its own pricing models and not by direct reference tothe prices quoted by other market participants for the same product.What distinguishes a European derivative from other path-dependent and

American-type products is the fact that it makes payments which can bedetermined by observing the market on a single date T . This fact means thatto model an exotic European derivative it is, in principle, only necessary toformulate a model for the economy (of pure discount bonds) on the single dateT . This is clearly a simplification of the general situation but there is a priceto pay. Most products traded are European, and most new exotic Europeanderivatives are generalizations of those already in existence and commonlytraded. It is essential when formulating a model to price a new product toensure that the price which results is reasonable when compared to the pricesof other similar products, and it must, for example, be worth more than anyother product which is guaranteed to pay out less in all circumstances. Thus,although the problem is theoretically easier, more is demanded of the model.In particular, it is essential that the pricing model is calibrated to other similarbut vanilla instruments. Indeed, we shall see in Chapter 14 that there are evenproducts whose prices are completely determined by the prices of other vanillaproducts and, given these vanilla prices, their prices are independent of anymodel chosen.With the above comments in mind, the process of developing a pricing

model for an exotic European derivative can be split into two steps. The firststep is to apply as much high-level insight as possible about the market andits relationship to this particular derivative to decide on general essential anddesirable features of the pricing model. In particular, one should ask whatfactors in the market are going to have the most significant effect on theprice and to which instruments it is most closely related. This determinesthe complexity of the model required (one factor or more?) and which vanillainstruments the model should be calibrated to. We shall see several examplesof this approach in the coming chapters. The alternative approach, of tryingto use one general model for all exotic derivatives, is dangerous and willnot always work. For example, models we introduce later when discussingAmerican and path-dependent products, such as the market models and

Page 284: Financial derivatives in theory and practice

260 Orientation

Markov-functional models, are well suited to those more complex productsbut often give a poor reflection of those features of the market which arerelevant to pricing Europeans.The second step in developing a pricing model, having identified the

necessary properties, is to choose the specific model to be used, determinethe inputs to this model, calibrate the model and finally apply the model toprice the exotic derivative in question.As an example of the above procedure, we will here briefly discuss the

constant maturity swap (CMS). The CMS is described fully in Chapter 14and a suitable pricing model is developed there. Like a vanilla swap, the CMScomprises a number of payments which can be priced independently so weshall discuss only one. Let y[T, S] be a given forward swap rate which sets attime T . Then a CMS payment based on the index y[T, S] is for the amountyT [T, S] and is made at some fixed time M ≥ T . This is clearly a Europeanproduct, its value at T being determined by DTM and yT [T, S],

VT = DTMyT [T, S] .

Thus, in particular, it is only necessary to model the economy at this terminaltime T and, by the usual valuation formula (presented in Corollary 7.34), theproduct’s value at time zero is given by

V0 = N0EN yT [T, S]DTMNT

(∗)

for some numeraire pair (N,N).We need now to select a numeraire N and then formulate a model to

describe the law of yT [T, S]DTM

NTin the martingale measure N corresponding

to this numeraire. The payout amount is determined by the swap rate yT [T, S]so the distribution of this rate will clearly have a major impact on the priceof the CMS. Furthermore, there is a set of vanilla instruments whose payoutalso depends on this rate, namely the set of swaptions on the associated swap.These are obvious and natural calibrating instruments. It is then a naturalchoice to take N = P [T, S] as numeraire, the PVBP for the underlying swap,and the valuation formula (*) becomes

V0 = P0ES yT [T, S]DTMPT

. (∗∗)

It is in fact the case, and this follows from applying Remark 11.3 in the contextof swaptions, that the prices for all strikes of the calibrating vanilla swaptionsdetermine the law of yT [T, S] in the measure S. But the valuation formula(**) also includes the term DTM/PT so this must also be modelled. Thus weneed to develop a realistic model for the discount curve DTS : S ≥ T which

Page 285: Financial derivatives in theory and practice

Pricing exotic European derivatives 261

correctly values swaptions on the swap corresponding to the rate y[T, S]. Wealso need to ensure that the model is arbitrage-free, otherwise it is of littleuse for derivative pricing. Just such a model is developed in Chapters 13 and14.The next four chapters present ways to develop models for pricing European

exotics. Chapter 13 provides a general framework, the class of terminal swap-rate models, and Chapter 14 applies these ideas to convexity-related productssuch as the CMS. The ideas of Chapter 13 are further extended in Chapter 15,and Chapter 16 shows how the ideas carry over to the multi-currency setting.

Page 286: Financial derivatives in theory and practice
Page 287: Financial derivatives in theory and practice

13

Terminal Swap-Rate Models

13.1 INTRODUCTION

The purpose of this chapter is to present a direct approach to modelling andpricing European interest rate derivatives. To do this we focus attention on thederivative product to be priced and develop models which explicitly captureprecisely those properties of the market which are relevant to the product athand. In particular, we insist that a model is calibrated to any existing marketproducts which are related to the exotic derivative in question. The advantageof this approach over other techniques (for example, using a standard short-rate model such as one of those discussed later in Chapter 17) is that it isguaranteed to price the new product accurately relative to existing products;the model will, by construction, have realistic properties (for example, positiveinterest rates) and will be built upon a theoretically sound base. Furthermore,and at least as important, the characteristics of the model will usually behighly transparent and thus we will be able to understand the model’sstrengths and weaknesses as a tool for pricing any specific product.

13.2 TERMINAL TIME MODELLING

13.2.1 Model requirements

Suppose we wish to value a European product which we know at some futuredate T has value VT, where VT is some arbitrary function of the discount factorson date T. Typically (always in practice) the payoff function VT will dependon only a finite number of discount factors, VT = VT (DTM1 , DTM2 , . . . , DTMk

).In particular, this payoff amount is independent of any modelling assumptionswe might choose to make about the evolution of market rates.

The valuation of a European derivative with a payoff such as this wasdiscussed in detail in Chapter 7. Given suitable assumptions about the modelof the economy (that it be complete and arbitrage-free), the time-zero value

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 288: Financial derivatives in theory and practice

264 Terminal swap-rate models

of the derivative is given by

V0 = N0EN[VTN−1T ] , (13.1)

where (N,N) is some numeraire pair for the economy. To evaluate (13.1) weneed a model under the measure N for the pure discount bond prices DTMi ,i = 1, 2, . . . , k, and for the numeraire price NT . We do not need to know anymore about the distributions of these assets at earlier times since these otherdistributions are not relevant to the price of the exotic European product. Wesuppose that there is some (low-dimensional) random variable yT such thatthe numeraire NT and, for each S ≥ T , the pure discount bond price DTS

can be written as a function of yT :

DTS = DTS(yT ),

NT = NT (yT ) .

If this is the case, then the derivative payoff can now also be written as VT (yT )and V0 can be evaluated from the distribution of yT and the functional formsVT (yT ) and NT (yT ).A model such as this, in which the economy can be modelled as a function

of a low-dimensional random variable, is a special case of a Markov-functionalmodel. These models are discussed more fully in Chapter 19.Thus far the discussion has been very general. We have as yet discussed

neither how to choose an appropriate random variable yT , nor the choiceof an appropriate functional form for DTS(yT ). These choices complete thespecification of the model and will depend on the ‘exotic’ derivative that we arevaluing. The first decision to be made is the choice of yT , and it is importantto ensure that this choice (including the dimension of yT ) is appropriate forthe exotic product in question. Usually there will be some standard marketproducts which are closely related to the exotic derivative, and it is essentialthat the model is ‘calibrated’ to these products. To achieve this we oftenchoose yT to be a natural characterizing variable associated with the vanillaproducts, usually some par swap rate. Given the choice of yT there will usuallybe a natural choice of numeraireN such that under the corresponding measureN the distribution of yT is easily inferred from the market prices of the vanillaproducts.All that then remains to evaluate (13.1) is to specify the functional form

DTS(yT ) and hence VT (yT ). This choice is important and different choicesgive different valuations. There are three key properties which are desirablefor the resulting model:

(P1) (martingale property) D·SN−1 must be a martingale under the measureN, otherwise the model will admit arbitrage;

(P2) (consistency property) any relevant relationships which hold in themarket must also hold for the model;

Page 289: Financial derivatives in theory and practice

Terminal time modelling 265

(P3) (realism property) the model must have realistic properties.

(P1) is essential if the model is to be suitable for pricing derivative productssince it ensures that the model does not admit arbitrage and is required forequation (13.1) to be valid. In many applications, such as calculating convexitycorrections (discussed in the next chapter), this property is often not identifiedor acknowledged in practice. From a practical point of view it ensures, forexample, that if one forward rate is today significantly higher than anotherthen this will also be the case at maturity T . (P2) is required to ensure thatany model we propose is internally consistent. An example of (P2), which weshall see shortly, arises when a model is parameterized in terms of a marketinterest rate, such as a swap rate; we have specified the discount factors interms of the swap rate, which in turn is a function of discount factors — thetwo must agree. The need for (P3) is obvious.If we can find a functional form obeying (P1)—(P3) then we have constructed

a model which is well calibrated, realistic and arbitrage-free, and which canbe used to value European derivatives. In practice the valuation will often becarried out by performing a numerical integration.

13.2.2 Terminal swap-rate models

In most situations the vanilla products most closely related to the derivative tobe priced will be swaptions (or caplets, which are just single-period swaptions).This is, for example, the case for constant maturity swaps (discussed more fullyin Chapter 14) and for zero coupon swaptions (discussed in Section 13.5). Wewill assume that this holds throughout this section and show how the ideasabove can be applied.Let V nt (K) be the time-t value of the calibrating (payers) swaption with

strike K, where n denotes the total number of payments in the underlyingswap. Let T = S0 be the swap start date, S = (S1, S2, . . . , Sn) be the paymentdates in the swap, and define αj = Sj − Sj−1, j = 1, 2, . . . , n, to be theassociated accrual factors. As in Chapter 10, the forward par swap rate forthis swap, yt[T, S], is given by

yt[T, S] =DtT −DtSnPt[T, S]

,

where

Pt[T, S] =

n

j=1

αjDtSj .

We will take the forward par swap rate yT [T, S] as the parameterizing randomvariable yT and construct a model which is calibrated to the underlying vanillaswaption prices for all strikes.We work in the measure S corresponding to the numeraire process P . The

first thing we do is derive the law of the random variable yT in this measure.

Page 290: Financial derivatives in theory and practice

266 Terminal swap-rate models

Recall from Chapter 11 that the value of a payers swaption with strike K isgiven by

V n0 (K) = P0ES[(yT −K)+] . (13.2)

Differentiating both sides of (13.2) with respect to the strike parameter Kyields

∂V n0 (K)

∂K= P0ES[−1yT>K] = −P0S(yT > K) ,

and thus the swaption prices have allowed us to recover the implieddistribution for yT under the measure S. In the case when all the swaptionprices are given by Black’s formula this corresponds to yT having a log-normaldistribution, but the approach is more general and also incorporates smileeffects (i.e. non-log-normal distributions).All that it now remains to do is to specify the functional form DTS(yT ) and

hence VT (yT ). Recall the three key properties identified earlier, which in thiscase become the following:

(P1) (martingale property) D·SP−1 must be a martingale under the measureS;

(P2) (consistency property) yT = P−1T DTT −DTSn ;

(P3) (realism — monotonicity and limit properties) for all S > T , DTS(0) =1, DTS(∞) = 0 and DTS(yT ) should be decreasing in both yT and in S.

If we can find a functional form obeying (P1)—(P3) then we have constructeda model which is well calibrated, realistic and arbitrage-free. There are manyways to do this, and we shall discuss three in the next section.

13.3 EXAMPLE TERMINAL SWAP-RATE MODELS

We shall now study three specific models in more detail: the exponential, thegeometric and the linear swap-rate models. In what follows S1, . . . , Sn denotethe cashflow times for the calibrating swap.

13.3.1 The exponential swap-rate model

In practice, discount curves always exhibit an approximately exponentialshape and so this is a natural functional form to consider. We will modeldiscount factors as

DTS = exp(−CSzT )for some random variable zT and some CS , S ≥ T . The random variable zTreflects the overall level of rates at T , whereas CS gives the dependence onthe maturity S. Note that, were the discount curve exactly exponential at Tthen we would have CS = S−T . This relationship will approximately hold in

Page 291: Financial derivatives in theory and practice

Example models 267

our model, but the exact values CS will be chosen to satisfy the martingaleproperty (P1).It remains to specify the distribution of zT under S. To do this note that,

given the function CS , zT is defined implicitly from yT via

yT =DTT (zT )−DTSn(zT )

PT (zT ).

This relationship yields both the distribution of zT (from that of yT ) andalso the (implicit) functional form DTS(yT ). Furthermore, the consistencyproperty (P2) is automatically satisfied.A simple algorithm can be used to solve for the values CS needed for any

given derivative. Suppose for now that CS1 , . . . , CSn are known. Then thefunctional form for PT (zT ) is also known and, for the model to be consistentwith the initial discount curve and the martingale property of P -rebased assetprices, any other CS is the unique solution to the equation

D0SP0

= ESexp(−CSzT )PT (zT )

, (13.3)

which can be found efficiently using any standard algorithm such as Brent’salgorithm (see, for example, Press et al. (1988)). This has assumed knowledgeof the CSi which we do not have. Indeed, we do not yet even know whetheror not CSi exist which will generate an arbitrage-free model. That they do weestablish in Section 13.4, where we also provide an algorithm to find the CSithat is guaranteed to converge.

13.3.2 The geometric swap-rate model

It is common in practice to model discount factors as geometrically decaying,based on some market rate. The corresponding model in the terminal swap-rate framework is

DTSi =

i

j=1

(1 + αjzT )−1, i = 1, 2, . . . , n,

DTS = (1 + αSzT )−1DTSi , Si ≤ S < Si+1,

yT =DTT (zT )−DTSn(zT )

PT (zT ),

for some αi, αS . This model clearly satisfies (P2) and (P3), and (P1) can bemade to hold by an appropriate choice for αi, αS , the fitting of which is donein a manner similar to that for the exponential model, solving first for the α iand then any other αS that are needed.

Page 292: Financial derivatives in theory and practice

268 Terminal swap-rate models

Note that under this model each forward LIBOR at T which correspondsto a payment period of the vanilla swap, i.e. LT [Si−1, Si] for i = 1, . . . , n,is linear in zT . The model could easily be adapted to make this hold for allforward LIBOR with the same accrual period ∆ := Si − Si−1. That is, themodel can be adapted so that, for all T ≤ T = S − ∆, LT [T , S] is linear inzT . Note also that yT and zT are closely related — if the initial curve is suchthat αi = αi, then yT = zT .

13.3.3 The linear swap-rate model

Our third model is analytically the most tractable of the three and this canlead to insight for specific products (see Chapter 14 for a discussion of thispoint in the case of convexity-related products). Suppose, as in the examplesabove, that we have found a suitable (smooth) function DTS(yT ). Letting

DTS = DTSP−1T and expanding DTS(yT ) about y0, we have

DTS(yT ) =

m

i=0

(yT − y0)ii!

D(i)TS(y0) +

(yT − y0)m+1(m+ 1)!

D(m+1)TS (θT ) (13.4)

for some θT ∈ [y0, yT ], where D(i)TS(yT ) is the ith derivative in yT of the

function DTS(yT ). We could truncate this at m terms and take this to be ourmodel

DTS(yT ) =

m

i=0

aiyiT .

The parameters ai must now be chosen/adjusted from (13.4) to give thenecessary martingale property for the numeraire-rebased discount factors.Without this adjustment the model will no longer correctly value zero couponbonds, and the ability to do this is a highly desirable feature of any interestrate model.For many applications a first-order model (m = 1) is adequate, and this

is the one we develop further. (Most treatments of convexity are based on afirst-order Taylor expansion but without parameter adjustments to get thenecessary martingales and thus bond prices.) This yields a model of the form

DTS(yT ) = A+BSyT , (13.5)

for suitable A,BS . We call this the linear swap-rate model. Note that sincey is a martingale in the swaption measure S it follows from (13.5) and the

martingale property of D·S that

DtS(yt) = A+BSyt ,

Page 293: Financial derivatives in theory and practice

Arbitrage-free property 269

for all t ≤ T . Furthermore, again in the swaption measure S, the discountfactors DTS(yT ) have a rational log-normal distribution, but they may becomenegative. We discuss this latter rather undesirable feature further in the nextsection.Parameter fitting for this model is straightforward. Substituting (13.5) into

the definition for PT yields

n

i=1

αiDTSi =

n

i=1

αi A+

n

i=1

αiBSi yT = 1 ,

and thus, since this must hold for all values of the random variable yT ,

A =n

i=1

αi

−1.

Now we apply the martingale property to find BS :

D0S = ES[DTS ] = A+BSy0

and so

BS =D0S −Ay0

.

13.4 ARBITRAGE-FREE PROPERTY OF TERMINALSWAP-RATE MODELS

The three terminal swap-rate models developed in Section 13.3 have beenformulated to be arbitrage-free by ensuring that all PVBP-rebased assets aremartingales in the swaption measure S. We saw in Theorem 7.32 that thisguarantees that the model is arbitrage-free. To complete the argument, how-ever, there are two final points to check:

(i) In the case of the exponential and geometric models we have not yetestablished that there exist parameter values CSi and αi, respectively,such that the martingale property holds for the (PVBP-rebased) bondsin the underlying calibrating swap, DTS0 , . . . , DTSn .

(ii) We have not established the existence of a full term structure economyDtS : 0 ≤ t < S < ∞ which is arbitrage-free and consistent with themodel. That is, we have not extended the economy to exist over the timeinterval [0,∞).

This second step we do not need to carry out explicitly for applications, butwe should check that it can be done to ensure that the model is theoreticallyvalid.

Page 294: Financial derivatives in theory and practice

270 Terminal swap-rate models

13.4.1 Existence of calibrating parameters

To address the first of these questions, we present an algorithm which can beused to find the parameters CSi , i = 1, . . . , n, in the exponential swap-ratemodel of Section 13.3.1. A similar algorithm can also be employed for thegeometric model to find the αi. The algorithm is iterative and guaranteed toconverge. At each iteration the algorithm gets strictly closer to the solution.Note that the argument below also guarantees the existence of a solution toequation (13.3), i.e. the model is consistent with the initial term structure ofinterest rates and is arbitrage-free.Let Cki denote the value of CSi at the kth iteration of the algorithm. For

each k, take Ck0 ≡ 0 and, without loss of generality, Ckn ≡ 1. Define

xi =D0SiP0and

Eki = ESexp(−Cki zkT )PT (zkT )

,

where

PT (zkT ) =

n

j=1

αjexp(−Ckj zkT )

and zkT solves

yT =1− exp(−zkT )

nj=1 αjexp(−Ckj zkT )

. (13.6)

Note that for Ckj ≥ 0, j = 1, . . . , n, (13.6) has a solution zkT ≥ 0 for allyT ≥ 0. The xi are the target values and our aim is to construct a sequenceCk = (Ck0 , . . . , C

kn) such that E

ki → xi, i = 0, 1, . . . , n. The limit value of the

Ck sequence is the solution to (13.3) that we require (for each i).The algorithm which we present below has the following properties:

(i) 0 ≤ Ck+1i ≤ Cki , i = 2, 3, . . . , n− 1, for all k;(ii) Eki ≤ xi, i = 2, 3, . . . , n− 1, for all k, with equality only if C k

i = 0;(iii) Ek0 −Ekn = x0 − xn = y0, for all k;(iv)

nj=1 αjE

kj =

nj=1 αjxj = 1, for all k.

It follows from (i) that the algorithm converges to yield some limit values Cand E. In the limit Ei ≤ xi by (ii), but this will be an equality with Ci > 0 forall i > 0 (which is what we require), as we now show. For suppose E i < xi forsome i. Then property (ii) implies Ci = 0, in which case Ei = E0. Further, by(iv), if (ii) holds and Ei < xi then En > xn. Substituting these relationshipsinto (iii) yields

x0 = E0 −En + xn < E0 = Ei < xi ,which contradicts the property that discount factors are non-increasing inmaturity.

Page 295: Financial derivatives in theory and practice

Arbitrage-free property 271

Algorithm

Step 0: Define C0 = (0, A, . . . , A, 1) for someA sufficiently large that property(ii) holds. That such A exists is clear (consider A → ∞). At this first step,and all subsequent steps, property (iii) follows from (13.6) and the martingaleproperty of y under S. Property (iv) is immediate.

General step: Given Ck, define zkT via (13.6) and let Ck+1i , i = 2, 3, . . . , n−1,

solve

xi = ESexp(−Ck+1i zkT )yT

1− exp(−zkT ). (13.7)

Note that if Ck+1i is replaced by Cki in (13.7) then the right-hand sideof (13.7) is just Eki , which is no greater than xi by property (ii). Hence

Ck+1i ≤ Cki , i = 2, 3, . . . , n − 1. To ensure that our argument as given iscomplete, we must at this point add the step that if Ck+1 < 0 then set it tozero. Finally, note that Ck+1 ≤ Ck implies from (13.6) that zk+1T ≥ zkT , foreach yT , and thus, from (13.7), Ek+1i ≤ xi.Our earlier argument shows that Ck+1i > 0, i = 2, 3, . . . , n−1, and so there

will never be an occasion on which we set any component of C k+1 equal tozero. Further, it is now clear that the inequalities in (i) and (ii) will alwaysbe strict.

13.4.2 Extension of model to [0,∞)As stated earlier, extending terminal swap-rate models to the time domain[0,∞) is not something we want to do in practice — the whole objectivebehind the approach is to model European derivatives for which the onlyrelevant times are 0 and T . However, we would like to know whether or notan extension is possible and thus whether the model is consistent with a fullarbitrage-free term structure model.We carry out such an extension for the exponential model. That this ap-

proach applies equally well to any model for which the discount factors attime T are positive is easy to see. The extension is completed in three steps:

(1) Define a swap-rate process yt : 0 ≤ t ≤ T consistent with thedistribution of the random variable yT .

(2) Define the economy over [0, T ], DtS : 0 ≤ t ≤ T, S.(3) Extend the definition to also define the economy over [T,∞), DtS : T ≤

t ≤ S.Step 1: Consider a probability space (Ω, Ft,F , S) supporting a Brownianmotion W . Let F be the probability distribution function for the randomvariable yT (under the swaption measure S),

F (x) = S(yT ≤ x) ,

Page 296: Financial derivatives in theory and practice

272 Terminal swap-rate models

and let Φ be the standard cumulative normal distribution. Note that F herecan be general and is not restricted to the log-normal distribution. Now definethe stochastic process yt : 0 ≤ t ≤ T by

yt := ES F−1Φ(WT /√T )|Ft . (13.8)

It is immediate from the tower property and equation (13.8) that y is an(Ft, S) martingale on [0, T ]. Furthermore,

S(yT ≤ x) = S F−1Φ(WT /√T ) ≤ x

= S WT /√T ≤ Φ−1F (x)

= F (x),

so yT has the required distribution.Step 2: We are given the functional form DTS = exp(−CSzT ) for S ≥ T .We will use this, combined with the martingale property of y and all PVBP-rebased discount factors, to construct the model over [0, T ]. However, thefunctional form of DTS for S ≥ T tells us nothing about discount factorsmaturing before T . To model these discount factors and thus to be able tocompletely specify the model on [0, T ] we first introduce a set of ‘pseudo-

discount factors’ DTS = exp(−CSzT ) for 0 ≤ S ≤ T . We have yet to fix theparameters CS . Clearly these are not real discount factors since all bonds withmaturities S ≤ T have expired by T , but, if one so wished, one could think ofeach DtS , t > S, as a new asset which is bought with the proceeds of the bondmaturing at S. We tie down the parameters CS from the required martingaleproperty of the PVBP-rebased pseudo-assets,

D0SP0

= ESexp(−CSzT )PT (zT )

,

and the complete model can now be recovered from the martingale propertyof the PVBP-rebased assets,

DtS =DtS/PtDtt/Pt

=ES[DTSP−1T |Ft]ES[DTtP

−1T |Ft]

.

Step 3: For the exponential and geometric models all discount factors at timeT are positive. Extend the economy beyond T in a deterministic manner asfollows: for S ≥ t ≥ T ,

DtS =DTt(zT )

DTS(zT ).

That the resulting model is arbitrage-free follows from the fact that it is anumeraire model, as defined in Chapter 8; here we are taking the numeraireto be the PVBP until T and the unit-rolling numeraire from then on. Thecorresponding martingale measure is the swaption measure S.

Page 297: Financial derivatives in theory and practice

Zero coupon swaptions 273

Remark 13.1: The derivative pricing theory presented in Chapter 7 required allnumeraire-rebased assets to be martingales with respect to the asset filtrationrather than with respect to the original filtration for the probability space.It is not difficult to see that for the models and examples presented in thischapter we can work with either filtration.

13.4.3 Arbitrage and the linear swap-rate model

One would like to apply the techniques of Section 13.4.2 to show that the linearswap-rate model can also be embedded in a full arbitrage-free term structuremodel. This cannot be done, the reason being the failure of step 3 above.In the linear swap-rate model discount factors at time T may, in general, benegative and it is not possible to have an arbitrage-free model with discountfactors which are negative at time T but guaranteed to be positive, indeedunity, at a known future time.This fact is not as inhibiting as it may at first appear. Recall the initial

motivation for the linear swap-rate model as a first-order approximation toa more realistic and arbitrage-free model such as the exponential model — amodel which can be extended. Also note that the martingale property up untiltime T ensures that there is no arbitrage up to this time, and for a Europeanoption it is only over the time interval [0, T ] that we need to consider theeconomy.

13.5 ZERO COUPON SWAPTIONS

Vanilla swaptions, as defined in Chapter 11, are widely traded in the interestrate markets. A related but less common product is the zero coupon swaption.In the case of a vanilla swaption the holder has the right to enter, at time T ,a swap in which the fixed rate is some predetermined value K. The fixed legthen consists of a series of payments made on dates S1, S2, . . . , Sn, paymentj being of amount αjK. A zero coupon swaption gives the holder the right toenter a zero coupon swap at time T . A zero coupon swap is also an agreementbetween two counterparties to exchange cashflows, but in this case there isonly one fixed payment which is made at the swap maturity time Sn. Theamount of this fixed payment is given by (1 +K)(Sn−S0) − 1 where K is the‘zero coupon rate’ of the swap. The floating leg is exactly the same as for avanilla swap.Here we present the results of pricing a zero coupon swaption using each of

the exponential, geometric and linear swap-rate models, defined above, undervarious market scenarios. Three trades were considered, each being optionsto enter into payers zero coupon swaps: a one-year option into a nine-yearswap; a five-year option into a five-year swap; and a seven-year option into athree-year swap. These trades were each valued in three different interest rateenvironments: a steeply upward-sloping forward curve; a flat forward curve;

Page 298: Financial derivatives in theory and practice

274 Terminal swap-rate models

and a downward-sloping forward curve. In each case the model was calibratedto vanilla swaptions with matching option expiry and swap length. Vanillaswaption volatilities of 5%, 12% and 20% were used, and the zero couponswaptions were valued for strikes ranging from 180 basis points (bp) out ofthe money to 180 bp in the money. Tables 13.1—13.3 summarize these results,showing the maximum price differences that were found between the differentmodels.

Table 13.1 Pricing differences for a one-year into nine-year trade

Shape of forward curve

Upward-sloping Flat Downward-sloping

Geometricvs 7.61 bp at 20% 2.97 bp at 20% 3.89 bp at 20%

Exponential

Geometricvs 130.0 bp at 20% 6.34 bp at 20% 6.21 bp at 20%

Linear

Table 13.2 Pricing differences for a five-year into five-year trade

Shape of forward curve

Upward-sloping Flat Downward-sloping

Geometricvs 1.03 bp at 20% 0.48 bp at 20% 0.29 bp at 20%

Exponential

Geometricvs 37.62 bp at 20% 2.90 bp at 20% 1.05 bp at 20%

Linear

Table 13.3 Pricing differences for a seven-year into three-year trade

Shape of forward curve

Upward-sloping Flat Downward-sloping

Geometricvs 1.60 bp at 20% 0.59 bp at 20% 0.24 bp at 20%

Exponential

Geometricvs 8.20 bp at 20% 1.34 bp at 20% 0.42 bp at 20%

Linear

The exponential and geometric models compared well in most situations.The greatest deviation of 7.61 bp was observed when pricing a one-year intonine-year trade deep in the money, at high volatilities and in a steeply upward-sloping curve. The price of the option in this case was 405.51 bp under theexponential model, so this represents a 1.9% relative difference in value. Ingeneral the differences were less than 1 bp.

Page 299: Financial derivatives in theory and practice

Zero coupon swaptions 275

The linear swap-rate model generally compares reasonably well with thegeometric and exponential models, except in high-volatility and steep-curveenvironments. It consistently overpriced the options we considered whencompared with the other two models, the divergence being most noticeablewhen pricing the one-year into nine-year trade, as one may expect, since in thisexample the zero coupon swaption is least similar to the calibrating vanillaswaption. The relative inaccuracy of the linear model is to be expected giventhat it is intended only as a first-order approximation to a more realisticmodel, such as the exponential or geometric.In Figure 13.1 we examine the one-year into nine-year swaption in more

detail. The figure shows the value of the zero coupon swaption for a rangeof strikes as calculated by each of the three models. The volatility of thecalibrating swaption was 20% and the forward curve was upward-sloping, thesituation in which the models differ most. To give the results some perspectivethe figure also shows the intrinsic value of the zero coupon swaption and, foreach strike levelK, the value of a vanilla swaption which has a strike K chosensuch that the underlying vanilla and zero coupon swaps have the same value.Note how consistently the geometric and exponential perform relative to eachother and the time value of the option which they are designed to capture.

1800.00

1600.00

1400.00

1200.00

1000.00

800.00

600.00

400.00

200.00

0.00

Val

ue (

bp)

Exponential

Geometric

Linear

Intrinsic Value

Vanilla Swaption

12.4

0%

12.6

0%

12.8

0%

13.0

0%

13.2

0%

13.4

0%

13.6

0%

13.8

0%

14.0

0%

14.2

0%

14.4

0%

14.6

0%

14.8

0%

15.0

0%

15.2

0%

15.4

0%

15.6

0%

15.8

0%

16.0

0%

16.2

0%

Strike

Figure 13.1 One-year into nine-year zero coupon swaption valuation

Page 300: Financial derivatives in theory and practice
Page 301: Financial derivatives in theory and practice

14

Convexity Corrections14.1 INTRODUCTION

In this chapter we study the problem of how to correctly price products,relative to each other, which have in common the amount of payment madeand differ only through the date of payment. As we shall see, by careful choiceof model and numeraire for pricing we obtain a decomposition of the valuationformulae which has a natural and informative interpretation.

Two products of considerable importance in the interest rate derivativesmarket which fall into this category are the constant maturity swap (CMS) andthe LIBOR-in-arrears basis swap. Each of these products shares the commonproperty that there is some liquid product to which it is very closely related,and consequently each must be priced correctly relative to its more liquidcounterpart.

The pricing of constant maturity and LIBOR-in-arrears swaps is an exampleof a more general pricing problem which is the subject of this chapter andwhich we now describe. We begin with a definition.

Definition 14.1 Two derivative products are said to be convexity-related ifthere exists some time T such that the value at T of product i, i = 1, 2, is ofthe form

V(i)T =

ni∑

j=1

DTS

(i)j

c(i)j F (DTS, S ≥ T ). (14.1)

Here c(i)j are a set of known constants, S

(i)j ≥ T are known times and F is

some function of the discount curve at time T.

Each of a pair of convexity-related products promises a set of cashflowsmade at the times S

(i)j . Modulo the constants c

(i)j , the actual payments made

for each product are for the same amount, hence the close relationship betweenthem. However, where they differ is through the dates of the payments.

The result of this mismatch of payment dates is convexity. The value of bothderivatives will change as the underlying index (curve) moves. However, they

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 302: Financial derivatives in theory and practice

278 Convexity corrections

will change by different amounts because of the differing payment dates. Theword ‘convexity’ is used in this context because if the value at the maturitytime T of one product is linear in some underlying market rate, the other isnot, and in many common applications the value in this latter case is a convexfunction of the underlying market rate.An important point to note about these products is the following. The

value at time T takes the form of a product of some function of the marketrates (the same in each case) with the value of a linear combination of purediscount bonds. In practice the value of the derivative is usually more sensitiveto market rates through the function F than through the second ‘discounting’term, and we assume this to be the case.Mathematically, the key observation which lies at the heart of this chapter is

that any linear combination of pure discount bonds can, assuming positivity,be taken as a numeraire in the valuation of options. This property, as wesee in the next section, leads to an appealing decomposition of either of theconvexity-related derivatives. Careful choice of modelling assumptions thenallows a detailed comparison to be made between the products.The layout of this chapter is as follows. In Section 14.2.1 we carry out the

decomposition described above, appealing to the theory of option pricing vianumeraires as described in Chapter 7. Applying these results in the context ofthe linear swap-rate model, we present some general and illuminating pricingformulae in Section 14.2.2. In Section 14.3 we apply these results to someexplicit examples. We study the CMS and the LIBOR-in-arrears basis swapand present closed-form results. We also consider options on these productsand are able to show why the ad hoc market standard approach works welland is, to first order, theoretically valid.

14.2 VALUATION OF ‘CONVEXITY-RELATED’PRODUCTS

14.2.1 Affine decomposition of convexity products

Recall that the value at time t of a derivative that pays at T an amount VTcan be expressed in the form

Vt = NtEN[VTN−1T |Ft] , (14.2)

where (N,N) is some numeraire pair and Ft is the filtration generated bythe underlying assets (in the application here these assets will be a finitecollection of zero coupon bonds).We can use (14.2) to express the value of convexity-related derivatives in a

number of ways. Consider two European products having time-T value of the

Page 303: Financial derivatives in theory and practice

Valuation of ‘convexity-related’ products 279

form (14.1). For i = 1, 2, let

P(i)t =

ni∑j=1

c(i)j D

tS(i)j

.

If we assume that P(i)t = 0, t ≤ T , a.s. we can take P (i) (or −P (i)) as numeraire

and apply (14.2) to obtain

V(i)t = P

(i)t EP(i) [F |Ft],

where P(i) is a martingale measure corresponding to the numeraire P (i).

This gives us formulae for each of the products independently, each as anexpectation of the same function F, but, as the expectations are taken withrespect to different measures, these formulae give little information about therelationship between the two products. To understand this relationship weextend the ideas above a little further.

Suppose we can find (and we always can) a set of numeraire pairs (N (j), N(j))such that, for each j = 1, . . . , d, N (j) is an a.s. non-zero linear combination ofpure discount bonds and such that

P(i)t =

d∑j=1

a(i)j N

(j)t , (14.3)

for all t and some constants a(i)j . Then, for i = 1, 2, we have the representation

V(i)T =

( d∑j=1

a(i)j N

(j)T

)F (14.4)

and thus

V(i)t =

d∑j=1

a(i)j N

(j)t EN(j) [F |Ft].

This can be rewritten as

V(i)t = P

(i)t

( d∑j=1

ω(i)j (t)EN(j) [F |Ft]

), (14.5)

where

ω(i)j (t) =

a(i)j N

(j)t

P(i)t

.

Page 304: Financial derivatives in theory and practice

280 Convexity corrections

We have thus expressed both V(1)t and V

(2)t as an affine combination of the

same set of expected values of the single function F . The weightings, w(i)j (t),

depend only on the time-t discount factors DtS , S ≥ T , and not on anyassumptions about the model of the economy — that only comes in via theexpectations. Thus the effect of the mismatch of payment dates is manifestedthrough these weighting terms.What is the interpretation of these results, and why have we derived the

valuation formulae in this way? This will become clear in the next section.

14.2.2 Convexity corrections using the linear swap-rate model

To understand better the relative pricing of convexity-related products we willemploy the linear swap-rate model. The key to understanding the convexitylies in Section 14.2.1. In what follows we shall derive an affine decompositionsimilar to (14.4) which leads to pricing formulae that provide insight intohow the values of convexity-related products are affected by the mismatch ofpayment dates.When studing convexity-related products there is usually some market swap

rate y closely associated with both products (as is the case in the examples ofthe next section). We assume this is indeed the case and take this rate as theunderlying swap rate for the linear swap-rate model that we use for pricing.We denote, as usual, by P the PVBP of the fixed leg of the swap associatedwith the parameterizing swap rate y, and by αj and Sj , j = 1, 2, . . . , n, thecorresponding accrual factors and payment dates.Recall from Section 14.2.1 that the affine decomposition of products was

done using suitable numeraire processes. In what follows we will perform adecomposition using the numeraires P and yP . That P is a numeraire is clearfrom the fact that it is the value of the fixed leg of a swap. Furthermore, sinceyt is the fixed rate at t which ensures both fixed and floating legs of the swaphave the same value, it follows that ytPt is the time-t value of the floating legof the same swap and is also suitable as a numeraire process.Associated with the two numeraires P and yP are martingale measures, S

and Y respectively, in which all numeraire-rebased assets are martingales. Wehave already met the swaption measure S in which P -rebased assets, and inparticular y, are martingales. As is common market practice, we shall assumethat, under S, y satisfies an SDE of the form

dyt = σtytdWSt , (14.6)

whereW S is a standard Brownian motion under S and σ is some deterministicfunction of time. Recall that this yields Black’s formula for swaption prices(Section 11.4). The measure Y, which we call the floating measure (associatedwith the swap rate y), is defined from S by

dYdS Ft

=yty0, t ≤ T .

Page 305: Financial derivatives in theory and practice

Valuation of ‘convexity-related’ products 281

It follows from Girsanov’s theorem (Theorem 5.24) that

dyt = σ2t ytdt+ σtytdWYt , (14.7)

where WY is a standard Brownian motion under Y. In particular, solving(14.6) and (14.7) yields

yT = y0expT

0

σudWSu − 1

2

T

0

σ2u du

= y0expT

0

σudWYu +

12

T

0

σ2u du .

(14.8)

Returning to our convexity-related derivatives, recall that their values at T

are given by V(i)T = P

(i)T F (DTS , S ≥ T ). Under the assumptions of the linear

swap-rate model we can decompose each of the terms P(i)T , i = 1, 2, as follows:

P(i)T =

ni

j=1

c(i)j DTS(i)

j

=

ni

j=1

c(i)j (A+BS(i)

j

yT )PT

=

ni

j=1

c(i)j A PT +

ni

j=1

c(i)j BS(i)j

(yTPT ) . (14.9)

This decomposition is of the form (14.3) and so we can apply (14.5), whichyields

V(i)t = P

(i)t w

(i)t ES[F |Ft] + (1− w(i)t )EY[F |Ft] (14.10)

where

w(i)t =

Pt/nj=1 αj

P(i)t /

nij=1 c

(i)j

. (14.11)

Note the particular form of w(i)t , which depends only on the discount curve

at time t and not on the model for the evolution of the pure discount bondsthrough time. Define the random time A(P, t) via

Pt = DtA(P,t)

n

j=1

αj .

We call A(P, t) the PVBP-average time corresponding to P , and DtA(P,t) thePVBP-average discount factor corresponding to P . With these definitions,the numerator in (14.11) is the PVBP-average discount factor for P ; it is

Page 306: Financial derivatives in theory and practice

282 Convexity corrections

the amount by which a single future paymentnj=1 αj would have to be

discounted if it is to have the same value as the actual payments in the PVBPP . Similarly, the denominator of (14.11) is the PVBP-average discount factorfor the term P (i).The decomposition at (14.9), with the coefficients multiplying PT and

(yTPT ) being independent of yT , was only possible because of the particularform of the linear swap-rate model. This decomposition in terms of the floatingand fixed legs is not the only one with this property, but it is the most useful.The reason for this is the fact that yT is log-normally distributed under bothS and Y, as given by (14.8). As a consequence of this, defining

Ft(yt) = ES[F |Ft] ,

we have

EY[F |Ft] = Ft y∗t ,

where

y∗t = ytexpT

t

σ2u du .

Note that the dependence of Ft(yt) only on t and yt follows from the facts thatthe economy at T is summarized by yT and that (yt, t) is a Markov process.The valuation formula (14.10) can now be written as

V(i)t = P

(i)t w

(i)t Ft(yt) + (1− w(i)t )Ft(y∗t )

= P(i)t Ft(yt) + (1− w(i)t ) Ft(y∗t )− Ft(yt) . (14.12)

The first term in (14.12) is what would result from naıvely valuing the productby taking the expected payoff in the swaption measure S and multiplyingthis by the PVBP term P

(i)t . The second term inside the brackets in (14.12)

is referred to as the convexity correction, the amount by which we mustchange the expectation term in the valuation formula in order to obtain thecorrect valuation. Practitioners have various ad hoc methods for obtainingthis correction term but this one is particularly illuminating, as we explain inthe next section.

14.3 EXAMPLES AND EXTENSIONS

The examples of greatest practical importance and interest are constantmaturity swaps (CMS) and options on constant maturity swaps (CMSoptions), and these are the first two examples we discuss. The third examplewe consider is the LIBOR-in-arrears basis swap which is a special case of the

Page 307: Financial derivatives in theory and practice

Examples and extensions 283

CMS. However, the LIBOR-in-arrears swap is simpler to analyse than theCMS and as a result much more can be said. In each of the three exampleswe discuss, one of the two convexity-related products is a standard vanillaproduct, and this is largely why it is so important to quantify this relationshipcorrectly.

14.3.1 Constant maturity swaps

Recall that under the terms of a vanilla interest rate swap fixed and floatingpayments are made on a series of (approximately) regularly spaced datesS1, S2, . . . , Sn. The amount of each floating payment is the (accrual factormultiplied by the) LIBOR that sets at the beginning of the period, whereasfor the fixed leg the payment is the fixed rate K (multiplied by the accrualfactor). A CMS is the same, with the exception that the payments in thefloating leg are not set based on LIBOR, rather they are based on some othermarket swap rate. So, for example, the CMS may have a five-year maturityand make payments every six months, a total of ten payments per leg. Thepayments on the floating leg may be the two-year par swap rate that sets(usually but not always) at the beginning of each accrual period.Valuation of the fixed leg is straightforward and need not concern us. To

value the CMS floating leg we consider each cashflow in turn. A generalfloating payment in a CMS can be defined as follows. On some date T amarket par swap rate is observed, yT , and a payment of the amount yT ismade at time M ≥ T .This single-payment derivative is closely related to the vanilla swap

corresponding to the rate yT . The floating leg of this vanilla swap has value atT given by PT yT , which can be compared with the value of the CMS paymentwhich is DTMyT . We can apply the results of Section 14.2 to value this CMSpayment relative to the floating leg of the vanilla swap.The value of the CMS payment at time zero is given, from (14.11) and

(14.12), byV0 = D0M(wy0 + (1− w)y∗0)

where

w =

nj=1 αjD0Sj/

nj=1 αj

D0M, (14.13)

y∗0 = y0eσ2T ,

σ being the Black—Scholes volatility of a swaption (on the swap rate yT )maturing at T . It follows that the ‘convexity-adjusted forward rate’, the knownamount which, paid at timeM , has the same value today as the CMS payment,is given by

y0 = wy0 + (1− w)y∗0= y0 + (1− w)y0(eσ

2T − 1) . (14.14)

Page 308: Financial derivatives in theory and practice

284 Convexity corrections

From (14.13) and (14.14) we see that if the single CMS payment is madebefore the PVBP-average time for P , the correction term in (14.14) is positive,reflecting the fact that an increase in yT will increase the value of the CMSpayment by proportionately more than the amount by which the value ofthe vanilla swap floating leg will increase. On the other hand, if the CMSpayment is made after the PVBP-average of the swap, the convexity correctionis negative.

14.3.2 Options on constant maturity swaps

Having introduced the CMS, it is natural to consider options thereon, alsocommon products in the market. A CMS cap is to a cap what the CMS isto a swap. It comprises a series of payments, the payment made at Sj beingαj(ySj−1 −K)+ where αj is the accrual factor, K is the strike and ySj−1 issome market swap-rate setting at Sj−1. The CMS floor is identical, but withthe payment being αj(K − ySj−1)+. A general payment is therefore of theform (φ(yT −K))+ and is made at time M , where φ ∈ −1,+1 and M ≥ T .Again following Section 14.2, the value of this payment at time zero is

V0 = D0M wBS(y0,σ,K, T ) + (1− w)BS(y∗0 ,σ,K, T )

where w and y∗0 are as above and BS is the market standard forward Black—Scholes option pricing formula,

BS(y,σ,K, T ) = φyN(φd1)− φKN(φd2),

d1 =log(y/K)

σ√T

+ 12σ√T ,

d2 =log(y/K)

σ√T− 1

2σ√T ,

φ being +1 for a call and −1 for a put. The same comments made for theCMS also apply for the convexity corrections for CMS options.It is common market practice when valuing CMS caps and floors first to

calculate the convexity-adjusted forward swap rate, y0 in (14.14), and thenclaim that the value of the CMS option is given by

V0 = D0MBS(y0,σ,K, T ) .

We can now see that this is justified, at least to first order, which is all anyconvexity correction of this nature achieves. For, given any payoff F (yT ) attime M , we have

V F0 = D0M wF0(y0) + (1− w)F0(y∗0) (14.15)

Page 309: Financial derivatives in theory and practice

Examples and extensions 285

where, recall from (14.12),

F0(y0) = ES[F (yT )] .

Noting that F0 is C∞ we can apply a Taylor expansion to F0 in (14.15) to

obtain

V F0 = D0M F0(y0) + (1− w) F0(y∗0)− F0(y0)

= D0M F0(y0) + (1− w)(y∗0 − y0)F0(y0) +O(y∗0 − y0)2

= D0MF0(y0) +O(y∗0 − y0)2 .

As is often the case, ad hoc market practice turns out to have a theoreticaljustification.

14.3.3 LIBOR-in-arrears swaps

Unlike a vanilla interest rate swap, the LIBOR-in-arrears basis swap hastwo floating legs but no fixed leg. For notational convenience, define Ljt :=Lt[Sj−1, Sj]. The first floating leg is exactly the same as for a vanilla swap— that is, it comprises a sequence of payments made at times S1, S2, . . . , Sn,the amount of the jth payment being given by αjL

jSj−1 . The second floating

leg makes payments on the same dates but the LIBOR is set in arrears, theamount of payment j being αjL

j+1Sj

(with the obvious definition of Sn+1 and

Ln+1).To calculate the value of the LIBOR-in-arrears leg we consider each cashflow

in turn. It is easy to see that this is just a special case of the more general CMSabove, taking, in the case of the jth payment, M = T = Sj and yT = L

j+1T .

However, to analyse this product we do not need even to formulate a modelfor the evolution of rates, as long as we know the prices of an appropriate setof market caplets. The technique we use here was first presented by Breedenand Litzenberger (1978).The value at time Sj of the jth LIBOR-in-arrears payment is just the

payment amount αjLj+1Sj. Observe that we could take this payment and

immediately reinvest it until time Sj+1, at which time it will be worth

V jSj+1(Lj+1Sj) = αjL

j+1Sj

1 + αj+1Lj+1Sj

= αjLj+1Sj

+ αjαj+1 Lj+1Sj

2

. (14.16)

This formula is smooth in the variable Lj+1Sj, and for any smooth function

V (x), x ≥ 0, we have the linear representation

V (x) = V (0) + xV (0) +∞

0

V (K)(x−K)+ dK. (14.17)

Page 310: Financial derivatives in theory and practice

286 Convexity corrections

Equation (14.17) represents the payoff V(x) as a linear combination of capletpayoffs. But the value of a linear combination of derivatives is just the sumof their individual values. Substituting (14.16) into (14.17), we conclude thatV j

0 , the value at time zero of the LIBOR-in-arrears payment, is given by

V j0 = αjD0Sj+1L

j+10 +

∫ ∞

02αjαj+1C

j(K) dK,

where Cj(K) is the value at time zero of a caplet with strike K, setting at Sj

and paying (Lj+1Sj

− K)+ at Sj+1. Note that at no point have we assumed amodel in this analysis.

Page 311: Financial derivatives in theory and practice

15

Implied Interest Rate PricingModels15.1 INTRODUCTION

In Chapter 13 we introduced the general class of terminal swap-rate modelsand studied three examples: the linear, geometric and exponential swap-ratemodels. Each of these examples was constructed by postulating a functionalform for the discount factors, at the terminal time T, and then using thedistribution of one given market swap rate and the martingale property ofnumeraire-rebased asset prices to determine the remaining free parametersof the model. In this chapter we will consider a class of terminal swap-ratemodels constructed in a different way.

One of the major advantages of using the terminal swap-rate approach todevelop a model for pricing European derivatives is the explicit control onecan exercise over the functional form of the discount factors in terms of anunderlying stochastic process. A second is the ability to calibrate this type ofmodel accurately to the market-implied distribution of a given swap rate orLIBOR. A practical difficulty with the approach is that for the more realisticexamples, such as the exponential swap-rate model, the calibration algorithmspecified in Section 13.4.1 can be slow to converge when the calibrating swaphas many cashflows (for example, quarterly for ten years is a problem).The algorithm given there can be adapted to give better convergence butthe performance is still insufficient for a live trading environment. That isnot to say that an efficient algorithm cannot be developed, and indeed thismay be a simple matter, but at this point of time we are not aware ofone. In the absence of a more efficient calibrating algorithm it is useful todevelop an alternative terminal swap-rate model with similar properties to theexponential or geometric model but which calibrates much more efficiently. Aspecial case of the implied pricing models presented in this chapter achievesprecisely that.

The defining property of an implied interest rate pricing model is that it

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 312: Financial derivatives in theory and practice

288 Implied interest rate pricing models

is one which is constructed to correctly price a number of standard marketproducts. The starting point is the assumption that the complete discountcurve at the terminal time T is some (to be determined) function of a randomvariable r. The prices of the calibrating products which must be correctlyvalued are then used to imply what this functional form must be. To developand use such a model in practice it is important to first identify whichinstruments will be used for the calibration (always a collection of caps and/orswaptions) and then to have a method for implying the functional form fromthese prices.The rest of this chapter is as follows. Throughout we will consider the case

when the random variable r is univariate. First, in Section 15.2, we show howthe functional form of the discount curve at time T , DTS : S ≥ T, canbe implied from the prices of the standard market swaptions which expire attime T . To do this we use the prices of swaptions of all strikes and, for theunderlying swaps, of all (regularly spaced) maturities. The resulting modelis, in fact, a terminal swap-rate model. In Section 15.3 we briefly considerthe numerical implementation of an implied model, then in Section 15.4 weconsider in detail the problem of pricing an irregular swaption. The featureof interest about this product is that it is not in general clear which standardmarket swaption is most closely related to this exotic product. The chapterconcludes in Section 15.5 with a discussion of the relationship between theseimplied models and the geometric and exponential models we met in Chapter13. It turns out that if the volatility of all the calibrating swaptions takesthe same value then the implied model is almost indistinguishable from theequivalent exponential or geometric model.

15.2 IMPLYING THE FUNCTIONAL FORM DTS

Consider a payers swap with fixed rate K which starts at time T and makespayments at times S1, S2, . . . , Sn, and let y

nt denote the forward par swap rate

for this swap at time t. We know that the value of this swap can be writtenin the form

Unt = Pnt (y

nt −K)

and that the value at the maturity date T of a swaption written on this swapis then

V nT (K) = PnT (y

nT −K)+ .

We will suppose throughout this section that we are today, that is at timezero, given the prices of these swaptions for all K ≥ 0, i.e. V n

0 (K) : K ≥ 0.For each coupon date of this swap, Si, i = 1, 2, . . . , n − 1, one can alsoconsider the swaps maturing at time Si which make payments at the timesS1, S2, . . . , Si. We will additionally suppose that we know the prices ofswaptions for all strikes written on all these underlying swaps, V i

0 (K) : K ≥

Page 313: Financial derivatives in theory and practice

Implying the functional form DTS 289

0, i = 1, . . . , n− 1. What we shall do is develop a model which is consistentwith all these prices simultaneously.To develop this model we need to make the following assumptions:

(i) There exists some univariate random variable r which summarizes thestate of the economy at time T and thus the discount curve at T can bewritten in the form DTS(r) : S ≥ T.

(ii) All the swap rates yiT (r) (the exact functional form here being derivedfrom the functional form of the underlying discount factors), are monotone(either all increasing or all decreasing) in the random variable r.

The first assumption means that the model is a one-factor model. Thesecond assumption is a reasonable one to make since it means that the singlefactor being used is a measure of the overall level of interest rates. It is unlikelythat one would want to develop a single-factor model for the discount curvein which the stochastic variable does not represent the level of rates.We can now derive the functional form DTSi(r) : i = 1, 2, . . . , n from

these two assumptions and the swaption prices V i0 (K) : K ≥ 0, i = 1, . . . , n

(we shall provide the functional form for other S later in this section). Toachieve this we shall, in Theorem 15.1, derive the functional forms y iT (r) :i = 1, 2, . . . , n. That these are equivalent follows from the relationships

DTS0 = DTT = 1

and

yiT =DTT −DTSi

P iT, i = 1, . . . , n .

To obtain the functional forms yiT (r) we make the following observation.Since we have assumed that there exists a single factor r which governs alldiscount bond prices, and since each yiT is monotone in r, then given K1 thereexist K2, . . . ,Kn such that

r : y1T (r) < K1 = r : yiT (r) < Ki. (15.1)

If we can, in some measure, find the distribution of a single y iT and if we canalso find, for each K1, the values Ki in (15.1) then we have in effect derivedthe functional form that we desire. More precisely, we will have derived thefunctional forms yiT (y

1T ), giving us the result we require in the case r = y

1T .

The result for any other random variable r can now be derived immediatelyfrom this.The following theorem provides the desired link between each of the Ki.

Theorem 15.1 Given any λ ∈ (0, D0T ), there is a unique solution to theequations (taking V 00 = 0)

∂V i−10 (Ki−1)∂Ki−1

= (1 + αiKi)∂V i0 (Ki)

∂Ki− αi(V i0 (Ki) + λ) . (15.2)

Page 314: Financial derivatives in theory and practice

290 Implied interest rate pricing models

Furthermore the solution Ki(λ) satisfies

λ = D0TF(yiT < Ki(λ)) (15.3)

for each i, where F is the forward measure to time T (the measure underwhich all D·T -rebased discount factors are martingales), and

r : yiT < Ki(λ) = r : yjT < Kj(λ)

for all i and j.

Proof: We work in the forward measure F, in which case

V i0 (Ki) = D0TEF[V iT (Ki)]

= D0T

−∞(−DTT +DTSi +KiP

iT )+ dF(r)

= D0T

r∗(Ki)

−∞(−1 +DTSi +KiP

iT ) dF(r) , (15.4)

where r∗(Ki) is the (possibly infinite) supremum value of r such that theintegrand is positive. Differentiating with respect to Ki yields

∂V i0 (Ki)

∂Ki= D0T

r∗(Ki)

−∞P iT dF(r) . (15.5)

Substituting (15.4) and (15.5) into the right-hand side of (15.2) andrearranging gives

D0T

r∗(Ki)

−∞P i−1T dF(r) − αi(λ−D0TF(r < r∗(Ki))) . (15.6)

Observe that this is increasing in Ki so, given λ, (15.2) will have at most onesolution. Choosing K1,K2, . . . ,Kn such that, for all i and j,

D0TF(y1T < K1) = λ,

r : yiT (r) < Ki = r : yjT (r) < Kj,we have

r∗(Ki) = r∗(Kj) = r

∗,

D0TF(r < r∗) = λ .

With this choice, (15.6) now reduces to

D0T

r∗(Ki−1)

−∞P i−1T dF(r)

Page 315: Financial derivatives in theory and practice

Implying the functional form DTS 291

which is precisely the left-hand side of (15.2). It follows that this is our requiredsolution.The proof above is developed in the forward measure. When considering

each swaption in isolation it is more natural to work with swap rates and towork in the appropriate swaption measure, as we did in Chapter 11. It is thisproperty of the market that obscures the relationship between each of theswaptions — the proof is easy once we have the relationship (15.2) given to us.We have derived an implied functional form at the swap payment dates

Si. It remains to specify the functional form at intermediate time points.There are many ways in which this could be done. One approach which hasconsiderable practical advantages is to define the intermediate discount factorsby some simple interpolation algorithm. For example, one could use a log-linear interpolation of discount factors,

log(DTS) =S − SiSi+1 − Si log(DTSi+1) +

Si+1 − SSi+1 − Si log(DTSi) , (15.7)

where Si < S < Si+1. This has the advantage that it is exactly what manybanks do in practice. At the maturity time T the liquid market instrumentsdetermine the discount factors DTSi and banks then calculate intermediatediscount factors according to the formula (15.7), or some other similar simpleinterpolation routine. Theoretically, however, this model will admit arbitrage— numeraire-rebased discount bond prices for intermediate bonds will not bemartingales. This effect is small enough that it is not relevant in practice.If, despite the practical insignificance of the theoretical arbitrage caused by

this assumption, one wants to generate a model that is exactly arbitrage-freethen this can be done quite easily in a number of ways. Perhaps the simplestwould be, for each intermediate time S, to find some constant wS such that

D0S = wSD0Si + (1− wS)DSi+1where Si < S < Si+1. It then immediately follows from the martingaleproperty of the numeraire-rebased gridpoint bonds, DtSi , that the numeraire-rebased bond DtS is also a martingale. If one wanted to mirror more closelythe market practice of interpolating using (15.7), then an alternative wouldbe to use the interpolation

log(DTS) = wS log(DTSi+1) + (1− wS) log(DTSi) ,

or more generally

f(DTS , S) = wSf(DTSi+1 , Si+1) + (1− wS)f(DTSi , Si) ,

for some function f . In these latter cases the values wS would need to becalculated numerically so that the martingale property held. This would,

Page 316: Financial derivatives in theory and practice

292 Implied interest rate pricing models

however, be very efficient and could be done in exactly the same way as wesolved for intermediate discount factors when we discussed the exponentialswap-rate model in Section 13.3.1.Now that we have derived, from market prices, the joint distribution of

the swap rates at T , we can (numerically) price any derivative from marketswaption prices. Of course, this will only be meaningful if a one-factor modelis suitable for the particular product being considered. See Section 15.4 forfurther discussion of this point.

15.3 NUMERICAL IMPLEMENTATION

To implement an implied model in practice we must solve (15.2). For this we

need to know the functional forms V i0 (K) and the first derivatives∂V i

0 (K)∂K .

All this information is available in the market. Note, in particular, that theswaptions market generally prices using Black’s model. In this case the valueat time zero of a receivers swaption of maturity T is given by

V i0 (Ki) = Pi0[KiN(−di2)− yi0N(−di1)],

where

di1 =log(yi0/Ki)

σi√T

+ 12σi√T ,

di2 = di1 − σi

√T ,

N(·) is the standard Gaussian distribution function and σi is the marketvolatility for this swaption.To derive the joint distributions in this case we work in the forward measure

again. Fixing λ ∈ (0, D0T ), equation (15.2) becomes

P i−10 N(−di−12 ) + αiλ = Pi0 [N(−di2) + αiy

i0N(−di1)] . (15.8)

The form of (15.8) and, more generally, (15.2) is very simple and so it canbe solved inductively in i using Newton—Raphson. This calculation is veryquick because of the quadratic convergence of the Newton—Raphson algorithm(see Press et al. (1988)) and the well-behaved form of the functions involved.Furthermore, if we are solving (15.8) for a grid of λ values in (0, D0T ) thenthe solution for one λ will be a very good first guess at a solution for the nextneighbouring λ. As a result convergence only takes one or two iterations ofthe algorithm.One final decision to be made is what spacing should be used when choosing

the values of λ at which to calculate the functional forms. One could spacethe values evenly over the interval (0, D0T ) which, given the interpretationof λ in (15.3) is certainly one reasonable choice. An alternative, the one we

Page 317: Financial derivatives in theory and practice

Irregular swaptions 293

have usually adopted, is to choose instead an equal spacing for the first parswap rate y1. For each value of y1 the corresponding λ can be calculated from(15.8), thus saving one application of the Newton—Raphson routine.

15.4 IRREGULAR SWAPTIONS

We now consider the problem of pricing an irregular swaption and how animplied pricing model could be used to value such a product. We shall seethat the problem is not as straightforward as one might at first expect.Recall that under the terms of a standard swap one counterparty pays to

another a series of floating cashflows and in return receives a series of fixedcashflows. Using the terminology of Chapter 10, the jth floating cashflow,paid at Sj , is for the amount AαjLTj [Tj, Sj ] (A being the notional amount)and the corresponding fixed cashflow is for AαjK, K being the fixed rate. Byan irregular swap we mean an agreement in which, as for a vanilla swap, aseries of floating and fixed cashflows are exchanged but for which the notionalamount of each cashflow pair may differ between pairs. That is, the jth floatingcashflow is for an amount AjαjLTj [Tj , Sj ] and the jth fixed cashflow is forAjαjK, where now the Aj , j = 1, . . . , n, may differ but must all be non-negative.This is a considerable generalization on a standard swaption. The restriction

that all the Aj be non-negative ensures that this product is primarily sensitiveto the overall level of interest rates and not to the shape of the yield curve.Thus a one-factor model will be sufficient. So suppose we now wish to usea terminal swap-rate model to price this product, first calibrating to somerelated vanilla swaption(s). If we adopt the approach of Chapter 13 and fitan exponential model calibrated to a single swaption, the question arises asto which swaption is appropriate. For example, if all the Aj are (very nearly)equal this is clearly a vanilla product and the calibrating swaption is obvious.On the other hand if all but the last Aj are equal and An = ε which issmall, then the correct calibrating swaption is one with an underlying swapof maturity Sn−1. When the Aj are linearly decreasing to zero the problemis altogether more difficult to decide. One alternative in this case is to use animplied model and calibrate to all the underlying swaption prices. This has aclear weakness, as the following theorem demonstrates.

Theorem 15.2 Consider a linearly amortizing receivers swap, namely anirregular swap for which the notional amounts Aj , j = 1, 2, . . . , n, are givenby

Aj =

nl=j αlnl=1 αl

,

Page 318: Financial derivatives in theory and practice

294 Implied interest rate pricing models

with the αj being the usual accrual factors. Now let

Ut =

n

j=0

cjDtSj

be the value at t of this swap (where, in fact, it follows from the FRA analysisof Section 10.6.3 that, taking A0 = An+1 = 0, cj = Aj(1+αjK)−Aj+1). Fori = 1, 2, . . . , n, let

U it (Ki) = −DtS0 +DtSi +Ki

i

j=1

αjDtSj

be the value at t of a set of vanilla (receivers) swaps with strikes K i. Supposethat

Ut =n

i=1

wiUit (Ki) (15.9)

for someKi and some positive wi and consider now the amortizing and vanillaswaptions with payoffs VT = (UT )+ and V iT (Ki) = (U iT (Ki))+ respectively.Let Vt and V

it (Ki) denote the values of these swaptions at time t ≤ T . Then

Vt ≤n

i=1

wiVit (Ki) (15.10)

for all t.Suppose now that we use a one-factor model to price each of these swaptions,

with resultant prices Vt and Vt(Ki). Then, assuming all swap rates areincreasing in the driving factor,

Vt = infF

n

i=1

wiVit (Ki) (15.11)

where

F = (wi,Ki), i = 1, 2, . . . , n : wi ≥ 0, Ut =n

i=1

wiUit (Ki) .

Remark 15.3: This theorem shows how decomposing the linearly amortizingswap into regular swaps, as in (15.9), gives a corresponding overhedge for theamortizing swaption in terms of vanilla swaptions, (15.10). The implicationsof the second part of the theorem for implied interest rate models is clarifiedin Corollary 15.4 below.

Page 319: Financial derivatives in theory and practice

Irregular swaptions 295

Proof: The inequality follows immediately since

VT = max(UT , 0)

= max

n

i=1

wiUiT (Ki), 0

≤n

i=1

wimax(UiT (Ki), 0).

Thus the original option always pays an amount no greater than the sum ofthe vanilla swaptions and so its value at t must obey the same inequality. Notethat this holds independently of any modelling assumptions.The second part of the result relies heavily on the one-factor property

and the monotonicity of each product in the driving random variable r. Theargument is essentially the one-factor decomposition of Jamshidian (1989).Let r∗ ∈ (−∞,∞) be the value of r such that UT (r∗) = 0. We deal withthe degenerate case when no such r∗ exists later. Choose each Ki such thatU iT (Ki, r

∗) = 0, namely

Ki =DTS0(r

∗)−DTSi(r∗)ij=1 αjDTSj (r

∗), (15.12)

and let wi, i = 1, . . . , n, solve (15.9).Telescoping back from cn to c1 shows that such wi do exist and are given

(with the convention wn+1 = cn+1 = 0), by

(1 + αiKi)wiαi=wi+1αi+1

+ciαi− ci+1αi+1

. (15.13)

The final identity that we require for these wi to solve (15.9) is that

c0 = −n

i=1

wi , (15.14)

but this follows since, by construction, (15.9) holds at t = T , r = r∗.We now make two observations. The first is that wi ≥ 0 for each i and

so this solution is indeed feasible for our minimization problem. This followsinductively from (15.13) since the right-hand side is positive, given the formof the cj for a linearly amortizing swaption, and also (15.12) ensures that(1 + αiKi) > 0. The second observation is that UT and U iT (Ki) are bothdecreasing in r, which follows since all the cashflows except the first (whosevalue is independent of r) are positive and discount factors decrease in r.

Page 320: Financial derivatives in theory and practice

296 Implied interest rate pricing models

To complete the proof note that, by construction, all the swaps have zerovalue when r = r∗ and so monotonicity in r implies

VT (r) =

n

i=1

wiViT (r,Ki)

for all r. Hence (15.10) is an equality in this case and the infimum is attained.The argument above started with the existence of a value r∗ for which the

amortizing swap had zero payout. In the absence of such an r∗ the amortisingswap value is either always positive or always negative. We deal with theformer case, the latter being similar.For i = 2, 3, . . . , n, let Ki = suprKi(r) where Ki(r) is as in (15.12), and

define wi, as before, using (15.13). Now choose w1 and K1 such that (15.13)and (15.14) hold. Observe that each wi is again non-negative and every swaphas non-negative value for all r. The proof is complete.

Corollary 15.4 A one-factor implied pricing model calibrated to all themarket prices V i0 (Ki) will always yield an upper bound for the value of alinearly amortizing swaption.

Proof: For the implied interest rate model, the model and market prices ofthe calibrating swaptions are equal by construction, V i0 (Ki) = V

i0 (Ki) for all

i and Ki. It thus follows from (15.11) and (15.10) that

Vt ≤ Vt .Thus the price from the implied model always bounds the true linearlyamortizing swaption value from above.This systematic overpricing of linearly amortizing swaptions is a direct

result of the one-factor assumption and full calibration. It is worth spendinga little time understanding why this is the case because the issue needs to beconsidered carefully when pricing any product with a single-factor model. Theproblem is that in the model all the forward LIBORs are perfectly correlatedwith each other, whereas in the real market they are not. The volatility of eachcalibrating swaption depends on the law of the underlying swap rate, and thisin turn depends on a subset of the forward LIBOR and their interactions. Inparticular, the correlation structure of these forward LIBORs has a significanteffect. On the other hand, in the one-factor model the correlation betweenthe forward LIBORs is unity. In calibrating correctly to the swaption priceswithin a one-factor model, i.e. one with perfect correlations, the modelmust compensate by systematically adjusting the volatilities of the forwardLIBORs. But this compensation, while it is ideal for the products to whichthe model is calibrated, is far from suitable for other products which have adifferent dependency on the underlying LIBORs.To rectify this problem we must remember the purpose of developing and

using a one-factor model and use general higher-level information before

Page 321: Financial derivatives in theory and practice

Irregular swaptions 297

specializing to a specific model. In all the applications so far there has been avery clear market rate, with a known volatility, to which we should calibratethe model. We have then consistently developed a model for the whole termstructure (at the terminal time T ) so that we can capture some pricing effect,such as the convexity of a CMS. The situation for an irregular swaption isdifferent since there is no clear liquid instrument against which to calibratethe model.What we must do in this situation is first decide what it is in the market

that has most effect on the price of the irregular swaption. In fact there isan obvious answer to this, the par swap rate for the underlying irregularswap, which we define shortly. As we shall see, this rate clearly captures theimportant features of the market, but it has the drawback that there is littledirect market information about its volatility. But we can estimate this byvarious means.To expand on this approach, consider a general irregular swap that

comprises n cashflows and has a fixed rate K. Suppose that cashflow j occursat time Sj and that the notional amount on which this cashflow is based isAj ≥ 0. Then the value at time t of the fixed leg is

n

j=1

AjαjKDtSj

and, using the FRA analysis of Chapter 10, the value of the floating leg is

n

j=1

Aj(DtSj−1 −DtSj ).

For this swap structure we can define the corresponding PVBP, P , and parswap rate, y, via

Pt =

n

j=1

AjαjDtSj ,

yt =

nj=1Aj(DtSj−1 −DtSj )

Pt.

We now model and price the swaption very much as we did for vanillaswaptions. Take P as numeraire and y as a log-normal process in thecorresponding swaption measure S. Once more y is a martingale in thismeasure, being a linear combination of numeraire-rebased asset prices, andwe obtain, for the payers swaption,

Vt = PtES[(yT −K)+|Ft]= Pt ytN(d1)−KN(d2) . (15.15)

Page 322: Financial derivatives in theory and practice

298 Implied interest rate pricing models

Equation (15.15) has exactly the same form as (11.8), the price of a vanillaswaption, and the definitions for d1 and d2 are exactly as given there:

d1 =log(yt/K)

σt√T − t +

12 σt√T − t ,

d2 =log(yt/K)

σt√T − t −

12 σt√T − t ,

σ2t =1

T − tT

t

σ2udu .

From the above it is clear that we do not, on this occasion, need to modelthe full discount curve at time T , DTS : S ≥ T — everything we need tomodel is summarized by the irregular swap rate. However, we are left withthe problem of how we should set the volatility for this irregular swap rate y.But it is now very clear that this is what matters and we are able to exploitto the full our knowledge of the true covariance structure between the variousmarket LIBORs and swap rates. To do this one can use various degrees ofsophistication based on the market swaption and cap floor volatilities.The simplest reasonable approach would be the following. Very roughly

speaking, any vanilla par swap rate can be considered to be a weighted averageof the LIBORs which underlie the corresponding swap. To see this, write

yit =i

j=1

βjLt[Tj, Sj ] , (15.16)

where

βj :=αjDtSjP it

.

Of course the βj also depend on the forward LIBOR but, to simplify, weoverlook this fact. If by some means (we shall not go into this, but one way is byhistorical analysis) we can estimate the correlation structure for the LIBORsthen equations (15.16) allow us to derive the (approximate) LIBOR volatilitiesfrom the swaption volatilities. From these volatilities and correlations we cannow calculate the volatility of the irregular swap rate,

yt =

n

j=1

γjLt[Tj , Sj ] ,

where

γj :=AjαjDtSj

Pt.

What we have done here is use full information about the multi-factornature of the real world to choose the ‘best’ one-factor model for pricing theirregular swaption. We have not needed to use a full terminal swap-rate model

Page 323: Financial derivatives in theory and practice

Exponential and implied swap-rate models 299

because a single rate (the irregular par swap rate y) has summarized enoughabout the market to price the product.As a final point on this product, note that if one were to price a CMS

based on this irregular swap rate then it is possible, extending the approach ofChapter 13 in the obvious way, to develop a terminal swap-rate model based onthis irregular swap rate. This product is unlikely to occur in practice becausecounterparties like payments to be determined from a readily available ratesuch as a LIBOR or a vanilla swap rate.Returning to the implied pricing models, the subject of this chapter, the

analysis above might suggest that they are of limited use. The fitting of asingle factor to many swap-rate distributions has created a model with unde-sirable features. The problem was not one caused by the implied fitting ideaitself, which is an efficient technique, rather it was caused by the choice ofdistributions to fit. However, if we use the technique in a slightly differentway it turns out to be very powerful. We will demonstrate this in the nextsection by generating an arbitrage-free terminal swap-rate model with proper-ties very similar to those of the exponential model defined in Chapter 13. Thisalternative model has the distinct advantage that it is much easier to calibratebecause equations (15.2) can be solved sequentially rather than needing to besolved simultaneously as in the algorithm of Section 13.4.

15.5 NUMERICAL COMPARISON OF EXPONENTIALAND IMPLIED SWAP-RATE MODELS

A single-factor implied interest rate model such as those derived in thischapter, or more generally the one-dimensional examples of Markov-functionalmodels introduced in Chapter 19, can have undesirable properties and givemisleading prices if used inappropriately. The problem, as discussed in the lastsection, is the reduction of a multi-factor world to a single-factor model andis one which is relevant to any one-factor model — you cannot get everythingright so be careful to focus on the right quantities for the pricing problem tohand.For the amortizing swaption, calibrating a one-factor implied model to the

prices of all the underlying swaptions was not the right thing to do. So theidea of calibrating to all these underlying swaptions as an end in itself is notsensible. However, by careful choice of input prices it is possible to generate aone-factor implied pricing model which has relevant (to the problem at hand)properties and behaviour. This should not be surprising since, after all, anyone-factor model is determined by its marginal swap-rate distributions and animplied model can be fitted to any desired marginals. We illustrate this ideaand technique by formulating an implied pricing model which is very similarto a predefined exponential swap-rate model. Other behaviour can be imposedin a similar way.

Page 324: Financial derivatives in theory and practice

300 Implied interest rate pricing models

The problem we consider is this. Suppose we are given an initial discountcurve, D0S , 0 ≤ S < ∞, and we wish to generate an exponential swap-rate model for which the swap rate yn is log-normal in its own swaptionmeasure Sn, i.e. a model which prices all swaptions of expiry time T on theswap rate yn according to Black’s formula. Calibration of this algorithm canbe prohibitively slow, so we would like an alternative model which has verysimilar behaviour to the ‘target’ exponential model.To achieve this, the procedure we follow is this. We will develop an implied

model which values all the swaptions V i0 (Ki) : 1 ≤ i ≤ n according to

Black’s formula. Clearly we want the volatility of yn to be σn, but whatshould we choose for the volatility σi of the swap rate y

i? We use the followingheuristic. In the exponential model, we know that, for each i = 1, . . . , n,

DTSi(zT ) = exp(−CSizT ) , (15.17)

yiT =1− exp(−CSizT )ij=1 αjexp(−CSjzT )

. (15.18)

If we were to replace the random variable zT in (15.17) and (15.18) by someconstant z, then we could conclude immediately, remembering that we havedefined CSn = 1, that

z = − log(DTSn) ,

CSi = −log(DTSi)

z, i = 1, . . . , n− 1 ,

and, furthermore,

dyi0dz

= yi0CSiD0Si1−D0Si

+

ij=1 CSjαjD0Sjij=1 αjD0Sj

. (15.19)

The sensitivity of the swap rate yi to the variable z is given by (15.19). If, foreach i, we use this as a proxy for the sensitivity of the swap rate y i to a drivingBrownian motion, then we are led to select the volatilities σi according to therule

σiyiσnyn

:=dyi0/dz

dyn0 /dz.

So now we can build an implied model with these volatilities σi, i =1, . . . , n. The implied model will have each swap rate being precisely log-normally distributed in its own swaption measure. But how close is this tothe ‘target’ exponential model? We can investigate this numerically and do sobelow. Note, however, that the point of this numerical analysis is to show howclose the two models are. Thus we can be assured that the implied model hassimilar properties to the (more explicit) exponential model. However, there

Page 325: Financial derivatives in theory and practice

Exponential and implied swap-rate models 301

is no reason to believe the implied model is inferior to the exponential modelwhich it attempts to emulate. Both are arbitrage-free, both calibrate perfectlyto the swap rate yn, both give rise to discount curves of approximately thesame shape for each value of the random variable zT (as guaranteed by thenumerical comparison below). However, the major benefit of the implied modelis that it can easily be calibrated efficiently, whereas the exponential modelcannot.To compare these two models we have considered a model with expiry

time T = 1 calibrated to a nine-year swap which pays annual fixed coupons.The volatility of the one-year into nine-year swaption is taken to be 20%throughout and the corresponding forward swap rate is taken to be 7.00%. Wethen consider various interest rate curves, upward-sloping, downward-slopingand flat, and see how similar the models are.The three different interest rate curves used are summarized in Table 15.1.

For each of these curves and for each of the two models we calculated thefollowing quantities:

(i) for each swap rate yi and for a range of strikes K, the value of theassociated payers swaption;

(ii) for each swap rate yi and for a range of K, the probability in thecorresponding swaption measure Si that the swap rate is below the levelK, Si(yiT < K).

It turns out that the differences are greatest for the swap rate y1, as onemight expect given that this rate is least similar to the one-year into nine-year swap rate to which both models have been calibrated. The worst resultswere observed for the upward-sloping curve, and Figures 15.1 and 15.2 belowpresent the results in this case. Figure 15.1 shows the difference in the PVBP-forward swaption values (i.e. swaption value/swap PVBP) for both modelsplotted against the strike. The maximum difference was 0.33 basis points(0.0033%). Figure 15.2 shows the difference in the cumulative probabilitiesfor the two models as a function of the level K. In this case the maximumdiscrepancy was 0.00125.

Table 15.1 Interest rate curves

Term/Years

1 2 3 4 5 6 7 8 9

Upward sloping Swap rate 5.31% 5.54% 5.77% 5.99% 6.21% 6.42% 6.62% 6.81% 7.00%

Fwd libor 5.31% 5.79% 6.27% 6.75% 7.23% 7.71% 8.18% 8.66% 9.12%

Flat Swap rate 7.07% 7.06% 7.05% 7.04% 7.03% 7.02% 7.02% 7.01% 7.00%

Fwd libor 7.07% 7.05% 7.03% 7.01% 6.99% 6.97% 6.95% 6.94% 6.92%

Downward sloping Swap rate 8.76% 8.51% 8.28% 8.05% 7.83% 7.62% 7.41% 7.20% 7.00%

Fwd Libor 8.76% 8.25% 7.75% 7.26% 6.78% 6.30% 5.82% 5.35% 4.88%

Page 326: Financial derivatives in theory and practice

302 Implied interest rate pricing models

0.35

0.30

0.25

0.20

0.15

0.10

0.05

0.00

0.05

Strike

Pri

ce d

iffe

ren

ce (

bp

)

2.62

%3.

33%

4.05

%4.

76%

5.48

%6.

19%

6.91

%7.

62%

8.34

%9.

05%

9.77

%

10.4

8%

11.2

0%1.

90%

Figure 15.1 Difference in forward swaption prices

0.0025

0.0020

0.0015

0.0010

0.0005

0.0000

0.0005

0.0010

0.0015

Swap rate

Pro

bab

ility

dif

fere

nce

2.62

%3.

33%

4.05

%4.

76%

5.48

%6.

19%

6.91

%7.

62%

8.34

%9.

05%

9.77

%

10.4

8%

11.2

0%1.

90%

Figure 15.2 Difference in cumulative probabilities

Page 327: Financial derivatives in theory and practice

16

Multi-Currency TerminalSwap-Rate Models

16.1 INTRODUCTION

The majority of exotic interest rate trades, even ones which involve severalcurrencies, can be valued using single-currency models and thus fall withinthe scope of Chapters 13–15. The reason for this is the fact that the tradesusually comprise two or more legs, one in each currency, and each can bevalued independently of the others. However, there are still many trades whichdo genuinely require a model for the interest rate and foreign exchange (FX)rate evolution of several currencies simultaneously. Models of this type, againfor application to European derivatives, are the topic of this chapter.

The terminal swap-rate models developed in Chapter 13 have many desir-able features and are particularly suited to European derivatives. In devel-oping a multi-currency model one would like to retain these properties ofthe models. Indeed, it would be nice to be able to develop a multi-currencyanalogue which, when considering a single currency, reduced to a standardterminal swap-rate model. This we can in fact do, as we shall see. As wasthe case for the single currency, the resulting models are arbitrage-free, havethe advantage that they are parameterized in terms of standard market rates,and can be calibrated to appropriate prices in the interest rate market. Fur-thermore, within the model the resulting (PVBP-weighted) forward FX rate(defined precisely below) is log-normal and reduces to the standard marketFX model when interest rates are taken to be deterministic. Another desirablefeature is that the model is symmetric, in that it has the same general formwhen viewed from any currency.

Section 16.2 below provides the definition and derivation of a multi-currencyterminal swap-rate model. These ideas are then applied, in Section 16.3, tovalue spread options and cross-currency swaptions. To do this we use the linearswap-rate model, and this results in convexity corrections which generalizethose developed in Chapter 14. Of course, the more sophisticated exponential,

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 328: Financial derivatives in theory and practice

304 Multi-currency terminal swap-rate models

geometric or implied models could also be used for other products whichwarranted it.

16.2 MODEL CONSTRUCTION

As stated in Section 16.1, our objective in this chapter is to extend the class ofterminal swap-rate models to the multi-currency setting. We want to do so insuch a way that we retain all the desirable properties of a single-currencyterminal swap-rate model. In particular, we want an arbitrage-free model(or more generally a modelling framework) which allows us to specify thefunctional form of discount factors in each currency. We now develop justsuch a model.Suppose we wish to value some given product and to do so we must model a

total of n currencies. Suppose further that in each of these currencies we haveidentified a market swap rate, the rate y i in currency i, to which we want tocalibrate the model. Extending the notation of Chapter 10 in the obvious way,we denote by P i the PVBP corresponding to the calibrating swap in currencyi and by Di

tS : 0 ≤ t ≤ S <∞ the discount factors in currency i.The specification of the multi-currency model is carried out in two stages

as follows.

Step 1: Consider each currency separately and in that currency define thefunctional form Di

TS(yiT ), S ≥ T .

Step 2: Fix a probability measure in which to work (usually the swaptionmeasure for the calibrating swap in currency 1) and in this measuredefine the joint law of the calibrating swap rates and the spot exchangerates, yi, X i : 1 ≤ i ≤ n, 0 ≤ t ≤ T, X i

t being the value at t incurrency 1 of one unit of currency i.

Of course, in carrying out these two steps we must ensure that the resultingmodel is arbitrage-free and has all the properties intended at the outset of themodelling process.We shall go through each of these steps in turn. For clarity we first consider

the case when the calibrating swap rates y iT are log-normally distributed intheir respective swaption measures (which we denote by Si). This correspondsto all the calibrating swaptions being priced by Black’s formula and is themost common assumption made in practice — it would be unusual, with thepossible exception of yen derivatives, for a bank to price an exotic option usinga model which explicitly models a volatility smile (i.e. non-log-normal swap-rate distributions). It is the case, however, that the techniques presented herecan be adapted to incorporate volatility smiles, and so we shall later showhow this can be done.

Page 329: Financial derivatives in theory and practice

Model construction 305

16.2.1 Log-normal case

We carry out the two steps described above in turn.Step 1: In each of the currencies i = 1, . . . , n, we select the general functionalform desired for the discount factors at the terminal time T as a function of therelevant calibrating swap rates. Usually this will be the same for all currenciesbut it need not be. Thus, for example, currency 1 could use the exponentialswap-rate model whereas currency 2 could use the linear swap-rate model. Wethen follow exactly the procedure of Chapter 13 to determine the parametersof this functional form (the constants CS for the exponential model, theconstants A and BS for the linear model). This step of the procedure requiresthat we know the volatilities of the respective swap rates but we do notactually set up the model on a probability space at this stage.Step 2: To complete the model specification, we now define the law of allthe swap rates (in a single measure) and of the various FX rates and showthat these, together with the functional forms defined in step 1, combine toproduce a full, arbitrage-free, multi-currency terminal swap-rate model.Let (Ω, Ft,F ,P) be a probability space supporting a correlated 2n-

dimensional Brownian motion (W i,W i; 1 ≤ i ≤ n) with

dW it dW

jt = ρijdt, dW i

t dWjt = ρijdt, dW i

t dWjt = ρijdt.

As will become clear, the measure P will in fact be the swaption measurecorresponding to the PVBP numeraire P 1. Define the vector process(yi,M i; 1 ≤ i ≤ n) to be the solution to the SDE

dyit = −σitσitρiiyit dt+ σityitdW

it ,

dM it = σitM

itdW

it ,

for deterministic functions σi and σi. This SDE clearly has a unique strongsolution and thus uniquely defines the process (yi,M i). Indeed it can be solvedcomponentwise and yields explicitly the solution

yit = yi0exp

t

0

σiudWiu −

t

0

ρiiσiuσ

iu +

12 (σ

iu)2 du , (16.1)

M it =M

i0exp

t

0

σiudWu − 12

t

0

(σiu)2 du .

The processes M i will be used to define the FX rates and, accordingly, we

will always set σ1 ≡ 0. Thus the Brownian motion W 1 is redundant and wasonly included to simplify notation (to save running the FX index from 2 to nrather than from 1 to n).Denote by X i

t the value in currency 1 at time t of one unit of currency i. Themulti-currency model is now completed by defining all the discount factors and

Page 330: Financial derivatives in theory and practice

306 Multi-currency terminal swap-rate models

the FX rates in terms of the process (yi,M i : i = 1, . . . , n) as follows. Firstwe use the functional forms derived in step 1 to define the discount factors atT from the swap-rate processes y i via

DiTS := D

iTS(y

iT ), i = 1, 2, . . . , n.

Note that this automatically also defines the various PVBPs, P iT . Secondly,

we define the FX rates at time T by

X iT :=

P 1TMiT

P iT.

This completely specifies the model at time T . To specify the model at anyearlier time we use the martingale property of P 1-rebased assets, these assetsbeing the pure discount bonds quoted in units of currency 1,

X itD

itS

P 1t= EP

X iTD

iTS

P 1TFt ,

for all t ≤ T ≤ S. Note, in particular, that since each M i is a martingale,

M it =

P itXit

P 1t,

for all 0 ≤ t ≤ T .Remark 16.1: Remark 13.1 regarding the filtration Ft also holds here. Thatis, all numeraire-rebased assets should be martingales with respect to the assetfiltration rather than with respect to the original filtration for the probabilityspace. Once again, for the models and examples presented here, we can workwith either filtration.

It is straightforward to check that this completely determines all the assetprice processes and the FX rate processes (although we will only need toconsider them at the terminal time T in applications). This follows, as it didin Section 13.4 for the single-currency case, from the observation that D i

tt ≡ 1for all i and t and the fact that X1 ≡ 1.The steps above clearly uniquely define a multi-currency model. What is

not yet clear is that it is arbitrage-free and that it has the kind of propertiesthat we intended. This we now show.To establish that the model in arbitrage-free, we show that all P 1-rebased

assets are martingales. That is, for all i and S the processDi·SX

i

P 1 is an (Ft,P)martingale (and thus (P 1,P) is a numeraire pair for the economy). To see this,fix i and note first that, for all t and S,

DitSX

it

P 1t=DitS

P itM it .

Page 331: Financial derivatives in theory and practice

Model construction 307

It therefore follows, from Lemma 5.19, thatDi·SX

i

P 1 is an (Ft,P) martingaleif and only if

Di·SP i is an (Ft,Q) martingale where

dQdP Ft

:=M it

M i0

.

But, applying Girsanov’s theorem for Brownian motion (Theorem 5.24) toW i

we see that (16.1) can be written as

yit = yi0exp

t

0

σiudWiu − 1

2

t

0

(σiu)2 du

where W i is an (Ft,Q) Brownian motion. But now, in currency i, we arein precisely the situation under which we set up the original single-currency

terminal swap-rate model and, by construction, the processDi·SP i is an (Ft,Q)

martingale, as required.To conclude this section we make several observations about the properties

of the model we have just defined:

(i) The measureQ just constructed is, in fact, the swaption measure Si underwhich all P i-rebased assets are martingales.

(ii) The PVBP-weighted forward FX rates Mi are log-normal martingales inthe measure P (= S1). One could also define the FX rates X ij , where

X ijt := Xj

t /Xit is the value in currency i of one unit of currency j, and

the corresponding PVBP-weighted forward FX rates, M ijt := M j

t /Mit .

It is left to the reader to check that, for each i and j, the process M ij

is an Ft martingale in the swaption measure Si corresponding to thenumeraire P i. Furthermore, all the forward par swap rates y i take asimilar log-normal form in this measure and thus the model is symmetric— its general properties are not dependent on the particular currency usedto set up the model initially.

(iii) Observe that the dynamics of any of the discount curves, currency i forexample, depends only on the swap rate in that currency, y i, and not onthe FX rates or on the swap rates in other currencies. That is not to saythe currencies are independent of each other: they are not. Furthermore,note that, if only one currency is considered, this multi-currency modelreduces to the underlying single-currency terminal swap-rate model.

16.2.2 General case: volatility smiles

The treatment above was for the case when the swap rates were log-normalprocesses, this being the most common assumption made in practice. However,just as was the case for single-currency terminal swap-rate models, the analysisextends easily to allow for non-log-normal distributions for the swap rates y iT .

Page 332: Financial derivatives in theory and practice

308 Multi-currency terminal swap-rate models

To cope with volatility smiles in the single-currency case, a model was setup, in Section 13.4.2, as follows. First we calculated a functional form yT (WT )which would give yT the required distribution, W here being a Brownianmotion. Then the functional form DTS(yT ) was derived to be consistentwith the martingale property of PVBP-rebased assets. Thus, indirectly, wecalculated a functional form DTS(WT ).In this multi-currency setting we apply these same ideas, along with the

techniques introduced in Section 16.2.1 to extend to several currencies. First,

as in step 1 above, we calculate the functional forms D iTS(W

iT ). We have not

at this point defined the law of W i, but we know that if it were a Brownianmotion in the measure Si then currency i would, in isolation, define a single-currency terminal swap-rate model. We now carry out step 2 along similarlines to the log-normal case above. Take the same probability space, as before,

supporting the correlated Brownian motion (W i,W i; 1 ≤ i ≤ n), and nowdefine the vector process (z i,M i; 1 ≤ i ≤ n) to be the solution to the SDE

dzit = −σitσit ρii dt+ σitdWit ,

dM it = σitM

itdW

it .

The process z = (z1, . . . , zn) is, under the measure S1, a Brownian motionwith drift, the drift being chosen such that, for each i, z i is a (driftless)Brownian motion under the measure Si. Setting Di

TS := DiTS(z

iT ), and

defining X i from M i, P i and P 1 as in Section 16.2.1, now yields a multi-currency model with all the desired properties. The details are left as anexercise for the reader.

16.3 EXAMPLES

We consider two examples of particular practical importance: spread optionsand cross-currency swaptions. Henceforth, to simplify notation, we will dropthe superscript 1 when the context is clear. We consider the case when theswap and FX rates are all log-normal processes with constant volatilitiesand use the linear swap-rate model in each currency. The constant volatilityrestriction is purely for notational convenience.

16.3.1 Spread options

Consider a cross-currency spread option, the payoff for which is made incurrency 1 at time T and is for an amount

VT = αy(2)T − βy(3)T −K

+,

Page 333: Financial derivatives in theory and practice

Examples 309

where α,β and K are positive constants. The index variables y(2)T and y

(3)T are

two predefined swap rates in currencies 2 and 3, and the payment currency,currency 1, may be the same as one of the index currencies.Working in the measure S, the time-t value for this option is given by

Vt = PtES αy(2)T − βy(3)T −K +

P−1T |Ft . (16.2)

We can evaluate this expectation analytically in the case K = 0 or via aone-dimensional numerical integration when K > 0. However, rather thannaıvely evaluating (16.2) by substituting in the appropriate distributions, it isinstructive to examine the structure of the product in more detail. We do thisby following the ideas and techniques introduced in Chapter 14 for treatingconvexity corrections.We introduce some further notation. Let

M(A1, A2,σ1,σ2, ρ,K, T ) = E A1eN1 −A2eN2 −K

+

where (N1, N2) is a bivariate normal random variable such that

Ni ∼ N − 12σ2i T,σ2i T

and cov(N1, N2) = σ1σ2ρT . In the case when K = 0 this reduces toMargrabe’s exchange option formula (Margrabe (1978)) which is given by

M(A1, A2,σ1,σ2, ρ, 0, T ) = A1N(d1)−A2N(d2),

where

d1 =log(A1/A2)

φ√T

+ 12φ√T ,

d2 =log(A1/A2)

φ√T

− 12φ√T ,

φ2 = σ21 + σ22 − 2ρσ1σ2 .In the case K > 0, M must be evaluated numerically, conditioning first onN2 and evaluating the ‘inner’ integral explicitly.Returning to the evaluation of (16.2) and writing

(αu− βv −K)+ = f(u, v) ,

the payoff can be expressed as the sum of two terms

VT = f y(2)T , y

(3)T = f y

(2)T , y

(3)T A+BT yT PT ,

Page 334: Financial derivatives in theory and practice

310 Multi-currency terminal swap-rate models

the parameters A and BT being as defined in Section 13.3.3 for the linearswap-rate model. Define Y to be the measure given by

dYdS Ft

=yty0, t ≤ T . (16.3)

Then Y is a martingale measure corresponding to the numeraire y tPt, and itfollows from Corollary 5.9 that

Vt = APtES f y(2)T , y

(3)T |Ft +BT (ytPt)EY f y(2)T , y

(3)T |Ft

= wDtTES f y(2)T , y

(3)T |Ft + (1− w)DtTEY f y(2)T , y

(3)T |Ft

where

w =APtDtT

.

We must now find the conditional distribution of y(2)T , y

(3)T given Ft in the

measures S and Y. We already know this for S, the original measure underwhich the model was set up and under which the processes y i satisfy (16.1).

Under S the conditional distribution of (log y(2)T , log y(3)T ) given Ft is bivariate

normal. This joint distribution is specified via the mean and covariancestructure which, recalling that we have dropped the time-dependency of thevolatilities, is given by

ES log yiT |Ft = log yit − σiσiρii(T − t) ,

var[log yiT |Ft] = (σi)2(T − t) ,and

corr[log y(2)T , log y

(3)T |Ft] = ρ2,3 .

To find the corresponding law under Y we apply Girsanov’s theorem(Theorem 5.24) to (16.1) (we know the Radon—Nikodym derivative is givenby (16.3)), to establish that yi is given by

yit = yi0exp σiW i

t + (σ1σiρ1i − σiσiρii − 1

2 (σi)2)t ,

where W it := W i

t − σ1σiρ1it is an (Ft,Y) Brownian motion. Thus, underY, the conditional distribution of y(2)T , y

(3)T , given Ft, is again Gaussian, the

covariance structure being as before but now having mean

EY log yiT |Ft = log yit + σiσ1ρ1i(T − t)− σiσiρii(T − t) .

Page 335: Financial derivatives in theory and practice

Examples 311

It now follows that

Vt = wDtTMS + (1− w)DtTMY (16.4)

where

MN =M α y(N;2)t

∗,β y

(N;3)t

∗,σ(2),σ(3), ρ23,K, (T − t)

and, for i = 2, 3,

y(S;i)t

∗= yitexp −σiσiρii(T − t) ,

y(Y;i)t

∗= yitexp σ1σiρ1i − σiσiρii (T − t) .

The form of the valuation formula (16.4) is particularly striking. The value isexpressed as an affine combination of two Margrabe-type terms, the weightingbeing dependent only on the domestic discount curve at time t. The Margrabeterms each have two ‘convexity-adjusted’ forward swap rates as input. In thefirst term the convexity adjustment is caused by the correlation between therespective swap rate and the corresponding FX rates. In the second termthese forwards are additionally adjusted for the correlation between the swaprate and the domestic swap rate. This is a natural extension of the resultspresented in Chapter 14 for convexity in a single currency.

16.3.2 Cross-currency swaptions

A cross-currency swaption is an option to enter a cross-currency swap. Thereare several different types of cross-currency swap and we will consider four ofthem. The risks and pricing formulae for each of these are somewhat different,but all four fall within our framework.

With initial and final exchange on variable notional

This is one of the most common examples. On entering the swap on date Tthe two counterparties make an initial exchange of principal. One receives a

unit amount of domestic currency (currency 1) and in return pays (X(2)T )−1

units of currency 2. This initial transaction has zero value. The counterpartywho receives the unit domestic amount also pays, on dates Sj , j = 1, 2, . . . , n1,fixed amounts αjK, and finally he returns the initial unit payment at time Sn1 .In return for making these payments, this counterparty receives in currency

2 on dates S(2)j , j = 1, . . . , n2, a series of floating LIBOR payments plus a

margin, m, on the notional (X(2)T )−1 (as in the floating leg of a standard

Page 336: Financial derivatives in theory and practice

312 Multi-currency terminal swap-rate models

interest rate swap). He also receives his initial principal (X(2)T )−1 back at

time S(2)n2 .

To value this swaption, note first that in the case when m = 0 the net valueat time T of all the payments in currency 2 is zero. Further note that the time-T value of the currency 1 principal exchange is DTT−DTSn1 , the time-T valueof the fixed payments made in currency 1 is K

n1j=1 αjDTSj and the time-

T value of the marginal payments received is (X(2)T )−1m n2

j=1 α(2)j D

(2)

TS(2)j

in

units of currency 2 or m n2j=1 α

(2)j D

(2)

TS(2)

j

expressed in units of currency 1. It

then follows that for the swaption

VT = DTT −DTSn1 −Kn1

j=1

αjDTSj +m

n2

j=1

α(2)j D

(2)

TS(2)

j +

= PT (yT −K) +mP (2)T+,

and thus

Vt = PtES (yT −K) +mP(2)T

PT +

Ft

= PtES (yT −K) +m A+BT yT

A(2)+B

(2)T y

(2)T +

Ft .

This can be evaluated by first conditioning on y(2)T and then performing the

remaining integral numerically.

Without initial and final exchange on variable notional

This is exactly as above except the initial and final principal payments arenot made. Note that the time-T value of the floating LIBOR payments made

in currency 2 is exactly y(2)T P

(2)T units of currency 1. We now have

VT = y(2)T P

(2)T −KPT +mP (2)T

+and so

Vt = PtES y(2)T +m

P(2)T

PT−K

+

Ft

= PtES (A+BT yT )y(2)T +m

A(2)+B

(2)T y

(2)T

−K+

Ft .

Evaluation is carried out numerically as above.

Page 337: Financial derivatives in theory and practice

Examples 313

With initial and final exchange on pre-agreed notionals

This is a variant of the first example. There the cashflows in currency 2 are all

made based on a notional (X(2)T )−1. In this example they are made on some

pre-agreed amount, C. Now we have

VT = PT (yT −K) + CmX (2)T P

(2)T

+

= PT yT −K + CmM(2)T

+,

and so

Vt = PtES yT −K + CmM(2)T

+Ft .

This expression is similar to those in the spread option example and evaluationis carried out in the same manner.

Without initial and final exchange on pre-agreed notionals

This is a variant of the second example. Earlier the floating payments were

made on a notional amount (X(2)T )−1. In this example they are made on some

pre-agreed amount, C. Now we have

VT = CX(2)T (y

(2)T +m)P

(2)T −KPT

+,

which yields the time-T value in currency 2 as

V(2)T = C(y

(2)T +m)P

(2)T −K(X (2)

T )−1PT+,

and so, working in the measure S(2),

V(2)t = P

(2)t ES(2) C(y

(2)T +m)−K(M (2)

T )−1+Ft .

This expression is again similar to those in the spread option example andevaluation is carried out in the same manner.

Page 338: Financial derivatives in theory and practice
Page 339: Financial derivatives in theory and practice

Orientation: Pricing ExoticAmerican and Path-DependentDerivatives

In the last three chapters of this book we consider the problem of how to priceAmerican and path-dependent products. By this we mean products which arenot European, ones whose payoffs can only be determined by observing themarket on several (more than one) distinct dates. If the product can be dividedinto simpler subproducts, it is assumed that at least one of the subproductsis not European. If the payoff can be determined completely by observing themarket, with no decisions and actions being taken by the counterparties, weuse the terminology path-dependent; if, in addition, the actions of one or bothcounterparties affect the payoff, we say the product is American. This latterclass includes Bermudan swaptions which we introduce shortly.The dependency of these products on the market on several dates means

that to price them we must also model the market on all the relevant dates.This is certainly a generalization of the models considered in Chapters 13—16 where, although we proved that the models introduced extended to alldates, we only concentrated on the model’s properties at a single date. Thisgeneralization clearly makes the modelling problem more difficult.The general approach to modelling multi-temporal products is identical to

that for European products. That is, the first step is to understand, at ahigh level, the relationship between the product in question and the market,in particular which features of the market are relevant to the product andmust be modelled accurately, and which other products are already tradedin the market relative to which this product must be priced. Once this hasbeen done an appropriate model can be developed which reflects the requiredfeatures of the market. Any model will only capture some properties of thereal market and will be a (very) poor reflection of other features of the marketmore generally. Hence the need to understand the product in question well atthe outset and to tailor the model to the product, just as we did for Europeanproducts such as irregular swaptions in Section 15.4.Understanding what in the market and within any given model has most

effect on the price of a multi-temporal derivative is far from easy. Thereis still a shortage of research available on this problem. When modellingthese products people often take the view that adding more factors (drivingBrownian motions) to a model gives a more realistic model of the market andthat this will therefore give a better price for the derivative. To an extent thisis true. However, whenever an extra factor is added various parameters must

Page 340: Financial derivatives in theory and practice

316 Orientation

be fitted. For many models it is often unclear how these should be chosenand what effect they will have on the derivative’s price. And every time anew factor is added the computation problem of calculating a derivative pricenumerically gets much more difficult.Recently, starting in 1995, there have been significant developments in

interest rate modelling which have made the pricing of multi-temporalderivatives much easier. Prior to this date the primary models used for path-dependent and American products were short-rate models. The main benefitsof these models are that they are arbitrage-free and easy to implement.However, they have the major drawback that they are parameterized in termsof a variable, the short rate, which is only distantly related to the productbeing priced. Consequently, it is usually difficult to assess how well a short-rate model captures those features of the market relevant to the pricing ofany given product. We shall discuss models of this type in Chapter 17, inparticular the Vasicek—Hull—White model which is highly tractable and easyto implement.The breakthrough in 1995 was the introduction of the first market models.

These models are formulated directly in terms of the dynamics of market ratessuch as LIBORs and swap rates. This has two major consequences. First, it isnow possible to formulate models which reflect well the (implied) distributionsof precisely those market rates relevant to any particular derivative product,and thus these models have much better calibration properties than short-rate models. The second major consequence is greater clarity and focus in thethought process around the pricing of multi-temporal derivatives. Now thatwe can model directly the law of a finite number of market rates on a finitenumber of dates, now that we realize (as long as we work in an appropriateprobability measure) that it is only the distribution of these particular rateson these particular dates that affects the derivative price and that given thisdistribution the infinitesimal behaviour of any process is irrelevant, we canfocus on what it is about this (high- but finite-dimensional) distribution thatmost affects the derivative price and which must be captured by the model.And now it is also much clearer what information in the real market weshould use when pricing a particular derivative product. We shall discussmarket models in Chapters 18 and a further development, Markov-functionalmodels, in Chapter 19. This latter class of models shares with market modelsthe ability to model market rates directly, but has improved implementationproperties and also moves further away from the idea of specifying derivativepricing models via the infinitesimal behaviour of relevant stochastic processes.An important example of this type of multi-temporal product which we shall

meet repeatedly in the remainder of this book is the Bermudan swaption, orequivalently a cancellable swap. Let T = S0 be the start date of a swap withpayment dates S1 < . . . < Sn and, as usual, let αi, i = 1, . . . , n, denote thecorresponding accrual factors. If two counterparties enter a swap they areeach obliged to make all the cashflows even if on some occasions it is not

Page 341: Financial derivatives in theory and practice

Pricing exotic American and path-dependent derivatives 317

to their advantage. A cancellable swap is one in which one counterparty hasthe right, on any payment date, to cancel all the remaining cashflows. Thishe would clearly do if interest rates had moved against him. A Bermudanswaption is the option part of the cancellable swap. It gives the holder theright, on any of the swap payment dates, to enter the remaining swap. It isclear that a cancellable receive-fixed swap is just the combination of owning aBermudan payers swaption and entering the underlying receive-fixed swap —if you exercise the swaption then the new pay-fixed swap will precisely offsetall remaining obligations on the original receive-fixed swap.What is it that we need to model to value a Bermudan swaption? Recall

that for the vanilla swaption, analysed in Chapter 11, the only thing that weneeded to model, as long as we worked in the relevant swaption measure, wasthe forward par swap rate. It is clear that for the Bermudan we must modelall the discount factors relevant to the underlying swap, D·Si : 0 ≤ i ≤ n.Indeed, working in the martingale measure corresponding to the numeraireD·Sn , it is straightforward to see (just write down the valuation formula) thatall we need to model is the 1

2n(n + 1)-dimensional vector random variable

(DSiSj : 0 ≤ i < j ≤ n). This is equivalent to modelling the 12n(n + 1)-

dimensional swap-rate vector (y ijTi : 1 ≤ i ≤ j ≤ n), where Ti = Si−1 and

yijt =DtTi −DtSj

P ijt,

P ijt =

j

k=i

αkDtSk .

The market provides information, via swaption prices as described in Section13.2.2, on the distribution of each yijTi in its own swaption measure S

ij . Todevelop a model which correctly captures all these marginal distributionswould be overfitting. Even if this were a good idea, it would involve too manyrandom factors to be implementable in practice. However, as we shall see inChapter 19, it is possible to develop a practically implementable model (drivenby a univariate Brownian motion) which correctly captures the market pricesof swaptions corresponding to the par swap rates y inTi , those corresponding tothe swaps which could be entered if the Bermudan were exercised, and withenough free parameters to fit their joint distribution to a degree sufficient forBermudan valuation.

Page 342: Financial derivatives in theory and practice
Page 343: Financial derivatives in theory and practice

17

Short-Rate Models17.1 INTRODUCTION

This chapter is devoted to the study of short-rate models. We single out,in particular, the Vasicek–Hull–White model for more detailed study. At thetheoretical level short-rate models have already been considered in Chapter 8,where they were related to other classes of models. What we actually meanhere is the special case of arbitrage-free models of the terms structure forwhich the short rate rt, t ≥ 0 is a (time-inhomogeneous) Markov processin the risk-neutral measure Q. Usually we will also restrict attention to ashort-rate process driven by a univariate Brownian motion, but we need notmake this restriction in general. This class of models has been of considerableimportance historically, primarily because they are both arbitrage-free and canbe implemented numerically. Recently their popularity has been decreasingamongst practitioners with the advent of market models which have bettercalibrating properties.

All the short-rate models in common use are specified via an SDE,

drt = µ(rt, t)dt + σ(rt, t) dWt,

where W is a Brownian motion in the risk-neutral measure Q. We shall meetseveral examples in the coming sections. The functions µ and σ are chosen togive the model particular behaviour, tractability, and to make it arbitrage-free.For example, the Vasicek–Hull–White model satisfies an SDE of the form

drt = (θt − atrt)dt + σt dWt, (17.1)

where a, θ and σ are deterministic functions of time. The exact form of thesefunctions is chosen to fit the model to initial bond prices (to make it arbitrage-free) and to option prices. We shall discuss this further later.

The Markovian property of the short rate, in the risk-neutral measure Q,is essential for these models to be implementable in practice. To see why this

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 344: Financial derivatives in theory and practice

320 Short-rate models

is the case, recall from Chapter 8 that the value DtT of a pure discount bondat t which matures at T is given by (see equation (8.11))

DtT = EQ exp −T

t

ru du Ft , (17.2)

where Ft is the augmented natural filtration generated by the Brownianmotion W . The Markov property of r ensures that, for all pairs t ≤ T ,(17.2) is a function of the triple (rt, t, T ). Thus the state of the market att is completely summarized by the pair (rt, t). It is this property that allowsone to price a derivative using any of the standard numerical techniques suchas numerical integration, simulation, trees or some finite-difference algorithm,as appropriate. We shall not discuss these standard techniques but refer thereader to Duffie (1996).In this chapter we do the following. In Section 17.2 we give the defining

SDE for a few of the best-known short-rate models. In each case there areparameters which need to be determined by fitting the model to marketprices. This calibration must usually be done numerically and can often beonerous. One model for which the calibration is relatively straightforwardis the Vasicek—Hull—White model which we study in detail in Sections 17.3and 17.4. The form of the defining SDE (17.1) is such that the solutionprocess is a Gaussian process and therefore much can be said about this modelanalytically. In particular, closed-form bond prices result. In Section 17.3 weshow how to recover the Vasicek—Hull—White model parameters efficientlyfrom market prices, and in Section 17.4 we consider the problem of pricinga Bermudan swaption using the Vasicek—Hull—White model. We provide analgorithm which is highly efficient and stable. The algorithm exploits theGaussian property of the Vasicek—Hull—White model and can be applied toother Gaussian processes such as some of the example Markov-functionalmodels introduced in Chapter 19.

17.2 WELL-KNOWN SHORT-RATE MODELS

Here we briefly describe a few of the best-known short-rate models. By farthe most important for practitioners are the first two, the Vasicek—Hull—Whitemodel and the Black—Karasinski model.

17.2.1 Vasicek—Hull—White model

Vasicek (1977) introduced the first short-rate model, not with derivativepricing in mind but from an economic perspective. For his model, under Q,the (mean reverting) short rate satisfies the SDE

drt = (θ − art)dt+ σdWt

Page 345: Financial derivatives in theory and practice

Well-known short-rate models 321

for some constants θ, a and σ. It is immediately clear that the Vasicek modeldoes not have enough free parameters so that it can be calibrated to correctlyprice all pure discount bonds, i.e. we cannot in general choose a,σ and θ tosimultaneously solve

D0T = EQ exp −T

0

ru du

for all maturities T > 0. This led Hull and White (1990) to extend this modelby replacing these constants with deterministic functions,

drt = (θt − atrt)dt+ σtdWt . (17.3)

It is apparent from (17.3) that the short rate r can take negative valuesin the Vasicek—Hull—White model. This is obviously an undesirable feature,one much criticized. However, in our view this criticism is overplayed sinceall the main short-rate models have their own particular weaknesses and, aslong as the model is calibrated and used appropriately, this in itself shouldnot usually present too many problems. Indeed, the Black—Karasinski model,which is often used instead of the Vasicek—Hull—White model precisely becauseit keeps interest rates positive, suffers from giving too high a probability tosample paths of r which spend large amounts of time at large positive values.This is just as problematic, leading, for example, to infinite futures prices!The major advantage of the Vasicek—Hull—White model, and the primary

reason why it is used so widely, is that it is highly tractable. This tractabilityis important in order to be able to calibrate the model efficiently to marketprices. The SDE (17.3) is well known as one which yields a Gaussian processas a solution and it is easily solved. To do this, consider the process y t := φtrtwhere

φt = expt

0

audu .

Applying Ito’s formula to y yields

dyt = φtθtdt+ φtσtdWt

and thus

yt = y0 +t

0

φuθudu+t

0

φuσudWu

= y0 +t

0

φuθudu+W (ξt) , (17.4)

where

ξt :=t

0

φ2uσ2udu

Page 346: Financial derivatives in theory and practice

322 Short-rate models

and W is a Brownian motion (adapted to the filtration Ft = Fξ−1t). To

see this last step just check that the process W , defined by

Wt :=ξ−1(t)

0

φuσudWu,

satisfies the defining properties of a Brownian motion (Definition 2.4).The representation for yt given in (17.4) allows us to make all the necessary

deductions and calculations for the short-rate process rt (= φ−1t yt) that weneed when calibrating the model to bond and option prices. In particular,given rs, both rt, t ≥ s, and integrals of rt against deterministic functionswill be Gaussian. From this we obtain immediately the standard result forthe value at time t of a pure discount bond paying unity at time T , namely,

DtT = EQ exp −T

t

rudu Ft = AtT e−BtT rt (17.5)

for suitable A and B to be determined. These functions can be evaluateddirectly, using the properties of Brownian motion and log-normal randomvariables, to yield A and B in terms of the functions θ, a and σ, and the readermay choose to do so now. However, there is a more direct way to calculatethese parameters, in terms of the initial discount curve D0T : 0 ≤ T < ∞.We shall do this in Section 17.3.

17.2.2 Log-normal short-rate models

Motivated by the log-normality of LIBORs and swap rates as given by Black’soption pricing formula (see Sections 11.3.1 and 11.4), and by a desire to havea model with positive interest rates, one is led to postulate a short-rate modelin which the short rate is a log-normal process,

d log rt = (θt − at log rt) dt+ σtdWt . (17.6)

This model, usually referred to as the Black—Karasinski model (Black andKarasinski (1991)), and the Vasicek—Hull—White model are undoubtedly themost commonly used short-rate models in the markets. The Black—Karasinskimodel has the distinct advantage over the Vasicek—Hull—White model that itretains positive interest rates. However, as mentioned earlier, the cost, due tothe log-normality, is heavy positive tails for the distribution of each rt whichleads to infinite values for futures prices (see Hogan and Weintraub (1993)).This illustrates quite clearly the dangers of a model which is parameterizedin terms of a variable only remotely related to the relevant market rates. Asecond drawback is that fitting the model to market prices is time-consuming.If we knew the function σ, it would be a straightforward and quick matter to

Page 347: Financial derivatives in theory and practice

Well-known short-rate models 323

fit the remaining parameters. The way to proceed is using a tree algorithmand forward induction as described by Jamshidian (1991). (We shall not coverthis topic.) However, when simultaneously fitting the model to option pricesto determine σ one must resort to some search algorithm to solve for thefunction σ; fix σ, solve for θ, change σ, re-solve for θ, and so on.The function a in the SDE (17.6), and also in (17.3), is extremely important.

We shall defer discussion of this until Section 19.7. Some practitionerssimultaneously fit all three functions θ, a and σ to the market, others fixa (as discussed in Section 19.7) and then fit just θ and σ. The latter is, in ourview, the better approach. Note that if we were to set at = −σt/σt then

rt = exp(αt + σtWt)

where

αt = σtlog r0σ0

+t

0

θuσudu .

This special case is known as the Black—Derman—Toy model.

17.2.3 Cox—Ingersoll—Ross model

We include this model because it is extremely well known. However, it is notone that we have ever used for derivative pricing purposes.The original Cox—Ingersoll—Ross model (Cox et al. (1985)) is defined via

the short-rate SDE,

drt = (θ − art) dt+ σ√rtdWt , (17.7)

for constants θ, a and σ. Three features are immediately clear from theSDE (17.7). First, the short rate is mean-reverting, being pulled towards thelevel θ/a. Second, its diffusion term is decreasing with the level of the shortrate according to

√rt. This is intermediate between the Vasicek—Hull—White

model, in which the diffusion term does not depend on rt and for which ratescan go negative, and the log-normal models in which the diffusion term isproportional to rt and rates are strictly positive. In fact, it can be shown thatif θ < 1

2σ2 the Cox—Ingersoll—Ross model will eventually hit zero; if θ ≥ 1

2σ2

it is always strictly positive.The third property to note about the Cox—Ingersoll—Ross model is that

there are not enough free parameters to fit an arbitrary initial term structure.This is easily rectified by modifying to the generalized Cox—Ingersoll—Rossmodel,

drt = (θt − atrt) dt+ σt√rtdWt . (17.8)

The SDE (17.8) is readily solved when θt ≥ 12σ

2t for all t. For this generalized

model there are, as was the case for the log-normal models, no closed-form

Page 348: Financial derivatives in theory and practice

324 Short-rate models

bond pricing formulae and thus any implementation would again be relativelyonerous to calibrate simultaneously to the initial discount curve and optionprices.

17.2.4 Multidimensional short-rate models

This is not a topic we shall cover in much detail. In the financial literaturethere are a great many short-rate models which involve two factors but theseare rarely used for derivative pricing in practice. See, for example, Musiela andRutkowski (1997) and references therein. There are three primary reasons forthis. The first is the difficulty with implementing such a model. For example,calibrating a tree, already onerous for all but the Vasicek—Hull—White model,will now be even more costly since the tree will be (at least) two-dimensionaland there will be more parameters to fit. The second problem is one ofunderstanding how these factors and associated model parameters affect theproperties of the model and consequently option prices. The third reason isthe fact that a one-factor model is usually enough for path-dependent andAmerican-type products, as long as the model is fitted with the product to bepriced in mind, a point discussed in Section 15.4 in the context of Europeanamortizing swaptions.Having given all these caveats, two-factor models can be developed. From

an implementation viewpoint, one needs the resulting model to be of the form

rt = f(Xt, Yt, t) (17.9)

for some (time-inhomogeneous) Markov process (X,Y ). Typically the process(X,Y ) would be driven by a (two-dimensional) Brownian motion. A specialcase of (17.9) is when the process (X,Y ) takes the form

Xt = µXt +

t

0

σXu dWXu ,

Yt = µYt +

t

0

σYu dWYu ,

for deterministic functions µX , µY ,σX and σY and for a Brownian motion(WX ,WY ). In this special case one knows explicitly, for any t ≥ s, thedistribution of (Xt, Yt) (hence of rt) given (Xs, Ys). When the function f at(17.9) is linear in (X,Y ) we are once more in the situation where bond pricesare available in closed form and this considerably simplifies the calibrationof the model to market data (but the effect of each of the two factors is stilldifficult to appreciate).

Page 349: Financial derivatives in theory and practice

Parameter fitting within the Vasicek—Hull—White model 325

17.3 PARAMETER FITTING WITHIN THE VASICEK—HULL—WHITE MODEL

We now detail how all the parameters of the Vasicek—Hull—White model canbe calculated from market prices. Recall that the short rate is driven by theSDE

drt = (θt − atrt)dt+ σtdWt (17.10)

in the risk-neutral measure Q, and furthermore (equation (17.5)), bond pricesin the model are of the form

DtT = AtT e−BtT rt .

We shall show how to derive each of the functions a, θ,σ, A and B. Note,however, that for derivative valuation explicit knowledge of each of theseparameters is not necessary. The reason for this, as we shall shortly see, isthat for the Vasicek—Hull—White model it is more convenient to work not inthe risk-neutral measure but in the forward measure corresponding to takingsome fixed bond D·T as numeraire. If we do this the only parameters we needto explicitly derive are (various integrals of) the functions σ and a.Throughout this section we assume only that the initial discount curve,

D0T : T ≥ 0, is right-differentiable with continuous right derivative almosteverywhere (the usual case in practice). We comment on the implications ofthe discount curve not having a continuous derivative in Section 17.3.3.The market prices to which the model will be calibrated are bond prices

(i.e. the initial discount curve) and option prices, either caps and floors orswaptions as appropriate to the pricing application. We shall assume that themean reversion function a has been prespecified. This means that it is onlypossible to calibrate the model to correctly price at most one option, with agiven strike, per maturity date. One could fit the function a as well, and doingso would allow us to simultaneously fit to the prices of two different optionsper maturity date (such as a swaption and a caplet). However, in Section 15.4we discovered the dangers of fitting a one-factor model to the price of morethan one option per expiry date. Furthermore, as discussed in Section 19.7, themean reversion has such a significant effect on the joint distribution of marketswap rates that we should take care to ensure that any fitting algorithm fixesthis parameter in a way most appropriate to the exotic option being priced.With the two remaining degrees of freedom (θ and σ) one could, in principle(as long as the prices are consistent), calibrate the model to the full initialdiscount curve and one option price for each option maturity date T . However,in applications it is only necessary to calibrate to options with a finite numberof maturity dates and thus it is common to fix a functional form for σ betweenthe relevant dates Ti.We also suppose that the prices of the market options that we use to

calibrate the model are consistent with the initial assumptions about the form

Page 350: Financial derivatives in theory and practice

326 Short-rate models

of a, i.e. given our choice for a there exist θ and σ consistent with the marketprices.For future reference we make the following definitions, all relating to

integrals of the original model parameters.

φt = expt

0

audu , ψt =t

0

φ−1u du,

ξt =t

0

φ2uσ2udu, ζt =

t

0

ψuφ2uσ

2udu, ηt =

t

0

ψ2uφ2uσ

2udu,

µt =t

0

φuθudu, λt =t

0

ψuφuθudu.

It is only actually ψ and ξ that we shall need for the pricing algorithmdescribed in Section 17.4 below.

17.3.1 Derivation of φ, ψ and B·T

Given that a is a predefined user input (for current purposes) the evaluationof φ and ψ is straightforward. To calculate BtT , 0 ≤ t ≤ T, we note that, foreach T ,

MtT := exp −t

0

rudu DtT = exp −t

0

rudu AtT e−BtT rt

defines a martingale M·T , t ≤ T in the risk-neutral measure, i.e. the measurein which (17.10) holds. Applying Ito’s formula to M·T yields

dMtT

MtT=− rtdt+ dDtT

DtT

= −rt + (AtT /AtT )− rtBtT + 12σ

2tB

2tT −BtT (θt − atrt) dt

− σtBtT dWt

(A and B representing the derivative with respect to the first parameter t).Since M·T is a martingale the finite variation term is identically zero for allvalues of rt and t. It thus follows that

−1−BtT + atBtT = 0, (17.11)

(AtT /AtT ) +12σ

2tB

2tT −BtT θt = 0. (17.12)

Solving (17.11) subject to the condition BTT = 0 (which is required sinceDTT ≡ 1) yields

BtT = φt

T

t

φ−1u du = φt(ψT − ψt).

Page 351: Financial derivatives in theory and practice

Parameter fitting within the Vasicek–Hull–White model 327

17.3.2 Derivation of ξ, ζ and η

We already know all the terms in each of the respective integrands exceptσ. It suffices to calculate ξt, t ≥ 0, since this implicitly defines σ and thus ζand η. In practice we will only have a finite set of market option prices tocalibrate to and so will calculate ξt for a finite set of times and assume a(simple) functional form for σ at intermediate points.

Suppose we have available the price V0 of an option expiring at T andpaying ( ∑

j≤J

cjDTSj− K

)+.

This general form includes all standard caps, floors and swaptions. Taking D·Tas numeraire, it follows that

V0 = D0TEF

[( ∑j≤J

cjDTSj− K

)+

],

where the measure F is equivalent to the original risk-neutral measure and issuch that D·S

D·Tis a martingale for all S > 0. Indeed, F and Q are related by

dF

dQ

∣∣∣∣Ft

= exp(−

∫ t

0rudu

) DtT

D0T.

Applying Ito’s formula to (the martingale)D·Sj

D·Tand changing measure to F

yields

d

(DtSj

DtT

)=

DtSj

DtT(BtT − BtSj

)σtdWt

=DtSj

DtT(ψT − ψSj

)φtσtdWt,

for an (Ft, F) Brownian motion W . This SDE can be solved explicitly andyields the unique solution (see Chapter 6)

DtSj

DtT=

D0Sj

D0Texp

((ψT − ψSj

)∫ t

0φuσudWu −

12(ψT − ψSj

)2ξt

)

=D0Sj

D0Texp

((ψT − ψSj

)W (ξt) −12(ψT − ψSj

)2ξt

)(17.13)

where W is an (Fξ−1t, F) Brownian motion, and thus

V0 = EF

[( ∑j≤J

cjD0Sje(ψT−ψSj

)W(ξT)−12

(ψT−ψSj)2ξT − KD0T

)+

]. (17.14)

Page 352: Financial derivatives in theory and practice

328 Short-rate models

Suppose for now that we already know the value of ξT. Here we show howthen to calculate V0(ξT), the option value for this particular ξT. Solving forξT can then be completed by iterating over ξT with either Brent or Newton-Raphson to match the value V0(ξT) to the true market option value V0.

Given a value for ξT, evaluation of (17.14) is performed using the decompo-sition first introduced by Jamshidian (1989). A requirement for this approachto (be guaranteed to) work is that the payoff must be monotone in the realizedvalue of W . This is always the case for standard options, including those withamortizing and forward start features.

The first step in this approach is to find the value ω for W (ξT) which setsthe term inside the expectation (17.14) to be exactly zero. A safe Newton-Raphson algorithm (Press et al. (1988)) will do this quickly. We can thenrewrite (17.14) as

V0 = EF

[∑j≤J

cjD0Sj

(e(ψT−ψSj

)W(ξT)− 12 (ψT−ψSj

)2ξT − Kj

],

whereKj = exp

((ψT − ψSj

)ω − 12 (ψT − ψSj

)2ξT

)

and the ± in the expectation is + if the payoff is increasing in W and − ifit is decreasing. With this decomposition the expectation can now be easilyevaluated.

Before moving on we note that should an alternative model be requiredin which a is not chosen a priori but is instead a function of σ then thestarting point for the algorithm would be this section, solving for a and σsimultaneously.

17.3.3 Derivation of µ, λ and A·T

Substituting for BtT in (17.12) and integrating yields

log(AtT) = log(A0T) + ψTµt − λt − 12 [ψ

2Tξt − 2ψT ζt + ηt]. (17.15)

We know A0T from the values D0T, namely A0T = D0T e

B0T r0 . Substituting t = T

into (17.15), differentiating with respect to T and noting that ATT ≡ 1 yields

µT = ψTξT − ζT − φT∂

∂Tlog(A0T) (17.16)

and thence

λT = log(A0T) − ψT φT∂

∂Tlog(A0T) + 1

2 (ψ2T ξT − ηT).

Substituting back into (17.15) now gives the function AtT.

Page 353: Financial derivatives in theory and practice

Bermudan swaptions via Vasicek—Hull—White 329

Remark 17.1: Observe from equations (17.16) that if the initial discount curvedoes not have a continuous first derivative everywhere then there will be timepoints at which the function µ is also discontinuous. This corresponds to thefunction θ having a delta function component at these time points. As a resultthe underlying short-rate process is continuous whenever the initial discountcurve has a continuous derivative but undergoes jumps at the time pointswhere the discount curve has a discontinuous derivative.In the theory of SDEs as discussed in Chapter 6 we only considered SDEs for

continuous processes. However, the jump in the function µ is of a deterministicamount at a deterministic time and this corresponds to a deterministic jumpin the short-rate process r at a deterministic time. Thus, although we have notstrictly covered such processes, it is easy to see that the theory developed issufficient by inductively applying the continuous theory between jump times,explicitly incorporating the jumps, and piecing together the solutions.

17.4 BERMUDAN SWAPTIONS VIA VASICEK—HULL—WHITE

We defined the Bermudan swaption in the preceding Orientation sectionon multi-temporal products. Here we consider it again and describe anefficient algorithm to value Bermudan swaptions within the Vasicek—Hull—White model.The Bermudan we consider here is more general than that described in

the Orientation. There we restricted to a Bermudan which could be exercisedonly on payment dates of a vanilla swap. Furthermore, the swaps on whichthe constituent options were based were all themselves vanilla swaps. Herewe allow the exercise dates, on which the holder has the right but not theobligation to enter a swap, to be an arbitrary but finite sequence of timesT1, . . . , TK . If the option is exercised the underlying swap which the holderenters may be irregular, with irregular payment dates and irregular couponamounts. For the swap which can be entered at the exercise time Ti, wedenote by Si0 the swap start date and by Sij , 1 ≤ j ≤ Ji, the payment dates.We saw in Chapter 10 that the value of any swap can be written as a linearcombination of discount factors. Thus if the option is exercised at Ti the valueof the swap entered is

Ii := (j≤Ji

cijDTiSij )+ (17.17)

for some known constants cij . Where there is no risk of confusion in whatfollows we will tend to drop some of the subscripts.In the following sections we describe in detail an algorithm for pricing a

Bermudan swaption such as this. Bermudan pricing is usually done by build-ing a tree to carry out a dynamic programming calculation via backward

Page 354: Financial derivatives in theory and practice

330 Short-rate models

induction and is standard. The algorithm described below also uses backwardinduction but exploits the Gaussian structure to gain extra efficiencies. Werefer to the set of (rt, t) pairs used in the algorithm as a tree but we do notexplicitly assign probabilities to transitions within the tree as is usual for stan-dard binomial and trinomial pricing trees. The steps we need to go throughto implement the algorithm are the following:

(i) Choose a probability measure within which to work and use a set ofmarket instruments to calibrate the Vasicek—Hull—White model. If wewere to work in the risk-neutral measure Q this would determine (variousintegrals of) the functions a, θ and σ in (17.10). We shall see that not allof these are needed explicitly.

(ii) Specify the locations of the nodes in the tree and how the intrinsic valueof the Bermudan will be calculated at each of these nodes.

(iii) Calculate at each node the discounted future value of the Bermudan, andthus calculate the Bermudan value at this node by taking the maximumof this with the intrinsic value at the node.

We address each of these in turn.

17.4.1 Model calibration

The model needs to be calibrated so that it correctly prices a set of marketinstruments. These instruments will typically be (pure discount) bonds andeither swaptions or caps and floors, as appropriate.Calibrating a Vasicek—Hull—White model to these products is, because of

the one-factor Gaussian structure, a straightforward task. We will for ourcurrent purposes assume that the (critical) mean reversion function a hasbeen chosen and fixed beforehand, although these techniques also apply tothe more general situation when a depends on σ.For our algorithm we shall work in the forward measure corresponding to

taking as numeraire the bond D·TK , that is the bond which matures on theexercise date of the last option in the Bermudan swaption. We are given theinitial discount curve, D0T , 0 ≤ T < ∞, and a set of option prices, onematuring on each date Ti. The only parameters we shall need to imply arethe underlying bond prices D0Sij , which can be read off from the discountcurve, the values ψTi and ψSij , a user input since a is predefined, and thevalues ξTi . Recall that we saw how to derive the function ξ, using a safeNewton—Raphson or Brent algorithm (Press et al. (1988)), in Section 17.3.2.

17.4.2 Specifying the ‘tree’

The first step is to identify the random process that will be modelled by the‘tree’ and then allocate the positions of the tree nodes.We saw in Section 17.3.2 that, if we work in the forward measure F

Page 355: Financial derivatives in theory and practice

Bermudan swaptions via Vasicek—Hull—White 331

corresponding to the numeraire bond D·TK , then all discount factors couldbe written in the form

DtSDtTK

=D0SD0TK

exp (ψTK − ψS)Xt − 12 (ψTK − ψS)2ξt , (17.18)

where X is a (deterministically) time-changed Brownian motion,

Xt := W (ξt),

and W is as defined in (17.13). Note that (17.18), an equation for ratios ofdiscount factors, defines all discount factors since we can recover D tTK bysubstituting S = t into (17.18). We shall take the Gaussian process X as theone we model with the tree.The most striking property of the tree is that we only position nodes at

times when decisions must be made, namely T1, . . . , TK . At intermediate timesno decisions need to be made and the value function V is well behaved. Weare thus able to calculate any intermediate properties explicitly.At each grid-time Ti we position nodes to be equally spaced in the variable

XTi . This ‘vertical’ spacing between nodes is a user input, depending on theaccuracy required. We will describe shortly the location of the top and bottomnodes in any column.The importance of being able to choose the times of nodes should not be

overlooked. We have positioned the nodes at precisely those times at whichwe need to know the option’s value and so have not lost accuracy caused bynodes missing decision points.The choice of the locations of the extreme nodes in each column depends

on the required accuracy. To understand the effect of this choice, note thatV0(X0), the Bermudan value at zero, can be written as

V0(X0) = D0TKEF Vτ (Xτ )D−1τTK

= D0TKEF Vτ (Xτ )D−1τTK

| XTi < mi P(XTi < mi)

+D0TKEF Vτ (Xτ )D−1τTK

| XTi ≥ mi P(XTi ≥ mi) (17.19)

where τ is the random time at which the option is exercised and Vt(Xt) isits value at time t. By truncating column i of the tree at mi we are ignoringthe second term in (17.19) and also the probability in the first term. Giventhat the process X is Gaussian, the first probability differs from one by aterm O(exp(−m2

i /2ξTi)). The second term decays to zero at the same ratesince the payoff is at most O(exp(cmi)) for some constant c. This argumentextends to include all the exercise times Ti and shows that errors resultingfrom truncation are below polynomial in order of magnitude.

Page 356: Financial derivatives in theory and practice

332 Short-rate models

Should we wish to choose the limits to guarantee a certain level of accuracywe can use the above argument to do so. What is important is the numberof standard deviations of the Gaussian distribution for XTi covered by thenodes. The limits should thus be set at a fixed number of standard deviationsabove and below zero, the mean for XTi . In practice, because the error decaysso quickly, the exact level is not important and a conservative value can bechosen without adversely affecting the algorithm’s speed.The precise algorithm for placing nodes is as follows. First select your node

spread, m. Locate the top node in each column at the value m var[XTi ] =

m ξTi . Then locate nodes at a regular spacing ∆, chosen depending onthe accuracy required (as described in Section 17.4.5), down from this node

until the first node lower than −m ξTi . We do not specify any further treestructure at this point. Strictly speaking, the algorithm is not a standard treealgorithm, and we do not assign arcs and probabilities as in a tree.

17.4.3 Valuation through the tree

To value the Bermudan swaption we work back in time from the final exercisedate assigning option values at each node in the tree. On the final exercisedate, if we have not already exercised, the option value is its intrinsic valuewhich can be calculated as a closed-form expression from (17.17) and (17.18).To calculate the value of the option at any other node in the tree, suppose

first that we have calculated the value at all later nodes. Let t be the timeof this node and T be the time of the next set of nodes. Let Vt(Xt) be thevalue at this node, Et(Xt) be the ‘discounted future value’ of the option atthis node (if we choose not to exercise), and It(Xt) be the intrinsic value atthe node. Then

Vt(Xt) = max Et(Xt), It(Xt) (17.20)

Et(Xt)

DtTK= EF VT (XT )D

−1TTK

| Xt

=D0TD0TK

e−12 (ψTK−ψT )2ξTEF VT (XT )exp (ψTK − ψT )XT | Xt .

(17.21)

All that remains is to evaluate (17.21), and thus Vt(Xt), efficiently.

17.4.4 Evaluation of expected future value

There are many ways to evaluate an expectation for a univariate randomvariable. There are two important considerations for us. The first is that the

integrand in (17.21), VT (XT )exp (ψTK − ψT )XT , is piecewise C∞, but not

in itself C∞. The problem is (see equation (17.20)) that V takes two distinct

Page 357: Financial derivatives in theory and practice

Bermudan swaptions via Vasicek–Hull–White 333

functional forms, one below an exercise boundary X∗T, the other above it. We

must first estimate the level X∗T from the values of ET (XT) and IT (XT) at

the nodes. This can be done by fitting a polynomial to ET and IT about theexercise point and solving, as illustrated in Figure 17.1.

XT

XT

^

E(XT)

I(XT)

Figure 17.1. Changing functional form in the integration

Having calculated XT, our estimate of X∗T, we now perform the integration

on each side of XT, separately. To do this efficiently, over each interval [x, x +∆], approximate VT(XT) by a polynomial P, of order n, using neighbouringpoints. Then approximate the integral of VT over this interval by insteadcalculating the integral of P analytically (we know the integral of thepolynomial against a Gaussian explicitly; see Appendix 3). This procedureis very closely related to that of Gaussian quadrature (Press et al. (1988)).

One further point on efficiency should be mentioned. This issue does notaffect the asymptotic order of convergence, rather it affects convergence onlythrough the constant term. The point to consider is that we do not justwish to integrate for one value of Xt but for a whole set of values. Also oneof the most expensive calculations in the whole algorithm is to calculate theexponential function which is needed both as part of the integrand and also inthe evaluation of the normal density function which appears in the expressionsfor the integrals of a Gaussian against a polynomial. We do not go into thedetails here, but it is a straightforward matter to exploit the fact that thevalues Xt and XT are equally spaced to considerably reduce the number ofcalls to the exponential function. Effectively as much of the calculation as issimilar for differing nodes is carried out only once and cached.

Page 358: Financial derivatives in theory and practice

334 Short-rate models

17.4.5 Error analysis

There are three (and only three) distinct sources of error within the algorithm.The first, which we have already discussed in Section 17.4.2, is that caused by‘clipping’ the tree. This is of negligible order compared with the other two.All the errors come about as part of the integration routine. Let t < T be

two successive exercise times. Within the algorithm we are aiming to calculatethe function

Et(Xt)

DtTK=D0TD0TK

e−12 (ψTK−ψT )2ξTEF F (XT ) | Xt (17.22)

where F (XT ) = VT (XT )exp((ψTK − ψT )XT ). The expectation in (17.22) canbe expressed as

EF F (XT ) | Xt = EF F1(XT )1XT>X∗T | Xt + EF F2(XT )1XT≤X∗T | Xt

where X∗T is the exercise point at which F is non-smooth and F1 and F2 aresmooth functions (ET and IT in fact). What we actually calculate (exactly)is

EF G1(XT )1XT>XT | Xt + EF G2(XT )1XT≤XT | Xt

where G1 and G2 are our polynomial approximations to F1 and F2 and XTis our estimate of X∗T . It follows that the error, the difference between thesetwo, is

EF (F1 −G1)(XT )1XT>XT | Xt + EF (F2 −G2)(XT )1XT≤XT | Xt+ EF (F1 − F2)(XT )(1XT>X∗T − 1XT>XT ) | Xt .

Since G is a local polynomial approximation of order n to F , it follows thatthe magnitude of the first two terms is O(∆n+1), where ∆ is, as before, thespacing between neighbouring nodes at time T . The remaining term is theexpectation of a function taking the value zero outside an interval of size|XT −X∗T | which is O(∆n+1) if we calculate XT with a polynomial of ordern. Over this interval the integrand F1 − F2 is also O(∆n+1), so the net errorcontribution is O(∆2(n+1)). Finally, the function F is not dependent on ∆and thus the net overall error is O(∆n+1).In terms of the time to run the algorithm, there are K columns of nodes

corresponding to the K decision times, each containing O(m/∆) nodes, wherem is the range covered by the columns. Each node performs a calculationinvolving all the nodes at the next column, so the total number of calculationsis O(m2/∆2), which is thus the order of the time to run the algorithm. Nowwe must increase m as ∆ decreases to ensure that the clipping error is not

Page 359: Financial derivatives in theory and practice

Bermudan swaptions via Vasicek—Hull—White 335

significant. Suppose we increasem like ∆−ε for some ε > 0. The clipping errorwill then decrease faster than any polynomial order in ∆. Thus the run timeis O(∆−2(1−ε)), and the error is o(∆n+1−η), for any η. This finally yields anerror which is o(T−

12 (n+1)+ε) for any ε > 0. This compares with a standard

Vasicek—Hull—White tree where the error is O(T −12 ).

Page 360: Financial derivatives in theory and practice
Page 361: Financial derivatives in theory and practice

18

Market Models18.1 INTRODUCTION

The short-rate models of Chapter 17 have been around for many years. Theyhave the virtue of being easy to understand (although they are not usuallytransparent in their properties) and, in the case of the Vasicek–Hull–Whitemodel, can be very efficient to implement. Where they are less ideal is in theircalibration properties. For example, when pricing a Bermudan swaption it isonly possible to calibrate a Vasicek–Hull–White model, for each swaptionmaturity date, to correctly price one swaption of a given strike (two if wealso fit the mean reversion parameter). But which one should it be? Thatis not at all clear, and given that the Vasicek–Hull–White model yields aGaussian short-rate process whereas the market does not, the precise choicewill in general be important.

Recently, to overcome this calibration problem, a new class of models hasbeen developed, the class of market models. The approach was pioneered byMiltersen et al. (1997) and Brace et al. (1997) in the context of LIBOR-based models and later extended to the swap-rate context by Jamshidian(1997). The essential new and defining feature of these models is that theyare parameterized in terms of standard market rates such as LIBORs andswap rates. Furthermore, it is easy, in the case when option prices are givenby Black’s formula and thus the link between the SDE governing the evolutionof the appropriate market interest rates and the terminal distributions of theserates is clear, to define these models so that they exactly match market prices.

The major breakthrough that came with the introduction of market modelswas a new way of thinking, one which focuses directly on the market interestrates to be modelled. In this chapter we shall present these models. We providethe necessary defining equations and prove that models do exist with theseproperties. Importantly, we do not prove that they fall within the (HJM) class,defined fully in Chapter 8, rather we show that they are numeraire models(N). The extra burden of proving the models to be (HJM) is unnecessary,as discussed in Chapter 8. We shall also prove that the models are strong

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 362: Financial derivatives in theory and practice

338 Market models

Markov, an important property if they are to be used for pricing in practice.Despite all their benefits, the market models defined in this chapter have a

significant drawback. Although the models are strong Markov, the dimensionof the Markov process is the same as the number of LIBORs or swap ratesbeing modelled. So, for example, to price a Bermudan swaption which givesthe right to cancel a ten-year swap paying quarterly coupons, the underlyingMarkov process would be 39-dimensional (there are 39 unknown cashflowsin the swap). It is infeasible, in practice, to implement such a model exceptby simulation or by approximation. Certainly the techniques introduced inChapter 17 could not be employed. We shall meet an alternative class ofmodels in the next chapter that overcomes this problem.

Since they were first introduced in 1995, market models have generatedconsiderable interest and there is already an extensive literature on thetopic. Of particular note are the original articles, mentioned above, anda comprehensive survey article by Rutkowski (2001). For a discussion ofimplementation issues and market models with smiles, see Glasserman andZhao (2000) and Andersen and Andreasen (2000) respectively.

In this chapter we shall describe the two basic models, one for LIBOR,the other for swap rates. These models are ideal for pricing path-dependentLIBOR and swap derivatives for which the price can be obtained bysimulation. Because of the implementation problems, Bermudan swaptionsare not easily treated using market models. We also introduce an alternativeswap-rate model, the reverse swap-rate model. This latter model is intendedfor pricing European swap and LIBOR derivatives, but again prices must beobtained via simulation. We develop each of these models in turn in the nextthree sections, first deriving the form of the SDE they satisfy (which is whatwe need in practice), then establishing that the SDE does admit a (strongMarkov) solution and thus that such a model does indeed exist.

18.2 LIBOR MARKET MODELS

The first market models studied were LIBOR market models. The idea wasto develop a model directly in terms of a set of (contiguous) LIBORs, andin particular to develop a model under which the caplets corresponding tothese LIBORs were all priced using the market standard Black formula. Moregenerally, however, the objective was to (partially) define a term structuremodel via an SDE for the LIBORs.

There are several steps required to define a model in this way, and thesesteps are common to the swap-rate models developed in Sections 18.3 and 18.4below. These are as follows.

(i) Specify an SDE for the forward LIBOR (swap rates in Sections 18.3and 18.4) and determine the necessary relationship between the drift

Page 363: Financial derivatives in theory and practice

LIBOR market models 339

and diffusion terms for any corresponding term structure model to bearbitrage-free.

(ii) Prove that a solution exists to the SDE defined in step (i).(iii) Check that, for this solution, there exists a numeraire pair (N, N) for

which all the numeraire-rebased bonds defined by the SDE (only afinite number at this stage) are martingales under the measure N.

(iv) Demonstrate that this partially defined model can be extended to amodel for the whole term structure in such a way that the extendedmodel also admits a numeraire pair.

Here we shall carry out steps (i)–(iv) for the LIBOR market model. So, letS0 < S1 < . . . < Sn be a sequence of dates and, for i = 1, . . . , n, we define thecorresponding forward LIBORs,

Lit :=

DtSi−1 − DtSi

αiDtSi

. (18.1)

We will develop the model working in the forward measure F correspondingto taking the bond D.Sn

as numeraire, and with this in mind we also make thedefinitions, for 0 ≤ i ≤ n,

Dit :=

DtSi

DtSn

, (18.2)

Πit :=

i∏j=1

(1 + αjL

jt

),

where in the second equation the product over the empty set is unity, i.e.Π0 ≡ 1. It will also be convenient to define Dn+1 ≡ 1 and Ln+1 ≡ 0.

Note immediately from (18.1) and (18.2) that, for i = 0, . . . , n − 1,

Dit = (1 + αi+1L

i+1t )Di+1

t , (18.3)

a relationship that will be used inductively to derive the form of the SDE forthe forward LIBORs, and further that

Dit =

n∏j=i+1

(1 + αjL

jt

)=

Πnt

Πit

. (18.4)

18.2.1 Determining the drift

Suppose that we wish to develop a model for the term structure of interestrates for which, for each i = 1, . . . , n, the forward LIBORs Li solve an SDEof the form

dLit = µi

t(Lt)Litdt + σi

t(Lt)LitdW i

t (18.5)

Page 364: Financial derivatives in theory and practice

340 Market models

where W is a (correlated) Brownian motion with dW it dW

jt = ρijdt and

L = (L1, . . . , Ln). As stated earlier, we work in the forward measure Fcorresponding to taking D·Sn as numeraire. Then, if this model is to be

arbitrage-free, each of the Di must be a martingale under this measure. Thisimposes a relationship between the diffusion coefficients σi and the drifts µi

in the SDE (18.5). Here we derive that relationship. Note that, in this section,we only provide a necessary relationship between the drift and diffusion terms.We do not yet show that a suitable process exists which satisfies (18.5), i.e.we only carry out step (i) described above. Steps (ii)—(iv) are treated in thenext section.So suppose each of the Di is a martingale under the measure F. Then,

applying Ito’s formula to (18.3), we obtain

dDit = (1 + αi+1L

i+1t )dDi+1

t + αi+1Di+1t dLi+1t + αi+1dL

i+1t dDi+1

t . (18.6)

Substituting from (18.5) and equating local martingale parts in (18.6) yields

dDit = (1 + αi+1L

i+1t )dDi+1

t + Di+1t αi+1L

i+1t σi+1t dW i+1

t ,

whence

ΠitdDit = Π

i+1t dDi+1

t +Πi+1t Di+1t

αi+1Li+1t

1 + αi+1Li+1t

σi+1t dW i+1t .

(We have here suppressed the dependence of σt on Lt. We shall do this lateralso, for both σt and µt.) It follows by backward induction, down from i = n,that for i = 0, . . . , n− 1,

ΠitdDit =

n

j=i+1

Πjt Djt

αjLjt

1 + αjLjt

σjt dWjt ,

and thus

dDit = D

it

n

j=i+1

ΠjtDjt

ΠitDit

αjLjt

1 + αjLjt

σjt dWjt

= Dit

n

j=i+1

αjLjt

1 + αjLjt

σjtdWjt . (18.7)

Now we can also equate the finite variation terms in (18.6) to obtain, fori = 0, . . . , n− 1,

αi+1Di+1t µi+1t Li+1t dt+ αi+1σ

i+1t Li+1t dW i+1

t dDi+1t = 0 ,

Page 365: Financial derivatives in theory and practice

LIBOR market models 341

and thus, again by backward induction, it follows that

µit(Lt) = −n

j=i+1

αjLjt

1 + αjLjt

σit(Lt)σjt (Lt)ρij .

Finally, the original SDE (18.5) becomes

dLit = −n

j=i+1

αjLjt

1 + αjLjt

σit(Lt)σjt (Lt)ρij Lit dt+ σit(Lt)L

it dW

it . (18.8)

18.2.2 Existence of a consistent arbitrage-free term structuremodel

The results of Section 18.2.1 are all that we need in practice when usinga market model. However, we need to check that the model given there isconsistent with a full arbitrage-free term structure model (steps (ii)—(iv) asdefined above), and this we now do.To begin, we must show that the SDE (18.8) for the forward LIBORs has a

solution. In general it will not since the SDE given there is of a very generalform. However, in the important case when σi is bounded over any timeinterval [0, t] a solution does exist. This includes the special case of mostpractical relevance (and which leads to Black’s formula for option prices)when σit(Lt) = σit, a locally bounded function of time which does not dependon the LIBORs Li, i = 1, . . . , n.Establishing existence of a solution to the SDE in the above cases follows

from the general theory for SDEs as developed in Chapter 6.

Theorem 18.1 Suppose that, for i = 1, . . . , n, the functions σ i : Rn×R+ →R are bounded on any time interval [0, t]. Then strong existence and pathwiseuniqueness hold for the SDE (18.8). Furthermore, the solution process L is astrong Markov process.

Proof: First consider the modified SDE

dLit = µit(Lt)L

it dt+ σit(Lt)L

it dW

it (18.9)

whereLi = Li1Li>0,

µit(L) = µit(L),

σit(L) = σit(L)

and L = (L1, . . . , Ln). For any fixed t > 0, this modified SDE is globallyLipschitz over the time interval [0, t]. This follows since both µ and σ are

Page 366: Financial derivatives in theory and practice

342 Market models

bounded on [0, t]. It thus follows from Theorem 6.27 that strong existence andpathwise uniqueness hold for the SDE (σ, µ). (Note that all the components ofthe driving d-dimensional Brownian motion in Theorem 6.27 are independent,whereas W in (18.9) is a correlated Brownian motion. Casting (18.9) in thatform is a straightforward matter.)

Let L now be a solution process for the SDE (σ, µ). By applying Ito’s

formula to obtain an SDE for log L it is clear that the process L can bewritten in the form

Lit = Li0 exp

t

0

σiu(Lu) dWiu +

t

0

µiu(Lu)− 12 (σ

iu(Lu))

2 du .

Since σ and µ are bounded, the process L is a.s. strictly positive and thus isalso a solution for the original SDE (σ, µ) (which only differs from (σ, µ) whenthe solution process takes negative values).

The strong Markov property follows immediately (for L and thus for L)from Theorem 6.36.We have just shown that there is indeed a process L satisfying the SDE

(18.8). This was necessary for the model to be arbitrage-free. Of course itis not sufficient. We must additionally check that all D·Sn-rebased assets aremartingales in the measure F (step (iii)). But, from (18.7), Di can be writtenas a Dolean exponential,

Dit = D

i0E

t

0

n

j=i+1

αjLju

1 + αjLju

σjudWju .

Observing that the exponentiated term has bounded quadratic variation overany time interval [0, t], it follows from Novikov’s condition (Theorem 5.16)

that Di is indeed a true martingale.It remains to show that the economy can be extended in such a way that

all numeraire-rebased assets for the whole economy are martingales. This isstraightforward. Up until time Sn we have taken D·Sn as numeraire but havenot defined its law fully as yet. What we have done is define the law of theprocesses Li. However, since DSiSi ≡ 1, we can recover the functional formDSiSn(LSi) from (18.4). Extend the model as follows. For any Si ≤ t ≤ Si+1,define

DtSn(Lt) :=Si+1 − tSi+1 − SiDSiSn(Lt) +

t− SiSi+1 − SiDSi+1Sn(Lt) .

This is a continuous semimartingale and its law can be found from that of theprocess L and the defining functional forms. We use this to define a numerairemodel (as defined in Chapter 8): for 0 ≤ t ≤ S ≤ Sn, define

DtS := DtSn(Lt)EF D−1SSn

(LS)|Ft .

Page 367: Financial derivatives in theory and practice

Regular swap-market models 343

It is readily seen that this is consistent with the assumptions made above. Tocomplete the model, for t ≤ Sn < S take

DtS = DtSnD0SD0Sn

,

and for Sn < t < S

DtS =D0SD0t

.

For the corresponding numeraire, take D ·Sn until time Sn and then the unitrolling numeraire from then on.

18.2.3 Example application

One important exotic product which is relatively popular is the limit cap.This product is so closely related to the underlying vanilla cap that accuratecalibration is essential, and thus the LIBOR market model is ideally suited tovaluing this product.A limit cap is identical to a vanilla cap except there is a limit imposed on

the total number of caplets that can be exercised. As per a cap, the limitcap thus comprises a number of payment dates Si, i = 1, 2, . . . , n, and a setof strikes Ki, i = 1, 2, . . . , n. There is also a limit number m. The holderof a limit cap receives the first m caplets that set in the money. Note thatthe option holder does not have the right to refuse to exercise any particularcaplet that sets in the money (perhaps because it is only just in the money)and thus the product is path-dependent but not American. This means thatpricing can be carried out via simulation of the LIBOR process and optionpayout, and the high dimensionality of the Markov LIBOR process does notcause major difficulties.One could choose to implement a multi-factor LIBOR market model to

price this product. Because for each reset date there is only one market rateused to determine the option payoff, in practice we ourselves have alwaysrestricted to a single-factor model (see the discussion in Chapter 9). Onecan, by introducing a mean reversion parameter, as discussed in Section 19.7,reflect the relevant market distributions even within the one-factor setting.

18.3 REGULAR SWAP-MARKET MODELS

For path-dependent products based on swaps and swap rates, such as barrierswaptions (discussed in Section 18.3.3), we would like to develop swap-ratemodels analogous to the LIBOR models just introduced. The procedure fordoing this is identical to that for LIBOR models, with only the notation andalgebra being more difficult.

Page 368: Financial derivatives in theory and practice

344 Market models

We make the following definitions relating to a series of swaps all maturingon the date Sn. Taking for convenience α0 = 1, we define, for i = 0, . . . , n,

P it :=n

j=i

αjDtSj ,

P it :=P itDtSn

=

n

j=i

αjDtSjDtSn

, (18.10)

and, for 1 ≤ i ≤ n,

yit :=DtSi−1 −DtSn

P it, (18.11)

Ψit :=

i

j=1

1 + αjyj+1t .

We also take Pn+1 ≡ yn+1 ≡ 0 and Ψ0 ≡ 1, Ψ−1 ≡ (Ψ1)−1. It follows from(18.10) and (18.11) that, for i = 0, . . . , n,

P it = αi + (1 + αiyi+1t )P i+1t , (18.12)

whence, for i = 0, . . . , n,

Ψi−1t P it = αiΨi−1t +ΨitP

i+1t .

It then follows inductively that

P it =

nj=i αjΨ

j−1t

Ψi−1t

. (18.13)

We follow the same four steps in developing the swap-market model as wedid in Section 18.2 for the LIBOR market model, starting with the derivationof the relationship between the diffusion and drift of the market swap rates.

18.3.1 Determining the drift

We want to develop a model for the term structure of interest rates for which,for each i = 1, . . . , n, the forward par swap rates y i solve an SDE of the form

dyit = µit(yt)y

it dt+ σit(yt)y

it dW

it , (18.14)

where W is a (correlated) Brownian motion with dW it dW

jt = ρijdt and

y = (y1, . . . , yn). We again work in the forward measure F corresponding

Page 369: Financial derivatives in theory and practice

Regular swap-market models 345

to taking D·Sn as numeraire. Then, if the model is to be arbitrage-free, eachof the P i, i = 0, . . . , n, must be a martingale under this measure. This againimposes a relationship between the diffusion coefficients σi and the drifts µi

in the SDE (18.14), which we now derive.

Suppose each of the P i is a martingale under the measure F and apply Ito’sformula to (18.12) to obtain

dP it = (1 + αiyi+1t )dP i+1t + αiP

i+1t dyi+1t + αidy

i+1t dP i+1t . (18.15)

Suppressing the dependence of σi (and later of µi) on the vector y, equatinglocal martingale parts yields

dP it = (1 + αiyi+1t )dP i+1t + P i+1t αiy

i+1t σi+1t dW i+1

t ,

whence

Ψi−1t dP it = ΨitdP

i+1t +ΨitP

i+1t

αiyi+1t

1 + αiyi+1t

σi+1t dW i+1t .

It now follows by backward induction, down from i = n, that for i =0, . . . , n− 1,

Ψi−1t dP it =

n

j=i+1

Ψj−1t P jtαj−1y

jt

1 + αj−1yjt

σjtdWjt ,

and so

dP it = Pit

n

j=i+1

Ψj−1t P jt

Ψi−1t P it

αj−1yjt

1 + αj−1yjt

σjtdWjt . (18.16)

Now we can also equate the finite variation terms in (18.15) to obtain, fori = 0, . . . , n− 1,

αiPi+1t µi+1t yi+1t dt+ αiσ

i+1t yi+1t dW i+1

t dP i+1t = 0 ,

and thus, for i = 1, . . . , n,

µit(yt) = −n

j=i+1

Ψj−1t P jt

Ψi−1t P it

αj−1yjt

1 + αj−1yjt

σit(yt)σjt (yt)ρij .

Finally, the original SDE (18.14) becomes

dyit = −n

j=i+1

Ψj−1t P jt

Ψi−1t P it

αj−1yjt

1 + αj−1yjt

σit(yt)σjt (yt)ρij yitdt+ σit(yt)y

it dW

it .

(18.17)

Page 370: Financial derivatives in theory and practice

346 Market models

18.3.2 Existence of a consistent arbitrage-free term structuremodel

It remains for us to prove that the SDE (18.17) has a solution and that thereis a full arbitrage-free term structure model consistent with this SDE. Thisis precisely steps (ii)—(iv) of Section 18.2. Step (ii) is carried out as for theLIBOR market model and leads to the following theorem. Note that thisagain includes the important special case when σ it(yt) = σit, a locally boundeddeterministic function of time.

Theorem 18.2 Suppose that, for i = 1, . . . , n, the functions σ i : Rn×R+ →R are bounded on any time interval [0, t]. Then strong existence and pathwiseuniqueness hold for the SDE (18.17). Furthermore, the solution process y is astrong Markov process.

Proof: The proof is identical to that for Theorem 18.1 once we have establishedthat the functions µi are bounded over the time interval [0, t] when y i ≥ 0for all i. This is immediate once we observe, from (18.13), that Ψ i−1

t Pt isdecreasing in i.Now that we know that (18.17) admits a solution for suitable σ i, we must

carry out step (iii) and show that for this solution and for each i = 0, . . . , n,the processes D·Si/D·Sn are martingales in the measure F. Equivalently, weneed to show that the processes P i are martingales. That they are localmartingales follows from the form of (18.16). But, from (18.16), P i can bewritten as a Dolean exponential,

P it = Pi0E

t

0

n

j=i+1

Ψj−1u P ju

Ψi−1u P iu

αj−1yju1 + αj−1y

ju

σju dWju .

The integrand is bounded over any finite time interval [0, t] and thus P i is amartingale by Novikov’s condition. The model is completed by carrying outstep (iv) and extending to all bonds and to the time interval [0,∞). This laststep is precisely the same as for the LIBOR model of Section 18.2.

18.3.3 Example application

The most important swap-based multi-temporal product in the market isundoubtedly the Bermudan swaption. Whereas the models described inthis section are theoretically well suited to pricing this product, the highdimensionality of the underlying Markov process, combined with the Americannature of the Bermudan swaption, means it is very difficult to use marketswaption models for the Bermudan.One product for which a market swap-rate model is suited is the discrete

barrier swaption. This product is similar to a vanilla swap with onemodification. If, on any of the swap reset dates Ti = Si−1, the par swap

Page 371: Financial derivatives in theory and practice

Reverse swap-market models 347

rate yi is above a given barrier level Hi then the rest of the swap is cancelledand no further cashflows take place. This is an up and out barrier swaption.There are also the obvious variants, the up and in, down and out, and downand in swaps.The discrete barrier swaption is path-dependent and can be valued within

a swap-market model using simulation. One factor is a natural choice and amean reversion parameter (see Section 19.7) should be included to create thedesired joint distributions for the underlying swap rates.

18.4 REVERSE SWAP-MARKET MODELS

The model described in Section 18.3 above is suitable for path-dependentproducts such as barrier swaptions where the exotic product to be priced isrelated to a collection of swap rates which set on different dates but which havea common maturity date. There are other situations where several swaps needto be modelled and they have in common not their maturity date but theirstart date. One example of this was the spread option which was discussed inSection 16.3.1. Here we return to the problem of pricing European derivativesand present a variant of the model described in Section 18.3 which is suitablefor such products. As will become clear, most of the procedure for developingthe model is similar to that in Section 18.3, with the exception of step (iv)in the construction which uses techniques developed for terminal swap-ratemodels.Now we make the following definitions relating to a series of swaps all

starting on date T = S0 but with maturity dates S1, . . . , Sn. Note that thesediffer from those of Section 18.3. We define, for i = 1, . . . , n,

P it :=i

j=1

αjDtSj ,

yit :=DtT −DtSi

P it, (18.18)

P it :=P itDtT

=

i

j=1

αjDtSjDtT

(18.19)

and

Πit :=i

j=1

1 + αjyjt .

It follows from (18.18) and (18.19) that, for i = 1, . . . , n,

(1 + αiyit)P

it = αi + P

i−1t , (18.20)

Page 372: Financial derivatives in theory and practice

348 Market models

ΠitPit = αiΠ

i−1t +Πi−1t P i−1t ,

where P 0 ≡ 0 and Π0 ≡ 1. Induction now yields

P it =

ij=1 αjΠ

jt

Πit. (18.21)

18.4.1 Determining the drift

Once more, we wish to develop a model for the term structure of interest ratesfor which, for each i = 1, . . . , n, the forward par swap rates y i solve an SDEof the form

dyit = µit(yt)y

it dt+ σit(yt)y

it dW

it (18.22)

where, as before,W is a (correlated) Brownian motion with dW it dW

jt = ρijdt.

For this model we work in the measure F corresponding to taking D ·T asnumeraire. Applying Ito’s formula to (18.20) yields, for i = 1, . . . , n,

dP i−1t = (1 + αiyit)dP

it + αiP

it dy

it + αidy

itdP

it . (18.23)

Since we require each P i to be a martingale under the measure F, we cansubstitute (18.22) into (18.23) and equate local martingale parts to obtain

dP i−1t = (1 + αiyit)dP

it + P

itαiy

itσit dW

it ,

Πi−1t dP i−1t = ΠitdPit +Π

itP

it

αiyit

1 + αiyitσitdW

it ,

and thus, inductively,

ΠitdPit = −

i

j=1

Πjt Pjt

αjyjt

1 + αjyjt

σjtdWjt ,

dP it = −P iti

j=1

Πjt Pjt

ΠitPit

αjyjt

1 + αjyjt

σjt dWjt .

Equating the finite variation terms in (18.23) now yields

P itµit(yt) dt+ σit(yt) dW

it dP

it = 0 ,

and thus

µit(yt) = −i

j=1

Πjt Pjt

ΠitPit

αjyjt

1 + αjyjt

σit(yt)σjt (yt)ρij .

Page 373: Financial derivatives in theory and practice

Reverse swap-market models 349

Thus the SDE (18.22) can finally be written in the form

dyit = −i

j=1

Πjt Pjt

ΠitPit

αjyjt

1 + αjyjt

σit(yt)σjt (yt)ρij yit dt+ σit(yt)y

it dW

it .

(18.24)

18.4.2 Existence of a consistent arbitrage-free term structuremodel

Again we must complete the model by carrying out steps (ii)—(iv) of Section

18.2. Observing from (18.21) that ΠitPit is increasing in i, exactly the same

proof as used for Theorem 18.2 yields the analogous result here.

Theorem 18.3 Suppose that, for i = 1, . . . , n, the functions σ i : Rn×R+ →R are bounded on any time interval [0, t]. Then strong existence and pathwiseuniqueness hold for the SDE (18.24). Furthermore, the solution process y is astrong Markov process.

The proof of this result also establishes that the solution process y hasall its components strictly positive. It then follows from the definition of y it(equation (18.18)) that, for each i, DtSi < DtT and thus P

it <

ij=1 αj . So

the continuous local martingale P i is bounded, hence a true martingale andwe have completed step (iii).Thus far we have only defined the dynamics of a finite number of bonds

and so we must extend the model. There is, as was the case for the terminalswap-rate models of Chapter 13, no unique way to extend the economy, andin practice we do not carry out this extension, we just need to know that itcan be done. We can do this in a very similar way to that used in Section13.4.2 to extend a terminal swap-rate model.To carry out this extension, observe that at time T the discount factors

DTSi can all be written as a function of the strong Markov process y, indeedas a function of yT (see equation (18.21)). For intermediate maturities S,define DTS by

DTS =Si+1 − SSi+1 − SiDTSi +

S − SiSi+1 − SiDTSi+1

where Si ≤ S ≤ Si+1, and for S < T define the ‘pseudo’-discount factors

DTS = exp(−CSy1T ) for CS being the unique solution toD0SD0T

= EF exp(−CSy1T ) .

We then define discount factors at any earlier time t ≤ T from the (required)martingale property,

DtS =DtS/DtTDtt/DtT

=EF[DTS |Ft]EF[DTt|Ft] .

Page 374: Financial derivatives in theory and practice

350 Market models

Finally, we extend the model beyond T in a deterministic manner by setting

DtS =DTt(yT )

DTS(yT ),

for T ≤ t ≤ S.

18.4.3 Example application

As mentioned above, this model is designed for pricing European swap-ratederivatives. One important application is the single-currency spread optionwhich pays an amount (αy1T − βy2T −K)+ for two par swap rates y1 and y2and for constants α,β and K.This product needs a two-factor model because there are two market rates

setting on the same date required to determine the payoff. Recall that inChapter 16 we priced spread options based on the spread between two marketrates in different currencies but could not work in one currency since there weonly developed terminal swap-rate models with one factor per currency.The path-dependent nature of the SDE (18.24) means that pricing must be

done by simulation, even though the underlying product is European.

Page 375: Financial derivatives in theory and practice

19

Markov-Functional Modelling

19.1 INTRODUCTION

In this final chapter we consider models that can fit the observed prices ofliquid instruments in a similar fashion to the market models, but which alsohave the advantage that derivative prices can be calculated just as efficientlyas in the Vasicek–Hull–White model, the most tractable short-rate model. Toachieve this we consider the general class of Markov-functional interest ratemodels, the defining characteristic of which is that pure discount bond pricesare at any time a function of some low-dimensional process which is Markovianin some martingale measure. This ensures that implementation is efficientsince it is only necessary to track the driving Markov process, something thatis particularly important for American products such as Bermudan swaptions.Market models do not possess this property (for a low-dimensional Markovprocess) and this is the impediment to their efficient implementation. Thefreedom to choose the functional form is what permits accurate calibration ofMarkov-functional models to relevant market prices, a property not possessedby short-rate models. The remaining freedom to specify the law of the drivingMarkov process means the model can be formulated to capture well thosefeatures of the real market relevant to a given exotic product.

19.2 MARKOV-FUNCTIONAL MODELS

In Chapters 13–16 we developed a range of models to price exotic Europeanderivatives. We did this by focusing on the specific product to be pricedand on those features of the market relevant to that product. Becausewe were studying European products we were able to focus on marketdistributions at one particular time. We explicitly modelled the distributionof one appropriate market rate and then fitted a suitable functional form tothe full discount curve on that date. What resulted was a model with thefollowing important properties:

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 376: Financial derivatives in theory and practice

352 Markov-functional modelling

(a) it was arbitrage-free;(b) it was well calibrated, correctly pricing relevant liquid instruments

without being overfitted;(c) it was realistic and transparent in its properties;(d) it allowed an efficient implementation.

Here we develop models which also have the four properties above but whichare specifically designed for the pricing of multi-temporal derivatives such asBermudan swaptions. In particular, we shall develop models which satisfy thefollowing definition.

Definition 19.1 An interest rate model is said to be Markov-functional ifthere exists some numeraire pair (N,N) and some process x such that:

(P.i) the process x is a (time-inhomogeneous) Markov process under themeasure N;

(P.ii) the pure discount bond prices are of the form

DtS = DtS(xt), 0 ≤ t ≤ ∂S ≤ S ,

for some boundary curve ∂S : [0, ∂∗]→ [0, ∂∗] and some constant ∂∗;

(P.iii) the numeraire N , itself a price process, is of the form

Nt = Nt(xt) 0 ≤ t ≤ ∂∗ .

Remark 19.2: The constant ∂∗ and the boundary curve ∂S : [0, ∂∗] → [0, ∂∗]are introduced so that the model does not need to be defined over the wholetime domain 0 ≤ t ≤ S <∞. In the applications we have encountered it hasalways been of the form

∂S =S, if S ≤ T,T, if S > T,

(19.1)

for some constant T .

All the models discussed in Chapters 13—16 were Markov-functional. Infact all models of practical use are Markov-functional, including the short-rate models of Chapter 17 and the market models of Chapter 18. However,each of these fails to some degree to satisfy properties (a)—(d). By focusingon the Markov-functional property we will be able to develop models whichsatisfy all four criteria and can be used to price multi-temporal products.To satisfy property (d) and to be able to implement a Markov-functional

model in practice the driving Markov process must be of dimension one or atmost two. This is an ambitious requirement given that the product to whichthe model is being applied is multi-temporal. The payoff, and thus also the

Page 377: Financial derivatives in theory and practice

Markov-functional models 353

valuation, of any multi-temporal derivative depends on (the joint distributionof) a set of forward swap rates or forward LIBORs. The model must capture(certain aspects of) the joint distribution of these rates.Suppose we are aiming to price a product whose payoff depends (primarily)

on the swap rates (or LIBORs) y1, . . . , yn on their respective setting datesT1, . . . , Tn. We assume, for now, that for each distinct setting date Ti thereis only one associated swap rate y i relevant to the product. (This, therefore,explicitly excludes products such as Bermudan spread options for which a two-dimensional driving Markov process would be appropriate. We discuss how todevelop a Markov-functional model for products such as this in Section 19.5.)The question arises as to whether it is possible to specify a Markov-functionalmodel for which the driving Markov process is only one-dimensional but whichnonetheless accurately captures the important properties of the joint distri-bution of the yi, i = 1, . . . , n. In particular:

(Q1) Given a one-dimensional Markov process x and a set of market swaprates (or LIBORs), can we find functional forms DtS(xt) such that themodel accurately reflects the prices of other vanilla market instruments(i.e. swaptions) with payoffs dependent on each of the yiTi? Equivalently(as we saw in Section 13.2.2), can we find functional forms such that themarginal distributions of the yiTi in the associated swaption measure S

i

agree with those implied from the market?(Q2) If we can do this and capture the relevant marginal distributions, is there

still enough freedom in the model so that the full joint distribution ofthe vector yiTi : 1 ≤ i ≤ n is adequately captured, or will satisfying(Q1) lead to a model that is overfitted and thus a poor model forthe application to hand? We saw a case of the latter problem whendiscussing irregular swaptions in Section 15.4.

The answer to the first question is yes, and Section 19.3 is devoted toshowing how to do this.Turning to the second question, for all the applications that we shall

consider it turns out that, given the Markov process x, the functional formsyiTi(xTi ) are uniquely determined by the associated swaption prices. Thus theonly way to affect the joint distribution of these rates is then through thechoice of the Markov process x. But x is arbitrary and so there is considerablefreedom remaining in the selection of x to affect this joint distribution. Thereis enough flexibility remaining even if we restrict x to be a Gaussian process,one for which numerical implementation is particularly efficient (as we sawwhen pricing Bermudans with the Vasicek—Hull—White model in Section 17.4).In order to be able to choose x to give a good model we must have someunderstanding of how the dynamics of x affects the relevant joint distributions.Much insight can be gained by a close examination of the Vasicek—Hull—Whitemodel because of its tractability. We do this in Section 19.7.

Page 378: Financial derivatives in theory and practice

354 Markov-functional modelling

19.3 FITTING A ONE-DIMENSIONAL MARKOV-FUNCTIONAL MODEL TO SWAPTION PRICES

Throughout this section and the next we shall restrict attention to developingand understanding Markov-functional models for which the driving Markovprocess is one-dimensional. We extend to the more general case in Section19.5. We shall suppose that we have already selected the martingale measureN in which to work and have specified the driving Markov process x. Weconsider the problem of how to choose the functional forms for the numeraireand discount factors so that the model calibrates accurately to the prices of aset of swaptions, those corresponding to the rates y1, . . . , yn setting on datesT1 < . . . < Tn.Defining the payment dates for the swap (FRA in the case when yi is a

LIBOR) associated with the rate yi by Sij , j = 1, 2, . . . ,mi, we assume in the

discussions below that, for all i, j, either S ij > Tn or Sij = Tk, for some k > i.

The assumption makes the ideas below easier to develop but is not strictlynecessary for the results that follow. It does hold for the most importantproducts encountered in practice, such as a Bermudan swaption associatedwith a cancellable swap. If the assumption does not hold one can alwaysmake it hold by introducing auxillary swap rates yk as necessary.To specify the model fully we must define the functional forms DtS(xt) for

all 0 ≤ t ≤ S ≤ Snmn, 0 ≤ t ≤ Tn, and also the functional form Nt(xt).

Note from Definition 19.1 that, given the law of the process x under N, tocompletely define a general Markov-functional model it is sufficient to specify:

(P.ii ) the functional form of the discount factors on the boundary ∂S , i.e.D∂SS(x∂S ) for S ∈ [0, ∂∗];

(P.iii ) the functional form of the numeraire Nt(xt) for 0 ≤ t ≤ ∂∗.

That is, we do not need to explicitly specify the functional form of discountfactors on the interior of the region bounded by ∂S . This follows becausethese interior discount factors can be recovered via the martingale propertyfor numeraire-rebased assets under N,

DtS(xt) = Nt(xt)END∂SS(x∂S )

N∂S (x∂S )Ft . (19.2)

We shall always consider Markov-functional models for which the boundaryis as given by (19.1). In this case the terminal time is the last swap start dateTn. For t < Tn we know the form of the discount factors on the boundary,Dtt(xt) ≡ 1. Thus, all we need to know is the functional form of discountfactors at Tn, DTnS(xTn), S ≥ Tn, and the functional form of the numerairefor t ≤ Tn, Nt(xt).Before deriving these functional forms, we make one further important

observation. For the multi-temporal product in question the value depends

Page 379: Financial derivatives in theory and practice

Fitting a one-dimensional model to swaption prices 355

only on discount factors at the times T1, . . . , Tn and for maturities Sij , i =

1, . . . , n, j = 1, . . . ,mi. Thus we are interested in the functional forms onlyon the grid shown in Figure 19.1.

T1 T2 Tn

(Tn – 1, S1 )n (Tn , S1 )

n

(Tn – 1, S1 )n –1

Figure 19.1 Model gridpoints

Since we need only specify the model on the boundary, it follows that allwe need to define to recover the grid are the functional forms DTnSnj

(xTn),

j = 1, . . . ,mn, and the functional form NTi(xTi), 1 ≤ i ≤ n.

19.3.1 Deriving the numeraire on a grid

We now show how to construct these functional forms inductively in such away that the resulting model correctly prices options on the swaps associatedwith the rates yi. We work backwards from Tn to T1 constructing thefunctional forms on the grid of Figure 19.1. Completing the specification ofthe model at other times is not necessary for the application. However, justas was the case for our discussion of terminal swap-rate models and marketmodels, we need to know that it can be done in such a way that the resultingmodel is arbitrage-free. We do this in Section 19.3.2.As a first step we must specify the functional forms DTnSnj

(xTn), j =

1, . . . ,mn, to be consistent with swaptions associated with the swap rate

Page 380: Financial derivatives in theory and practice

356 Markov-functional modelling

yn. We have seen how to do this in Chapter 13 in our discussion of terminalswap-rate models so we shall assume we have already carried out this step.It remains to specify the functional form of the numeraire at the times

T1, . . . , Tn. We will make two further assumptions:

(A.i) the choice of numeraire is such that NTn(xTn) can be inferred from thefunctional forms of the discount factors at Tn;

(A.ii) the ith forward par swap rate at time Ti, yiTi, is a monotonic increasing

function of the variable xTi .

The latter assumption is natural if x is intended to capture the overall levelof interest rates.In the remainder of this subsection we show how market prices of the

calibrating vanilla swaptions can be used to imply, numerically at least, thefunctional forms NTi(xTi ) for i = 1, . . . , n − 1. Equivalent to calibrating tovanilla swaption prices is calibrating to the inferred market prices of PVBP-digital swaptions. A PVBP-digital swaption corresponding to yi having strikeK has payoff at time Ti of

V iTi(K) = PiTi1yiTi>K

,

where P i is the PVBP of the swap corresponding to the swap rate y i. Its valueat time zero, applying the usual valuation formula (Corollary 7.34), is givenby

V i0 (K) = N0(x0)EN PiTi(xTi)1yiTi (xTi )>K

(19.3)

where

P iTi(xTi) =P iTi(xTi)

NTi(xTi ).

That knowledge of V i0 (K) for all K is equivalent to knowlege of the vanillaswaption prices V i0 (K), again for all K, follows as in Section 13.2.2 from the

identity V i0 (K) = − ∂∂KV

i0 (K).

To determine the functional forms of the numeraire NTi(xTi) we work backiteratively from the terminal time Tn. Consider the ith step in this procedure.Assume that NTk(xTk), k = i + 1, . . . , n, have already been determined. Wecan also assume that

DTiS(xTi ) :=DTiS(xTi)

NTi(xTi)

is known for relevant grid-times S > Ti, having been determined using (19.2)and the known (conditional) distributions of xTk , k = i, . . . , n. Note that this

implies that P iTi is also known.

Page 381: Financial derivatives in theory and practice

Fitting a one-dimensional model to swaption prices 357

Now consider yiTi , which can be written as

yiTi =DTiTi −DTiSimi

P iTi

=N−1Ti −DTiSimi

N−1TiP iTiN

−1Ti

. (19.4)

Rearranging equation (19.4), we have that

NTi(xTi ) =1

P iTi(xTi )yiTi(xTi ) + DTiSimi

(xTi ). (19.5)

Thus to determine NTi(xTi ) it is sufficient to find the functional form yiTi(xTi).

By assumption (A.ii) there exists a unique value of K, say K i(x∗), suchthat the set identity

xTi > x∗ = yiTi > Ki(x∗) (19.6)

holds. Now define

J i0(x∗) = N0(x0)EN P iTi(xTi)1xTi>x∗ . (19.7)

For any given x∗ we can calculate the value of J i0(x∗) using the known

distribution of xTi under N. Furthermore, using market prices we can thenfind the value of K such that

J i0(x∗) = V i0 (K) . (19.8)

Comparing (19.3) and (19.7), we see that the value of K satisfying (19.8) ispreciselyK i(x∗). Clearly, from (19.6), knowingK i(x∗) for any x∗ is equivalentto knowing the functional form yiTi(xTi ), and we are done.It is common market practice to use Black’s formula to determine the

swaption prices V i0 (K). Observe, however, that the techniques here apply moregenerally. In particular, if the currency concerned is one for which there is alarge volatility skew, meaning the volatility used as input to Black’s formulais highly dependent on the level of the strike K, these techniques can stillbe applied. This is particularly important in currencies such as yen in whichit is not reasonable to model rates via a log-normal process. The ability ofMarkov-functional models to ‘adapt’ to all different markets is one of theirmajor strengths.

Page 382: Financial derivatives in theory and practice

358 Markov-functional modelling

19.3.2 Existence of a consistent arbitrage-free term structuremodel

As we did for the terminal swap-rate models in Chapter 13, and for the marketmodels in Chapter 18, we need to ensure that there exists a full arbitrage-free term structure model consistent with this Markov-functional model. Theapproach we adopt here is very similar to those used earlier.

What we have already done in Section 19.3.1 is to derive a functional formfor the numeraire N at the grid-times Ti. By definition each of the bonds D·Ti

when divided by the numeraire is a martingale,

DtTi

Nt:= EN

[DTi

Ti

NTi

∣∣∣∣Ft

].

It remains to show that we can develop a model such that this also holds formaturities T /∈ T1, . . . , Tn. We extend to other maturities as follows. For anyt ≤ Tn, define

Nt(xt) :=Ti+1 − t

Ti+1 − TiNTi

(xt) +t − Ti

Ti+1 − TiNTi+1(xt), (19.9)

where T0 := 0 and Ti ≤ t ≤ Ti+1. This is a continuous semimartingale and itslaw can be found from that of the process x and the defining functional formsNTi

(x). Use this to define a numeraire model (as introduced in Chapter 8) asfollows. For 0 ≤ t ≤ S ≤ Tn define

DtS(xt) := Nt(xt)EN[N−1S (xS)|Ft],

and for 0 ≤ t ≤ Tn ≤ S define

DtS(xt) := Nt(xt)EN

[DTn

S(xTn)

NTn(xTn

)

∣∣∣∣Ft

].

It is readily seen that this is consistent with the assumptions made above.To complete the model, extend the model in a deterministic fashion from

time Tn:

DtS =DTn

t(xTn)

DTnS(xTn

),

for Tn < t ≤ S.For the corresponding numeraire, take N until time Sn and then the unit

rolling numeraire from then on.

Remark 19.3: Note we could also have taken

N−1t (xt) :=

D0T − D0Ti+1

D0Ti− D0Ti+1

N−1Ti

(xt) +D0Ti

− D0T

D0Ti− D0Ti+1

N−1Ti+1

(xt)

instead of (19.9). If we wanted to actually build such a general model thislatter interpolation is more convenient since it is N−1 that appears in valu-ation formulae, and it is also consistent with a general initial discount curveD0t : t ≥ 0.

Page 383: Financial derivatives in theory and practice

Example models 359

19.4 EXAMPLE MODELS

In this section we introduce two example models, both one-dimensional,which can be used to price LIBOR- and swap-based interest rate derivativesrespectively.

19.4.1 LIBOR model

The model we now describe, the LIBOR Markov-functional model, is designedto price path-dependent and American products which depend explicitly ona contiguous set of forward LIBORs. We denote this set of forward LIBORsby Li for i = 1, 2, . . . , n. Denote by Ti the start of the ith LIBOR period,by Si the end, and let αi = Si − Ti be the corresponding accrual factor.So, in particular, Si = Ti+1 for i = 1, . . . , n − 1 and, using the notation ofChapter 10, Li

t = Lt[Ti, Si]. Further, write Tn+1 := Sn.An example of the type of product we might wish to price with this model is

the flexible cap. A flexible cap is defined by the sequence Li, a correspondingset of strikes Ki, and a limit number m. The holder of a flexible cap has theright to exercise (or not, as he may decide) up to a maximum of m caplets.This contrasts with the limit cap, introduced in Chapter 18, for which theoption holder was forced to exercise any caplet setting in the money, up toa maximum of m. It is the added American feature of the flexible cap whichmakes it hard to price using the market models of Chapter 18.

As described in Sections 19.2 and 19.3, a Markov-functional model can bespecified by properties (P.i), (P.ii′) and (P.iii′). We work in the swaptionmeasure S

n corresponding to the numeraire D·Sn. Here we will be consistent

with Black’s formula for caplets on Ln if we assume that Ln is a log-normalmartingale under S

n, i.e.dLn

t = σnt Ln

t dWt (19.10)

where W is a standard Brownian motion under Sn and σn is some determin-

istic function. We will often take σnt = σeat, for some σ > 0 and some mean

reversion parameter a, for reasons explained in Section 19.7. Note, however,that this form is too simplistic to use in practice as it does not dynamicallyfit an appropriate implied market correlation structure.

It follows from (19.10) that we may write

Lnt = Ln

0 exp(−1

2

∫ t

0(σn

u)2du + xt

),

where x, a deterministic time-change of a Brownian motion, satisfies

dxt = σnt dWt. (19.11)

We take x as the driving Markov process for our model, and this completesthe specification of (P.i).

Page 384: Financial derivatives in theory and practice

360 Markov-functional modelling

The boundary curve ∂S for this problem is exactly that of (19.1). For thisapplication the only functional forms needed on the boundary are DTiTi(xTi)for i = 1, 2, . . . , n, trivially the unit map, and DTnSn(xTn). This latter formfollows from the relationship

DTnSn =1

1 + αnLnTn,

which yields

DTnSn =1

1 + αnLn0 exp(− 12 Tn0(σnu )

2 du+ xTn).

This completes (P.ii ).It remains to find the functional form NTi(xTi), in our case DTiSn(xTi),

and for this we apply the techniques of Section 19.3.1. Take as calibratinginstruments the caplets on the forward LIBORs Li. The value of the ithcorresponding digital caplet of strike K is given by

V i0 (K) = D0Sn(x0)ESnDTiTi+1(xTi)

DTiSn(xTi )1Li

Ti(xTi)>K .

If we assume the market value is given by Black’s formula with volatility σ i,the price at time zero for this digital caplet is

V i0 (K) = D0Ti+1(x0)Φ(di2), (19.12)

where

di2 =log(Li0/K)

σi√Ti

− 12 σ

i Ti

and Φ denotes the cumulative normal distribution function.To determine the functional form DTiSn(xTi) for i < n, we proceed as

in Section 19.3.1. Suppose we choose some x∗ ∈ R. Evaluate by numericalintegration

J i0(x∗) = D0Sn(x0)ESn

DTiTi+1(xTi )

DTiSn(xTi)1xTi>x∗

= D0Sn(x0)ESn ESnDTi+1Ti+1(xTi+1 )

DTi+1Sn(xTi+1)FTi 1xTi>x∗

= D0Sn(x0)∞

x∗

−∞

1

DTi+1Sn(u)φxTi+1 |xTi (u) du φxTi (v) dv

(19.13)where φxTi denotes the transition density function of xTi and φxTi+1 |xTi thedensity of xTi+1 given xTi . Note from (19.11) that φxTi+1 |xTi is a normal

Page 385: Financial derivatives in theory and practice

Example models 361

density function with mean xTi and varianceTi+1Ti

(σnu )2 du. Finally, note the

integrand in (19.13) only depends on DTi+1Sn(xTi+1) which has already beendetermined in the previous iteration at Ti+1.The value of DTiSn(x

∗) can now be determined as follows. Recall from(19.5) that to determine DTiSn(x

∗) it is sufficient to find the functional formLiTi(x

∗). From (19.6) and (19.8),

LiTi(x∗) = Ki(x∗)

where K i(x∗) solvesJ i0(x

∗) = V i0 (Ki(x∗)) . (19.14)

We have just evaluated the left-hand side of (19.14) numerically and K i(x∗)can thus be found from (19.12) using some simple algorithm. Formally,

LiTi(x∗) = Li0exp − 12 (σi)2Ti − σi TiΦ

−1 J i0(x∗)

D0Ti+1(x0).

Finally, to obtain the value of DTiSn(x∗) we use (19.5).

19.4.2 Swap model

Here we construct a swap Markov-functional model suitable for pricing swap-based products such as the Bermudan swaption. The model described here issuitable for the special case of pricing a Bermudan swaption which correspondsto a cancellable swap. We restrict to this case to keep notation simple.Generalizations are straightforward.We have already met Bermudans and cancellable swaps on several occasions,

in particular in the preceding Orientation on multi-temporal products andin the discussion of the Vasicek—Hull—White model in Chapter 17. Here wesuppose that the cancellable swap starts on date T1 and has payment datesS1, . . . , Sn. As previously, we refer to the exercise times of a Bermudanswaption as T1, . . . , Tn and denote by αi the accrual factor for the period[Ti, Si].For this model the ith forward par swap rate y i, which sets on date Ti, has

coupons precisely at the dates Si, . . . , Sn, and is given by

yit =DtTi −DtSn

P it,

whereP it :=

n

j=i

αjDtSj .

Note that the last par swap rate yn is just the forward LIBOR, Ln,for the period [Tn, Sn]. As in the LIBOR example above, we take D·Snas our numeraire and work in the swaption measure Sn. Furthermore, we

Page 386: Financial derivatives in theory and practice

362 Markov-functional modelling

specify properties (P.i), (P.ii ) and DTnSn(xTn) exactly as for the LIBORmodel. However, (P.iii ), the functional form for the numeraire D·Sn at timesTi, i = 1, . . . , n− 1, will need to be determined.For this new model the value of the calibrating PVBP-digital swaption (as

defined in Section 19.3.1) having strike K and corresponding to y i is given by

V i0 (K) = D0Sn(x0)ESnP iTi(xTi)

DTiSn(xTi )1yi

Ti(xTi )>K .

If we assume that the market value is given by Black’s formula then the priceat time zero of this PVBP-digital swaption has the form

V i0 (K) = Pi0(x0)Φ(d

i2) (19.15)

where

di2 =log(yi0/K)

σi√Ti

− 12 σ

i Ti .

Here Black’s formula implies that the marginal distribution of the y iTi arelog-normally distributed in their respective swaption measures.Next suppose we choose some x∗ ∈ R and, for i < n, evaluate by numerical

integration

J i0(x∗) = D0Sn(x0)ESn

P iTi(xTi )

DTiSn(xTi)1xTi>x∗

= D0Sn(x0)ESn ESnP iTi+1(xTi+1)

DTi+1Sn(xTi+1)FTi 1xTi>x∗

= D0Sn(x0)∞

x∗

−∞

P iTi+1(u)

DTi+1Sn(u)φxTi+1 |xTi (u) du φxTi (v) dv

.

Note that to calculate a value for J i0(x∗) we need to know DTi+1Tj (xTi+1), j >

i. These will have already been determined in the previous iteration.Now, as in Section 19.3.1,

yiTi(x∗) = Ki(x∗)

where K i(x∗) solvesJ i0(x

∗) = V i0 (Ki(x∗)) . (19.16)

Having evaluated the left-hand side of (19.16) numerically, K i(x∗) can berecovered from (19.15). Formally, we have

yiTi(x∗) = yi0exp − 12 (σi)2Ti − σi TiΦ

−1 J i0(x∗)

P i0(x0).

The value of DTiSn(x∗) can now be calculated using (19.5).

Page 387: Financial derivatives in theory and practice

Multidimensional models 363

19.5 MULTIDIMENSIONAL MARKOV-FUNCTIONALMODELS

All the discussion so far has focused on Markov-functional models for whichthe driving Markov process is one-dimensional. This is sufficient for manypractical applications, but not all. One product not covered is the Bermudancallable spread option. Another important example is as follows. Consider aswap for which one leg is a standard floating leg, with payments being basedon LIBOR, and the other is a CMS leg, with the amount of each paymentbeing based on some swap rate. Now suppose we wish to value this swap when,additionally, the counterparty paying LIBOR has the right, on any paymentdate, to cancel the rest of the swap. We call this product a Bermudan callableCMS. A two-factor model is needed to price this product accurately. Thereason for this is the fact that the value of the underlying swap is insensitiveto the overall level of interest rates (both legs increase in value as interestrates go up), but it is highly sensitive to the shape of the forward interest ratecurve, i.e. to the relative level of different forward rates.The techniques developed in Sections 19.3 and 19.4 relied heavily on the

one-dimensional property of the Markov process x. In particular, we assumedin (A.ii) that the swap rates to which we calibrate are monotone functionsof the process x. This and the one-dimensional property of x resulted in theset identity (19.6) which was crucial for the efficient implementation. Thisproperty would be lost if x were taken to be of higher dimension.The solution to this problem, of retaining the set identity (19.6) while

increasing the dimensionality of the Markov process, is to generalize x tobe a one-dimensional process which is not itself Markovian but which is adeterministic function of some higher-dimensional process which is Markovian:

xt = f(zt, t),

where z is a (time-inhomogeneous) Markov process of dimension d andf : Rd × R → R. As in the one-dimensional case, in practice we usuallytake z to be a time-changed Brownian motion.If we adopt this approach, all the techniques developed in Sections 19.3

and 19.4 still apply. Of course, the algorithm takes longer to run since one-dimensional integrals have been replaced by d-dimensional integrals, but thatis always the case for a genuine d-dimensional model. What remains is tochoose the Markov process z and the function f . By careful choice of theseit is possible to generate a wide range of models which calibrate to selectedmarginal and joint distributions. Work in this area is at a very early stage,but here we outline one of the simplest choices that can be made.

Page 388: Financial derivatives in theory and practice

364 Markov-functional models

19.5.1 Log-normally driven Markov-functional models

The high dimensionality of the (log-normal) market models in Chapter 18 is asevere impediment to their use for pricing and hedging American-type prod-ucts. A considerable literature has developed which attempts to overcomethis, sometimes by clever use of simulation, sometimes by approximating themarket model by some process of lower dimension. In this latter case themost common approach used is to replace the state-dependent drift in themarket model by some simpler drift. For example, we could replace the state-dependent drift µi

t(Lt) in equation (18.5) by the corresponding drift with timet replaced by time zero throughout. The resultant LIBOR process is thenjointly log-normal and Markov-functional, with the dimension of the drivingMarkov process being equal to that of the underlying driving Brownian motionin equation (18.5) (which will usually be much less than the total number ofLIBORs being modelled).

More sophisticated approximations along similar lines, allowing for (deter-ministic) time-dependent drifts and accurate convexity adjustments, are alsoavailable. All admit arbitrage but they do give yield curve distributions similarto those of the market model being approximated.

A simple log-normal approximation to a log-normal market model makes agood basis for selecting the function f in the Markov-functional approach. Forexample, what follows is, in outline, a model designed for pricing a Bermu-dan callable CMS. Suppose we have in mind a two-factor Markov-functionalmodel which prices exactly all swaptions based on the rates yi

Ti, i = 1, . . . , n,

on which the constant maturity leg payments are based. Start by defining anapproximate LIBOR market model using equation (18.5) but with the driftterm µi

t(Lt) replaced by µi0(L0). (We use a LIBOR market model in preference

to a swap market model because LIBOR is the other relevant index for theBermudan callable CMS.) It follows from equation (18.8) that the approxi-mate LIBOR process L for this approximating model now satisfies the SDE

dLit = −

n∑

j=i+1

(αjL

j0

1 + αjLj0

)σi

tσjtρij

Li

tdt + σitL

itdW i

t ,

which in turn admits the solution

Lit = Li

0 exp(

µit − 12

∫ t

0(σi

u)2du +

∫ t

0σi

udWiu

), (19.17)

where

µi = −n∑

j=i+1

(αjL

j0

1 + αjLj0

)σi

tσjtρij.

In the simplest case one might choose the volatilities σi to be constant intime and consistent with caplet prices and choose the correlations ρij so that,within this approximate LIBOR market model, the swap rates yi

Ti(which are

Page 389: Financial derivatives in theory and practice

Relationship to market models 365

a known function of the LIBORs) have the correct market-implied variance.The time-independent volatility structure can be generalized to incorporatemean reversion, but we omit that possibility here for ease of exposition.

Now take the driving Brownian motion W to be of dimension 2, meaningit can be written in the form Wt = Mzt for some n × 2 matrix M and astandard two-dimensional Brownian motion z. This and the assumption oftime-independent volatility mean that (19.17) can be rewritten as

Lit = Li

0 exp(µit − 12 (σ

i)2t + Mi1z1t + Mi2z

2t ),

which is of Markov-functional form,

Lit ≡ Li

t(zt).

In particularLi

Ti≡ Li

Ti(zTi

)

Suppose now that we are constructing a Markov-functional model and havedone so at times Ti+1, . . . , Tn. Applying the martingale property of numeraire-rebased assets, we can recover the functional forms Lj

Ti(zTi

), j > i. We nowwrite the swap rate yi as

yiTi

= yiTi

(LiTi

, LjTi

, j > i) = fi(zTi, Ti).

Using the above approximation to a market model as a guide, we take xas any one-dimensional process such that, for each i = 1, . . . , n, x(zTi

, Ti) =fi(zTi

, Ti). There are many ways to define x at intermediate times but thishas no effect on the model so we shall not specify it further. Having chosen x,we now proceed as in Section 19.3.

Following the above procedure leads to a model which is arbitrage-free,calibrated exactly to the swap rates yi

Ti, i = 1, . . . , n, and which has the desired

correlation structure for each pair (LiTi

, yiTi

). The intertemporal distributionsof these rates can be controlled by the introduction of, and appropriatechoice of, mean reversion parameters. Section 19.7 discusses this in the one-dimensional case.

19.6 RELATIONSHIP TO MARKET MODELS

An obvious question which arises is how the Markov-functional models of thischapter relate to the market models introduced in Chapter 18. The class ofMarkov-functional models is strictly larger. That it contains all the marketmodels described in Chapter 18 follows from Theorems 18.1, 18.2 and 18.3which showed that the market models given there were all Markovian. Thatthe Markov-functional class is strictly larger follows since the driving Markovprocess could be taken to have dimension greater than the number of swaprates being modelled. In practice, of course, one would not choose to do this – aprimary motivation for the Markov-functional formulation is to reduce thedimensionality.

Page 390: Financial derivatives in theory and practice

366 Markov-functional modelling

One could, as is common, define market models more generally than we didin Chapter 18 by allowing the coefficients of the driving SDE to depend onthe whole sample path and not just the current value of the market rates.Clearly for this more general case the resultant model will not be Markovianand, therefore, not Markov-functional. Indeed, in this generality the class ofmarket models is equivalent to the class (PDB) described in Chapter 8 andthus is the most general class of continuous interest rate models that we haveconsidered. In particular, it is more general than the (HJM) class.Often when people refer to market models they mean the special case when

the forward par swap rates are assumed to be log-normal martingales in theirassociated swaption measures. This yields Black’s formula for the prices of theassociated swaptions. A further question, therefore, is whether it is possiblein this case to write a market model as a function of some low-dimensionalMarkov process. In fact it is not. The reason for this is that the assumptionof log-normal processes in a market model is much stronger and more restric-tive than the assumptions made when fitting a Markov-functional model toswaption prices given by Black’s formula. In the latter case we only assumethat the swap rates, in their associated swaption measures, satisfy the mar-tingale property and have a log-normal distribution on their respective fixingdates. The additional restriction for log-normal market models is preciselywhat makes those models difficult to use in practice because they cannot beusefully characterized by a low-dimensional Markov process. The followingresult, which can be extended to include the swap-market models, makes thisstatement precise for the log-normal LIBOR-based market models.

Theorem 19.4 Let L = (L1, . . . , Ln), n > 1, be a non-trivial log-normalmarket model, where the Li are a set of contiguous forward LIBORs. Thenthere exists no one-dimensional process x such that:

(i) Lit = Lit(xt) ∈ C2,1(R,R+) for i = 1, 2, . . . , n;

(ii) each Lit(xt) is strictly monotone in xt.

That is, L is not a one-dimensional Markov-functional model.

Remark: We believe this result extends to hold for any process x of dimensionless than n and any functional forms Lit(xt).

Proof: Suppose such a process x exists. Then it follows from the invertibilityof the map xt → Lit(xt) that we can write

Lit = Lit(L

nt ), i = 1, 2, . . . , n . (19.18)

Since L is a log-normal market model, it follows from (18.8) that it satisfiesthe SDE

dLit = −n

j=i+1

αjLjt

1 + αjLjt

σitσjt ρij Lit dt+ σitL

it dW

it , (19.19)

Page 391: Financial derivatives in theory and practice

Mean reversion, forward volatilities and correlation 367

under the measure Sn (corresponding to taking D·Sn as numeraire). HereW = (W 1, . . . ,Wn) is a (correlated) n-dimensional Brownian motion with

dW it dW

jt = ρijdt, σ

1, . . . ,σn are deterministic functions of time and the αjare, as usual, accrual factors. On the other hand, if we apply Ito’s formula to(19.18), we obtain

dLit =∂Lit∂t

+ 12 (σ

nt L

nt )2 ∂2Lit∂(Lnt )

2dt+

∂Lit∂Lnt

dLnt

=∂Lit∂t

+ 12 (σ

nt L

nt )2 ∂2Lit∂(Lnt )

2dt+

∂Lit∂Lnt

σnt Lnt dW

nt . (19.20)

Equating the local martingale terms in (19.19) and (19.20) yields W i ≡Wn,for all i, and

∂Lit∂Lnt

=σitL

it

σnt Lnt

. (19.21)

Solving (19.21), we find

Lit = ci(t)(Lnt )βi(t) (19.22)

where ci(t) is some function of t and βi(t) = σit/σnt .

Having concluded that W i =Wn, for all i, (19.19) reduces to the form

dLit = −n

j=i+1

αjLjt

1 + αjLjt

σitσjt Lit dt+ σitL

it dW

nt . (19.23)

Substituting (19.22) back into (19.20) and equating finite variation terms in(19.20) and (19.23) now gives (after some rearrangement)

1

σit

ci(t)

ci(t)+ βi(t) logL

it + 1

2σit(βi(t)− 1) = −

n

j=i+1

αjσjtL

jt

1 + αjLjt

.

In particular, taking i = n−1 yields σnt = 0 and thus the model is degenerate.

19.7 MEAN REVERSION, FORWARD VOLATILITIESAND CORRELATION

19.7.1 Mean reversion and correlation

Mean reversion of interest rates is considered a desirable property of a modelbecause it is perceived that interest rates tend to trade within a fairly tightlydefined range. This is indeed true, but when pricing exotic derivatives it is the

Page 392: Financial derivatives in theory and practice

368 Markov-functional modelling

effect of mean reversion on the correlation of interest rates at different timesthat is more important. We illustrate this for the Vasicek—Hull—White modelbecause it is particularly tractable. The insights carry over to the marketmodels of Chapter 18 and the Markov-functional models of this chapter.In the standard Vasicek—Hull—White model the short-rate process r solves

the SDEdrt = (θt − atrt)dt+ σtdWt , (19.24)

under the risk-neutral measure Q, where a, θ and σ are deterministic functionsand W is a standard Brownian motion. For the purposes of this discussion wetake at ≡ a, for some constant value.Suppose we fit the Vasicek—Hull—White model to market bond and caplet

prices. The best we can do is to fit the model so that it correctly valuesone caplet for each date Ti, the one having strike Ki say. It turns out, andthe reader can verify this from equations (17.4) and (17.14), that, given theinitial discount curve D0S, S > 0 and the mean reversion parameter a, thecorrelation structure for the rTi is given by

corr(rTi , rTj ) = e−a(Tj−Ti) vi

vj,

where vi = var(rTi). Furthermore, given the market cap prices, the ratiovi/vj is independent of a. Thus increasing a has the effect of reducingthe correlation between the short rate at different times, hence also thecovariance of spot LIBOR at different times. This is important for pricingpath-dependent and American options whose values depend on the jointdistribution (rTi , i = 1, 2, . . . , n).For other models, including Markov-functional models, the analytic

formulae above will not hold, of course, but the general principal does. Giventhe marginal distributions for a set of spot interest rates y iTi , i = 1, 2, . . . , n,a higher mean reversion for spot interest rates leads to a lower correlationbetween the yiTi, i = 1, 2, . . . , n.

19.7.2 Mean reversion and forward volatilities

In the models of Section 19.4 we have not explicitly presented an SDE forspot LIBOR or spot par swap rates. We therefore need to work a little harderto understand how to introduce mean reversion within these models. To dothis we consider the Vasicek—Hull—White model once again.We have in (19.10) parameterized our Markov-functional example in terms

of a forward LIBOR process Ln. If we can understand the effect of meanreversion within a model such as Vasicek—Hull—White on Ln we can apply thesame principles to a more general Markov-functional model.An analysis of the Vasicek—Hull—White model defined via (19.24) shows

thatLnt = Xt − α−1n

Page 393: Financial derivatives in theory and practice

Mean reversion, forward volatilities and correlation 369

where

dXt =(

e−aTn − e−aSn

a

)σeatXtdWt,

W being a standard Brownian motion under the measure Sn. The forward

LIBOR is a log-normal martingale minus a constant. Notice the dependenceon time t of the diffusion coefficient of the martingale term: the volatilityis of the form constant × Xt × eat. This is the motivation for the volatilitystructure we chose for Ln in the Markov-functional model definition in (19.10).But notice also that, for a more realistic correlation structure, one would at aminimum need to make σ a function of time and calibrate (19.24) to marketprices. This is not a topic we shall cover in any detail here. What is importantis not so much the exact functional form as the fact that the volatility increasesthrough time. The faster the increase, the lower the correlation between spotinterest rates which set at different times.

19.7.3 Mean reversion within the Markov-functional LIBOR model

To conclude this discussion of mean reversion we return to the Markov-functional LIBOR model of Section 19.4.1 and show how taking σn

t = σeat

in (19.10) leads to mean reversion of spot LIBOR. Suppose the market capprices are such that the implied volatilities for L1 and Ln are the same, σ say,and all initial forward values are the same, Li

0 = L0, for all i.

LIBOR

L0

T1 Tn

L1T1

LnT1

]|LnTn

FT1ESn [

Figure 19.2. Mean reversion

Figure 19.2 shows the evolution of L1 and Ln in the situation whenthe driving Markov process x has increased (significantly), xT1 > x0. BothL1 and Ln have increased over the interval [0, T1] but L1 has increasedby more. The reason for this is as follows. Over [0, T1], L1 has (root meansquare) volatility σ. By comparison, Ln has (root mean square) vola-tility σ over the whole interval [0, Tn], but its volatility is increasingexponentially (in the case when σt ≡ σ, a constant) and thus its

Page 394: Financial derivatives in theory and practice

370 Markov-functional modelling

(root mean square) volatility over [0, T1] is less than σ. Since Ln is a martingale

under Sn, it follows that

ESn LnTn |FT1 = LnT1 < L1T1 .

That is, in Figure 19.2 when spot LIBOR has moved up from its initial valueL0 at time zero to its value at T1, L

1T1, the expected value of spot LIBOR at

Tn is less than L1T1. Conversely, when spot LIBOR moves down by time T1

(xT1 < x0) the expected value of spot LIBOR at Tn is greater than its valueat T1. This is mean reversion.

19.8 SOME NUMERICAL RESULTS

We conclude with some numerical results comparing the prices of Bermudanswaptions derived using some of the models discussed in this book. Fora 30-year Deutschmark Bermudan, which is exercisable every five years,we have compared in Table 19.1, for different levels of mean reversion,the prices calculated by three different models: Black—Karasinski, one-dimensional Markov-functional and the Vasicek—Hull—White model. Note thatour implementation of the Vasicek—Hull—White model prohibits negative meanreversions so we have not been able to include these results.For every level of mean reversion, we have given the prices of the embedded

European swaptions (5 × 25, 10 × 20, 15 × 15, 20 × 10, 25 × 5). Since allthree models are calibrated to these prices, all models should agree exactlyon them. The differences reported in the table are due to numerical errors.We adopted the following approach. First we chose a level for the Black—Karasinski mean reversion parameter and reasonable levels for the Black—Karasinski volatilities. We then used these parameters to generate the prices,using the Black—Karasinski model, of the underlying European swaptions.We then calibrated the other, Vasicek—Hull—White and Markov-functional,models to these prices. The resultant implied volatilities used for each caseare reported in the second column. At the bottom of each block in the table,we report the value of the Bermudan swaption as calculated by each of thethree models.From the table we see that the mean reversion parameter has a significant

impact on the price of the Bermudan swaption. It is intuitively clear whythis is the case. The reason a Bermudan option has more value than themaximum of the embedded European option prices is the freedom it offersto delay or advance the exercise decision of the underlying swap during thelife of the contract to a date when it is most profitable. The relative valuebetween exercising ‘now’ or ‘later’ depends very much on the correlationof the underlying swap rates between different time points. This correlationstructure is exactly what is being controlled by the mean reversion parameter.

Page 395: Financial derivatives in theory and practice

Some numerical results 371

By contrast, the effect of changing the model is considerably less, andwhat difference there is will be due in part to the fact that the meanreversion parameter has a slightly different meaning for each model. Theprecise (marginal) distributional assumptions made have a secondary role indetermining prices for exotic options (relative to the joint distributions ascaptured by the mean reversion parameter).

Table 19.1 Value of 30-year Bermudan, exercisable every five years

Strike: 6.24%, Currency: DEM, Valuation Date: 11-feb-98

Mean Reversion = −0.05European Receivers Payers

Mat ImVol BK MF HW BK MF HW

05× 25 8.17 447.7 445.9 — 456.6 456.6 —10× 20 7.74 365.2 363.8 — 448.2 448.6 —15× 15 7.90 285.8 284.7 — 350.8 351.4 —20× 10 8.29 191.8 190.8 — 241.6 242.1 —25× 05 8.68 91.0 90.9 — 126.2 126.1 —

Bermudan 510.0 502.7 — 572.2 566.9 —

Mean Reversion = 0.06

European Receivers Payers

Mat ImVol BK MF HW BK MF HW

05× 25 8.46 463.7 462.8 464.2 472.6 472.0 473.310× 20 7.91 374.0 373.5 374.2 457.0 456.9 457.515× 15 7.80 281.8 281.5 281.9 346.8 346.8 347.120× 10 7.81 179.5 179.4 179.6 229.4 229.3 229.525× 05 8.16 84.8 84.7 84.7 119.9 120.0 120.1

Bermudan 606.1 602.9 608.2 743.6 717.7 727.3

Mean Reversion = 0.20

European Receivers Payers

Mat ImVol BK MF HW BK MF HW

05× 25 8.37 458.7 457.6 459.3 467.5 466.8 468.010× 20 6.99 326.2 325.4 326.2 409.1 408.8 409.315× 15 6.67 237.1 236.4 237.3 302.1 301.7 302.420× 10 7.32 166.7 166.4 166.6 216.6 216.4 216.625× 05 9.40 99.8 99.5 99.6 134.9 134.8 134.8

Bermudan 647.4 665.8 673.9 885.5 814.7 813.6

Mat denotes maturity and tenor of embedded European option.ImVol denotes implied volatility of embedded European option.BK are prices calculated with Black—Karasinksi model.MF are prices calculated with Markov-functional model.HW are prices calculated with Hull—White model.

Page 396: Financial derivatives in theory and practice
Page 397: Financial derivatives in theory and practice

20

Exercises and SolutionsExercises

Exercise 1 Let M be a non-negative local martingale defined on the filteredspace (Ω, Ft,F , P) such that E[M0] < ∞. Show that M is a supermartingale.

Exercise 2 Let (Ω, Ft,F , P) be a filtered probability space supporting aone-dimensional Brownian motion W. Let X denote a local martingale of theform

Xt :=∫ t

0Hu dWu ,

where H ∈ Π(W ) is a deterministic function of time. Suppose that [X]t ↑ ∞as t ↑ ∞ and, for t ≥ 0, define

τt := infu > 0 : [X]u > t

.

Show that X can be written in the form

Xt = W([X]t

),

where W is a Brownian motion adapted to Ft := Fτt.

Remark: The result above is sufficient for the application in Section 17.2.1.For the more general result due to Dubins and Schwarz, where X is anycontinuous local martingale, see Section IV.34 of Rogers and Williams (1987).

Exercise 3 Consider the SDE

dXt = −aXtdt + σdWt,

where W is a one-dimensional Brownian motion and a and σ are constants.By applying Ito’s formula to the process Yt := eatXt, show that the uniquesolution to this SDE is given by

Xt = e−at

(X0 +

∫ t

0easσ dWs

).

Assuming X0 = x0, a constant, specify the distribution of Xt.

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 398: Financial derivatives in theory and practice

374 Exercises and solutions

Further show that for t1 < t2,

corr(Xt1 , Xt2 ) = e−a(t2−t1)√

v1

v2,

where vi = var(Xti).

Exercise 4 (i) Let P and Q be equivalent probability measures with respectto F on the filtered space (Ω, Ft,F) and let X ∈ L1(Ω,Ft∗ , Q) for some fixedt∗. For s ≤ t∗, express the conditional expectation

EQ

[X | Fs

]in terms of the Radon–Nikodym process

ρs :=dQ

dP

∣∣∣∣Fs

, s ≤ t∗.

(ii) Show that M is an (Ft, Q) martingale if and only if ρM is an (Ft, P)martingale. (This is just Lemma 5.19(i).)

In Exercises 5 to 9 and 11, (Ω,F , P) is a probability space supporting a one-dimensional Brownian motion W and FW

t denotes the augmented naturalfiltration generated by W. We will also use the notation A′ to denote thetranspose of the vector A.

The next exercise can be attempted after reading Chapter 5.

Exercise 5 Let µ, r > 0 and σ > 0 be constants and consider a processA′ = (A(1), A(2)) satisfying

dAt =

(rA

(1)t

µA(2)t

)dt +

(0

σA(2)t

)dWt

A′0 = (e−rT, 1).

For fixed T < ∞, define a measure Q ∼ P on FWT via

dQ

dP

∣∣∣∣FW

T

= exp(φWT − 1

2φ2T),

where φ = σ−1(r − µ). Further, for some VT ∈ L1(Ω,FW

T , Q), define

V0 := e−rTEQ[VT].

Page 399: Financial derivatives in theory and practice

Exercises and solutions 375

Suppose that there exists an FWt semimartingale X and FW

t -predictableprocesses α(1) and α(2) such that

dXt = α(1)t dA

(1)t + α

(2)t dA

(2)t ,

Xt = α(1)t A

(1)t + α

(2)t A

(2)t ,

Xt ≥ 0 for t ∈ [0, T ],

XT = VT.

(i) Show that X0 ≥ V0.

Hint: Consider the process X := X/A(1) under Q.

(ii) Further show that we can find processes X, α(1) and α(2) with the propertiesdescribed above and such that X0 = V0.

Hint: Define Mt := EQ

[VT | FW

t

]and show we can take Xt = A

(1)t Mt.

Exercise 6 Black–Scholes economyConsider the Black–Scholes economy as defined in Example 7.1: for the finitetime interval 0 ≤ t ≤ T < ∞, the price process A′ = (D, S) satisfies

dDt = rDt dt (DT = 1, r > 0),

dSt = µSt dt + σSt dWt (S0 a constant).

(i) Writing SDt = D−1

t St, explain how we could use Girsanov’s theorem to finda solution to the SDE

dSDt = (µ − r)SD

t dt + σSDt dWt.

Remark: The easiest way to solve this equation is, of course, to consider log SDt .

(ii) Set φ = σ−1(r − µ). Use the solution you found in (i) to show that

SDt exp

(φWt − 1

2φ2t)

is an FWt martingale under P.

(iii) Let Q be a measure equivalent to P on FWT defined via

dQ

dP

∣∣∣∣FW

T

:= exp(φWT − 1

2φ2T).

Deduce from (ii) that SD is an(FW

t , Q)

martingale.

Page 400: Financial derivatives in theory and practice

376 Exercises and solutions

Next define a measure Q ∼ Q on FWT via

dQ

dQ

∣∣∣∣FW

T

=e−rTST

S0.

Show that, for 0 ≤ u ≤ t ≤ T ,

EQ

[Dt

St

∣∣∣∣FWu

]=

Du

Su.

(iv) Show that the time-zero value of a derivative paying (ST − K)+ at timeT is given by

V0 = S0 Q(ST > K) − e−rtK Q(ST > K) .

Exercise 7 For the Black–Scholes economy described in Exercise 6, let N

denote an EMM corresponding to the numeraire N = D + S.

(i) Find the SDE satisfied by(AN, N

)under the measure N.

(ii) Use your working in (i) and Lemma 5.17 to show that

dN

dP

∣∣∣∣FW

t

=Nt/N0

Dt/D0exp

(φWt − 1

2φ2t),

where φ = σ−1(r − µ).

Exercise 8 Let σt, t ≥ 0 be an FWt -predictable process bounded above

and below by positive constants. Consider the Black–Scholes economydescribed in Exercise 6 but where now

dSt = µSt dt + σtSt dWt (S0 = 1).

(i) Find the SDE satisfied by DS := S−1D under P.

(ii) Show that there exists a unique EMM QS corresponding to the numeraireS.

(iii) Find the solution to the SDE in (i) using Girsanov’s theorem.

Exercise 9 Example of a path-dependent option

(i) Consider the Black–Scholes economy as defined in Exercise 6. Define

mSt := min

u∈[0,t]Su

andMX

t,T := maxu∈[t,T]

(Xu − Xt),

Page 401: Financial derivatives in theory and practice

Exercises and solutions 377

whereXt := −σWt +

( 12σ

2 − µ)t

and W is a one-dimensional Brownian motion with respect to(Ω, FW

t ,F , P).

Show thatmS

T = min

mSt , Ste

−MXt,T

.

(ii) A lookback call option has payoff at time T given by

VT =(ST − mS

T

)+ = ST − mS

T.

Use Girsanov’s theorem and the result of Example 2.17 to show that the valueat time t ∈ [0, T ] of this option is given by

Vt = sN(h1) − me−rτN(h2)

− sσ2

2rN(−h1) + e−rτ sσ2

2r

(m

s

)2rσ−2

N(− h2 + 2

r2

σ

√τ),

where

h1 :=log(s/m)

σ√

τ+

r1

σ

√τ ,

h2 :=log(s/m)

σ√

τ+

r2

σ

√τ ,

and s := St, m := mSt , τ = T − t, r1 = r + 1

2σ2 and r2 = r − 1

2σ2.

Exercise 10 Let W be a two-dimensional Brownian motion defined on theprobability space (Ω,F , P) and let FW

t be the augmented natural filtrationgenerated by W.

Consider an economy defined for the finite time interval 0 ≤ t ≤ T < ∞ andcomposed of three assets having price process A′ =

(D, S(1), S(2)

)satisfying

dDt = rDt dt (DT = 1, r > 0),

dS(i)t = S

(i)t

(µi dt + dM

(i)t

), i = 1, 2,

where M (1) and M (2) are continuous FWt martingales such that[

M (i), M (j)]t= ξij dt, i = 1, 2,

for constants ξij with ξii > 0, and

ρ :=ξ12√

ξ11√

ξ22∈ (−1, 1).

Page 402: Financial derivatives in theory and practice

378 Exercises and solutions

(i) Use Girsanov’s theorem to show that there exists an EMM Q correspondingto the numeraire S(2).

Hence, setting Xt := S(1)t /S

(2)t , show that

dXt = Xt dYt,

whereYt =

(√ξ11 −

√ξ22ρ

)W

(1)t −

√ξ22(1 − ρ2) W

(2)t

and W ′ =(W (1), W (2)

)is a Brownian motion under Q.

(ii) Show that the value at time zero of the attainable derivative paying(S

(1)T − S

(2)T

)+ at time T is given by

V0 = S(1)0 N(d1) − S

(2)0 N(d2),

where

d1 =log(X0)σ0√

T+ 1

2σ0√

T ,

d2 =log(X0)σ0√

T− 1

2σ0√

T

andσ2

0 = ξ11 − 2ξ12 + ξ22.

Exercise 11 Completeness for the Black–Scholes economy

Remark: Completeness of the Black–Scholes economy follows immediatelyfrom Theorem 7.45. This exercise works through a direct approach.

Consider the Black–Scholes economy as described in Exercise 6. Let QS

denote the unique EMM corresponding to the numeraire S. (That QS is uniqueis a special case of Exercise 8(ii).)

(i) Apply the martingale representation theorem for Brownian motion (The-orem 5.49) to show that, for any FW

T -measurable random variable X satisfying

EQS

[|X/ST|

]< ∞,

there exists an FWt -predictable self-financing trading strategy φ such that

X = φ0 · A0 +∫ T

0φu · dAu.

Page 403: Financial derivatives in theory and practice

Exercises and solutions 379

(ii) Suppose (N, N) is some other numeraire pair for the economy. Show that,for 0 ≤ t ≤ T ,

dQS

dN

∣∣∣∣FW

t

=St/S0

Nt/N0.

(iii) Show that the trading strategy φ of part (i) is admissible and hence thatthe Black–Scholes economy is complete for the set

S =X : X is an FW

T -measurable random variable such that EQ

[|X|]

< ∞,

where Q is the unique EMM corresponding to numeraire D.

Exercise 12 (i) Let E be an economy defined on the probability space(Ω, Ft,F , P). Suppose (N, N) is a numeraire pair for E and let VT be someattainable contingent claim. Starting from the martingale valuation formula,equation (7.15), show that

Vt = Z−1t EP

[VTZT | FA

t

]for some pricing kernel Z.

(ii) Find a pricing kernel for the economy described in Exercise 10. Is thiseconomy complete? Write down the Radon–Nikodym derivative

dQD

dP

∣∣∣∣FA

t

, 0 ≤ t ≤ T,

where QD denotes an EMM corresponding to the numeraire D.

Exercise 13 Recall that a rational log-normal model is defined by

DtT =AT + BTMt

At + BtMt,

where M is a log-normal martingale under Z, a measure locally equivalent tothe ‘real world’ measure P, M0 = 1, and A and B are deterministic positiveabsolutely continuous functions decreasing to zero at infinity.

(i) Show that a rational log-normal model is in the class (FH). (See Chapter 8,page 190, for a definition of the class (FH).)

(ii) In the definition of an (FH) model suppose that

Mt = exp(σWt − 1

2σ2t),

Page 404: Financial derivatives in theory and practice

380 Exercises and solutions

where W is a one-dimensional Brownian motion under Z. Use this rational log-normal model to find the time-zero value of a payers swaption having cashflowsat times S1 < S2 < . . . < Sn, start date T = S0 < S1, and fixed rate K.

Exercise 14 Let (Ω, Ft,F , P) be a probability space supporting a one-dimensional Brownian motion W.

(i) For a short-rate model having short-rate process r and risk-neutral measureQ, show that the time-t value of any attainable derivative with value VT ∈ FA

T

at maturity time T can be expressed as

Vt = EQ

[exp

(−∫ T

t

ru du)VT

∣∣∣∣FAt

].

Further, show that we can write

Vt = DtT EF

[VT | FA

t

],

where F is a measure equivalent to Q on FAT which you should specify.

(ii) Ho–Lee modelSuppose that under Q the short-rate process r satisfies the SDE

drt = θt dt + σ dWt,

where θt is a deterministic function of time and σ is a constant.Assuming that the model is calibrated to the initial discount curve, show that

θt = −∂2 log D0t

∂t2+ σ2t.

Further, show that under F the short-rate process r satisfies

drt =(θt − σ2(T − t)

)dt + σ dWt,

where W is a standard Brownian motion under F.

Exercise 15 Consider an economy involving three currencies. For i = 1, 2, 3,let Di

tT denote the value at time t of a zero coupon bond that pays a unitamount at T in currency i. Further, for i, j = 1, 2, 3, i = j, let Xij

t denote thevalue at time t in currency i of one unit of currency j, and let Mij

tT denote thecorresponding forward FX rate at time t for maturity at T. Recall that

MijtT =

DjtT

DitT

Xijt .

Page 405: Financial derivatives in theory and practice

Exercises and solutions 381

Suppose that a complete arbitrage-free model has been specified for theeconomy for the finite time interval 0 ≤ t ≤ T < ∞ and let Q1 denote theEMM corresponding to numeraire D1

·T.

(i) Show that the time-zero value, stated in currency 2, of a call option thatpays (X23

T − K)+ in currency 2 at time T can be expressed as

V 230 (K) = X21

0 D10T EQ1

[(X13

T − X12T K

)+

].

(ii) Suppose we are given the prices for all strikes of call options on X21T and

X31T . If the model is calibrated to these prices explain, using the expression

in (i), what these prices tell us about the distributional information neededto calculate call prices on X23

T .

(iii) Let W ∗ be a two-dimensional Brownian motion on the probability space(Ω,F , Q1) and let Ft be the augmented natural filtration generated by W ∗.Now suppose that under Q1 the processes M 12

·T and M 13·T satisfy

dM 12tT = σ(1)eatM 12

tT dW(1)t ,

dM 13tT = σ(2)eatM 13

tT dW(2)t ,

where σ(1), σ(2) and a are positive constants and

W(1)t =

√1 − ρ2 W

∗(1)t + ρW

∗(2)t , W

(2)t = W

∗(2)t .

If Q2 denotes the EMM corresponding to numeraire D2·T show that, under Q2,

dM 23tT = σ

(3)t M 23

tT dWt,

where W is an (Ft, Q2) Brownian motion and

σ(3)t := eat

√(σ(1))2 − 2ρσ(1)σ(2) + (σ(2))2.

For this model derive an expression for V 230 (K), the time-zero value of the

vanilla FX call option described in (i).

Exercise 16 (i) Consider a Hull–White model in which the short-rateprocess r satisfies the SDE

drt = (θt − art)dt + σdWt

under the risk-neutral measure Q, where θ is a deterministic function of timeand σ and a are constants.

Page 406: Financial derivatives in theory and practice

382 Exercises and solutions

Let 0 < S0 < S1 < . . . < Sn−1 < Sn be a sequence of regularly spaced timesand set Ti = Si−1, αi = Si − Ti. Show that the forward LIBOR Li

t = Lt[Ti, Si]satisfies the SDE

dLit =

(e−aTi − e−aSi

a

)eat(Li

t + α−1i

)σdW i

t ,

W i being a standard Brownian motion under the EMM Fi corresponding totaking D·Si

as numeraire.

(ii) Let V it (K) denote the value at time t of a European put option on a

discount bond having payoff (K − DTiSi)+ at time Ti (not the more usual

payment time Si).

Show that the value at time t of a cap having payment dates S1, . . . , Sn, andstrike K can be written in the form

V capt (K) =

n∑i=1

(1 + αiK)V it

(1

1 + αiK

)

and find a closed-form formula for V it (K) for the Hull–White model described

in (i) above.

Exercise 17 (i) Consider a payers swaption on a swap with cashflows attimes S1 < S2 < . . . < Sn, start date T = S0 < S1 and fixed rate K. Accordingto Black’s formula, the value at time zero of this swaption is given by

V0 = P0[y0N(d1) − KN(d2)

],

where

d1 =log(y0/K)

σ0√

T+ 1

2 σ0√

T ,

d2 =log(y0/K)

σ0√

T− 1

2 σ0√

T ,

σ0 is a positive constant and N(·) denotes the standard cumulative normaldistribution.Suppose an arbitrage-free term structure model has been defined for whichthe price of this swaption is consistent with Black’s formula as stated abovewith volatility σ0 for all strikes. Show that for this model the distribution ofyT is log-normal under an EMM corresponding to numeraire P =

∑ni=1 αiD·Si

where αi = Si − Si−1.

Page 407: Financial derivatives in theory and practice

Exercises and solutions 383

(ii) Let (Ω,F , P) be a probability space supporting a one-dimensionalBrownian motion W. Let σt be a deterministic function of time t boundedabove and below by positive constants and let µ be a constant.

Now suppose an arbitrage-free term structure model has been defined for thefinite time interval 0 ≤ t ≤ T < ∞ such that, under P, the forward par swaprate y satisfies the SDE

dyt = µyt dt + σtyt dWt.

Show that the model will be consistent with Black’s formula for the swaptionas given in (i), provided

σ20 =

1T

∫ T

0σ2

u du.

In Exercises 18 to 21, 0 < S0 < S1 < . . . < Sn−1 < Sn denotes a sequenceof regularly spaced times with Ti := Si−1 for i = 1, . . . , n + 1. Further, we letLi

t = Lt[Ti, Si] denote the forward LIBOR at time t for the period [Ti, Si].

Exercise 18 (i) For each i = 1, . . . , n, consider a caplet with start date Ti,cashflow at time Si and strike K. According to Black’s formula, the value attime t = 0 of the ith caplet is given by

V i0 = αiD0Si

(Li

0N(di1) − KN(di

2)),

where αi = Si − Ti, σi is a positive constant and

di1 =

log(Li0/K)

σi√

Ti

+ 12σ

i√

Ti,

di2 =

log(Li0/K)

σi√

Ti

− 12σ

i√

Ti.

Suppose an arbitrage-free term structure model has been defined which isconsistent with Black’s formula at time t = 0 for all strikes for each of thesecaplets. Find the distribution of Li

Tiunder an EMM corresponding to the

numeraire D·Si.

(ii) In a one-factor LIBOR market model, working in the measure F

corresponding to the numeraire D·Sn, suppose L′ := (L1, . . . , Ln) satisfies an

SDE of the form

dLit = µi

t dt + σiLit dWt, i = 1, . . . , n,

for constants σi, W a one-dimensional Brownian motion, µnt = 0 for all t ≥ 0

and, for 1 ≤ i < n,

µit = −

n∑j=i+1

αjLjt

1 + αjLjt

σjσiLit.

Page 408: Financial derivatives in theory and practice

384 Exercises and solutions

Show that this model is consistent with Black’s formula as given in (i) foreach of the caplets.

Exercise 19 An alternative model to the LIBOR market model given inExercise 18(ii) is proposed as follows.

Suppose now that, under F, for i = 1, . . . , n,

Lit = Li

0 exp(σiWt + ηi

t

),

where W and σi are as in Exercise 18(ii) and ηit is a deterministic function of

t ≤ Ti chosen to satisfyEF[D(i−1)

t ] = D(i−1)0 ,

whereD

(i−1)t :=

DtTi

DtSn

.

Show that, for i = 1, . . . , n,

ηit = log Di

0 − log EF[exp(σiWt)Dit].

Is this model arbitrage-free? Would this model be easy to implement inpractice?

Exercise 20 Fix k, where 1 < k < n. Working in the measure Fk corre-sponding to the numeraire D·Sk

, suppose that, in a LIBOR market model,

L′ := (L1, . . . , Ln)

satisfies an SDE of the form

dLit = µi

t dt + σiLit dW i

t , i = 1, . . . , n.

Here each σi is a constant, each µit is some general process, to be determined,

and W ′ = (W 1, . . . , Wn) is an n-dimensional Brownian motion having

dWit dWj

t = ρij dt

for constants ρij with ρii = 1 for all i.

(i) Why is µkt = 0?

(ii) Show that, for 1 ≤ i < k,

µit = −

k∑j=i+1

αjLjt

1 + αjLjt

σjσiρijLit,

where αj = Sj − Tj for each j.

Page 409: Financial derivatives in theory and practice

Exercises and solutions 385

(iii) Show that, for k < i ≤ n,

µit =

i∑j=k+1

αjLjt

1 + αjLjt

σjσiρijLit.

Exercise 21 Consider a LIBOR Markov-functional model specified usingthe numeraire pair (D·Sn

, Sn) and having a driving Markov process x satisfying

dxt = σeatdWt,

where W is a standard Brownian motion under Sn and σ > 0 and a areconstants. Show that if this model is calibrated to caplet prices as given bythe Hull–White model discussed in Exercise 16 then this Markov-functionalmodel coincides with the Hull–White model on the ‘grid’, i.e. the functionalforms

DTiSj(xTi

) , 1 ≤ i ≤ j ≤ n

are as for the Hull–White model.

Solutions

Solution 1 We need to check that conditions (i)–(iii) of Definition 3.90hold. Clearly M is adapted by the definition of a local martingale and thus (i)is satisfied. Condition (iii) requires us to show that Ms ≥ E[Mt | Fs] for s < t.Letting Tn be a reducing sequence for M − M0, we have

E[MTn

t | Fs

]= MTn

s ,

where MTnt = Mt∧Tn

. Now

Ms = limn→∞

MTns = lim

n→∞E[MTn

t | Fs

]≥ E

[Mt | Fs

]. (20.1)

This last inequality follows from the conditional form of Fatou’s lemma: ifYn is a sequence of random variables in L1(Ω,F , P) with Yn ≥ 0, and if G isa sub-σ-algebra of F , then

E[lim inf Yn | G

]≤ lim inf E

[Yn | G

].

Finally, to establish (ii), take expectations of both sides of inequality (20.1)and set s = 0. This yields, for all t ≥ 0,

E[|Mt|

]= E[Mt] ≤ E[M0] < ∞

and we are done.

Page 410: Financial derivatives in theory and practice

386 Exercises and solutions

Solution 2 Recall, from Section 4.5.1, that we have

[X]t =∫ t

0H2

u du

and, since H is deterministic, τt is an increasing deterministic function of time.Setting

Wt := Xτt=∫ τt

0Hu dWu ,

we have thatW([X]t

)= Xτ

[X]t= Xt

and so we will have established the result if we can prove that W is a Brownianmotion adapted to

Fτt

. We do this by an application of Levy’s theorem.

Thus we must show that W is a continuous local martingale adapted toFτt

and that

[W]t= t.

That W is adapted follows from the adaptedness of the stochastic integral.Further, since X is constant on any interval of constancy of [X], it follows bythe continuity of the stochastic integral that W is continuous. Next we find areducing sequence for W in order to show that it is a local martingale.

DefineSn := inf

t > 0 : |Xt| > n

and

Tn := τ−1Sn

= [X]Sn.

Since τ is increasing it follows, for each fixed t, that

τt∧Tn= Sn ∧ τt

and

Tn ≤ t = Sn ≤ τt ∈ Fτt.

ThusW Tn

t := Wt∧Tn= XSn

τt

and, since Sn is a reducing sequence for X, it follows that W Tnt is a martingale

with respect to Fτt. We thus see that Tn is a reducing sequence for the

continuous local martingale W .

Page 411: Financial derivatives in theory and practice

Exercises and solutions 387

To apply Levy’s theorem and conclude that W is an Fτt Brownian

motion, it remains only to show that[W]t= t. But it is easy to check using

Theorem 4.18 and localization that[W]t=∫ τt

0H2

u du = [X]τt= t .

Solution 3 Strong existence and pathwise uniqueness for the SDE followimmediately from Theorem 6.27. By Ito’s formula,

dYt = d(eatXt)

= aeatXt dt + eat dXt

= aYt dt − aYt dt + σeat dWt

= σeat dWt ,

whence

Yt = Y0 +∫ t

0σeas dWs

and the required equation for Xt follows. An application of Levy’s theorem,as in Solution 2 above, establishes that

Wt :=∫ ξ−1

t

0σeas dWs

is a Brownian motion where ξt :=∫ t

0 σ2e2au du, and we can write

Xt = e−at(X0 + W (ξt)

).

Thus, when X0 = x0, Xt has a normal distribution with mean eatx0 andvariance

e−2atξt = e−2atσ2∫ t

0e2au du = σ2 1 − e−2at

2a.

Finally, observe that

cov(Xt1 , Xt2 ) = e−a(t1+t2) cov(W (ξt1), W (ξt2)

)= e−a(t1+t2)ξt1

and

corr(Xt1 , Xt2 ) =e−a(t1+t2)ξt1√

e−2at1ξt1

√e−2at2ξt2

= e−a(t2−t1)

√e−2at1ξt1√e−2at2ξt2

as required.

Page 412: Financial derivatives in theory and practice

388 Exercises and solutions

Solution 4 (i) If P ∼ Q with respect to F , then P ∼ Q with respect to Ft.By Corollary 5.9 applied to the σ-algebras Fs ⊆ Ft we have

EQ[X | Fs] =EP[ρtX | Fs]EP[ρt | Fs]

=EP[ρtX | Fs]

ρs.

(ii) Suppose ρM is an (Ft, P) martingale. To show that M is an (Ft, Q)martingale we must check that conditions (M.i)–(M.iii) of Definition 3.1 aresatisfied. Adaptedness of M follows from that of ρ and ρM . Since ρM isintegrable with respect to P (property (M.i) applied to the martingale ρM)we have

EQ

[|Mt|

]= EP

[|ρtMt|

]< ∞ .

Finally, using part (i) and the martingale property of ρM , we have

EQ[Mt | Fs] = ρ−1s EP[ρtMt | Fs] = Ms .

The converse implication can be proved similarly.

Solution 5 (i) Write Xt = Xt/A(1)t and A

(2)t = A

(2)t /A

(1)t .

An application of Ito’s formula shows that

dA(2)t = (µ − r)A(2)

t dt + σA(2)t dWt (20.2)

and that

Xt = α(1)t + α

(2)t A

(2)t = X0 +

∫ t

0α(2)

u dA(2)u . (20.3)

By Girsanov’s theoremWt = Wt − φt

is an(FW

t , Q)

Brownian motion. Substituting for dWt in (20.2), we have

dA(2)t = σA

(2)t dWt

and so

Xt = X0 +∫ t

0α(2)

u σA(2)u dWu

and thus X is a (positive) local martingale under Q. Since FW0 is trivial we

can use the result of Exercise 1 to conclude that X is a supermartingale. Inparticular, noting that A

(1)0 = e−rT, we have

X0 ≥ EQ[XT]

and so X0 ≥ V0 as required.

Page 413: Financial derivatives in theory and practice

Exercises and solutions 389

(ii) Consider the(FW

t , Q)

martingale

Mt = EQ

[VT | FW

t

].

It follows from the martingale representation theorem for Brownian motion(Theorem 5.49) that there exists some FW

t predictable H ∈ Π(W ) such that

Mt := EQ[VT] +∫ t

0Hu dWu .

From (i), dWt = σ−1(A

(2)t

)−1dA

(2)t and so

Mt = EQ[VT] +∫ t

0Huσ

−1(A(2)u

)−1dA(2)

u . (20.4)

Suppose we set Xt = A(1)t Mt. We can now use (20.3) and (20.4) to form the

required(α

(1)t , α

(2)t

). Take

α(2)t = σ−1(A(2)

t

)−1Ht.

Then by (20.3) we see that we need to choose

α(1)t = Xt − α

(2)t A

(2)t = Mt − α

(2)t A

(2)t .

It follows by Ito’s formula that dXt = α(1)t dA

(1)t + α

(2)t dA

(2)t and further, since

VT is FWT -measurable and A

(1)T = 1, we have XT = A

(1)T MT = VT, as required.

Since Mt is the conditional expectation of a positive random variable wehave Mt ≥ 0. Noting that A

(1)t = e−r(T−t), it follows that Xt ≥ 0 for t ∈ [0, T ].

Finally, observe X0 = A(1)0 M0 = e−rT EQ[VT] = V0.

Solution 6 (i) Using Wald’s martingale (Example 3.3) define

dQ

dP

∣∣∣∣FW

t

:= exp(φWt − 1

2φ2t), t ≤ T,

where φ = σ−1(r − µ). By Girsanov’s theorem

dSDt = σSD

t dWt,

where Wt := Wt − φt is an(FW

t , Q)

Brownian motion. Using Dolean’sexponential (Example 4.36) we have

SDt = SD

0 exp(σWt − 1

2σ2t)

= SD0 exp

(σWt − 1

2σ2t − (r − µ)t

).

Page 414: Financial derivatives in theory and practice

390 Exercises and solutions

(ii) Substituting for SDt , we have

SDt exp

(φWt − 1

2φ2t)

= SD0 exp

((σ + φ)Wt − 1

2 (σ + φ)2t)

which is of the form of Wald’s martingale with λ = φ + σ.

(iii) Observe that

ρt :=dQ

dP

∣∣∣∣FW

t

= EP

[(dQ

dP

∣∣∣∣FW

T

)∣∣∣∣FWt

]= exp

(φWt − 1

2φ2t).

By Lemma 5.19(i), since ρSD is an(FW

t , P)

martingale, SD is an(FW

t , Q)

martingale. Note this result is also immediate from the working for (i).

Finally, using the fact that SD is an (FWt , Q) martingale, define

ρt :=dQ

dQ

∣∣∣∣FW

t

:= S−10 EQ

[e−rTST | FW

t

]=

e−rTSDt

S0.

Again we can apply Lemma 5.19(i) to deduce that, since DSt ρt = S−1

0 e−rT is aconstant (hence an

(FW

t , Q)

martingale), it follows that DSt is an

(FW

t , Q)

martingale and we are done.

Alternatively, a direct proof of the identity follows easily using Exercise 4(i).

(iv) WriteVT = (ST − K)+ = ST1ST>K − K1ST>K.

Then, applying the martingale valuation formula, equation (7.15), usingnumeraire pairs (N 1, N1) and (N 2, N2), we have

V0 = N 10 EN1

[ST1ST>K

N 1T

]− N 2

0 EN2

[K1ST>K

N 2T

].

From our working in (iii) it is clear we can take (N 1, N1) = (S, Q) and(N 2, N2) = (D, Q). Thus

V0 = S0EQ

[1ST>K

]− e−rTKEQ

[1ST>K

]as required.

Solution 7 (i) Define Yt := Nt/Dt and observe that

d

(Nt

Dt

)= d

(Dt + St

Dt

)= dSD

t = (µ − r)SDt + σSD

t dWt.

Page 415: Financial derivatives in theory and practice

Exercises and solutions 391

Thus

dDNt = dY −1

t = −Y −2t dYt + Y −3

t (dYt)2

= −(DN

t

)2((µ − r)SDt dt + σSD

t dWt

)+(DN

t

)3σ2(SD

t

)2dt

=(−(µ − r) + σ2SN

t

)DN

t SNt dt − DN

t SNt σdWt

= −σDNt SN

t

(dWt −

−(µ − r) + σ2SNt

σdt

). (20.5)

Noting that SNt = 1 − DN

t , we now also have dSNt = −dDN

t .

Since N is equivalent to P with respect to FWT , by Girsanov’s theorem

(Theorem 5.24) there exists an FWt -predictable process C such that

dN

dP

∣∣∣∣FW

t

= exp(∫ t

0CudWu − 1

2

∫ t

0C2

u du

)(20.6)

and

Wt = Wt −∫ t

0Cu du

is an FWt Brownian motion under N.

Further, since (N, N) is a numeraire pair, we must have, from (20.5), that

Ct =

(−(µ − r) + σ2SN

t

)DN

t SNt

σDNt SN

t

= σSNt − σ−1(µ − r).

It then follows that

dANt = d

(DN

t

SNt

)=

(−σDN

t SNt

σDNt SN

t

)dWt.

Finally, we have

dNt = dSt + dDt

= (µSt + rDt)dt + σStdWt

= (µSt + rDt + σStCt)dt + σStdWt

=(

rNt + σ2 S2t

Nt

)dt + σStdWt.

Page 416: Financial derivatives in theory and practice

392 Exercises and solutions

(ii) Substituting for Ct in equation (20.6), we have

dN

dP

∣∣∣∣FW

t

= exp(∫ t

0σSN

u dWu − 12

∫ t

0

(σ2(SN

u

)2 + 2σφSNu

)du

)exp

(φWt − 1

2φ2t).

It remains to show that this first exponential is justNt/N0

Dt/D0.

Let Q be the equivalent martingale measure corresponding to numeraire D.Recall that

dQ

dP

∣∣∣∣FW

t

= exp(φWt − 1

2φ2t),

where φ = σ−1(r − µ), and that, under Q, ρt :=Nt/N0

Dt/D0is a martingale with

dρt = d

(Nt/N0

Dt/D0

)=

D0

N0dSD

t =D0

N0σSD

t dW ∗t ,

where W ∗t = Wt − φt is a Brownian motion under Q.

Using ρ as a Radon–Nikodym derivative, it follows from Lemma 5.17 that

Nt/N0

Dt/D0= exp

(∫ t

0

Du

NuσSD

u dW ∗u − 1

2

∫ t

0

(Du

NuσSD

u

)2

du

)= exp

(∫ t

0σSN

u dWu −∫ t

0σφSN

u du − 12

∫ t

0σ2(SN

u

)2du

)as required.

Remark: The identity in part (ii) of the question also follows by solving theSDE for DN

t found in part (i).

Solution 8 (i) Observe

dDSt = S−1

t dDt + DtdS−1t + dDtdS−1

t

and

dS−1t = −S−2

t dSt + S−3t (dSt)2

=(σ2

t − µ)S−1

t dt − σtS−1t dWt,

whencedDS

t =(σ2

t + (r − µ))DS

t dt − σtDSt dWt . (20.7)

Page 417: Financial derivatives in theory and practice

Exercises and solutions 393

(ii) If QS is an EMM corresponding to numeraire S then, under QS, DS is amartingale. Using the SDE for DS from part (i), Girsanov’s theorem (Theorem5.24) suggests we take

ρt :=dQS

dP

∣∣∣∣FW

t

= exp(∫ t

0

σ2u + (r − µ)

σudWu − 1

2

∫ t

0

(σ2

u + (r − µ)σu

)2

du

).

(20.8)Since σt is bounded above and below by a positive constant, clearly

E

[exp

(12

∫ t

0

(σ2u + (r − µ)

σu

)2du

)]< ect < ∞

for some constant c, and so, by Novikov (Theorem 5.16), ρ is an (FWt , P)

martingale with ρ0 = 1. Thus equation (20.8) with t = T does indeed definea measure QS ∼ P on FW

T .

Girsanov’s theorem tells us that

Wt = Wt −∫ t

0

(σ2

u + (r − µ)σu

)du (20.9)

is an(FW

t , QS)

Brownian motion and, substituting (20.9) into (20.7),

dDSt = −σtD

St dWt .

Using Dolean’s exponential, this SDE has the unique solution

DSt = e−rT exp

(−∫ t

0σudWu − 1

2

∫ t

0σ2

udu

)(20.10)

and it follows by Novikov that DS is an(FW

t , QS)

martingale. Since SS ≡ 1is trivially a martingale we have established that QS is an EMM correspondingto the numeraire S.

To show that QS is unique, let Q∗ be some other EMM for the numeraireS. Then by Girsanov’s theorem for Brownian motion (Theorem 5.24) thereexists an FW

t -predictable process γ such that

W ∗t = Wt −

∫ t

0γu du

is an(FW

t , Q∗) Brownian motion. But then

dDSt = −σtD

St (dW ∗

t + γt dt).

Page 418: Financial derivatives in theory and practice

394 Exercises and solutions

For DS to be a martingale under Q∗ we must have∫ t

0 σuDSuγu du = 0 for all

0 ≤ t ≤ T . Noting that σt > 0 and DSt > 0 for 0 ≤ t ≤ T , it follows that γ ≡ 0.

Thus by Girsanov’s theorem,

dQ∗

dQS

∣∣∣∣FW

T

= 1

and hence Q∗ = QS on FWT .

(iii) Substituting for W in (20.10) yields

DSt = e−rT exp

(−∫ t

0σu dWu + 1

2

∫ t

0σ2

u du + (r − µ)t)

.

Solution 9 (i) Observe that

mST = min

mS

t , minu∈[t,T]

Su

.

From Solution 6(i) we have that

St = S0 exp(σWt + (µ − 1

2σ2)t)

and so, for u > t, we may write

Su = St exp(−(Xu − Xt)

).

Noting thatminu∈[t,T]

Su = Ste−M X

t,T ,

the result follows.

(ii) Working with the numeraire pair (D, Q), we have

Vt = DtEQ

[ST − mS

T | FWt

]= DtS

Dt − DtEQ

[min

mS

t , Ste−MX

t,T

∣∣∣FWt

].

Now set τ = T − t and MXτ := maxu∈[0,τ] Xu. Noting that both mS

t and St

are FWt -measurable random variables and that the random variable MX

t,T isindependent of FW

t , we have

EQ

[min

mS

t , Ste−M X

t,T

∣∣∣FWt

]= L

(St, m

St

),

Page 419: Financial derivatives in theory and practice

Exercises and solutions 395

whereL(s, m) := EQ

[min

m, se−MX

τ

]for s ≥ m > 0. It remains to calculate a closed-form expression for L(s, m).Recall that, under Q, Xt = −σWt +

( 12σ

2 − r)t where W is a one-dimensional

Brownian motion with respect to(FW

t , Q). Define a measure Q∗ ∼ Q on

FWT via

dQ∗

dQ

∣∣∣∣FW

T

:= exp(− r2

σWT −

12

r22

σ2 T

),

where r2 = r − 12σ

2. Then, by Girsanov’s theorem,

W ∗t := Wt +

r2

σt

is an(FW

t , Q∗) Brownian motion and Xt = −σW ∗t .

By symmetry

Q∗(− W ∗τ ∈ dx, MX/σ

τ ∈ da)

= Q∗(W ∗τ ∈ dx, MW∗

τ ∈ da)

= f(x, a) dx da

where, using Example 2.17,

f(x, a) =

√2(2a − x)√

πτ 3exp

(− (2a − x)2

)dx da if max0, x ≤ a,

0 otherwise.

Thus

L(s, m) − m = EQ

[(se−M X

τ − m)1M X

τ ≥− log(ms )

]= EQ∗

[dQ

dQ∗

∣∣∣∣FW

τ

(se−MX

τ − m)1MX

τ ≥− log(ms )

]= exp

(− 1

2r22

σ2 τ)EQ∗

[e

r2σ W∗

τ(se−σM

X/στ − m

)1

MX/στ ≥− 1

σ (log ms )]

= exp(− 1

2r22

σ2 τ)

×∫ ∞

0

∫ a

−∞e−

r2σ x(se−σa − m)1

a≥− 1σ log(m

s )f(x, a) dx da.

Evaluating the double integral (which is a little tedious but straightforward)and substituting the resulting expression for L(s, m) leads to the requiredresult.

Page 420: Financial derivatives in theory and practice

396 Exercises and solutions

Solution 10 (i) In order to be able to apply Girsanov’s theorem, we firstdefine a Brownian SDE for the S(2)-rebased assets. An application of Ito’sformula yields

dXt = S(1)t d

(S

(2)t

)−1 +(S

(2)t

)−1dS

(1)t + dS

(1)t d

(S

(2)t

)−1

= Xt

(dM

(1)t − dM

(2)t

)+ Xt(µ1 − µ2 + ξ22 − ξ12) dt.

Similarly, setting Ut := Dt/S(2)t , we find that

dUt := −UtdM(2)t + Ut(r − µ2 + ξ22) dt.

By Levy’s theorem we have(M (1)

M (2)

)=

( √ξ11 0

ρ√

ξ22√

ξ22(1 − ρ)2

)(W

(1)t

W(2)t

),

where W ′ = (W (1), W (2)) is an FWt Brownian motion, and it then follows

that

d

(Ut

Xt

)=

(Ut 00 Xt

)(r − µ2 + ξ22

µ1 − µ2 + ξ22 − ξ12

)dt

+

(Ut 00 Xt

)(0 −11 −1

)( √ξ11 0

ρ√

ξ22√

ξ22(1 − ρ2)

)(dW

(1)t

dW(2)t

),

which is our desired SDE.

Observe that FAt = FW

t , where FWt is the natural augmented filtration

generated by W . In order to define an EMM Q corresponding to numeraireS(2), Girsanov’s theorem (Theorem 5.24) suggests we take

Ct := −( √

ξ11 0

ρ√

ξ22√

ξ22(1 − ρ2)

)−1(0 −11 −1

)−1(

r − µ2 + ξ22

µ1 − µ2 + ξ22 − ξ12

)

=:(

c1c2

).

We can define Q via

dQ

dP

∣∣∣∣FW

t

= exp(c1W

(1)t + c2W

(2)t − 1

2

(c21 + c2

2)t),

which is clearly a martingale by Novikov’s condition.

Page 421: Financial derivatives in theory and practice

Exercises and solutions 397

It then follows by Girsanov’s theorem that

d

(Ut

Xt

)=

(Ut 00 Xt

)(−ρ

√ξ22 −

√ξ22(1 − ρ2)

√ξ11 − ρ

√ξ22 −

√ξ22(1 − ρ2)

)d

(W

(1)t

W(2)t

),

where W ′ =(W (1), W (2)

)is a Brownian motion under Q.

To establish that Q is an EMM it remains to check that the solution to theabove SDE is a martingale. Solving for X, we have that

Xt = X0 exp(Yt − 1

2 [Y ]t),

where Yt is defined as in the question and

[Y ]t = (ξ11 − 2ξ12 + ξ22)t =: σ20t.

Again it follows by Novikov’s condition that X is a martingale. That U is amartingale follows similarly.

(ii) Taking (N, N) = (S(2), Q), we have that

V0 = S(2)0 EQ

[(XT − 1)+

].

Further, from (i),

XT ∼ X0e− 1

2σ2

0t+Z

where Z ∼ N(0, σ20t) under Q, and thus the required formula follows immedi-

ately from the Black–Scholes formula.

Solution 11 (i) By numeraire invariance (Theorem 7.13) the trading strat-egy φ′ = (φ(1), φ(2)) is self-financing with respect to A if and only if it isself-financing with respect to AS. Thus we seek φ satisfying

X

ST= φ

(1)0 DS

0 + φ(2)0 +

∫ T

0φ(1)

u dDSu = φ

(1)T DS

T + φ(2)T .

By Solution 8(ii), equation (20.9), it follows that Wt := Wt − σ−1(σ2 + r − µ)tis a Brownian motion under QS. Further, note that FW

t = FWt .

Define

Mt := EQS

[X

ST

∣∣∣∣FWt

], 0 ≤ t ≤ T.

Page 422: Financial derivatives in theory and practice

398 Exercises and solutions

Then M is an(FW

t , QS)

martingale and by Theorem 5.49 we can find someFW

t -predictable H such that

Mt = M0 +∫ t

0Hu dWu .

In particular, taking t = T ,

X

ST= EQS

[X

ST

]+∫ T

0Hu dWu .

Noting that under QS (see Exercise 8(ii))

dWt = −σ−1(DSt

)−1dDS

t ,

we have

Mt = EQS

[X

ST

]−∫ T

0Huσ

−1(DSu

)−1dDS

u. (20.11)

Using (20.11), we find φ satisfying

Mt = EQS

[X

ST

]+∫ t

0φ(1)

u dDSu +

∫ t

0φ(2)

u dSSu

= φ(1)t DS

t + φ(2)t .

Takeφ

(1)t = −Htσ

−1(DSt

)−1

andφ

(2)t = Mt − φ

(1)t DS

t .

Note that for replication it does not matter what φ(2)t is taken to be. We must

make this choice for the self-financing property to hold.

(ii) Let (N, N) be some other numeraire pair. Then, in particular, St/Nt is amartingale under N. Use this fact to form an equivalent measure Q via

dQ

dN

∣∣∣∣FW

t

=St/S0

Nt/N0=: ρt.

We show that Q = QS, which establishes the result. Noting that Dt/Nt is an(FW

t , N)

martingale, it follows from part (i) of Lemma 5.19 that

Dt

Nt· Nt

St=

Dt

St

Page 423: Financial derivatives in theory and practice

Exercises and solutions 399

is an(FW

t , Q)

martingale. Trivially SS is also a(FW

t , Q)

martingale andso Q is an EMM for the numeraire S. But the measure for which DS is amartingale with respect to FW

t is unique. Thus Q = Q on FWT and so

dQS

dN

∣∣∣∣FW

t

= ρt

as required.

(iii) Recall that an FWt -predictable process φ is admissible for the

Black–Scholes economy if it is self-financing and if, for all numeraire pairs(N, N), the numeraire-rebased gain process Gφ/N is an

(FW

t , N)

martingale(see Definition 7.25).

Using the result of Exercise 4(i), observe that, for any numeraire pair (N, N),for 0 ≤ u ≤ t we have

EN

[φt · AN

t

∣∣FWu

]= EQS

[φt ·

At

Nt· Nt

St

∣∣∣∣FWu

]Su

Nu

=Su

Nuφu · AS

u

= φu · ANu .

The second equality above follows since, by part (i), Mt = φt · ASt is a

martingale. That φt · ANt is adapted is clear, and integrability follows from

that of φt · ASt under QS. Thus Gφ

t /Nt = φt · ANt − φ0 · AN

0 is an(FW

t , N)

martingale as required.

We have shown that the Black–Scholes economy is complete for the set

S ′ =

X : X is an FWT -measurable random variable with EQS

[∣∣∣∣XST

∣∣∣∣] < ∞

.

Observe by (ii) that

EQ

[|X|]

= EQS

[|X| dQ

dQS

∣∣∣∣FW

T

]

=S0

D0EQS

[|X|DT

ST

]

and so S = S ′ and we are done.

Page 424: Financial derivatives in theory and practice

400 Exercises and solutions

Solution 12 (i) Set Zt = N−1t

dNdP

∣∣∣FA

t

. Using the result of Exercise 4(i), it is

straightforward to check that Z is a pricing kernel (see the proof of part (i)of Theorem 7.48). By equation (7.15) and Exercise 4(i) we have

Vt = NtEN

[VT/NT | FA

t

]= Nt

EP

[VT/NT

dNdP

∣∣FA

T

∣∣FAt

](dNdP

∣∣FA

t

)= Z−1

t EP

[VTZT | FA

t

]as required.

(ii) Using part (i) above, we can set

Zt =(S

(2)t

)−1 dQ

dP

∣∣∣∣FA

t

.

The working in Solution 10 leads to an explicit expression for this pricingkernel, as we now show.

Observe that the SDE for S(i) can be written in exponential form

dS(i)t = S

(i)t dX

(i)t

with X(i)t = µit + M

(i)t and, by Example 4.36, this has the unique solution

S(i)0 E

(X(i)

)t. In particular, we have

S(2)t = S

(2)0 exp

(M

(2)t + (µ2 − 1

2ξ22)t). (20.12)

Set

R :=( √

ξ11 0ρ√

ξ22√

ξ22(1 − ρ2)

)and recall from Solution 10 that

M = RW , (20.13)

where M ′ =(M (1), M (2)

)and W ′ =

(W (1), W (2)

). Let Q be an EMM corre-

sponding to numeraire S(2). Then, again using Solution 10, we have that

dQ

dP

∣∣∣∣FA

t

= exp(CTR−1Mt − 1

2 |C|2t)

(20.14)

Page 425: Financial derivatives in theory and practice

Exercises and solutions 401

where C ′ = (c1, c2), the components of the vector C being defined as inSolution 10.

Finally, combining (20.12) and (20.14), we have

Zt =(S

(2)0

)−1 exp(CR−1Mt − M(2)t −

( 12 |C|2 + µ2 − 1

2ξ22)t).

To establish completeness, first observe that the SDE for A′ = (D, S(1), S(2))can, using (20.13), be written in the form (7.1). That pathwise uniquenessand hence weak uniqueness holds for this SDE follows from the uniqueness ofthe solution stated above. Since the economy has a finite variation asset D, itthen follows immediately from Theorem 7.43 that the economy is complete.

By Theorem 7.41 and part (iii) of Theorem 7.48, the pricing kernel Z isunique, and so if (M, M) is some other numeraire pair we have

M−1t

dM

dP

∣∣∣∣FA

t

= Zt.

Thus, taking (M, M) = (D, QD), we have

dQD

dP

∣∣∣∣FA

t

= DtZt

where Zt is as given above.

Solution 13 (i) Since A and B are absolutely continuous and decreasing tozero, we may write

At =∫ t

0au du + A0, Bt =

∫ t

0bu du + B0

where A∞ = B∞ = 0.

To see that a rational log-normal model is in (FH), define

MtS := −(aS + bSMt).

Then ∫ ∞

t

MtSdS = −∫ ∞

t

(aS + bSMt) dS

= At − A∞ + (Bt − B∞)Mt

= At + BtMT

Page 426: Financial derivatives in theory and practice

402 Exercises and solutions

and

DtT =

∫∞T MtS dS∫∞t MtS dS

as required.

(ii) From Theorem 8.25 it follows that for a model in (FH) we can take aspricing kernel

Zt :=∫ ∞

t

MtS dS.

Thus for a rational log-normal model we have the valuation formula

V0 = Z−10 EZ[ZTVT]

= (A0 + B0)−1EZ

[(AT + BTMT)VT

].

Noting that the value of the swaption at T is

VT =(

1 − DTSn− K

n∑j=1

αjDTSj

)+

where αj = Sj − Sj−1, and that under the log-normal model

DTSj=

ASj+ BSj

MT

AT + BTMT,

we have that the time-zero value of the swaption under the log-normal modelis given by

V0 = (A0 + B0)−1EZ

[[(AT −

(ASn

+ Kn∑

j=1

αjASj

))

+(

BT −(BSn

+ K

n∑j=1

αjBSj

))MT

]+

].

Assuming MT is of the form given, this reduces to

V0 = (A0 + B0)−1[(

BT −(BSn

+ K

n∑j=1

αjBSj

))N(d1)

+(

AT −(ASn

+ K

n∑j=1

αjASj

))N(d2)

],

Page 427: Financial derivatives in theory and practice

Exercises and solutions 403

where

d1 =

log

(BT − (BSn

+ K∑n

j=1 αjBSj)

AT − (ASn+ K

∑nj=1 αjASj

)

)σ√

T+ 1

2σ√

T

and

d2 =

log

(BT − (BSn

+ K∑n

j=1 αjBSj)

AT − (ASn+ K

∑nj=1 αjASj

)

)σ√

T− 1

2σ√

T .

Solution 14 (i) From the definition of an (SR) model it follows immediatelythat the risk-neutral measure Q is an EMM corresponding to the numeraireexp(

∫ t

0 ru du). The first valuation formula is just the martingale valuationformula, equation (7.15), using this numeraire.For 0 ≤ t ≤ T define a measure F equivalent to Q via

dF

dQ

∣∣∣∣FA

t

= exp(−∫ t

0rudu

)DtT

D0T= D−1

0T EQ

[exp

(−∫ T

0rudu

) ∣∣∣∣FAt

].

(20.15)Setting ρt := dQ

dF

∣∣∣FA

t

and using the result of Exercise 4, we have

Vt = ρ−1t EF

[exp

(−∫ T

t

rudu

)VTρT

∣∣∣∣FAt

]= exp

(−∫ t

0rudu

)DtT EF

[VT exp

(∫ t

0rudu

)∣∣∣∣FAt

]= DtT EF

[VT | FA

t

].

Note that D·S/D·T is a martingale under F and thus we have formed theRadon–Nikodym derivative from the ratio of numeraires.

(ii) For the Ho–Lee model observe that

∫ T

0rudu = r0T +

∫ T

0

∫ u

0θs ds du +

∫ T

0σWu du. (20.16)

Applying Ito’s formula to tWt and integrating, we obtain

TWT =∫ T

0Wu du +

∫ T

0udWu.

Page 428: Financial derivatives in theory and practice

404 Exercises and solutions

Substituting into (20.16) and interchanging the order of integration, we have∫ T

0ru du = r0T +

∫ T

0θs(T − s) ds +

∫ T

0(T − u)σdWu. (20.17)

Now, from Exercise 2 it follows that, under Q,∫ T

0 (T − u)σ dWu has a normaldistribution with mean zero and variance

∫ T

0 σ2(T − u)2du = 13σ

2T 3. Thence,if the model is calibrated to the initial discount curve, we have

D0T = EQ

[exp

(−∫ T

0rudu

)]= exp

(− r0T −

∫ T

0θs(T − s)ds + 1

6σ2T 3

). (20.18)

Taking logarithms and differentiating twice with respect to T yields therequired equation for θt.

Next observe from (20.15), (20.17) and (20.18) that

dF

dQ

∣∣∣∣FA

T

= exp(− 1

6σ2T 3 −

∫ T

0(T − u)σ dWu

).

It then follows from Girsanov’s theorem (check!) that, under F,

Wt := Wt −∫ t

0−(T − u)σ du (20.19)

is a Brownian motion. Substituting (20.19) into the SDE for r given in thequestion yields the required result.

Solution 15 (i) Using the numeraire D2·T, the corresponding EMM Q2 and

noting thatdQ2

dQ1

∣∣∣∣Ft

=M 12

tT

M 120T

,

we have

V 230 (K) = D2

0TEQ2

[(X23

T − K)+]

= D20TEQ1

[X12

T D10T

X120 D2

0T

(X23

T − K)+

]= X21

0 D10TEQ1

[X12

T

(X23

T − K)+

].

Noting that X12T = X13

T /X23T yields the required result.

Page 429: Financial derivatives in theory and practice

Exercises and solutions 405

(ii) Observe that

V 210 = D2

0TEQ2

[(X21

T − K)+

]= X21

0 D10TEQ1

[X12

T

(X21

T − K)+

]= KX21

0 D10TEQ1

[(K−1 − X12

T

)+

].

Thus from the prices of call options on X21T we can recover the prices of put

options on X12T and hence, via put–call parity, the prices of call options on X12

T .This means we can recover the implied marginal distribution of X12

T under Q1.Similarly, we can recover the implied marginal distribution of X13

T under Q1.However, no information concerning the dependence structure between X12

T

and X13T can be inferred from these prices.

(iii) Noting that the unique solution to the SDE for M 12tT is

M 12tT = M 12

0T exp(∫ t

0σ(1)eaudW (1)

u − 12

(σ(1))2 ∫ t

0e2audu

),

we havedQ2

dQ1

∣∣∣∣Ft

= exp(∫ t

0Cu · dW ∗

u − 12

∫ t

0|Cu|2 du

),

where

Ct = σ(1)eat

(√1 − ρ2

ρ

).

It then follows by Girsanov’s theorem that

W ∗t :=

(W

∗(1)t

W∗(2)t

)= W ∗

t −(∫ t

0σ(1)eau du

)(√1 − ρ2

ρ

)(20.20)

is an Ft Brownian motion under Q2.

We now find the SDE for M 23·T under Q2. By integration by parts,

dM 23tT = d

(M 13

tT

M 12tT

)= M 13

tT d(M 12

tT

)−1 +(M 12

tT

)−1dM 13

tT + dM 13tT d(M 12

tT

)−1.

By Ito’s formula,

d(M 12

tT

)−1 = −σ(1)eat(M 12

tT

)−1dW

(1)t +

(σ(1))2e2at

(M 12

tT

)−1dt

Page 430: Financial derivatives in theory and practice

406 Exercises and solutions

and thus

dM 23tT = M 23

tT

(e2at((

σ(1))2 − σ(1)σ(2)ρ)

dt + eat(σ(2)dW

(2)t − σ(1)dW

(1)t

)).

Substituting for dW(1)t and dW

(2)t and using (20.20), we obtain

dM 23tT = M 23

tT dZt,

wheredZt :=

(σ(2) − σ(1)ρ

)eatdW

∗(2)t − eatσ(1)

√1 − ρ2 dW

∗(1)t .

Observe thatd[Z]t =

(3)t

)2dt.

SettingdWt :=

(3)t

)−1dZt,

it follows by Levy’s theorem that W is an(Ft, Q2

)Brownian motion. It can

now easily be seen that M 23·T satisfies the given SDE.

Finally, noting that

X23T = M 23

TT = M 230T exp

(∫ T

(3)t dWt − 1

2

∫ T

0

(3)t

)2dt

),

routine calculation yields

V 230 (K) = X23

0 D30TN(d1) − KD2

0TN(d2),

where

d1 :=log(D3

0TX230 /D2

0TK)σ0

+ 12 σ0,

d2 :=log(D3

0TX230 /D2

0TK)σ0

− 12 σ0,

and

σ20 :=

e2at − 12a

((σ(1))2 − 2ρσ(1)σ(2) +

(σ(2))2).

Solution 16 (i) Observe that

Lit := Lt[Ti, Si] = α−1

i

DtTi

DtSi

− α−1i

Page 431: Financial derivatives in theory and practice

Exercises and solutions 407

will be a martingale under Fi. From (17.5), for the Hull–White model we havethat the value at time t of a pure discount bond is of the form

DtT = AtTe−BtTrt ,

where here

BtT = eat

∫ T

t

e−audu =1a

(1 − e−a(T−t)).

ThencedLi

t = α−1i d

[AtTi

AtSi

exp((BtSi

− BtTi)rt

)].

Applying Ito’s formula to the C1,2 function Li(t, rt) yields

dLit =

∂tLi(t, rt) dt +

∂rtLi(t, rt) drt + 1

2∂2

∂r2t

Li(t, rt) (drt)2.

Working in the measure Fi, we can ignore finite variation terms (which mustbe zero since Li is a martingale) and so

dLit = α−1

i

AtTi

AtSi

(BtSi− BtTi

)e(BtSi−BtTi

)rtσdW it

= α−1i (BtSi

− BtTi)DtTi

DtSi

σdW it .

The result now follows by noting that

BtSi− BtTi

=1a

[e−a(Ti−t) − e−a(Si−t)] =

eat

a

(e−aTi − e−aSi

).

(ii) The caplet which pays at Si has value V iSi

(K) = αi(LiTi− K)+ at Si and

value V iTi

(K) = DTiSiV i

Si(K) at Ti. Recalling

DTiSi=

11 + αiLi

Ti

,

it easily follows that

V iTi

(K) = αiDTiSi(Li

Ti− K)+

= (1 − DTiSi− αiKDTiSi

)+

= (1 + αiK)(

11 + αiK

− DTiSi

)+.

Page 432: Financial derivatives in theory and practice

408 Exercises and solutions

Thus a caplet having payment date Si has the same value at time Ti as(1 + αiK) times the value of a put option on the discount bond DTiSi

withstrike (1 + αiK)−1. The required result now follows by summing over thevalues of the individual caplets at time t < T1.

From the discussion and the martingale valuation formula, equation (7.15), italso follows that

V it (K) = KV i

t

(α−1

i K−1 − α−1i

)= αiDtSi

KEFi

[(Xi

Ti− (αiK)−1)

+

∣∣∣Ft

],

where Xit = Li

t + α−1i and Ft is the augmented natural filtration generated

by W. (Note that here FAt = Ft.)

For the Hull–White model in (i) we have

dXit = σi

tXitdW i

t ,

where

σit = σeat

(e−aTi − e−aSi

a

)and W is a Brownian motion under Fi. Thus

Xit = Xi

0 exp(∫ t

0σi

udW iu − 1

2

∫ t

0(σi

u)2du

)and so, for t < Ti,

log XiTi

= log Xit − 1

2

∫ Ti

t

(σiu)

2du +∫ Ti

t

σiudW i

u.

Set

Σt :=(∫ Ti

t

(σiu)

2du

)1/2

=1 − e−a(Si−Ti)

a

√σ2

2a

(1 − e−2a(Ti−t)

).

Noting that, conditional on Ft, log XiTi

has a N(log Xit − 1

2Σ2t , Σ2

t ) distribution,it follows by explicit calculation that

V it (K) = DtTi

KN(d1) − DtSiN(d2),

where

d1 =log(DtTi

K/DtSi)

Σt+ 1

2Σt

and

d2 =log(DtTi

K/DtSi)

Σt− 1

2Σt .

Page 433: Financial derivatives in theory and practice

Exercises and solutions 409

Solution 17 (i) Recall from Chapter 11 that the time-zero value of a payersswaption with strike K is given by

V0(K) = P0ES

[(yT − K)+

],

where S denotes an EMM corresponding to numeraire P.

Differentiating both sides with respect to K and appealing to Fubini’s theoremyields

∂V0(K)∂K

= P0∂

∂KES

[(yT − K)+

]= P0ES

[− 1yT>K

]= −P0S(yT > K).

Assuming V0(K) is given by Black’s formula, we also have

∂V0(K)∂K

= P0

(−y0

σ0√

TKφ(d1) − N(d2) +

1σ0√

Tφ(d2)

)= −P0N(d2),

where φ(x) denotes the standard normal density.

Thus S(yT > K) = N(d2) and so

S(yT ≤ K) = N(−d2) = N

(log(K/y0)

σ0√

T+ 1

2 σ0√

T

).

Hence, under S, log yT ∼ N(log y0 − 12 σ

20T, σ2

0T ).

(ii) Let S denote the EMM corresponding to the numeraire P. Recall that

yt =DtT − DtSn

Pt

and so yt must be a martingale under S. We can write

dyt = σtytdWt, (20.21)

where

Wt = Wt −∫ t

0

µ

σudu.

Let Ft denote the augmented natural filtration generated by W. In orderfor W to be a Brownian motion under S, Girsanov’s theorem suggests we take

dS

dP

∣∣∣∣Ft

= exp(∫ t

0

µ

σudWu − 1

2

∫ t

0

µ2

σ2u

du

). (20.22)

Page 434: Financial derivatives in theory and practice

410 Exercises and solutions

This is an (Ft, P) martingale by Novikov since, for all t ≥ 0,

E

[exp

(12

∫ t

0

µ2

σ2u

du

)]< ∞,

and so, by Girsanov’s theorem, equation (20.22) defines a measure with therequired properties.

The SDE (20.21) has a unique solution

yt = y0 exp(∫ t

0σudWu − 1

2

∫ t

0σ2

udu

).

To establish that the model is consistent with Black’s formula given in (i), itis sufficient to prove that ∫ T

0σudWu ∼ N(0, σ2

0T ).

This follows immediately by Exercise 2 and we are done.

Solution 18 (i) Let Fi denote an EMM corresponding to numeraire D·Si.

Then the value of the ith caplet having strike K at time zero is given by

V i0 (K) = D0Si

EFi

[αi(Li

Ti− K)+

].

Differentiating both sides with respect to K yields

∂V i0

∂K= αiD0Si

EFi

[− 1Li

Ti>K

]= −αiD0Si

Fi(LiTi

> K).

Assuming V i0 (K) is given by Black’s formula, as given in the question, we

obtain∂V i

0

∂K= −αiD0Si

N

(log(Li

0/K)σi√

Ti

− 12σ

i√

Ti

).

Equating terms yields

Fi(LiTi≤ K) = N

(log(K/Li

0)σi√

Ti

− 12σ

i√

Ti

)and thus, under Fi,

log LiTi∼ N

(log Li

0 − 12 (σ

i)2Ti, (σi)2Ti

).

Page 435: Financial derivatives in theory and practice

Exercises and solutions 411

(ii) We know from Section 18.2 that the µit have been chosen so that the model

is arbitrage-free. If Fi denotes an EMM corresponding to numeraire D·Sithen,

recalling

Lit =

DtTi− DtSi

αiDtSi

,

we see that Li must be a martingale under Fi. Proceeding as in Exercise 15(ii),we can show that

dLit = σiLi

tdWt,

where Wt is a Brownian motion under Fi. The solution to this SDE is

Lit = Li

0 exp(σiWt − 1

2 (σi)2t).

Thus, for each i, under Fi the random variable LiTi

has the same log-normaldistribution as found in (i) and so the model is consistent with Black’s formula.

Remark 1: The LIBOR market model in (ii) above will of course be consistentwith Black’s formula for all 0 < t < Sn.

Remark 2: Setting Dit := DtTi+1/DtSn

, the Radon–Nikodym derivative dFi

dF

∣∣∣Ft

is

given bydFi

dF

∣∣∣∣Ft

=Di

t

Di0

= E(∫ T

0

n∑j=i+1

αjLju

1 + αjLju

σjdWu

),

where Ft denotes the augmented natural filtration generated by W.

Solution 19 Observe that, for i = 1, . . . , n,

D(i−1)t :=

DtTi

DtSn

=n∏

j=i

(1 + αjLjt)

= Dit + αiL

itD

it.

Taking expectations and using the model assumptions, we have

Di0 + αiL

i0D

i0 = Di−1

0

= EF

[D

(i−1)t

]= EF

[Di

t + αiLitD

it

]= Di

0 + αiEF

[Li

tDit

].

Page 436: Financial derivatives in theory and practice

412 Exercises and solutions

For this to hold we require

EF

[Li

tDit

]= Li

0Di0,

which is equivalent to

exp(ηit)EF

[exp(σiWt)Di

t

]= Di

0.

Taking logarithms yields the required result.For the model to be arbitrage-free we require D(i−1) to be a martingale

under F. Noting that, under F,

dLit = σiLi

tdWt + Lit(dηi

t + 12(σ

i)2dt),

we would needLi

t

(dηi

t + 12(σ

i)2dt)

= µitdt,

where µit is as in Exercise 18(ii). But then ηi

t would not be deterministic. Thusthe model described is not arbitrage-free.

The model is easy to implement in practice: write

vi = σi/σn

and observe thatLi

t = (Lnt )

vi

βit ,

where

βit =

Li0 exp(ηi

t)(Ln

0 exp(ηnt ))vi .

In practice we need to find LiTk

, 1 ≤ k ≤ i ≤ n. Observe that

ηnt = − 1

2 (σn)2t.

For the other η’s, work back from i = n inductively: suppose we know ηjTk

,j > i, then we know D

(j−1)Tk

, j > i, so we can find ηiTk

.

Solution 20 (i) Under Fk each D·Ti

D·Sk

is a martingale. Noting that

Lkt =

DtTk− DtSk

αkDtSk

,

we see that Lk is a martingale under Fk and so the drift in the SDE for Lk

must be zero, i.e. µk ≡ 0.

Page 437: Financial derivatives in theory and practice

Exercises and solutions 413

(ii) For 1 ≤ i < k,

D(i−1)t :=

DtTi

DtSk

=k∏

j=i

(1 + αjLjt) (20.23)

and soD

(i−1)t = Di

t + αiLitD

it.

Since D(i−1) and Di are both martingales, so is LiDi.

Nowd(Li

tDit) = Li

tdDit + Di

tdLit + dDi

tdLit

and equating finite variation terms to zero yields

Ditµ

itdt + σiLi

tdDitdW i

t = 0. (20.24)

From (20.23), ignoring finite variation terms which must combine to give zero,

dDit =

k∑j=i+1

∂Dit

∂Ljt

σjLjtdWj

t

= Dit

k∑j=i+1

αjLjt

1 + αjLjt

σjdWjt .

Substituting into (20.24) yields

Ditµ

itdt + Di

t

k∑j=i+1

αjLjt

1 + αjLjt

σjσiLitρijdt = 0

and thus the formula for µit, for 1 ≤ i < k, follows as required.

(iii) For i = k + 2, . . . , n + 1,

D(i−1)t :=

DtTi

DtSk

=i−1∏

j=k+1

DtSj

DtTj

=i−1∏

j=k+1

(1 + αjLjt)

−1 (20.25)

and so

D(i−1)t =

i∏j=k+1

(1 + αjLjt)

−1(1 + αiLit),

i.e. D(i−1)t = Di

t + αiLitD

it as before.

Page 438: Financial derivatives in theory and practice

414 Exercises and solutions

Equation (20.24) follows as before for i = k + 1, . . . , n, but now, from (20.25),we have

dDit =

i−1∑j=k+1

∂Dit

∂Ljt

σjLjtdWj

t

= −Dit

i∑j=k+1

αjLjt

1 + αjLjt

σjdWjt .

The formula for µit, i = k + 1, . . . , n, follows by substituting into (20.24).

Solution 21 Calibrating to vanilla cap prices is equivalent to calibrating tothe inferred prices of digital caplets. For the Hull–White model specified inExercise 16, the value of the ith digital caplet of strike K is given, after alittle calculation, by

V i0 (K) = D0Si

ES i

[1Li

Ti>K

]= D0Si

Φ(d2)

where Φ denotes the standard cumulative normal distribution,

d2 =log(

D0Ti

D0Si

(1 + αiK)−1)

Σ0− 1

2Σ0

and

Σ0 =(

e−aTi − e−aSi

a

)√

varxTi.

Consider the problem of how to specify the Markov-functional model whichcalibrates to these digital caplet prices. The boundary curve for this problemis exactly that of (19.1). The only functional forms needed on the boundaryare DTiTi

(xTi), i = 1, 2, . . . , n, which are trivially the unit map, and DTnSn

(xTn).

To complete the specification of the Markov-functional model it is sufficient tofind the functional forms for the numeraire bond DTiSn

(xTi), i = 1, . . . , n − 1.

(Compare (P.ii′) and (P.iii′) in Section 19.3.)Thus to establish that the Markov-functional model coincides with the

Hull–White model on the grid it is sufficient to show that the functionalforms DTiSn

(xTi), i = 1, . . . , n, agree for the two models.

For the Hull–White model, it follows from equation (17.13) that, fori = 1, . . . , n,

DTiSn(xTi

) =D0Sn

D0Ti

exp(− cixTi

+ 12c

2i var(xTi

)), (20.26)

Page 439: Financial derivatives in theory and practice

Exercises and solutions 415

where

ci :=e−aTi − e−aSn

a.

We now show that the same expression is obtained for the Markov-functionalmodel.

The form DTnSncan be recovered directly from the closed-form expression

available for LnTn

for the Hull–White model. From the working in Solution 16it follows that

LnTn

= (Ln0 + α−1

n ) exp(cnxTn

− 12c

2n var(xTn

))− α−1

n

and so

DTnSn=

11 + αnL

nTn

=D0Sn

D0Tn

exp(− cnxTn

+ 12c

2n var(xTn

))

as required.To determine the remaining functional forms we work back iteratively from

the terminal time Tn. Consider the ith step in this procedure. Assume thatDTjSn

, j = i + 1, . . . , n, have already been determined. We can also assumethat

DTiSi(xTi

)DTiSn

(xTi)

=D0Si

D0Sn

exp(ci+1xTi

− 12c

2i+1 var(xTi

)),

having been determined via (19.2) and the known (conditional) distributionof xTi+1 given xTi

.For some x∗ ∈ R, define

Ji0(x

∗) := D0SnESn

[DTiSi

(xTi)

DTiSn(xTi

)1xTi

>x∗

]= D0Si

exp( 1

2c2i+1 var(xTi

))ESn

[exp(ci+1xTi

)1xTi>x∗

]= D0Si

Φ(ci+1 var(xTi

) − x∗√varxTi

).

From (19.5) we have

DTiSn=(

DTiSi

DTiSn

)−1

(1 + αiLiTi

)−1,

and so to determine DTiSnit is sufficient to find Li

Ti(x∗). From (19.6) and

(19.8),Li

Ti(x∗) = Ki(x∗)

Page 440: Financial derivatives in theory and practice

416 Exercises and solutions

where Ki(x∗) solvesJi

0(x∗) = V i

0(Ki(x∗)

).

It follows that

(1 + αiL

iTi

(x∗))−1 =

D0Si

D0Ti

exp(

Σ0

(ci+1 var(xTi) − x∗√

var(xTi)

)+ 1

2Σ20

)and thus, after some algebra, we obtain equation (20.26) and we are done.

Page 441: Financial derivatives in theory and practice

Appendix 1

The Usual Conditions

When working with continuous time processes, it is common to work on afiltered probability space (Ω, Ft,F , P) which satisfies the usual conditions.We do this throughout most of this book and it is needed, for example, toensure that the stochastic integral (defined in Chapter 4) is adapted. Here webriefly describe these conditions and how to augment a probability space tomake them hold.

A filtered probability space (Ω, Ft,F , P) is said to satisfy the usualconditions if the following three conditions hold:(i) F is P-complete: if B ⊂ A ∈ F and P(A) = 0 then B ∈ F .(ii) F0 contains all P-null sets.(iii) Ft : t ≥ 0 is right-continuous: for all t ≥ 0,

Ft = Ft+ :=⋂

s>t

Fs.

Let (Ω,Fo, P) be a probability triple. The P-completion (Ω,F , P) of(Ω,Fo, P) is the probability triple defined by

F = σ(Fo ∪ N ),

whereN := A ⊂ Ω : A ⊆ B for some B ∈ F with P(B) = 0.

The usual P-augmentation (Ω, Ft,F , P) of the filtered probability space(Ω, Fo

t ,Fo, P) is produced by taking F to be the P-completion of Fo andby defining

Ft := σ(Fot+ ∪N ) =

s>t

σ(Fos ∪ N )

for all t ≥ 0. Clearly the usual P-augmentation satisfies the usual conditions.

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 442: Financial derivatives in theory and practice
Page 443: Financial derivatives in theory and practice

Appendix 2

L2 Spaces

The study of L2 spaces, and more generally Lp spaces for p ≥ 1, is a standardpart of functional analysis covered by many books. See, for example, thetreatments of Priestley (1997) and Rudin (1987). Here we present a few of thebasic facts used in this book. We omit all proofs, which can be found in theaforementioned references.

Definition A.1 Let (S, Σ) be a measurable space and let µ be a finitemeasure on (S, Σ). Then, for p ≥ 1,Lp(S, Σ, µ) (or just Lp when the contextis clear) is the space of Σ-measurable functions f : S → R such that

∫S

|f | dµ < ∞.

Define an equivalence relation on Lp by

f ∼ g if and only if f = g, µ almost everywhere,

and denote by [f ] the equivalence class containing the function f. The spaceLp(S, Σ, µ) (abbreviated to Lp) is defined to be the space of equivalence classesof Lp,

Lp(S, Σ, µ) = [f ] : f ∈ Lp(S, Σ, µ).

Notationally it is common practice to drop the [·] notation to denote anequivalence class and write, for example, f ∈ Lp rather than [f ] ∈ Lp. Weshall adopt this convention.

That the relation ∼ is an equivalence relation is easily verified, as are thefollowing simple lemmas.

Lemma A.2 For a finite measure µ and any p ≥ q,

Lp ⊆ Lq, Lp ⊆ Lq.

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 444: Financial derivatives in theory and practice

420 Appendix 2

Lemma A.3 For any f in either Lp or Lp, define

||f ||p :=(∫

S

|f |pdµ

)1/p

.

Then || · ||p defines a semi-norm on the space Lp and a norm on the space Lp.

It is the identification of functions which are almost everywhere equal thatturns || · ||p from a semi-norm into a norm: for all f, g ∈ Lp,

f ∼ g ⇔ ||f − g||p = 0.

The normed space (Lp, || · ||p) has one very important property, completeness.

Theorem A.4 For p ≥ 1, the space Lp is a complete normed space, i.e. aBanach space.

This is all we shall say about general Lp spaces.The space L2 has extra structure not present in a general Lp space, namely

it supports an inner product. The existence of an inner product means we cantalk about the idea of orthogonality, and this is crucial for the developmentof the stochastic integral.

Theorem A.5 For f, g ∈ L2(S,∑

, µ), let

〈f, g〉 :=(∫

s

fgdµ

).

Then 〈·, ·〉 defines an inner product on L2(S,∑

, µ) (with corresponding norm||f ||22 = 〈f, f〉). Thus (L2, 〈·, ·〉) is a Hilbert space (i.e. an inner product spacewhich is also a Banach space).

Remark A.6: Note that in the main text we denote the L2 norm describedabove by || · || and reserve the notation || · ||2 for the corresponding (semi-)normon the space of square-integrable martingales.

Page 445: Financial derivatives in theory and practice

Appendix 3

Gaussian CalculationsHere we gather together some simple Gaussian integrals that arise frequentlyin applications. Throughout X ∼ N(µ, σ2), φ(x, µ, σ2) is the correspondingGaussian density, φ(x) is the standard Gaussian density function and Φ is thecorresponding distribution function.

Polynomial integrals

We derive expressions for

In(l, h, µ, σ2) = E[Xn1l≤X≤h]

=∫ h

l

xnφ(x, µ, σ2) dx.

We will abbreviate this to In(l, h) when no confusion can arise.For m > 0,

d

dx(xmφ(x, µ, σ2)) = mxm−1φ(x, µ, σ2) − x − µ

σ2 xmφ(x, µ, σ2).

Integrating this over the interval [l, h] gives

hmφ(h, µ, σ2) − lmφ(l, µ, σ2) = mIm−1(l, h) +µ

σ2 Im(l, h) − 1σ2 Im+1(l, h).

Rearranging and substituting m = n − 1 yields

In(l, h) = σ2(n − 1)In−2(l, h) + µIn−1(l, h)

+ σ2(ln−1φ(l, µ, σ2) − hn−1φ(h, µ, σ2))

= σ2(n − 1)In−2(l, h) + µIn−1(l, h)

+ σ

(ln−1φ

(l − µ

σ

)− hn−1φ

(h − µ

σ

)).

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 446: Financial derivatives in theory and practice

422 Appendix 3

It remains to calculate I0 and I1. I0 is obtained by direct evaluation:

I0(l, h) =1

σ√

∫ h

l

exp

(−1

2

(x − µ

σ

)2)

dx

=1√2π

∫ h − µ

σl − µ

σ

exp(−1

2y2

)dy

= Φ(

h − µ

σ

)− Φ

(l − µ

σ

).

To calculate I1(l, h) note that

d

dxφ(x, µ, σ2) = −x − µ

σ2 φ(x, µ, σ2)

and so

I1(l, h) = µI0(l, h) + σ2(φ(l, µ, σ2) − φ(h, µ, σ2))

= µI0(l, h) + σ

(l − µ

σ

)− φ

(h − µ

σ

))

which gives

I1(l, h) = µ

(h − µ

σ

)− Φ

(l − µ

σ

))+ σ2(φ(l, µ, σ2) − φ(h, µ, σ2))

= µ

(h − µ

σ

)− Φ

(l − µ

σ

))+ σ

(l − µ

σ

)− φ

(h − µ

σ

))

Exponential integrals

These are commonly used in Black-Scholes-type calculations.

E[eαX1l≤X≤h] =

∫ h

l

exp

(αx − 1

2

(x − µ

σ

)2)

dx

= eαµ+ 12 α2σ2

∫ h

l

exp

(−1

2

(x − (µ + ασ2)

σ

)2)

dx

= eαµ+ 12 α2σ2

I0(l, h, µ, +ασ2, σ2)

= eαµ+ 12 α2σ2

(h − (µ + ασ2)

σ

)− Φ

(l − (µ + ασ2)

σ

)).

Page 447: Financial derivatives in theory and practice

ReferencesAndersen L. and Andreasen J. (2000) Volatility skews and extensions of the Libor

market model. Applied Mathematical Finance, 7(1), 1–32.Babbs S.H. (1990) The term structure of interest rates: stochastic processes and

contingent claims. PhD thesis, Imperial College, London.Babbs S.H. and Selby M.J.P. (1998) Pricing by arbitrage under arbitrary information.

Mathematical Finance, 8(2), 163–168.Balland P. and Hughston L.P. (2000) Markov market model consistent with caplet

smile. International Journal of Theoretical and Applied Finance, 3(2), 161–181Baxter M.W. (1997) General interest rate models and the universality of HJM. In

S. Pliska and M. Dempster (eds), Mathematics of Derivative Securities. CambridgeUniversity Press, Cambridge.

Baxter M.W. and Rennie A. (1996) Financial Calculus. Cambridge University Press,Cambridge.

Bennett M.N. and Kennedy, J.E. (2004) A comparison of Markov-functional andmarket models: the one-dimensional case. Preprint, University of Warwick.

Billingsley P. (1986) Probability and Measure, 2nd edition. John Wiley & Sons,Chichester.

Bjork T., Di Masi G., Kabanov Y. and Runggaldier W. (1997a) Towards a generaltheory of bond markets. Finance and Stochastics, 1, 141–174.

Bjork T., Kabanov Y. and Runggaldier W. (1997b) Bond market structure in thepresence of marked point processes. Mathematical Finance, 7, 211–239.

Black F. (1976) The pricing of commodity contracts. Journal of Financial Economics,3, 167–179.

Black F., Derman E. and Toy W. (1990) A one-factor model of interest rates and itsapplication to Treasury bond options. Financial Analysts Journal, Jan.-Feb., 33–39.

Black F. and Karasinski P. (1991) Bond and option pricing when short rates arelognormal. Financial Analysts Journal, July-Aug., 52–59.

Black F. and Scholes M. (1973) The pricing of options and corporate liabilities. Journalof Political Economy, 81, 637–659.

Brace A., Gatarek D. and Musiela M. (1997) The market model of interest ratedynamics. Mathematical Finance, 7, 127–154.

Breeden M.J. and Litzenberger R. (1978) Prices of state-contingent claims implicit inoption prices. Journal of Business, 51, 621–651.

Breiman L. (1992) Probability, classic edition. Society of Industrial and AppliedMathematics, Philadelphia.

Financial Derivatives in Theory and Practice Revised Edition. P. J. Hunt and J. E. Kennedyc© 2004 John Wiley & Sons, Ltd ISBNs: 0-470-86358-7 (HB); 0-470-86359-5 (PB)

Page 448: Financial derivatives in theory and practice

424 References

Chung K.L. (1974) A Course in Probability Theory, 2nd edition. Academic Press, NewYork.

Chung K.L. and Williams R.J. (1990) Introduction to Stochastic Integration. Birk-hauser, Boston.

Coleman T.S. (1995) Convexity adjustment for constant maturity swaps and Libor-in-arrears basis swaps. Derivatives Quarterly, Winter, 19–27.

Cox J.C., Ingersoll J.E. and Ross, S.A. (1985) A theory of the term structure of interestrates. Econometrica, 53, 385–408.

Delbaen F. and Schachermayer W. (1999) Non-arbitrage and the fundamentaltheorem of asset pricing: Summary of main results. In D.C. Heath and G. Swindle(eds), Introduction to Mathematical Finance. Proceedings of Symposia in AppliedMathematics of the AMS, 57, 49–58.

Dellacherie C. and Meyer P.-A. (1980) Probabilite et Potentiel. Theorie des Martin-gales. Hermann, Paris.

Doust P. (1995) Relative pricing techniques in the swaps and options markets. Journalof Financial Engineering, 4(1), 11–46.

Duffie D. and Stanton R. (1992) Pricing continuously resettled contingent claims. J.Economic Dynamics and Control, 16, 561–573.

Duffie D. (1996) Dynamic Asset Pricing Theory, 2nd edition. Princeton UniversityPress, Princeton, NJ.

Duffie D. and Singleton K.J. (1999) Modeling term structures of defaultable bonds.Review of Financial Studies, 12(4), 687–720.

Dupire B. (1994) Pricing with a smile. RISK Magazine, 7(1), 18–20.Durrett R. (1984) Brownian Motion and Martingales in Analysis. Wadsworth,

Belmont, CA.Durrett R. (1996) Stochastic Calculus: A Practical Introduction. CRC Press, Boca

Raton, FL.Elliott R.J. (1982) Stochastic Calculus and Applications. Springer-Verlag, Berlin,

Heidelberg and New York.Flesaker B. and Hughston L.P. (1996) Positive interest. RISK Magazine, 9(1), 46–49.Flesaker B. and Hughston L.P. (1997) International models for interest rates and

foreign exchange. Net Exposure, 3, 55–79.Gandhi S.K. and Hunt P.J. (1997) Numerical option pricing using conditioned

diffusions. In S. Pliska and M. Dempster (eds), Mathematics of Derivative Securities.Cambridge University Press, Cambridge.

Geman H., El Karoui N. and Rochet J. (1995) Changes of numeraire, changes ofprobability measure and option pricing. Journal of Applied Probability, 32, 443–458.

Glasserman P. and Zhao X. (2000) Arbitrage-free discretization of lognormal forwardLibor and swap rate models. Finance and Stochastics, 4(1).

Harrison J.M. and Kreps D. (1979) Martingales and arbitrage in multiperiod securitiesmarkets. Journal of Economic Theory, 20, 381–408.

Harrison, J.M. and Pliska S. (1981) Martingales and stochastic integrals in the theoryof continuous trading. Stochastic Processes and Their Applications, 11, 215–260.

Heath D., Jarrow R. and Morton A. (1992) Bond pricing and the term structure ofinterest rates: a new methodology for contingent claims valuation. Econometrica,61(1), 77–105.

Hogan M. and Weintraub K. (1993) The log-normal interest rate model and Eurodollarfutures. Working Paper, Citibank, New York.

Hull J. and White A. (1990) Pricing interest rate derivative securities. Review ofFinancial Studies, 3(4), 573–592.

Page 449: Financial derivatives in theory and practice

References 425

Hull J. and White A. (1994) Numerical procedures for implementing term structuremodels I: Single-factor models. Journal of Derivatives, Fall, 7–16.

Hunt P.J. and Kennedy J.E. (1996) On multi-currency interest rate models. WorkingPaper, University of Warwick.

Hunt P.J. and Kennedy J.E. (1997) On convexity corrections. Working Paper,University of Warwick.

Hunt P.J. and Kennedy J.E. (1998) Implied interest rate pricing models. Finance andStochastics, 3, 275–293.

Hunt P.J., Kennedy J.E. and Pelsser A.A.J. (2000) Markov-functional interest ratemodels. Finance and Stochastics, 4(4).

Hunt P.J., Kennedy J.E. and Scott E.M. (1996) Terminal swap-rate models. WorkingPaper, University of Warwick.

Hunt P.J. and Pelsser A.A.J. (1996) Arbitrage-free pricing of quanto-swaptions.Journal of Financial Engineering, 7(1), 25–33.

Jacka S.D., Hamza K. and Klebaner F. (1998) Arbitrage-free term structure models.Research Report No. 303, University of Warwick Statistics Department.

Jacod J. (1979) Calcul Stochastique et Problemes de Martingales, Lecture Notes inMath. 714. Springer-Verlag, Berlin, Heidelberg and New York.

Jamshidian F. (1989) An exact bond option formula. Journal of Finance, 44(1),205–209.

Jamshidian F. (1991) Forward induction and construction of yield curve diffusionmodels. Journal of Fixed Income, 1, 62–74.

Jamshidian F. (1996a) Sorting out swaptions. RISK Magazine, 9(3), 59–60.Jamshidian F. (1996b) Libor and swap market models and measures, II. Working

Paper, Sakura Global Capital, London.Jamshidian F. (1997) Libor and swap market models and measures. Finance and

Stochastics, 1, 261–291.Jarrow R.A. and Madan D. (1995) Option pricing using the term structure of interest

rates to hedge systematic discontinuities in asset returns. Mathematical Finance, 5,311–336.

Jarrow R.A. and Turnbull S.M. (1995) Pricing derivatives on financial securitiessubject to credit risk. Journal of Finance, 50, 53–85.

Jin Y. and Glasserman P. (2001) Equilibrium positive interest rates: a unified view.Review of Financial Studies, 14, 187–214.

Karatzas I. (1993) IMA Tutorial Lectures 1–3: Minneapolis. Department of Statistics,Columbia University.

Karatzas I. and Shreve S.E. (1991) Brownian Motion and Stochastic Calculus, 2ndedition. Springer-Verlag, Berlin, Heidelberg and New York.

Karatzas I. and Shreve S.E. (1998) Methods of Mathematical Finance. Springer-Verlag,New York.

Kennedy D.P. (1994) The term structure of interest rates as a Gaussian random field.Mathematical Finance, 4, 247–258.

Kunita H. and Watanabe S. (1967) On square integrable martingales. NagoyaMathematical Journal, 30, 209–245.

Lamberton D. and Lapeyre B. (1996) Introduction to Stochastic Calculus Applied toFinance, English edition, translated by N. Rabeau and F. Mantion. Chapman &Hall, London.

Margrabe W. (1978) The value of an option to exchange one asset for another. Journalof Finance, 33, 177–186.

Merton R. (1973) The theory of rational option pricing. Bell Journal of Economicsand Management Science, 4, 141–183.

Page 450: Financial derivatives in theory and practice

426 References

Miltersen K., Sandmann K. and Sondermann D. (1997) Closed form solutions forterm structure derivatives with log-normal interest rates. Journal of Finance, 52,409–430.

Musiela M. and Rutkowski M. (1997) Martingale Methods in Financial Modelling.Springer-Verlag, Berlin, Heidelberg and New York.

Neuberger A. (1990) Pricing swap options using the forward swap market. WorkingPaper, London Business School.

Oksendal B. (1989) Stochastic Differential Equations, 2nd edition. Springer-Verlag,Berlin, Heidelberg and New York.

Press W.H., Teukolsky S.A., Vetterling W.T. and Flannery B.P. (1988) NumericalRecipes in C, 2nd edition. Cambridge University Press, Cambridge.

Priestley H.A. (1997) Introduction to Integration. Clarendon Press, Oxford.Protter P. (1990) Stochastic Integration and Differential Equations. Springer-Verlag,

Berlin, Heidelberg and New York.Rebonato, R. (2002) Modern Pricing of Interest-Rate Derivatives. Princeton University

Press, Princeton, NJ.Revuz D. and Yor M. (1991) Continuous Martingales and Brownian Motion. Springer-

Verlag, Berlin, Heidelberg and New York.Ritchken P. and Sankarasubramanian L. (1995) Volatility structures of forward rates

and the dynamics of the term structure. Mathematical Finance, 5, 55–72.Rogers L.C.G. and Williams D. (1987) Diffusions, Markov Processes, and Martingales.

Vol. 2: Ito Calculus. John Wiley & Sons, Chichester.Rogers L.C.G. and Williams D. (1994) Diffusions, Markov Processes, and Martingales.

Vol. 1: Foundations, 2nd edition. John Wiley & Sons, Chichester.Rudin W. (1987) Real and Complex Analysis, 3rd edition. McGraw-Hill, Singapore.Rutkowski M. (1997) A note on the Flesaker–Hughston model of term structure of

interest rates. Applied Mathematical Finance, 4(3), 151–163.Rutkowski M. (2001) Modelling of forward LIBOR and swap rates. In E. Jouini,

J. Cvitanic and M. Musiela (eds), Option Pricing, Interest Rates and RiskManagement, 336–395. Cambridge University Press, Cambridge.

Stroock D.W. and Varadhan S.R.S. (1979) Multidimensional Diffusion Processes.Springer-Verlag, Berlin, Heidelberg and New York.

Vasicek O.A. (1977) An equilibrium characterisation of the term structure. Journal ofFinancial Economics, 5, 177–188.

Williams D. (1991) Probability with Martingales. Cambridge University Press, Cam-bridge.

Willmot P., Dewynne J.N. and Howison S. (1993) Option Pricing: Numerical Modelsand Computation. Oxford Financial Press, Oxford.

Yamada T. and Watanabe S. (1971) On the uniqueness of solutions of stochasticdifferential equations. Journal of Mathematics of Kyoto University, 11, 155–167.

Page 451: Financial derivatives in theory and practice
Page 452: Financial derivatives in theory and practice

428 Index

Page 453: Financial derivatives in theory and practice

Index 429

Cox – Ingersoll – Ross model 323–4credit risks

see also risk. . .futures contracts 249–51

cross-currency swaptions, multi-currencyterminal swap-rate models 311–13

currency see foreign exchange

Daniel – Kolmogorov consistency theorem 97daycount fraction, concept 228decomposition

affine decomposition 278–82convexity products 277–82irregular swaptions 293–9

Delbaen, F. 162deposits

cashflows 228–9, 233–4concept 227–9valuation 233–4

Derman, E. 323deterministic transformations, Brownian motion

23–6differentiability, Brownian motion 25–6differential equations 83–5, 125–34

see also partial. . .; stochastic. . .concept 115ODEs 115, 126–30

diffusion 116, 205, 340, 345, 366digital caplets

concept 244valuation 244, 360

digital caps, concept 244digital floorlets

concept 244valuation 244

digital floors, concept 244digital options, concept 244–5digital swaptions, concept 245, 356dimensions, data sets 219–21discount bonds 183–8, 191–200, 238, 320, 352

see also zero coupon bondsconvexity corrections 278–86

discount curves 233, 260–1, 266–9, 277–8,288–92, 298–300, 307

discount factors 354–5Bermudan swaptions 317, 329–32concept 233–6multi-currency terminal swap-rate models

305–8terminal swap-rate models 266–9, 272,

287–302, 305–8, 349discrete barrier swaptions 346–7discrete resettlement, futures contracts 252–3

discrete time 33, 149, 152dividends 179Dolean’s exponential 86, 342Doob – Meyer theorem 55, 61–2, 170, 196,

212Doob’s martingale convergence theorem 36,

39–40, 61–2, 129, 196doubling strategy, concept 160, 185drift 116, 205, 308, 339–41, 344–5, 348–9Duffie, D. 17, 147, 159, 165, 320Durrett, R. 20, 66dynamic term structures see term structures

economiessee also complete. . .arbitrage-free properties 6–15, 64, 145–52,

158–63, 185, 269–73components 141, 142–5, 151, 183–4incomplete economies 5infinite trading horizon 180–1, 183information availability 144, 152–3, 216one-period economy 5–17pure discount bonds 183–7real-world models 215–25simplest setting 4–5, 16

Einstein, Albert 19Elliot, R.J. 61, 176EMM see equivalent martingale measureequity markets 227equivalent martingale measure (EMM)

concept 14–15, 105–6, 110–13, 155–8,169

martingale representation theorem 110–13option pricing 149–51, 155–8real-world measure 216–18

equivalent probability measuresconcept 91–8filtration 94, 95–8Radon – Nikodym derivative 14, 91–9,

101–3, 106, 156–8, 165, 310Eurodollar futures contracts 247–8, 250–1,

253, 257European derivatives 141–2, 219–21, 256–7,

259–313, 368–9concept 146–7, 259pricing 146–51, 219–21, 287–302, 368–9reverse swap-market models 347–50

exchanges, futures contracts 248–51exotic American derivatives 315–71exotic derivatives 223–5, 259–313, 315–71

concept 259exotic European derivatives 259–313, 351–2,

370–1

Page 454: Financial derivatives in theory and practice

430 Index

Page 455: Financial derivatives in theory and practice

Index 431

Page 456: Financial derivatives in theory and practice

432 Index

Page 457: Financial derivatives in theory and practice

Index 433

Page 458: Financial derivatives in theory and practice

434 Index

payers swaptions, concept 242, 245, 266,297–8, 368

paymentdates 229–30futures contracts rules 248–9payouts 176–9

PDEs see partial differential equationsPicard-Lindelof iteration 126–30plain vanilla interest rate swaps 242polarization identity, concept 56, 81polynomial integrals, concept 373–4pre-agreed deposits 227–9predictable processes 128, 205

concept 62, 64, 65–7stochastic integration 64, 65–7, 69–70,

79–80present value of a basis point (PVBP) 235,

280–4, 356prices

see also option pricing; valuationassets 142–6, 161continuous process assumption 19convexity corrections 277–86exotic American derivatives 315–71exotic European derivatives 259–313,

347–50, 351–2, 370–1futures contracts 247–9, 252–6implied interest rate pricing models 287–302irregular swaptions 288, 293–9macro information modelling 217–18model comparisons 370–1path-dependent derivatives 315–69real-world models 215–25zero coupon swaptions 273–5

pricing kernelsee also numerairesarbitrage-free properties 6–15, 185complete economies 9–12, 185–7concept 6–15, 173–6equivalent martingale measure 14–15futures contracts 252term structure models 188–9, 191, 192, 194,

201–3, 207–12Priestley, H.A. 371principal component analysis 219–22product-based models 218–23progressively measurable concept 50, 67Protter, P. 92, 110, 116, 176pure discount bonds 183–8, 191–2, 194–6,

198–200, 320, 352see also zero coupon bondsconvexity corrections 278–86

put options 241

put-call parity 241–3PVBP see present value of a basis point

quadratic covariationconcept 55–6, 216stochastic integration 74–6

quadratic variationBrownian motion 26, 58, 88–9, 113concept 51–6, 69, 216filtration changes 60–1, 88–9Levy’s theorem 88–9, 113martingales 32, 51–6, 58, 88–9, 100stochastic integration 74–6, 84, 88–9

Radon–Nikodym derivative 14, 91–9, 101–3,106, 156–8, 165, 310

random walkssee also Brownian motionsymmetric random walks 21–3

rational log-normal models 207real-world models 215–25

see also modelsEMM 216–18martingales 215–18terminal swap-rate models 263–75, 280–2,

287–313, 347, 349receivers swaptions, concept 242–3, 245,

293–5, 371reflection principle, Brownian motion 28–30regular swap-market models 343–7replication 3–17, 147, 149–51, 163–4

see also trading strategiesconcept 3, 6–7, 9–10, 112, 234FRAs 234, 237–8futures contracts 186simplest setting 4–5swaps 235

representation theorem, martingales 64, 91,103, 105–13, 139, 203

reset dates, FRAs 229–30resettlement, futures contracts 251, 252–5reverse swap-market models 338, 347–50Revuz, D. 36, 42, 57, 67, 92, 110, 113, 116,

122, 125, 254Riemann sums 73, 254risk-neutral measures 252–7

concept 189, 201–3short-rate models 319–20, 325–30

riskscredit risks 249–51futures contracts 247–57market risks 249–51

Rogers, L.C.G. 38, 70, 78, 122–5

Page 459: Financial derivatives in theory and practice

Index 435

Page 460: Financial derivatives in theory and practice

436 Index

Page 461: Financial derivatives in theory and practice

Index 437

Page 462: Financial derivatives in theory and practice

WILEY SERIES IN PROBABILITY AND STATISTICS

ESTABLISHED BY WALTER A. SHEWHART AND SAMUEL S. WILKS

Editors: David J. Balding, Noel A. C. Cressie, Nicholas I. Fisher,Iain M. Johnstone, J. B. Kadane, Geert Molenberghs, Louise M. Ryan,David W. Scott, Adrian F. M. Smith, Jozef L. TeugelsEditors Emeriti: Vic Barnett, J. Stuart Hunter David G. Kendall

The Wiley Series in Probability and Statistics is well established and authoritative. Itcovers many topics of current research interest in both pure and applied statistics and probabilitytheory. Written by leading statisticians and institutions, the titles span both state-of-the-artdevelopments in the field and classical methods.

Reflecting the wide range of current research in statistics, the series encompasses applied,methodological and theoretical statistics, ranging from applications and new techniques madepossible by advances in computerized practice to rigorous treatment of theoretical approaches.

This series provides essential and invaluable reading for all statisticians, whether in academia,industry, government, or research.

ABRAHAM and LEDOLTER · Statistical Methods for ForecastingAGRESTI · Analysis of Ordinal Categorical DataAGRESTI · An Introduction to Categorical Data AnalysisAGRESTI · Categorical Data Analysis, Second EditionALTMAN, GILL, and McDONALD · Numerical Issues in Statistical Computing

for the Social ScientistAMARATUNGA and CABRERA · Exploration and Analysis of DNA Microarray and Protein

Array DataANDEL · Mathematics of ChanceANDERSON · An Introduction to Multivariate Statistical Analysis, Third Edition

∗ANDERSON · The Statistical Analysis of Time SeriesANDERSON, AUQUIER, HAUCK, OAKES, VANDAELE, and WEISBERG ·

Statistical Methods for Comparative StudiesANDERSON and LOYNES · The Teaching of Practical StatisticsARMITAGE and DAVID (editors) · Advances in BiometryARNOLD, BALAKRISHNAN, and NAGARAJA · Records

∗ARTHANARI and DODGE · Mathematical Programming in Statistics∗BAILEY · The Elements of Stochastic Processes with Applications to the Natural

SciencesBALAKRISHNAN and KOUTRAS · Runs and Scans with ApplicationsBARNETT · Comparative Statistical Inference, Third EditionBARNETT · Environmental Statistics: Methods & ApplicationsBARNETT and LEWIS · Outliers in Statistical Data, Third EditionBARTOSZYNSKI and NIEWIADOMSKA-BUGAJ · Probability and Statistical InferenceBASILEVSKY · Statistical Factor Analysis and Related Methods: Theory and

ApplicationsBASU and RIGDON · Statistical Methods for the Reliability of Repairable SystemsBATES and WATTS · Nonlinear Regression Analysis and Its ApplicationsBECHHOFER, SANTNER, and GOLDSMAN · Design and Analysis of Experiments for

Statistical Selection, Screening, and Multiple ComparisonsBELSLEY · Conditioning Diagnostics: Collinearity and Weak Data in Regression

∗Now available in a lower priced paperback edition in the Wiley Classics Library.

Page 463: Financial derivatives in theory and practice

BELSLEY, KUH, and WELSCH · Regression Diagnostics: Identifying InfluentialData and Sources of Collinearity

BENDAT and PIERSOL · Random Data: Analysis and Measurement Procedures,Third Edition

BERRY, CHALONER, and GEWEKE · Bayesian Analysis in Statistics andEconometrics: Essays in Honor of Arnold Zellner

BERNARDO and SMITH · Bayesian TheoryBHAT and MILLER · Elements of Applied Stochastic Processes, Third EditionBHATTACHARYA and JOHNSON · Statistical Concepts and MethodsBHATTACHARYA and WAYMIRE · Stochastic Processes with ApplicationsBILLINGSLEY · Convergence of Probability Measures, Second EditionBILLINGSLEY · Probability and Measure, Third EditionBIRKES and DODGE · Alternative Methods of RegressionBLISCHKE AND MURTHY (editors) · Case Studies in Reliability and MaintenanceBLISCHKE AND MURTHY · Reliability: Modeling, Prediction, and OptimizationBLOOMFIELD · Fourier Analysis of Time Series: An Introduction, Second EditionBOLLEN · Structural Equations with Latent VariablesBOROVKOV · Ergodicity and Stability of Stochastic ProcessesBOULEAU · Numerical Methods for Stochastic ProcessesBOX · Bayesian Inference in Statistical AnalysisBOX · R. A. Fisher, the Life of a ScientistBOX and DRAPER · Empirical Model-Building and Response Surfaces

∗BOX and DRAPER · Evolutionary Operation: A Statistical Method for ProcessImprovement

BOX, HUNTER, and HUNTER · Statistics for Experimenters: An Introduction toDesign, Data Analysis, and Model Building

BOX and LUCENO · Statistical Control by Monitoring and Feedback AdjustmentBRANDIMARTE · Numerical Methods in Finance: A MATLAB-Based IntroductionBROWN and HOLLANDER · Statistics: A Biomedical IntroductionBRUNNER, DOMHOF, and LANGER · Nonparametric Analysis of Longitudinal

Data in Factorial ExperimentsBUCKLEW · Large Deviation Techniques in Decision, Simulation, and EstimationCAIROLI and DALANG · Sequential Stochastic OptimizationCHAN · Time Series: Applications to FinanceCHATTERJEE and HADI · Sensitivity Analysis in Linear RegressionCHATTERJEE and PRICE · Regression Analysis by Example, Third EditionCHERNICK · Bootstrap Methods: A Practitioner’s GuideCHERNICK and FRIIS · Introductory Biostatistics for the Health SciencesCHILES and DELFINER · Geostatistics: Modeling Spatial UncertaintyCHOW and LIU · Design and Analysis of Clinical Trials: Concepts and Methodologies, Second

EditionCLARKE and DISNEY · Probability and Random Processes: A First Course with

Applications, Second Edition∗COCHRAN and COX · Experimental Designs, Second EditionCONGDON · Applied Bayesian ModellingCONGDON · Bayesian Statistical ModellingCONOVER · Practical Nonparametric Statistics, Second EditionCOOK · Regression GraphicsCOOK and WEISBERG · Applied Regression Including Computing and GraphicsCOOK and WEISBERG · An Introduction to Regression GraphicsCORNELL · Experiments with Mixtures, Designs, Models, and the Analysis of Mixture

Data, Third EditionCOVER and THOMAS · Elements of Information Theory

∗Now available in a lower priced paperback edition in the Wiley Classics Library.

Page 464: Financial derivatives in theory and practice

COX · A Handbook of Introductory Statistical Methods∗COX · Planning of ExperimentsCRESSIE · Statistics for Spatial Data, Revised EditionCSORGO and HORVATH · Limit Theorems in Change Point AnalysisDANIEL · Applications of Statistics to Industrial ExperimentationDANIEL · Biostatistics: A Foundation for Analysis in the Health Sciences, Sixth Edition

∗DANIEL · Fitting Equations to Data: Computer Analysis of Multifactor Data,Second Edition

DASU and JOHNSON · Exploratory Data Mining and Data CleaningDAVID and NAGARAJA · Order Statistics, Third Edition

∗DEGROOT, FIENBERG, and KADANE · Statistics and the LawDEL CASTILLO · Statistical Process Adjustment for Quality ControlDENISON, HOLMES, MALLICK and SMITH · Bayesian Methods for Nonlinear Classification

and RegressionDETTE and STUDDEN · The Theory of Canonical Moments with Applications in

Statistics, Probability, and AnalysisDEY and MUKERJEE · Fractional Factorial PlansDILLON and GOLDSTEIN · Multivariate Analysis: Methods and ApplicationsDODGE · Alternative Methods of Regression

∗DODGE and ROMIG · Sampling Inspection Tables, Second Edition∗DOOB · Stochastic ProcessesDOWDY, WEARDEN and CHILKO · Statistics for Research, Third EditionDRAPER and SMITH · Applied Regression Analysis, Third EditionDRYDEN and MARDIA · Statistical Shape AnalysisDUDEWICZ and MISHRA · Modern Mathematical StatisticsDUNN and CLARK · Applied Statistics: Analysis of Variance and Regression, Second EditionDUNN and CLARK · Basic Statistics: A Primer for the Biomedical Sciences,

Third EditionDUPUIS and ELLIS · A Weak Convergence Approach to the Theory of Large Deviations

∗ELANDT-JOHNSON and JOHNSON · Survival Models and Data AnalysisENDERS · Applied Econometric Time SeriesETHIER and KURTZ · Markov Processes: Characterization and ConvergenceEVANS, HASTINGS, and PEACOCK · Statistical Distributions, Third EditionFELLER · An Introduction to Probability Theory and Its Applications, Volume I,

Third Edition, Revised; Volume II, Second EditionFISHER and VAN BELLE · Biostatistics: A Methodology for the Health Sciences

∗FLEISS · The Design and Analysis of Clinical ExperimentsFLEISS · Statistical Methods for Rates and Proportions, Third EditionFLEMING and HARRINGTON · Counting Processes and Survival AnalysisFULLER · Introduction to Statistical Time Series, Second EditionFULLER · Measurement Error ModelsGALLANT · Nonlinear Statistical ModelsGIESBRECHT and GUMPERTZ · Planning, Construction, and Statistical Analysis of Compar-

ative ExperimentsGIFI · Nonlinear Multivariate AnalysisGHOSH, MUKHOPADHYAY, and SEN · Sequential EstimationGLASSERMAN and YAO · Monotone Structure in Discrete-Event SystemsGNANADESIKAN · Methods for Statistical Data Analysis of Multivariate Observations,

Second EditionGOLDSTEIN and LEWIS · Assessment: Problems, Development, and Statistical IssuesGREENWOOD and NIKULIN · A Guide to Chi-Squared TestingGROSS and HARRIS · Fundamentals of Queueing Theory, Third Edition

∗HAHN and SHAPIRO · Statistical Models in Engineering

∗Now available in a lower priced paperback edition in the Wiley Classics Library.

Page 465: Financial derivatives in theory and practice

HAHN and MEEKER · Statistical Intervals: A Guide for PractitionersHALD · A History of Probability and Statistics and their Applications Before 1750HALD · A History of Mathematical Statistics from 1750 to 1930HAMPEL · Robust Statistics: The Approach Based on Influence FunctionsHANNAN and DEISTLER · The Statistical Theory of Linear SystemsHEIBERGER · Computation for the Analysis of Designed ExperimentsHEDAYAT and SINHA · Design and Inference in Finite Population SamplingHELLER · MACSYMA for StatisticiansHINKELMAN and KEMPTHORNE: · Design and Analysis of Experiments, Volume 1:

Introduction to Experimental DesignHOAGLIN, MOSTELLER, and TUKEY · Exploratory Approach to Analysis of VarianceHOAGLIN, MOSTELLER, and TUKEY · Exploring Data Tables, Trends and Shapes

∗HOAGLIN, MOSTELLER, and TUKEY · Understanding Robust and Exploratory Data AnalysisHOCHBERG and TAMHANE · Multiple Comparison ProceduresHOCKING · Methods and Applications of Linear Models: Regression and the Analysis

of Variance, Second EditionHOEL · Introduction to Mathematical Statistics, Fifth EditionHOGG and KLUGMAN · Loss DistributionsHOLLANDER and WOLFE · Nonparametric Statistical Methods, Second EditionHOSMER and LEMESHOW · Applied Logistic Regression, Second EditionHOSMER and LEMESHOW · Applied Survival Analysis: Regression Modeling of

Time to Event DataHØYLAND and RAUSAND · System Reliability Theory: Models and Statistical MethodsHUBER · Robust StatisticsHUBERTY · Applied Discriminant AnalysisHUNT and KENNEDY · Financial Derivatives in Theory and Practice, Revised EditionHUSKOVA, BERAN, and DUPAC · Collected Works of Jaroslav Hajek –

with CommentaryIMAN and CONOVER · A Modern Approach to StatisticsJACKSON · A User’s Guide to Principle ComponentsJOHN · Statistical Methods in Engineering and Quality AssuranceJOHNSON · Multivariate Statistical SimulationJOHNSON and BALAKRISHNAN · Advances in the Theory and Practice of Statistics: A

Volume in Honor of Samuel KotzJUDGE, GRIFFITHS, HILL, LUTKEPOHL, and LEE · The Theory and Practice of

Econometrics, Second EditionJOHNSON and KOTZ · Distributions in StatisticsJOHNSON and KOTZ (editors) · Leading Personalities in Statistical Sciences: From the

Seventeenth Century to the PresentJOHNSON, KOTZ, and BALAKRISHNAN · Continuous Univariate Distributions,

Volume 1, Second EditionJOHNSON, KOTZ, and BALAKRISHNAN · Continuous Univariate Distributions,

Volume 2, Second EditionJOHNSON, KOTZ, and BALAKRISHNAN · Discrete Multivariate DistributionsJOHNSON, KOTZ, and KEMP · Univariate Discrete Distributions, Second EditionJURECKOVA and SEN · Robust Statistical Procedures: Asymptotics and InterrelationsJUREK and MASON · Operator-Limit Distributions in Probability TheoryKADANE · Bayesian Methods and Ethics in a Clinical Trial DesignKADANE and SCHUM · A Probabilistic Analysis of the Sacco and Vanzetti EvidenceKALBFLEISCH and PRENTICE · The Statistical Analysis of Failure Time Data Second

EditionKARIYA and KURATA · Generalized Least SquaresKASS and VOS · Geometrical Foundations of Asymptotic Inference

∗Now available in a lower priced paperback edition in the Wiley Classics Library.

Page 466: Financial derivatives in theory and practice

KAUFMAN and ROUSSEEUW · Finding Groups in Data: An Introduction to ClusterAnalysis

KEDEM and FOKIANOS · Regression Models for Time Series AnalysisKENDALL, BARDEN, CARNE, and LE · Shape and Shape TheoryKHURI · Advanced Calculus with Applications in Statistics, Second EditionKHURI, MATHEW, and SINHA · Statistical Tests for Mixed Linear ModelsKLEIBER and KOTZ · Statistical Size Distributions in Economics and Actuarial SciencesKLUGMAN, PANJER, and WILLMOT · Loss Models: From Data to DecisionsKLUGMAN, PANJER, and WILLMOT · Solutions Manual to Accompany Loss Models:

From Data to DecisionsKOTZ, BALAKRISHNAN, and JOHNSON · Continuous Multivariate Distributions,

Volume 1, Second EditionKOTZ and JOHNSON (editors) · Encyclopedia of Statistical Sciences: Volumes 1 to 9

with IndexKOTZ and JOHNSON (editors) · Encyclopedia of Statistical Sciences: Supplement

VolumeKOTZ, READ, and BANKS (editors) · Encyclopedia of Statistical Sciences: Update

Volume 1KOTZ, READ, and BANKS (editors) · Encyclopedia of Statistical Sciences: Update

Volume 2KOVALENKO, KUZNETZOV, and PEGG · Mathematical Theory of Reliability of

Time-Dependent Systems with Practical ApplicationsLACHIN · Biostatistical Methods: The Assessment of Relative RisksLAD · Operational Subjective Statistical Methods: A Mathematical, Philosophical, and

Historical IntroductionLAMPERTI · Probability: A Survey of the Mathematical Theory, Second EditionLANGE, RYAN, BILLARD, BRILLINGER, CONQUEST, and GREENHOUSE ·

Case Studies in BiometryLARSON · Introduction to Probability Theory and Statistical Inference, Third EditionLAWLESS · Statistical Models and Methods for Lifetime Data, Second EditionLAWSON · Statistical Methods in Spatial EpidemiologyLE · Applied Categorical Data AnalysisLE · Applied Survival AnalysisLEE and WANG · Statistical Methods for Survival Data Analysis, Third EditionLEPAGE and BILLARD · Exploring the Limits of BootstrapLEYLAND and GOLDSTEIN (editors) · Multilevel Modelling of Health StatisticsLIAO · Statistical Group ComparisonLINDVALL · Lectures on the Coupling MethodLINHART and ZUCCHINI · Model SelectionLITTLE and RUBIN · Statistical Analysis with Missing Data, Second EditionLLOYD · The Statistical Analysis of Categorical DataMAGNUS and NEUDECKER · Matrix Differential Calculus with Applications in

Statistics and Econometrics, Revised EditionMALLER and ZHOU · Survival Analysis with Long Term SurvivorsMALLOWS · Design, Data, and Analysis by Some Friends of Cuthbert DanielMANN, SCHAFER, and SINGPURWALLA · Methods for Statistical Analysis of

Reliability and Life DataMANTON, WOODBURY, and TOLLEY · Statistical Applications Using Fuzzy SetsMARDIA and JUPP · Directional StatisticsMASON, GUNST, and HESS · Statistical Design and Analysis of Experiments with

Applications to Engineering and Science, Second EditionMcCULLOCH and SEARLE · Generalized, Linear, and Mixed ModelsMcFADDEN · Management of Data in Clinical Trials

∗Now available in a lower priced paperback edition in the Wiley Classics Library.

Page 467: Financial derivatives in theory and practice

McLACHLAN · Discriminant Analysis and Statistical Pattern RecognitionMcLACHLAN and KRISHNAN · The EM Algorithm and ExtensionsMcLACHLAN and PEEL · Finite Mixture ModelsMcNEIL · Epidemiological Research MethodsMEEKER and ESCOBAR · Statistical Methods for Reliability DataMEERSCHAERT and SCHEFFLER · Limit Distributions for Sums of Independent

Random Vectors: Heavy Tails in Theory and Practice∗MILLER · Survival Analysis, Second EditionMONTGOMERY, PECK, and VINING · Introduction to Linear Regression Analysis,

Third EditionMORGENTHALER and TUKEY · Configural Polysampling: A Route to Practical

RobustnessMUIRHEAD · Aspects of Multivariate Statistical TheoryMURRAY · X-STAT 2.0 Statistical Experimentation, Design Data Analysis, and

Nonlinear OptimizationMURTHY, XIE, and JIANG · Weibull ModelsMYERS and MONTGOMERY · Response Surface Methodology: Process and Product

Optimization Using Designed Experiments, Second EditionMYERS, MONTGOMERY, and VINING · Generalized Linear Models. With

Applications in Engineering and the SciencesNELSON · Accelerated Testing, Statistical Models, Test Plans, and Data AnalysesNELSON · Applied Life Data AnalysisNEWMAN · Biostatistical Methods in EpidemiologyOCHI · Applied Probability and Stochastic Processes in Engineering and Physical

SciencesOKABE, BOOTS, SUGIHARA, and CHIU · Spatial Tesselations: Concepts and

Applications of Voronoi Diagrams, Second EditionOLIVER and SMITH · Influence Diagrams, Belief Nets and Decision AnalysisPALTA · Quantitative Methods in Population Health: Extensions of Ordinary

RegressionsPANKRATZ · Forecasting with Dynamic Regression ModelsPANKRATZ · Forecasting with Univariate Box-Jenkins Models: Concepts and Cases

∗PARZEN · Modern Probability Theory and It’s ApplicationsPENA, TIAO, and TSAY · A Course in Time Series AnalysisPIANTADOSI · Clinical Trials: A Methodologic PerspectivePORT · Theoretical Probability for ApplicationsPOURAHMADI · Foundations of Time Series Analysis and Prediction TheoryPRESS · Bayesian Statistics: Principles, Models, and ApplicationsPRESS · Subjective and Objective Bayesian Statistics, Second EditionPRESS and TANUR · The Subjectivity of Scientists and the Bayesian ApproachPUKELSHEIM · Optimal Experimental DesignPURI, VILAPLANA, and WERTZ · New Perspectives in Theoretical and Applied

StatisticsPUTERMAN · Markov Decision Processes: Discrete Stochastic Dynamic Programming

∗RAO · Linear Statistical Inference and Its Applications, Second EditionRENCHER · Linear Models in StatisticsRENCHER · Methods of Multivariate Analysis, Second EditionRENCHER · Multivariate Statistical Inference with ApplicationsRIPLEY · Spatial StatisticsRIPLEY · Stochastic SimulationROBINSON · Practical Strategies for ExperimentingROHATGI and SALEH · An Introduction to Probability and Statistics, Second Edition

∗Now available in a lower priced paperback edition in the Wiley Classics Library.

Page 468: Financial derivatives in theory and practice

ROLSKI, SCHMIDLI, SCHMIDT, and TEUGELS · Stochastic Processes for Insuranceand Finance

ROSENBERGER and LACHIN · Randomization in Clinical Trials: Theory and PracticeROSS · Introduction to Probability and Statistics for Engineers and ScientistsROUSSEEUW and LEROY · Robust Regression and Outlier DetectionRUBIN · Multiple Imputation for Nonresponse in SurveysRUBINSTEIN · Simulation and the Monte Carlo MethodRUBINSTEIN and MELAMED · Modern Simulation and ModelingRYAN · Modern Regression MethodsRYAN · Statistical Methods for Quality Improvement, Second EditionSALTELLI, CHAN, and SCOTT (editors) · Sensitivity Analysis

∗SCHEFFE · The Analysis of VarianceSCHIMEK · Smoothing and Regression: Approaches, Computation, and

ApplicationSCHOTT · Matrix Analysis for StatisticsSCHOUTENS · Levy Processes in Finance: Pricing Financial DerivativesSCHUSS · Theory and Applications of Stochastic Differential EquationsSCOTT · Multivariate Density Estimation: Theory, Practice, and Visualization

∗SEARLE · Linear ModelsSEARLE · Linear Models for Unbalanced DataSEARLE · Matrix Algebra Useful for StatisticsSEARLE, CASELLA, and McCULLOCH · Variance ComponentsSEARLE and WILLETT · Matrix Algebra for Applied EconomicsSEBER and LEE · Linear Regression Analysis, Second EditionSEBER · Multivariate ObservationsSEBER and WILD · Nonlinear RegressionSENNOTT · Stochastic Dynamic Programming and the Control of Queueing Systems

∗SERFLING · Approximation Theorems of Mathematical StatisticsSHAFER and VOVK · Probability and Finance: Its Only a Game!SMALL and MCLEISH · Hilbert Space Methods in Probability and Statistical InferenceSRIVASTAVA · Methods of Multivariate StatisticsSTAPLETON · Linear Statistical ModelsSTAUDTE and SHEATHER · Robust Estimation and TestingSTOYAN, KENDALL, and MECKE · Stochastic Geometry and Its Applications, Second

EditionSTOYAN and STOYAN · Fractals, Random Shapes and Point Fields: Methods of

Geometrical StatisticsSTYAN · The Collected Papers of T. W. Anderson: 1943–1985SUTTON, ABRAMS, JONES, SHELDON, and SONG · Methods for Meta-

Analysis in Medical ResearchTANAKA · Time Series Analysis: Nonstationary and Noninvertible Distribution TheoryTHOMPSON · Empirical Model BuildingTHOMPSON · Sampling, Second EditionTHOMPSON · Simulation: A Modeler’s ApproachTHOMPSON and SEBER · Adaptive SamplingTHOMPSON, WILLIAMS, and FINDLAY · Models for Investors in Real World MarketsTIAO, BISGAARD, HILL, PENA, and STIGLER (editors) · Box on Quality and

Discovery: with Design, Control, and RobustnessTIERNEY · LISP-STAT: An Object-Oriented Environment for Statistical

Computing and Dynamic GraphicsTSAY · Analysis of Financial Time SeriesUPTON and FINGLETON · Spatial Data Analysis by Example, Volume II:

Categorical and Directional Data

∗Now available in a lower priced paperback edition in the Wiley Classics Library.

Page 469: Financial derivatives in theory and practice

VAN BELLE · Statistical Rules of ThumbVESTRUP · The Theory of Measures and IntegrationVIDAKOVIC · Statistical Modeling by WaveletsWEISBERG · Applied Linear Regression, Second EditionWELSH · Aspects of Statistical InferenceWESTFALL and YOUNG · Resampling-Based Multiple Testing: Examples and

Methods for p-Value AdjustmentWHITTAKER · Graphical Models in Applied Multivariate StatisticsWINKER · Optimization Heuristics in Economics: Applications of Threshold

AcceptingWONNACOTT and WONNACOTT · Econometrics, Second EditionWOODING · Planning Pharmaceutical Clinical Trials: Basic Statistical PrinciplesWOOLSON and CLARKE · Statistical Methods for the Analysis of Biomedical Data,

Second EditionWU and HAMADA · Experiments: Planning, Analysis, and Parameter Design

OptimizationYANG · The Construction Theory of Denumerable Markov Processes

∗ZELLNER · An Introduction to Bayesian Inference in EconometricsZELTERMAN · Discrete Distributions: Applications in the Health SciencesZHOU, OBUCHOWSKI, and McCLISH · Statistical Methods in Diagnostic

Medicine

∗Now available in a lower priced paperback edition in the Wiley Classics Library.