17
The carry trade and fundamentals: Nothing to fear but FEER itself Òscar Jordà a, b , Alan M. Taylor c, d, e, a Federal Reserve Bank of San Francisco, United States b University of California, Davis, United States c University of Virginia, Department of Economics, Monroe Hall, Charlottesville VA 22903, United States d National Bureau of Economic Research, United States e Center for Economic Policy Research, United Kingdom abstract article info Article history: Received 30 March 2010 Received in revised form 21 February 2012 Accepted 3 March 2012 Available online 12 March 2012 JEL classication: C44 F31 F37 G15 G17 Keywords: Uncovered interest parity Efcient markets Exchange rates Receiver operating characteristic curve Correct classication frontier Risky arbitraging based on interest rate differentials between two countries is typically referred to as a carry trade. Up until the recent global nancial crisis, these trades generated years of persistent positive returns, which were hard to reconcile with standard pricing kernels. In 2008 these trades blew up, which seemed to weaken the case for a puzzle relating to predictable currency returns. But the rise and fall of this puzzle in the academic literature has only been concerned with naïve carry trades based on yield signals alone. We show, however, that some simple and more realistic fundamentals-augmented trading strategies would have generated strong and sustained positive prots that endured through the turmoil. © 2012 Elsevier B.V. All rights reserved. 1. Introduction Once a relatively obscure corner of international nance, the Naïve carry trade a crude bilateral interest rate arbitrage has drawn in- creasing attention in recent years, within and beyond academe, and especially as the volume of carry trade activity multiplied astronomi- cally: daily turnover in the global foreign exchange (FX) market is US $4 trillion, which dwarfs that of the main stock exchanges combined. Inevitably, press attention has often focused on the personal angle. In 2007, The New York Times reported on the disastrous losses suffered by the FX Beautiesclub and other Japanese retail investors during an episode of yen appreciation; one highly-leveraged housewife lost her family's entire life savings within a week. 1 By the fall of 2008 attention was grabbed by an even more brutal squeeze on carry traders from big- ger yen movese.g., up 60% against the AUD over 2 months, and up 30% against GBP (including 10% moves against both in 5 hours on the morn- ing of October 24). Money managers on the wrong side saw their funds blowing up, supporting the June 2007 prediction of Jim O'Neill, chief global economist at Goldman Sachs, who had said of the carry trade that there are going to be dead bodies around when this is over.2 Economists have also been paying renewed attention to the carry trade, revisiting old questions: Are there returns to currency specula- tion? Are the returns predictable? Is there a failure of market efciency? We are still far from a consensus, but the present state of the literature, how it got there, and how it relates to the interpretations of the current market turmoil, can be summarized in a just few moments, so to speak. 3 Journal of International Economics 88 (2012) 7490 Taylor has been supported by the Center for the Evolution of the Global Economy at UC Davis and Jordà by DGCYT Grant (SEJ2007-63098-econ); part of this work was completed while Taylor was a Houblon-Norman/George Fellow at the Bank of England; all of this research support is gratefully acknowledged. We thank Travis Berge and Yanping Chong for excellent research assistance. We acknowledge helpful comments from the editor and two anonymous referees, as well as from Vineer Bhansali, Richard Clarida, Robert Hodrick, Richard Meese, Michael Melvin, Stefan Nagel, Michael Sager, Mark Taylor, and seminar participants at The Bank of England, Barclays Global Inves- tors, London Business School, London School of Economics, PIMCO, the 3rd annual JIMF-SCCIE conference, and the NBER IFM program meeting. All errors are ours. Corresponding author. E-mail addresses: [email protected], [email protected] (Ò. Jordà), [email protected] (A.M. Taylor). 1 Martin Fackler, Japanese Housewives Sweat in Secret as Markets Reel,The New York Times, September 16, 2007. 2 Ambrose Evans-Pritchard, Goldman Sachs Warns of Dead Bodiesafter Market Turmoil,The Daily Telegraph, June 3, 2007. 3 For a full survey of foreign exchange market efciency see Chapter 2 in Sarno and Taylor (2002), on which we draw here. 0022-1996/$ see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.jinteco.2012.03.001 Contents lists available at SciVerse ScienceDirect Journal of International Economics journal homepage: www.elsevier.com/locate/jie

The carry trade and fundamentals: Nothing to fear but FEER itself

Embed Size (px)

Citation preview

Page 1: The carry trade and fundamentals: Nothing to fear but FEER itself

Journal of International Economics 88 (2012) 74–90

Contents lists available at SciVerse ScienceDirect

Journal of International Economics

j ourna l homepage: www.e lsev ie r .com/ locate / j i e

The carry trade and fundamentals: Nothing to fear but FEER itself☆

Òscar Jordà a,b, Alan M. Taylor c,d,e,⁎a Federal Reserve Bank of San Francisco, United Statesb University of California, Davis, United Statesc University of Virginia, Department of Economics, Monroe Hall, Charlottesville VA 22903, United Statesd National Bureau of Economic Research, United Statese Center for Economic Policy Research, United Kingdom

☆ Taylor has been supported by the Center for the Evat UC Davis and Jordà by DGCYT Grant (SEJ2007-63098completed while Taylor was a Houblon-Norman/Georgeall of this research support is gratefully acknowledgedYanping Chong for excellent research assistance. We acfrom the editor and two anonymous referees, as well asClarida, Robert Hodrick, Richard Meese, Michael MelvinMark Taylor, and seminar participants at The Bank of Etors, London Business School, London School of EconoJIMF-SCCIE conference, and the NBER IFM program mee⁎ Corresponding author.

E-mail addresses: [email protected], [email protected]@virginia.edu (A.M. Taylor).

0022-1996/$ – see front matter © 2012 Elsevier B.V. Alldoi:10.1016/j.jinteco.2012.03.001

a b s t r a c t

a r t i c l e i n f o

Article history:Received 30 March 2010Received in revised form 21 February 2012Accepted 3 March 2012Available online 12 March 2012

JEL classification:C44F31F37G15G17

Keywords:Uncovered interest parityEfficient marketsExchange ratesReceiver operating characteristic curveCorrect classification frontier

Risky arbitraging based on interest rate differentials between two countries is typically referred to as a carrytrade. Up until the recent global financial crisis, these trades generated years of persistent positive returns,which were hard to reconcile with standard pricing kernels. In 2008 these trades blew up, which seemedto weaken the case for a puzzle relating to predictable currency returns. But the rise and fall of this puzzlein the academic literature has only been concerned with naïve carry trades based on yield signals alone.We show, however, that some simple and more realistic fundamentals-augmented trading strategieswould have generated strong and sustained positive profits that endured through the turmoil.

© 2012 Elsevier B.V. All rights reserved.

1. Introduction

Once a relatively obscure corner of international finance, the Naïvecarry trade – a crude bilateral interest rate arbitrage – has drawn in-creasing attention in recent years, within and beyond academe, andespecially as the volume of carry trade activity multiplied astronomi-cally: daily turnover in the global foreign exchange (FX) market is US$4 trillion, which dwarfs that of the main stock exchanges combined.

Inevitably, press attention has often focused on the personal angle.In 2007, The New York Times reported on the disastrous losses suffered

olution of the Global Economy-econ); part of this work wasFellow at the Bank of England;. We thank Travis Berge andknowledge helpful commentsfrom Vineer Bhansali, Richard, Stefan Nagel, Michael Sager,ngland, Barclays Global Inves-mics, PIMCO, the 3rd annualting. All errors are ours.

avis.edu (Ò. Jordà),

rights reserved.

by the “FX Beauties” club and other Japanese retail investors during anepisode of yen appreciation; one highly-leveraged housewife lost herfamily's entire life savings within a week.1 By the fall of 2008 attentionwas grabbed by an evenmore brutal squeeze on carry traders from big-ger yenmoves—e.g., up 60% against the AUD over 2months, and up 30%against GBP (including 10%moves against both in 5 hours on themorn-ing of October 24). Money managers on the wrong side saw their fundsblowing up, supporting the June 2007 prediction of Jim O'Neill, chiefglobal economist at Goldman Sachs, who had said of the carry tradethat “there are going to be dead bodies around when this is over.”2

Economists have also been paying renewed attention to the carrytrade, revisiting old questions: Are there returns to currency specula-tion?Are the returns predictable? Is there a failure ofmarket efficiency?We are still far from a consensus, but the present state of the literature,how it got there, and how it relates to the interpretations of the currentmarket turmoil, can be summarized in a just fewmoments, so to speak.3

1 Martin Fackler, “Japanese Housewives Sweat in Secret as Markets Reel,” The NewYork Times, September 16, 2007.

2 Ambrose Evans-Pritchard, “Goldman Sachs Warns of ‘Dead Bodies’ after MarketTurmoil,” The Daily Telegraph, June 3, 2007.

3 For a full survey of foreign exchange market efficiency see Chapter 2 in Sarno andTaylor (2002), on which we draw here.

Page 2: The carry trade and fundamentals: Nothing to fear but FEER itself

75Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

The first point concerns the first moment of returns. There havebeen on average positive returns to Naïve carry trade strategies forlong periods.4 Put another way, the standard finding of a “forwarddiscount bias” in the short run means that exchange rate losses willnot immediately offset the interest differential gains of the Naïvecarry strategy.5 This finding is often misinterpreted as a failure of un-covered interest parity (UIP). But UIP is an ex-ante, not an ex-post,condition. And the evidence here, albeit limited, shows that averageex-ante exchange rate expectations (e.g., from surveys of traders)are not so far out of line with interest differentials. Rather, expecta-tions themselves seem to be systematically wrong.6 In this view, ex-post profits appeared to be both predictable and profitable, seeminglycontradicting the risk-neutral efficient markets hypothesis.7 However,over periods of a decade or two it is quite difficult to reject ex-postUIP, suggesting that interest arbitrage holds in the long run, andthat the possible profit opportunities are a matter of timing.8

The second point concerns the secondmoment of returns. If positiveex-ante returns are not arbitraged away, one way to rationalize themmight be that they are simply too risky (volatile) to attract additional in-vestors who might bid them away. This explanation has much in com-mon with other finance puzzles, like the equity premium puzzle. Theannual reward-to-variability ratio or Sharpe ratio of the S&P 500, rough-ly 0.4, can be taken as a benchmark level of risk adjusted return. It is anopen question whether carry trade strategies can reliably surmountthat hurdle, even with diversification.9,10

4 We prefer to focus on the period of unfettered arbitrage in the current era financialglobalization—that is, from the mid 1980s on, for major currencies. Staring in the 1960sthe growth of the Eurodollar markets had permitted offshore currency arbitrage to de-velop. Given the increasingly porous nature of the Bretton Woods era capital controls,and the tidal wave of financial flow building up, the dams started to leak, setting thestage for the trilemma to bite in the crisis of the Bretton Woods regime in 1970–73.Floating would permit capital account liberalization, but the process was fitful, andnot until 1990s was the transition complete in Europe (Bakker and Chapple, 2002).Empirical evidence suggests significant barriers of 100 bps or more even to riskless ar-bitrage in the 1970s and even into the early 1980s (Frenkel and Levich, 1975; Clinton,1998; Obstfeld and Taylor, 2004). Taking these frictions as evidence of imperfect cap-ital mobility we prefer to exclude the 1970s and the early 1980s from our empiricalwork. Other work, e.g. Burnside et al. (2008b,c) includes data back to the the 1970s.

5 Seminal studies of the forward discount puzzle include Frankel (1980), Fama(1984), Froot and Thaler (1990), and Bekaert and Hodrick (1993).

6 Seminal works on survey expectations include Dominguez (1986), Frankel andFroot (1987) and Froot and Frankel (1989). For an update see Chinn and Frankel(2002).

7 A standard test of thepredictability of returns (the “semi-strong” formofmarket efficien-cy) is to regress returns on an ex ante information set. The seminal work is Hansen andHodrick (1980). Other researchers deploy simple profitable trading rules as evidence ofmar-ket inefficiency (Dooley and Shafer, 1984; Levich and Thomas, 1993; Engel and Hamilton,1990).

8 Support for long-run UIP was found using long bonds by Fujii and Chinn (2000)and Alexius (2001), and using short rates by Sinclair (2005).

9 It is also an open question whether expanding the currency set to include minorcurrencies, exotics, and emerging markets can add further diversification and so en-hance the Sharpe ratio, or whether these benefits will be offset by illiquidity, tradingcosts, and high correlations/skew. Burnside et al. (2006) compute first, second, andthird moments for individual currencies and portfolio-based strategies for major cur-rencies and some minor currencies. In major currencies the transaction costs associat-ed with bid-ask spreads are usually small, although 5–10 bps per trade can add up if anentire portfolio is churned every month (i.e., 60–120 bps annual). Burnside et al.(2007) explore emerging markets with adjustments for transaction costs. Both studiesconclude that there are profit opportunities net of such costs, but the Sharpe ratios arelow. Moreover, Sager and Taylor (2008) argue that the Burnside et al. returns andSharpe ratios may be overstated. Pursuing a different strategy, however, using infor-mation in the term structure, Sager and Taylor are able to attain Sharpe ratios of 0.88for a diversified basket of currencies in back testing.10 In this paper we proceed in a risk-neutral framework, a widely-used benchmark, incontrast to the more recent and contentious use of consumption-based asset pricingmodels in the carry trade context. Lustig and Verdelhan (2007) claimed that the rela-tively low Naïve carry trade returns may be explicable in a consumption-based modelwith reasonable parameters. Burnside (2011) challenged the usefulness of this ap-proach. Returns to our systematic trading rules have very different moments, however,with mean returns twice as large as the typical Naïve strategy, leaving a considerablepremium to be explained in either framework.

Finally, we turn to the third point of near consensus, on the thirdmoment of returns. Naïve carry trade returns are negatively skewed.The lower-tail risk means that trades are subject to a risk of pro-nounced periodic crashes—sometimes referred to as a peso problem.Such a description applies to the aforementioned disaster events suf-fered by yen carry traders in 2007 and 2008 (and their predecessorsin 1998). However, many competing investments, like the S&P500,are also negatively skewed, so the question is once more relative.And while individual currency pairs may exhibit high skew, weknow that diversification across currency pairs can be expected todrive down skewness in a portfolio. But for Naïve strategies somenegative skew still remains under diversification, and it is thenunclear whether excess returns are a puzzle, or required compensa-tion for skew (and/or volatility).11

In a carry trade, the only unknown is the behavior of exchangerates and at least since Meese and Rogoff (1983), the profession hasmostly settled into a belief that fundamental explanations of the ex-change rate have no predictive value in the short-run. In this paperwe show that misalignments in frictionless equilibrium conditionsin financial and goods markets, like deviations from UIP or from thefundamental equilibrium exchange rate, or FEER, have significant di-rectional classification ability: simple FX trading strategies that ex-ploit this information largely avoid crashes, attenuate lower-tailrisk, modulate volatility, boost returns, and hence improve the Sharpeand gain-loss (Bernardo and Ledoit, 2000) ratios.

The contributions of our paper are as follows.We start from commonground and confirm all prior findings for our data (a panel of G-10 cur-rencies): Naïve carry trades had been profitable over long stretches oftime.However, awindowedout-of-sample analysis indicates that returnsquickly deteriorated in 2007, and by 2008 four-year average returnsturned negative. The global financial crisis brought many investmentstrategies to their knees and the carry trade was no exception. Fromthis perspective, it looked like the carry trade puzzle had been at leastattenuated.

But a Naïve carry trade is a crude investment strategy as it makesno attempt to hedge exchange rate risk. Marginally more sophisticat-ed strategies extended by a FEER factor are a natural alternative.Moreover, by now simple nonlinear threshold models that weigh sig-nals from departures of UIP and PPP equilibrium are common place(e.g. Taylor et al., 2001; Sarno et al., 2006). Might these simple exten-sions rescue the carry trade from the roiling waters of 2008? Thisseems to be the case, indeed. FEER control not only improves statisti-cal performance, it also enhances the financial performance of tradingrules beyond that achieved by diversification alone. For simplicity andcomparability, we apply these extensions to equal-weighted portfoliosof G-10 currencies to serve as a backtesting laboratory. These improvedcarry trade strategies generate positive returnswithmoderate volatilityand positive skewand for the four year average ending in 2008 eke out a3.64% annual rate of return.

We also compare our approach to rival crash protection strategies.One suggestion is that carry trade skew is driven by market stress andliquidity events (Brunnermeier et al., 2009); butwefind that VIX signalsprovide no additional explanatory or predictive power in our preferredforecasting framework. Another approach suggests that options provideaway to hedge downside risk (Burnside et al., 2008b,c); but these strat-egies are very costly to implement compared to our trading rule. Inter-estingly, real-world trading strategies for the years up to and includingthe 2008 financial crisis offer another window into our results. TheNaïve models and crude ETFs crashed horribly in 2008, wiping outyears of gains. More sophisticated ETFs bearing some resemblance to

11 Brunnermeier et al. (2009) focus on the “rare event” of crash risk as an explanationfor carry trade profits. Burnside et al. (2008b) argue that peso problems cannot explaincarry trade profits, although in a different version of the same paper (2008b), they ar-gue to the contrary.

Page 3: The carry trade and fundamentals: Nothing to fear but FEER itself

76 Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

our preferred model weathered the storm remarkably well with barelyany drawdown. 12

Absent perfect arbitrage – the ability tomake a gainwithout the pos-sibility of loss in a zero-cost investment – one can always find a pricingkernel that explains the returns of any asset. We provide formal out-of-sample predictive ability testing results (Giacomini and White, 2006)using innovative returns-, Sharpe- and skew-based loss functionsgrounded on trading performance rather than the more conventionalquadratic-loss approach. Still, it is hard to gain enough texture to tellwhether or not the observed returns are commensurate compensationfor risk.

Our work is sympathetic to a well-established view that assetprices can deviate from their fundamental value for some time, beforea “snap back” occurs (Poterba and Summers, 1986; Plantin and Shin,2007). Recent theory has focused on what might permit the devia-tion, and what then triggers the snap. In FX markets, recent modelsinclude noise traders, heterogeneous beliefs, rational inattention, li-quidity constraints, herding, habits, “behavioral” effects, and otherfactors that may limit arbitrage (e.g., Backus et al., 1993; Shleiferand Vishny, 1997; Jeanne and Rose, 2002; Bacchetta and vanWincoop, 2006; Fisher, 2006; Brunnermeier et al., 2009; Ilut, 2010).To begin a conversation on these topics, we combine notions of ap-proximate arbitrage (Bernardo and Ledoit, 1999, 2000) with classifi-cation ability using tools popular in biostatistics, machine learning,and signal processing but rarely used in economics.13 In addition toproviding formal statistics on returns-weighted directional ability,the gain-loss ratios that we report form a basis of comparison withequity gain-loss ratios. Thus, our work also provides a starting pointfor describing the set of allowable pricing kernels that would explainthe data.

2. Currency strategy design

Strict absence of arbitrage – defined as zero-cost investments thatoffer the possibility of gain with no possibility of loss – is a fundamen-tal condition for equilibrium in asset markets. In this section we knittogether basic no-arbitrage conditions in international economics toderive a predictive framework that nests commonly used carrytrade strategies with alternatives that we propose. Ultimately thegoal will be to assess which of these strategies generates approximatearbitrage opportunities (in the sense of Bernardo and Ledoit, 1999).Such opportunities are possibly incompatible with well-functioningcapital markets. Since any strict non-arbitrage portfolio will be fairlypriced for some set of investor preferences, it will be important forus to design a statistical metric that provides meaningful bounds forthe set of pricing kernels that correctly price all portfolios. This wedo in Section 4 below.

Ex-post nominal excess returns to a carry trade are:

xtþ1 ¼ Δetþ1 þ i�t−it� �

; ð1Þ

where et+1 is the log nominal exchange rate in U.S. dollars per foreigncurrency, and it and it* are nominal interest rates home (U.S.) and abroadfor a riskless deposit with a one-periodmaturity. In a frictionless world,the carry trade is a zero-cost investment; thus, in the absence of arbi-trage the expected excess return to a risk-neutral investor should bezero, that is Etxt+1=EtΔet+1+(it*− it)=0. This is the well known un-covered interest rate parity (UIP) condition.

12 Pojarliev and Levich (2007, 2008) argue that active managers deliver less alphathan claimed, and that much of their returns can be described as quasi-beta with re-spect to style factors like carry, momentum, and value that are now captured in passiveindices and funds.13 The credit risk literature is an exception (see, e.g., Khandani et al., 2010).

Excess returns can also be expressed in real terms. Let pt+1 andpt*+1 denote the logarithm of the price level home and abroad sothat πt+1=Δpt+1 is the domestic inflation rate (and similarly forthe inflation rate abroad, πt*+1). Define the real exchange rate asqtþ1 ¼ q þ etþ1 þ p�tþ1−ptþ1

� �, then it is clear that Eq. (1) can be re-

written as:

xtþ1 ¼ Δqtþ1 þ r�t−rt� �

; ð2Þ

where Etrt= it−Etπt+1 denotes the real interest rate (and similarlyfor Etrt*). Under the assumption of (weak) purchasing power parity,q is the mean FEER to which qt reverts in the long-run implying thatqt−q is a stationary variable and hence a cointegrating vector. In amore general setting, q could include time varying elements (e.g. pro-ductivity differentials as in the Harrod–Balassa–Samuelson frame-work, see Chong et al., 2012), which we do not explore directly inthis paper due to space considerations.

Expressions (1) and (2) are naturally nested in the stochastic pro-cess of the stationary system14:

Δytþ1 ¼Δetþ1

π�tþ1−πtþ1i�t−it

24

35; ð3Þ

where deviations of the real exchange from FEER are a cointegratingvector. Further assuming linearity, a vector error correction model(VECM) is a natural benchmark representation for the stochastic pro-cess that describes the evolution of Δyt+1.

The last row of expression (3) may look peculiar for two reasons:(1) there is no natural integrand for it*− it, and (2) this variable isdated at time t. However, neither of these two characteristics poseany difficulty. Broadly speaking, underlying the cointegrating vectoris the law of one price applied to goods trading across borderswhere it*− it does not naturally appear. The timing convention is sim-ply due to the fact that the returns from holding a riskless one-periodbond are known with certainty when the transaction is entered.

Absence of arbitrage under risk-neutrality is directly linked to the pre-dictability of thefirst row ofΔyt+1. Thus a natural predictor of Δet+1 isbased on a linear combination of lagged information on Δet, (πt*−πt),it*− it and qt−q. Our analysis is based on a panel of nine countries:Australia, Canada, Germany, Japan, Norway, New Zealand, Sweden,Switzerland, and theUnitedKingdomset against theU.S. over the periodJanuary 1986 to December, 2008, observed at a monthly frequency (i.e.,the “G10” sample).15 The dataset includes the end-of-month exchangerates and the one-month London interbank offered rates (LIBOR) andconsumer price indices for the U.S. and any of the nine counterpartycountries.16

A natural impulse is to determine the stationarity properties of theconstituent elements of expression (3) so as to verify that qt+1 is aproper cointegrating vector. Such steps are common when the objec-tive of the analysis is to directly examine whether purchasing powerparity holds in the data; when one is interested in determining thespeed of adjustment to long-run equilibrium; and hence to ensurethat the estimators and inference are constructed with the appropri-ate non-standard asymptotic machinery. However interesting it is toinvestigate these issues, they are of second order importance for ouranalysis given our stated focus on classification ability and derivation

14 It would be straightforward to augment this model with an order-flow element, asin the VAR system of Froot and Ramadorai (2008).15 Data for the Germany after 1999 are constructed with the EUR/USD exchange rateand the fixed conversion rate of 1.95583 DEM/EUR.16 Data on exchange rates and the consumer price index are obtained from the IFS da-tabase whereas data on LIBOR are obtained from the British Banker's Association(www.bba.org.uk) and Global Financial Data (globalfinacialdata.com).

Page 4: The carry trade and fundamentals: Nothing to fear but FEER itself

77Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

of profitable investment strategies—the Beveridge–Nelson represen-tation of the first equation in expression (3) is valid regardless ofwhether or not there is cointegration. For this reason, we provide afar less extensive analysis of these issues for our panel of data thanis customary and refer the reader to the extensive literature onpanel PPP testing (e.g., see Taylor and Taylor, 2004; and our relatedwork Chong et al., 2012).

Instead, we investigate the properties of qt+1 directly with a bat-tery of cointegration tests that pool the data to improve the power ofthe tests. These tests also account for different forms of heterogeneityin cross-section units and are based on the work of Pedroni (1999,2004). The results of these tests are summarized in Table 1, withthe particulars of each test explained therein. Broadly speaking,while not overwhelming, it seems reasonable to conclude that thedata support our thinking of qt+1 as a valid cointegrating vector.

While the results of Table 1 are informative regarding the relativestrength of PPP, we will show momentarily that there is valuable pre-dictive information contained in real exchange rate fundamentals foruse in one period-ahead directional predictions, which might form thebasis of a trading strategy. Hence, we try various forecasting models,from Naïve models (e.g., simple carry trades) to more sophisticatedmodels (e.g., threshold models of UIP and PPP dynamics). The modelsare then evaluated in two ways: first from a statistical standpoint, bylooking at the in-sample fit and out-of-sample predictive power; andsecond from a trading standpoint, by examining how implementingthese strategies would have affected a currency manager's returns, vol-atility, Sharpe ratios, gain-loss ratios and skewness for the sampleperiod.

3. A trading laboratory

The previous section sets the stage for the design of carry trade strat-egies. This section calculates the equal-weighted currency portfolioreturns of alternative designs out-of-sample. At first blush, it appearsthat a little sophistication can generate returns, and that these returnswould have survived the recent global financial crash. Whether thisperformance holds up to statistical scrutiny and whether it threatensstandard notions of equilibrium pricing are questions that are exploredin the following sections.

The most basic design consists in arbitraging bilateral interest-ratedifferentials and hence does not require a model per se: the popularNaïve carry trade assumes that exchange rates are an unpredictable ran-domwalk. But such a view rests solely on notions about UIP so it is nat-ural to pit the Naïve carry against a model that also entertains a PPP

Table 1Panel cointegration tests.

Test Constant Constant+Trend

Raw Demeaned Raw Demeaned

stats p-value stats p-value stats p-value stats p-value

Panelν 2.50** 0.017 3.82** 0.000 0.14 0.395 2.04* 0.050ρ −1.68* 0.097 −2.22** 0.034 −0.56 0.340 −1.78* 0.079PP −1.59 0.113 1.68* 0.097 −1.09 0.220 −1.71* 0.093

ADF 0.99 0.245 −0.54 0.345 1.27 0.179 −0.39 0.370Group

ρ −0.51 0.350 −1.71* 0.092 0.48 0.356 −2.12** 0.042PP −0.98 0.246 −1.65 0.103 −0.40 0.368 −2.06** 0.048ADF 7.31** 0.000 0.59 0.335 7.16** 0.0009 0.74 0.303

Notes: We use the same definitions of each test as in Pedroni (1999). The tests areresidual-based. We consider tests with constant term and no trend, and with constantand time trend. The first 4 rows describe tests that pool along the within-dimensionwhereas the last 3 rows describe tests that pool along the between-dimension. The testsallow for heterogeneity in the cointegrating vectors, the dynamics of the error processand across cross-sectional units. ** (*) denotes rejection of the null of no-cointegrationat the 95% (90%) confidence level. The sample is January 1986 to December 2008.

factor. We call this the Naïve ECM strategy. A more sophisticated inves-torwould then consider including lagged information, a strategywe callVECM if it includes a PPP factor, or simply VAR if it does not. Finally,Coakley and Fuertes (2001), Sarno et al. (2006) and others have foundthat exchange rates behave like a random walk as long as deviationsfrom UIP and PPP are not large, in which case reversion to paritytends to happen rather quickly. Hence, the last strategy we entertainis a simple threshold model based on these precepts and which wecall TECM. There are undoubtedly numerous other alternatives (espe-cially when the nonlinear door is opened) but our objective is not tosearch for the “best” strategy. Rather, we want to show that a little so-phistication can have an unexpectedly big payoff.

Currency funds invest in a portfolio of currencies. This allowsfunds to take advantage of gains from diversification: standard statis-tical intuition (cf. the Central Limit Theorem) tells us that varianceand skewness can be curtailed by taking an average of draws thatare imperfectly correlated. 17 As a practical matter for many tradingstrategies studied in the academic literature, and some used in prac-tice, the portfolio choice is restricted—for reasons of simplicity, com-parability, diversification and transparency. The simplest restriction isto employ simple equal-weight portfolios where a uniform bet of size$1/N is placed on each of N FX-USD pairs, so that the only forecast-based decision is the binary long-short choice for each pair in eachperiod. Granted, lifting this restriction to allow adjustable portfolioweights must, ipso facto, allow even better trading performance, atleast on backtests. But from our standpoint of presenting a puzzle –

showing a strategy with high returns, Sharpes, gain-loss ratios, andlow skewness – meaningful forecasting success even when restrictedto an equal-weight portfolio is evidence enough to support ourstory.18

A typical investor prefers to assess the performance of alternativestrategies through out-of-sample analysis since any in-sample resultscould be the result of over-fitting. However, the recent globalfinancial cri-sis adds a newwrinkle to this assessment: How sensitive are the conclu-sions to the inclusion/exclusion of this once-in-a-lifetime event? Toinvestigate this issue we consider three windows of fixed length (so asto make the statistics comparable across samples) over which out-of-sample returns are calculated: 2002–2006; 2003–2007; and 2004–2008.That is, one-step ahead forecasts are generated, one at a time, with a roll-ing window that begins with the January 1986 observation and finisheswith the December 2001 observation (say, for the first of the windows).The evolution of trading performance over time turns out to be quite re-vealing, cementing the consequence of the puzzles we uncover.

The basic format of the analysis below is as follows. We report in-sample estimates of the models along with basic statistics to get asense of the contribution of each explanatory factor. Next we com-pute out-of-sample returns using the three windows just described.For each window we report the average returns of the strategy overthe period, their standard deviation and hence their annualizedSharpe ratio, their skewness and their annualized gain-loss ratio(Bernardo and Ledoit, 2000). The Sharpe ratio is a common measureof performance which has been found unreliable in some circum-stances (see, e.g. Cochrane and Saá-Requejo, 2000, and Černý,2003). The gain-loss ratio of Bernardo and Ledoit (2000) overcomes

17 A numerical illustration is provided by Burnside et al. (2008a).18 For example, we have experimented with linear-weight portfolios based on inter-est differential weights, or on forecast return weights, and these can sometimes attainhigher in-sample returns and Sharpe ratios. These results are not shown. One could al-so go further and attempt to construct “optimized” portfolios based on mean-varianceoptimization. However, the value of this approach remains in question as recent em-pirical work shows that in many contexts the supposed performance gains of mean-variance optimization may be minimal or illusory (DeMiguel et al., 2009). In additionthe covariance matrices so estimated can produce extreme weights. As with anyunequal-weight approach this can lead, in practice, to recurring trades where portfoliomanagers bump into position limits for single currencies imposed either by manage-ment or by mandate. Thus, like other work in this literature, we prefer to stick to thecleaner and simpler equal-weight approach in this paper.

Page 5: The carry trade and fundamentals: Nothing to fear but FEER itself

78 Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

these shortcomings and as we will see, has a natural connection to thenew statistical metrics that we introduce below.

3.1. Two Naïve models

Table 2 presents the one-month ahead forecasts of the change inthe log nominal exchange rate based on two Naïve models. Model 1,denoted “Naïve,” implicitly imposes the assumption that the ex-change rate follows a random walk. All regressors are omitted sothere is, in fact, no estimated model. This is simply the benchmarkor null model.

However crude it may be, this is still a model that one might use as abasis for trading – long high yield, short low yield – so panel (b) exploresthe out-of-sample trading performance of an equal-weighted currencyportfolio with all nine currencies against the U.S. dollar based on sucha strategy. This is done over the three out-of-sample windows describedabove. In this table, and in what follows, we report the profits the inves-tor would have actually made based on an equal-weighted portfolio oflong-short trades in each currency, in each period, with direction givenby the sign of the model's return forecast.

Over the initial window ending in 2006, realized profits are a healthy54 basis points (bps) per month, with low volatility and skew, thus gen-erating an impressive 1.59 annualized Sharpe ratio and a remarkable2.98 gain-loss ratio (expected gains are about three times expectedlosses). As a reference, Bernardo and Ledoit (2000) calculate the gain-loss ratio for the stock market index over the period 1959 to 1997 tobe 2.6 assuming logarithmic utility. However, as the sample windowrolls over 2007 and then 2008, when the financial crisis is in full swing,the picture changes quite dramatically. The same currency portfolio

Table 2Two Naïve models.

Model (1) (2)

Naïve Naïve ECM

(a) Model parametersDependent variable Δeit+1 Δeit+1

log(qit−qi) — −0.0235**(0.0047)

R2, within — .0119F — F(1,2447)=25.39Fixed effects — YesPeriods 276 276Currencies (relative to US$) 9 9Observations 2484 2484

(b) Out-of-sample trading performance †

Realized profits 2002–2006Mean 0.0054 0.0062Std. dev. 0.0117 0.0145Skewness −0.2181 0.5792Sharpe ratio (annualized) 1.5884 1.4870Gain/loss ratio 2.9819 3.3678

Realized profits 2003–2007Mean 0.0030 0.0036Std. dev. 0.0120 0.0131Skewness −0.0630 0.8712Sharpe ratio (annualized) 0.8697 0.9466Gain/loss ratio 1.8511 2.2078

Realized profits 2004–2008Mean −0.0024 −0.0012Std. dev. 0.0179 0.0143Skewness −2.9748 −1.6695Sharpe ratio (annualized) −0.4715 −0.2868Gain/loss ratio 0.6438 0.7773

†Statistics for trading profits are net, monthly, from a equally weighted portfolio.Notes: Standard errors in parentheses; ** (*) denotes significant at the 95% (90%)confidence level. Estimation is by ordinary least squares. In column (1), no model isestimated as model coefficients are assumed to be zero and e follows a random walk.Estimation in column (2) is by ordinary least squares. The sample is January 1986 toDecember 2008.

would now be losing money at a rate of 24 bps per month, with consid-erable increases in volatility and negative skew, resulting in a −0.47Sharpe ratio and a gain-loss ratio of 0.64, below the strict no-arbitragevalue of 1.

Model 2, denoted “Naïve ECM,” augments the Naïve model: weembrace the idea that real exchange rate fundamentals may have pre-dictive power, in that currencies overvalued (undervalued) relative toFEER might be expected to face stronger depreciation (appreciation)pressure. Now in panel (a) the one-month-ahead forecast equationhas just a single, time-varying regressor, the lagged real exchangerate deviation, plus currency-specific fixed effects which are notshown. The coefficient on the real exchange rate term of−0.02 is sta-tistically significant and indicates that reversion to FEER occurs at aconvergence speed of about 2% per month on average, which impliesthat deviations have a half-life of 36 months—within the consensusrange of the literature.

In panel (b) the potential virtues of a FEER based trading approachstart to emerge, but only weakly. Performance in the window endingin 2006 is even better than with the Naïve model at 62 bps per month,albeit with slightly higher volatility, explaining the lower Sharpe ratio,now at 1.49. However, the gain-loss ratio is instead 3.37, even more im-pressive. But just as before as thewindows roll over to include the finan-cial crisis, gains turn to losses (12 bps per month), positive skew tonegative skewwith a Sharpe ratio of−0.29 and a gain-loss ratio of 0.78.

By focusing mainly on unconditional Naïve carry trade returns,some influential academic papers have concluded that crash riskand peso problems are important features of the FX market. Table 2may be extremely Naïve, but it suffices to launch the start of a coun-terargument. It shows that an allowance for real exchange rate funda-mentals can limit adverse skewness from the carry trade, and it isindeed well known that successful strategies do just that, as weshall see in a moment. However, we do not stop here, since wehave the tools to explore superior but simple model specificationsthat deliver trading strategies with even higher ex-post actualreturns, lower volatilities, and negligible (or positive) skew.

3.2. Two linear models

Table 3 shows one-month-ahead forecasts from the richerflexible-form exchange rate models discussed in the previous section.Model 3 is based on a regression of the change in the nominal ex-change rate on the first lag of the change in the nominal exchangerate, inflation, and the interest differential (this model is labeledVAR as it corresponds to the first equation of a vector autoregressivemodel for Δyt+1 in expression 3). Model 4 extends the specificationwith the real exchange rate level, and is labeled VECM by analogywith vector error-correction.

As with the Naïve ECM (Model 2) a plausible convergence speedclose to 3% per month is estimated. Other coefficients conform to stan-dard results and folk wisdom. The positive coefficient on the laggedchange in the nominal exchange rate of 0.12 suggests a modest but sta-tistically significant momentum effect. The positive coefficient on thelagged interest differential, closer to +1 than −1, conforms to thewell known forward discount puzzle: currencies with high interestrates tend to appreciate all else equal. The positive coefficient on the in-flation differential conforms to recent research findings that “bad newsabout inflation is good news for the exchange rate” (Clarida andWaldman, 2008): under Taylor rules, high inflation might be expectedto lead to monetary tightening and future appreciation.

We can see that these models offer some improvement as comparedto the Naïve ECM (Model 2). The R2 is higher and the explanatory vari-ables are statistically significant. They alsoweather thefinancial crisis bet-ter than the Naïvemodels in Table 2. During the window ending in 2006,returns are 45 and 57 bps per month respectively for the VAR and VECMmodels, which are slightly lower than the 54 and 62 bps permonth of theNaïve and Naïve ECM models, resulting in a lower Sharpe and gain-loss

Page 6: The carry trade and fundamentals: Nothing to fear but FEER itself

Table 3Two linear models.

Model (3) (4)

VAR VECM

(a) Model parametersDependent variable Δeit+1 Δeit+1

Δeit 0.1218** 0.1333**(0.0267) (0.0268)

iit∗ − iit 0.2889 0.7555**

(0.3529) (0.3506)log(πit∗ −πit) 0.2325 0.1690**

(0.1603) (0.0047)log(qit−qi) – −0.0277**

(0.0047)R2, within 0.0164 0.0319F F(3,2427)=8.66 F(4,2426)=14.36Fixed effects Yes YesPeriods 276 276Currencies (relative to US$) 9 9Observations 2484 2484

(b) Trading performance†Realized profits 2002–2006

Mean 0.0045 0.0057Std. dev. 0.0145 0.0144Skewness 0.1062 −0.1141Sharpe ratio (annualized) 1.0706 1.3641Gain/loss ratio 2.2303 2.7335

Realized profits 2003–2007Mean 0.0022 0.0021Std. dev. 0.0145 0.0142Skewness 0.0706 −0.5490Sharpe Ratio (annualized) 0.5239 0.5161Gain/loss ratio 1.4734 1.4547

Realized profits 2004–2008Mean 0.0004 0.0027Std. dev. 0.0153 0.0161Skewness 0.0210 0.3148Sharpe ratio (annualized) 0.0834 0.5779Gain/loss ratio 1.0644 1.5646

†Statistics for trading profits are net, monthly, from a equally weighted portfolio.Notes: Standard errors in parentheses; ** (*) denotes significant at the 95% (90%)confidence level. Estimation is by ordinary least squares. The sample is January 1986to December 2008.

79Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

ratios as well. This remains true as we move on to the early stages of thefinancial crisis with thewindow ending in 2007 but by the time the crisiserupts, the advantages of these more sophisticated strategies begin toemerge. The VAR model ekes out barely positive returns of 4 bps permonth which deliver a pyrrhic Sharpe ratio and a gain-loss ratio indistin-guishable from the strict no-arbitrage normof one. In contrast, adding theFEER term seems tomake a big difference—returns over this period are 27bps per month, skew is slightly positive and thus the Sharpe ratio is 0.58and the gain-loss ratio a respectable 1.56.

Statistically the VECM is preferred with R2 and F-statistic twice aslarge, and the coefficient on the lagged real exchange rate highly signif-icant. However, the key finding is that when trades are conditional ondeviations from FEER, actual returns have a skew that is zero or mildlypositive.

Table 4Nonlinearity tests.

Hypothesis tests Test statistic

H0: Nonlinear in |iit* − iit| and |log(qit−qi)| versus:(a) H1: Nonlinear in |log(qit−qi)| and F(44,2337)=4.28linear in |iit* − iit| p=0.0000(b) H2: Nonlinear in |iit* − iit| and F(52,2337)=1.42linear in|log(qit−qi)| p=0.0262(c) H3: Linear in |iit* − iit| and F(96,2337)=28.04linear in |log(qit−qi)| p=0.0000

Notes: p refers to the p-value for the LM test of the null of linearity against the alterna-tive of nonlinearity based on a polynomial expansion of the conditional mean with theselected threshold arguments. See Granger and Teräsvirta (1993).

3.3. A threshold model

Are these the best models we can find? We think not, due to thelimitations of linear models. Two arguments force us to reckon withpotential nonlinearities in exchange rate dynamics. The first possiblereason for still disappointing performance in the VECM (Model 4) isthat the model places excessively high weight on the raw carry signalemanating from the interest differential in some risky “high carry”circumstances. This may lead to occasional large losses, even if it isnot enough to cause negative skew. Recall that Naïve Model (2)placed zero weight on this signal as part of an exchange rate forecast,

whereas here the weight is 0.76. We conjecture that the true relation-ship may be nonlinear. Here we follow an emerging literature whichsuggests that deviations from UIP are corrected in a nonlinear fash-ion: that is, when interest differentials are large the exchange rate ismore likely to move against the Naïve carry trade, and in line withthe unbiasedness hypothesis (e.g., Coakley and Fuertes, 2001; Sarnoet al., 2006).

The second possible reason for disappointing performance mightbe related not to nonlinear UIP dynamics, but rather nonlinear PPPdynamics. Here it is possible that VECM places too little weight onthe real exchange rate signal in some circumstances, e.g. when cur-rencies are heavily over/undervalued and thus more prone to fall/rise in value. Here again, we build on an influential literature whichpoints out that reversion to Relative PPP, largely driven by nominalexchange rate adjustment, is likely to be more rapid when deviationsfrom FEER are large (e.g., Obstfeld and Taylor, 1997; Michael et al.,1997; Taylor et al., 2001).

To sum up, past empirical findings suggest that the parameters ofthe Naïve ECMModel (2) may differ between regimes with small andlarge carry trade incentives, as measured by the lagged interest ratedifferential; and also may differ between regimes with small andlarge FEER deviations, as measured by the lagged real exchangerate. In addition to these arguments for a nonlinear model, the impor-tant role of the third moment in carry trade analyses also bolsters thecase, given the deeper point made in the statistical literature thatskewness cannot be adequately addressed outside of a nonlinearmodeling framework (Silvennoinen et al., 2008).

To explore these possibilities we extend the framework further toallow for nonlinear dynamics. We first perform a test against nonli-nearity of unknown form and find strong evidence of nonlinearity de-rived from the absolute magnitudes of interest differentials and realexchange rate deviations. We then implement a simple nonlinearthreshold error correction model (TECM). For illustration, we employan arbitrarily partitioned dataset with thresholds based on medianabsolute magnitudes.

Table 4 presents the nonlinearity tests using Model (4) in Table 3,the VECMmodel, where the candidate threshold variables are the ab-solute magnitudes of the lagged interest differential |iit*∗− iit| and thelagged real exchange rate deviation qit−�qij j. Although more specificnonlinearity tests tailored to particular alternatives would be morepowerful, we are already able to forcefully reject the null with genericforms of neglected nonlinearity by simply taking a polynomial expan-sion of the linear terms based on the threshold variables. This type ofapproach is often used in testing against linearity when a smoothtransition regression (STAR) model is specified but it is clearly notlimited to this alternative hypothesis (see Granger and Teräsvirta,1993). The test takes the form of an auxiliary regression involvingpolynomials of each variable, up to the fourth power.

The results in Table 4 show that the hypothesis of linearity can berejected at standard significance levels for specifications based onwheth-er the threshold variables are considered individually or simultaneously.Thus our VECM Model (4) in Table 3 appears to be misspecified, and so

Page 7: The carry trade and fundamentals: Nothing to fear but FEER itself

80 Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

we are moved to develop a more sophisticated yet parsimonious thresh-oldmodel. For illustration, we use a simple four-regime TECM, where theregimes are delineated by threshold values taken to be the median levelsof the absolute values of the regressors Iit=|iit*− iit| and Qit ¼ qit−�qij j,which for simplicity will be denoted by θi and θq respectively.

Estimates of this model, denoted Model (5) in Table 5, show thatfrom both an econometric and a trading standpoint, the performanceof the nonlinear model is outstanding. It also accords with manywidely recognized state-contingent FX market phenomena.

We see from panel (a) in Table 5 that the estimated model coeffi-cients differ sharply across the four regimes. The null of no differencesacross regimes is easily rejected. In column 1 of Table 5 the modelperforms rather poorly in the regime where both interest differentials

Table 5Threshold model.

Model (5)

NECM

(a) Model parameters Qitbθq Qitbθq Qit>θq Qit>θq(by regime) Iitbθi Iit>θi Iitbθi Iit>θiDependent variable Δeit+1 Δeit+1 Δeit+1 Δeit+1

Δeit 0.0637 0.1228** 0.1029* 0.2027**(0.0417) (0.0494) (0.0574) (0.0567)

iit∗ − iit 2.5033** 1.7790** 0.8707 −0.5734

(1.1577) (0.6379) (1.3585) (0.7811)log(πit* −πit) 0.6731 0.3836 −0.5269 0.1331

(0.4101) (0.3880) (0.3945) (0.2173)log(qit−qi) −0.0572** −0.0564** −0.0146** −0.0364**

(0.0011) (0.0265) (0.0072) (0.0077)R2, within 0.0277 0.0419 0.0176 0.0749F F(4,642) F(4,549) F(4,554) F(4,641)

=4.63 =4.29 =1.90 =8.60Fixed effects Yes Yes Yes YesPeriods 276 276 276 276Currencies (relative to US$) 9 9 9 9Observations (by regime) 655 562 567 654

(b) Trading performance †

Realized profits 2002–2006Mean

0.0058Std. dev.

0.0128Skewness

0.2242Sharpe ratio (annualized)

1.5776Gain/loss ratio

3.3385Realized profits 2003–2007

Mean0.0033

Std. dev.0.0134

Skewness−0.0290

Sharpe ratio (annualized)0.8524

Gain/loss ratio1.9007

Realized profits 2004–2008Mean

0.0030Std. dev.

0.0129Skewness

0.0664Sharpe ratio (annualized)

0.8024Gain/loss ratio

1.8658

†Statistics for trading profits are net, monthly, from a equally weighted portfolio.Notes: Standard errors in parentheses; ** (*) denotes significant at the 95% (90%)confidence level. Estimation is by ordinary least squares.

and real exchange rate deviations are small. Still, there is weak evi-dence for a self-exciting dynamic with a very strong forward bias:the coefficient on the interest differential is large (+2.5) consistentwith traders attracted by carry incentives bidding up higher yield cur-rencies. However, for small interest differentials, we see from column3 in Table 5 that this effect disappears once large real exchange ratedeviations emerge. Still, the fit of the model remains quite poor in col-umn 3 also.

In column 2 of Table 5, where the interest differential is above medi-an, this self-exciting property is also manifest clearly, with a coefficientof+1.8 that is statistically significant at the 5% level. The results stronglysuggest that wide interest differentials take exchange rates “up thestairs” on a gradual appreciation away from fundamentals. Of course,for given price levels or inflation rates, such dynamics quickly cumulatein ever-larger real exchange rate deviations, moving themodel's regimeto column 4 in Table 5, where the fit of the model is best, as shown by ahigh R2 and F-statistic. And here the dynamics dramatically change suchthat the exchange rate can rapidly descend “down the elevator”: thelagged real exchange rate carries a highly significant convergence coeffi-cient of 3.6% per month, working against previous appreciation of thetarget currency, and reverse exchange rate momentum then snowballs,with the lagged exchange rate change carrying a now significant andlarge coefficient of +0.2.

In addition to being the preferredmodel based on statistical tests, doesthe nonlinear TECMModel (5) in Table 5 deliver better trading strategies?Panel (b) in Table 5 suggests so. In the calmer period, where the sampleends in 2006, returns are second only to the Naïve ECM (Model 2 inTable 2) at 58 bps per month, positive skew, a Sharpe of 1.58 and an ex-traordinary gain-loss ratio of 3.34. TECM remains a strong performer asthe financial crisis creeps in, with monthly returns of 33 bps per monthand second only to the Naïve ECM model again. The gain-loss ratio stillmakes expected average gains almost twice expected average losses.

However, during the financial crisis, the returns on the TECMstrategy only fall from 33 to 30 bps amonthwhereas Naïve ECM returnsfall into negative territory from 36 bps to −12 bps per month. All thewhile, the TECM strategy maintains a neutral skew, a Sharpe of 0.80and only a marginally lower gain-loss ratio of 1.87. Thus, a simple non-linear extension of the VECM model generates considerable protectionagainst crash risk while providing reasonably high returns.

3.4. Where do performance improvements come from?

What explains the trading performance of TECM? Table 6 breaksdown the fraction of profitable trading positions correctly called byeach model, both in-sample and for all three out-of-sample windows.The basic conclusion is that there are no discernible differences acrossstrategies, as is most easily seen when looking at the pooled results.Even in the out-of-sample window ending in 2008 where TECM ap-pears to dominate Naïve strongly, the difference between correctclassification rates is 54% versus 51%. So what is so great about ourso-called refined models?

The answer is given in Table 7 and explored more formally in thenext section. It is not the ability to pick the right direction of tradesmore often; it is the ability to pick the right direction on trades that arelikely to result in large gains or losses. For simplicity, this table comparesjust the Naïve and TECM models. In-sample (panel a) we find that over2454 pooled trades, the two models agreed on direction 1556 timesand disagreed 898 times. When in agreement, returns were 66 bps permonth, annual Sharpe was 0.81 and skew was −0.47. The worst tradelost 12% in a month, the best gained about the same, 12%. But whenthe models took opposite sides, the differences were stark. The Naïvemodel collected negative returns of 41 bps, the TECM just the opposite.Annual Sharpes were −0.44 versus 0.44. Skew was −0.85 versus 0.85.Naïve suffered a worst trade of 18%, but the TECM had that on the plusside. TECM'sworstmonthwas only a 10% loss, and that was Naïve's best.

Page 8: The carry trade and fundamentals: Nothing to fear but FEER itself

Table 6Fraction of profitable trading positions correctly called.

Naïve Naïve ECM VAR VECM TECM

(a) In sampleAustralia 0.63 0.57 0.57 0.56 0.56Canada 0.58 0.52 0.57 0.56 0.55Germany 0.52 0.53 0.56 0.54 0.56Japan 0.58 0.54 0.61 0.60 0.56New Zealand 0.63 0.59 0.62 0.58 0.59Norway 0.57 0.55 0.57 0.57 0.58Sweden 0.63 0.51 0.66 0.66 0.62Switzerland 0.54 0.55 0.58 0.55 0.58UK 0.53 0.55 0.49 0.53 0.54Pooled 0.58 0.55 0.58 0.57 0.57

(b) Out of sample 2002–2006Australia 0.63 0.60 0.60 0.55 0.68Canada 0.60 0.63 0.50 0.63 0.57Germany 0.60 0.58 0.53 0.58 0.57Japan 0.55 0.63 0.55 0.63 0.55New Zealand 0.75 0.75 0.72 0.75 0.73Norway 0.62 0.63 0.58 0.58 0.58Sweden 0.67 0.58 0.67 0.67 0.67Switzerland 0.52 0.57 0.55 0.52 0.53UK 0.55 0.57 0.57 0.55 0.50Pooled 0.61 0.62 0.59 0.61 0.60

(c) Out of sample 2003–2007Australia 0.62 0.58 0.57 0.50 0.63Canada 0.52 0.57 0.57 0.58 0.58Germany 0.50 0.48 0.43 0.48 0.48Japan 0.55 0.63 0.57 0.62 0.55New Zealand 0.72 0.72 0.67 0.67 0.63Norway 0.52 0.52 0.52 0.48 0.50Sweden 0.60 0.55 0.62 0.63 0.62Switzerland 0.52 0.53 0.53 0.50 0.53UK 0.55 0.50 0.53 0.48 0.45Pooled 0.56 0.56 0.56 0.55 0.55

(d) Out of sample 2004–2008Australia 0.55 0.52 0.55 0.47 0.62Canada 0.43 0.50 0.47 0.55 0.48Germany 0.45 0.43 0.42 0.48 0.47Japan 0.57 0.62 0.62 0.65 0.62New Zealand 0.62 0.62 0.58 0.58 0.58Norway 0.47 0.47 0.52 0.52 0.53Sweden 0.53 0.50 0.57 0.60 0.60Switzerland 0.53 0.52 0.57 0.57 0.48UK 0.45 0.40 0.50 0.48 0.43Pooled 0.51 0.51 0.53 0.54 0.54

Note: The in-sample period is 1986 to 2008.

81Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

The out-of-sample results in panels (b)–(d) reinforce this story, espe-ciallywhen considering panel (d), which refers to the out-of-samplewin-dow ending in 2008 in the middle of the global financial crisis. When themodels agreed, returnswere virtually zero. The largest loss is about 12% ina month again, and the largest gain 10%. But when the models disagreed,mean returns are a 76 bps loss for Naïve, and a 76 bps gain for TECM. Thisis probably largely due to the observation that the worst loss for Naïvewas 18% in a month, which was the largest gain for TECM. On the otherhand, the largest loss for TECMwas only about 7% and thatwas the largestgain for Naïve. With these results it is not hard to see that when themodels disagree, skewness is highly negative for Naïve at −0.73 but ofthe opposite sign for TECM.

We conclude this section by asking two final questions: (1) Howmany of the most profitable trades took place in the extreme regimes ofthe TECM strategy where it differs most from Naïve? And, how muchchurning is there in the portfolios? To answer the first question, we con-sidered the 898 trades were the models disagree in the in-sample andthen tabulatedwhat portion of the trades belonged to each of the four re-gimes in the TECM strategy: 28% of the trades fall inside both the UIP andPPP thresholds; 22.5% exceed theUIP threshold but not the PPP threshold;29.2% exceed the PPP but not the UIP threshold; and 20.4% exceed boththresholds are exceed. Therefore, there does not seem to be a specific con-dition that is plucking the best trades consistently.

Each time a position is reversed the trader is likely to incur trans-action fees that eat into returns. We abstract from this matter alto-gether but report here some basic figures to assess its possibleimpact and therefore answer the second of the questions posedabove. We constructed a churning index that counts the number oftimes that a position is reversed. That is, if a strategy always goeslong in one currency and short in another, churning would be 0. Butif on the other hand, positions were reversed each period, churningwould be 1. As a benchmark, we calculated the churning ratio forthe perfect foresight trader—one that picks the correct direction oftrade each time. We did this in-sample and for each of the out-of-sample windows but the results are very similar throughout andtherefore not reported for the out-of-sample windows. In-sample,churning varies by country anywhere between 38% of the time for po-sitions against Sweden and 52% for positions against the U.K. In con-trast, the Naïve strategy is considerably cheaper to execute, withchurning varying from 1% for trades against Australia to 6% for tradesagainst the U.K. This is not surprising given the basic premise of theNaïve strategy. What about the TECM strategy? Churning ranges be-tween a low of 18% for trades against Japan to a high of 31% for tradesagainst the U.K., which is much more commensurate with the perfectforesight trader but presumably slightly more costly to implement inpractice.

4. Evaluating carry trade returns and risk

Is this all too good to be true? Recall that the perfect foresightreturns to a carry trade are

xtþ1 ¼ Δetþ1 þ i�t−it� �

:

If we denote xtþ1 as the one period-ahead forecast and noticingthat (it*− it) is known at time t, then the only source of uncertaintyin this forecast comes from Δetþ1: It appears then that formal evalua-tion of carry trade returns boils down to a standard predictive evalu-ation of nominal exchange rates, but at least since Meese and Rogoff(1983) the consensus view is that exchange rate forecasts are no bet-ter than a random walk when judged by fit (RMSE).

A moment's reflection reveals that an investor's objectives are notwell aligned with this point of view. The reason is that the investoruses Δetþ1 to determine the direction of trade but Δetþ1 does not di-rectly determine the actual profits of the investment. To see this de-fine dt+1∈{−1,1} as an indicator of the ex-post profitable directionof the carry trade, i.e., dt+1=sign(xt+1). The returns realized by theinvestor can be defined as

μ tþ1 ¼ dtþ1xtþ1;

where dtþ1 is the binary forecast for dt+1 implicit in xtþ1. In generaldtþ1 ¼ sign xtþ1−c

� �for c∈(−∞,∞). It may be tempting to restrict

c=0, as would be reasonable for a risk neutral investor. In that casewe could use standard techniques to assess direction (e.g., Pesaranand Timmermann, 1992). However, risk averse investors facing possi-bly asymmetric return distributions may wish to guard against tailrisk by choosing c≠0. And in any case, we want to weigh directionby the returns associated with each position, an issue to which we re-turn below.

Setting the forecasting problem up as a binary choice or directionproblem makes sense on a number of dimensions. First, as a statisticalmatter, making successful continuous exchange rate (and hence, return)forecasts is a harder challenge, as thoseworking in theMeese–Rogoff tra-dition have shown; but a directional forecast sets a lower bar, one that re-cent work suggests could be surmountable (e.g., Cheung et al., 2005).Second, it is all that is needed in the case of equal-weight currency port-folios. Thirdly, directional forecasting is important for traders. Like otherfunds, currency strategies face the risk of redemptions or closure if

Page 9: The carry trade and fundamentals: Nothing to fear but FEER itself

19 The origin of the ROC curve can be traced back to radar signal detection theory de-veloped in the 1950s and 1960s (Peterson and Birdsall, 1953), and its potential formedical diagnostic testing was recognized as early as 1960 (Lusted, 1960). More re-cently, ROC analysis has been introduced in psychology (Swets, 1973), in atmosphericsciences (Mason, 1982), and in the machine learning literature (Spackman, 1989). SeePepe (2003) for a survey.

Table 7When does the TECM model outperform the Naïve model?

(a) In sample N Mean Std. dev. Skewness Sharpe Min Max

All trades pooledNaïve 2454 0.0027 0.0299 −0.6752 0.3115 −0.1773 0.1180TECM 2454 0.0057 0.0295 0.1068 0.6603 −0.1202 0.1773

Trade where models agreeNaïve 1556 0.0066 0.0282 −0.4709 0.8085 −0.1202 0.1180TECM 1556 0.0066 0.0282 −0.4709 0.8085 −0.1202 0.1180

Trade where models disagreeNaïve 898 −0.0041 0.0315 −0.8539 −0.4449 −0.1773 0.1018TECM 898 0.0040 0.0316 0.8481 0.4352 −0.1018 0.1773

(b) Out of sample 2002–2006 N Mean Std. dev. Skewness Sharpe Min Max

All trades pooledNaïve 540 0.0054 0.0264 −0.1938 0.7037 −0.0832 0.0744TECM 540 0.0058 0.0263 −0.1239 0.7694 −0.0832 0.0744

Trade where models agreeNaïve 442 0.0069 0.0267 −0.2128 0.8875 −0.0832 0.0744TECM 442 0.0069 0.0267 −0.2128 0.8875 −0.0832 0.0744

Trade where models disagreeNaïve 98 −0.0013 0.0240 −0.2940 −0.1905 −0.0717 0.0520TECM 98 0.0013 0.0240 0.2940 0.1905 −0.0520 0.0717

(c) Out of sample 2003–2007 N Mean Std. dev. Skewness Sharpe Min Max

All trades pooledNaïve 540 0.0030 0.0265 −0.1867 0.3937 −0.0912 0.0744TECM 540 0.0033 0.0265 −0.0962 0.4297 −0.0832 0.0912

Trade where models agreeNaïve 400 0.0043 0.0264 −0.1913 0.5580 −0.0832 0.0744TECM 400 0.0043 0.0264 −0.1913 0.5580 −0.0832 0.0744

Trade where models disagreeNaïve 140 −0.0005 0.0267 −0.1730 −0.0680 −0.0912 0.0736TECM 140 0.0005 0.0267 0.1730 0.0680 −0.0736 0.0912

(d) Out of sample 2004–2008 N Mean Std. dev. Skewness Sharpe Min Max

All trades pooledNaïve 540 −0.0024 0.0315 −0.2683 −1.0608 −0.1773 0.1038TECM 540 0.0030 0.0314 0.5984 0.3292 −0.1215 0.1773

Trade where models agreeNaïve 347 0.0004 0.0282 −0.4949 0.0525 −0.1215 0.1038TECM 347 0.0004 0.0282 −0.4949 0.0525 −0.1215 0.1038

Trade where models disagreeNaïve 193 −0.0076 0.0362 −1.3709 −0.7264 −0.1773 0.0736TECM 193 0.0076 0.0362 1.3709 0.7264 −0.0736 0.1773

Notes: The in-sample period is 1986 to 2008. The Sharpe ratio is annualized. Other statistics are monthly.

82 Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

negative returns are frequent and/or large. It is nice to pick the biggerwinners, rather than just small ones, but not if the strategy risks blowingup. The point is only amplifiedwhen funds are leveraged. Thus froma tra-der's point of view, it is important to knit together returns andexposure inevaluating carry trade strategies, and not just direction.

4.1. Hits and misses: the correct classification frontier

Variables used to determine a binary outcome are generically calledclassifiers. In this paperwewill denote such a variable as δtþ1 to indicatethat it could be a conditional mean prediction from a regression, a con-ditional probability prediction from a binary choice model, the singleindex from a dimension reduction technique, etc. A binary predictioncan therefore be constructed as dtþ1 ¼ sign δtþ1−c

� �for c∈(−∞,∞).

Intuitively, dt+1 generates an empirical mixture distribution for δtþ1,each constituent distribution associated with whether dt+1=−1 or 1.A good classifier is one where these two distributions are easily separable(a concept to be made precise below) with little to no overlap betweenthem. That way, observations of δtþ1 to the left of the threshold c have ahigher chance of being properly sorted into the dt+1=−1 bin and viceversa. At the other extreme, strict no-arbitrage is equivalent to a perfectoverlap of the constituent distributions in the mixture: any given δtþ1 isjust as likely to come from the dt+1=−1 bin as it is from the dt+1=1bin. In general, the classifier's two empirical distributions need not be

symmetric nor have the same shape, thus resulting in different profit op-portunities and exposure. Proper evaluation therefore requires character-izing the returns and risk associated with each opportunity and not justdirectional ability.

Accounting for all these features requires new methods and thefoundation for the solutions we propose comes from techniques com-mon in biostatistics and engineering. The specific device that we drawon here is the receiver operating characteristic (ROC) curve, whichcomes from signal detection theory.19 The methods that we presentbelow combine this literature and results in Jordà and Taylor(2011), which the reader may wish to consult for a more detailed ex-planation of some concepts.

An investor considering a carry trademust entertain four possible out-comes and the probability of their occurrence. Specifically, let TP(c)=P δtþ1 > ch ���dtþ1 ¼ 1� denote the true positive rate, sometimes also called

sensitivity or recall rate, and let TN(c)= P δtþ1bch ���dtþ1 ¼ −1� denote the

true negative rate, sometimes also called specificity. The counterparts to

Page 10: The carry trade and fundamentals: Nothing to fear but FEER itself

Table 8Area Under the correct classification frontier (CCF).

Model AUC Standard error z-score p-value

(a) In sampleNaïve 0.5905 0.012 7.85 0.000Naïve ECM 0.5604 0.012 5.19 0.000VAR 0.5996 0.012 8.67 0.000VECM 0.5935 0.012 8.11 0.000TECM 0.6084 0.011 9.47 0.000

(b) Out of sample 2002–2006Naïve 0.6239 0.025 4.99 0.000Naïve ECM 0.6462 0.024 5.97 0.000VAR 0.6012 0.025 4.04 0.000VECM 0.6428 0.025 5.82 0.000TECM 0.6258 0.025 5.08 0.000

(c) Out of sample 2003–2007Naïve 0.5945 0.025 3.80 0.000Naïve ECM 0.5989 0.025 3.98 0.000VAR 0.5782 0.025 3.12 0.002VECM 0.5962 0.025 3.87 0.000TECM 0.5704 0.025 2.80 0.005

(d) Out of sample 2004–2008Naïve 0.5170 0.025 0.68 0.495Naïve ECM 0.5097 0.025 0.39 0.697VAR 0.5529 0.025 2.14 0.033VECM 0.5620 0.025 2.51 0.012TECM 0.5546 0.025 2.21 0.027

Notes: In sample is 1987 to 2008, out of sample is estimated with recursively. The AUCstatistic is a Wilcoxon–Mann–Whitney U-statistic with asymptotic normal distributionand its value ranges from 0.5 under the null to a maximum of 1 (an ideal classifier). SeeJordà and Taylor (2011).

83Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

these rates are the false negative rate FN(c)= P δtþ1bch ���dtþ1 ¼ 1� and the

false positive rate FP(c)= P δtþ1 > ch ���dtþ1 ¼ −1� with TP(c)+FN(c)=1

and TN(c)+FP(c)=1. Naturally, the investor is interested in maximizingthe “true” rates andminimizing the “false” rates by choosing c. But the na-ture of the trade-offs is a little different. If the investor chooses a small c,then as c→−∞ he ensures that TN(c)=1 but TP(c)=0, and vice versawhen he lets c→∞.

A plot of {TP(c),TN(c)}c for all values of c is a summary of all thecorrect classification combinations of a given classifier δtþ1—in eco-nomic parlance, a production possibilities frontier of correct classifi-cation. For this reason, we will call such a plot the correctclassification frontier or CCF. This curve can be displayed in the unitsquare. For an uninformative classifier, the CCF will be the line bisect-ing the unit square with slope −1: the marginal correct classificationgain in one direction results in the exact same marginal correct clas-sification loss in the other direction.

This situation corresponds to strict no-arbitrage and for a risk-neutral investor, the natural null of efficient markets. The better aclassifier is, the more the CCF bows away from the origin and abovethe simplex; and in the limit, a perfect classifier has a CCF that hugsthe top-right corner of the unit-square. This corner represents perfectarbitrage—a classifier with this CCF would permit an investor to al-ways take the profitable side of a trade. Most classifiers' CCFs exist be-tween these two extremes, thus generating approximate arbitrageopportunities. Such opportunities may be consistent with the repre-sentative investor's attitude toward risk and indeed consistent withkernels that price other assets correctly. This is an issue that we willrevisit below.

At this point, we make no mention of the returns associated witheach {TP(c),TN(c)} pair, nor we discuss yet how an investor's prefer-ences interact with the CCF (as the market with two goods would in-teract the production possibilities frontier with the utility extractedfrom each good). But it is useful to introduce a summary measure ofclassification ability based on the area under the CCF, henceforthAUC. Under strict no-arbitrage, the AUC=1/2, whereas for perfect ar-bitrage, AUC=1, with typical CCFs attaining values in-between. TheAUC turns out to be closely related to the Wilcoxon–Mann–WhitneyU-statistic—a non-parametric measure of how “separate” the two em-pirical distributions of the classifier δtþ1 are in the mixture deter-mined by the true state dt+1∈ {−1,1}. The AUC is also convenientbecause under the null of no classification ability, Hsieh andTurnbull (1996) show that its large sample distribution is Gaussian.More details on these results are provided in Jordà and Taylor (2011).

Table 9 shows the pooled AUC in- and out-of-sample for each ofthe three windows 2002–2006; 2003–2007; and 2004–2008 thatwe have been considering. The results are particularly revealing andexplain why the Naïve strategy had been so popular in recent times.The 2002–2006 window was a particularly good time for guessingthe correct direction of a trade using the interest differential, withAUC values exceeding 0.60 in all cases and the null of strict no-arbitrage easily rejected. Naïve and VAR do well but their perfor-mance improves even more when the FEER term is added.

However, in the 2004–2008window the value of these simple strat-egies comes crashing down along with the global financial system.Naïve and Naïve ECM have AUCs indistinguishable from the strict no-arbitrage value of 0.5, although VAR, VECMand TECMmanage to escapethis fate. But these results are old news: we knew from Section 3.4 thatTECM really shines, not in better directional prediction, but in spottingtrades with high returns. Thus, the next section takes our analysis tothe evaluation of classification ability weighted by returns.

20 A similar improvement to ours is the instance-varying ROC curve described in Fawcett(2006).

4.2. Gains and losses: The return-weighted CCF* curve

Tables 7 and 8 showed that although the TECMmodel does not pickwinners any better than some simpler specifications, it is able to avoid

large losses responsible for negative skewness in the distribution ofreturns. Thus, a classifier that correctly pinpoints 99 penny trades butmisses the dollar trade has almost perfect classification ability but is amoney loser. For this reason, Jordà and Taylor (2011) introduce anovel refinement to the construction of the CCF that accounts for therelative profits and losses of the classification mechanism.20 We pro-ceed in the spirit of gain–loss measures of investment performance(Bernardo and Ledoit, 2000) by attaching weights to a given classifier'supside and downside outcomes in proportion to observed gains andlosses. This alternative performance measure will give little weight toright or wrong bets when the payoffs are small; but when payoffs arelarge it will penalize classifiers for picking trades that turn out to bebig losers, and reward them for picking big winners.

To keep the new curve appropriately normalized, consider all thetrades where “long FX” was the ex-post correct trade to have made.The maximum gain from classifying all these trades correctly withdtþ1 ¼ 1 would be

L ¼ ∑dtþ1¼1

xtþ1:

Now consider all the trades where “short FX”was the ex-post cor-rect trade to have made. Similarly, the maximum gain from thesetrades would be

S ¼ ∑dtþ1¼−1

xtþ1

�� ��

since xt+1b0 when dt+1=−1.We can now redefine, or rescale, the true positive and true nega-

tive rates TP(c) and TN(c) in terms of the long-side and short-sidereturns x actually achieved, relative to the maximally attainable

Page 11: The carry trade and fundamentals: Nothing to fear but FEER itself

84 Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

gains thus defined. For any given threshold c, this leads to new statis-tics as follows:

TP� cð Þ ¼∑d tþ1¼1jdtþ1¼1 xtþ1

L; and TN� cð Þ ¼

∑d tþ1¼−1jdtþ1¼−1 xtþ1

�� ��S

where we use |xt+1| to calculate TN*∗(c) since xt+1b0 when dt+1=−1.

The plot of TP*(c) versus TN*(c) is then, by construction, limited tothe unit square, and generates what we term a returns-weighted CCF,which is denoted CCF*. Associated with the CCF* curve is a corre-sponding returns-weighted AUC statistic, denoted AUC*∗. Intuitively,if AUC* exceeds 0.5 this is evidence that the classifier outperformsthe strict no-arbitrage null from a return-weighted perspective orgain–loss ratio of 1. Detailed derivation of these statistics and theirproperties is provided in Jordà and Taylor (2011), along with tech-niques for inference.

Using these return-weighted statistics strongly reinforces our argu-ment that classifiers based on more refined models (VECM/TECM) areinformative; in contrast the performance of more Naïve models is notstatistically distinguishable from the null. Table 10 collects summaryAUC* statistics for the same in- and out-of-sample periods discussedin Table 9.

The AUC*∗ statistics are generally higher than the correspondingAUC statistics reported in Table 9, and the z-scores attained aremuch higher too. The out-of-sample window 2002–2006 now con-tains AUC*∗ values at almost 0.68 (VECM), which is quite remarkable.A stark pattern starts to emerge about the benefits of including a FEERfactor as strategies that include this term seem to do best regardlessof period of evaluation. What about the last window, 2004–2008,which includes the period of financial turbulence? Here too the re-sults are very revealing. In fact, we report values of AUC*∗ below 0.5for the Naïve strategies suggesting that doing exactly the opposite ofwhat these strategies suggested would have been more profitable.As it stands, these strategies were consistent money losers, more sothan flipping a coin at conventional levels of significance. However,the VECM and especially the TECM strategies did reasonably well

Table 9Area under the returns-weighted correct classification frontier (CCF⁎).

Model AUC⁎ Standard error z-score p-value

(a) In sampleNaïve 0.5690 0.0116 5.94 0.0000Naïve ECM 0.6060 0.0114 9.26 0.0000VAR 0.6291 0.0113 11.42 0.0000VECM 0.6496 0.0112 13.41 0.0000TECM 0.6769 0.0109 16.21 0.0000

(b) Out of sample 2002–2006Naïve 0.6242 0.025 5.01 0.000Naïve ECM 0.6627 0.024 6.72 0.000VAR 0.6118 0.025 4.48 0.000VECM 0.6754 0.024 7.32 0.000TECM 0.6635 0.024 6.76 0.000

(b) Out of sample 2003–2007Naïve 0.5835 0.025 3.34 0.001Naïve ECM 0.6059 0.025 4.27 0.000VAR 0.5625 0.025 2.49 0.013VECM 0.5994 0.025 4.00 0.000TECM 0.5862 0.025 3.45 0.001

(b) Out of sample 2004–2008Naïve 0.4355 0.025 −2.62 0.009Naïve ECM 0.4419 0.025 −2.36 0.018VAR 0.5317 0.025 1.28 0.202VECM 0.5577 0.025 2.33 0.020TECM 0.5971 0.024 3.98 0.000

Notes: In sample is 1987 to 2008, out of sample is estimated with recursively. The AUCstatistic is a Wilcoxon–Mann–Whitney U-statistic with asymptotic normal distributionand its value ranges from 0.5 under the null to a maximum of 1 (an ideal classifier). SeeJordà and Taylor (2011).

even during this period, easily beating the strict no-arbitrage null.We note that the TECM was not significantly different than themost successful strategy in any period, but clearly weathered the fi-nancial storm best.

4.3. Beyond RMSE: trading-based predictive ability tests

Do any of our models generate meaningful improvement from atrader's perspective? The AUC-based tests suggest that our preferredstrategies pick the correct direction of carry trade more often thannot, even when weighing by returns. Gain–loss ratios, which providebounds on the set of permissible pricing kernels, indicate that ap-proximate arbitrage opportunities (in the sense of Bernardo andLedoit, 1999, 2000) would have been available even during the globalfinancial crisis. This section complements this analysis with a moreconventional predictive ability testing framework based on loss func-tions that better reflect an investor's interests.

Extant tests designed to assess the marginal predictive ability ofone model against another are based, generally speaking, on a com-parison of the forecast loss function from one model to the other.Such is the approach proposed by the popular Diebold and Mariano(1995) and West (1996) frameworks and subsequent literature,with the usual choice of loss function being the root-mean squarederror (RMSE).

The approach that we pursue instead is to consider a variety of lossfunctions and use a framework that naturally extends our AUC-basedanalysis. For this purpose we generate a series of out-of-sample, one-period ahead forecasts from a rolling sample of fixed length. It is clearthat in this type of set-up, estimation uncertainty never vanishes andfor this reason we adopt the conditional predictive ability testingmethods introduced in Giacomini and White (2006). These tests havethe advantage of permitting heterogeneity and dependence in the fore-cast errors, and their asymptotic distribution is based on the view thatthe evaluation sample goes to infinity even though the estimation sam-ple remains of fixed length.

Specifically, let {Lt+1i }t=R

T−1 denote the loss function associated withthe sequence of one step-ahead forecasts, where R denotes the fixedsize of the rolling estimation-window from t=1 to T-1, and leti=0,1 (with the index 0 indicating forecasts generated with theunit root null model and the index 1 indicating the alternativemodel). Then, the test statistic

GW1;0 ¼ ΔLσ L=

ffiffiffiP

p d→N 0;1ð Þ ð4Þ

provides a simple test of predictive ability, where

ΔL ¼ 1P

XT−1

t¼R

L1tþ1−L0tþ1

� �;

and

σ L ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1P

XT−1

t¼R

L1tþ1−L0tþ1

� �2vuut

(since E(ΔL)=0 under the null. See theorem 1 in Giacomini andWhite, 2006); and where P=T−R+1.

The flexibility in defining the loss function permitted in theGiacomini and White (2006) framework is particularly useful forour application. The framework extends naturally to a panel context,where the forecasts for N different currencies over P periods arepooled together. In that case, assuming independence across unitsin the panel, we may implement the above test with the observationcount P replaced by NP. Because one may suspect that the panel unitsare not independent, we implement a cluster-robust covariance

Page 12: The carry trade and fundamentals: Nothing to fear but FEER itself

Table 10Forecast evaluation: loss-function statistics for equally-weighted portfolios.

RMSE RTN Sharpe Skew

(×1000) (annual%) (annual) (monthly)

(a) In sampleNaïve 0.44* 3.19** 0.58** -0.35**Naïve ECM 0.44** 4.10** 0.72** 0.30VAR 0.42** 5.54** 0.99** 0.10VECM 0.42** 6.90** 1.14** 0.37**TECM 0.40** 6.96** 1.35** 0.47**

(b) Out of sample, 2002–2006Naïve 0.42** 6.64** 1.59** −0.29Naïve ECM 0.41** 7.74** 1.49** 0.26VAR 0.40** 5.49** 1.07** 0.09VECM 0.38** 7.03** 1.36** 0.14TECM 0.38** 7.25** 1.58** 0.22

(c) Out of sample, 2003–2007Naïve 0.39 3.68* 0.87* −0.10Naïve ECM 0.38* 4.39** 0.95** 0.29VAR 0.38** 2.66 0.52 −0.09VECM 0.38 2.57 0.52 −0.37TECM 0.38 4.02* 0.85* 0.21

(d) Out of sample, 2004–2008Naïve 0.55 −2.89 −0.47 −0.52Naïve ECM 0.54 −1.41 −0.29 −0.61VAR 0.53 0.44 0.08 −0.36VECM 0.53 3.27 0.58 −0.11TECM 0.52 3.64* 0.80* 0.23

** (*) denotes significant at the 95% (90%) confidence level. To correct for cross-sectional dependence, a cluster-robust covariance correction is used. See text.

85Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

correction however. Pooled results are reported in the appendix forcompleteness.

The traditional summary statistic of predictive performance is thewell-known root-mean squared error, but even if RMSE is a majorfocus of the academic literature, it is of little interest to those in finan-cial markets, since models with good (poor) fit measured by RMSEcould still generate poor (respectively, good) returns. So what aboutinvestors' preferred performance criteria? Fortunately, the flexibilityof the framework allows one to consider loss functions based on alter-native metrics.

Even before one adjusts for risk, an investor cares about averagereturns, an obvious first alternative. Next, it is natural to normalizereturns by their volatility and the Sharpe ratio is the second metricthat we consider. But the Sharpe ratio largely depends on symmetryand quadratic utility—it is a poor measure when the risk of a suddencrash is a consideration. For this reason our third metric is based onthe skewness of returns, which allows us to assess the models' abilityto avoid infrequent but particularly large negative returns. More de-tails on the specific forms of these loss functions are reserved forthe appendix. To make comparability easier with Tables 2, 3 and 5,Table 11 reports the average returns, Sharpe ratio and skewness forequal-weighted portfolios based on each of the strategies we haveconsidered, using asterisks to indicate when the Giacomini andWhite (2006) statistics are statistically significant with respect tothe null of strict no-arbitrage.21

Table 10 summarizes the results for full in-sample (panel (a)) and thethree out-of-sample (panel (b)) windows: 2002–2006, 2003–2007, and2004–2008. In each case we report: RMSE (in percent); Annualized Re-turn (in percent); Annualized Sharpe Ratio; and the monthly Skewnesscoefficient. We do this for the Naïve; the Naïve ECM the VAR; theVECM; and the TECM models. A ⁎⁎ (⁎) indicates that the Giacomini andWhite (2006) statistic is significant at the conventional 95% (90%) confi-dence level.

The results in column 1 show that the difference in the models asjudged by RMSE are miniscule. However, measured by metrics thatmatter to traders, the differences are considerable. For example, inour preferred model, in-sample the difference in annual mean returnsis +3.19% for Naïve versus +6.96% for TECM. But out-of-sample thedifferences quickly become more dramatic, especially when lookingat the last window, 2004–2008. In that case the difference is−2.89% versus +3.64%. Annual Sharpe Ratios are boosted from+0.58 to +1.35 in-sample, to a virtual tie in the 2002–2006 and2003–2007 out-of-sample windows at +1.59 versus +1.58, and+0.87 versus +0.85 respectively, with the most dramatic differencescoming with the global financial crisis window of 2004–2008 with−0.47 for Naïve and +0.80 for TECM. Skewness is also generally bet-ter with TECM, improving from −0.35 to +0.47 in-sample, to −0.52and 0.23 for the out-of-sample window 2004–2008. Even in the2002–2006 window where the differences between strategies aresmall, notice that although Naïve generates a Sharpe Ratio of +1.59,which is slightly higher than the ratio for TECM at +1.58, Naïve has askewness of −0.29 versus +0.22 for TECM. There is little doubt whichof these returns a currency hedge fund manager would have preferred.The GW tests also show that, with the exception of skewness, these per-formance improvements are statistically significant, out-of-sample.

4.4. Predictive ability tests in possibly unstable environments

One concern that the skeptical readermight harbor at this point is thatwe have been lucky in the choice of our out-of-sample period. We sharethat concern andhave experimentedwith a variety ofwindows. Our anal-ysis in Table 10 uses three non-overlapping out-of-sample windows to

21 For completeness, the pooled results are provided in the Appendix A.

evaluate our proposed carry trade strategies, and this suggests that some-thing more fundamental than kismet is at work.

In this section we try to more formally insulate our results againstthe fear that model and performance instability could yet undermineour claims. We can add one more layer of flexibility to our testingsetup by using Giacomini and Rossi's (2010) fluctuations test. Theirtest is design to address the problem that predictors which workwell in one period, may not work well in another.

The test is meant to establish which model predicts best whenthere may be such instabilities. Flexibility is achieved by continuouslymonitoring the entire path of relative model performance rather thanthrough averages over fixed windows. As a result, the new Giacominiand Rossi (2010) framework allows for possible instabilities.

Conceptually, the Giacomini and Rossi (2010) approach borrows ele-ments of the Giacomini andWhite (2006) approach used in the previoussection, specifically the notion of testing local conditional predictive abil-ity. The advantage is that by conditioning on the forecasting model, thetesting environment allows for comparisons between nested and non-nested alternatives using flexible loss functions. We leave the specificsof how the fluctuations test is constructed to the original reference, andinstead provide some intuition here.

The fluctuations test considers average relative forecasting abilityover small windows rolled-over the entire out-of-sample interval so asto allow for local variation. The test then accommodates the serial corre-lation naturally introduced by the rolling windows, and adjusts the as-ymptotic distribution to the fact that the test is not calculated as a fixedaverage over the entire out-of-sample interval. In addition, one mustchoose the size of the rolling window relative to the available out-of-sample and Giacomini and Rossi (2010) recommend that ratio to beabout 0.3. Applied to our data, this ratio translates to rolling windows of24months in length over the six years of out-of-sample interval available.

Table 11 reports the results of the fluctuation test and omits RMSEbased loss, which trivially rejects the null that each proposed modelconsidered has superior predictive ability rather overwhelmingly.Rather, we find it more helpful to concentrate on trading-based lossfunctions derived from measures of returns, Sharpe-ratio and skew-ness, just as we did in Table 10.

Page 13: The carry trade and fundamentals: Nothing to fear but FEER itself

Table 11Forecast evaluation using fluctuations test for equally weighted portfolios.

Model RTN Sharpe Skew

(annual %) (annual) (monthly)

Naïve 3.03* 3.99** −12.41**Naïve-ECM 2.77 2.56 0.89VAR 9.02** 10.17** −2.89VECM 3.85** 3.97** 2.54TECM 2.94* 2.93* 0.43

** (*) denotes significant at the 95% (90%) significance level. The critical value at 95%(90%) confidence is 3.15 (2.92). See text.

86 Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

Table 11 lends support to our running story and encapsulates thefindings in Table 10 rather well. By this metric, the VAR model wouldmost strongly reject the null, albeit with positive skewness (as evincedby the negative value that the fluctuations test attains). Instead, al-though VECM and TECM have somewhat less dramatic rejections,they do so with negative (albeit not statistically significant) skew.

To illustrate differences in returns, Sharpe-ratio and skewness overtime, Fig. 1 displays the accumulated returns one would have obtainedout-of-sample over the period January 2002 to December 2008with theNaïve strategy (the focus of most of the literature) relative to the TECMstrategy. We plot the accumulated returns to smooth some of themonth-to-month fluctuations and because, in a moment, Section 5 dis-plays a similar figure based on two tradable indices (now exchange-traded funds) from Deutsche Bank which loosely resemble these twostrategies. The two figures are more easily comparable this way.

Fig. 1 makes clear that, except for an interlude between the secondhalf of 2007 and the first quarter of 2008, TECM-based returns wouldhave been superior to Naïve-based returns, at times by as much as 10%.A good illustration of the differences in skewness between these twostrategies is the dramatically different turns each takes in the latter partof 2008. In the space of only a few months, the Naïve strategy wouldhave declined by about 20% whereas the opposite would have been trueof the TECM strategy so that by the end of 2008, the differences in cumu-lated returns would reach the 35%–40% mark—a considerable difference.

5. Alternative crash protection strategies

In this section we compare the merits of our fundamentals basedmodel with the recent literature, where two alternative crash protec-tion strategies have emerged. The first is a contract-based strategy:Burnside et al. (2008b,c) explore a crash-free trading strategy whereFX options are used to hedge against a tail event that takes the formof a collapse in the value of the high-yield currency. The second is amodel based strategy: Brunnermeier et al. (2009) argue that an

100

110

120

130

140

150

2002m1 2003m1 2004m1 2005m1 2006m1 2007m1 2008m1 2009m1TECM

Fig. 1. Out-of-sample cumulated returns: Naïve versus TECM model, 2002–08. Thechart shows the cumulated returns one would have obtained out-of-sample from Jan-uary 2002 to December 2008 with the Naïve and TECM strategies. The initial value is100 in January 2002.

important cause of carry trade crashes is the arrival of a state of theworld characterized by an increase in market volatility or illiquidity.

5.1. Options

In Burnside et al. (2008b,c) a Naïve carry trade strategy is thestarting point. Traders go long the high yield currencies and shortthe low yield currencies. The focus is also on major currencies, ashere, although the sample window goes back to the 1970s era of cap-ital controls, and applies to a larger set of 20 currencies. Returns maybe computed for individual currencies, portfolios, of for subsets ofcurrencies (e.g., groups of 1, 2 or 3 highest yield currencies againstsimilar groups of lowest yield currencies). However, each Naïve strat-egy can be augmented by purchase of at-the-money put options onthe long currencies.22

This augmentation leads to positive skew by construction. It avertstail risk, but at a price. How large is that price? For the 1987/1 to2008/1 period a Naïve unhedged equally-weighted carry trade had areturn of 3.22% per annum with an annualized Sharpe ratio of 0.54and a skewness of −0.67. This compares with the U.S. stock market's6.59% return, 0.45 Sharpe ratio, and skewness of−1.16 over the sameperiod. Normality of returns is rejected in both cases. Implementingthe option-hedged carry trade boosts the Sharpe ratio a little, elimi-nates the negative skew, but at a significant cost. Added insurancecosts of 0.71% per annum mean that the hedged strategy lowersreturns to only 2.51% annually, has a Sharpe of 0.71, and a significantpositive skew of 0.75.

In contrast, let us now compare our similar equal-weight Naïvecarry trading strategy with our fundamentals based strategy, inTable 10, panel (a). Comparing the Naïve and TECMmodels, our strat-egy increases returns on an annualized basis from 3.19% to 6.96%, again of 377 bps, with a Sharpe ratio of 1.35 and eliminating the neg-ative skew of −0.35 to generate a positive skew of 0.47. Unlike theoption-based strategy, our approach can increase returns and lowerskew at the same time.

We make some further observations. To be fair, we recognize thatthe strategies implemented by Burnside et al. (2008b,c) are crude andmainly illustrative. The options they employ are at-the-money, andpush skew into strong positive territory. In reality, investors mightpurchase option that is more out-of-the-money; at much lowercosts, this could protect against only the large crashes, yet still elimi-nate the worst of the negative skew. This was argued by Jurek (2008),who found that only 30%–40% of Naïve carry trade profits need bespent on average to insure against crash risk, leaving the puzzleintact.

Nonetheless, even if lower cost option strategies are found, our re-sults suggest that there is a better no-cost way to hedge againstcrashes by using data on deviations from long-run fundamentals,and this information also serves to boost absolute returns and Sharperatios too. Moreover, option contracts have other drawbacks, many ofwhich were shown to be salient issues after the financial crash of2008. Option prices embed volatility, so in periods of turmoil theprice of insurance can rise just when it is most needed, dependingon the implied FX volatility (Bhansali, 2007). And as with all deriva-tives, options carry counterparty risk. In addition, option price andposition data from exchanges may not be representative, since thisactivity represents only a tiny fraction of FX option trading, the vastmajority being over-the-counter. Also, like forward and future deriv-atives, options carry bid-ask spreads that can explode such as to makemarkets dysfunctional in crisis periods. Finally, while Burnside et al.(2008b,c) show the viability of an options-based strategy for 20widely-traded major currencies, FX derivative markets are thin tononexistent for many emerging market currencies, and bid-ask

22 This strategy is also studied by Bhansali (2007).

Page 14: The carry trade and fundamentals: Nothing to fear but FEER itself

100

120

140

160

2002m1 2003m1 2004m1 2005m1 2006m1 2007m1 2008m1 2009m1

DBCRUSI (Carry/Value/Momentum)

Fig. 2. Naïve carry ETF versus augmented carry ETF: real-world index cumulated returnsfor two Deutsche Bank USD Carry Indices, 2002–08. ETFs are based on the DeutscheBank G10 Currency Harvest USD Index (DBCFHX) and Deutsche Bank Currency ReturnsUSD Index (DBCRUSI). The indices are based on a portfolio of nine G10 currencies versusthe U.S. dollar, with weights of ±1 or 0 depending on signals. Indices predate the ETF in-ception dates. Index values are daily with January 1, 2002 equal to 100.

87Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

spreads can be wide, so in those markets currency strategists mayhave to seek alternative plain-vanilla approaches, like ours, thatmight help avert currency crash risk.

5.2. VIX signal

In Brunnermeier et al. (2009) it is argued that various forms of gen-eralized asset market distress, by reducing risk appetite, may be signif-icant contributors to carry trade crashes. The logic of the argument isthat such conditionsmay prompt a pull back from all risky asset classes,leading to short-run losses via order flow effects, or the price impact oftrades. To provide empirical support for this proposition the authorsshow that, on a weekly basis, changes in the VIX volatility index (aproxy for distress) correlate with the returns to Naïve carry trade strat-egies—that is, trading strategieswhichdonot condition on any informa-tion except the interest differential. An increase in VIX was found tocorrelate with lower carry trade returns, all else equal, suggesting thatliquidity risk might partially explain the excess returns. However,lagged changes in VIX had little or no predictive power for next periodsreturns. This prompts us to explore whether the same arguments holdtrue in our sample, and if we detect any difference when we use ourpreferred trading strategy. We found that in our data, using the same1992–2006 sample period, at a monthly frequency, there was no evi-dence that the change in VIX was correlated with returns contempora-neously, nor that it helped predict next period's returns.

Specifically, in the spirit of Brunnermeier et al. (2009), we com-puted loadings of the returns on the contemporaneous and laggedchange in VIX. We first regressed the actual (signed) returns of tradesbased on the Naïve model on the contemporaneous change in VIX forall months and all currencies in our data over their sample period1992–2006. We then did the same for the lagged change in VIX. Wethen re-ran both regressions using the actual (signed) returns of ourpreferred TECM model. None of the four regression was the slope co-efficient significant, and in all cases it actually had the “wrong” posi-tive sign, suggesting that rising VIX was slightly positively correlatedwith profits this month, or next month.

We make some further observations. To be fair, we should notethat Brunnermeier et al. (2009) present these results at a weekly fre-quency. In our dataset, we have to work at monthly frequency giventhe underlying price index data used to construct the real exchangerate fundamental signal. We conjecture that the VIX signal mayhave stronger correlations with current or future returns at weeklyor even fortnightly frequency, even when placed in a model likeours augmented to account for real exchange rate fundamentals, al-though further research is warranted in this area.

If the sample is extended using our full dataset, including throughthe volatile crisis period in 2007–2008, the coefficients are not stableand we do sometimes find a statistically significant role for VIX sig-nals in predicting negative actual (signed) returns of the NaïveModel 1, although this episode is outside the window considered byBrunnermeier et al. (2009). But this result disappears, and the signeven reverses, when we include the real exchange rate term anduse our preferred TECM model, suggesting that the real exchangerate deviation alone may serve as an adequate crash warning. Infact, in these models the sign of the coefficient on the change in VIXwas often positive and significant, suggesting potentially the oppositeinterpretation: when VIX is rising, arbitrage capital may be beingwithdrawn, but this has a tendency to leave more profits on thetable for the risk tolerant, or “patient capital,” and for traders pursu-ing profitable fundamental-based strategies with inherent skew pro-tection of the sort developed here.

6. Reality check: exchange traded funds

Beyond out-of-sample forecasting, an alternative way to judge ourpreferredmodeling and trading strategies against the Naïve benchmark

is to compare the recent performances of different currency funds in thereal world. When comparing the merits of Naïve versus fundamentalsbased strategies, the relative performance of these funds in the financialcrisis 2008 proves to be very revealing.

First, we seek a fund representative of the Naïve long-short ap-proach, the basic carry strategies of Burnside et al. (2008a), and thesame authors with Kleshchelski (2008a,b), henceforth BER/K. Forthis purpose we focus on the Deutsche Bank G10 Currency HarvestUSD Index (AMEX: DBCFHX, Bloomberg: DBHVG10U). This indexhas been constructed back to March 1993 using historical data.Since April 2006 this index has been made available to retail inves-tors—in the United States, for example, it trades as the PowersharesDB G10 Currency Harvest Fund (AMEX: DBV). These indices track aportfolio based on 2:1 leveraged currency positions selected fromten major “G10” currencies with the U.S. dollar as a base. For every$2 invested, the index goes long (+$1) in the three currencies withthe highest interest rates, and short ($-1) in the three with the lowestinterest rates, and with 0 invested in the other three, with $1 placedin Treasuries for margin. The portfolio is reweighted only every quar-ter and is implemented via futures contracts. If the U.S. dollar is one ofthe long or short currencies, that position makes no profit in the basecurrency and is dropped, so leverage falls to 1.66.

This ETF's Naïve approach to forecasting any currency pair via thecarry signal matches the BER/K approach. The main differences arethat the index is not equal weight and tries to pick 6 winners out of9, the fund's leverage is finite and so this fund has to pay the oppor-tunity costs of satisfying the margin requirement, and the fund alsocharges a 0.75% per annum management fee. The performance ofthe index in the period 2002 to 2008 is shown in Fig. 2, where thestarting value of the index is rebased to 100. The first five or soyears, up to mid/late 2007 were a golden age for Naïve strategies:the index return was about 10% per year, and the index peaked over150 in mid-2007. Even after some reversals, as late as the middle of2008 the index stood at over 140, up 40% on January 2002 levels.

It is much less obvious what particular peso problem can explainthe high Sharpe ratio associated with the equally weighted carrytrade. That strategy seems to involve “picking up pennies in front ofan unknown truck that has never been seen.”

Unfortunately, in the second half of 2008 the Naïve–BER/K type ofcarry trade strategies was hit by the financial equivalent of a runawayeighteen-wheeler. The index gave up about 10% in the third quarter anda further 20% in October in a major unwind. After all the index stoodclose to its January 2002 level of 100, and five and a half years of gainshad been erased.

Page 15: The carry trade and fundamentals: Nothing to fear but FEER itself

Table 12Forecast evaluation: loss-function statistics for pooled data.

Squared error Return Sharpe Ratio Skewness

(×1000) (annual %) (annual) (monthly)

(a) In sampleNaïve 0.90** 3.19** 0.30** −0.17**Naïve ECM 0.88** 4.10** 0.39** 0.03VAR 0.88** 5.54** 0.53** −0.07VECM 0.86** 6.90** 0.66** 0.09TECM 0.84** 6.96** 0.66** 0.05

(b) Out of sample2002–2006Naïve 0.70** 6.64** 0.70** −0.12Naïve ECM 0.69** 7.74** 0.82** −0.08VAR 0.69** 5.49** 0.58** −0.09VECM 0.67** 7.03** 0.75** −0.20TECM 0.66** 7.25** 0.77** −0.07

(c) Out of sample2003–2007Naïve 0.70* 3.68** 0.39** −0.07Naïve ECM 0.69** 4.39** 0.47** −0.03VAR 0.70 2.66** 0.28** −0.14VECM 0.70 2.57* 0.28* −0.16TECM 0.70 4.02** 0.43** −0.03

(d) Out of sample2004–2008Naïve 1.01** −2.89** −0.27** −0.32**Naïve ECM 1.01** −1.41 −0.13 −0.18VAR 1.00 0.44 0.04 −0.16*VECM 0.99 3.27** 0.30** −0.04TECM 0.97 3.64** 0.33** 0.02

Notes: ** (*) denotes significant at the 95% (90%) confidence level. To correct for cross-sectional dependence, a cluster-robust covariance correction is used. See text.

88 Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

Could a more sophisticated carry trade strategy have avoided thiscarnage, while still preserving positive returns overall? Yes. We canlook for a different index which matches better with our more sophis-ticated trading strategy. We focus on the Deutsche Bank CurrencyReturns USD Index (Bloomberg: DBCRUSI). This investible indexoperates on the same set of “G10” currencies but amalgamates threedifferent trading signals for each currency, where again the signalsare +1 (long), 0 (no position), and −1 (short). The first signal is thesame as before: theNaïve carry trade signal based purely on the interestrate, the 3-month LIBOR rate (what DB calls the “Carry Index”), updatedevery three months. The second signal is based on a calculation of thedeviation of the currency from its FEER level, based on common-currency price levels from the OECD (what DB calls the “ValuationIndex”) continuously updated every three months. The third signal isbased onmomentum, and takes into account the USD return to holdingthat currency in the previous period (what DB calls the “MomentumIndex”) updated every month. Clearly this augmented carry strategy isclose in spirit to our augmented models, given its mix of signals. Inour model “carry” corresponds to the interest differential; “momen-tum” corresponds mostly to the lagged exchange rate change (givenpersistence in the interest differential); and “valuation” correspondsto our real exchange rate deviation.

The Deutsche Bank Currency Returns Index is not available as an ETFin the U.S., but it does trade as an ETF in Frankfurt or London. A test ofour approach is to ask: did investors in this more sophisticated fundfare better in 2008 than those invested in the Naïve DBV fund. AsFig. 2 shows, they did much better. On the 2003–07 upswing thisindex was less volatile, but returns were still strong, up about 35% atthe peak. Momentum signals got the money quickly out of crashingcurrencies (and back in when they rose), and valuation signalshedged against dangerously overvalued currencies (and jumped inotherwise), tempering the Naïve carry signal. The real test was in2008. The index did remarkably well: performance was absolutelyflat, and kept 30% of the cumulative gains locked in. Looking at thethree components of this strategy, a 30%+ loss on the carry signalin 2008, was offset by a 10% gain on the PPP signal and a 20% gainon the momentum signal.

The relative performance of the ETFs lines up with the results of ourtrading strategies as shown back in Table 7. In the out of sample resultsthe Naïve–BER/K typemodel showed a loss of 2.84% per year from 2004to 2008. Ourmore sophisticated VECM showed a gain of 4.15% per year.That differential of 7% per year would cumulate to a roughly 30% diver-gence in index values over a four year period. Despite some differencesin methods between our strategies and the ETFs, this matches the cu-mulative divergence seen for the real-world ETFs in Fig. 2.

Fundamentals matter. Comparing the ETFs makes clear the gainsfrom using a more sophisticated currency strategy incorporating aFEER measure. Although academic research has focused on theNaïve strategies, it is no secret in the financial markets that more re-fined models, taking into account the signals discussed here andmany others, are actively employed by serious fund managers andare now leaking out into traded products.

7. Conclusion

The global financial crisis of 2008 dealt a blow to many invest-ments, including the carry trade. Naïve strategies based on arbitrag-ing interest rate differentials across countries under the belief that itwas equiprobable for exchange rates to appreciate or depreciate hadbeen consistently profitable for many years. But these returns werevolatile and subject to occasional crashes. Even averaging returnsover four years, by the end of 2008 one could not reject the null ofstrict no-arbitrage. Was this the end of the carry trade puzzle?

The elegant simplicity of the Naïve carry trade hadmade for a persua-sive story. But it left on the table important clues about the secularmovements of exchange rates toward fundamental long-run

equilibrium, clues that we used to construct slightly more refined carrytrade strategies. These refinements are sufficient to generate persistentpositive returns with moderate exposure and zero or even mildly pos-itive skew. More importantly, they would have eked out positivereturns even during the financial crisis at a time when little elsecould. The implied gain–loss ratios from these strategies in-sampleand over different out-of-sample windows appear to exceed those as-sociated with equity investments. These approximate arbitrage oppor-tunities (in the sense of Bernardo and Ledoit, 1999) impose newbounds on the set of admissible pricing kernels implied by investors'attitudes toward risk. Our exhaustive diagnostic analysis indicatesthat these are no in-sample flukes.

We arrive at these conclusions using novel methods. Informationthat is insufficient to scientifically validate a model of economic be-havior can be sufficient to determine the returns-weighted optimaldirection of trade. This observation had two consequences for ouranalysis. On one hand, we structured tests of predictive abilitywith loss functions that matter to the investor (rather than the sci-entist): returns, Sharpe, and skew. On the other hand, the binarychoice of which currency to sell short and which to go long in fallsneatly into a classification framework of the sort that is often dis-cussed in biostatistics and signal processing. With such a frame-work, we were able to weave the opportunities provided by agiven classifier – investment strategy – with the investor's prefer-ences over each choice. The statistics and the connection betweenopportunities and choices are explored in more detail in a compan-ion paper (Jordà and Taylor, 2011).

The global financial crisis may have made the Naïve carry tradea curious relic in international finance, but our paper reinvigoratesthe debate using only marginally more complicated strategies.

Page 16: The carry trade and fundamentals: Nothing to fear but FEER itself

89Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

Appendix A

The loss functions of the Giacomini and White (2006) statistics inexpression (4) that correspond to three trading-based metrics are, forthe returns of a given strategy:

Ltþ1 ¼ μ tþ1;

for the Sharpe ratio

Ltþ1 ¼ μ tþ1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1P ∑T−1

t¼R μ tþ1−μ� �2q ;

and for skewness, we use Pearson's second definition of skewness toprovide a comparable scale with the returns- and Sharpe ratio-basedloss functions, specifically

Litþ1 ¼ 3 μ tþ1−μ 0:50ð Þ� �ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1P ∑T−1

t¼R μ tþ1−μ� �2q ;

where μ(0.50) refers to the median of μ tþ1 for t=R,…,T−1.In Table 12 we report parallel results to those in Table 10 by pool-

ing instead of by portfolios. To adjust for any heterogeneity, we use acluster-robust correction for the panel.

References

Alexius, Annika, 2001. Uncovered interest parity revisited. Review of International Eco-nomics 9 (3), 505–517.

Bacchetta, Philip, van Wincoop, Eric, 2006. Can information heterogeneity explain theexchange rate determination puzzle? The American Economic Review 96 (3),552–576.

Backus, David K., Gregory, Allan W., Telmer, Chris I., 1993. Accounting for forward ratesin markets for foreign currency. Journal of Finance 48 (5), 1887–1908.

Bakker, Age, Chapple, Bryan, 2002. Advanced country experiences with capital accountliberalization. Occasional Paper, 214. International Monetary Fund, Washington,D.C.

Bekaert, Geert, Hodrick, Robert J., 1993. On biases in the measurement of foreign ex-change risk premiums. Journal of International Money and Finance 12 (2),115–138.

Bernardo, Antonio E., Ledoit, Olivier, 1999. Approximate Arbitrage. Anderson School ofManagement, UCLA. Working paper 18–99.

Bernardo, Antonio E., Ledoit, Olivier, 2000. Gain, loss, and asset pricing. Journal of PoliticalEconomy 108 (1), 144–172.

Bhansali, Vineer, 2007. Volatility and the carry trade. Journal of Fixed Income 17 (3),72–84.

Brunnermeier, Markus K., Nagel, Stefan, Pedersen, Lasse H., 2009. Carry trades and cur-rency crashes. NBER Macroeconomics Annual 2008. : Daron Acemoglu, KennethRogoff, Michael Woodford. University of Chicago Press, Chicago, pp. 313–347.

Burnside, A. Craig, 2011. The cross-section of foreign currency risk premia and consumptiongrowth risk: comment. The American Economic Review 101 (7), 3456–3476.

Burnside, Craig A., Eichenbaum, Martin, Kleshchelski, Isaac, Rebelo, Sergio, 2006. Thereturns to currency speculation. NBER Working Papers, 12489.

Burnside, Craig A., Eichenbaum, Martin, Rebelo, Sergio, 2007. The returns to currencyspeculation in emerging markets. NBER Working Papers 12916.

Burnside, A. Craig, Eichenbaum, Martin, Rebelo, Sergio, 2008a. Carry trade: the gains ofdiversification. Journal of the European Economic Association 6 (2–3), 581–588.

Burnside, Craig A., Eichenbaum, Martin, Kleshchelski, Isaac, Rebelo, Sergio, 2008b. Dopeso problems explain the returns to the carry trade? CEPR Discussion Papers,6873. June.

Burnside, A. Craig, Eichenbaum, Martin, Kleshchelski, Isaac, Rebelo, Sergio, 2008c. Dopeso problems explain the returns to the carry trade? NBER Working Papers14054.

Černý, Aleš, 2003. Generalised Sharpe ratios and asset pricing in incomplete markets.European Finance Review 7 (2), 191–233.

Cheung, Yin-Wong, Chinn, Menzie D., Garcia Pascual, Antonio, 2005. Empirical ex-change rate models of the nineties: are any fit to survive? Journal of InternationalMoney and Finance 24 (7), 1150–1175.

Chinn, Menzie D., Frankel, Jeffrey A., 2002. Survey data on exchange rate expectations:more currencies, more horizons, more tests. In: Allen, W., Dickinson, D. (Eds.), Mone-tary Policy, Capital Flows and Financial Market Developments in the Era of FinancialGlobalisation: Essays in Honour of Max Fry. Routledge, London, pp. 145–167.

Chong, Yanping, Jordà, Òscar, Taylor, Alan M., 2012. The Harrod-Balassa-SamuelsonHypothesis: Real Exchange Rates and their Long-Run Equilibrium. InternationalEconomic Review 53 (2), 609–634.

Clarida, Richard, Waldman, Daniel, 2008. Is bad news about inflation good news for theexchange rate? And, if so, can that tell us anything about the conduct of monetary

policy? In: Campbell, John Y. (Ed.), Asset Prices and Monetary Policy. University ofChicago Press, Chicago, pp. 371–396.

Clinton, Kevin, 1998. Transactions costs and covered interest arbitrage: theory and ev-idence. Journal of Political Economy 96 (2), 358–370.

Coakley, Jerry, Fuertes, Ana María, 2001. A Non-linear analysis of excess foreign ex-change returns. The Manchester School 69 (6), 623–642.

Cochrane, John H., Saá-Requejo, Jesús, 2000. Beyond arbitrage: good-deal asset pricebounds in incomplete markets. Journal of Political Economy 108 (1), 79–119.

DeMiguel, Victor, Garlappi, Lorenzo, Uppal, Raman, 2009. Optimal versus naive diver-sification: how inefficient is the 1/N portfolio strategy? Review of Financial Studies22 (5), 1915–1953.

Diebold, Francis X., Mariano, Roberto S., 1995. Comparing predictive accuracy. Journalof Business and Economic Statistics 13 (3), 253–263.

Dominguez, Kathryn M., 1986. Are foreign exchange forecasts rational? New evidencefrom survey data. Economics Letters 21 (3), 277–281.

Dooley, Michael P., Shafer, Jeffrey R., 1984. Analysis of short-run exchange rate behav-ior: March 1973 to 1981. In: Bigman, David, Taya, Teizo (Eds.), Floating ExchangeRates and the State of World Trade and Payments, pp. 43–70.

Engel, Charles, Hamilton, James D., 1990. Long swings in the dollar: are they in the dataand do markets know it? The American Economic Review 80 (4), 689–711.

Fama, Eugene F., 1984. Forward and spot exchange rates. Journal of Monetary Econom-ics 14 (3), 319–338.

Fawcett, Tom, 2006. ROC graphs with instance-varying costs. Pattern Recognition Let-ters 27 (8), 882–891.

Fisher, Eric O'N, 2006. The forward premium in a model with heterogeneous prior be-liefs. Journal of International Money and Finance 25 (1), 48–70.

Frankel, Jeffrey A., 1980. Tests of rational expectations in the forward exchange market.Southern Economic Journal 46 (4), 1083–1101.

Frankel, Jeffrey A., Froot, Kenneth A., 1987. Using survey data to test standard proposi-tions regarding exchange rate expectations. The American Economic Review 77(1), 133–153.

Frenkel, Jacob A., Levich, Richard M., 1975. Covered interest rate arbitrage: unexploitedprofits? Journal of Political Economy 83 (2), 325–338.

Froot, Kenneth A., Frankel, Jeffrey A., 1989. Forward discount bias: is it an exchange riskpremium? Quarterly Journal of Economics 104 (1), 139–161.

Froot, Kenneth A., Ramadorai, Tarun, 2008. Institutional portfolio flows and interna-tional investments. Review of Financial Studies 21 (2), 937–971.

Froot, Kenneth A., Thaler, Richard H., 1990. Foreign exchange. The Journal of EconomicPerspectives 4 (3), 179–192.

Fujii, Eiji, Chinn, Menzie D., 2000. Fin de Siècle real interest parity. NBER Working Pa-pers, 7880.

Giacomini, Raffaella, Rossi, Barbara, 2010. Forecast comparisons in unstable environ-ments. Journal of Applied Econometrics 25 (4), 595–620.

Giacomini, Raffaella, White, Halbert, 2006. Tests of conditional predictive ability. Econ-ometrica 74 (6), 1545–1578.

Granger, Clive W.J., Teräsvirta, Timo, 1993. Modelling Nonlinear Economic Relation-ships. Oxford University Press, Oxford.

Hansen, Lars Peter, Hodrick, Robert J., 1980. Forward exchange rates as optimal predictorsof future spot rates: an econometric analysis. Journal of Political Economy 88 (5),829–853.

Hsieh, Fushing, Turnbull, Bruce W., 1996. Nonparametric and semiparametric estima-tion of the receiver operating characteristic curve. Annals of Statistics 24 (1),25–40.

Ilut, Cosmin L., 2010. Ambiguity aversion: implications for the uncovered interest rateparity puzzle. Working Papers, 10–53. Department of Economics, Duke University.

Jeanne, Olivier, Rose, Andrew K., 2002. Noise trading and exchange rate regimes. Quar-terly Journal of Economics 117 (2), 537–569.

Jordà, Òscar, Taylor, Alan M., 2011. Performance evaluation of zero net-investmentstrategies. NBER Working Paper, 17150.

Jurek, Jakub W., 2008. Crash-Neutral Currency Carry Trades. Princeton University,Photocopy.

Khandani, Amir E., Kim, Adlar J., Lo, AndrewW., 2010. Consumer credit-risk models viamachine-learning algorithms. Journal of Banking and Finance 34, 2767–2787.

Levich, Richard M., Thomas III, Lee R., 1993. The significance of technical trading-ruleprofits in the foreign exchange market: a bootstrap approach. Journal of Interna-tional Money and Finance 12 (5), 451–474.

Lusted, Lee B., 1960. Logical analysis in roentgen diagnosis. Radiology 74 (2), 178–193.Lustig, Hanno, Verdelhan, Adrien, 2007. The cross section of foreign currency risk pre-

mia and consumption growth risk. The American Economic Review 97 (1), 89–117.Mason, Ian B., 1982. A model for the assessment of weather forecasts. Australian

Meterological Magazine 30 (4), 291–303.Meese, Richard A., Rogoff, Kenneth, 1983. Empirical exchange rate models of the se-

venties. Journal of International Economics 14 (1–2), 3–24.Michael, Panos, Nobay, A. Robert, Peel, David A., 1997. Transactions costs and nonlinear

adjustment in real exchange rates: an empirical investigation. Journal of PoliticalEconomy 105 (4), 862–879.

Obstfeld, Maurice, Taylor, Alan M., 1997. Nonlinear aspects of goods-market arbitrageand adjustment: Heckscher's commodity points revisited. Journal of the Japaneseand International Economies 11 (4), 441–479.

Obstfeld, Maurice, Taylor, Alan M., 2004. Global Capital Markets: Integration, Crisis, andGrowth. Cambridge University Press, Cambridge.

Pedroni, Peter, 1999. Critical values for cointegration tests in heterogeneous panels withmultiple regressors. Oxford Bulletin of Economics and Statistics 61 (S1), 653–670.

Pedroni, Peter, 2004. Panel cointegration: asymptotic and finite sample properties ofpooled time series tests with an application to the PPP hypothesis. EconometricTheory 20 (3), 597–625.

Page 17: The carry trade and fundamentals: Nothing to fear but FEER itself

90 Ò. Jordà, A.M. Taylor / Journal of International Economics 88 (2012) 74–90

Pepe, Margaret S., 2003. The Statistical Evaluation of Medical Tests for Classificationand Prediction. Oxford University Press, Oxford.

Pesaran, M. Hashem, Timmermann, Allan, 1992. A simple nonparametric test of predic-tive performance. Journal of Business and Economic Statistics 10 (4), 461–465.

Peterson,WesleyW., Birdsall, Theodore G., 1953. The theory of signal detectability: part I.the general theory. Electronic Defense Group. : Technical Report, 13. University ofMichigan, Ann Arbor.

Plantin, Guillaume, Shin, Hyun Song, 2007. Carry Trades and Speculative Dynamics.Princeton University, Photocopy.

Pojarliev, Momtchil, Levich, Richard M., 2007. Do professional currency managers beatthe benchmark? NBER Working Papers, 13714.

Pojarliev, Momtchil, Levich, Richard M., 2008. Trades of the living dead: style Differences.Style Persistence and Performance of Currency Fund Managers: NBER Working Pa-pers, 14355.

Poterba, James M., Summers, Lawrence H., 1986. The persistence of volatility and stockmarket fluctuations. The American Economic Review 76 (5), 1142–1151.

Sager, Michael, Taylor, Mark P., 2008. Generating Currency Trading Rules from theTerm Structure of Forward Foreign Exchange Premia. University of Warwick. Octo-ber, Photocopy.

Sarno, Lucio, Taylor, Mark P., 2002. The Economics of Exchange Rates. Cambridge UniversityPress, Cambridge.

Sarno, Lucio, Valente, Giorgio, Leon, Hyginus, 2006. Nonlinearity in deviations from un-covered interest parity: an explanation of the forward bias puzzle. Review of Finance10 (3), 443–482.

Shleifer, Andrei, Vishny, Robert W., 1997. A survey of corporate governance. Journal ofFinance 52 (2), 737–783.

Silvennoinen, Annastiina, Teräsvirta, Timo, He, Changli, 2008. Unconditional Skewnessfrom Asymmetry in the Conditional Mean and Variance. Department of EconomicStatistics, Stockholm School of Economics. Photocopy.

Sinclair, Peter J.N., 2005. How policy rates affect output, prices, and labour, open econ-omy issues, and inflation and disinflation. In: Mahadeva, Lavan, Sinclair, Peter(Eds.), How Monetary Policy Works. Routledge, London, pp. 53–81.

Spackman, Kent A., 1989. Signal detection theory: valuable tools for evaluating inductivelearning. Proceedings of the Sixth International Workshop on Machine Learning.Morgan Kaufman, San Mateo, Calif, pp. 160–163.

Swets, John A., 1973. The relative operating characteristic in psychology. Science 182(4116), 990–1000.

Taylor, Alan M., Taylor, Mark P., 2004. The purchasing power parity debate. The Journalof Economic Perspectives 18 (4), 135–158.

Taylor, Mark P., Peel, David A., Sarno, Lucio, 2001. Nonlinear mean-reversion in real ex-change rates: toward a solution to the purchasing power parity puzzles. InternationalEconomic Review 42 (4), 1015–1042.

West, Kenneth D., 1996. Asymptotic inference about predictive ability. Econometrica64 (5), 1067–1084.