24
Northfield Information Services 184 High Street Boston, MA 02110 617.451.2222 617.451.2122 fax www.northinfo.com Quantitative Investing as a Liberal Art Dan diBartolomeo Northfield Information Services, Inc. INVESCO Atlanta, July 20, 1999 Abstract Investment practitioners utilizing quantitative methods often are faced with an unpleasant reality: investment decision models which have produced excellent results in back-tests and simulations may achieve very poor results when actually implemented. Four general areas of possible causation are discussed: (1) the conflict between the theoretical and professional environments, (2) failure to clearly identify the objective function, (3) the inherent limitations of back-testing and simulated trading and (4) failure to consider estimation error in applying the results of models. Comment on Notation As this paper is meant to be presented in a live forum at the Financial Analysts Seminar, certain organizational liberties have been taken in preparation of the text. Sixteen exhibits (printed two to a page) are attached. References to each exhibit have been inserted into the text and are printed in italicized type.

Quantitative Investing as a Liberal Art - Northfield · phenomena we must understand not only the nature but also the context of the quantitative investment ... detective genre of

Embed Size (px)

Citation preview

Northfield Information Services 184 High Street ⋅ Boston, MA 02110 ⋅ 617.451.2222 ⋅ 617.451.2122 fax www.northinfo.com

Quantitative Investing as a Liberal Art

Dan diBartolomeo Northfield Information Services, Inc.

INVESCO Atlanta, July 20, 1999 Abstract Investment practitioners utilizing quantitative methods often are faced with an unpleasant reality: investment decision models which have produced excellent results in back-tests and simulations may achieve very poor results when actually implemented. Four general areas of possible causation are discussed: (1) the conflict between the theoretical and professional environments, (2) failure to clearly identify the objective function, (3) the inherent limitations of back-testing and simulated trading and (4) failure to consider estimation error in applying the results of models. Comment on Notation As this paper is meant to be presented in a live forum at the Financial Analysts Seminar, certain organizational liberties have been taken in preparation of the text. Sixteen exhibits (printed two to a page) are attached. References to each exhibit have been inserted into the text and are printed in italicized type.

Quantitative Investing as a Liberal Art

Northfield Information Services 184 High Street ⋅ Boston, MA 02110 ⋅ 617.451.2222 ⋅ 617.451.2122 fax www.northinfo.com 2

PART ONE: The Sociological Context Investment practitioners who utilize quantitative models often find that these models do not work nearly as well in practice as they do in back-tests or simulated trading. To understand this phenomena we must understand not only the nature but also the context of the quantitative investment process. Investment models are developed by people who generally consider their careers as being an "analyst". If we take a moment to consider the nature of the character of the analyst, we will immediately come upon the most common error in the construction of quantitative models, that of mistaking what is complex for that which is necessarily profound. With increasing complexity comes an inherent tendency toward instability, just as a Ferrari requires more tune-ups than a Chevy. The second exhibit (exhibit page 1, lower section) provides an interesting insight into the nature of the "analyst". The reader should note that Poe was the first American author of the detective genre of fiction. It should also be noted that in writing another of his short stories, The Mystery of Marie Roget, he performed an act of analysis of which we should all be jealous. Needing some basis for a mystery story, Poe came upon an unsolved and highly-publicized murder in New York City in the late 1830's. From his home in Baltimore, Poe, like any good quant, carefully assimilated all publicly known information on the case (various newspaper accounts). Through careful reconciliation of the conflicting facts presented in the newspapers, Poe reached a conclusion as to the most likely murderer among the publicly recognized suspects and hence had an ending for his fictionalized account of the case. To avoid libel, Poe changed the venue of the case from New York to Paris (Hoboken was now the Left Bank) and changed the victim's name from Mary Rodgers to Marie Roget. After several years of unsuccessfully trying to get the story published, it was finally printed as a serial in Harpers Magazine in 1842. Upon its publication, the real life person represented by the character Poe had chosen to name as the murderer, came forward and confessed. The consistent bias toward overly complex models is also reinforced by another human imperfection: ego. The building of quantitative models often involves usage of advanced mathematics, statistics and computer methods of which only a minority of the investment community have a functional working knowledge. As such, quantitative analysts within the industry have taken on a mystique, much like the high priests of a pagan religion. To retain this special status, like the catholic monks of the Dark Ages, quantitative analysts must prevent such knowledge from spreading into the language of the common people. Let us not forget that Galileo was not persecuted by the church for his heretical scientific theories until he began to publish in Italian rather than Latin. Investment analysis also suffers from an affliction which impacts all of the social sciences, the immortality of published research. Unlike the physical sciences, where the laws of nature change

Quantitative Investing as a Liberal Art

Northfield Information Services 184 High Street ⋅ Boston, MA 02110 ⋅ 617.451.2222 ⋅ 617.451.2122 fax www.northinfo.com 3

very slowly if at all, the social sciences study phenomena which are the result of the hand of man, and hence are unstable through time. While theoretical work may have a somewhat longer shelf life, empirical research may be meaningless by the time ink meets paper. Unfortunately, many practitioners assume that since something was published, it must be valid and hence can be assumed to be a valid basis for extending the line of research. A good current example of this problem is the extensive effort made by fixed income researchers to account for the mean-reverting tendency of interest rates in their models of fixed income instruments. It might be wondered how many of them bothered to determine if it is either necessary or appropriate to consider the mean reversion issue at all. If one performs a simple time series analysis of monthly returns of a major bond index such as the Lehman Brothers Government Corporate (Jan 1973 through May 1992), one see's that the first auto-correlation coefficient is positive (.17) and statistically significant (T=2.67), rather than negative as would indicate a mean-reverting series. A provocative view of the publication issue can be found in third exhibit (exhibit page 2, upper section). Another major roadblock to successful implementation of quantitative models is business risk to people in the investment management business. For example, in a typical US pension fund, asset allocation policy is set by the fund sponsor while active management of portfolios is contracted to outside investment management firms. In that a fund sponsor can not fire itself for a bad asset allocation decision, but can easily fire a manager, the level of risk taking by active managers is often below that which would be consistent with the aggressiveness inherent in the sponsor's asset allocation policy. To make matters worse, large funds often have multiple active managers for each asset class, adding diversification across managers. As such, the need for diversification within each manager's portfolio is reduced. The typical result is an active management posture within an asset class which is an order of magnitude more timid than is consistent with the asset class benchmark.

Quantitative Investing as a Liberal Art

Northfield Information Services 184 High Street ⋅ Boston, MA 02110 ⋅ 617.451.2222 ⋅ 617.451.2122 fax www.northinfo.com 4

PART TWO: Failure to Define the Objective Function A very common failing with quantitative models is to clearly identify the assumed objective function of the investor whose funds are to be invested according to this scheme. Most practitioners have a vague sense that investors would prefer more return to less and would prefer less uncertainty of return to more. These two factors are commonly known as the first and second stochastic dominance criteria. Beyond this, specification of the goals of an investment model often becomes ill-defined. Most investment models assume a mean-variance utility function, which is consistent with the basis of current portfolio theory. While such a mean-variance approach provides an elegant unity to investment theory, it is conceptually defensible if and only if we assume that returns are symmetrically distributed. Clearly, investors have no objection to unexpectedly high returns. They are unhappy only with unexpectedly low returns. Let us suppose we are developing a valuation model for equity portfolio management. Is it our goal to find a model which produces the highest return in the long run (maximize the geometric mean return), the highest return to variability ratio (mean-variance efficient) or is our goal to find the model which maximizes return while keeping the probability of underperforming some stated goal below an acceptable level (the Telser criteria)? Depending on what your portfolio management process is actually trying to accomplish, the effectiveness of any model will be judged differently. An insidious problem is that many quantitative models are judged using measurement criteria which are dependent on the assumed utility function of the investor, without recognition of the form of that utility function. For example, two common measurements of investment performance, the Treynor ratio and Jensen's alpha are applications of classic Capital Asset Pricing Model. One of the assumptions which underlies the standard CAPM is that all investors make their investment decisions on the basis of a mean-variance utility function. In that only a small minority of the professional investment community can describe the nature of a mean-variance utility function, the assumption that something like the "invisible hand" of Adam Smith is guiding all investors to act in accordance with this approach may be heroic. Some interesting views of the operative motivation of investors are presented in the fourth exhibit (exhibit page 2, lower section).

Quantitative Investing as a Liberal Art

Northfield Information Services 184 High Street ⋅ Boston, MA 02110 ⋅ 617.451.2222 ⋅ 617.451.2122 fax www.northinfo.com 5

PART THREE: Back Testing and Simulations Conceptual Problems (exhibit page four, upper section) Most quantitative models go through a least one stage of gestation known as a "back-test". This process in an empirical study which attempts to judge if a particular investment technique would have been successful had it been applied in the past. If it appears from the test that the methods under study would have been successful in meeting the hopefully well defined goals of our investors, then the method is often tried in simulated or "paper" trading in purportedly current investment conditions. The central point of this paper is to consider the reasonable extent to which "back-tests" and simulated trading may be relied upon as indicative of likely results in actual implementation. For a short answer to this question, please see the third page of exhibits. The essence of a back-test is to step backward in time to some past point and then come forward, simulating what we believe our actions would have been, assuming we had previously formulated the investment approach which are now testing (and therefore was presumably only formulated in the recent past). This brings forward the first of several conceptual problems with back-tests. Even non-financial scientists, such as Stephen Hawking, have noted the potential for investment reward. In A Brief History of Time, he said:

"It is rather difficult to talk about human memory because we don't know how the brain works in detail. We do, however, know all about how computer memories work. I shall therefore discuss the psychological arrow of time for computers. I think it reasonable to assume that the arrow for computers is the same as that for humans. If it were not, one could make a killing on the stock exchange by having a computer that would remember tomorrow's prices. To summarize, the laws of science do not distinguish between the forward and backward directions of time. However, there are at least three arrows of time which distinguish the past from the future. They are the thermodynamic arrow, the direction of time in which disorder increases (just think of a teenager's room - Author' Note), the psychological arrow, the direction of time in which we remember the past and not the future and the cosmological arrow, the direction of time in which the universe expands rather the contracts."

We can borrow from another principle of physics, the Heisenberg Uncertainly Principle, to illustrate further limits on the validity of back-testing and simulation methods. In order to predict the future price and rate of return of an investment asset, we must be able to accurately measure its current value and rate of return. Prices, however, are valid only for an instant while returns must be measured over periods. As such we can only get an accurate gauge of prices if we are willing to accept return data for an infinitely short period. Conversely we can get meaningful return data only if we are willing to accept the average price of the period over which return is measured. We cannot simultaneously know both items with perfect validity.

Quantitative Investing as a Liberal Art

Northfield Information Services 184 High Street ⋅ Boston, MA 02110 ⋅ 617.451.2222 ⋅ 617.451.2122 fax www.northinfo.com 6

Another conceptual problem with back-tests is the implication of the detached observer. The tests assume that had we developed the technique we are now testing at some earlier date, we could have applied such a technique, garnered the potential success thereof and had no impact on the markets. In financial markets, where there must be a buyer for every seller and a seller for every buyer this is conceptually weak. Even in a market of infinite liquidity, competitive business pressures cause market participants to observe each other. It is unlikely that any technique offering a meaningful competitive advantage would cause no response among other market participants. Most back-tests rely on parametric statistics. Most test results are analyzed with statistics which presume from the outset that the distribution of estimators in known in advance. We assume a neat world of normal distributions and linear effects. As such we find only those empirical results which can be found under those constraints. The most profound problem with back-tests and simulations is that no matter how rigorously they are performed, there is the final, overriding assumption that the world of tomorrow will behave like the world of today or yesterday. Technological Problems(exhibit page four, lower section) Many back-tests obtain falsely positive results due to the improper management of technology and resources. The most obvious and often repeated flaw is the usage of today's cheap computer capacity to sift through mountains of past data which it would have been economically impractical to research even if the conceptual basis of the technique under study had been evident. If we gave modern M-16 rifles to Napoleon's troops they certainly would have killed far more of the enemy soldiers but could we then credit Napoleon with being a better general. A parallel problem exists in academic studies which utilize large amounts of free graduate student labor to research strategies it would not be economically feasible to implement as a practitioner. Another common problem is the handling of data errors. Many practitioners routinely perform large amounts of data "cleansing" before beginning a research test. In a back-test, we have the luxury of artificial time. In the real-time application of an investment technique, it may not be possible to perform extensive data cleansing operations. While extensive data checks may be appropriate for academic research, the practitioner must simulate real conditions as much as possible including the practical limitations of ongoing implementation. A related flaw in many tests is the reliance on databases which are subject to revision. Economic statistics, for example, are routinely revised by the government agencies that issue them. If we are building a model which will react to information as it is released, have we tested on "as released" or "as revised" test information?

Quantitative Investing as a Liberal Art

Northfield Information Services 184 High Street ⋅ Boston, MA 02110 ⋅ 617.451.2222 ⋅ 617.451.2122 fax www.northinfo.com 7

Statistical Problems (exhibit page five, upper and lower sections) Among the myriad of flaws of statistical procedure which can beset a back-test or simulation are look-ahead bias and survival bias. Look ahead bias simply means utilizing information which could not have been known at the past times at which actions relating to that information were purportedly to have been undertaken. Survivor bias is limiting the subjects of a test (security universe) to those securities which continued to exist at the end of the test period. A blatant example of survivor bias was a famous study which tried to study the health of 30 year old American males by getting birth records and sending out surveys to the birthday boys. The entire lot seemed remarkably healthy until someone noticed that very few responses had been received from the deceased. Relating to our earlier discussion to inadequate specification of objective function is the practical problem of how to select the control group or benchmark for comparison. Many researchers have shown considerably more skill (valuable nevertheless) in selecting benchmarks which they thought would perform poorly, hence giving a relative edge to whatever technique they were choosing to explore. A phenomena called "data mining" often arises in poorly designed experiments. We often try to establish the validity of an investment technique with the use of T-stats and other descriptive statistics which measure the likelihood that our results (hopefully successful) arise by chance. For example, if we presume that a technique is valid only if we are 95% confident that the successful result is not a chance event, we will still obtain many false successes if we test hundreds of different techniques. Many practitioners of technical analysis fall into this trap. It has even been computerized with so-called indicator optimization programs. For more on this issue, see the eleventh exhibit, upper section, exhibit page six. Many practitioners build quantitative models based on a set of data and then test the model only on the data from which it was derived. This "in-sample" testing is a self-fulfilling prophecy and always works well. In essence, the results of the in-sample tests represents the upper limit of the usefulness of a technique, not the expected value of its usefulness. There are a host of more mundane statistical flaws which can remove the validity of an experimental design such as inappropriate handling of outliers, and the inclusion of highly correlated independent variables in a model. Earlier we made mention of the typical reliance on parametric statistics which can be rendered meaningless by higher order characteristics of a distribution such as skewness and kurtosis. Given the reasons cited here for potentially misleading results of a back-test or simulation, the practitioner is advised to proceed carefully with claims for discovery of powerful indicators of future investment performance. The sentiments expressed in The Prophet (exhibit twelve, lower section, exhibit page six) are somehow very appropriate to this circumstance.

Quantitative Investing as a Liberal Art

Northfield Information Services 184 High Street ⋅ Boston, MA 02110 ⋅ 617.451.2222 ⋅ 617.451.2122 fax www.northinfo.com 8

PART FOUR: Estimation Error In building quantitative models, one of the most obvious and yet overlooked aspect of the process is to take estimation error into account. Many of the things we need to forecast such as the relative performance of different stocks is relatively difficult to get right. On the other hand, we can predict with great certainty that certain minimum transaction costs will occur with each trade that we do. We need to be honest with ourselves. If our quantitative model is to correctly weigh a variety of predicted factors, each with a different level of certainty to the prediction, we must try to adjust our estimates of future events to incorporate the level of our prediction accuracy. A popular approach to this process is known as Bayesian statistics. Put simply, the process is to come up with two expected values for each item we need to forecast. The first value, known as a prior, represents our best guess of what the value of the predicted item will be if our prediction process has no validity. The second estimate is our actual prediction for the predicted item. For example, we might predict that a certain stock would outperform the market by 10% in the upcoming year. Our prior (assuming our prediction has no particular validity) is that the stock will outperform the market by 0% in the upcoming year. If we know that our past predictions of such things have had a particular level of accuracy, we may proportionately combine the two forecasts values into one forecast with greater expected accuracy. Once we admit that estimation error exists, the formation of efficient frontiers and similar constructs of portfolio theory becomes considerably more complex. Efficient frontiers become fuzzy bands of overlapping portfolios which are not meaningfully distinguished from one another in terms of risk and return. Statistical resampling methods such as "bootstraps" and "jacknifes" can be used to estimate the level of estimation error in portfolio construction. Results can be improved through the usage of non-parametric statistics and other techniques not requiring assumptions of linearity, symmetry or normality. For example, most of the analysis of utility functions can be extended to include situations where the distribution is not known in advance by use of Tchebyshev's inequality. Another useful technique is a form of discriminant analysis known as entropy minimization. In entropy minimization, a search algorithm is used to identify certain critical values of description variables which can then be combined together in an ad hoc "screening" fashion to most efficiently separate those members of the universe which have met some arbitrary but predefined criteria for success.

Quantitative Investing as a Liberal Art

Northfield Information Services 184 High Street ⋅ Boston, MA 02110 ⋅ 617.451.2222 ⋅ 617.451.2122 fax www.northinfo.com 9

Conclusions Back-tests and simulations are of limited usefulness in prediction of future investment outcomes. Many of the limitations are conceptual and apply irrespective of the level of statistical and experimental rigor. Considerable insight into the problems of quantitative investment analysis can be obtained from a subjective analysis of the professional and sociological context in which such analysis takes place. The fifteenth exhibit (exhibit page eight, upper section) should do much to properly reorient and broaden the perspective of practitioners. Most models fail the final test of successful implementation simply because practitioners had no right to expect that a model would work in the real world just because it worked on paper. Presenter's Note There once was an playwright who was commissioned to prepare a work for mid-summer presentation to a prominent and discriminating audience. Faced with relatively little lead time, he was not altogether happy with the result. Finding myself in a similar position, I have taken his words as the final exhibit.

WE are the problem

The mental features discoursed of as the analytical, are in themselves but littlesusceptible of analysis. We appreciate them only in their effects. We know of them,among other things, that they are always to their possessor when inordinately possessed,a source of the liveliest enjoyment. As the strong man exults in his physical ability,delighting in such exercises as call his muscles into action, so glories the analyst in thatmoral activity which disentangles. He derives pleasure even from the most trivialoccupation which brings his talent into play. He is fond of enigmas, of conundrums.hieroglyphics; exhibiting in his solutions of each a degree of acumen which appears tothe ordinary apprehension as praeternatural. His results brought about by the very souland essence of method, have, in truth, the whole air of intuition.

Edgar A. PoeMurders in the Rue Morgue

What to Publish?When we were first drawn together as a society, it had pleased God to enlightenour minds so far as to see that some doctrines, which we once esteemed truths,were errors; and that others, which we had esteemed errors, were real truths.From time to time, He has been pleased to afford us farther light, and ourprinciples have been improving and our errors diminishing. Now we are not surethat we have arrived at the end of this progression, and at the perfection ofspiritual or theological knowledge; and we fear that, if we should once print ourconfession of faith, we should feel ourselves as if bound and confined by it, andperhaps be unwilling to receive farther improvement, and our successors stillmore so, as conceiving, what we their elders and founders had done, to besomething sacred never to be departed from.

Michael WelfareFounder, The Society of Dunkers

Defining the Utility Function

Three Views:

"can only be taken as a result of animal spirits - of a spontaneous urge to action ratherthan inaction, and not as the outcome of a weighted average of benefits multiplied byquantitative probabilities."

John Maynard Keynes on investor's decisions, 1923

"The age of the computer, systematic analysis, and rational decision making based oneconomic theory has dawned."

Rudd and Clasing, Modern Portfolio Theory, 1982

"When it is a question of money, everybody is of the same religion."

Voltaire (in conversation)

Question:How to Blow a Back Test?

Answer:Believe the Results.

End of Presentation

Just kidding!

Conceptual Problems

The Stephen Hawking ProblemWhy doesn't time work in reverse?

The Heisenberg Uncertainty PrincipleOnly finer detail helps

The Scientific MethodThe implication of the detached observer.

The search for neatnessThe linear, normally distributed world.

Technological Problems

Data Mining:

Cheap Computer CapacityNapoleon with M-16sHigh kill ratio, but does it mean anything.Beware free graduate student labor.

Database error ratesTo clean or not to clean, that is the question?

Data cleaning beyond your controlRevised versus reported data

Statistical Problems

Survival BiasThe health of 30 year old American males

Picking the Appropriate BenchmarkThe slow rabbit problem

Meaning of T-StatsSpurious Correlation

Testing on In-Sample DataHow to always be rightWhat In-Sample Tests can do

Statistical Problems II

OutliersThe art of the science

SurvivabilityThe multiple factor environment

MulticolinearityControlling cross-correlation in multiple models

SkewnessUnimodaleptokurtosis

Correlation Not Causality

Even trained statisticians often fail to appreciate the extent to which statistics are vitiatedby the unrecorded assumptions of their interpreters... It is easy to prove that the wearingof tall hats and the carrying of umbrellas enlarges the chest, prolongs life and conferscomparative immunity from disease. A university degree, a daily bath, the owning ofthirty pairs of trousers, a knowledge of Wagner's music, a pew in church, anything, inshort, that implies more means and better nurture... can be statistically palmed off as amagic spell conferring all sorts of privileges... The mathematician whose correlationswould fill a Newton with admiration, may, in collecting and accepting data and drawingconclusions from them fall into quite crude errors by just such popular oversights as I havebeen describing.

George Bernard ShawThe Doctor's Dilemma

Market Inefficiencies

Say not, "I have found the truth," but rather, "I have found atruth."

Say not, "I have found the path of the soul." Say rather, "I havemet the soul walking upon my path."

Kahlil GibranThe Prophet

Estimation Error

Validity of parametric statisticsMean-Variance utility functions

The not-so-efficient frontier

Bayesian adjustmentsTo thine own self be true

Statistical Gymnastics: The bootstrap and jacknifeDifferentness tests: A simple Chi-Square

Non-Parametric Methods

Simple improvements:Median versus meanRank correlationCoefficient of concordance

Not-so-simple improvement:Entropy minimization

The Era of Quant

"Between the Second World War to end allwars, and the Third, which by eliminatingmankind altogether, will actually do thetrick."

Philip MacDonaldThe List of Adrian Messenger

A Parting Apology...If we shadows have offended,Think but this and all is mended,That you have slumb'red hereWhile these visions did appear.And this weak and idle theme,No more yielding but a dream,Gentles, do not reprehend.If you pardon, we will mend.And, as I am an honest Puck,if we have unearned luckNow to scape the serpent's tongue,We will make amends ere long;Else the Puck a liar call. Puck, (closing verse)So, good night unto you all.Give me your hands, if we be friends, A Midsummer Night's DreamAnd Robin shall restore amends. By William Shakespeare