Transcript
Page 1: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

INFERENCE TO THE BEST EXPLANATION, DUTCHBOOKS, AND INACCURACY MINIMISATION

BY IGOR DOUVEN

Bayesians have traditionally taken a dim view of the Inference to the Best Explanation (IBE),arguing that, if IBE is at variance with Bayes’ rule, then it runs afoul of the dynamic Dutchbook argument. More recently, Bayes’ rule has been claimed to be superior on grounds of condu-civeness to our epistemic goal. The present paper aims to show that neither of these argumentssucceeds in undermining IBE.

The inference rule called ‘Inference to the Best Explanation’ (IBE) assignsconfirmation-theoretic import to explanatory considerations. According tosome, IBE is the cornerstone of scientific methodology.1 But critics haveargued that, if IBE is at variance with Bayes’ rule, then, like any othersuch rule, it is to be rejected as leading to irrational belief updates.2 Formany years, the standard argument for this claim has been the so-calleddynamic Dutch book argument, which purports to show that updating byany rule other than Bayes’ makes one liable to sure financial losses. How-ever, even Bayesians themselves have increasingly come to regard this ar-gument as addressing the wrong issue, to wit, that of whether it isrational from a practical, rather than an epistemic, viewpoint to deviatefrom Bayes’ rule. This has led some theorists to pursue a different strategy

1 See, for example, R. Boyd, ‘The Current Status of Scientific Realism’ in J. Leplin(ed.), Scientific Realism (University of California Press, 1984), pp. 41–82, and E. McMullin,The Inference that Makes Science (Marquette University Press, 1992).

2 We will be throughout concerned with update rules applicable to learning events inwhich an agent becomes certain of a proposition of which he or she was previously uncer-tain. Bayesians acknowledge that other types of learning event call for different updaterules. See, for example, R. Jeffrey, The Logic of Decision (University of Chicago Press, 2nded., 1983), Ch. 11, and B. C. van Fraassen, Laws and Symmetry (Oxford UP, 1989), Ch. 13.

The Philosophical Quarterly Vol. 63, No. 252 July 2013ISSN 0031-8094 doi: 10.1111/1467-9213.12032

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical QuarterlyPublished by John Wiley & Sons Ltd, 9600 Garsington Road, Oxford ox4 2DQ, UK, and 350 Main Street, Malden, MA 02148, USA

Page 2: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

in defence of Bayes’ rule, a strategy that is purported to offer a distinc-tively epistemic argument in favour of that rule, one spelled out in termsof inaccuracy minimisation. Roughly, the argument is that by updatingvia any non-Bayesian rule, one’s degrees of belief are not as accurate asthey would have been had one updated via Bayes’ rule.

This paper aims to show that neither of the aforesaid arguments succeedsin undermining IBE. The first part of the paper argues that, while currentdevelopments in mainstream epistemology may help to deflect some of thecriticism the dynamic Dutch book argument has met with, the argumentfails nonetheless, since it rests on an unfounded (and unstated) premise. Thesecond part focuses on the inaccuracy-minimisation defence of Bayes’ rule,arguing that there appear to be several equally legitimate ways to interpretthe notion of inaccuracy minimisation, and using computer simulations toshow that under some of them it may be IBE rather than Bayes’ rule thatdoes best with regard to inaccuracy minimisation.

I. THE DYNAMIC DUTCH BOOK ARGUMENT REVISITED

For many decades, the Ramsey–de Finetti Dutch book argument hasbeen viewed as key to the Bayesian account of rationality. According tothis argument, we are susceptible to Dutch books—collections of betsensuring a negative net pay-off come what may—precisely if ourdegrees of belief violate the axioms of probability. From this, Ramseyand de Finetti concluded that rational degrees of belief are formallyprobabilities.

Ian Hacking may have been the first to observe that the Ramsey–deFinetti argument in fact does nothing to justify Bayes’ rule.3 However, afew years after the publication of Hacking’s paper, Paul Teller reported aDutch book argument—which he attributed to David Lewis—aimed atjustifying Bayes’ rule as the only rational update rule.4 This dynamicDutch book argument (as it is now called) purports to show that if a per-son updates by some rule other than Bayes’, she can be offered a series ofbets at different points in time such that each bet will seem fair at thetime it is offered, yet jointly the bets guarantee a financial loss. What isworse—the argument continues—the person could have seen this loss

3 I. Hacking, ‘Slightly More Realistic Personal Probabilities’, Philosophy of Science, 34,1967, pp. 311–25.

4 P. Teller, ‘Conditionalization and Observation’, Synthese, 26, 1973, pp. 218–58. Thesource of the argument reported by Teller was later published as D. Lewis, ‘Why Condi-tionalize?’ in his Papers on Metaphysics and Epistemology (Cambridge UP, 1999), pp. 403–7.

INFERENCE TO THE BEST EXPLANATION 429

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 3: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

coming. This vulnerability to ‘dynamic’ Dutch books has convinced manythat non-Bayesian updating is a mark of irrationality.5

Recently, however, the Dutch book approach to defending Bayesianismhas come under a cloud. Critics have argued that when we are concerned withthe rationality of degrees of belief as well as the change thereof over time,we are concerned with questions of epistemic rather than practical rationality.Given that being vulnerable to cunning bookies seems primarily a practicalliability, the Dutch book arguments have been said to be beside the point.6

In response to this, some (e.g., Brian Skyrms7) have claimed that Dutchbook vulnerability does flag an underlying epistemic defect: it is a mani-festation of the fact that a person deems one and the same bet or seriesof bets as both fair and not fair and thus is in an inconsistent state ofmind. Even if this claim is true for the (static) Ramsey–de Finetti argu-ment,8 the point does not carry over to the dynamic Dutch book argu-ment. An agent may be susceptible to engage in the kind of betting overtime that figures in that argument without at any one time holding incon-sistent views on the fairness of any bets. Naturally, after a learning eventshe may regard a bet as unfair that previously she regarded as fair, butthe same would have been true had she been a Bayesian learner.

Still, Bayesians may not be altogether defenceless against the above cri-tique. In particular, they may be able to get some mileage out of the prag-matic turn that a number of epistemologists are currently taking. Theepistemic status of a belief has traditionally been thought to depend solelyon matters that bear on the truth of the belief, like the quality of one’s evi-dence or whether or not one is reliably connected to what the belief isabout. But over the past years, various authors have argued that the episte-mic status of a belief is inextricably bound up with the believer’s practical sit-uation, in particular, with what is at stake for her in believing correctly.9

Bayesians wishing to maintain the integrity of the Dutch book defence maynot want to buy into any particular one of the arguments that have been

5 We should actually speak of putative vulnerability to dynamic Dutch books: that anon-Bayesian updater is bound to regard all bets in a dynamic Dutch book as fair has beendisputed in I. Douven, ‘Inference to the Best Explanation Made Coherent’, Philosophy ofScience, 66, 1999, pp. S424–35; see also M. Tregear, ‘Utilising Explanatory Factors in Induc-tion?’, British Journal for the Philosophy of Science, 55, 2004, pp. 505–19.

6 See J. Joyce, ‘A Nonpragmatic Vindication of Probabilism’, Philosophy of Science, 65,1998, pp. 575–603, Sect. 2, and references given there.

7 B. Skyrms, ‘Coherence’ in N. Rescher (ed.), Scientific Inquiry in Philosophical Perspective(University Press of America, 1987), pp. 225–41.

8 But see Joyce, ‘A Nonpragmatic Vindication of Probabilism’, p. 585 f, for a critique.9 See, for example, J. Fantl and M. McGrath, ‘Evidence, Pragmatics, and Justification’,

Philosophical Review, 111, 2002, pp. 67–94, and J. Stanley, Knowledge and Practical Interests(Oxford UP, 2005).

430 IGOR DOUVEN

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 4: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

advanced in favour of this ‘pragmatic encroachment view’ (as it has beencalled). However, it may suffice for them to argue that the mere prominencein current epistemology of the debate on pragmatic encroachment is enoughto call into question the existence of the clear-cut divide between the episte-mic and the pragmatic that the critics of the Dutch book arguments are pre-supposing. For—it may be said—if there were such a clear-cut divide,contributions to this debate should have gone down like lead balloons. And—Bayesians may conclude—if there is no such divide, then little is left of thecharge that Dutch book arguments address the wrong type of rationality.

Be this as it may, there is a deeper problem with the dynamic Dutchbook argument, one that remains even if the pragmatic encroachmentview is endorsed. For note that Dutch book invulnerability is only oneamong an in principle indefinite number of practical interests that peoplemay have. What if IBE, or some other non-Bayesian rule, serves otherpractical interests better than Bayes’ rule? Could the Bayesian in that casestill maintain that, in view of the dynamic Dutch book argument, Bayes’rule is the only rational update rule? Surely there exist practical goalswhose achievement would more than make up for running the risk ofbeing fleeced by a Dutch bookie—especially in view of the fact thatDutch bookies only occur as fictional characters in philosophers’ tales!

To show that IBE may indeed have compensating practical advantages,we use a particular probabilistic version of IBE that we apply in the contextof a simple statistical model. Let fHigiOn be a set of self-consistent, mutuallyexclusive, and jointly exhaustive hypotheses, and Pr one’s probability functionprior to learning E. Then, according to the version of IBE to be considered,one’s new probability for Hi after learning E (and nothing stronger) equals

PrðHiÞPrðEjHiÞ þ f ðHi;EÞPnj¼1 PrðHjÞPrðEjHjÞ þ f ðHj ;EÞ

� � ;

where f is a function that assigns a bonus point (> 0) to the hypothesis (orhypotheses, in case of a tie) that explain(s) E best in light of the back-ground knowledge, and assigns nothing to the other hypotheses.10

For present concerns, all we need is a definition of ‘best explanation’ forthe following statistical model. Let fHig0OiO10 be a set of ‘bias hypotheses’concerning a given coin, where Hi is the hypothesis that the bias for heads

10 A number of authors, including Peter Lipton and Jonathan Weisberg, have proposedversions of IBE that are compatible with Bayesianism; see P. Lipton, Inference to the BestExplanation (Routledge, 2nd ed., 1004), Ch. 7, and J. Weisberg, ‘Locating IBE in the Bayes-ian Framework’, Synthese, 167, 2009, pp. 125–43. Note that the present version is equivalentto Bayes’ rule if, and only if, there is no best explanation among the hypotheses andf assigns 0 to all of them.

INFERENCE TO THE BEST EXPLANATION 431

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 5: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

is i/10. These hypotheses are supposed to be jointly exhaustive. Let Ej indi-cate that the outcome of the j-th of a series of tosses with the given coin is E.Then we say that Hi best explains Ej iff i/10 is closer to the actual frequencyof heads in the first j tosses than k/10, for all k ∈ {0,…,10} different from i.For definiteness, in this case let f ðHi;EjÞ ¼ :1. If, for some i and k, i/10 andk/10 are equally close to the actual frequency of heads, and closer than l/10for all l different from i and k, the bonus is split: f ðHi;EjÞ ¼f ðHk ;EjÞ ¼ :05. No other hypotheses receive bonuses. So, for instance, if681 heads have been observed in a series of 999 tosses and the 1000th toss isagain heads, then in this model, and absent any further information aboutthe coin, H7 provides the best explanation of that last outcome and thusreceives a bonus if updating proceeds by the above rule.

We will continue to use ‘IBE’ as a label for the broad idea that explana-tory considerations have confirmation-theoretic import and use ‘IBE’ (in sansserif font) to designate our probabilistic explication of that idea. It is to beemphasised that nothing in the following will hinge on whether IBE is thebest or even a satisfactory explication of IBE. The main role of IBE will be inbringing into relief hidden, problematic premises in the dynamic Dutchbook and inaccuracy-minimisation arguments. If we have failed to capturethe notion of best explanation even for the above simple model, that willnot make those arguments valid without the hidden premises, nor will itmake these premises appear unproblematic.

Now, suppose the Bayesian and the explanationist (as we shall henceforthcall the IBE-updater) are watching the same sequence of coin tosses to whichthe above model pertains. Both have started with a flat probability distribu-tion over the eleven bias hypotheses, and they update their probabilities forthese hypotheses via their respective update rules. It has been said that aproposition ‘is assertable to the extent that it has high subjective probabilityfor its assertor’.11 If so, we may ask who—the Bayesian or the explanationist

11 F. Jackson, ‘On Assertion and Indicative Conditionals’, Philosophical Review, 88, 1979,pp. 565–89, at p. 565. It has been argued that a proposition is assertable if it is rationallyacceptable; see, for example, I. Douven, ‘Assertion, Knowledge, and Rational Credibility’,Philosophical Review, 115, 2006, pp. 449–85 and ‘Assertion, Moore, and Bayes’, PhilosophicalStudies, 144, 2009, pp. 361–75. This yields Jackson’s view cited in the text if high subjectiveprobability suffices for rational acceptability. That might seem too simplistic, in particularin view of so-called lottery propositions (propositions to the effect that a given ticket in alarge fair lottery with only one winner will lose), for, although highly probable, such propo-sitions do not at all seem rationally acceptable. But see I. Douven, ‘The Lottery Paradoxand the Pragmatics of Belief’, Dialectica, 66, 2012, pp. 351–73, and ‘Putting the Pragmaticsof Belief to Work’ in A. Capone, F. Lo Piparo, and M. Carapezza (eds), Perspectives on Prag-matics and Philosophy (Springer, in press), for a proposal on which high probability is enoughfor rational acceptability, with the exception of lottery propositions, which are excluded onprincipled grounds.

432 IGOR DOUVEN

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 6: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

—is more likely to first be in a position to assert the truth about the bias ofthe coin, supposing the sequence is long enough for both eventually to be inthat position.

To answer this question, we ran computer simulations of sequences ofcoin tosses that were long enough for both the Bayesian and the explana-tionist to assign a probability greater than .99 to the true bias hypotheses(following Richard Foley’s suggestion that .99 is a reasonable thresholdfor ‘high subjective probability’12). More exactly, we ran 1000 sequencesof (simulated) tosses of a coin with bias .1 for heads, 1000 sequences oftosses of a coin with bias .2 for heads, and so on, up to bias .9 (the resultsfor bias 0 and bias 1 were determined analytically). It was further simu-lated that, after each toss, the Bayesian and the explanationist updatedtheir probabilities for the various bias hypotheses, and it was registeredwho was first in assigning a probability greater than .99 to the truehypothesis and also how many more updates the other party needed toassign a probability greater than .99 to the truth as well. Table 1 summa-rises the results of these simulations. As can be seen in this table, the ex-planationist was, on average, much faster to come to a position to assertthe truth about the bias of the coin than the Bayesian.

If this sounds unremarkable, imagine that the hypotheses concern somescientifically interesting quantity—such as the success rate of a medical treat-

Bias X SD n<�100 �100⩽n<0 n = 0 0<n⩽100 n>100

.0 24 0 0 0 0 1000 0

.1 64 66 1 86 0 709 204

.2 128 83 0 6 0 384 610

.3 180 101 0 0 0 184 816

.4 211 115 0 0 0 99 901

.5 220 117 0 0 0 111 889

.6 212 111 0 0 0 106 894

.7 172 101 0 0 0 195 805

.8 126 86 0 5 3 393 599

.9 65 64 4 69 0 716 211

1.0 24 0 0 0 0 1000 0

Table 1. Results of 1000 simulations of sequences of coin tosses for each bias value: X gives the average

over 1000 simulations of the number n of updates by which the explanationist was faster than the Bayesian

in assigning a probability above .99 to the correct bias hypothesis; SD gives the standard deviation of the

sample; the other numbers indicate in how many of the simulations the explanationist was more than 100,

between 0 and 100, etc., updates faster than the Bayesian.

12 R. Foley, ‘The Epistemology of Belief and the Epistemology of Degrees of Belief’,American Philosophical Quarterly, 29, 1992, pp. 111–24, at p. 113.

INFERENCE TO THE BEST EXPLANATION 433

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 7: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

ment, or the probability of depressive relapse—rather than the bias of a coin,and the tosses are observations or experiments aimed at determining thatquantity. Which researcher would not want to use an update rule thatincreases her chances of being in a position to make public a scientific theory,or a newmedical treatment, before the (Bayesian) competition is?

However, there is a possible downside to the apparent success of IBE,described above. The reason why the explanationist so often beats theBayesian in these simulations in assigning a high probability to the truthis that she is, in a clear sense, bolder in her responses to new information,due to the fact that she adds a bonus to the best explanation. It does nottake much to see that this same feature makes the explanationist moreprone to assign a high probability to some false hypothesis: a row of con-secutive tosses producing a subsequence in which the relative frequencyof heads starkly deviates from the probability for heads is more likely topush the explanationist’s probability for some false bias hypothesis overthe .99 threshold than it is to push the more cautious Bayesian’s probabil-ity for some false hypothesis over that threshold. Thus, even if IBE mayput one more rapidly in a position to assert truths, it may also put onemore rapidly in a position that licenses the assertion of falsehoods.

Consider, then, an arguably fairer comparison between IBE and Bayes’rule. A magician has a set of eleven coins in his pocket, one coin with aperfect bias for heads, one with a bias of .1 for heads, one with a bias of.2 for heads, and so on. He entices a Bayesian and an explanationist intoplaying a game against one another. In this game, the magician picks oneof the coins and starts tossing it, showing each outcome to the two play-ers. A player scores a point by raising her hand and asserting the truthabout the bias of the coin. However, if the player gets the bias wrong, thepoint goes to the other player. If both the players raise their hands afterthe same update and both are right about the bias, they both receive apoint; if they are both wrong, neither receives a point; if one is right andthe other wrong, the former receives two points. Once a player hasasserted a bias hypothesis—whether correctly or incorrectly—the magi-cian puts the coin back into his pocket and picks a (not necessarily differ-ent) coin. Then everything starts all over. This procedure is repeated 100times, after which the player with the highest score is declared the win-ner. Note that this game seems to treat the possible advantages and disad-vantages of IBE-updating in an evenhanded manner: because of herboldness, the explanationist might often be first to identify the truth, butby the same token she might earn quite a few points for her opponent.Who is more likely to win the game, the explanationist or the Bayesian?Or are their chances of winning equal?

434 IGOR DOUVEN

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 8: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

To answer these questions, we ran 1000 simulations of the game. Inthese simulations, the explanationist always won, and typically did so by awide margin. To be more exact, the mean of the differences between theexplanationist’s score and the Bayesian’s score was 27 (SD = 9); the small-est difference between the two players’ score at the end of a game was 5,and the greatest was 52.

This is not meant to demonstrate the superiority of IBE. Sets of trickcoins of the variety required for playing the above game are hardly morecommon than Dutch bookies. Moreover, nothing said so far, or to be saidhereafter, excludes the possibility that there are circumstances underwhich Bayesian updating offers practical advantages more consequentialthan Dutch book invulnerability. Instead, the point is to draw attention toa hidden assumption in the Dutch book approach, to wit, that Dutchbook invulnerability trumps all other practical goals we may have. Absentan argument for this assumption, the Dutch book approach does not posea threat to IBE or any other probabilistic explication of IBE, even grantingthat the pragmatic pervades the epistemic.

II. IBE AND OUR EPISTEMIC GOAL

The relation between the pragmatic and the epistemic is immaterial toan approach to justifying Bayes’ rule, and more generally the tenets ofBayesianism, that has emerged over the past fifteen years or so. Varioustheorists, disconcerted by the apparently pragmatic focus of the Dutchbook arguments, have sought to justify the Bayesian tenets in strictly epi-stemic terms. Specifically, they have sought to show that those tenets aremost conducive to the achievement of our epistemic goal, as spelled outfor graded beliefs.

This development owes much to Jim Joyce’s ‘A Nonpragmatic Vindica-tion of Probabilism’, in which a first attempt is made to formulate an epi-stemic goal in terms of graded beliefs, and which argues that, for everydegrees-of-belief function that violates the probability axioms, there is adegrees-of-belief function that obeys those axioms and that is closer to theepistemic goal. Joyce thought that this vindicated the ‘synchronic’ part ofBayesianism, according to which degrees of belief ought to be probabili-ties, though in a later paper he admits that the earlier argument was notairtight.13 For present purposes, however, it is mostly Joyce’s conceptionof our epistemic goal as pertaining to graded beliefs that matters.

13 J. Joyce, ‘Accuracy and Coherence: Prospects for an Alethic Epistemology of PartialBelief’, in F. Huber and C. Schimdt-Petri (eds), Degrees of Belief (Springer, 2009), pp. 263–97.

INFERENCE TO THE BEST EXPLANATION 435

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 9: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

Among mainstream epistemologists, who tend to be concerned firstand foremost with categorical beliefs, it is almost universally held that ourepistemic goal is to believe all that is true and only what is true. Untilrecently, there was no equally clear conception of an epistemic goal interms of graded beliefs. This changed when Joyce proposed an epistemicgoal for graded beliefs that was explicitly meant to be analogous to theaforementioned epistemic goal for categorical beliefs.14 The proposal isthat our system of degrees of belief ought to be gradationally accurate,and in fact as gradationally accurate as any other system of degrees ofbelief that we might adopt. The notion of gradational accuracy is definedin technical terms, in particular in terms of so-called scoring rules (aboutwhich more below). But the basic intuition underlying it is clear enough,to wit, that the higher one’s degree of belief in a true proposition is, themore accurate one is, ceteris paribus, and also the lower one’s degree ofbelief in a false proposition is, the more accurate one is, ceteris paribus.

In the same vein, Hannes Leitgeb and Richard Pettigrew have tried togive a nonpragmatic justification not only of the synchronic part of Baye-sianism but also of its diachronic part, that is, of Bayes’ rule.15 They shareJoyce’s view of our epistemic goal, which they put as follows:16

Accuracy: An epistemic agent ought to approximate the truth. In other words, she

ought to minimise her inaccuracy.

Also like Joyce, Leitgeb and Pettigrew make the notion of inaccuracy pre-cise by reference to scoring rules. What they then argue is that Bayesianupdating minimises expected inaccuracy. More exactly, they argue that if,and only if, an agent updates via Bayes’ rule, she minimises the expectedinaccuracy of her post-update probability function, where the expectationis minimal according to her pre-update probability function.17

What Leitgeb and Pettigrew aim to show for Bayes’ rule is in animportant respect disanalogous with what Joyce aims to show for the syn-chronic part of Bayesianism. While Joyce argues that an agent whosedegrees of belief are not probabilities fails to minimise actual inaccuracy,Leitgeb and Pettigrew argue that an agent who updates by a rule otherthan Bayes’ fails to minimise expected inaccuracy. Nothing they say pre-cludes the possibility that some non-Bayesian rule outperforms Bayes’ rule

14 Joyce, ‘A Nonpragmatic Vindication of Probabilism’, p. 578 f.15 H. Leitgeb and R. Pettigrew, ‘An Objective Justification of Bayesianism I: Measuring

Inaccuracy’, Philosophy of Science, 77, 2010, pp. 201–35, and ‘An Objective Justification ofBayesianism II: The Consequences of Minimizing Inaccuracy’, Philosophy of Science, 77, 2010,pp. 236–72.

16 Leitgeb and Pettigrew, ‘An Objective Justification of Bayesianism I, ibid., p. 202.17 Leitgeb and Pettigrew, ‘An Objective Justification of Bayesianism II, ibid., p. 249 f.

436 IGOR DOUVEN

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 10: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

with respect to actual inaccuracy minimisation. More importantly, Leitgeband Pettigrew are concerned only with the inaccuracy of the immediatepost-update belief state. But an equally—if not more—legitimate questionone may ask about update rules is which of them will lead to the mostaccurate belief state in the long run. Indeed, given that the epistemic goalfor graded beliefs is supposed to be analogous to the epistemic goal as dis-cussed by mainstream epistemologists, the long-run question is the morenatural one to ask. After all, in mainstream epistemology the epistemicgoal is mostly conceived as an ultimate goal to which all our epistemicendeavours are geared and in light of which they are to be assessed.

18

This points to what seems to be a major problem for the inaccuracy-minimisation approach, to wit, that Accuracy permits of a number of differ-ent interpretations. For instance, it can be interpreted as demanding thatevery single update minimise expected inaccuracy, as Leitgeb and Petti-grew do, or that every update minimise actual inaccuracy, or that everyupdate be aimed at realising the long-term project of coming to have aminimally inaccurate representation of the world, even if individualupdates do not always minimise inaccuracy or expected inaccuracy. Whatmakes this problematic is that all of these sound like legitimate epistemicgoals, and that it is by no means obvious that if rule R is most conduciveto the realisation of one goal and rule R′ is most conducive to the realisa-tion of a second, then R = R′, or at least the rules are equivalent in thatthey always yield the same output, given the same input.

In fact, there is not just the question of whether the goal of inaccuracyminimisation is meant as pertaining (only) to the ‘next update’ or is rathermeant as a long-term project. Even if we assume that it is conceived as along-term project, there are still further distinctions that can be made.One notable distinction concerns the question of whether we should aimto have a minimally inaccurate probability function in the long run, how-ever far in the future that may be, or whether it is better to have a mod-erately accurate probability function in the shorter run. It is certainlyimaginable that, for purely epistemic reasons, one might opt for the latter,

18 See, for example, A. Latus, ‘Our Epistemic Goal’ Australasian Journal of Philosophy, 78,1998, pp. 28–39, at p. 29, who says that in asking what our epistemic goal is ‘we are inter-ested in what we want the overall result of our various ways of “finding out”…to be’. AlvinGoldman also argues that (what he calls) epistemic systems are to be evaluated in terms oftheir accuracy, which he phrases in terms of degrees of truth-possession, which in turn isdefined by reference to a scoring rule; see his ‘Epistemic Relativism and Reasonable Dis-agreement’, in R. Feldman and T. Warfield (eds), Disagreement (Oxford UP, 2010), pp. 187–215, Sect. 2. However, he is very clear that what he has in mind is long-term accuracy:‘Epistemic system E is better than epistemic system E� iff conformity to E would produce(in the long run) a higher total amount of degrees of truth-possession than conformity toE� would produce’ (p. 194).

INFERENCE TO THE BEST EXPLANATION 437

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 11: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

supposing there is a choice to be made. We are, after all, curious aboutthe truth,19 and curiosity typically comes with a sense of urgency. As aresult, we might prefer an update rule that is more likely to take us fairlyclose to the truth in a reasonably short time span over one that is morelikely to take us extremely close to the truth in the long run but less likelyto take us even fairly close to the truth in (say) the middle-long run.20

We could go on for some time disambiguating Accuracy in this way, ifonly because speed and accuracy of convergence to the truth can betraded off in an indefinite number of ways. However, our aim here is notto catalogue all reasonable precisifications of Accuracy but rather to showthat the inaccuracy-minimisation argument poses no real threat to IBE.As Leitgeb and Pettigrew argue, Bayesian updating minimises theexpected inaccuracy of our next belief state relative to our current beliefstate, and thereby may be said to be most conducive to one epistemic goal,but that does little to undermine IBE as long as it is understood that IBE(or a particular explication of it) is most conducive to some other episte-mic goal or goals, where possibly the different epistemic goals can both(or all) be regarded as explications of the same broad idea that we shouldstrive for a minimally inaccurate belief system.

At this juncture, Bayesians might try to argue that expected inaccuracyminimisation of our next belief state as judged from our present onetrumps any other epistemic goal we may have. The prospects of this strat-egy are about as bleak as the prospects of showing that Dutch book invul-nerability trumps any other practical goal that we may have; bleaker still,for it would seem absurd to claim that it is epistemically more importantto have an update rule that minimises expected inaccuracy than to haveone that actually minimises inaccuracy.21

Alternatively, Bayesians might try to show that Bayes’ rule does bestwith respect to all epistemic goals. This may be hard to establish in an

19 See, for example, C. Hempel, ‘Aspects of Scientific Explanation’, in his Aspects of Sci-entific Explanation and Other Essays in the Philosophy of Science (Free Press, 1965), pp. 331–496, atp. 333, and R. Foley, The Theory of Epistemic Rationality (Harvard UP, 1987), at p. 11.

20 I. Douven, ‘Simulating Peer Disagreements’, Studies in History and Philosophy of Science,41, 2010, pp. 148–57, Sect. 4, makes a parallel point in terms of the qualitative notion oftruth approximation for update procedures that take into account peer opinions. (See G.Oddie, ‘Truthlikeness’, The Stanford Encyclopedia of Philosophy (Fall 2008 Edition), available athttp://plato.stanford.edu/entries/truthlikeness/, for a useful overview of the literature on truthapproximation.)

21 This seems especially true in view of Robbie Williams’ observation that expectedinaccuracy minimisation defences of Bayes’ rule raise the question of ‘why [you should]trust an outdated belief state to tell you how to fix your beliefs now you have new informa-tion’; J. R. G. Williams, ‘Generalized Probabilism: Dutch Books and Accuracy Domina-tion’, Journal of Philosophical Logic, 41, 2012, pp. 811–40, at p. 835.

438 IGOR DOUVEN

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 12: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

a priori manner, supposing that among the relevant goals are ones whoserealisation requires actual inaccuracy minimisation. How well a rule doesin terms of truth approximation may plausibly depend not only on therule but also on which world is the actual one. For example, even if ex-planationists have so far failed to produce a reason for believing that bestexplanations tend to be true, we might happen to inhabit a world inwhich—perhaps purely coincidentally—best explanations do tend to betrue. Worse yet (for the Bayesians), the simulations to be reported below,which compare Bayes’ rule and IBE in terms of gradational inaccuracy inthe setting of our earlier statistical model, give reason to doubt that Ba-yes’ rule is the most conducive to the realisation of every epistemicgoal.22

In these simulations, gradational inaccuracy is measured by means oftwo scoring rules. One is the popular Brier scoring rule. Given a setfHigiOn of self-consistent, mutually exclusive, and jointly exhaustivehypotheses, the Brier score penalty for an agent whose probabilities aregiven by Pr equals

Pni¼1 sHit� PrðHiÞð Þ2, where sHit designates the

semantic value of Hi .23 The other is the log score rule, another popular

scoring rule.24 In contrast to the Brier rule, the log score rule penalises anagent only on the basis of her probability for the true hypothesis: whereHi is the true hypothesis, an agent’s log score penalty equals ln PrðHiÞð Þ.

For each possible bias value, we ran 1000 simulations of a sequence of1000 tosses. As previously, the explanationist and the Bayesian updatedtheir degrees of belief after each toss. We registered in how many of those1000 simulations the explanationist incurred a lower penalty than theBayesian at various reference points, at which we calculated both Brierpenalties and log score penalties. The outcomes of these simulations aredisplayed in Table 2. They show that, on either measure of inaccuracy,IBE is most often the winner—it incurs the lowest penalty—at each refer-ence point. Hence, at least in the present kind of context, IBE seems a bet-ter choice than Bayes’ rule.

22 An anonymous referee noted that, while the simulations suffice to make my caseagainst the inaccuracy minimisation defence of Bayes’ rule, an analytical investigation ofIBE would be worthwhile in its own right. I fully agree, but this is a task better dealt within a separate paper.

23 Leitgeb and Pettigrew, ‘An Objective Justification of Bayesianism I: Measuring Inac-curacy’, argue that the Brier score is the scoring rule via which to measure gradational inac-curacy. However, their argument rests on several premises the defence of which theyrelegate to future work.

24 In his ‘The Justification of Induction’, Philosophy of Science, 59, 1992, pp. 527–39, RogerRosenkrantz also uses the log score rule in an attempt to vindicate Bayes’ rule that fore-shadows Leitgeb and Pettigrew’s.

INFERENCE TO THE BEST EXPLANATION 439

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 13: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

Or maybe not—for we also calculated averages, taken over the 1000simulations for each bias value, of the penalties incurred by the explana-tionist and the Bayesian at the designated reference points. Table 2 gives,for each of these points, the mean of the IBE penalties minus the mean ofthe Bayes’ rule penalties (in parentheses). As can be seen from those num-bers, if there is a difference between the two means, it is always in favourof Bayes’ rule. The reason for this is manifest from the spread of the sim-ulation outcomes (not represented here), which shows that, although IBEwins in most instances, it is typically by a relatively small margin, whereasin some of the runs in which IBE loses, it incurs considerably greater

Bias 100 250 500 750 1000

Brier score

.0 1000 (0) 1000 (0) 1000 (0) 0 (0) 0 (0)

.1 923 (.02) 993 (0) 1000 (0) 1000 (0) 1000 (0)

.2 748 (.10) 937 (.02) 991 (0) 1000 (0) 1000 (0)

.3 717 (.11) 918 (.03) 984 (0) 998 (0) 999 (0)

.4 690 (.11) 865 (.06) 968 (.01) 990 (0) 997 (0)

.5 679 (.12) 879 (.05) 972 (0) 992 (0) 998 (0)

.6 660 (.14) 904 (.03) 977 (0) 992 (0) 995 (0)

.7 698 (.12) 907 (.03) 978 (.01) 995 (0) 1000 (0)

.8 754 (.09) 947 (.02) 996 (0) 999 (0) 999 (0)

.9 910 (.04) 990 (0) 1000 (0) 1000 (0) 1000 (0)

1.0 1000 (0) 1000 (0) 1000 (0) 0 (0) 0 (0)

log score

.0 1000 (0) 1000 (0) 0 (0) 0 (0) 0 (0)

.1 926 (.18) 993 (.01) 1000 (0) 924 (0) 515 (0)

.2 750 (.75) 937 (.47) 991 (.12) 1000 (0) 1000 (0)

.3 717 (.90) 918 (.58) 984 (.12) 998 (.05) 999 (0)

.4 691 (1.08) 865 (1.08) 968 (.66) 990 (.36) 997 (.14)

.5 679 (1.18) 879 (.94) 972 (.56) 992 (.24) 998 (.09)

.6 664 (1.20) 904 (.86) 977 (.24) 992 (.20) 995 (.21)

.7 698 (.99) 907 (.72) 978 (.28) 995 (.12) 1000 (0)

.8 756 (.68) 947 (.39) 996 (.08) 999 (.01) 999 (0)

.9 910 (.25) 990 (0) 1000 (0) 932 (0) 513 (0)

1.0 1000 (0) 1000 (0) 0 (0) 0 (0) 0 (0)

Table 2. Results of 1000 simulations of sequences of 1000 coin tosses for each bias value: the columns give

the number of simulations in which IBE incurred a lower Brier score penalty (above) or log score penalty

(below) than Bayes’ rule after 100, 250, 500, 750, or 1000 tosses; in parentheses is the mean of the IBE penal-

ties minus the mean of the Bayes’ rule penalties, taken over the 1000 simulations (rounded to two decimal

places).

440 IGOR DOUVEN

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 14: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

penalties than does Bayes’ rule. This, in turn, is due to the fact, notedearlier, that the explanationist reacts in a bolder fashion to the evidencethan does the Bayesian and is therefore more easily led astray by rows oftosses which produce subsequences with deviating relative frequencies.

The graphs in Figure 1 illustrate these connections. The left panel inthe middle row represents the explanationist’s (gray) and the Bayesian’s(black) probabilities after each toss of a sequence of 250 tosses with a faircoin. Both are early on pushed off the right track, but the explanationistmuch more so than the Bayesian. This has a clearly recognizable effecton the differences between the penalties that they incur, as can be seen inthe middle panel of the same row for the Brier rule and in the right panelfor the log score rule.

From these simulations the two update rules under consideration comeout doing better in different respects; this fact further buttresses the pointof this section. For there seems to be no clear answer to the question ofwhether it is better, epistemically speaking, to use an update rule that ingeneral achieves greater accuracy than other update rules, even if typi-cally not much greater accuracy; or to use an update rule that is less likely

50 100 150 200 250

0.02

0.04

0.06

0.08

50 100 150 200 250

0.050.10.150.20.250.3

50 100 150 200 250

0.5

1

1.5

50 100 150 200 250

0.51

1.52

2.53

50 100 150 200 250

0.70.750.8

0.850.9

0.95

50 100 150 200 250

0.2

0.4

0.6

0.8

1

50 100 150 200 250

0.860.880.9

0.920.940.960.98

50 100150200250

0.00250.0050.00750.01

0.01250.0150.0175

50 100 150 200 250

0.0250.05

0.0750.1

0.1250.15

Fig. 1. Simulations with a randomly chosen sequence of 250 tosses with a coin with

bias = .1 (top row), bias = .5 (middle row), and bias = .9 (bottom row) Left column:

probabilities assigned to the true bias hypothesis; middle column: Brier penalties; right

column: log score penalties. Gray squares: values for the explanationist; black

triangles: values for the Bayesian.

INFERENCE TO THE BEST EXPLANATION 441

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 15: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

than another update rule to ever make one vastly inaccurate, even thoughthe former typically makes one somewhat more inaccurate than the latter.Naturally, we would prefer a rule that offered the best of both worlds, orbetter yet, of all worlds—in other words, one that was most conducive toany epistemic goal we might have. But there may be no rule that fits thisbill, and in any event, the above simulations give reason to believe thatBayes’ rule is not that rule. To be sure, these simulations do not warranta more positive verdict about IBE either. But the aim of this section wasnot to make a case for IBE, or even for IBE. Rather, it was to highlightand question a hidden assumption in the inaccuracy-minimisation defenceof Bayes’ rule, to wit, that there is but one way in which an update rulecan be said to minimise inaccuracy.

In closing, I would like to consider a possible Bayesian response to theforegoing. In section I, it was argued that what makes the dynamic Dutchbook argument objectionable may not so much be the fact that the argu-ment addresses the wrong kind of rationality—practical rationality—butrather that it focuses on just one possible practical concern to the neglectof all others, apparently without good reason. If pragmatic considerationscan be legitimately invoked in the present discussion, then one might tryto argue that, even if the goal or goals that Bayes’ rule serves best are notprivileged from an epistemic viewpoint, they are privileged from a practi-cal viewpoint. Specifically, one might try to argue that, if the results ofthe above simulations have some general validity, and do not just holdfor the particular type of statistical model considered in this paper, thenthe fact that updating via Bayes’ rule leads, on average, to a lower pen-alty is enough to justify this rule. And indeed, it does follow from the saidfact that were one to keep betting on consecutive tosses with a given coin,each time posting betting odds in accordance with one’s degrees of beliefat that time, then one would maximise one’s expected payoff by updatingvia Bayes’ rule.

But it would be false to think that payoffs and scoring rule penaltiesmust always be so strictly related as in this example. To see how they cancome apart, suppose that you are the owner of one of two engineeringfirms in a city. The two firms are in permanent competition for contractsfrom the local authorities. As a rule, a contract goes to the firm that pro-poses the best solution to whatever the engineering problem is that theauthorities want to be solved. Given that the engineers employed by yourfirm are about as competent as the engineers employed by the other firm,which firm comes up with the best plan typically depends on the accuracyof the information on which they could base their proposals. Note that,under these circumstances, being able to base one’s proposal on informa-

442 IGOR DOUVEN

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 16: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

tion that is just slightly more accurate than the information available tothe competition is enough to get the contract. By contrast, if the availableinformation is less accurate, it is immaterial whether it is only slightly lessaccurate or much less accurate: one will not get the contract either way;in this case, a miss is as good as a mile. Again supposing the results ofour simulations to have some general validity, they suggest that if youand your team update via IBE while the competition updates via Bayes’rule, then you are likely to get the vast majority of the contracts. Once ina while—very rarely—a contract will go to the other firm. When it does,the information on which your proposal was based was probably muchmore inaccurate than the information on which the rival plan was based.However, this has no financial consequences beyond the fact that youmiss out on the contract, which would also have happened if the informa-tion available to you had only been slightly less accurate than the infor-mation available to the other firm.

Thus, there are circumstances under which it is clearly better to updatevia IBE, practically speaking, given that it greatly increases the chance ofending up with a more accurate belief system than one would have endedup with had one updated via Bayes’ rule. Naturally, this is only one sensein which one can aim to minimise one’s inaccuracy, and the foregoingconsiderations in fact suggest that, from a pragmatic viewpoint, there maybe no one best update rule. Depending on the circumstances and onwhat exactly one’s interests are, Bayes’ rule may serve one’s interests best,or IBE may do so, or, perhaps, sometimes yet another rule may do so. Inepistemology, contextual approaches have been much in the limelightlately. I am not aware of anyone suggesting that which update rule to goby may have no context-independent answer. The suggestion is worthexploring, I think, but this must wait for another occasion.

III. CONCLUSION

We have considered the (currently) main arguments for the claim thatBayes’ rule is the only rational update rule. Against the dynamic Dutchbook argument, it was argued that if updating via Bayes’ rule has certainpractical advantages in comparison with other rules, a version of IBEmay have different, possibly more important, practical advantages in com-parison with Bayes’ rule. Equally, against the inaccuracy-minimisationapproach, it was argued that if Bayes’ rule minimises inaccuracy in oneor in some senses, a version of IBE may minimise inaccuracy in anothersense or senses, where for all anyone has shown, no one of these senses

INFERENCE TO THE BEST EXPLANATION 443

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly

Page 17: Inference to the Best Explanation, Dutch Books, and Inaccuracy Minimisation

can be said to be privileged in the sense of capturing the epistemicgoal. Consequently, neither the dynamic Dutch book approach nor theinaccuracy-minimisation approach succeeds in challenging the rationalityof IBE-updating.25

University of Groningen

25 I am grateful to Jake Chandler, Sylvia Wenmackers, and two anonymous referees forhelpful comments. An earlier version was presented at the Institut Jean–Nicod (Paris). Ithank the audience for stimulating questions.

444 IGOR DOUVEN

© 2013 The Author The Philosophical Quarterly © 2013 The Editors of The Philosophical Quarterly


Recommended