Inferences and Explanations at the KT Boundarypeople.virginia.edu/~daw4h/Inferences and Explanations at the KT... · INFERENCES AND EXPLANATIONS AT THE K/T BOUNDARY...AND BEYOND David

INFERENCES AND EXPLANATIONS AT THE K/T BOUNDARY...AND BEYOND

David Waldner

Department of Politics University of Virginia [email protected]

October 13, 2003 Draft version of chapter written for the volume Theory and Evidence, edited by Ned Lebow and Mark Lichbach. Comments welcome.

1

Introduction: From Inferences to Explanations

After a long period of dormancy in the post-Kuhnian era, arguments for methodological

unity are again attracting attention in the social sciences. The contemporary version of the claim

that all valid scientific knowledge is based on a core set of regulatory principles has been stated

most clearly in the volume Designing Social Inquiry (henceforth DSI) written by the Harvard

political scientists Gary King, Robert Keohane, and Sidney Verba.1 While philosophers authored

most 20th-century briefs for or against methodological unity, the authors of DSI are practicing

empirical social scientists who evince little patience for philosophical investigations. Their work

builds on philosophical precedents to argue that there is a first-order scientific method--test

theories by their observable implications is their version of what philosophers call the

‘hypothetico-deductive method’--but it moves quickly to second-order methodological

considerations distilled from statistical reasoning and techniques. It is not sufficient, they imply,

to follow Karl Popper’s lead and propose only falsifiable hypotheses: to count as warranted

knowledge--to generate valid inferences, in the authors’ terminology--social scientists must

evaluate hypotheses by avidly adhering to a strict set of standards and regulations The

provocative conclusion that they draw is that qualitative research can and should emulate the logic

of quantitative research: indeed, only those methods yield valid inferences which are the central

element of science.

In this essay, I return to the philosophical roots of these debates to demonstrate the

inadequacy of the new methodological unity. At the core of my argument is the distinction

between confirmation and explanation. DSI implicitly equates these two components of scholarly

inquiry. At first glance, this presumed equivalence seems either innocuous or correct: we propose

hypotheses in order to explain, after all, and so confirming those hypotheses seems to complete

the explanatory enterprise. Even if explanation entails something beyond the confirmation of

hypotheses, that something extra does not seem to contain much surplus value: this is why, I

suppose, statistics textbooks so often slight the concept of explanation.2

Confirmation differs significantly from explanation. We confirm hypotheses via contested

appraisal of their evidentiary grounds and theoretical logics, an operation the philosopher Richard

1Designing Social Inquiry: Scientific Inference in Qualitative Research (Princeton University Press, 1994).

2Christopher Achen, to give a prominent example, argues that “social scientists neithe r have nor want correct, stable functional forms for their explanations…Social theories rarely say more than that, ceteris paribus, certain variables are related…A great many other factors may also be influential; they are outside the theory and must be controlled somehow.” See his Interpreting and Using Regression (Newbury Park, CA: 1982), 16.

2

Miller calls “fair causal comparison of a hypothesis with its current rivals.”3 To say that a

hypothesis has been confirmed is to claim that it has weathered sufficient scrutiny relative to the

current state of theorizing and data gathering that belief in its approximate truth is more

reasonable than disbelief but is also subject to revision in the face of future data gathering or

theorizing. To inquire into confirmation is thus to ask when a body of data confirms a hypothesis.

We explain, on the other hand, by using these confirmed hypotheses in prescribed ways. To

inquire into explanation is to ask when a hypothesis adequately explains a phenomenon.

DSI provides a framework for making valid inferences. But while all explanations are

based on inferences, not all inferences constitute explanations. Not all explanatory inferences,

furthermore, work as good explanations. Explanation requires something more than valid

inferences. That something more is causal mechanisms. To prefigure a claim made later in the

paper, we explain by identifying chains of causal mechanisms that were, under the specific

circumstances, sufficient to produce the outcome. To not identify the relevant mechanisms is to

not explain; to identify them only partially is to gesture at without completing the explanation.

Explanatory inferences must therefore contain causal mechanisms; many valid inferences

self-evidently do not do so. Rules for establishing valid inferences that do not explicitly recognize

this distinction do not exhaust the explanatory enterprise. One of the two main defects of DSI is

that it fails to adequately grasp this point. Its framework can therefore at best be considered

necessary but not sufficient for achieving explanatory goodness.

Were they to concede this omission, the authors of DSI might still claim that their

framework is necessary for making valid inferences. I contest this claim as well. Causal

mechanisms discharge a dual function: they are the distinguishing characteristic of explanations,

but they can also be used to enhance or impeach the credibility of hypotheses as well. They play,

in other words, a crucial role in establishing inferential goodness. DSI concedes this point, but

without acknowledging the diverse means by which causal mechanisms contribute to

(dis)confirmation. Causal mechanisms exercise veto power over hypotheses, rendering

implausible hypotheses that otherwise evince the features of inferential goodness, and rendering

more credible those hypotheses accompanied by plausible mechanisms. Put differently, causal

mechanisms promote inferential goodness via theory, not via research design. They adjudicate

contests between multiple hypotheses that are otherwise equally consistent with the data. I

3See his Fact and Method: Explanation, Confirmation, and Reality in the Natural and Social Sciences (Princeton, 1987), esp. chapter 4.

3

demonstrate below that thinking about confirmation via causal mechanisms vastly expands our

repertoire for making valid inferences beyond what DSI explicitly sanctions. Indeed, under

specific conditions delineated below, fair causal comparison allows us to discount or even

disregard procedures and rules central to the case championed by DSI. Far from resulting in

inferential errors, these heterodox strategies support major scientific achievements, as we shall see

below. The research methods advocated by DSI, therefore, are neither necessary nor sufficient

for achieving explanatory adequacy. This conclusion vindicates methodological pluralism: social

science inquiry is not methodologically monochromatic.

Is it fair to argue that DSI equate the confirmation of hypotheses (making valid claims

about causal relationships inferred from the data) with explanation? DSI, after all, clearly states

that “the goal is inference.” In making central the transition from “immediate data to something

broader that is not directly observed,” the authors echo Karl Popper, who once described the

science as piercing the veil of appearances: There is a reality behind the world as it appears to us, possibly a many-layered reality, of which the appearances are the outermost layers. What the scientist does is boldly to guess, daringly to conjecture, what these inner realities are like...But there is another, a special kind of boldness-the boldness of predicting aspects of the world of appearance which have so far been overlooked but which it must possess if the conjectured reality is (more or less) right, if the explanatory hypotheses are (approximately) true.4

Note how Popper equates making inferences about unknowns and unobservables with explanatory

hypotheses. DSI gives many indications that they are following in his footsteps and equating the

confirmation of hypotheses (in their language, verifying the validity of causal inferences) and

explanation. Thus, not long after proclaiming that “the goal is inference”, they insist that

“explanation--connecting causes and effects--is the ultimate goal...” (34). This substitution of the

language of explanation for that of inference occurs so regularly that Henry Brady writes that “It

is not exactly clear how “explanation” fits into KKV’s categories of descriptive and causal

inference, but one reasonable interpretation is that they consider explanation to be identical with

causal inference.”5 Brady attributes the reticence of DSI on this important matter to their reliance

on statistical thinking, for “The statistics literature [in contrast to the philosophical literature

which intimately links causality and explanation] is exceptional in defining causality without

4In his “The Problem of Demarcation,” in David Miller, ed., Popper Selections (Princeton University Press, 1985), 122.

5Henry Brady, “Doing Good and Doing Better: Symposium on Designing Social Inquiry, Part 2,” The Political Methodologist 6 (Spring, 1995), 13 at footnote 6.

4

discussing explanation.”6 However, before fully conceding this point, it seems fair to make two

observations: first, DSI do make a definitional distinction between inferences and explanations:

the former involves moving from observables to non-observables, the latter involves “connecting

causes and effects.” (34) Second, buried innocuously in a footnote is the claim that “At its core,

real explanation is always based on causal inferences.” (75, at footnote 1) But these possible

ways of distinguishing inferences from explanations are never explored, even when the context

begs for that discussion.

It may be objected that I am holding DSI to unfair standards by using the criteria of

necessary and sufficient conditions. Yet these are the only standards that are relevant to the claim

of methodological unity, which is precisely why this project has fared so poorly over the last

century.7 Logical positivists used a transcendental principle of verification to distinguish the

meaningful statements of science from the meaningless statements of non-science. Statements

such as “all swans are white” could be verified by defining swans and observing their color;

statements such as “the absolute is eternal” and other expressions of philosophical idealism, on the

other hand, could not be so verified and were thus meaningless. Insofar as it ruled out various

forms of philosophical idealism then prevalent in European philosophy, logical positivism could

claim to have identified a necessary condition for science, but certainly not a sufficient condition,

for obviously false statements such as “the earth is flat” were clearly verifiable and thus had to

count as science.

In response, Karl Popper proposed a new solution to what he called “the problem of

demarcation.” Popper’s solution was to contrast empirical science to “pre-scientific myths and

6“Doing Good and Doing Better,” 14. Brady also criticizes DSI for equating causal explanations with all types of explanations, some of which are obviously non-causal. This critique was originally levied by Michael Scriven against Carl Hempel, who responded that this was analogous to objecting to a definition of a mathematical proof for its inability to account for the use of the word proof in the phrase “86-proof Scotch.”

7 Building on the preexisting distinction between knowledge (episteme) and opinion (doxa), Aristotle established the view that science was distinguished by the infallibility of its finding, a result of reasoning deductively from first principles. The idea that science conferred apocdictic certainty stabilized faith in methodological unity that continued into the first centuries of the scientific revolution. Key figures such as Galileo and Newton identified with Aristotle’s claims that scientific conclusions were demonstrative and thus incorrigible, even as they omitted Aristotle’s concern for explicitly causal demonstration. Methodological unity thus persisted into the nineteenth century when, in part under the influence of the probabilistic revolution, philosophers and practicing scientists adopted a fallibilistic perspective: scientific theor ies could in fact be in error and theories were only relatively superior to their rivals. At this point, methodological pluralism became more prominent, as scientists forged new methodological tools to replace deductive reasoning and began to embrace, wit h varying degrees of enthusiasm, inductive reasoning. See Larry Laudan, Beyond Positivism and Relativism: Theory, Method, and Evidence (Westview Press, 1996): 210-15. DSI retains as a core tenet the premise that all of our beliefs can be revised; uncertainty cannot be avoided. There is therefore no return to Aristotle.

5

metaphysics,” whose claims were not strictly meaningless but rather non-empirical. The defining

characteristics of science were two-fold and transcendental: bold conjectures and decisive

refutations. Conjectures are imaginative leaps from outward appearances to inner realities; they

are bold when they “stick their necks out,” when they take great risks of being proved wrong.

Science is, Popper then counter-intuitively claimed, not the search for certain knowledge, but

rather for certain refutation.8

Treating Popper’s demarcation criterion as a necessary condition for science created some

nagging problems, for it implied that many recognized scientific achievements did not in fact

constitute science. The Copernican revolution was one example of this revalorization of past

achievements: although Copernicus’ decentering of the earth clashed boldly with accepted belief,

it did not make any new prediction that could be falsifiable by crucial experiments and thus, as

Popper frankly acknowledged, by his terminology, it is “unscientific or metaphysical.”9 A second

problem is that, like the logical positivists, Popper too must allow that obviously false claims are

in fact science: if they have been falsified, then they must be falsifiable, and so they fall on the

science side of his demarcation principle.10 Falsifiability also cannot be a sufficient but not

necessary condition for science, for then astrology might fail the falsifiability criterion but gain

admission on other criterion. Most troubling, however, is that while vulnerability to refutation

may indeed be necessary to achieve scientific status, what is the status of propositions that are not

falsified?11 If falsifiability is a necessary but not a sufficient condition, then we would have to

conclude that physics might be a science for it passes the necessity test, but we would not be sure

that it was indeed a science. The only relevant standards for assessing meta-analytic frameworks

8For the formal analysis, see his The Logic of Scientific Discovery (Routledge, 1992 (1959]). For further commentary, see his Conjectures and Refutations: The Growth of Scientific Knowled ge (Basic Books, Inc., Publishers, 1963.

9More accurately, Popper allows that Copernicus made some minor predictions and to this extent, his work is scientific.

10Unlike the logical positivists, Popper claims that Marxism was scientific until its adherent s dogmatically ignored evidence that it had indeed been falsified. But in that sense, pre-1917 Marxism remains a science.

11Richard Feyman put it best: if the theory disagrees with the empirical evidence, “it is wrong. In that simple statement is the key to science.” But surely there must be at least two keys to science, one that blocks entry to many propositions, but one that permits entry to others.

6

is that they be both necessary and sufficient for achieving scientific status and thus warranted

belief.12

Given these problems, it is perhaps not surprising that Post-Popperian philosophers have

more frequently championed methodological pluralism. Most famously, Thomas Kuhn

distinguished not between science and non-science but between normal and crisis science. The

former is governed not by transcendental, unified rules but by field- and time-specific

achievements, or what he called “paradigms,” while the latter is governed, in part, by extra-

scientific considerations.13 More infamously, Paul Feyerabend argued for methodological

anarchy. Feyerabend was not, contrary to the many claims of his critics, anti-science. What he

objected to was the “narrow-minded extension of the latest scientific fashion to all areas of human

endeavour--in short what I object to is a rationalistic interpretation and defense of science.”14

Feyerabend could thus argue that Galileo’s method “worked” in kinematics and in other fields as

well, but simultaneously and consistently argue that “it does not follow that it should be

universally applied.”15

DSI confidently rejects these pluralistic impulses. From the opening sentence of the

preface, the book unambiguously claims that there is a unified methodology that confers the status

of legitimate knowledge to propositions tested according to its framework. The core of DSI is a

neat restatement of the hypothetic-deductive method, one which the authors hope will produce an

ideal-typical scholar who “uses theory to generate observable implications, then systematically

applies publicly known procedures to infer from evidence whether what the theory implied is

correct.”16 But the authors are unsatisfied by this general disposition towards scientific inquiry; it

may be necessary to produce sound results, but it is not sufficient for that goal. Popper and his

followers spoke about a general intellectual temperament; DSI speaks about concrete research

steps. Consequently, their criteria are not a restatement of the hypothetical-deductive method, for

that method can be used poorly to derive patently unsound conclusions. Rather, they clearly state

12Laudan, Beyond Positivism and Relativism , 215-22.

13Thomas Kuhn, The Structure of Scientific Revolutions 2nd ed. (University of Chicago Press, 1970).

14Paul Feyerabend, Against Method 3rd ed. (London: Verso, 1993 [1975]), 122.

15Against Method, 123.

16Gary King, Robert O. Keohane, and Sidney Verba, “The Importance of Research Design in Political Science,” American Political Science Review 89 (June 1995), 476.

7

what they believe to be the proper usage of the hypothetical-deductive method. Here is where

DSI generates considerable controversy, for their claim is that the specific rules they advance for

using the H-D method are those “rules that are sometimes more clearly stated in the style of

quantitative research.” (6). These rules pertain to, inter alia, avoiding selection bias when

choosing what to observe, correcting for multicollinearity, diagnosing and treating endogeneity,

and increasing the number of observations to avoid indeterminate research designs. These rules,

moreover, are in no way specific to statistical analysis; rather the authors claim that these rules

transcend but are best embodied in statistical analysis, and they are therefore available for

qualitative research as well. Note finally that by claiming that these rules have been best

articulated by statistical researchers, DSI implies that it is this second-order collection of rules and

not the H-D method itself that constitutes the scientific method and which thus distinguishes

scientific inquiry from casual observation.

The authors of DSI should be commended for moving so far beyond earlier efforts to

establish methodological unity. And they should be held accountable to strict standards, not only

because those strict standards are logically implicated by their enterprise, but because they give

good presumptive reasons to believe they will meet those standards. This essay argues, however,

that their efforts ultimately fail.

In the spirit of methodological pluralism that animates this essay, I make this argument in

two ways: through conceptual and logical analysis of the evaluative criteria for confirmation and

explanation and through a case study of scientific progress, one which has been erroneously

recruited to the cause of methodological unity. I begin with the case study, followed by sections

on explanation, inference, and explanatory adequacy.

Inferences and Explanations on the K/T Boundary

An obvious retort to the claim that qualitative analysis should follow the lead of

quantitative research is that the former focuses on highly complex and unique events. The authors

of DSI meet this challenge squarely and expertly, arguing first that complexity is a function of the

analytic apparatus brought to bear on an event, and second that even “unambiguously unique

events” can be studied using the scientific methods they champion. To support this important

claim, they briefly consider one such unique event, the extinction of the dinosaurs.

According to DSI, to study a unique event scientifically is to use the hypothetico-

deductive method (H-D method), which is composed of one or more hypotheses--statements

whose truth value is to be evaluated in terms of their consequences--one or more statements of

8

initial conditions, and one or more observable predictions, or states of the world which can be

deductively implied by the conjoined hypotheses and initial conditions and which therefore must

be observed for the theory to be true. These observations are thus the test of the theory. This is

precisely, according to DSI, the method by which the dinosaur extinction was studied. The

authors neatly assemble this position as follows: One hypothesis to account for dinosaur extinction...posits a cosmic collision: a meteorite crashed into the earth at about 72,000 kilometers an hour, creating a blast greater than that from a full-scale nuclear war. If this hypothesis is correct, it would have the observable implication that iridium (an element common in meteorites but rare on earth) should be found in the particular layer of the earth’s crust that corresponds to sediment laid down sixty-five million years ago; indeed, the discovery of iridium at predicted layers in the earth in the earth has been taken as partial confirming evidence for the theory. ( 11)

It is true that the iridium anomaly provides powerful support for the meteorite (bolide-

impact) hypothesis; and it is true that the team of Berkeley scientists responsible for this

hypothesis treated the iridium anomaly as the observable implication of something that could not

be directly observed. But the scientific study of the dinosaur extinction does not resemble the

summary contained in DSI.17 Rather, as we shall see, researchers reasoned backwards from the

iridium anomaly to its cause; and they focused not on inferential goodness but rather on causal

mechanisms. It is these two facts that set the stage for the philosophical discussion to follow.

Causal mechanisms loomed so large in the study of the dinosaur extinction because a key

member of the Berkeley team, the geologist Walter Alvarez, stumbled on the iridium anomaly as a

by-product of his work on plate tectonics, a theory that gained scientific credibility only in the

1960s.18 The theory of continental drift had originally been proposed in the 1930s by the German

meteorologist Alfred Wegener, who made an inference from continental morphology that all of

the continents must once have been joined; from this inference, he made the second inference that

the continents must be drifting apart. This conjecture boldly challenged the prevailing view that

continental position was fixed. The scientific community rejected Wegener’s hypothesis because

he could not propose a plausible causal mechanism explaining how the continents could move.

Wegener mistakenly claimed that the drifting continents plowed through a solid earth as a ship

plows through the ocean, a view that was correctly rejected by physicists as being physically

impossible. The theory was accepted in the 1960s, in part due to new observations, but more

17My account is based entirely on materials that became available only after the publication of DSI.

18The main source for what follows is Walter Alvarez, T. rex and the Crater of Doom (Vintage Books, 1997).

9

importantly because a new causal mechanism was adduced: the continents rest on tectonic plates

which are carried along in convection currents generated by the earth’s internal thermodynamic

processes. It was the absence of a plausible causal mechanism and not a fallacious inference

induced by a faulty research design that prompted the initial rejection of the theory; and it was the

subsequent depiction of a plausible causal mechanism that led to the theory’s later acceptance,

not a research-design based set of observational inferences.19

Alvarez was a paleomagnetist who specialized in the sub-continental “microplates” of the

Mediterranean, which took him to the medieval city of Gubbio, north of Rome in the Apennine

Mountains. In a canyon just outside the city is an outcrop of pink limestone called the Scaglia rossa whose exposed face spans the Cretaceous period, running into the more recent Tertiary

period. The boundary between these two periods is called the K/T boundary.20 For his research,

Alvarez sampled rocks down through the Cretaceous and up through the Tertiary, crossing the

K/T boundary. He dated his rock samples in part by working with a specialist in foraminifera--

“forams” for short, single-celled marine organisms whose microfossils can be identified and dated

precisely. Forams were plentiful below the boundary, scarce above it; upon learning that the

dinosaur extinction basically coincided with the foram extinction, Alvarez decided to turn his

attention to explaining the mass extinction at the K/T boundary.

The meteorite hypothesis had long been proposed by this time, but Alvarez did not analyze

iridium deposits to test it. Rather, he sought to discover how long the K/T boundary lasted,

collecting data to adjudicate between two contending paradigms of geologic change: gradualism

and catastrophism.21 Alvarez’s research did not distinguish between competing catastrophic

hypotheses, of which there were many: rather he devised a test of whether the dinosaur extinction

had been relatively abrupt or gradual. Gubbio limestone was composed of 95% calcium

carbonate (composed, in turn, overwhelmingly of the fossilized remains of forams) and 5% clay.

19Timothy McKeown has properly placed the discovery of causal mechanisms at the center of his statement of methodological pluralism in which identifying causal processes replaces the empirical tests demanded by the H -D method. See his extremely valuable essay, “Case Studies and the Statistical Worldview: Review of King, Keohane, and Verba’s Designing Social Inquiry: Scientific Inference in Qualitative Methods ,” International Organization 53 (Winter 1999): 161-90, esp. 185-86.

20Cretaceous comes from the German, kreide, or chalk, because in this last third of the Mesozoic era chalk was widely deposited in shallow seas. The letter K is substituted for the C of Cretaceous to distinguish the Cretaceous from the earlier Cambrian period.

21The great 19th-century geologist Charles Lyell attacked the biblical -inspired paradigm of catastrophism and replaced it with gradualism, whose motto was natura non facit saltum: nature does not make jumps. Stephen Jay Gould provides a characteristically concise account in his [complete citation.

10

The K/T boundary, however, is a physical boundary of almost pure clay. So the logical question

to ask is how long it took to deposit those clay sediments. The two scenarios Alvarez posited

predicted different levels of iridium, 0.1 parts per billion for the relatively slow scenario (short-

term increase in clay deposits with constant level of fossilized foram deposits) and virtually none

for the relatively fast scenario (abrupt cessation of foram deposits with constant level of clay

deposits). Both scenarios assumed a constant rate of iridium accumulation.22

The results were astounding: nine parts per billion, roughly ninety times higher than the

expected amount if the rate of sedimentation had been relatively slow, on the scale of thousands

of years. Neither scenario was supported, in other words, by the data. Rather, the assumption of

a constant rate of iridium accumulation was upended. This finding, needless to say, raised new

questions without answering old ones, for now Alvarez had to figure out an explanation for “all

that iridium.” Numerous answers could be proposed for this question--a meteorite impact was

one possibility, but so were massive volcanic eruptions. Indeed, other hypotheses existed, some

of which were non-catastrophic, such as an encounter with a cloud of interstellar dust and gas.

Any credible explanation for the elevated levels of iridium, moreover, had to do double duty, to

also answer the question, “what caused the extinction of the dinosaurs?”

Thus, contrary to the report of DSI, the Berkeley team23 did not propose elevated levels of

iridium as a test of the meteorite hypothesis. A long process of serendipity and trial and error led

to the discovery of elevated iridium levels, a finding which itself begged explanation.24 Rather

than using the iridium as a deductive test of an existing hypothesis, they used their findings and

worked backward from them.25 The H-D method was ancillary to this core methodology.26

22Iridium exists on earth in the same proportion that it exists on meteorites: but like other heavy elements, iridium has become concentrated in the earth’s core. Most iridium in the earth’s crust has thus been deposited by meteorites. Meteorite dust accumulates slowly: if there is a measurable accumulation, the rate of sedimentation must also have been relatively slow, suggesting that the rising proportion of clay at the K/T boundary was caused by an abrupt extinction of foram; if there were virtually no iridium accumulations, then powerful support would be given the hypothesis that the K/T boundary was due to rising levels of clay deposition and not an abrupt extinction.

23The team consisted of Walter Alvarez, his father, the Nobel-prize winning physicist Luis Alvarez, Frank Asaro, a specialist in neutron-activation analysis, and Helen V. Michel, a specialist in plutonium chemistry.

24For the steps and missteps along the way to the discovery of the iridium anomaly, see T. rex and the Crater of Doom, 19-71.

25On the relationship between an ontology of causal mechanisms and reasoning via abduction or via inference to the best explanation, see Ian Shapiro and Alexander Wendt, “The Difference that Realism Makes: Social Science and the Politics of Consent,” Politics & Society 20 (June 1992): 197-223.

11

Rather than following the framework summarized by DSI, and this is a key point, they thought in

terms of causal mechanisms--or, in the context of this debate, killing mechanisms. For over a

year, Walter Alvarez recounts, the Berkeley team regularly returned to the impact hypothesis but

continuously rejected it, not because it was inconsistent with the evidence, for it was consistent,

but rather because they could not understand why an impact would cause worldwide extinction...A supernova had seemed more reasonable because it would have bathed the entire Earth in lethal radiation, thus explaining the global character of the extinction. But a supernova was out, and impact seemed to provide no global killing mechanism. For over a year we had searching discussions that always ended in frustration, and I would lie awake at night thinking, “There has to be a connection between the extinction and the iridium. What can it possibly be?27

Thus, it was not debate over research design but efforts to identify a plausible causal

mechanism that drove the Berkeley team. By late 1979, Luis Alvarez believed that he had found

the appropriate mechanism: a large impact would have created a global dust cloud that would

cause the collapse of the entire food chain, resulting in mass extinction. No research was

conducted in support of this hypothesis; rather, when initial calculations of the quantity of dust

and its impact were approved by a Berkeley astronomer, Luis Alvarez exclaimed, “We’ve got the

answer.” Within weeks, the meteorite impact hypothesis was presented at a conference and

within a year, the seminal report appeared in the journal Science.28

Explanation

Contrary to what DSI claims, the extinction of the dinosaurs was studied scientifically but

not exclusively by the methods they advocate. Causal mechanisms, moreover, play a more

prominent role in the story I have just told than they play in the methodology advanced in DSI: Their absence discredited theories on grounds other than their evidentiary warrant while their

presence powerfully supported theories whose evidentiary warrant was not superior to rivals.

The iridium anomaly, in other words, was consistent with more than one hypothesis: the Berkeley

26For example, as a test of the hypothesis that a supernova explosion deposited the iridium, they searched for the presence of plutonium-244; its complete absence falsified the supernova hypothesis. My claim is that the H-D method is not absolutely necessary to valid findings, not that it is superfluous to scientific inquiry.

27T. Rex and the Crater of Doom, 76, emphasis in original.

28Luis W. Alvarez, Walter Alvarez, Frank Asaro, and Helen V. Michel, “Extraterrestrial Cause for the Cretaceous-Tertiary Extinction,” Science 208 (6 June, 1980): 1095-1108.

12

team judged the meteorite hypothesis superior solely because it alone could be credibly linked to

the iridium anomaly and to the dinosaur extinction. Finally, the function of causal mechanisms is

not exhausted by its role in establishing the relative superiority of a given hypothesis. Causal

mechanisms are also of crucial importance to explanatory adequacy.

To understand the explanatory role of causal mechanisms, we must be more attentive to

the profound distinction between confirmation of a hypothesis and the use of that hypothesis in an

explanation. The H-D method is a powerful means for providing a hypothesis with evidentiary

support, a process usually called confirmation. It allows us to judge a body of evidence as

confirming a hypothesis. But to ask “when does a body of evidence confirm a hypothesis?” is a

distinct question from asking “when does a hypothesis adequately explain an outcome?” DSI is

primarily an answer to the first question, an answer that relies heavily on, but also goes beyond,

the Hypothetico-Deductive Method. The H-D method permits us to move from reports of direct

observation to statements about unobservables, for we have not directly observed the causal effect

in question. This is precisely what DSI intends with its discussion of inference as moving from

immediate data to “something broader that is not directly observed.” (8). Because the statements

we aim for are about unobservables, the problem of validity looms large, which is why the H-D

method is near-universally considered to be the scientific method.29 Hence, the clarion call of

DSI is to develop concrete and falsifiable theories with as many falsifiable implications as possible.

But, to repeat, the H-D method is clearly understood in the philosophical literature to be a

method of confirmation of one or more hypotheses: those hypotheses do not automatically serve

as explanations or as good explanations. Some confusion on this score is inevitable due to the

structural similarities between the H-D method and the deductive-nomological model of

explanation, advanced by Carl Hempel and the leading model of explanation for several decades

before it was completely discredited some forty years ago. The H-D method deduces an

observational prediction from the conjunction of one or more hypotheses and one or more

statements of initial conditions. If the observational is true, then the hypothesis has passed the

test: it has not been falsified and it has received some degree of inductive confirmation. The D-N

model of explanation, on the other hand, deduces an outcome to be explained from the

conjunction of one or more general laws and one or more statements of initial conditions. The

29See, for example, John Earman and Wesley C. Salmon, “The Confirmation of Scientific Hypotheses,” in Wesley C. Salmon, et. al., Introduction to the Philosophy of Science (New York: Prentice-Hall, Inc., 1992), 44.

13

outcome is explained by subsuming it under the general law, showing it to be a specific instance of

a more general phenomenon known to be true.

Here’s an example. We might test the hypothesis “All celestial bodies follow elliptical

orbits” by adding the initial condition “The earth is a celestial body” and then deducing the

observational prediction, “If it is true that all celestial bodies follow elliptical orbits, then the earth

must follow an elliptical orbit.” Observing that the earth does have a celestial orbit does not make

the hypothesis true: having deduced the prediction, it in turn provides only inductive support. But

suppose we accept the hypothesis as true, conferring on it the status of general law. Now we can

explain why the earth has an elliptical orbit: all celestial bodies have elliptical orbits, the earth is a

celestial body, therefore, (we logically deduce), the earth must have an elliptical orbit.

Structural similarities aside, the difference, however, is of acute importance: whereas the

D-N model uses a well-confirmed law to explain an observational statement, the H-D method uses

an observational statement to provide inductive confirmation of a hypothesis that might be used

in future D-N-type explanations. It thus might seem that only a short step separates the H-D

method of confirmation from the D-N model of explanation: this, I suppose, is one reason that

confirmation and explanation are so often conflated by those who accept the D-N model of

explanation as legitimate, a position I show below is unwarranted. But in fact that step dividing

the two is as large as the gap between a partially confirmed hypothesis and a well-confirmed

general law.

Explanations are inferential, but not all inferences are explanations. One way to think

about this point is that symptoms are inferentially relevant but explanatorily irrelevant.

Cosmologists believe that the universe is expanding as distant galaxies recede at high velocities.

They infer this hypothesis from the Doppler effect; light from these distant galaxies shifts towards

the red end of the spectrum. But nobody believes that the red-shift of these galaxies’ light

explains why they are receding from us. The consensus explanation, rather, is that the “big bang”

that originated the universe sent its parts speeding off in different directions.

A second way to establish the distinction between explanations and inferences is the

problem of temporal asymmetry. From knowledge of initial conditions of the earth, sun, and the

moon, and using the general laws of celestial mechanics, we can predict--infer an unobserved

event--a future eclipse. We can even claim, with confidence, that the antecedent conditions and

general laws explain the eclipse. In this instance, the inference coincides with the explanation.

But consider a slightly modified scenario: from knowledge of the initial conditions of the earth,

sun, and moon, combined with the general laws of celestial mechanics, we can infer that an eclipse

14

occurred ten thousand years ago. But nobody would claim that the present positions of celestial

bodies explains a past eclipse, as this violates a basic law of causal explanations: causes cannot

follow their effects. Once again we see that while inferences and explanations may overlap, they

are not equivalent sets.

What, then, are the criteria of explanatory goodness? When does a set of hypotheses

explain an outcome? And why does the deductive-nomological model fail as a model of

explanation? These questions all have the same answer. The D-N model fails because it does not

demand incorporation of causal mechanisms: explanations contain causal mechanisms, and good

explanations contain the mechanisms needed to fully connect cause and effect.

The D-N model fails as a model of explanation because many arguments faithfully follow

its form but manifestly fail to explain. At minimum, therefore, the D-N model is not sufficient for

an explanation.30 We have already seen this point with reference to receding galaxies and

eclipses. The counterexamples to the D-N model--arguments that meet its formal qualifications

but perform miserably as explanations--are so numerous that many texts in the literature refer to

them by number or nickname. Consider two more such examples, both courtesy of Wesley

Salmon.31 Mumbling an ancient incantation, I pour salt into a beaker of water; the water

dissolves. I observe that every time someone mumbles the same incantation while pouring salt

into water, the salt dissolves. Note that this example fails miserably to confirm the hypothesis that

ritual incantations are causally related to salt dissolution in water, for any good research design

would notice quickly that salt dissolves in water in the absence of the hex. But no matter for the

D-N model, for the general law “Salt dissolves when poured into water and accompanied by ritual

incantations” is a true general law. Therefore, by the formal structure of the D-N model, the hex

explains the dissolution of the salt. Or, in Salmon’s final example, Mr. Jones explains his failure

to get pregnant by pointing to his taking contraceptives. Again, the explanation is absurd, but it

fits the logical structure of the D-N model, for it contains the general law “Men who take

contraceptives will not become pregnant.” The D-N model failed, and failed miserably, because it

tried to substitute a set of formal requirements for causal mechanisms. We know that water

causes salt to dissolve; we know that men cannot become impregnated; we know that both the

30I would also argue that it is not necessary to explanatory goodness, but this ancillary point need not detain us here.

31See his “Four Decades of Scientific Explanation,” in Philip Kitcher and Wesley C. Salmon, eds., Minnesota Studies in the Philosophy of Science, XIII: Scientific Explanation (University of Minnesota Press, 1989), 46-50.

15

hex and the contraceptives are irrelevant to the explanations. But the D-N model cannot

distinguish between logically sound arguments and causal explanations.

But why must explanations contain causal mechanisms? Why cannot the function of

causal mechanisms be restricted to making valid inferences? One answer, given in different forms

by Jon Elster and Charles Tilly, is that relations between independent and dependent variables are

so unstable and indeterminate that knowledge of causal mechanisms is essentially all that we can

know.32 But we need not make that ontological wager to agree that knowledge of causal

mechanisms is critical to our enterprise.33 We ask “Why?” because identifying causal mechanisms

gives us a different type of knowledge than we acquire from estimating causal effects, to use the

language of DSI.34 To seek lawlike generalizations--including those with stable probabilities--

grants us what Salmon calls “nomic expectability.” When we know that X is associated with Y at

probability p, we know only that to observe X is to expect to observe Y with the same probability.

And, as we have seen, that expectation is consistent with multiple formulations of the relationship

between X and Y.

To know causal mechanisms gives us a different type of knowledge, what Salmon calls the

ontic conception of explanatory import. Causal mechanisms tell us not only that something

occurs (with regularity p), but also why it occurs. This is knowledge of how the world works. In

other words, to know causal mechanisms is to take DSI seriously when it defines explanation as

connecting X and Y, not just observing their associations. Only when we know the causal

mechanisms can we claim to have explained a phenomenon.

This function of causal mechanisms is scarcely evident in DSI. Indeed, causal mechanisms

and the concept of causality itself were largely absent from early twentieth-century efforts to

establish the unity of science. To their credit, the authors of DSI explicitly adopt a more modern

notion of causal explanations. But they misidentify the role of causal mechanisms in providing

explanations, and they place far more emphasis on establishing empirical associations than on

32They reach this conclusion by very different routes, however. See Jon Elster, “A Plea for Mechanisms,” in Peter Hedström and Richard Swedberg, eds., Social Mechanisms: An Analytical Approach to Social Theory (Cambridge University Press, 1998), 45-73; and Charles Tilly, “Mechanisms in Political Processes,” Annual Review of Political Science 4 (2001). 33 For the argument that causal mechanisms integrate otherwise disparate knowledge, see David Dessler, “Beyond Correlations: Toward a Causal Theory of War,” International Studies Quarterly 35 (1991): 337-55.

34This section is based on Wesley C. Salmon, “Why Ask “Why,”” in his Causality and Explanation (Oxford University Press, 1998), 125-141.

16

identifying causal mechanisms and using them in explanations. They evince an allergy to causal

mechanisms in two ways.35 On the one hand, they insist on affording conceptual and

epistemological priority to causal effects over causal mechanisms. On the other hand, the express

considerable skepticism about the feasibility of a mechanistic approach to causality. By the end of

the book, causal mechanisms have been relegated to a distinctively auxiliary role, that of

increasing the number of observations and thus overcoming, in part, the problem of indeterminate

research designs. I expand on each of these points below.

Explanation, DSI avers, consists of “connecting causes and effects.” (34). One might

think that this definition would lead the authors to stress the notion of “connection” in their

discussion of causal explanations. Yet the authors expend considerable energy in chapter 3,

where their main discussion of causality takes place, to define causality in terms of “causal

effects.” A causal effect, they argue, is “the difference between the systematic component of

observations made when the explanatory variable takes one value and the systematic component

of comparable observations when the explanatory variable takes on another value.” (81-82). This

definition has some virtues, in that it manifests keen awareness of the distinction between random

and systematic causes and is clearly rooted in a counterfactual approach to causality. On the

other hand, it substitutes an operational definition--how do we measure causality--for a the

semantic definitions--What does it mean to say that X causes Y--that one typically finds in the

philosophical literature.

Indeed, they go to great lengths to downplay the role of causal mechanisms in scientific

inquiry. First, they insist that their definition of causal effects is “logically prior” to the

identification of causal mechanisms. This is because identifying causal mechanisms requires a

disciplined research design, which in turn requires estimating causal effects--in a sense, measuring

what is to be explained. (86) It is not clear why this makes causal effects “logically prior” to

causal mechanisms: if causal mechanisms exist, then they produce causal effects. Causal

mechanisms thus appear to be ontologically prior to causal effects, and the latter should then be

assigned a second-order, instrumental function. Moreover, as we have seen, the Berkeley team

identified a causal mechanism without a disciplined research design; that identification was a

theoretical enterprise, not an empirical one. Emphasizing causal effects need not and should not

entail discounting the importance of causal mechanisms.

35See also McKeown, “Case Studies and the Statistical World View,” 162-64.

17

A second way that DSI attempts to discount the importance of causal mechanisms is by

claiming that the requirement of identifying causal mechanisms is self-defeating, for if causal

mechanisms are any causal links between two variables, then the identification of every causal

mechanisms creates two new sets of variables being linked, creating the need to identify two more

causal variables, leading to infinite regress and the combinatorial explosion of variables and

mechanisms.36 This objection, if true, would seem to be a brief for the total dismissal of concern

for mechanisms, a point that KKV adopt by implication. But the reasoning does not survive

scrutiny: by the same reasoning, after all, we would rediscover Xeno’s paradox and conclude that,

because any objection in motion would always have to cover halve of its intended distance in

perpetuity, motion is impossible. The real world doesn’t work like this. Xeno managed to ignore

time; KKV ignore the distinction between causal mechanisms and events. They treat everything that temporally mediates two events as a causal mechanism. Causal mechanisms are structures

and entities with the capacity to generate events; they are not equivalent to the events themselves.

Pragmatic considerations intervene, moreover: no reasonable person asks for an explanation that

contains each and every possible detail imaginable. But even if we adopt a more permissive

definition of causal mechanisms, it is not clear that we would always suffer the problem of infinite

regress: Steven Weinberg, for example, once gave a perfectly reasonable explanation for why

chalk is white that reached the sub-atomic level in only five steps.37

Following these objections, KKV drop the subject of causal mechanisms until the book’s

final pages, where the topic is reintroduced only in the context of increasing the number of

observable implications of a theory. “By providing more observations relevant to the implications

of a theory,” they write of the qualitative method of process tracing, “such a method can help to

ovecome the dilemmas of small-n research and enable investigators and their readers to increase

their confidence in the findings of social science.” (226-28) In other words, causal mechanisms

are understood and give a legitimate role in social science only as handmaidens of inferences:

qualitative researchers should identify and test causal mechanisms not because this is what

adequate explanations demand but because this is what adequate hypothesis testing demands in

36The point here seems to be to discredit the notion of causal mechanisms and downplay their role in explana tions: otherwise, the objection of infinite regress would apply equally forcefully--in this case, not much--to DSI as well.

37Weinberg asks a sixth “why” question: why does the world consist of the fields of quarks, electrons, protons and so forth? This question, he admits, is unanswerable. Applying the implicit standards of DSI, quantum physicists explain nothing; if they do not hold that position, then their point about infinite regress simply does not convince. For Weinberg’s exposition, see his Dreams of a Final Theory: A Scientist’s Search for the Ultimate Laws of Nature (Vintage Books, 1992), 21-26.

18

the absence of the statistical manipulation of data. Once again, we find DSI conflating the

confirmation of hypotheses with the elaboration of explanations.

To be sure, the identification of causal mechanisms does play a role in the confirmation of

hypotheses: James Johnson assigns to causal mechanisms the definitional function of making

theory T “more credible in the sense that [the mechanism] renders the explanations that T

generates more fine-grained.”38 Johnson’s analytic point matches the conclusion we drew from

the example of dinosaur extinctions. But we need not agree that the function of credibility

enhancement exhausts the role of mechanisms; nor, to stick with the question of making

overarching theories more credible, need we subscribe to the claim of DSI that mechanisms make

theories more credible by increasing the number of observations and thus overcoming the small-N

problem, which is not how they were used in the confirmation of the meteorite hypothesis.

Consider: falling barometers are empirically associated with the arrival of storm systems;

but we immediately reject any claim that the relationship is causal in nature, not because of

research-design considerations or the results of a statistical study, but because we know that there

is no mechanism linking falling barometers to storms. Or, to use an example from political

science, after establishing a statistical association between political culture and the durability of

democratic institutions, Ronald Inglehart writes It is conceivable that we have the causal arrow reversed. Perhaps many decades of living under democratic institutions produces greater life satisfaction. We don’t rule this factor out. Indeed, we think it does contribute to overall life satisfaction. But theoretical considerations suggest that the process mainly works the other way around. It seems more likely that a global sense of well-being would also shape one’s attitudes toward politics than that what is experienced in one relatively narrow aspect of life would determine one’s overall sense of satisfaction.39

Inglehart does not reason about causal processes in order to overcome a problem of indeterminate

research design: his statistical methods have presumably dispatched that problem. He thinks

about causal processes in order to better understand the nature of the empirical relationship that

he has already identified. We care about causal mechanisms in these and many other examples not

38James Johnson, “Conceptual Problems As Obstacles to Progress in Political Science,” Journal of Theoretical Politics 15 (2003), 94.

39See his “The Renaissance of Political Culture,” American Political Science Review 82 (December 1988),1217. This is not to endorse his reasoning: citizens of the non-democracies included in his survey--levels of life satisfaction were taken from 1981--include South Africa under apartheid, Hungary under communist rule, and very recent transitions from military rule in Spain, Portugal, and Greece--might take exception to his characterization of politics as a “relatively narrow aspect of life.”

19

to guard against invalid inferences, but to distinguish between well-confirmed accidental

correlations and well-confirmed causal relationships. Thinking in terms of causal mechanisms is

thus not a matter of research design; rather, I argue below, it is a matter of thinking about causal

processes and their modes of propagation.

At this point, DSI might, with some frustration, respond merely be repeating their claim

that even if we upgrade the value of causal mechanisms, identifying causal mechanisms requires

causal inferences. To see why that strategy must fail, let us consider, now, the question of

inferential goodness.

Inferences

“In a world in which almost everything is influenced by many different factors,” Robin

Dunbar reminds us, “confounding variables are the bane of a scientist’s life.”40 Take an event, any

event: it is preceded by and coincident to an astonishingly large number of other events, all of

which are enveloped by a vast coterie of environmental characteristics, qualities, and conditions.

Any standing feature or any episode of discrete temporal change could, in principle, be the cause

of anything else. And as we attempt, however feebly, to measure covariations, we find many,

many things undergoing simultaneous change. The problem with qualitative analysis, from the

perspective of quantitative analysis, is the perceived absence of reliable means to control for these

confounding variables. The quantity of reasonable causes of a phenomenon may quickly

outnumber the cases being studied--too many variables, too few cases is the bumper-sticker sized

statement of this raw, existential condition more formally called the problem of indeterminate

research designs or the “small-n” problem.

DSI is certainly not the first book to argue that statistical analysis is better equipped than

case-study research to handle the problem of confounding variables and thus to issue in

determinate judgments and valid inferences. It differs from these earlier works in three ways: by

forcefully restating the value of the H-D method; by counseling case-study researchers to emulate

the strategies of statistical analysis and thus overcome their inherent methodological weaknesses;

and to articulate clearly these strategies and gesture at how they can be employed non-

quantitatively. There is no inherent reason, the authors argue, why qualitative researchers cannot

define their theories to maximize the number of observable implications which must be seen if the

theory is true; to select carefully their cases with full knowledge of the problems of selection bias;

40The Trouble with Science (Harvard University Press, 1995), 14.

20

to increase the number of observations, especially by “making many observations from few”’ and

to avoid endogeneity, measurement error, and bias in the exclusion of relevant variables.

The Berkeley team was acutely aware of the need to seek observational implications of

their theories; it is precisely this element of their study that set it off from the dozens of earlier

speculative enterprises.41 Yet the Berkeley team spent little time worrying about problems of

indeterminate research designs even though the number of their observations was dwarfed by the

number of existing hypotheses.42 Indeed, the core methodological precepts of DSI played

virtually no role in confirming the hypothesis linking an extraterrestrial impact to the K/T

extinction. Recall from the case study above that only after the discovery of the iridium anomaly

did the Berkeley team begin to consider the possibility of an extraterrestrial event as the cause of

the mass extinction: that discovery was not a test of any specific hypothesis. The key moment in

their decision that the bolide impact hypothesis was vindicated was the discovery of a causal

mechanism with three properties: 1). it was consistent with the iridium anomaly; 2). it could be

plausibly linked to the biological extinction; and 3). the causal mechanism linking the impact to

the subsequent deaths was consistent with all know scientific laws and models. Reviewing the

process of research and reasoning that supported the impact hypothesis thus reveals that although

the H-D method in its various guises was frequently employed, it was absent at critical moments

of the research cycle, and thus, contingent on two ancillary issues considered immediately below,

it cannot be considered a necessary element of confirmation.

Note further how the Berkeley team explicitly eschewed many of the specific precepts of

DSI. Take, for example, the proposition that qualitative researchers should increase the number

of observations. Observing downstream implications of an independent variable increases the

number of available observations, but it does not contribute to overcoming the problem of

indeterminate research designs in any meaningful statistical sense (and, presumably, a clever

scholar could simultaneously increase the number of observations connected to rival hypotheses

as well). Recalibrating the ratio of variables to cases requires investigating other instances of the

phenomenon in question; in this case, we would expect the Berkeley team to examine other

instances of mass extinction and their relationship to extraterrestrial impacts. This research

41For a thorough review, see Michael J. Benton, “Scientific Methodologies in Collision: The History of the Study of the Extinction of the Dinosaurs,” Evolutionary Biology 24 (1990): 371-400.

42In the example of indeterminate research designs given in DSI, a researcher has seven causal variables and only three observations. Applying this to the Berkeley group, we would have dozens of causal variables and only one observation.

21

strategy is not ruled out by the uniqueness of the dinosaur extinction: indeed, no scientific

researcher even refers to this event as one of dinosaur extinction, for it involved the extinction of

over forty percent of all genera, making it one of five mass extinctions.43 This fact was explicitly

recognized by the Berkeley group, who began their historic publication by stating “In the 570-

million year period for which abundant fossil remains are available, there have been give great

biological crises, during which many groups of organisms died out.”44 Yet far from worrying

about the problem of indeterminate research design, the Berkeley group ignored these other four

instances of mass extinction and concentrated solely on explaining the K/T extinction.45 This

research strategy explicitly violates one of the most important lessons of DSI: avoid selection bias,

about which the authors warn in stark terms: “When observations are selected on the basis of a

particular value of the dependent variable, nothing whatsoever can be learned about the causes of

the dependent variable without taking into account other instances when the dependent variable

takes on other values.” (129). Not only did the Berkeley group omit study of other instances of

mass extinction or explicit study of non-extinction periods (during which time extraterrestrial

impacts were in fact quite common),46 but they also omitted study of the far more numerous

instances of sub-mass extinctions. As the paleontologist J.J. Sepowski has scrupulously

demonstrated, a histogram of all extinctions evinces a highly skewed distribution, one associated

with power laws, suggesting not only that there is no sharp discontinuity between small and large

extinctions, but also that the mass extinctions may in fact be random events without specific

causes.47 Finally, note that when prior to publication the Berkeley team looked for additional

observations supporting the iridium anomaly, they searched only for iridium anomalies at the K/T

43The best introduction is Stephen Schwartz, Extinction [complete citation].

44“Extraterrestrial Cause,”1095.

45They raised the possibility that extraterrestrial collisions caused all of the mass extinctions, but speculated that some of these collisions would have been comets that were mostly ice so that no I r anomaly would exist. “Extraterrestrial Causes,” 1107.

46See, for example, K.A. Farley, “Geochemical evidence for a comet shower in the late Eocene,” Science 280 (May 22, 1998)1250-54.

47Power laws inversely relate the magnitude of an event to its frequency: small earthquakes are far more common than powerful ones, and small numbers of species and genera will go extinct far more often than mass extinctions will occur. There are various ways to model this relationship so that mass extinctions do not have ca uses distinct from small-scale extinctions; if this were true, there would be no reason to look for a unique or rare cause of the K/T extinction. David Raup provides powerful evidence against this line of reasoning in his Extinction: Bad Genes or Bad Luck? (W.W. Norton and Company, 1991).

22

boundary itself, another instance of selection bias, albeit one whose logic is well-supported by the

research situation.48

The authors of DSI might make three objections to my account of the dinosaur extinction: • Objection #1: The story I have told refers to “the irrational nature of discovery,” to the

process by which theories are generated. The framework of DSI, on the other hand, refers to the evaluation of existing theories.49

• Response #1: The meteorite hypothesis predates the discovery of the iridium anomaly, and the Berkeley group repeatedly discussed and rejected the hypothesis until they came up with a plausible causal mechanism. With that mechanism in place, they claimed to have provided “direct physical evidence” for a “satisfactory explanation.”

• Objection #2: The meteorite hypothesis was only “partially” confirmed by the iridium anomaly. (11)

• Response #2: This objection raises the question of when confirmation occurs, however. While DSI treats the original research as exemplary by their own standards, they do not consider the hypothesis to be confirmed, writing (and again conflating confirmation with explanation) that

a hypothesis is not considered to be a reasonably certain explanation until it has been evaluated and passed a number of demanding tests. At minimum, its implications must be consistent with out knowledge of the external world; at best, it should predict what Imre Lakatos refers to as “new facts,” that is, those formerly unobserved. (12)

By these criteria, however, the hypothesis was indeed confirmed. The impact hypothesis was consistent with knowledge of the external world; moreover, it had predicted what Lakatos calls “novel, excess information,” predicting, for example, the presence of an impact crater as well as the existence of extraterrestrial events coincident to other mass extinctions. Lakatos, moreover, always recognized that a theory was to be evaluated relative to rival theories. Although the authors of DSI note the existence of a rival theory--that the K/T extinction was a product of massive volcanic eruptions--they do not reference this rival theory or the problem of theoretical rivals in their discussion of confirmation. Thus, by the abbreviated sketch of confirmation that DSI offers, the hypothesis was indeed confirmed.50

48A global distribution of the iridium anomaly is a necessary condition for their to have been a large bolide impact coincident to the K/T boundary. Selecting on the dependent variable is in fact obligatory under these circumstances. See Brian Skyrms, Choice & Chance: An Introduction to Inductive Logic 3rd ed. (Wadsworth Publishing Company, 1986), 90-94.

49This defense of DSI from its critics is made most explicitly in their “Importance of Research Design,” 476.

50It is true, of course, that the Berkeley team did not consider the matter settled: they concluded their article with two potential tests of their hypothesis. First, their hypothesis could be in the future tested against the other four major instances of mass extinction; second, they could find the crater associated with the meteor impact (and whose size they were able to forecast). Note, by the way, that there was no reason to expect to find that crater: the greater probability was that it would be an oceanic impac t of which subduction would have eliminated almost all traces of the pre-Tertiary ocean floor.

23

• Objection #3: The meteorite hypothesis was not confirmed by the original report; it was confirmed by subsequent research (including the discovery of geological formations believed to be unique to bolide impacts and the impact crater itself) that hewed more closely to the strictures of DSI. The hypothesis was confirmed, in other words, by a retrospectively valid research design produced by an entire scientific community, not a single research team.51

• Response #3: To use subsequent research to render valid the original research goes well beyond the claim made by the author’s of DSI, who claim only that a classic single-case study was in fact consistent with their model of good research design because its author was “contributing to a large scholarly literature. As such, he was not trying to estimate a causal effect from a single observation; nor was he selecting on his dependent variable.”52 But as we have seen, the Berkeley group was not contributing to an existing literature; they were trying to estimate a causal effect from a single observation, and they were selecting on the dependent variable. If retrospective and collective research designs validate research, then no research can ever be discredited without full knowledge of future ideas, a condition that Popper has shown to be logically incoherent.

All of these objections, whatever their individual merits, ignore the research strategy that

governed the reasoning of the Berkeley team. The Berkeley team may have neglected to follow

the rules of DSI, but that does not mean that they followed no rules or that the rules they followed

were illegitimate. They explicitly justified reasoning backwards from the iridium anomaly to the

meteorite hypothesis and accepting the latter as provisionally warranted belief prior to the

gradual accumulation of new supporting evidence. In their 1980 Science article, they claim to present direct physical evidence for an unusual event at exactly the same time of the extinctions in the planktonic realm. None of the current hypotheses adequately accounts for the this evidence, but we have developed a hypothesis that appears to offer a satisfactory explanation for nearly all the available paleontological and physical evidence.53

Put differently, the impact hypothesis counts as warranted belief consequent to its meeting

all of the following three conditions:

51“Importance of Research Design,” 477. Note that the subsequent research further confirms that an impact took place; little of it directly bears on the question of the extinction itself, a point we return to below.

52“Importance of Research Design,” 477. The authors are responding to points raised by a number of contributors to a symposium on DSI. While these contributors allow for a collective dimension to research designs, they all cast this quality as scholars “strategically choosing observations based upon knowledge of cases from parallel studies.” The unit of evaluations might thus be the community of scholars and not the individual researcher, a point firs t raised by Thomas Kuhn’s concept of a paradigm which governed the beliefs and practices of a community of scholars. This claim is not akin to retrospectively judging a research design sound based on subsequent work, however. The quoted sentence is from David Laitin, “Disciplining Political Science ,” American Political Science Review 89 (June 1995), 456.

53“Extraterrestrial Cause,” 1095.

24

• It accounts for an incredibly significant fact--the iridium anomaly--whose importance outweighs almost every other available piece of data;

• It accounts for that fact better than its rivals such as the supernova hypothesis. • It accounts for the outcome in question by way of a credible causal mechanism.

Emerging in this criterial list is the outlines of an alternative methodology based explicitly

on the search for causal mechanisms. For Richard Miller, confirmation is a process of “fair causal

comparison.” According to Miller’s formal definition, “A hypothesis is confirmed just in case its

approximate truth, and the basic falsehood of its rivals, is entailed by the best causal account of

the history of data-gathering and theorizing out of which the data arose.” To put this point more

colloquially, to claim that my hypothesis has been confirmed is to state, “These are the facts. This

is how they are explained assuming the approximate truth of the favored hypothesis. This is why

they are not explained as well on the rival hypotheses which are the current competitors.”54

Because confirmation is based on a relationship to evidence and to rivals, confirmation must always be tentative, contingent on the availability of data and the state of rivals. A confirmed

hypothesis, in other words, can be dislodged by new evidence or displaced by new rivals.

Acceptance of a hypothesis thus means only that “acceptance is taken to be more reasonable than

rejection, but suspended judgment is not excluded.”55

Miller’s account, however, does not explicitly tie confirmation to explanation: because the

latter is our ultimate goal, and because explanation requires causal mechanisms, we can add to his

account the condition that vindicated hypotheses must be connected to explanatory outcomes by

way of credible causal mechanisms. As the Berkeley team realized, making the search for causal

mechanisms an integral element of the process of confirmation--mobilizing, in other words, the

twin functions of causal mechanisms, as means for enhancing credibility of hypotheses and as

explanatory devises--vastly expands our repertoire for engaging in fair, causal comparison. In the

following list, to say that a theory is rejected is to render its acceptance less reasonable than a

competitor which does not suffer an analogous problem. • A hypothesis can be rejected because its posited causal mechanism is considered

inconsistent with generally accepted principles and thus implausible. This was the case, to give just one example, with the rejection of the initial formulation of plate-tectonics theory. It is also the reason given for the rejection of functionalist models of social change.

• A hypothesis can be rejected because it does not logically imply the outcome attributed to it. The observed association between advanced levels of economic

54Miller, Fact and Method, 155, 163.

55Miller, Fact and Method, 158.

25

modernity and democracy might plausibly explain why democracies in wealthy countries survive without explaining why wealthy countries became democratic.56

• A hypothesis can be rejected because its posited causal mechanism is considered conceptually inadequate so that it provides no insight into “how things work.” This form of rejection is much stronger when the hypothesis in question is the latest version of a research programme that has long suffered this problem. James Johnson levels this charge against political-culture research, for example.57

• A hypothesis can be rejected because it is considered to lack causal depth: it may be considered definitionally sufficient for the outcome in question (candidate X won because more voters cast ballots for her); it may be considered part of the normal course of affairs (bridges may collapse while cars drive over them, but the cause is presumably a structural defect, not the cars that the bridge was designed to support); or it may be subsumed by a hypothesis that lies further back on a causal chain (the East Asian financial crisis of the late 1990s was triggered by a run on local currencies; it was caused by a syndrome of structural imbalances).58

• A hypothesis can be rejected because its emphasis on large-scale causes of large-scale effects is demonstrably invalid in circumstances in which small-scale causes can have large effects or in circumstances in which stochastic processes follow power laws, usually producing small effects but occasionally producing large effects whose magnitude is inversely related to their frequency.59

• A hypothesis can be rejected because its truth implies necessary observations that cannot be made by credible techniques. This was the means by which the Berkeley team rejected the otherwise plausible inference that the iridium anomaly was produced by a supernova explosion. The absence of long-term economic convergence on a global scale has similarly discredited many simple models of economic growth, while the presence of a handful of rapid developers has been held to discredit dependency theory.

• A hypothesis can be rejected, finally (but perhaps not conclusively), because it is shown to rest on an invalid inference stemming from a faulty research design.

As this list demonstrates, there is more than one way to engage in fair, causal comparison.,

only some of which are covered by the regulatory framework of DSI. The list demarcates a core

56Adam Przeworski, et. al., Democracy and Development: Political Institutions and Well-Being in the World, 1950-1990 (Cambridge University Press, 2000).

57Johnson, “Conceptual Problems.”

58Particularly good on the pragmatics of explanation is John Gerring, Social Science Methodology: A Criterial Framework (Cambridge University Press, 2001), 90-99.

59For the argument that non-linear models of change discredit a diverse array of existing approaches to political science, see Alan Zuckerman, “Reformulating Explanatory Standards and Advancing Theory in Comparative Politics,” in Mark Irving Lichbach and Alan S. Zuckerman, eds., Comparative Politics: Rationality, Culture, and Structure (Cambridge University Press, 1997), 277-310. For reasons to be skeptical, see David Waldner, “Anti Anti-Determinism,” paper presented to the Annual Convention of the American Political Science Association, Boston, MA, September 2002, esp. sections 5 and 6.

26

working model, one that permits us to believe in some ideas because they appear, by current

standards and knowledge, superior to their rivals. It is thus reasonable to commit to them, even

tentatively and in full knowledge that superior alternatives might yet emerge. We reach this

judgment, moreover, in diverse ways. Statistical studies and qualitative studies striving to mimic

statistical exactitude are powerful members of our methodological ensemble, but they have

valuable accomplices whose contributions should be neither overlooked nor slighted. Those allies

are largely based on the consideration of causal mechanisms: and many of those considerations are

conceptual/theoretical, not empirical, in nature.

Within the social sciences, a growing number of scholars have considered analogous

methodological alternatives, loosely grouped together under the name “process-tracing.”

Sometimes lost in this important literature, however, is the distinction between confirming a

hypothesis against the evidence and confirming a hypothesis by eliminating rival hypotheses

against the evidence. Fair, causal comparison always demands both elements of inquiry.

Consider, for example, Jack Goldstone’s metaphor of the detective who will draw on her experience with similar cases in making judgments about which factors pertain to this particular case (inductive insights); but the causal reasoning proposed to explain a crime or accident will be a linking of particular facts involved in this case with general principles regarding how opportunities, motivations, and circumstances conduce to particular action (deductive reasoning).60

The problem with this approach is that it relies too heavily on a confirmationist strategy of

finding evidence consistent with a hypothesis, and does not explicitly incorporate an

eliminationist strategy of rejecting alternative hypotheses.61 Changing the metaphor a bit,

scholars must simultaneously act as prosecutors and defense attorneys. Timothy

McKeown’s version of process tracing makes clear that detectives build cases in large part

by eliminating rival explanations:

An observation may be consistent with several different hypotheses about the identity of the killer and rules out few suspects. No one observation establishes the identity of the killer, but the detective’s background knowledge, in conjunction with a series of observations, provides the basis for judgments that generate or eliminate suspects...Rival theories are assessed and disposed of, generally by

60Jack A. Goldstone, “Methodological Issues in Comparative Macrosociology,” Comparative Social Research 16 (1997), 113.

61Goldstone explicitly rejects any role for Millian methods in his approach, even though these methods are best understood not as means to generate a theory but rather as means to falsify alternatives.

27

showing that they are not successful in accounting for all the observations. The suspect may attempt to argue that it is all a coincidence, but the detective knows that someone has to be the killer and the evidence against the suspect is so much stronger than the evidence against anybody else that one can conclude beyond a reasonable doubt that the suspect should be arrested.62

Fair, causal comparison, then, always involves the simultaneous confirmation of

one hypothesis, by showing it to be consistent with the data and the current state of

theorizing, and the elimination of rival hypotheses. Some of this work is done by

evaluating the fit between hypothesis and evidence; some of it is done by independent

evaluation of a theory’s causal mechanisms. As the list enumerated above indicates, there

are diverse specific techniques for engaging in fair, causal comparison. These techniques

mean that fair, causal comparison can distinguish valid from invalid process tracing.

Consider, as an illustration, an example of process tracing from a book that has

won wide acclaim for its methodological sophistication and its attention to causal

mechanisms, Robert Putnam’s, Making Democracy Work.63 He uses process tracing to

explain why northern Italian citizens possess civic culture, a virtue lacking in southern

Italians that Putnam has argued is causally related to superior performance in northern

political institutions. By this account, ordinary citizens partaking of their regional civic

culture enact their own institutional fates. Thus in the nineteenth century, mutual aid

societies and other forms of voluntary organization proliferated in the north but were

starkly absent in the south, where, according to an historian quoted by Putnam, “The

peasants were in constant competition with each other for the best strips of land on the

latifondo, and for what meagre resources were available. Vertical relationships between

landlord and client, and obsequiousness to the landlord, were more important than fixed

solidarities.”64

Process tracing thus seems to confirm Putnam’s account: long-standing cultural

differences are associated with different patterns of behavior and different institutional

62“Case Studies and the Statistical Worldview,” 170-71. Substituting the metaphor of a crossword puzzler for the detective, Susan Haack makes a remarkably similar case in her essay “Puzzling out Science,” contained in her Manifesto of a Passionate Moderate (The University of Chicago Press, 1998), 95.

63Robert Putnam, Making Democracy Work: Civic Traditions in Modern Italy (Princeton University Press, 1993). DSI considers the book a methodologically exemplary study that exemplifies the possibilities of combining quantitative and qualitative research. More skeptical on this score is Sidney Tarrow, “Bridging the Quantitative-Qualitative Divide in Political Science,” American Political Science Review 89 (June 1995), 471.

64 Making Democracy Work, 143, quoting the historian Paul Ginsborg who is in turn citing an Italian scholar.

28

outcomes. Yet Putnam does not consider any alternative hypothesis, even as he provides

clear evidence that cultural values did not exist autonomous of other structures and that

both culture and institutions were induced by the broader socioeconomic context. Thus,

as aristocratic rule in the north was declining, From 1504 until 1860, all of Italy south of the Papal States was ruled by the Hapsburgs and the Bourbons, who (as Anthony Pagden has recently described in detail) systematically destroyed horizontal ties of solidarity in order to maintain the primacy of vertical ties of dependence and exploitation.65

The use of power--coercive and otherwise--to prevent peasants from achieving social

solidarity that might be used to challenge upper-class hegemony did not end with

unification, for post-1860 “The southern feudal nobility...used private violence, as well as

their privileged access to state resources, to reinforce vertical relations of dominion and

personal dependency and to discourage horizontal solidarity.”66

Thus, Putnam’s historical sketch provides two very different images. In one

depiction, peasants mistrust one another and anxiously seek patronage from local elites,

thus creating and recreating relations of dependency inconsistent with and injurious to

civic culture: this is the genuine cultural interpretation that Putnam wishes to vindicate.

But in the second image, elites--who have no northern analogue-- themselves create and

recreate these vertical ties and actively discourage horizontal solidarity among lower

classes. The question is, if southern peasants were culturally hostile to horizontal norms

of solidarity and engagement, why did elites so persistently feel the need to maintain

vertical ties and destroy horizontal alternative ones? Putnam’s own sources suggest this

alternative reading which Putnam takes no measures to reject.67 On the contrary, his own

65Making Democracy Work, 136.

66Making Democracy Work, 145. Putnam follows this material with the claim that southern peasants made recourse to patron-client relations, a rational decision given their wretched position. But that wretched position was forced upon them; it seems that it was power relations and not culture that created vertical relationships. Were culture autonomously generative of such relations, upper classes would not themselves need to make recourse to violence.

67One might respond that Putnam’s historical sources did not take the form of credible rival hypotheses: but these historical materials are fully consistent with theoretical accounts that should be considered rival hypotheses, most notably Antonio Gramsci, Selections from the Prison Notebooks (International Publishers, 1971) or James Scott, Domination and the Arts of Resistance: Hidden Transcripts (Yale University Press, 1990). For an exemplary attempt to adjudicate between Gramsci and Scott, which act as rival accounts to one anot her, see Susan Stokes, Cultures in Conflict: Social Movements and the State in Peru (University of California Press, 1995).

29

interpretation consistently highlights the independent agency of southern peasants while

completely obscuring the agency of southern landowners and their allies. In Putnam’s

account, it is culturally deprived peasants who willingly create their own relations of

exploitation.68

The differences between the work of the Berkeley team and Putnam’s research

team, with the first engaging in fair causal comparison while violating many norms of DSI and the second adhering to DSI while avoiding fair causal comparison, illustrate why fair,

causal comparison must render methodological pluralism more credible than

methodological unity: there are multiple ways to engage in fair, causal comparison, only

some of which are captured by the rules and regulations contained in DSI. The point is

not that research design considerations are dispensable: the point, rather, is that efforts to

confirm propositions and use them in explanations need not be based solely on the logic of

statistical inference. Statistical inferences, after all, are largely unconcerned with causal

mechanisms: consequently, the inferences that result cannot explain. Giving attention to

causal mechanisms not only permits us to distinguish explanatory from non-explanatory

inferences, but it also gives us diverse means to engage in fair causal comparison.

Some of the research designs that result will precede gathering data; at other times,

scholars will approach existing data with a disciplined plan to adjudicate debates. The

latter strategy is in no way inferior to the former: DSI is right to worry about the

confirmationist bias, but fair causal comparison is adequately equipped to manage these

concerns, as the critical reflections on Putnam’s process tracing indicates.

The basic rule is this: if existing data can be interpreted to vindicate a hypothesis

against its rivals in a process of fair, causal comparison, then no further data gathering is

necessary. If existing data cannot accomplish this task because the existing data is

consistent with multiple hypotheses, then new research is needed and that research must

be guided by a research design oriented explicitly at the current contributors to the

theoretical controversy. What this means is that the appropriate research strategy is

always a function of the existing state of knowledge--the data that is available, the

controversies on behalf of whose adjudication it was gathered, and the state of theoretical

68Making Democracy Work, 177-78.

30

contestation.69 Under this rule, the Berkeley team made a reasonable claim to have

explained the mass extinction at the K/T boundary. The unexpected discovery of the

iridium anomaly--a discovery which was not pursuant to the test of any specific theory and

which was not followed by a new, systematic research design--provided sufficient data to

vindicate the impact hypothesis because was made at a time of an incredible proliferation

of explanations, none of which commanded any evidentiary warrant. The discovery itself

was sufficient to eliminate most rival explanations; a simple test led to the rejection of one

more potent rival; and when coupled with a plausible causal mechanism, was sufficient to

confirm, in the sense discussed above, the impact hypothesis.70

Note, finally, that fair, causal comparison requires careful attention to the history

of data gathering. Sometimes we have to think about how our evidence has been

obtained, for that history might sensitize us to biases in that history. One of the main

concerns of DSI is how we select our data; Stanley Lieberson makes the very different

point that we must learn to think about the causal processes producing our data.71

Theory-confirming or theory-informing observations, in other words, do not exist in a

theoretical vacuum. The impact hypothesis, for example, implies that mass extinction was

a simultaneous event, at least on a geologic scale. Fossil evidence supporting this claim

may be misleading, because sometimes erosion forms gaps in the geologic record, and

missing stratiographic units give the appearance of sudden extinctions. But despite this

bias in favor of the hypothesis, the fossil record does not directly support the impact claim,

most centrally because the existing fossil record suggests that dinosaurs were already in a

steep demographic decline by the middle of the Cretaceous period, leaving few to be killed

by a meteor impact. There is good reason to believe, however, that the inference is a

faulty one caused by unavoidable sampling bias. Because our fossil record documents

only a small percentage of all fossils present in a rock formation (which in turn, given the

numerous hurdles to be overcome before remains become fossilized, is itself only a sliver

69The authors of DSI allow for this when they admit that a single case study can have enormous ramifi cations when it contributes to existing research and attendant debates. The problem is that they provide no logic for understanding this eventuality.

70That is to say, confirmation could be applied in 1980 and withdrawn subsequent to the future elaboration of a plausible rival hypothesis such as the volcano hypothesis.

71Stanley Lieberson, Making it Count: The Improvement of Social Research and Theory (University of California Press, 1985).

31

of the true population), as species become more rare, our sampling will be more imperfect

and because we are thus less likely to find fossils at a particular point like the K/T

boundary, we are more likely to conclude that the species died much earlier. This is why

we are sure that plentiful foram did largely go extinct at the K/T boundary, but the fossil

record for less numerous dinosaurs is far more ambiguous. This defense of the impact

hypothesis, known as the Signor-Lipps hypothesis, receives support from some ingenious

sampling techniques involving species well-known to have gone extinct simultaneously.72

Inference and Explanation at the K/T Boundary...and Beyond Analysis of causal mechanisms can lead to the confirmation of hypotheses, in part

by disconfirming rivals: this lesson has large implications for how we think about

confirmation for it suggests that the rules of DSI may be of central significance without

being necessary components of every act of confirmation. I have also argued that causal

mechanisms are necessary to adequate explanations: their importance, in other words,

goes beyond abetting inferential goodness. Now it might seem that in order to aid the

process of making valid inferences, causal mechanisms must be correctly identified; and

that correctly identified causal mechanisms constitute good explanations; and so the twin

role of mechanisms--confirmation and explanation--are in fact, contrary to what I have

argued, equivalent. This is not the case: correctly identified causal mechanisms constitute

adequate explanations only if they span the gap between cause and effect. Adequate

explanations, more often than not, require causal chains. To see why this is so, let us

briefly examine why the impact hypothesis can be a confirmed hypothesis but a poor

explanation.

That a large meteor struck the earth at the K/T boundary is beyond dispute. This

does not mean that the impact caused the mass extinction marking the end of the

Cretaceous period. To make that claim, we must rule out alternative hypotheses and

connect cause and effect in a chain of causal mechanisms. Without entering the full debate

between the impact hypothesis and its volcanic rival, let us concede the point to the impact

side and ignore the issue of rivals.73 How well does the impact hypothesis explain? This

72The Signor-Lipps effect is discussed lucidly in James Lawrence Powell, Night Comes to the Cretaceous: Dinosaur Extinction and the Transformation of Modern Geology (W.H. Freeman and Company, 1998), 130-41.

73For the full set of reasons to discount heavily the hypothesis that the K/T extinction was caused by huge volcanic outpourings on the Indian subcontinent, see Powell, Night Comes to the Cretaceous, 85-95.

32

question breaks down into sub-questions: What are the geological and biological

mechanisms that resulted in the mass extinction and by what other mechanisms were they

linked? And what explains the actual pattern of extinctions? Why, in other words, did

some genera become extinct while others did not?74

The original mechanism posited by the Berkeley research team was based on an

extrapolation from the explosion of the Krakatoa volcano in 1883. That explosion had

kicked up enough dust and ash to alter global atmospheric conditions for months; expand

the scale of those effects to the size of catastrophic impact, they reasoned, and dust in the

air would kill plant life, leading to a collapse of the food chain and mass extinction.

Observations made in 1994 when Shoemaker-Levy 9 struck Jupiter support calculations

that, given its tremendous speed, the meteorite would have carried with it tremendous

energy, far greater than that contained in the global supply of nuclear weapons, energy

that would have to be dissipated post-impact. The predicted effects include “shock waves,

tsunamis (tidal waves), acid rain, forest fires, darkness caused by atmospheric dust and

soot, and global heating or global cooling.”75 There is evidence for a variety of these

post-impact scenarios, including global wildfires, acid rain, and a decade-long “impact”

winter.76 But even book-length, enthusiastic defenses of the impact hypothesis devote

startlingly little space to fleshing out these scenarios, more typically concluding that “we

know that [the impact] must have had some combination of the effects described. What

we do not know is just how the many lethal possibilities would have interacted with each

other and with living organisms.”77 Indeed, uncertainty remains whether the extinction

was virtually instaneous or extended over thousands of years.78

74We know than answering this last question requires moving beyond geologic and environmental mechanisms to investigate genera-specific biological factors. The two dinosaur groups suffered extinction of all twenty-two of their genera. Given an overall extinction rate of 43 percent of all genera, the probability that every genus of the dinosaurs would go extinct by chance alone is virtually nil. For the ingenious reasoning, see David Raup, Extinction: Bad Genes of Bad Luck? (W.W Norton & Company, 1991), 88-105.

75Raup, Extinction: Bad Genes or Bad Luck?, 161.

76See the brief discussion in Powell, Night Comes to the Cretaceous, 176-79.

77Powell, Night Comes to the Cretaceous, 179.

78Early in the debate the Berkeley group acknowledged that extinctions might have been spread over as much as 104 to 105 years, claiming only that “a major impact would produce important environmental changes and that instaneous extinction in all groups is not a necessary corollary of the impact theory.” See their “The End of the Cretaceous: Sharp Boundary or Global Transition?” Science 223 (16 March, 1984), 1184.

33

With that last point in mind, turn next to the variable biotic responses to the K/T

environmental disturbances. Given catastrophic environmental changes, we might expect

uniform rates of extinction across all genera, but this is not what is observed for this or

any of the five major mass extinctions. Thus, we have two specific questions concerning

biological mechanisms of extinction: what mechanism led to the mass killing, and why did

it kill some genera but not others? These questions have prompted even firm supporters

of the impact hypothesis to conclude that the theory contains some “puzzling features,” in

the words of Richard Fortey. “There are many animals and plants that did survive,”

Fortey continues, “and somehow it does not seem satisfying to call them “lucky ones” and

leave it at that. Their survival should chime in with the fatal scenario.”79

Others have drawn even stronger conclusions. Responding to what has been called the

“Dante’s Inferno” scenario of a broad cluster of environmental catastrophes subsequent to

the impact. William Clemens states,

I think the results of studies of patterns of survival and extinction of terrestrial vertebrates fully falsify the hypothesis that an impact caused the terminal Cretaceous extinctions of terrestrial vertebrates through the series of environmental catastrophes embodied in the “Dante’s Inferno” scenario. Ancestors of groups that are today known to be unable to tolerate major climatic change, such as frogs, salamanders, lizards, turtles, and birds, survived whatever caused the extinction of the other dinosaurs.80

Clemens is not rejecting the impact hypothesis in toto. Rather, he is pointedly reminding

us that extinction is ultimately a biological phenomenon, and along with verifying the

geological consequences of the impact, we need to specify carefully the biological causal

mechanisms.

Defenders of the impact hypothesis treat the call for specified mechanisms as an

unreasonable assault on the impact hypothesis itself, which they insist has met many tests

and failed none of them and so deserves to be considered corroborated.81 This response is

79Richard Fortey, Life: A Natural History of the First Four Billion Years of Life on Earth (Alfred A. Knopf, 1998), 253. Fortey stresses the anomalous survival of insects which have an annual life cycle and rely on live plants for food and shelter and so should not have survived even a decade-long environmental catastrophe. Other puzzling aspects of survival include bony fish, coral, and birds, which are now widely recognized to be direct descendants of dinosaurs.

80“On the Mass Extinction Debates: An Interview with William A. Clemens,’ in William Glen, ed., The Mass-Extinction Debates: How Science Works in a Crisis (Stanford University Press, 1994), 245-46.

81Powell, Night Comes to the Cretaceous, 179-80.

34

based on the conflation of confirmation and explanation. We can agree that the impact

hypothesis has been confirmed--the impact itself occurred and we have good reason to

believe it was the ultimate cause of extinction--yet still maintain that we do not yet possess

an adequate explanation of how the dinosaurs and other genera became extinct. Writing

recently in the journal Paleobiology, Norman MacLeod characterizes the physical

evidence for the bolide impact hypothesis “overwhelming” and the hypothesis itself as

“fully proven, though a number of interesting subsidiary controversies still exist.”

MacLeod rightly insists that the issue at hand is one of standards of explanatory adequacy,

as the specification of precise extinction mechanisms is an indivisible part of explaining any mass extinction event. Just as geologists remained skeptical about continental drift until a precise causal mechanism...was proposed...paleontologists will remain skeptical about the connection between impacts and extinctions until precise biological/ecological mechanisms are proposed that uniquely account for observed taxic patterns and the stratigraphic timing of K/T extinction and survivorship.82

Inferential goodness, we must conclude, is not equivalent to explanatory goodness.

While the specification of causal mechanisms can support inferential goodness, for reasons

discussed at length above, identifying causal mechanisms is not sufficient for adequate

explanations. Instead, when it comes to explanations, we should become greedy: we

should expect a full and complete causal chain, tightly linking cause and effect. Let’s

consider an example from contemporary social science, turning again to Robert Putnam’s

Making Democracy Work. Since we want to focus on the question of explanatory

goodness, let us ignore the issues raised above and concede the inferential validity of the

work: northern Italy enjoys high civic culture and well-performing institutions while

southern Italy lacks civic culture and suffers ill-performing institutions. Why is this?

What is the causal link between civic culture and high-performing institutions? To his

credit, Putnam addresses this issue head on, and his provision of a credible causal

mechanism has led reviewers to praise the book highly. In a phrase, the answer is social

capital: northern Italians have developed norms of reciprocity and mutual collaboration for

the collective good, permitting them to engage in collective action which in turn reinforces

the norms of reciprocity; southern Italians, mired in distrust, cannot engage in collective

82Norman MacLeod, “K/T Redux,” Paleobiology 22 (1996), 315.

35

action. These two outcomes are both stable equilibria, one producing fortunate outcomes,

the other producing gross misfortune.

For Putnam, the lesson is clear: “These contrasting social contexts plainly affected

how the new institutions worked.”83 But just as the paleontologist MacLeod asked for

biological mechanisms to go along with geological ones, we might ask Putnam for social

and institutional mechanisms to complement and complete the causal chain that begins

with social capital. These mechanisms have to be shown to be consistent with the

institutional record. The image that Putnam conveys is that when faced with poor

institutional performance, northern Italians band together and demand better performance,

while southern Italians respond to poor performance lethargically, if at all. One would

then expect some evidence of this propensity for collective action on behalf of better

institutions, evidence which is strikingly absent given Putnam’s quarter century of “poking

and soaking” in Italy. Indeed, Putnam spends exactly two paragraphs (mirroring how the

Alvarez group spent most of their original article making the case for an impact and

minimal effort connecting the impact to the extinction) directly confronting the causal link

between democratic governance and a vigorous civil society: On the demand side, citizens in civic communities expect better government and (in part through their own efforts), they get it. They demand more effective public service, and they are prepared to act collectively to achieve their shared goals. Their counterparts in less civic regions more commonly assume the role of alienated and cynical supplicants.

On the supply side, the performance of representative government is facilitated by the social infrastructure of civic communities and by the democratic values of both officials and citizens. Most fundamental to the civic community is the social ability to collaborate for shared interests. Generalized reciprocity...generates high social capital and underpins collaboration.84

I find three points of interest in this handful of sentences. First, citizens in the

north of Italy not only expect better government, but they get it “in part through their own

efforts.” Again, we have no evidence that they make those efforts, only a theoretical

argument that they should be disposed to making those efforts. Moreover, by

parenthetically claiming that citizen’s efforts are only part of the story, Putnam raises the

possibility of causal incompleteness. Second, in the following sentence we are told that

83Making Democracy Work, 182

84Making Democracy Work, 182-83.

36

northern Italians “are prepared to act collectively to achieve their shared goals.”

Everything about the argument to that point prepares us to anticipate that citizens do act,

even if this is only “part” of the overall story: now we find that they have a latent

predisposition to act, but do not necessarily make it manifest. Finally, there is a supply

side to go along with the demand side: presumably, this is what Putnam means when he

attributes better outcomes “in part” to the demand-side actions of citizens. The supply

side implies, without stating it clearly or providing any evidence, that public officials in

northern Italy are, relative to their southern counterparts, either more inclined to provide

good public service, or more capable of providing good public service, or both. But this

raises a question: if officials are disposed and/or equipped to provide good public service,

what precisely is the role of the demand side? Why must northern citizens (be prepared

to) act collectively on behalf of good public service if their public officials are

independently prepared to provide those services? Logically, one might think that the

demand side, which now looks far more anemic than much of the prior 181 pages had led

us to believe, would be necessary only if the supply side did not exist. It is quite possible

that these logical relations could be sorted out and that evidence could be provided for the

final version of the causal chain; until that is accomplished, we should be skeptical of the

connections between civic culture and institutional performance. Even if we accept as

confirmed the inference that it is civic culture which most centrally distinguishes the two

halves of Italy (and there are independent reasons to not accept it), we still must conclude

that inferential goodness is not accompanied here by explanatory goodness.

Conclusion

Ever since Hume, philosophers and empirical scientists have recognized that causal

explanations are based on inferences. Not all inferences are explanations, however, and

many explanatory propositions--ones that contain at least a sketch of the relevant causal

mechanisms-- perform poorly as explanations. This essay has sought to demarcate

boundaries between inferences, explanations, and adequate explanations. Doing so, I have

argued, raises serious doubts about the feasibility of methodological unity. Despite the

powerful arguments contained in DSI, I have argued that a framework for inferences does

not function doubly as a framework for explanations; the DSI framework thus cannot be

considered sufficient for generating good explanations. The means by which good

37

explanations are generated are causal mechanisms: once we fully assimilate that point, we

find there are multiple and diverse ways to infirm or confirm hypotheses, many of which

are not even implicit in the framework of DSI: that framework thus cannot be considered

as necessary for explanatory adequacy either.

It might be objected that the sorts of complete explanations I advocate here are neither

feasible nor desirable. This seems to be the position of the authors of many statistical

textbooks, who apparently agree that Social science explanations are, at best, partial theories that indicate only the few important potential influences. Even if one could establish the list of all influences, this list would be sufficiently long to prohibit most analysis. Generally, researchers are trying to estimate accurately the importance of the influences that are central to policy making or to theoretical developments.85

Paul Humphreys, a philosopher who has written widely on the philosophy of explanations,

offers the characterization of such incompletely specified explanations as “explanatorily

informative,” arguing that incomplete explanations are not necessarily untrue ones, and

that explanatory incompleteness has in no way hindered progress on many important

scientific and public-policy issues.86

These objections, I think, only vindicate my point. Different communities of

scholars may in fact have very different yet non-competitive standards of explanatory

adequacy. Indeed, the authors of DSI recognizes this very point. On the book’s first

page, they write of two “styles” of research. Quantitative research uses numbers to

measure “specific aspects of phenomena [and] it abstracts from particular instances to seek

general description or to test causal hypotheses...” Qualitative research, on the other

hand, tends “to focus on one or a small number of cases...and to be concerned with a

rounded or comprehensive account of some event or unit.” (3-4). KKV claim that these

matters of style are “methodologically and substantively unimportant.”

But if qualitativists are licensed to desire “rounded or comprehensive accounts,”

then the standards to which I hold them, with all of their methodological implications, are

commensurately legitimate. Quantitativists may deny the centrality of causal mechanisms

85Eric A. Hanushek and John E. Jackson, Statistical Methods for Social Scientists (Academic Press, 1977, 12, as cited in Lieberson, Making it Count, 186.

86Paul Humphreys, The Chances of Explanation: Causal Explanation in the Social, Medical, and Physical Sciences (Princeton University Press, 1989).

38

and thus reject the implications I draw from consideration of those mechanisms, but that

may simply be a function of their predisposition toward abstracting from particular

instances leads them to discount the value or the feasibility of such comprehensive

accounts. Style, it may turn out, matters greatly.

True, giving comprehensive causal accounts may be desirable but not feasible. It

may be the case that phenomena are sequelae to enormous numbers of inter-correlated

causal influences, so that comprehensive and fully specified lists are unattainable. And it

might be the case that the sort of definitive tests of hypotheses represented by the

Berkeley team’s confirmation of the bolide-impact hypothesis are equally unattainable in

the social sciences, where even a defender of case-study methods like Stephen Van Evera

claims that “Most predictions have low uniqueness and low certitude.” When passed, they

do not rule out rivals (low uniqueness), but failing a test leaves relatively unharmed

propositions that make only probabilistic predictions (low certitude). Most social science,

Van Evera avers, consists of “straw-in-the-wind tests.”87 It is perfectly legitimate to accept any one or all of these wagers, to settle on an

ontological and epistemological position. But a status of great uncertainty where we cannot be sure of any causal relationships let alone the underlying causal structure of the world might also prompt some agnosticism, or at least a rational desire to diversify one’s methodological portfolio. Until it can be proven that explanatory adequacy as discussed here is a chimera, the philosophical position outlined here is valid and the methodological implications follow. Methodological pluralism cannot be defeated by assuming a world that would deliver that defeat.

87Stephen Van Evera, “Guide to Methods for Students of Political Science (Cornell University Press, 1997).

Documents

Inferences and Explanations at the KT Boundarypeople.virginia.edu/~daw4h/Inferences and Explanations at the KT... · INFERENCES AND EXPLANATIONS AT THE K/T BOUNDARY...AND BEYOND David