Upload
vokien
View
240
Download
1
Embed Size (px)
Citation preview
INFERENCES AND EXPLANATIONS AT THE K/T BOUNDARY...AND BEYOND
David Waldner
Department of Politics University of Virginia [email protected]
October 13, 2003 Draft version of chapter written for the volume Theory and Evidence, edited by Ned Lebow and Mark Lichbach. Comments welcome.
1
Introduction: From Inferences to Explanations
After a long period of dormancy in the post-Kuhnian era, arguments for methodological
unity are again attracting attention in the social sciences. The contemporary version of the claim
that all valid scientific knowledge is based on a core set of regulatory principles has been stated
most clearly in the volume Designing Social Inquiry (henceforth DSI) written by the Harvard
political scientists Gary King, Robert Keohane, and Sidney Verba.1 While philosophers authored
most 20th-century briefs for or against methodological unity, the authors of DSI are practicing
empirical social scientists who evince little patience for philosophical investigations. Their work
builds on philosophical precedents to argue that there is a first-order scientific method--test
theories by their observable implications is their version of what philosophers call the
‘hypothetico-deductive method’--but it moves quickly to second-order methodological
considerations distilled from statistical reasoning and techniques. It is not sufficient, they imply,
to follow Karl Popper’s lead and propose only falsifiable hypotheses: to count as warranted
knowledge--to generate valid inferences, in the authors’ terminology--social scientists must
evaluate hypotheses by avidly adhering to a strict set of standards and regulations The
provocative conclusion that they draw is that qualitative research can and should emulate the logic
of quantitative research: indeed, only those methods yield valid inferences which are the central
element of science.
In this essay, I return to the philosophical roots of these debates to demonstrate the
inadequacy of the new methodological unity. At the core of my argument is the distinction
between confirmation and explanation. DSI implicitly equates these two components of scholarly
inquiry. At first glance, this presumed equivalence seems either innocuous or correct: we propose
hypotheses in order to explain, after all, and so confirming those hypotheses seems to complete
the explanatory enterprise. Even if explanation entails something beyond the confirmation of
hypotheses, that something extra does not seem to contain much surplus value: this is why, I
suppose, statistics textbooks so often slight the concept of explanation.2
Confirmation differs significantly from explanation. We confirm hypotheses via contested
appraisal of their evidentiary grounds and theoretical logics, an operation the philosopher Richard
1Designing Social Inquiry: Scientific Inference in Qualitative Research (Princeton University Press, 1994).
2Christopher Achen, to give a prominent example, argues that “social scientists neithe r have nor want correct, stable functional forms for their explanations…Social theories rarely say more than that, ceteris paribus, certain variables are related…A great many other factors may also be influential; they are outside the theory and must be controlled somehow.” See his Interpreting and Using Regression (Newbury Park, CA: 1982), 16.
2
Miller calls “fair causal comparison of a hypothesis with its current rivals.”3 To say that a
hypothesis has been confirmed is to claim that it has weathered sufficient scrutiny relative to the
current state of theorizing and data gathering that belief in its approximate truth is more
reasonable than disbelief but is also subject to revision in the face of future data gathering or
theorizing. To inquire into confirmation is thus to ask when a body of data confirms a hypothesis.
We explain, on the other hand, by using these confirmed hypotheses in prescribed ways. To
inquire into explanation is to ask when a hypothesis adequately explains a phenomenon.
DSI provides a framework for making valid inferences. But while all explanations are
based on inferences, not all inferences constitute explanations. Not all explanatory inferences,
furthermore, work as good explanations. Explanation requires something more than valid
inferences. That something more is causal mechanisms. To prefigure a claim made later in the
paper, we explain by identifying chains of causal mechanisms that were, under the specific
circumstances, sufficient to produce the outcome. To not identify the relevant mechanisms is to
not explain; to identify them only partially is to gesture at without completing the explanation.
Explanatory inferences must therefore contain causal mechanisms; many valid inferences
self-evidently do not do so. Rules for establishing valid inferences that do not explicitly recognize
this distinction do not exhaust the explanatory enterprise. One of the two main defects of DSI is
that it fails to adequately grasp this point. Its framework can therefore at best be considered
necessary but not sufficient for achieving explanatory goodness.
Were they to concede this omission, the authors of DSI might still claim that their
framework is necessary for making valid inferences. I contest this claim as well. Causal
mechanisms discharge a dual function: they are the distinguishing characteristic of explanations,
but they can also be used to enhance or impeach the credibility of hypotheses as well. They play,
in other words, a crucial role in establishing inferential goodness. DSI concedes this point, but
without acknowledging the diverse means by which causal mechanisms contribute to
(dis)confirmation. Causal mechanisms exercise veto power over hypotheses, rendering
implausible hypotheses that otherwise evince the features of inferential goodness, and rendering
more credible those hypotheses accompanied by plausible mechanisms. Put differently, causal
mechanisms promote inferential goodness via theory, not via research design. They adjudicate
contests between multiple hypotheses that are otherwise equally consistent with the data. I
3See his Fact and Method: Explanation, Confirmation, and Reality in the Natural and Social Sciences (Princeton, 1987), esp. chapter 4.
3
demonstrate below that thinking about confirmation via causal mechanisms vastly expands our
repertoire for making valid inferences beyond what DSI explicitly sanctions. Indeed, under
specific conditions delineated below, fair causal comparison allows us to discount or even
disregard procedures and rules central to the case championed by DSI. Far from resulting in
inferential errors, these heterodox strategies support major scientific achievements, as we shall see
below. The research methods advocated by DSI, therefore, are neither necessary nor sufficient
for achieving explanatory adequacy. This conclusion vindicates methodological pluralism: social
science inquiry is not methodologically monochromatic.
Is it fair to argue that DSI equate the confirmation of hypotheses (making valid claims
about causal relationships inferred from the data) with explanation? DSI, after all, clearly states
that “the goal is inference.” In making central the transition from “immediate data to something
broader that is not directly observed,” the authors echo Karl Popper, who once described the
science as piercing the veil of appearances: There is a reality behind the world as it appears to us, possibly a many-layered reality, of which the appearances are the outermost layers. What the scientist does is boldly to guess, daringly to conjecture, what these inner realities are like...But there is another, a special kind of boldness-the boldness of predicting aspects of the world of appearance which have so far been overlooked but which it must possess if the conjectured reality is (more or less) right, if the explanatory hypotheses are (approximately) true.4
Note how Popper equates making inferences about unknowns and unobservables with explanatory
hypotheses. DSI gives many indications that they are following in his footsteps and equating the
confirmation of hypotheses (in their language, verifying the validity of causal inferences) and
explanation. Thus, not long after proclaiming that “the goal is inference”, they insist that
“explanation--connecting causes and effects--is the ultimate goal...” (34). This substitution of the
language of explanation for that of inference occurs so regularly that Henry Brady writes that “It
is not exactly clear how “explanation” fits into KKV’s categories of descriptive and causal
inference, but one reasonable interpretation is that they consider explanation to be identical with
causal inference.”5 Brady attributes the reticence of DSI on this important matter to their reliance
on statistical thinking, for “The statistics literature [in contrast to the philosophical literature
which intimately links causality and explanation] is exceptional in defining causality without
4In his “The Problem of Demarcation,” in David Miller, ed., Popper Selections (Princeton University Press, 1985), 122.
5Henry Brady, “Doing Good and Doing Better: Symposium on Designing Social Inquiry, Part 2,” The Political Methodologist 6 (Spring, 1995), 13 at footnote 6.
4
discussing explanation.”6 However, before fully conceding this point, it seems fair to make two
observations: first, DSI do make a definitional distinction between inferences and explanations:
the former involves moving from observables to non-observables, the latter involves “connecting
causes and effects.” (34) Second, buried innocuously in a footnote is the claim that “At its core,
real explanation is always based on causal inferences.” (75, at footnote 1) But these possible
ways of distinguishing inferences from explanations are never explored, even when the context
begs for that discussion.
It may be objected that I am holding DSI to unfair standards by using the criteria of
necessary and sufficient conditions. Yet these are the only standards that are relevant to the claim
of methodological unity, which is precisely why this project has fared so poorly over the last
century.7 Logical positivists used a transcendental principle of verification to distinguish the
meaningful statements of science from the meaningless statements of non-science. Statements
such as “all swans are white” could be verified by defining swans and observing their color;
statements such as “the absolute is eternal” and other expressions of philosophical idealism, on the
other hand, could not be so verified and were thus meaningless. Insofar as it ruled out various
forms of philosophical idealism then prevalent in European philosophy, logical positivism could
claim to have identified a necessary condition for science, but certainly not a sufficient condition,
for obviously false statements such as “the earth is flat” were clearly verifiable and thus had to
count as science.
In response, Karl Popper proposed a new solution to what he called “the problem of
demarcation.” Popper’s solution was to contrast empirical science to “pre-scientific myths and
6“Doing Good and Doing Better,” 14. Brady also criticizes DSI for equating causal explanations with all types of explanations, some of which are obviously non-causal. This critique was originally levied by Michael Scriven against Carl Hempel, who responded that this was analogous to objecting to a definition of a mathematical proof for its inability to account for the use of the word proof in the phrase “86-proof Scotch.”
7 Building on the preexisting distinction between knowledge (episteme) and opinion (doxa), Aristotle established the view that science was distinguished by the infallibility of its finding, a result of reasoning deductively from first principles. The idea that science conferred apocdictic certainty stabilized faith in methodological unity that continued into the first centuries of the scientific revolution. Key figures such as Galileo and Newton identified with Aristotle’s claims that scientific conclusions were demonstrative and thus incorrigible, even as they omitted Aristotle’s concern for explicitly causal demonstration. Methodological unity thus persisted into the nineteenth century when, in part under the influence of the probabilistic revolution, philosophers and practicing scientists adopted a fallibilistic perspective: scientific theor ies could in fact be in error and theories were only relatively superior to their rivals. At this point, methodological pluralism became more prominent, as scientists forged new methodological tools to replace deductive reasoning and began to embrace, wit h varying degrees of enthusiasm, inductive reasoning. See Larry Laudan, Beyond Positivism and Relativism: Theory, Method, and Evidence (Westview Press, 1996): 210-15. DSI retains as a core tenet the premise that all of our beliefs can be revised; uncertainty cannot be avoided. There is therefore no return to Aristotle.
5
metaphysics,” whose claims were not strictly meaningless but rather non-empirical. The defining
characteristics of science were two-fold and transcendental: bold conjectures and decisive
refutations. Conjectures are imaginative leaps from outward appearances to inner realities; they
are bold when they “stick their necks out,” when they take great risks of being proved wrong.
Science is, Popper then counter-intuitively claimed, not the search for certain knowledge, but
rather for certain refutation.8
Treating Popper’s demarcation criterion as a necessary condition for science created some
nagging problems, for it implied that many recognized scientific achievements did not in fact
constitute science. The Copernican revolution was one example of this revalorization of past
achievements: although Copernicus’ decentering of the earth clashed boldly with accepted belief,
it did not make any new prediction that could be falsifiable by crucial experiments and thus, as
Popper frankly acknowledged, by his terminology, it is “unscientific or metaphysical.”9 A second
problem is that, like the logical positivists, Popper too must allow that obviously false claims are
in fact science: if they have been falsified, then they must be falsifiable, and so they fall on the
science side of his demarcation principle.10 Falsifiability also cannot be a sufficient but not
necessary condition for science, for then astrology might fail the falsifiability criterion but gain
admission on other criterion. Most troubling, however, is that while vulnerability to refutation
may indeed be necessary to achieve scientific status, what is the status of propositions that are not
falsified?11 If falsifiability is a necessary but not a sufficient condition, then we would have to
conclude that physics might be a science for it passes the necessity test, but we would not be sure
that it was indeed a science. The only relevant standards for assessing meta-analytic frameworks
8For the formal analysis, see his The Logic of Scientific Discovery (Routledge, 1992 (1959]). For further commentary, see his Conjectures and Refutations: The Growth of Scientific Knowled ge (Basic Books, Inc., Publishers, 1963.
9More accurately, Popper allows that Copernicus made some minor predictions and to this extent, his work is scientific.
10Unlike the logical positivists, Popper claims that Marxism was scientific until its adherent s dogmatically ignored evidence that it had indeed been falsified. But in that sense, pre-1917 Marxism remains a science.
11Richard Feyman put it best: if the theory disagrees with the empirical evidence, “it is wrong. In that simple statement is the key to science.” But surely there must be at least two keys to science, one that blocks entry to many propositions, but one that permits entry to others.
6
is that they be both necessary and sufficient for achieving scientific status and thus warranted
belief.12
Given these problems, it is perhaps not surprising that Post-Popperian philosophers have
more frequently championed methodological pluralism. Most famously, Thomas Kuhn
distinguished not between science and non-science but between normal and crisis science. The
former is governed not by transcendental, unified rules but by field- and time-specific
achievements, or what he called “paradigms,” while the latter is governed, in part, by extra-
scientific considerations.13 More infamously, Paul Feyerabend argued for methodological
anarchy. Feyerabend was not, contrary to the many claims of his critics, anti-science. What he
objected to was the “narrow-minded extension of the latest scientific fashion to all areas of human
endeavour--in short what I object to is a rationalistic interpretation and defense of science.”14
Feyerabend could thus argue that Galileo’s method “worked” in kinematics and in other fields as
well, but simultaneously and consistently argue that “it does not follow that it should be
universally applied.”15
DSI confidently rejects these pluralistic impulses. From the opening sentence of the
preface, the book unambiguously claims that there is a unified methodology that confers the status
of legitimate knowledge to propositions tested according to its framework. The core of DSI is a
neat restatement of the hypothetic-deductive method, one which the authors hope will produce an
ideal-typical scholar who “uses theory to generate observable implications, then systematically
applies publicly known procedures to infer from evidence whether what the theory implied is
correct.”16 But the authors are unsatisfied by this general disposition towards scientific inquiry; it
may be necessary to produce sound results, but it is not sufficient for that goal. Popper and his
followers spoke about a general intellectual temperament; DSI speaks about concrete research
steps. Consequently, their criteria are not a restatement of the hypothetical-deductive method, for
that method can be used poorly to derive patently unsound conclusions. Rather, they clearly state
12Laudan, Beyond Positivism and Relativism , 215-22.
13Thomas Kuhn, The Structure of Scientific Revolutions 2nd ed. (University of Chicago Press, 1970).
14Paul Feyerabend, Against Method 3rd ed. (London: Verso, 1993 [1975]), 122.
15Against Method, 123.
16Gary King, Robert O. Keohane, and Sidney Verba, “The Importance of Research Design in Political Science,” American Political Science Review 89 (June 1995), 476.
7
what they believe to be the proper usage of the hypothetical-deductive method. Here is where
DSI generates considerable controversy, for their claim is that the specific rules they advance for
using the H-D method are those “rules that are sometimes more clearly stated in the style of
quantitative research.” (6). These rules pertain to, inter alia, avoiding selection bias when
choosing what to observe, correcting for multicollinearity, diagnosing and treating endogeneity,
and increasing the number of observations to avoid indeterminate research designs. These rules,
moreover, are in no way specific to statistical analysis; rather the authors claim that these rules
transcend but are best embodied in statistical analysis, and they are therefore available for
qualitative research as well. Note finally that by claiming that these rules have been best
articulated by statistical researchers, DSI implies that it is this second-order collection of rules and
not the H-D method itself that constitutes the scientific method and which thus distinguishes
scientific inquiry from casual observation.
The authors of DSI should be commended for moving so far beyond earlier efforts to
establish methodological unity. And they should be held accountable to strict standards, not only
because those strict standards are logically implicated by their enterprise, but because they give
good presumptive reasons to believe they will meet those standards. This essay argues, however,
that their efforts ultimately fail.
In the spirit of methodological pluralism that animates this essay, I make this argument in
two ways: through conceptual and logical analysis of the evaluative criteria for confirmation and
explanation and through a case study of scientific progress, one which has been erroneously
recruited to the cause of methodological unity. I begin with the case study, followed by sections
on explanation, inference, and explanatory adequacy.
Inferences and Explanations on the K/T Boundary
An obvious retort to the claim that qualitative analysis should follow the lead of
quantitative research is that the former focuses on highly complex and unique events. The authors
of DSI meet this challenge squarely and expertly, arguing first that complexity is a function of the
analytic apparatus brought to bear on an event, and second that even “unambiguously unique
events” can be studied using the scientific methods they champion. To support this important
claim, they briefly consider one such unique event, the extinction of the dinosaurs.
According to DSI, to study a unique event scientifically is to use the hypothetico-
deductive method (H-D method), which is composed of one or more hypotheses--statements
whose truth value is to be evaluated in terms of their consequences--one or more statements of
8
initial conditions, and one or more observable predictions, or states of the world which can be
deductively implied by the conjoined hypotheses and initial conditions and which therefore must
be observed for the theory to be true. These observations are thus the test of the theory. This is
precisely, according to DSI, the method by which the dinosaur extinction was studied. The
authors neatly assemble this position as follows: One hypothesis to account for dinosaur extinction...posits a cosmic collision: a meteorite crashed into the earth at about 72,000 kilometers an hour, creating a blast greater than that from a full-scale nuclear war. If this hypothesis is correct, it would have the observable implication that iridium (an element common in meteorites but rare on earth) should be found in the particular layer of the earth’s crust that corresponds to sediment laid down sixty-five million years ago; indeed, the discovery of iridium at predicted layers in the earth in the earth has been taken as partial confirming evidence for the theory. ( 11)
It is true that the iridium anomaly provides powerful support for the meteorite (bolide-
impact) hypothesis; and it is true that the team of Berkeley scientists responsible for this
hypothesis treated the iridium anomaly as the observable implication of something that could not
be directly observed. But the scientific study of the dinosaur extinction does not resemble the
summary contained in DSI.17 Rather, as we shall see, researchers reasoned backwards from the
iridium anomaly to its cause; and they focused not on inferential goodness but rather on causal
mechanisms. It is these two facts that set the stage for the philosophical discussion to follow.
Causal mechanisms loomed so large in the study of the dinosaur extinction because a key
member of the Berkeley team, the geologist Walter Alvarez, stumbled on the iridium anomaly as a
by-product of his work on plate tectonics, a theory that gained scientific credibility only in the
1960s.18 The theory of continental drift had originally been proposed in the 1930s by the German
meteorologist Alfred Wegener, who made an inference from continental morphology that all of
the continents must once have been joined; from this inference, he made the second inference that
the continents must be drifting apart. This conjecture boldly challenged the prevailing view that
continental position was fixed. The scientific community rejected Wegener’s hypothesis because
he could not propose a plausible causal mechanism explaining how the continents could move.
Wegener mistakenly claimed that the drifting continents plowed through a solid earth as a ship
plows through the ocean, a view that was correctly rejected by physicists as being physically
impossible. The theory was accepted in the 1960s, in part due to new observations, but more
17My account is based entirely on materials that became available only after the publication of DSI.
18The main source for what follows is Walter Alvarez, T. rex and the Crater of Doom (Vintage Books, 1997).
9
importantly because a new causal mechanism was adduced: the continents rest on tectonic plates
which are carried along in convection currents generated by the earth’s internal thermodynamic
processes. It was the absence of a plausible causal mechanism and not a fallacious inference
induced by a faulty research design that prompted the initial rejection of the theory; and it was the
subsequent depiction of a plausible causal mechanism that led to the theory’s later acceptance,
not a research-design based set of observational inferences.19
Alvarez was a paleomagnetist who specialized in the sub-continental “microplates” of the
Mediterranean, which took him to the medieval city of Gubbio, north of Rome in the Apennine
Mountains. In a canyon just outside the city is an outcrop of pink limestone called the Scaglia rossa whose exposed face spans the Cretaceous period, running into the more recent Tertiary
period. The boundary between these two periods is called the K/T boundary.20 For his research,
Alvarez sampled rocks down through the Cretaceous and up through the Tertiary, crossing the
K/T boundary. He dated his rock samples in part by working with a specialist in foraminifera--
“forams” for short, single-celled marine organisms whose microfossils can be identified and dated
precisely. Forams were plentiful below the boundary, scarce above it; upon learning that the
dinosaur extinction basically coincided with the foram extinction, Alvarez decided to turn his
attention to explaining the mass extinction at the K/T boundary.
The meteorite hypothesis had long been proposed by this time, but Alvarez did not analyze
iridium deposits to test it. Rather, he sought to discover how long the K/T boundary lasted,
collecting data to adjudicate between two contending paradigms of geologic change: gradualism
and catastrophism.21 Alvarez’s research did not distinguish between competing catastrophic
hypotheses, of which there were many: rather he devised a test of whether the dinosaur extinction
had been relatively abrupt or gradual. Gubbio limestone was composed of 95% calcium
carbonate (composed, in turn, overwhelmingly of the fossilized remains of forams) and 5% clay.
19Timothy McKeown has properly placed the discovery of causal mechanisms at the center of his statement of methodological pluralism in which identifying causal processes replaces the empirical tests demanded by the H -D method. See his extremely valuable essay, “Case Studies and the Statistical Worldview: Review of King, Keohane, and Verba’s Designing Social Inquiry: Scientific Inference in Qualitative Methods ,” International Organization 53 (Winter 1999): 161-90, esp. 185-86.
20Cretaceous comes from the German, kreide, or chalk, because in this last third of the Mesozoic era chalk was widely deposited in shallow seas. The letter K is substituted for the C of Cretaceous to distinguish the Cretaceous from the earlier Cambrian period.
21The great 19th-century geologist Charles Lyell attacked the biblical -inspired paradigm of catastrophism and replaced it with gradualism, whose motto was natura non facit saltum: nature does not make jumps. Stephen Jay Gould provides a characteristically concise account in his [complete citation.
10
The K/T boundary, however, is a physical boundary of almost pure clay. So the logical question
to ask is how long it took to deposit those clay sediments. The two scenarios Alvarez posited
predicted different levels of iridium, 0.1 parts per billion for the relatively slow scenario (short-
term increase in clay deposits with constant level of fossilized foram deposits) and virtually none
for the relatively fast scenario (abrupt cessation of foram deposits with constant level of clay
deposits). Both scenarios assumed a constant rate of iridium accumulation.22
The results were astounding: nine parts per billion, roughly ninety times higher than the
expected amount if the rate of sedimentation had been relatively slow, on the scale of thousands
of years. Neither scenario was supported, in other words, by the data. Rather, the assumption of
a constant rate of iridium accumulation was upended. This finding, needless to say, raised new
questions without answering old ones, for now Alvarez had to figure out an explanation for “all
that iridium.” Numerous answers could be proposed for this question--a meteorite impact was
one possibility, but so were massive volcanic eruptions. Indeed, other hypotheses existed, some
of which were non-catastrophic, such as an encounter with a cloud of interstellar dust and gas.
Any credible explanation for the elevated levels of iridium, moreover, had to do double duty, to
also answer the question, “what caused the extinction of the dinosaurs?”
Thus, contrary to the report of DSI, the Berkeley team23 did not propose elevated levels of
iridium as a test of the meteorite hypothesis. A long process of serendipity and trial and error led
to the discovery of elevated iridium levels, a finding which itself begged explanation.24 Rather
than using the iridium as a deductive test of an existing hypothesis, they used their findings and
worked backward from them.25 The H-D method was ancillary to this core methodology.26
22Iridium exists on earth in the same proportion that it exists on meteorites: but like other heavy elements, iridium has become concentrated in the earth’s core. Most iridium in the earth’s crust has thus been deposited by meteorites. Meteorite dust accumulates slowly: if there is a measurable accumulation, the rate of sedimentation must also have been relatively slow, suggesting that the rising proportion of clay at the K/T boundary was caused by an abrupt extinction of foram; if there were virtually no iridium accumulations, then powerful support would be given the hypothesis that the K/T boundary was due to rising levels of clay deposition and not an abrupt extinction.
23The team consisted of Walter Alvarez, his father, the Nobel-prize winning physicist Luis Alvarez, Frank Asaro, a specialist in neutron-activation analysis, and Helen V. Michel, a specialist in plutonium chemistry.
24For the steps and missteps along the way to the discovery of the iridium anomaly, see T. rex and the Crater of Doom, 19-71.
25On the relationship between an ontology of causal mechanisms and reasoning via abduction or via inference to the best explanation, see Ian Shapiro and Alexander Wendt, “The Difference that Realism Makes: Social Science and the Politics of Consent,” Politics & Society 20 (June 1992): 197-223.
11
Rather than following the framework summarized by DSI, and this is a key point, they thought in
terms of causal mechanisms--or, in the context of this debate, killing mechanisms. For over a
year, Walter Alvarez recounts, the Berkeley team regularly returned to the impact hypothesis but
continuously rejected it, not because it was inconsistent with the evidence, for it was consistent,
but rather because they could not understand why an impact would cause worldwide extinction...A supernova had seemed more reasonable because it would have bathed the entire Earth in lethal radiation, thus explaining the global character of the extinction. But a supernova was out, and impact seemed to provide no global killing mechanism. For over a year we had searching discussions that always ended in frustration, and I would lie awake at night thinking, “There has to be a connection between the extinction and the iridium. What can it possibly be?27
Thus, it was not debate over research design but efforts to identify a plausible causal
mechanism that drove the Berkeley team. By late 1979, Luis Alvarez believed that he had found
the appropriate mechanism: a large impact would have created a global dust cloud that would
cause the collapse of the entire food chain, resulting in mass extinction. No research was
conducted in support of this hypothesis; rather, when initial calculations of the quantity of dust
and its impact were approved by a Berkeley astronomer, Luis Alvarez exclaimed, “We’ve got the
answer.” Within weeks, the meteorite impact hypothesis was presented at a conference and
within a year, the seminal report appeared in the journal Science.28
Explanation
Contrary to what DSI claims, the extinction of the dinosaurs was studied scientifically but
not exclusively by the methods they advocate. Causal mechanisms, moreover, play a more
prominent role in the story I have just told than they play in the methodology advanced in DSI: Their absence discredited theories on grounds other than their evidentiary warrant while their
presence powerfully supported theories whose evidentiary warrant was not superior to rivals.
The iridium anomaly, in other words, was consistent with more than one hypothesis: the Berkeley
26For example, as a test of the hypothesis that a supernova explosion deposited the iridium, they searched for the presence of plutonium-244; its complete absence falsified the supernova hypothesis. My claim is that the H-D method is not absolutely necessary to valid findings, not that it is superfluous to scientific inquiry.
27T. Rex and the Crater of Doom, 76, emphasis in original.
28Luis W. Alvarez, Walter Alvarez, Frank Asaro, and Helen V. Michel, “Extraterrestrial Cause for the Cretaceous-Tertiary Extinction,” Science 208 (6 June, 1980): 1095-1108.
12
team judged the meteorite hypothesis superior solely because it alone could be credibly linked to
the iridium anomaly and to the dinosaur extinction. Finally, the function of causal mechanisms is
not exhausted by its role in establishing the relative superiority of a given hypothesis. Causal
mechanisms are also of crucial importance to explanatory adequacy.
To understand the explanatory role of causal mechanisms, we must be more attentive to
the profound distinction between confirmation of a hypothesis and the use of that hypothesis in an
explanation. The H-D method is a powerful means for providing a hypothesis with evidentiary
support, a process usually called confirmation. It allows us to judge a body of evidence as
confirming a hypothesis. But to ask “when does a body of evidence confirm a hypothesis?” is a
distinct question from asking “when does a hypothesis adequately explain an outcome?” DSI is
primarily an answer to the first question, an answer that relies heavily on, but also goes beyond,
the Hypothetico-Deductive Method. The H-D method permits us to move from reports of direct
observation to statements about unobservables, for we have not directly observed the causal effect
in question. This is precisely what DSI intends with its discussion of inference as moving from
immediate data to “something broader that is not directly observed.” (8). Because the statements
we aim for are about unobservables, the problem of validity looms large, which is why the H-D
method is near-universally considered to be the scientific method.29 Hence, the clarion call of
DSI is to develop concrete and falsifiable theories with as many falsifiable implications as possible.
But, to repeat, the H-D method is clearly understood in the philosophical literature to be a
method of confirmation of one or more hypotheses: those hypotheses do not automatically serve
as explanations or as good explanations. Some confusion on this score is inevitable due to the
structural similarities between the H-D method and the deductive-nomological model of
explanation, advanced by Carl Hempel and the leading model of explanation for several decades
before it was completely discredited some forty years ago. The H-D method deduces an
observational prediction from the conjunction of one or more hypotheses and one or more
statements of initial conditions. If the observational is true, then the hypothesis has passed the
test: it has not been falsified and it has received some degree of inductive confirmation. The D-N
model of explanation, on the other hand, deduces an outcome to be explained from the
conjunction of one or more general laws and one or more statements of initial conditions. The
29See, for example, John Earman and Wesley C. Salmon, “The Confirmation of Scientific Hypotheses,” in Wesley C. Salmon, et. al., Introduction to the Philosophy of Science (New York: Prentice-Hall, Inc., 1992), 44.
13
outcome is explained by subsuming it under the general law, showing it to be a specific instance of
a more general phenomenon known to be true.
Here’s an example. We might test the hypothesis “All celestial bodies follow elliptical
orbits” by adding the initial condition “The earth is a celestial body” and then deducing the
observational prediction, “If it is true that all celestial bodies follow elliptical orbits, then the earth
must follow an elliptical orbit.” Observing that the earth does have a celestial orbit does not make
the hypothesis true: having deduced the prediction, it in turn provides only inductive support. But
suppose we accept the hypothesis as true, conferring on it the status of general law. Now we can
explain why the earth has an elliptical orbit: all celestial bodies have elliptical orbits, the earth is a
celestial body, therefore, (we logically deduce), the earth must have an elliptical orbit.
Structural similarities aside, the difference, however, is of acute importance: whereas the
D-N model uses a well-confirmed law to explain an observational statement, the H-D method uses
an observational statement to provide inductive confirmation of a hypothesis that might be used
in future D-N-type explanations. It thus might seem that only a short step separates the H-D
method of confirmation from the D-N model of explanation: this, I suppose, is one reason that
confirmation and explanation are so often conflated by those who accept the D-N model of
explanation as legitimate, a position I show below is unwarranted. But in fact that step dividing
the two is as large as the gap between a partially confirmed hypothesis and a well-confirmed
general law.
Explanations are inferential, but not all inferences are explanations. One way to think
about this point is that symptoms are inferentially relevant but explanatorily irrelevant.
Cosmologists believe that the universe is expanding as distant galaxies recede at high velocities.
They infer this hypothesis from the Doppler effect; light from these distant galaxies shifts towards
the red end of the spectrum. But nobody believes that the red-shift of these galaxies’ light
explains why they are receding from us. The consensus explanation, rather, is that the “big bang”
that originated the universe sent its parts speeding off in different directions.
A second way to establish the distinction between explanations and inferences is the
problem of temporal asymmetry. From knowledge of initial conditions of the earth, sun, and the
moon, and using the general laws of celestial mechanics, we can predict--infer an unobserved
event--a future eclipse. We can even claim, with confidence, that the antecedent conditions and
general laws explain the eclipse. In this instance, the inference coincides with the explanation.
But consider a slightly modified scenario: from knowledge of the initial conditions of the earth,
sun, and moon, combined with the general laws of celestial mechanics, we can infer that an eclipse
14
occurred ten thousand years ago. But nobody would claim that the present positions of celestial
bodies explains a past eclipse, as this violates a basic law of causal explanations: causes cannot
follow their effects. Once again we see that while inferences and explanations may overlap, they
are not equivalent sets.
What, then, are the criteria of explanatory goodness? When does a set of hypotheses
explain an outcome? And why does the deductive-nomological model fail as a model of
explanation? These questions all have the same answer. The D-N model fails because it does not
demand incorporation of causal mechanisms: explanations contain causal mechanisms, and good
explanations contain the mechanisms needed to fully connect cause and effect.
The D-N model fails as a model of explanation because many arguments faithfully follow
its form but manifestly fail to explain. At minimum, therefore, the D-N model is not sufficient for
an explanation.30 We have already seen this point with reference to receding galaxies and
eclipses. The counterexamples to the D-N model--arguments that meet its formal qualifications
but perform miserably as explanations--are so numerous that many texts in the literature refer to
them by number or nickname. Consider two more such examples, both courtesy of Wesley
Salmon.31 Mumbling an ancient incantation, I pour salt into a beaker of water; the water
dissolves. I observe that every time someone mumbles the same incantation while pouring salt
into water, the salt dissolves. Note that this example fails miserably to confirm the hypothesis that
ritual incantations are causally related to salt dissolution in water, for any good research design
would notice quickly that salt dissolves in water in the absence of the hex. But no matter for the
D-N model, for the general law “Salt dissolves when poured into water and accompanied by ritual
incantations” is a true general law. Therefore, by the formal structure of the D-N model, the hex
explains the dissolution of the salt. Or, in Salmon’s final example, Mr. Jones explains his failure
to get pregnant by pointing to his taking contraceptives. Again, the explanation is absurd, but it
fits the logical structure of the D-N model, for it contains the general law “Men who take
contraceptives will not become pregnant.” The D-N model failed, and failed miserably, because it
tried to substitute a set of formal requirements for causal mechanisms. We know that water
causes salt to dissolve; we know that men cannot become impregnated; we know that both the
30I would also argue that it is not necessary to explanatory goodness, but this ancillary point need not detain us here.
31See his “Four Decades of Scientific Explanation,” in Philip Kitcher and Wesley C. Salmon, eds., Minnesota Studies in the Philosophy of Science, XIII: Scientific Explanation (University of Minnesota Press, 1989), 46-50.
15
hex and the contraceptives are irrelevant to the explanations. But the D-N model cannot
distinguish between logically sound arguments and causal explanations.
But why must explanations contain causal mechanisms? Why cannot the function of
causal mechanisms be restricted to making valid inferences? One answer, given in different forms
by Jon Elster and Charles Tilly, is that relations between independent and dependent variables are
so unstable and indeterminate that knowledge of causal mechanisms is essentially all that we can
know.32 But we need not make that ontological wager to agree that knowledge of causal
mechanisms is critical to our enterprise.33 We ask “Why?” because identifying causal mechanisms
gives us a different type of knowledge than we acquire from estimating causal effects, to use the
language of DSI.34 To seek lawlike generalizations--including those with stable probabilities--
grants us what Salmon calls “nomic expectability.” When we know that X is associated with Y at
probability p, we know only that to observe X is to expect to observe Y with the same probability.
And, as we have seen, that expectation is consistent with multiple formulations of the relationship
between X and Y.
To know causal mechanisms gives us a different type of knowledge, what Salmon calls the
ontic conception of explanatory import. Causal mechanisms tell us not only that something
occurs (with regularity p), but also why it occurs. This is knowledge of how the world works. In
other words, to know causal mechanisms is to take DSI seriously when it defines explanation as
connecting X and Y, not just observing their associations. Only when we know the causal
mechanisms can we claim to have explained a phenomenon.
This function of causal mechanisms is scarcely evident in DSI. Indeed, causal mechanisms
and the concept of causality itself were largely absent from early twentieth-century efforts to
establish the unity of science. To their credit, the authors of DSI explicitly adopt a more modern
notion of causal explanations. But they misidentify the role of causal mechanisms in providing
explanations, and they place far more emphasis on establishing empirical associations than on
32They reach this conclusion by very different routes, however. See Jon Elster, “A Plea for Mechanisms,” in Peter Hedström and Richard Swedberg, eds., Social Mechanisms: An Analytical Approach to Social Theory (Cambridge University Press, 1998), 45-73; and Charles Tilly, “Mechanisms in Political Processes,” Annual Review of Political Science 4 (2001). 33 For the argument that causal mechanisms integrate otherwise disparate knowledge, see David Dessler, “Beyond Correlations: Toward a Causal Theory of War,” International Studies Quarterly 35 (1991): 337-55.
34This section is based on Wesley C. Salmon, “Why Ask “Why,”” in his Causality and Explanation (Oxford University Press, 1998), 125-141.
16
identifying causal mechanisms and using them in explanations. They evince an allergy to causal
mechanisms in two ways.35 On the one hand, they insist on affording conceptual and
epistemological priority to causal effects over causal mechanisms. On the other hand, the express
considerable skepticism about the feasibility of a mechanistic approach to causality. By the end of
the book, causal mechanisms have been relegated to a distinctively auxiliary role, that of
increasing the number of observations and thus overcoming, in part, the problem of indeterminate
research designs. I expand on each of these points below.
Explanation, DSI avers, consists of “connecting causes and effects.” (34). One might
think that this definition would lead the authors to stress the notion of “connection” in their
discussion of causal explanations. Yet the authors expend considerable energy in chapter 3,
where their main discussion of causality takes place, to define causality in terms of “causal
effects.” A causal effect, they argue, is “the difference between the systematic component of
observations made when the explanatory variable takes one value and the systematic component
of comparable observations when the explanatory variable takes on another value.” (81-82). This
definition has some virtues, in that it manifests keen awareness of the distinction between random
and systematic causes and is clearly rooted in a counterfactual approach to causality. On the
other hand, it substitutes an operational definition--how do we measure causality--for a the
semantic definitions--What does it mean to say that X causes Y--that one typically finds in the
philosophical literature.
Indeed, they go to great lengths to downplay the role of causal mechanisms in scientific
inquiry. First, they insist that their definition of causal effects is “logically prior” to the
identification of causal mechanisms. This is because identifying causal mechanisms requires a
disciplined research design, which in turn requires estimating causal effects--in a sense, measuring
what is to be explained. (86) It is not clear why this makes causal effects “logically prior” to
causal mechanisms: if causal mechanisms exist, then they produce causal effects. Causal
mechanisms thus appear to be ontologically prior to causal effects, and the latter should then be
assigned a second-order, instrumental function. Moreover, as we have seen, the Berkeley team
identified a causal mechanism without a disciplined research design; that identification was a
theoretical enterprise, not an empirical one. Emphasizing causal effects need not and should not
entail discounting the importance of causal mechanisms.
35See also McKeown, “Case Studies and the Statistical World View,” 162-64.
17
A second way that DSI attempts to discount the importance of causal mechanisms is by
claiming that the requirement of identifying causal mechanisms is self-defeating, for if causal
mechanisms are any causal links between two variables, then the identification of every causal
mechanisms creates two new sets of variables being linked, creating the need to identify two more
causal variables, leading to infinite regress and the combinatorial explosion of variables and
mechanisms.36 This objection, if true, would seem to be a brief for the total dismissal of concern
for mechanisms, a point that KKV adopt by implication. But the reasoning does not survive
scrutiny: by the same reasoning, after all, we would rediscover Xeno’s paradox and conclude that,
because any objection in motion would always have to cover halve of its intended distance in
perpetuity, motion is impossible. The real world doesn’t work like this. Xeno managed to ignore
time; KKV ignore the distinction between causal mechanisms and events. They treat everything that temporally mediates two events as a causal mechanism. Causal mechanisms are structures
and entities with the capacity to generate events; they are not equivalent to the events themselves.
Pragmatic considerations intervene, moreover: no reasonable person asks for an explanation that
contains each and every possible detail imaginable. But even if we adopt a more permissive
definition of causal mechanisms, it is not clear that we would always suffer the problem of infinite
regress: Steven Weinberg, for example, once gave a perfectly reasonable explanation for why
chalk is white that reached the sub-atomic level in only five steps.37
Following these objections, KKV drop the subject of causal mechanisms until the book’s
final pages, where the topic is reintroduced only in the context of increasing the number of
observable implications of a theory. “By providing more observations relevant to the implications
of a theory,” they write of the qualitative method of process tracing, “such a method can help to
ovecome the dilemmas of small-n research and enable investigators and their readers to increase
their confidence in the findings of social science.” (226-28) In other words, causal mechanisms
are understood and give a legitimate role in social science only as handmaidens of inferences:
qualitative researchers should identify and test causal mechanisms not because this is what
adequate explanations demand but because this is what adequate hypothesis testing demands in
36The point here seems to be to discredit the notion of causal mechanisms and downplay their role in explana tions: otherwise, the objection of infinite regress would apply equally forcefully--in this case, not much--to DSI as well.
37Weinberg asks a sixth “why” question: why does the world consist of the fields of quarks, electrons, protons and so forth? This question, he admits, is unanswerable. Applying the implicit standards of DSI, quantum physicists explain nothing; if they do not hold that position, then their point about infinite regress simply does not convince. For Weinberg’s exposition, see his Dreams of a Final Theory: A Scientist’s Search for the Ultimate Laws of Nature (Vintage Books, 1992), 21-26.
18
the absence of the statistical manipulation of data. Once again, we find DSI conflating the
confirmation of hypotheses with the elaboration of explanations.
To be sure, the identification of causal mechanisms does play a role in the confirmation of
hypotheses: James Johnson assigns to causal mechanisms the definitional function of making
theory T “more credible in the sense that [the mechanism] renders the explanations that T
generates more fine-grained.”38 Johnson’s analytic point matches the conclusion we drew from
the example of dinosaur extinctions. But we need not agree that the function of credibility
enhancement exhausts the role of mechanisms; nor, to stick with the question of making
overarching theories more credible, need we subscribe to the claim of DSI that mechanisms make
theories more credible by increasing the number of observations and thus overcoming the small-N
problem, which is not how they were used in the confirmation of the meteorite hypothesis.
Consider: falling barometers are empirically associated with the arrival of storm systems;
but we immediately reject any claim that the relationship is causal in nature, not because of
research-design considerations or the results of a statistical study, but because we know that there
is no mechanism linking falling barometers to storms. Or, to use an example from political
science, after establishing a statistical association between political culture and the durability of
democratic institutions, Ronald Inglehart writes It is conceivable that we have the causal arrow reversed. Perhaps many decades of living under democratic institutions produces greater life satisfaction. We don’t rule this factor out. Indeed, we think it does contribute to overall life satisfaction. But theoretical considerations suggest that the process mainly works the other way around. It seems more likely that a global sense of well-being would also shape one’s attitudes toward politics than that what is experienced in one relatively narrow aspect of life would determine one’s overall sense of satisfaction.39
Inglehart does not reason about causal processes in order to overcome a problem of indeterminate
research design: his statistical methods have presumably dispatched that problem. He thinks
about causal processes in order to better understand the nature of the empirical relationship that
he has already identified. We care about causal mechanisms in these and many other examples not
38James Johnson, “Conceptual Problems As Obstacles to Progress in Political Science,” Journal of Theoretical Politics 15 (2003), 94.
39See his “The Renaissance of Political Culture,” American Political Science Review 82 (December 1988),1217. This is not to endorse his reasoning: citizens of the non-democracies included in his survey--levels of life satisfaction were taken from 1981--include South Africa under apartheid, Hungary under communist rule, and very recent transitions from military rule in Spain, Portugal, and Greece--might take exception to his characterization of politics as a “relatively narrow aspect of life.”
19
to guard against invalid inferences, but to distinguish between well-confirmed accidental
correlations and well-confirmed causal relationships. Thinking in terms of causal mechanisms is
thus not a matter of research design; rather, I argue below, it is a matter of thinking about causal
processes and their modes of propagation.
At this point, DSI might, with some frustration, respond merely be repeating their claim
that even if we upgrade the value of causal mechanisms, identifying causal mechanisms requires
causal inferences. To see why that strategy must fail, let us consider, now, the question of
inferential goodness.
Inferences
“In a world in which almost everything is influenced by many different factors,” Robin
Dunbar reminds us, “confounding variables are the bane of a scientist’s life.”40 Take an event, any
event: it is preceded by and coincident to an astonishingly large number of other events, all of
which are enveloped by a vast coterie of environmental characteristics, qualities, and conditions.
Any standing feature or any episode of discrete temporal change could, in principle, be the cause
of anything else. And as we attempt, however feebly, to measure covariations, we find many,
many things undergoing simultaneous change. The problem with qualitative analysis, from the
perspective of quantitative analysis, is the perceived absence of reliable means to control for these
confounding variables. The quantity of reasonable causes of a phenomenon may quickly
outnumber the cases being studied--too many variables, too few cases is the bumper-sticker sized
statement of this raw, existential condition more formally called the problem of indeterminate
research designs or the “small-n” problem.
DSI is certainly not the first book to argue that statistical analysis is better equipped than
case-study research to handle the problem of confounding variables and thus to issue in
determinate judgments and valid inferences. It differs from these earlier works in three ways: by
forcefully restating the value of the H-D method; by counseling case-study researchers to emulate
the strategies of statistical analysis and thus overcome their inherent methodological weaknesses;
and to articulate clearly these strategies and gesture at how they can be employed non-
quantitatively. There is no inherent reason, the authors argue, why qualitative researchers cannot
define their theories to maximize the number of observable implications which must be seen if the
theory is true; to select carefully their cases with full knowledge of the problems of selection bias;
40The Trouble with Science (Harvard University Press, 1995), 14.
20
to increase the number of observations, especially by “making many observations from few”’ and
to avoid endogeneity, measurement error, and bias in the exclusion of relevant variables.
The Berkeley team was acutely aware of the need to seek observational implications of
their theories; it is precisely this element of their study that set it off from the dozens of earlier
speculative enterprises.41 Yet the Berkeley team spent little time worrying about problems of
indeterminate research designs even though the number of their observations was dwarfed by the
number of existing hypotheses.42 Indeed, the core methodological precepts of DSI played
virtually no role in confirming the hypothesis linking an extraterrestrial impact to the K/T
extinction. Recall from the case study above that only after the discovery of the iridium anomaly
did the Berkeley team begin to consider the possibility of an extraterrestrial event as the cause of
the mass extinction: that discovery was not a test of any specific hypothesis. The key moment in
their decision that the bolide impact hypothesis was vindicated was the discovery of a causal
mechanism with three properties: 1). it was consistent with the iridium anomaly; 2). it could be
plausibly linked to the biological extinction; and 3). the causal mechanism linking the impact to
the subsequent deaths was consistent with all know scientific laws and models. Reviewing the
process of research and reasoning that supported the impact hypothesis thus reveals that although
the H-D method in its various guises was frequently employed, it was absent at critical moments
of the research cycle, and thus, contingent on two ancillary issues considered immediately below,
it cannot be considered a necessary element of confirmation.
Note further how the Berkeley team explicitly eschewed many of the specific precepts of
DSI. Take, for example, the proposition that qualitative researchers should increase the number
of observations. Observing downstream implications of an independent variable increases the
number of available observations, but it does not contribute to overcoming the problem of
indeterminate research designs in any meaningful statistical sense (and, presumably, a clever
scholar could simultaneously increase the number of observations connected to rival hypotheses
as well). Recalibrating the ratio of variables to cases requires investigating other instances of the
phenomenon in question; in this case, we would expect the Berkeley team to examine other
instances of mass extinction and their relationship to extraterrestrial impacts. This research
41For a thorough review, see Michael J. Benton, “Scientific Methodologies in Collision: The History of the Study of the Extinction of the Dinosaurs,” Evolutionary Biology 24 (1990): 371-400.
42In the example of indeterminate research designs given in DSI, a researcher has seven causal variables and only three observations. Applying this to the Berkeley group, we would have dozens of causal variables and only one observation.
21
strategy is not ruled out by the uniqueness of the dinosaur extinction: indeed, no scientific
researcher even refers to this event as one of dinosaur extinction, for it involved the extinction of
over forty percent of all genera, making it one of five mass extinctions.43 This fact was explicitly
recognized by the Berkeley group, who began their historic publication by stating “In the 570-
million year period for which abundant fossil remains are available, there have been give great
biological crises, during which many groups of organisms died out.”44 Yet far from worrying
about the problem of indeterminate research design, the Berkeley group ignored these other four
instances of mass extinction and concentrated solely on explaining the K/T extinction.45 This
research strategy explicitly violates one of the most important lessons of DSI: avoid selection bias,
about which the authors warn in stark terms: “When observations are selected on the basis of a
particular value of the dependent variable, nothing whatsoever can be learned about the causes of
the dependent variable without taking into account other instances when the dependent variable
takes on other values.” (129). Not only did the Berkeley group omit study of other instances of
mass extinction or explicit study of non-extinction periods (during which time extraterrestrial
impacts were in fact quite common),46 but they also omitted study of the far more numerous
instances of sub-mass extinctions. As the paleontologist J.J. Sepowski has scrupulously
demonstrated, a histogram of all extinctions evinces a highly skewed distribution, one associated
with power laws, suggesting not only that there is no sharp discontinuity between small and large
extinctions, but also that the mass extinctions may in fact be random events without specific
causes.47 Finally, note that when prior to publication the Berkeley team looked for additional
observations supporting the iridium anomaly, they searched only for iridium anomalies at the K/T
43The best introduction is Stephen Schwartz, Extinction [complete citation].
44“Extraterrestrial Cause,”1095.
45They raised the possibility that extraterrestrial collisions caused all of the mass extinctions, but speculated that some of these collisions would have been comets that were mostly ice so that no I r anomaly would exist. “Extraterrestrial Causes,” 1107.
46See, for example, K.A. Farley, “Geochemical evidence for a comet shower in the late Eocene,” Science 280 (May 22, 1998)1250-54.
47Power laws inversely relate the magnitude of an event to its frequency: small earthquakes are far more common than powerful ones, and small numbers of species and genera will go extinct far more often than mass extinctions will occur. There are various ways to model this relationship so that mass extinctions do not have ca uses distinct from small-scale extinctions; if this were true, there would be no reason to look for a unique or rare cause of the K/T extinction. David Raup provides powerful evidence against this line of reasoning in his Extinction: Bad Genes or Bad Luck? (W.W. Norton and Company, 1991).
22
boundary itself, another instance of selection bias, albeit one whose logic is well-supported by the
research situation.48
The authors of DSI might make three objections to my account of the dinosaur extinction: • Objection #1: The story I have told refers to “the irrational nature of discovery,” to the
process by which theories are generated. The framework of DSI, on the other hand, refers to the evaluation of existing theories.49
• Response #1: The meteorite hypothesis predates the discovery of the iridium anomaly, and the Berkeley group repeatedly discussed and rejected the hypothesis until they came up with a plausible causal mechanism. With that mechanism in place, they claimed to have provided “direct physical evidence” for a “satisfactory explanation.”
• Objection #2: The meteorite hypothesis was only “partially” confirmed by the iridium anomaly. (11)
• Response #2: This objection raises the question of when confirmation occurs, however. While DSI treats the original research as exemplary by their own standards, they do not consider the hypothesis to be confirmed, writing (and again conflating confirmation with explanation) that
a hypothesis is not considered to be a reasonably certain explanation until it has been evaluated and passed a number of demanding tests. At minimum, its implications must be consistent with out knowledge of the external world; at best, it should predict what Imre Lakatos refers to as “new facts,” that is, those formerly unobserved. (12)
By these criteria, however, the hypothesis was indeed confirmed. The impact hypothesis was consistent with knowledge of the external world; moreover, it had predicted what Lakatos calls “novel, excess information,” predicting, for example, the presence of an impact crater as well as the existence of extraterrestrial events coincident to other mass extinctions. Lakatos, moreover, always recognized that a theory was to be evaluated relative to rival theories. Although the authors of DSI note the existence of a rival theory--that the K/T extinction was a product of massive volcanic eruptions--they do not reference this rival theory or the problem of theoretical rivals in their discussion of confirmation. Thus, by the abbreviated sketch of confirmation that DSI offers, the hypothesis was indeed confirmed.50
48A global distribution of the iridium anomaly is a necessary condition for their to have been a large bolide impact coincident to the K/T boundary. Selecting on the dependent variable is in fact obligatory under these circumstances. See Brian Skyrms, Choice & Chance: An Introduction to Inductive Logic 3rd ed. (Wadsworth Publishing Company, 1986), 90-94.
49This defense of DSI from its critics is made most explicitly in their “Importance of Research Design,” 476.
50It is true, of course, that the Berkeley team did not consider the matter settled: they concluded their article with two potential tests of their hypothesis. First, their hypothesis could be in the future tested against the other four major instances of mass extinction; second, they could find the crater associated with the meteor impact (and whose size they were able to forecast). Note, by the way, that there was no reason to expect to find that crater: the greater probability was that it would be an oceanic impac t of which subduction would have eliminated almost all traces of the pre-Tertiary ocean floor.
23
• Objection #3: The meteorite hypothesis was not confirmed by the original report; it was confirmed by subsequent research (including the discovery of geological formations believed to be unique to bolide impacts and the impact crater itself) that hewed more closely to the strictures of DSI. The hypothesis was confirmed, in other words, by a retrospectively valid research design produced by an entire scientific community, not a single research team.51
• Response #3: To use subsequent research to render valid the original research goes well beyond the claim made by the author’s of DSI, who claim only that a classic single-case study was in fact consistent with their model of good research design because its author was “contributing to a large scholarly literature. As such, he was not trying to estimate a causal effect from a single observation; nor was he selecting on his dependent variable.”52 But as we have seen, the Berkeley group was not contributing to an existing literature; they were trying to estimate a causal effect from a single observation, and they were selecting on the dependent variable. If retrospective and collective research designs validate research, then no research can ever be discredited without full knowledge of future ideas, a condition that Popper has shown to be logically incoherent.
All of these objections, whatever their individual merits, ignore the research strategy that
governed the reasoning of the Berkeley team. The Berkeley team may have neglected to follow
the rules of DSI, but that does not mean that they followed no rules or that the rules they followed
were illegitimate. They explicitly justified reasoning backwards from the iridium anomaly to the
meteorite hypothesis and accepting the latter as provisionally warranted belief prior to the
gradual accumulation of new supporting evidence. In their 1980 Science article, they claim to present direct physical evidence for an unusual event at exactly the same time of the extinctions in the planktonic realm. None of the current hypotheses adequately accounts for the this evidence, but we have developed a hypothesis that appears to offer a satisfactory explanation for nearly all the available paleontological and physical evidence.53
Put differently, the impact hypothesis counts as warranted belief consequent to its meeting
all of the following three conditions:
51“Importance of Research Design,” 477. Note that the subsequent research further confirms that an impact took place; little of it directly bears on the question of the extinction itself, a point we return to below.
52“Importance of Research Design,” 477. The authors are responding to points raised by a number of contributors to a symposium on DSI. While these contributors allow for a collective dimension to research designs, they all cast this quality as scholars “strategically choosing observations based upon knowledge of cases from parallel studies.” The unit of evaluations might thus be the community of scholars and not the individual researcher, a point firs t raised by Thomas Kuhn’s concept of a paradigm which governed the beliefs and practices of a community of scholars. This claim is not akin to retrospectively judging a research design sound based on subsequent work, however. The quoted sentence is from David Laitin, “Disciplining Political Science ,” American Political Science Review 89 (June 1995), 456.
53“Extraterrestrial Cause,” 1095.
24
• It accounts for an incredibly significant fact--the iridium anomaly--whose importance outweighs almost every other available piece of data;
• It accounts for that fact better than its rivals such as the supernova hypothesis. • It accounts for the outcome in question by way of a credible causal mechanism.
Emerging in this criterial list is the outlines of an alternative methodology based explicitly
on the search for causal mechanisms. For Richard Miller, confirmation is a process of “fair causal
comparison.” According to Miller’s formal definition, “A hypothesis is confirmed just in case its
approximate truth, and the basic falsehood of its rivals, is entailed by the best causal account of
the history of data-gathering and theorizing out of which the data arose.” To put this point more
colloquially, to claim that my hypothesis has been confirmed is to state, “These are the facts. This
is how they are explained assuming the approximate truth of the favored hypothesis. This is why
they are not explained as well on the rival hypotheses which are the current competitors.”54
Because confirmation is based on a relationship to evidence and to rivals, confirmation must always be tentative, contingent on the availability of data and the state of rivals. A confirmed
hypothesis, in other words, can be dislodged by new evidence or displaced by new rivals.
Acceptance of a hypothesis thus means only that “acceptance is taken to be more reasonable than
rejection, but suspended judgment is not excluded.”55
Miller’s account, however, does not explicitly tie confirmation to explanation: because the
latter is our ultimate goal, and because explanation requires causal mechanisms, we can add to his
account the condition that vindicated hypotheses must be connected to explanatory outcomes by
way of credible causal mechanisms. As the Berkeley team realized, making the search for causal
mechanisms an integral element of the process of confirmation--mobilizing, in other words, the
twin functions of causal mechanisms, as means for enhancing credibility of hypotheses and as
explanatory devises--vastly expands our repertoire for engaging in fair, causal comparison. In the
following list, to say that a theory is rejected is to render its acceptance less reasonable than a
competitor which does not suffer an analogous problem. • A hypothesis can be rejected because its posited causal mechanism is considered
inconsistent with generally accepted principles and thus implausible. This was the case, to give just one example, with the rejection of the initial formulation of plate-tectonics theory. It is also the reason given for the rejection of functionalist models of social change.
• A hypothesis can be rejected because it does not logically imply the outcome attributed to it. The observed association between advanced levels of economic
54Miller, Fact and Method, 155, 163.
55Miller, Fact and Method, 158.
25
modernity and democracy might plausibly explain why democracies in wealthy countries survive without explaining why wealthy countries became democratic.56
• A hypothesis can be rejected because its posited causal mechanism is considered conceptually inadequate so that it provides no insight into “how things work.” This form of rejection is much stronger when the hypothesis in question is the latest version of a research programme that has long suffered this problem. James Johnson levels this charge against political-culture research, for example.57
• A hypothesis can be rejected because it is considered to lack causal depth: it may be considered definitionally sufficient for the outcome in question (candidate X won because more voters cast ballots for her); it may be considered part of the normal course of affairs (bridges may collapse while cars drive over them, but the cause is presumably a structural defect, not the cars that the bridge was designed to support); or it may be subsumed by a hypothesis that lies further back on a causal chain (the East Asian financial crisis of the late 1990s was triggered by a run on local currencies; it was caused by a syndrome of structural imbalances).58
• A hypothesis can be rejected because its emphasis on large-scale causes of large-scale effects is demonstrably invalid in circumstances in which small-scale causes can have large effects or in circumstances in which stochastic processes follow power laws, usually producing small effects but occasionally producing large effects whose magnitude is inversely related to their frequency.59
• A hypothesis can be rejected because its truth implies necessary observations that cannot be made by credible techniques. This was the means by which the Berkeley team rejected the otherwise plausible inference that the iridium anomaly was produced by a supernova explosion. The absence of long-term economic convergence on a global scale has similarly discredited many simple models of economic growth, while the presence of a handful of rapid developers has been held to discredit dependency theory.
• A hypothesis can be rejected, finally (but perhaps not conclusively), because it is shown to rest on an invalid inference stemming from a faulty research design.
As this list demonstrates, there is more than one way to engage in fair, causal comparison.,
only some of which are covered by the regulatory framework of DSI. The list demarcates a core
56Adam Przeworski, et. al., Democracy and Development: Political Institutions and Well-Being in the World, 1950-1990 (Cambridge University Press, 2000).
57Johnson, “Conceptual Problems.”
58Particularly good on the pragmatics of explanation is John Gerring, Social Science Methodology: A Criterial Framework (Cambridge University Press, 2001), 90-99.
59For the argument that non-linear models of change discredit a diverse array of existing approaches to political science, see Alan Zuckerman, “Reformulating Explanatory Standards and Advancing Theory in Comparative Politics,” in Mark Irving Lichbach and Alan S. Zuckerman, eds., Comparative Politics: Rationality, Culture, and Structure (Cambridge University Press, 1997), 277-310. For reasons to be skeptical, see David Waldner, “Anti Anti-Determinism,” paper presented to the Annual Convention of the American Political Science Association, Boston, MA, September 2002, esp. sections 5 and 6.
26
working model, one that permits us to believe in some ideas because they appear, by current
standards and knowledge, superior to their rivals. It is thus reasonable to commit to them, even
tentatively and in full knowledge that superior alternatives might yet emerge. We reach this
judgment, moreover, in diverse ways. Statistical studies and qualitative studies striving to mimic
statistical exactitude are powerful members of our methodological ensemble, but they have
valuable accomplices whose contributions should be neither overlooked nor slighted. Those allies
are largely based on the consideration of causal mechanisms: and many of those considerations are
conceptual/theoretical, not empirical, in nature.
Within the social sciences, a growing number of scholars have considered analogous
methodological alternatives, loosely grouped together under the name “process-tracing.”
Sometimes lost in this important literature, however, is the distinction between confirming a
hypothesis against the evidence and confirming a hypothesis by eliminating rival hypotheses
against the evidence. Fair, causal comparison always demands both elements of inquiry.
Consider, for example, Jack Goldstone’s metaphor of the detective who will draw on her experience with similar cases in making judgments about which factors pertain to this particular case (inductive insights); but the causal reasoning proposed to explain a crime or accident will be a linking of particular facts involved in this case with general principles regarding how opportunities, motivations, and circumstances conduce to particular action (deductive reasoning).60
The problem with this approach is that it relies too heavily on a confirmationist strategy of
finding evidence consistent with a hypothesis, and does not explicitly incorporate an
eliminationist strategy of rejecting alternative hypotheses.61 Changing the metaphor a bit,
scholars must simultaneously act as prosecutors and defense attorneys. Timothy
McKeown’s version of process tracing makes clear that detectives build cases in large part
by eliminating rival explanations:
An observation may be consistent with several different hypotheses about the identity of the killer and rules out few suspects. No one observation establishes the identity of the killer, but the detective’s background knowledge, in conjunction with a series of observations, provides the basis for judgments that generate or eliminate suspects...Rival theories are assessed and disposed of, generally by
60Jack A. Goldstone, “Methodological Issues in Comparative Macrosociology,” Comparative Social Research 16 (1997), 113.
61Goldstone explicitly rejects any role for Millian methods in his approach, even though these methods are best understood not as means to generate a theory but rather as means to falsify alternatives.
27
showing that they are not successful in accounting for all the observations. The suspect may attempt to argue that it is all a coincidence, but the detective knows that someone has to be the killer and the evidence against the suspect is so much stronger than the evidence against anybody else that one can conclude beyond a reasonable doubt that the suspect should be arrested.62
Fair, causal comparison, then, always involves the simultaneous confirmation of
one hypothesis, by showing it to be consistent with the data and the current state of
theorizing, and the elimination of rival hypotheses. Some of this work is done by
evaluating the fit between hypothesis and evidence; some of it is done by independent
evaluation of a theory’s causal mechanisms. As the list enumerated above indicates, there
are diverse specific techniques for engaging in fair, causal comparison. These techniques
mean that fair, causal comparison can distinguish valid from invalid process tracing.
Consider, as an illustration, an example of process tracing from a book that has
won wide acclaim for its methodological sophistication and its attention to causal
mechanisms, Robert Putnam’s, Making Democracy Work.63 He uses process tracing to
explain why northern Italian citizens possess civic culture, a virtue lacking in southern
Italians that Putnam has argued is causally related to superior performance in northern
political institutions. By this account, ordinary citizens partaking of their regional civic
culture enact their own institutional fates. Thus in the nineteenth century, mutual aid
societies and other forms of voluntary organization proliferated in the north but were
starkly absent in the south, where, according to an historian quoted by Putnam, “The
peasants were in constant competition with each other for the best strips of land on the
latifondo, and for what meagre resources were available. Vertical relationships between
landlord and client, and obsequiousness to the landlord, were more important than fixed
solidarities.”64
Process tracing thus seems to confirm Putnam’s account: long-standing cultural
differences are associated with different patterns of behavior and different institutional
62“Case Studies and the Statistical Worldview,” 170-71. Substituting the metaphor of a crossword puzzler for the detective, Susan Haack makes a remarkably similar case in her essay “Puzzling out Science,” contained in her Manifesto of a Passionate Moderate (The University of Chicago Press, 1998), 95.
63Robert Putnam, Making Democracy Work: Civic Traditions in Modern Italy (Princeton University Press, 1993). DSI considers the book a methodologically exemplary study that exemplifies the possibilities of combining quantitative and qualitative research. More skeptical on this score is Sidney Tarrow, “Bridging the Quantitative-Qualitative Divide in Political Science,” American Political Science Review 89 (June 1995), 471.
64 Making Democracy Work, 143, quoting the historian Paul Ginsborg who is in turn citing an Italian scholar.
28
outcomes. Yet Putnam does not consider any alternative hypothesis, even as he provides
clear evidence that cultural values did not exist autonomous of other structures and that
both culture and institutions were induced by the broader socioeconomic context. Thus,
as aristocratic rule in the north was declining, From 1504 until 1860, all of Italy south of the Papal States was ruled by the Hapsburgs and the Bourbons, who (as Anthony Pagden has recently described in detail) systematically destroyed horizontal ties of solidarity in order to maintain the primacy of vertical ties of dependence and exploitation.65
The use of power--coercive and otherwise--to prevent peasants from achieving social
solidarity that might be used to challenge upper-class hegemony did not end with
unification, for post-1860 “The southern feudal nobility...used private violence, as well as
their privileged access to state resources, to reinforce vertical relations of dominion and
personal dependency and to discourage horizontal solidarity.”66
Thus, Putnam’s historical sketch provides two very different images. In one
depiction, peasants mistrust one another and anxiously seek patronage from local elites,
thus creating and recreating relations of dependency inconsistent with and injurious to
civic culture: this is the genuine cultural interpretation that Putnam wishes to vindicate.
But in the second image, elites--who have no northern analogue-- themselves create and
recreate these vertical ties and actively discourage horizontal solidarity among lower
classes. The question is, if southern peasants were culturally hostile to horizontal norms
of solidarity and engagement, why did elites so persistently feel the need to maintain
vertical ties and destroy horizontal alternative ones? Putnam’s own sources suggest this
alternative reading which Putnam takes no measures to reject.67 On the contrary, his own
65Making Democracy Work, 136.
66Making Democracy Work, 145. Putnam follows this material with the claim that southern peasants made recourse to patron-client relations, a rational decision given their wretched position. But that wretched position was forced upon them; it seems that it was power relations and not culture that created vertical relationships. Were culture autonomously generative of such relations, upper classes would not themselves need to make recourse to violence.
67One might respond that Putnam’s historical sources did not take the form of credible rival hypotheses: but these historical materials are fully consistent with theoretical accounts that should be considered rival hypotheses, most notably Antonio Gramsci, Selections from the Prison Notebooks (International Publishers, 1971) or James Scott, Domination and the Arts of Resistance: Hidden Transcripts (Yale University Press, 1990). For an exemplary attempt to adjudicate between Gramsci and Scott, which act as rival accounts to one anot her, see Susan Stokes, Cultures in Conflict: Social Movements and the State in Peru (University of California Press, 1995).
29
interpretation consistently highlights the independent agency of southern peasants while
completely obscuring the agency of southern landowners and their allies. In Putnam’s
account, it is culturally deprived peasants who willingly create their own relations of
exploitation.68
The differences between the work of the Berkeley team and Putnam’s research
team, with the first engaging in fair causal comparison while violating many norms of DSI and the second adhering to DSI while avoiding fair causal comparison, illustrate why fair,
causal comparison must render methodological pluralism more credible than
methodological unity: there are multiple ways to engage in fair, causal comparison, only
some of which are captured by the rules and regulations contained in DSI. The point is
not that research design considerations are dispensable: the point, rather, is that efforts to
confirm propositions and use them in explanations need not be based solely on the logic of
statistical inference. Statistical inferences, after all, are largely unconcerned with causal
mechanisms: consequently, the inferences that result cannot explain. Giving attention to
causal mechanisms not only permits us to distinguish explanatory from non-explanatory
inferences, but it also gives us diverse means to engage in fair causal comparison.
Some of the research designs that result will precede gathering data; at other times,
scholars will approach existing data with a disciplined plan to adjudicate debates. The
latter strategy is in no way inferior to the former: DSI is right to worry about the
confirmationist bias, but fair causal comparison is adequately equipped to manage these
concerns, as the critical reflections on Putnam’s process tracing indicates.
The basic rule is this: if existing data can be interpreted to vindicate a hypothesis
against its rivals in a process of fair, causal comparison, then no further data gathering is
necessary. If existing data cannot accomplish this task because the existing data is
consistent with multiple hypotheses, then new research is needed and that research must
be guided by a research design oriented explicitly at the current contributors to the
theoretical controversy. What this means is that the appropriate research strategy is
always a function of the existing state of knowledge--the data that is available, the
controversies on behalf of whose adjudication it was gathered, and the state of theoretical
68Making Democracy Work, 177-78.
30
contestation.69 Under this rule, the Berkeley team made a reasonable claim to have
explained the mass extinction at the K/T boundary. The unexpected discovery of the
iridium anomaly--a discovery which was not pursuant to the test of any specific theory and
which was not followed by a new, systematic research design--provided sufficient data to
vindicate the impact hypothesis because was made at a time of an incredible proliferation
of explanations, none of which commanded any evidentiary warrant. The discovery itself
was sufficient to eliminate most rival explanations; a simple test led to the rejection of one
more potent rival; and when coupled with a plausible causal mechanism, was sufficient to
confirm, in the sense discussed above, the impact hypothesis.70
Note, finally, that fair, causal comparison requires careful attention to the history
of data gathering. Sometimes we have to think about how our evidence has been
obtained, for that history might sensitize us to biases in that history. One of the main
concerns of DSI is how we select our data; Stanley Lieberson makes the very different
point that we must learn to think about the causal processes producing our data.71
Theory-confirming or theory-informing observations, in other words, do not exist in a
theoretical vacuum. The impact hypothesis, for example, implies that mass extinction was
a simultaneous event, at least on a geologic scale. Fossil evidence supporting this claim
may be misleading, because sometimes erosion forms gaps in the geologic record, and
missing stratiographic units give the appearance of sudden extinctions. But despite this
bias in favor of the hypothesis, the fossil record does not directly support the impact claim,
most centrally because the existing fossil record suggests that dinosaurs were already in a
steep demographic decline by the middle of the Cretaceous period, leaving few to be killed
by a meteor impact. There is good reason to believe, however, that the inference is a
faulty one caused by unavoidable sampling bias. Because our fossil record documents
only a small percentage of all fossils present in a rock formation (which in turn, given the
numerous hurdles to be overcome before remains become fossilized, is itself only a sliver
69The authors of DSI allow for this when they admit that a single case study can have enormous ramifi cations when it contributes to existing research and attendant debates. The problem is that they provide no logic for understanding this eventuality.
70That is to say, confirmation could be applied in 1980 and withdrawn subsequent to the future elaboration of a plausible rival hypothesis such as the volcano hypothesis.
71Stanley Lieberson, Making it Count: The Improvement of Social Research and Theory (University of California Press, 1985).
31
of the true population), as species become more rare, our sampling will be more imperfect
and because we are thus less likely to find fossils at a particular point like the K/T
boundary, we are more likely to conclude that the species died much earlier. This is why
we are sure that plentiful foram did largely go extinct at the K/T boundary, but the fossil
record for less numerous dinosaurs is far more ambiguous. This defense of the impact
hypothesis, known as the Signor-Lipps hypothesis, receives support from some ingenious
sampling techniques involving species well-known to have gone extinct simultaneously.72
Inference and Explanation at the K/T Boundary...and Beyond Analysis of causal mechanisms can lead to the confirmation of hypotheses, in part
by disconfirming rivals: this lesson has large implications for how we think about
confirmation for it suggests that the rules of DSI may be of central significance without
being necessary components of every act of confirmation. I have also argued that causal
mechanisms are necessary to adequate explanations: their importance, in other words,
goes beyond abetting inferential goodness. Now it might seem that in order to aid the
process of making valid inferences, causal mechanisms must be correctly identified; and
that correctly identified causal mechanisms constitute good explanations; and so the twin
role of mechanisms--confirmation and explanation--are in fact, contrary to what I have
argued, equivalent. This is not the case: correctly identified causal mechanisms constitute
adequate explanations only if they span the gap between cause and effect. Adequate
explanations, more often than not, require causal chains. To see why this is so, let us
briefly examine why the impact hypothesis can be a confirmed hypothesis but a poor
explanation.
That a large meteor struck the earth at the K/T boundary is beyond dispute. This
does not mean that the impact caused the mass extinction marking the end of the
Cretaceous period. To make that claim, we must rule out alternative hypotheses and
connect cause and effect in a chain of causal mechanisms. Without entering the full debate
between the impact hypothesis and its volcanic rival, let us concede the point to the impact
side and ignore the issue of rivals.73 How well does the impact hypothesis explain? This
72The Signor-Lipps effect is discussed lucidly in James Lawrence Powell, Night Comes to the Cretaceous: Dinosaur Extinction and the Transformation of Modern Geology (W.H. Freeman and Company, 1998), 130-41.
73For the full set of reasons to discount heavily the hypothesis that the K/T extinction was caused by huge volcanic outpourings on the Indian subcontinent, see Powell, Night Comes to the Cretaceous, 85-95.
32
question breaks down into sub-questions: What are the geological and biological
mechanisms that resulted in the mass extinction and by what other mechanisms were they
linked? And what explains the actual pattern of extinctions? Why, in other words, did
some genera become extinct while others did not?74
The original mechanism posited by the Berkeley research team was based on an
extrapolation from the explosion of the Krakatoa volcano in 1883. That explosion had
kicked up enough dust and ash to alter global atmospheric conditions for months; expand
the scale of those effects to the size of catastrophic impact, they reasoned, and dust in the
air would kill plant life, leading to a collapse of the food chain and mass extinction.
Observations made in 1994 when Shoemaker-Levy 9 struck Jupiter support calculations
that, given its tremendous speed, the meteorite would have carried with it tremendous
energy, far greater than that contained in the global supply of nuclear weapons, energy
that would have to be dissipated post-impact. The predicted effects include “shock waves,
tsunamis (tidal waves), acid rain, forest fires, darkness caused by atmospheric dust and
soot, and global heating or global cooling.”75 There is evidence for a variety of these
post-impact scenarios, including global wildfires, acid rain, and a decade-long “impact”
winter.76 But even book-length, enthusiastic defenses of the impact hypothesis devote
startlingly little space to fleshing out these scenarios, more typically concluding that “we
know that [the impact] must have had some combination of the effects described. What
we do not know is just how the many lethal possibilities would have interacted with each
other and with living organisms.”77 Indeed, uncertainty remains whether the extinction
was virtually instaneous or extended over thousands of years.78
74We know than answering this last question requires moving beyond geologic and environmental mechanisms to investigate genera-specific biological factors. The two dinosaur groups suffered extinction of all twenty-two of their genera. Given an overall extinction rate of 43 percent of all genera, the probability that every genus of the dinosaurs would go extinct by chance alone is virtually nil. For the ingenious reasoning, see David Raup, Extinction: Bad Genes of Bad Luck? (W.W Norton & Company, 1991), 88-105.
75Raup, Extinction: Bad Genes or Bad Luck?, 161.
76See the brief discussion in Powell, Night Comes to the Cretaceous, 176-79.
77Powell, Night Comes to the Cretaceous, 179.
78Early in the debate the Berkeley group acknowledged that extinctions might have been spread over as much as 104 to 105 years, claiming only that “a major impact would produce important environmental changes and that instaneous extinction in all groups is not a necessary corollary of the impact theory.” See their “The End of the Cretaceous: Sharp Boundary or Global Transition?” Science 223 (16 March, 1984), 1184.
33
With that last point in mind, turn next to the variable biotic responses to the K/T
environmental disturbances. Given catastrophic environmental changes, we might expect
uniform rates of extinction across all genera, but this is not what is observed for this or
any of the five major mass extinctions. Thus, we have two specific questions concerning
biological mechanisms of extinction: what mechanism led to the mass killing, and why did
it kill some genera but not others? These questions have prompted even firm supporters
of the impact hypothesis to conclude that the theory contains some “puzzling features,” in
the words of Richard Fortey. “There are many animals and plants that did survive,”
Fortey continues, “and somehow it does not seem satisfying to call them “lucky ones” and
leave it at that. Their survival should chime in with the fatal scenario.”79
Others have drawn even stronger conclusions. Responding to what has been called the
“Dante’s Inferno” scenario of a broad cluster of environmental catastrophes subsequent to
the impact. William Clemens states,
I think the results of studies of patterns of survival and extinction of terrestrial vertebrates fully falsify the hypothesis that an impact caused the terminal Cretaceous extinctions of terrestrial vertebrates through the series of environmental catastrophes embodied in the “Dante’s Inferno” scenario. Ancestors of groups that are today known to be unable to tolerate major climatic change, such as frogs, salamanders, lizards, turtles, and birds, survived whatever caused the extinction of the other dinosaurs.80
Clemens is not rejecting the impact hypothesis in toto. Rather, he is pointedly reminding
us that extinction is ultimately a biological phenomenon, and along with verifying the
geological consequences of the impact, we need to specify carefully the biological causal
mechanisms.
Defenders of the impact hypothesis treat the call for specified mechanisms as an
unreasonable assault on the impact hypothesis itself, which they insist has met many tests
and failed none of them and so deserves to be considered corroborated.81 This response is
79Richard Fortey, Life: A Natural History of the First Four Billion Years of Life on Earth (Alfred A. Knopf, 1998), 253. Fortey stresses the anomalous survival of insects which have an annual life cycle and rely on live plants for food and shelter and so should not have survived even a decade-long environmental catastrophe. Other puzzling aspects of survival include bony fish, coral, and birds, which are now widely recognized to be direct descendants of dinosaurs.
80“On the Mass Extinction Debates: An Interview with William A. Clemens,’ in William Glen, ed., The Mass-Extinction Debates: How Science Works in a Crisis (Stanford University Press, 1994), 245-46.
81Powell, Night Comes to the Cretaceous, 179-80.
34
based on the conflation of confirmation and explanation. We can agree that the impact
hypothesis has been confirmed--the impact itself occurred and we have good reason to
believe it was the ultimate cause of extinction--yet still maintain that we do not yet possess
an adequate explanation of how the dinosaurs and other genera became extinct. Writing
recently in the journal Paleobiology, Norman MacLeod characterizes the physical
evidence for the bolide impact hypothesis “overwhelming” and the hypothesis itself as
“fully proven, though a number of interesting subsidiary controversies still exist.”
MacLeod rightly insists that the issue at hand is one of standards of explanatory adequacy,
as the specification of precise extinction mechanisms is an indivisible part of explaining any mass extinction event. Just as geologists remained skeptical about continental drift until a precise causal mechanism...was proposed...paleontologists will remain skeptical about the connection between impacts and extinctions until precise biological/ecological mechanisms are proposed that uniquely account for observed taxic patterns and the stratigraphic timing of K/T extinction and survivorship.82
Inferential goodness, we must conclude, is not equivalent to explanatory goodness.
While the specification of causal mechanisms can support inferential goodness, for reasons
discussed at length above, identifying causal mechanisms is not sufficient for adequate
explanations. Instead, when it comes to explanations, we should become greedy: we
should expect a full and complete causal chain, tightly linking cause and effect. Let’s
consider an example from contemporary social science, turning again to Robert Putnam’s
Making Democracy Work. Since we want to focus on the question of explanatory
goodness, let us ignore the issues raised above and concede the inferential validity of the
work: northern Italy enjoys high civic culture and well-performing institutions while
southern Italy lacks civic culture and suffers ill-performing institutions. Why is this?
What is the causal link between civic culture and high-performing institutions? To his
credit, Putnam addresses this issue head on, and his provision of a credible causal
mechanism has led reviewers to praise the book highly. In a phrase, the answer is social
capital: northern Italians have developed norms of reciprocity and mutual collaboration for
the collective good, permitting them to engage in collective action which in turn reinforces
the norms of reciprocity; southern Italians, mired in distrust, cannot engage in collective
82Norman MacLeod, “K/T Redux,” Paleobiology 22 (1996), 315.
35
action. These two outcomes are both stable equilibria, one producing fortunate outcomes,
the other producing gross misfortune.
For Putnam, the lesson is clear: “These contrasting social contexts plainly affected
how the new institutions worked.”83 But just as the paleontologist MacLeod asked for
biological mechanisms to go along with geological ones, we might ask Putnam for social
and institutional mechanisms to complement and complete the causal chain that begins
with social capital. These mechanisms have to be shown to be consistent with the
institutional record. The image that Putnam conveys is that when faced with poor
institutional performance, northern Italians band together and demand better performance,
while southern Italians respond to poor performance lethargically, if at all. One would
then expect some evidence of this propensity for collective action on behalf of better
institutions, evidence which is strikingly absent given Putnam’s quarter century of “poking
and soaking” in Italy. Indeed, Putnam spends exactly two paragraphs (mirroring how the
Alvarez group spent most of their original article making the case for an impact and
minimal effort connecting the impact to the extinction) directly confronting the causal link
between democratic governance and a vigorous civil society: On the demand side, citizens in civic communities expect better government and (in part through their own efforts), they get it. They demand more effective public service, and they are prepared to act collectively to achieve their shared goals. Their counterparts in less civic regions more commonly assume the role of alienated and cynical supplicants.
On the supply side, the performance of representative government is facilitated by the social infrastructure of civic communities and by the democratic values of both officials and citizens. Most fundamental to the civic community is the social ability to collaborate for shared interests. Generalized reciprocity...generates high social capital and underpins collaboration.84
I find three points of interest in this handful of sentences. First, citizens in the
north of Italy not only expect better government, but they get it “in part through their own
efforts.” Again, we have no evidence that they make those efforts, only a theoretical
argument that they should be disposed to making those efforts. Moreover, by
parenthetically claiming that citizen’s efforts are only part of the story, Putnam raises the
possibility of causal incompleteness. Second, in the following sentence we are told that
83Making Democracy Work, 182
84Making Democracy Work, 182-83.
36
northern Italians “are prepared to act collectively to achieve their shared goals.”
Everything about the argument to that point prepares us to anticipate that citizens do act,
even if this is only “part” of the overall story: now we find that they have a latent
predisposition to act, but do not necessarily make it manifest. Finally, there is a supply
side to go along with the demand side: presumably, this is what Putnam means when he
attributes better outcomes “in part” to the demand-side actions of citizens. The supply
side implies, without stating it clearly or providing any evidence, that public officials in
northern Italy are, relative to their southern counterparts, either more inclined to provide
good public service, or more capable of providing good public service, or both. But this
raises a question: if officials are disposed and/or equipped to provide good public service,
what precisely is the role of the demand side? Why must northern citizens (be prepared
to) act collectively on behalf of good public service if their public officials are
independently prepared to provide those services? Logically, one might think that the
demand side, which now looks far more anemic than much of the prior 181 pages had led
us to believe, would be necessary only if the supply side did not exist. It is quite possible
that these logical relations could be sorted out and that evidence could be provided for the
final version of the causal chain; until that is accomplished, we should be skeptical of the
connections between civic culture and institutional performance. Even if we accept as
confirmed the inference that it is civic culture which most centrally distinguishes the two
halves of Italy (and there are independent reasons to not accept it), we still must conclude
that inferential goodness is not accompanied here by explanatory goodness.
Conclusion
Ever since Hume, philosophers and empirical scientists have recognized that causal
explanations are based on inferences. Not all inferences are explanations, however, and
many explanatory propositions--ones that contain at least a sketch of the relevant causal
mechanisms-- perform poorly as explanations. This essay has sought to demarcate
boundaries between inferences, explanations, and adequate explanations. Doing so, I have
argued, raises serious doubts about the feasibility of methodological unity. Despite the
powerful arguments contained in DSI, I have argued that a framework for inferences does
not function doubly as a framework for explanations; the DSI framework thus cannot be
considered sufficient for generating good explanations. The means by which good
37
explanations are generated are causal mechanisms: once we fully assimilate that point, we
find there are multiple and diverse ways to infirm or confirm hypotheses, many of which
are not even implicit in the framework of DSI: that framework thus cannot be considered
as necessary for explanatory adequacy either.
It might be objected that the sorts of complete explanations I advocate here are neither
feasible nor desirable. This seems to be the position of the authors of many statistical
textbooks, who apparently agree that Social science explanations are, at best, partial theories that indicate only the few important potential influences. Even if one could establish the list of all influences, this list would be sufficiently long to prohibit most analysis. Generally, researchers are trying to estimate accurately the importance of the influences that are central to policy making or to theoretical developments.85
Paul Humphreys, a philosopher who has written widely on the philosophy of explanations,
offers the characterization of such incompletely specified explanations as “explanatorily
informative,” arguing that incomplete explanations are not necessarily untrue ones, and
that explanatory incompleteness has in no way hindered progress on many important
scientific and public-policy issues.86
These objections, I think, only vindicate my point. Different communities of
scholars may in fact have very different yet non-competitive standards of explanatory
adequacy. Indeed, the authors of DSI recognizes this very point. On the book’s first
page, they write of two “styles” of research. Quantitative research uses numbers to
measure “specific aspects of phenomena [and] it abstracts from particular instances to seek
general description or to test causal hypotheses...” Qualitative research, on the other
hand, tends “to focus on one or a small number of cases...and to be concerned with a
rounded or comprehensive account of some event or unit.” (3-4). KKV claim that these
matters of style are “methodologically and substantively unimportant.”
But if qualitativists are licensed to desire “rounded or comprehensive accounts,”
then the standards to which I hold them, with all of their methodological implications, are
commensurately legitimate. Quantitativists may deny the centrality of causal mechanisms
85Eric A. Hanushek and John E. Jackson, Statistical Methods for Social Scientists (Academic Press, 1977, 12, as cited in Lieberson, Making it Count, 186.
86Paul Humphreys, The Chances of Explanation: Causal Explanation in the Social, Medical, and Physical Sciences (Princeton University Press, 1989).
38
and thus reject the implications I draw from consideration of those mechanisms, but that
may simply be a function of their predisposition toward abstracting from particular
instances leads them to discount the value or the feasibility of such comprehensive
accounts. Style, it may turn out, matters greatly.
True, giving comprehensive causal accounts may be desirable but not feasible. It
may be the case that phenomena are sequelae to enormous numbers of inter-correlated
causal influences, so that comprehensive and fully specified lists are unattainable. And it
might be the case that the sort of definitive tests of hypotheses represented by the
Berkeley team’s confirmation of the bolide-impact hypothesis are equally unattainable in
the social sciences, where even a defender of case-study methods like Stephen Van Evera
claims that “Most predictions have low uniqueness and low certitude.” When passed, they
do not rule out rivals (low uniqueness), but failing a test leaves relatively unharmed
propositions that make only probabilistic predictions (low certitude). Most social science,
Van Evera avers, consists of “straw-in-the-wind tests.”87 It is perfectly legitimate to accept any one or all of these wagers, to settle on an
ontological and epistemological position. But a status of great uncertainty where we cannot be sure of any causal relationships let alone the underlying causal structure of the world might also prompt some agnosticism, or at least a rational desire to diversify one’s methodological portfolio. Until it can be proven that explanatory adequacy as discussed here is a chimera, the philosophical position outlined here is valid and the methodological implications follow. Methodological pluralism cannot be defeated by assuming a world that would deliver that defeat.
87Stephen Van Evera, “Guide to Methods for Students of Political Science (Cornell University Press, 1997).