Department of Political Science
Annual Review of Political Science 20 (May/June 2017)
Draft: 30 November 2015
Estimated pages: 23
Please do not cite without permission
Qualitative methods, broadly construed, extend back through the foggy mists of time to the very
beginnings of social and political analysis (however that might be dated). Selfconscious reflection
on those methods is comparatively recent. The first methodological statements of contemporary
relevance grew out of the work of logicians, philosophers, and historians in the nineteenth
century, most importantly J.S. Mill (1843). To be sure, these scholars were in a quest for science,
understood as a unified venture. So the notion of a method that applies only to qualitative data
would have made little sense to them.
At the turn of the twentieth century, a bifurcation appeared between quantitative and
qualitative methods (Platt 1992). The natural sciences, along with economics, moved fairly
quickly and without much fuss into the quantitative camp, while the humanities remained largely
qualitative in orientation. The social sciences found themselves in the middle – divided between
scholars aligned with each camp, and some who embraced both. For this reason, the qual/quant
distinction has assumed considerable importance in these fields, and very little importance
outside these fields.
Perhaps it is not coincidental that the quest for a “method” to qualitative inquiry has
proceeded further in the social sciences then in the humanities. And among the social sciences
one might argue that political science has gone further than any other in developing the field of
qualitative methods. Accordingly, this review article focuses primarily on work produced by
political scientists, with an occasional glance at neighboring disciplines.
I begin by discussing the time-honored qualitative/quantitative distinction. What is
qualitative data and analysis and how does it differ from quantitative data and analysis? I propose
an explicit definition for “qualitative” and then explore the implications of that definition. The
second section focuses on several areas of qualitative research that seem especially prominent
and/or fecund, judging by output over the past decade. This includes (a) case-selection, (b)
frameworks for qualitative inquiry, (c) rules of thumb for qualitative inquiry, and (d)
Qual and Quant
Although the qual/quant distinction is ubiquitous in social science today, the distinction is
viewed differently by scholars in either camp. As a rule, scholars whose work is primarily
quantitative tend to view social science as a unified endeavor, following similar rules and
assumptions. The naturalistic ideal centers on goals such as replication, cumulation, and
consensus – all of which point toward a single logic of inference (Beck 2006, 2010; King,
Keohane & Verba 1994).
By contrast, scholars whose work is primarily qualitative tend to view the two modes of
inquiry as distinctive, perhaps even incommensurable. They are more likely to identify with the
idea that knowledge of the world is embedded in theoretical, epistemological, or ontological
frameworks from which we can scarcely disentangle ourselves. They may also identify with the
phenomenological idea that all human endeavor, including science, is grounded in human
experience. Since experiences – often couched in positions of differential power and status –
vary, one can reasonably expect that the methods and goals of social science might also vary. The
apparent embeddedness of knowledge reinforces qualitative scholars’ predilection toward
pluralism, as it suggests that there are fundamentally – and legitimately – different ways of going
about business (Ahmed & Sil 2012; Bennett & Elman 2006: 456-57; Goertz & Mahoney 2012;
Hall 2003; Mahoney & Goertz 2006; Shapiro, Smith & Masoud 2004; Sil 2000; Yanow &
Following the axiom that where one sits determines where one stands, let us also
consider the stakes in this controversy. Over the past century quantitative work has been on the
ascendant and qualitative work has been cast in a defensive posture. There are lots of qualitative
practitioners but comparatively few qualitative methodologists. Consequently, it has been
difficult for researchers to explain their work in ways that those in the quantitative tradition can
understand, and respect. Uncomfortable with the prospect of absorption into a “quantitative
template,” one may surmise that many qualitative scholars have sought to emphasize the
distinctiveness of what they do for strategic reasons – establishing a nature preserve for an
endangered species, as it were.
Whatever its intellectual and sociological sources, the question of unity or dis-unity
depends upon how one chooses to define similarity and difference. Any two objects will share
some characteristics and differ in others. It follows that they may be either compared or
contrasted, depending upon the author’s point of view. Quantitatively inclined scholars may
choose to focus on similarities while qualitatively inclined scholars choose to focus on
differences. Both views are correct, as far as they go. The half-empty/half-full conundrum seems
difficult to overcome in this particular context.1 To put the matter in a more specific frame: all
may agree with Brady & Collier (2010) that there are “diverse tools” (the pluralistic perspective)
as well as “shared standards” (the monist perspective). But they do not necessarily agree on what
those shared standards are or to what extent they should discipline the work of social science.
Any attempt to resolve the monism/pluralism question that begins with high-level
concepts (e.g., monism and pluralism, logic of inquiry, epistemology, commensurability,
naturalism, interpretivism) is probably doomed to failure. These words are loaded, and once they
have been uttered the die is cast. Participants from either camp will dig in their heels.
I propose to take a ground-level approach that avoids hot-button concepts from
philosophy of science and focuses instead on matters of definition. What, exactly, is qualitative
data? And what, by contrast, is quantitative data? We shall then explore the repercussions of this
distinction, working toward some tentative conclusions that – hopefully – all may agree with,
even if they do not resolve all aspects of the qual/quant debate.
Since qualitative and quantitative are antonyms one cannot define one without defining the
other. I begin, therefore, by listing some of the attributes commonly associated with these
Qualitative work is expressed in natural language while quantitative work is
expressed in numbers and in statistical models. Qualitative work employs small
samples, while quantitative work is large-n. Qualitative work is often focused on the
subjective feelings and understandings of those under study and, accordingly, with
techniques such as ethnography and unstructured interviews. Quantitative work is
often focused on seemingly objective conditions, or things held in common.
Qualitative work draws on cases chosen in an opportunistic or purposive fashion
while quantitative work employs systematic (random) sampling. Qualitative work is
often focused on particular individuals, events, and contexts, lending itself to an
idiographic style of analysis. Quantitative work is more likely to be focused on
features that (in the researcher’s view) can be generalized across a larger population,
lending itself to a nomothetic style of analysis.
Let us suppose that all of the foregoing contrasts contain some empirical truth; that is,
the foregoing characteristics co-vary in the work of social scientists. And let us further suppose
that they resonate with common usage of these terms, as reflected in work on the subject (e.g.,
Bennett & Elman 2006; Brady 2010; Caporaso 2009; Collier & Elman 2008; Glassner & Moreno
1989; Goertz & Mahoney 2012; Hammersley 1992; King, Keohane & Verba 1994; McLaughlin
1991; Levy 2007; Morgan 2012; Patton 2002; Schwartz & Jacobs 1979; Shweder 1996; Snow
1959/1993; Strauss & Corbin 1998). If so, we have usefully surveyed the field. But we have not
provided anything more than a map of this rugged terrain.
1 This is nicely illustrated in recent arguments about causation (Reiss 2009).
My goal is to arrive at a minimal definition that bounds our subject in a fairly crisp
fashion, that resonates with extant understandings, and that does not trespass on other well-
established terms. (It would not be efficient, semantically speaking, to conflate qualitative with
idiographic, ethnographic, or some other term in this family of concepts.) In addition, it would
be helpful if the proffered definition accounts for (in a loosely causal sense) the various attributes
commonly associated with the terms “qualitative” and “quantitative” as surveyed above.
With these goals in mind, I propose that a fundamental feature of qualitative work is its
use of non-comparable observations – observations that pertain to different aspects of a causal or
descriptive question. As an example, one may consider the clues in a typical detective story. One
clue concerns the suspect’s motives; another concerns his location at the time the crime was
committed; a third concerns a second suspect; and so forth. Each observation, or clue, draws
from a different population. This is why they cannot be arrayed in a matrix (rectangular) dataset
and must be dealt with in prose (aka narrative analysis). It is also why we have difficulty counting
such observations. The time-honored question of quantitative research – What is the n? – is
impossible to answer in a definitive fashion. Likewise, styles of inference based on qualitative
data operate somewhat differently than styles of inference based on quantitative data.
I therefore define quantitative observations as comparable (along whatever dimensions
are relevant) and qualitative observations as non-comparable, regardless of how many there are.
When qualitative observations are employed for causal analysis they may be referred to as causal-
process observations (Brady 2010), though I shall continue to employ the more general (and less
bulky) term, qualitative observation, which applies to both descriptive and causal inferences.
The notion of a qualitative or quantitative analysis is, accordingly, an inference that rests
on one or the other sort of data. If the work is quantitative, it enlists patterns of covariation
found in a matrix of observations and analyzed with a formal model (e.g., set theory/QCA,
frequentist statistics, Bayesian probabilities, randomization inference) to reach a descriptive or
causal inference. If the work is qualitative, the inference is based on bits and pieces of non-
comparable observations that address different aspects of a problem. Traditionally, these are
analyzed in an informal fashion, an issue taken up below.
Some strategies of data collection seem inherently qualitative, e.g., unstructured
interviews, participant-observation (ethnography), and archival work. This is because researchers
are likely to incorporate a wide variety of clues drawn from different kinds of sources and
addressing different aspects of a problem. The different-ness of the evidence makes them non-
comparable, and hence qualitative. Other data collection strategies such as standardized surveys
are inherently quantitative, as they involve counting large numbers of observations that are
comparable – by assumption. Of course, they might not actually be comparable. We are speaking
here of assumptions about the data generating process, not about the truth with a capital T. But
we cannot avoid assumptions about the world, and these assumptions – quite rightly – lead
researchers to adopt one or the other method of apprehending that reality.
Converting Words to Numbers
No qualitative observation is immune from quantification. Interviews, pictures, ethnographic
notes, and texts drawn from other sources may be coded, either through judgments exercised by
coders or through mathematical algorithms (Grimmer & Stewart 2013). By coding I refer to the
systematic measurement of the phenomenon at hand – reducing the information at hand to a
small number of dimensions, consistently defined across the units of interest. All that is required,
following our definition, is that multiple observations of the same kind be produced and (voila!)
quantitative observations are born. These may then be represented in the matrix format familiar
to those who work with rectangular datasets.
Of course, there are often practical obstacles to quantification. Perhaps additional
sources (informants, pictures, texts) are unavailable. Perhaps, if available, they are not really
comparable, or they introduce problems of causal identification (e.g., heterogeneity across cases
that could pose a problem of noise or confounding). Alternatively, it may be possible to generate
additional (comparable) observations but not worthwhile, e.g., because the first observation is
sufficient to prove the point at issue. Sometimes, one clue is decisive. Nonetheless, in principle,
if the researcher’s assumptions of comparability are justified, qualitative data can become
quantitative data. The plural of anecdote is data.
Something is always lost in the process of reducing qualitative information to quantitative
data. One must ignore the unique aspects of each qualitative observation in order render them
comparable. If one wishes to generalize across a population, ignoring idiosyncratic features of
the data is desirable. But if one wishes to shed light on these heterogeneous features the
conversion of qualitative to quantitative data will iron out the ruggedness of the landscape –
obscuring variation of theoretical interest. Information loss must be reckoned with.2
Finally, and perhaps most importantly, there is an asymmetry between qual and quant.
One can convert qualitative data to quantitative data but not the reverse. It is a one-way street.
Once a piece of information is rendered in a matrix template whatever unique aspects may have
adhered to that observation have been lost. Data reduction is possible, but not expansion. The
singular of data is not anecdote, which is to say one can never recover an anecdote from a data
It follows from our discussion that the utility of qualitative and quantitative data varies according
to the researcher’s goals.
First, qualitative data is likely to be more useful insofar as a study is focused on a single
case (or event), or a small number of cases (or events). Such investigations bear close
resemblance, methodologically speaking, to a detective’s quest to explain a crime, which may be
thought of as a single event or a small number of associated events (if it is a string of crimes
committed by the same person or group). The reason that these investigations often rest on
qualitative data is that the researcher wishes to know a lot about the chosen case/event, and this
requires a supple mode of investigation that allows one to draw different kinds of observations
from different populations. Whether case-level analysis is warranted may rest on other, more
fundamental aspects of the analysis. For example, case-level analysis is more plausible if the cases
of theoretical interest are heterogeneous and scarce (e.g., nation-states) rather than homogeneous
and plentiful (e.g., firms or individuals), if the causal factor cannot be manipulated by the
researcher, if the causal factor or outcome is extremely rare, if the goal of the analysis is
exploratory rather than confirmatory, and so forth (Gerring 2016).
Second, qualitative data is likely to be more important in a causal analysis insofar as the
researcher seeks to identify token (singular, actual) causes rather than general (type) causes. The
former demands a case-level analysis – focused on what caused an outcome in a particular
instance. It also usually involves counterfactual thought-experiments – focused on what would
have happened if a causal factor had been present/absent – and these sorts of analyses are
almost by definition qualitative in nature.
Third, qualitative data is likely to be more important if the case-level analysis is idiographic
(focused on describing or explaining the case) rather than nomothetic (focused on generalizing
beyond the chosen case). Critical-juncture explanations offer a case in point, since these events
are the product of unique events that (by definition) cannot be predicted and explained by a
general model (cites). It is difficult to imagine applying a quantitative model, or matrix
observations, to a critical juncture. The general point is that unique features of a case are often
2 Of course, any rendering of a complex phenomenon involves some loss of information. This is true even for the
most faithful – and lengthy – descriptions of reality such as those produced by ethnomethodologists (Garfinkel
easier to represent qualitatively than quantitatively. Indeed, the very act of quantitative
measurement and analysis presupposes a metric that stretches beyond the case under study.
Other features of a theory or an analysis do not seem to have a bearing on the relative
utility of qualitative and quantitative data.3 The main point, then, is that wherever the focus of
analysis is limited to one or several units qualitative data is likely to be employed, though it may
be supplemented by quantitative data. Indeed, most case study research combines qualitative and
quantitative analysis (Gerring 2016), and most cross-case research involves some recourse to
qualitative data, either in an anecdotal or illustrative fashion or as a way to address concerns
about causal identification or causal mechanisms, as discussed below. In this light, one might
argue that all research is multi-method, a subject to which we return.
A Selection of Recent Topics
Having defined our terms, and explored some of the ramifications of this definition, we turn
now to recent work on the topic of qualitative methods. My treatment is highly selective, focused
on several areas that have received a good deal of attention from scholars in recent years. This
includes case selection, frameworks for qualitative inquiry, rules of thumb for qualitative inquiry, and
Evidently, many important subjects are left aside in this short review. I do not address set
theory and qualitative comparative analysis (QCA), as this extraordinarily large and complex
subject does not fit my proposed definition of qualitative research and is, in any case,
encompassed by other recent reviews (e.g., Mahoney 2010; Mahoney & Vanderpoel 2015;
Rihoux 2013). I do not address typological methods and concept formation (e.g., Bennett &
Elman 2006; Elman 2005), data archiving, transparency, and replication (Elman & Kapiszewski
2014; Elman, Kapiszewski & Vinuela 2010; Lieberman 2010), comparative historical analysis
(Mahoney & Thelen 2015), or the organizational features of qualitative methods (Collier &
Elman 2008) – all of which are nicely handled by previous reviews. Nor do I have much to say
about specific data collection methods – interviewing, ethnography, archival work, and so forth
(e.g., Kapiszewski, MacLean & Read 2015). Most importantly, I focus primarily on causal
inference, leaving aside many knotty questions pertaining to descriptive inference (Gerring
We have observed that case-based analysis is likely to contain qualitative observations (even if it
also incorporates quantitative observations). Consequently, the question of case-selection – how
a case, or a small number of cases, is chosen from a large number of potential cases – is central
to qualitative analysis.
Quite a number of case-selection typologies have been proposed over the years, with a
noticeable acceleration in the past decade. Mill (1843/1872) proposes the method of difference
(aka most-similar method) and method of agreement (aka most-different method), along with
3 For example, the long-standing distinction between research that seeks a complete explanation of an outcome
(“causes-of-effects”) and research that narrows its scope to a single hypothesis (“effects-of-causes”) seems to bear
ambivalently on the qual/quant divide. Note that a causes-of-effects explanation may be provided solely on the basis
of quantitative data, e.g., a “full” regression model. Likewise, an effects-of-causes explanation may be provided
based solely on qualitative data, i.e., a process tracing analysis).
4 This imposes a “positivist” lens on the questions under discussion, as many interpretivists – and presumably all
post-structuralists – view their research goals quite differently. Nonetheless, recent methodological work on
qualitative methods in political science is focused mostly on causal inference, following well-established disciplinary
preferences (Gerring 2012). From this perspective, our fairly restrictive scope-conditions may be justified even
though it leaves a vast expanse of qualitative research outside the purview of this study.
several others that have not gained traction. Lijphart (1971: 691) proposes six case study types: a-
theoretical, interpretative, hypothesis-generating, theory-confirming, theory-infirming, and
deviant. Eckstein (1975) identifies five species: configurative-idiographic, disciplined-
configurative, heuristic, plausibility probes, and crucial-case. Skocpol & Somers (1980) identify
three logics of comparative history: macro-causal analysis, parallel demonstration of theory, and
contrast of contexts. Gerring (2007) and Seawright & Gerring (2008) identify nine techniques:
typical, diverse, extreme, deviant, influential, crucial, pathway, most-similar, and most-different.
Levy (2008) identifies five case study research designs: comparable, most and least likely, deviant,
and process tracing. Rohlfing (2012: ch3) identifies five case-types – typical, diverse, most-likely,
least-likely, and deviant – which are applied differently according to the purpose of the case
study. Blatter & Haverland (2012: 24-26) identify three explanatory approaches – covariational,
process tracing, and congruence analysis – each of which offers a variety of case-selection
Building on these efforts, Gerring & Cojocaru (2016) propose a new typology that
(arguably) qualifies as the most comprehensive to date, incorporating much of the foregoing
literature. Its organizing feature is the goal that a case study is intended to serve, identified in the
first column of Table 1. Column 2 specifies the number of cases (n) in the case study. It will be
seen that case studies enlist a minimum of one or two cases, with no clearly defined ceiling.
Column 3 clarifies which dimensions of the case are relevant for case-selection, i.e., descriptive
features (D), causal factors of theoretical interest (X), background factors (Z), and/or the
outcome (Y). Column 4 specifies the criteria used to select a case(s) from a universe of possible
cases. Column 5 offers an example of each case-selection strategy. In what follows, I offer a brief
resume of the resulting typology.
Table 1: Case-Selection Strategies
Goals/Strategies N Factors Criteria for cases Examples
I. DESCRIPTIVE (to describe)
● Typical 1+ D Mean, mode, or median of D Lynd & Lynd (1929) Middletown
● Diverse 2+ D Typical sub-types Fenno (1977, 1978) Home Style
II. CAUSAL (to explain Y)
1. Exploratory (to identify HX)
● Outcome 1+ Y Maximize variation in Y Skocpol (1979) States and Social Revolutions
● Index 1+ Y First instance of ∆Y Pincus (2011) 1688: First Modern Revolution
● Deviant 1+ Z Y Poorly explained by Z Alesina et al (2001) Why Doesn’t US Have Welfare State?
● Most-similar 2+ Z Y Similar on Z, different on Y Epstein (1964) A Comparative Study of Canadian Parties
● Most-different 2+ Z Y Different on Z, similar on Y Karl (1997) Paradox of Plenty
● Diverse 2+ Z Y All possible configurations of Z (assumption: X ∈ 𝒁) Moore (1966) Social Origins of Dictatorship and Democracy
2. Estimating (to estimate HX)
● Longitudinal 1+ X Z X changes, Z constant or biased against HX Friedman & Schwartz (1963) Monetary History of US
● Most-similar 2+ X Z Similar on Z, different on X Posner (2004) Political Salience of Cultural Difference
3. Diagnostic (to assess HX)
● Influential 1+ X Z Y Greatest impact on P(HX) Ray (1993) Wars between Democracies
● Pathway 1+ X Z Y X→ 𝑌 strong, Z constant or biased against HX Mansfield & Snyder (2005) Electing to Fight
● Most-similar 2+ X Z Y Similar on Z, different on X & Y, X→ 𝑌 strong Walter (2002) Committing to Peace
D = descriptive features (other than those to be described in a case study). H X = causal hypothesis of interest. P(H X) = the probability of HX. X = causal factor(s) of theoretical
interest. X→ 𝑌 = apparent or estimated causal effect. Y = outcome of interest. Z = vector of background factors that may affect X and/or Y.
Many case studies are primarily descriptive, which is to say they are not organized around a
central, overarching causal hypothesis. Although writers are not always explicit about their
selection of cases, most of these decisions might be described as following a typical or diverse case
strategy. That is, they aim to identify a case, or cases, that exemplify a common pattern (typical)
or patterns (diverse). This follows from the minimal goals of descriptive analysis. Where the goal
is to describe there is no need to worry about the more complex desiderata that might allow one
to gain causal leverage on a question of interest.
Other case studies are oriented toward causal analysis. A good case (or set of cases) for
purposes of causal analysis is generally one that exemplifies quasi-experimental properties, i.e., it
replicate the virtues of a true experiment even while lacking a manipulated treatment (Gerring &
McDermott 2007). Specifically, for a given case (observed through time) or for several cases
(compared to each other), variation in X should not be correlated with other factors that are also
causes of Y, which might serve as confounders (Z), generating a spurious (non-causal)
relationship between X and Y.
Exploratory cases attempt to identify a possible cause of an outcome of theoretical
interest. The outcome, Y, is established, and is usually framed as a research question. What
accounts for variation in Y? Or, if Y is a discrete event, Why does Y occur? The researcher may also
have an idea about background conditions, Z, that influence Y but are not of theoretical interest.
The purpose of the study, in any case, is to identify X, regarded as a possible or probable cause
of Y. Specific techniques of exploratory case selection may be classified as outcome, index, deviant,
most-different, most-similar, or diverse, as specified in Table 1.
Estimating cases aim to test a hypothesis by estimating a causal effect. That might mean a
precise point estimate along with a confidence interval (e.g., from a time-series or synthetic
matching analysis), or an estimate of the “sign” of a relationship, i.e., whether X has a positive,
negative, or no relationship to Y. The latter is more common, not only because of the small size
of the sample (at the case level) but also because it is more likely to be generalizable across a
population of cases. In either situation, case selection rests on information about X and Z (not
Y). Two general approaches are viable – longitudinal and most similar – as outlined in Table 1.
Diagnostic case studies help to confirm, disconfirm, or refine a hypothesis (garnered from
the literature on a subject or from the researcher’s own ruminations) and identify the generative
agent (mechanism) at work in that relationship. All the elements of a causal model – X, Z, and Y
– are generally involved in the selection of a diagnostic case. Specific strategies may be classified
as influential, pathway, or most-similar, as shown in Table 1.
Note that virtually all of these case selection strategies may be executed in an informal,
qualitative fashion or by employing a quantitative algorithm. For example, a deviant case could
be chosen based on a researcher’s sense about which case(s) is poorly explained by extant
theories. Or it might be chosen by looking at residuals from a regression model. Discussion of
the pros and cons of algorithmic case selection can be found in Gerring (2016).
The reader may wonder, how might one know whether a designated strategy will achieve
what it is intended to achieve? Evidently, there are serious problems of validation to wrestle with.
Several attempts have been made to assess varying case selection strategies using simulation
techniques. Herron & Quinn (2015) assess estimating strategies, i.e., where the case is intended
to measure causal effects. Seawright (2015b) assesses diagnostic strategies, where the case is
designed to help confirm or disconfirm a causal hypothesis. Lucas & Szatrowksi (2014) assess
QCA-based strategies of case-selection.
It would take some time to discuss these complex studies, so I shall content myself with
several summary judgments. First, case selection techniques have different goals, so any attempt
to compare them must focus on the goals that are appropriate to that technique. A technique
whose purpose is exploratory (to identify a new hypothesis about Y) cannot be judged by its
efficacy in identifying causal mechanisms, for example. Second, among these goals, estimating
causal effects is the least common – and, by all accounts, the least successful – of these goals, so
any attempt to gauge the effectiveness of case selection methods should probably focus primarily
on exploratory and diagnostic functions. Third, case selection techniques are best practiced when
taking into account change over time in the key variables, rather than static cross-sectional
analyses – as most of the simulation exercises appear to do. Finally, and most importantly, it is
difficult and perhaps impossible to simulate the complex features involved in an in-depth case
analysis. The question of interest – which case(s) would best serve my purpose if I devoted a
case study of it? – is hard to model without introducing assumptions that pre-judge the results of
the case study and are in this respect endogenous to the case-selection strategy.5
In my opinion, testing the viability of case selection strategies in a rigorous fashion would
involve a methodological experiment of the following sort. First, assemble a panel of researchers
with similar background knowledge of a subject. Second, identify a subject deemed ripe for case
study research, i.e., it is not well-studied or has received no authoritative treatment and is not
amenable to experimental manipulation. Third, select cases algorithmically, following one of the
protocols laid out in Table 1. Fourth, randomly assign these cases to the researchers with
instructions to pursue all case study goals – exploratory, estimating, and diagnostic. Fifth,
assemble a panel of judges, who are well-versed in the subject of theoretical focus, to evaluate
how well each case study achieved each of these goals. These could be scored on a questionnaire
using ordinal, Likert-style categories. Judges would be instructed to decide independently
(without conferring), though there might be a second round of judgments following a
deliberative process in which they shared their thoughts and their preliminary decisions.
Such an experiment would be time-consuming and costly (assuming participants receive
some remuneration). And it would need to be iterated across several research topics and with
several panels of researchers and judges in order to make strong claims of generalizability.
Nonetheless, it might be worth pursuing given the possible downstream benefits – if, that is,
some strategies can be shown to be superior to others.6
Frameworks for Qualitative Inquiry
Having discussed case selection, we proceed to case analysis, with a focus on the qualitative
components of that inquiry. Here, we stumble upon the most mysterious, and most contested,
aspect of qualitative methods.
Because of its informal nature, qualitative evidence is often regarded with suspicion. It’s
hard to articulate what a convincing inference might consist of, and how to know it when one
sees it. What are the methodological standards of qualitative data analysis (sometimes referred to
as process tracing)?
To remedy this situation a number of recent studies try to make sense of qualitative data,
imposing order on the seeming chaos. Proposed frameworks include set theory (Mahoney 2012;
Mahoney & Vanderpoel 2015), acyclic graphs (Waldner 2015b), or – most commonly – Bayesian
inference (Beach & Pedersen 2013: 83-99; Bennett 2008, 2015; Crandell et al. 2011; George &
5 For example, Herron & Quinn (2015: 9) make the assumption that the potential outcomes inherent in a case (i.e.,
the unit-level causal relationship) will be discovered by the case study researcher in the course of an intensive
analysis of the case. Yet, “discoverability” is the very thing that case selection techniques are designed to achieve.
That is, a case selection technique is regarded as superior insofar as it offers a higher probability of discovering an
unknown feature of a case.
6 Note, however, that this experiment disregards qualitative judgments by researchers that might be undertaken after
an algorithmic selection of cases. These qualitative judgments might serve as mediators. It could be, for example,
that some case-selection strategies work better when the researcher is allowed to make final judgments – from
among a set of potential cases that meet the stipulated case-selection criteria – based on knowledge of the potential
cases. One must also consider a problem of generalizability that stems from the use of algorithmic procedures for
selecting cases. It could be that subjects for which algorithmic case selection is feasible (i.e., where values for X, Z,
and Y can be measured across a large sample) are different from subjects for which algorithmic case selection is
infeasible. If so, we could not generalize the results of this experiment to the latter genre of case study research.
McKeown 1985; Gill et al. 2005; Humphreys & Jacobs 2014, in process; McKeown 1999;
Rohlfing 2012: 180-99).
These efforts have performed an enormous service to the cause of qualitative inquiry,
fitting them into frameworks that are already well-established for quantitative inquiry. It should
be no surprise that there are multiple frameworks, just as there are multiple frameworks for
quantitative methodology. Scholars may debate whether, or to what extent, these frameworks are
compatible with each other; this important debate is orthogonal to the present topic. The point
to stress is that qualitative inquiry can be understood within the rubric of general causal
frameworks. There is, in this sense, a single logic of inquiry.
Exploring these complex frameworks in detail would take us far afield; interested readers
may explore the cited literature. However, I do want to highlight one point. Thus far,
applications of set theory, acyclic graphs, and bayesianism to qualitative methods have focused
on making sense of the activity rather than providing a practical guide to research. It remains to
be seen whether these can be developed in such a way as to alter the ways that qualitative
researchers go about their business. Let me illustrate.
Some years ago, Van Evera (1997) proposed a fourfold typology of tests that has since
been widely adopted (e.g., Bennett & Checkel 2015: 17; George & Bennett 2005; Mahoney &
Vanderpoel 2015; Waldner 2015a). A “hoop” test is necessary (but not sufficient) for
demonstrating Hx. A “smoking-gun” test is sufficient (but not necessary) for demonstrating Hx.
A “doubly-decisive” test is necessary and sufficient for demonstrating Hx. A “straw-in-the-wind”
test is neither necessary nor sufficient, constituting weak or circumstantial evidence. These
concepts, diagramed in Table 2, are useful for classifying the nature of evidence according to a
researcher’s judgment. However, the hard question – the judgment itself – is elided. When does a
particular piece of evidence qualify as a hoop, smoking-gun, doubly-decisive, or straw-in-the-
wind test (or something in between)?
Table 2: Qualitative Tests and their Presumed Inferential Role
Likewise, Bayesian frameworks are useful for combining evidence from diverse quarters
in a logical fashion with the use of subjective assessments, e.g., the probability that a hypothesis
is true, ex ante, and assessments of the probability that the hypothesis is true if a piece of
evidence (stipulated in advance) is observed. The hard question, again, is the case-specific
judgment. Consider the lengthy debate that has ensued over the reasons for electoral system
choice in Europe (Kreuzer 2010). Humphreys & Jacobs (2014) use this example to sketch out
their application of Bayesian inference to qualitative research. In particular, they explore the “left
threat” hypothesis, which suggests that the presence of a large left-wing party explains the
adoption of proportional representation (PR) in the early twentieth century (Boix 1999). The
authors point out that “for cases with high left threat and a shift to PR, the inferential task is to
determine whether they would have…or would not have…shifted to PR without left threat”
(Humphreys & Jacobs 2014: 28). Bayesian frameworks do nothing to ease this inferential task,
which takes the form of a counterfactual thought-experiment. Similar judgments are required by
other frameworks – set theory, acyclic graphs, and so forth.
To get a feel for the level of detail required in qualitative research we may benefit from a
closer look at a particular inquiry. Helpfully, Tasha Fairfield (2013: 55-6; see also 2015) provides
a scrupulous blow-by-blow account of the sleuthing required to reach each case-level inference
in her study of how policymakers avoid political backlash when they attempt to tax economic
elites. One of her three country cases is Chile, which is observed during and after a recent
presidential election. Fairfield explains,
During the 2005 presidential campaign, right candidate Lavı´n blamed Chile’s
persistent inequality on the left and accused President Lagos of failing to deliver his
promise of growth with equity. Lagos responded by publicly challenging the right to
eliminate 57 bis, a highly regressive tax benefit for wealthy stockholders that he
called “a tremendous support for inequality.” The right accepted the challenge and
voted in favor of eliminating the tax benefit in congress, deviating from its prior
position on this policy and the preferences of its core business constituency.
The following three hypotheses encompass the main components of my argument
regarding why the right voted in favor of the reform:
Hypothesis 1. Lagos’ equity appeal motivated the right to accept the reform, due to
concern over public opinion.
Hypothesis 2. The timing of the equity appeal—during a major electoral
campaign—contributed to its success.
Hypothesis 3. The high issue-salience of inequality contributed to the equity
The following four observations, drawn from different sources, provide indirect,
circumstantial support for Hypothesis 1:
Observation 1a (p. 48): The Lagos administration considered eliminating 57 bis in
the 2001 Anti-Evasion reform but judged it politically infeasible given business-
right opposition (interview: Finance Ministry-a, 2005).
Observation 1b: The Lagos administration subsequently tried to reach an agreement
with business to eliminate 57 bis without success (interview, Finance Ministry-b,
Observation 1c: Initiatives to eliminate the exemption were blocked in 1995 and
1998 due to right opposition. (Sources: congressional records, multiple interviews)
Observation 1d: Previous efforts to eliminate 57 bis did not involve concerted
equity appeals. Although Concertacio´n governments had mentioned equity in prior
efforts, technical language predominated, and government statements focused
much more on 57 bis’ failure to stimulate investment rather than its regressive
distributive impact (congressional records, La Segunda, March 27, 1998, El
Mercurio, April 1, 1998, Interview, Ffrench-Davis, Santiago, Chile, Sept. 5, 2005).
Inference: These observations suggest that right votes to eliminate 57 bis would
have been highly unlikely without some new, distinct political dynamic. Lagos’
strong, high-profile equity appeal, in the unusual context of electoral competition
from the right on the issue of inequality, becomes a strong candidate for explaining
the right’s acceptance of the reform.
The appendix continues in this vein for several pages, focused relentlessly on explaining the
behavior of one particular set of actors in one event, i.e., the motivation of the right-wing in
favoring the reform. This event is just one of a multitude of events discussed in connection with
the Chilean case study, to which must be added the equally complex set of events occurring in
Argentina and Bolivia in Fairfield’s three-country study. Clearly, reaching case-level inferences is
One may conclude that if researchers agreed on case-level judgments then general frameworks
could be successful in cumulating these judgments into a higher-level inference, accompanied by
a (very useful!) confidence interval. But if one cannot assume case-level consensus, conclusions
based on qualitative judgments combined through a Bayesian (or other) framework represent
little more than one researcher’s views, which might vary appreciably from another’s. Readers
who are not versed in the intricacies of Chilean politics will have a hard time ascertaining
whether Fairfield’s judgments are correct.
This sort of problem could be overcome with a crowd-based approach. Specifically, one
might survey a panel of experts – chosen randomly or with an aim to represent diverse
perspectives – on each point of judgment. One could then cumulate these judgments into an
overall inference, in which the confidence interval reflects the level of disagreement among
experts (among other things). Unfortunately, not just any crowd will do. The extreme difficulty
of case study research derives in no small part from the expertise that case study researchers
bring to their task. I cannot envision a world in which lay coders, recruited through Amazon
Turk or Facebook, would replace that expertise, honed through years of work on a particular
problem and in a particular site (a historical period, country, city, village, organization,…).
To be credible, a crowd-based approach to the problem of judgment would need to
enlist the small community of experts who study a subject and can be expected to make
knowledgeable judgments about highly specific questions such as the “left wing threat.” In the
previous example, it would entail enlisting scholars versed in the politics of early twentieth
century Europe. This procedure is conceivable, but difficult to implement. How would one
identify a random, or otherwise representative, sample? (What is the sampling frame?) How
would one motivate scholars to undertake the task? How would one elicit honest judgments
about the specific questions on a questionnaire, uncorrupted by broader judgments about the
theoretical question at hand (which they would probably be able to infer)?
Likewise, if one goes to the trouble of constructing a common coding frame (a
questionnaire), an on-line system for recording responses, a system of recruitment, and a
Bayesian (or some other) framework for integrating judgments, the considerable investment in
time and expense of such a venture would probably justify extending the analysis to many cases,
chosen randomly, so that a representative sample can be attained and stochastic threats to
inference minimized. In this fashion, procedures to integrate qualitative data into a quantitative
framework seem likely to morph from case studies into cross-case coding exercises. This is not
to argue against the idea. It is simply to point out that any standardization of procedures tends to
work against the intensive focus on one or several cases which (by our definition) characterizes
case study research.
Rules of Thumb for Qualitative Inquiry
My tentative conclusion, based on the discussion in the previous section and pending further
developments on this fast-moving field, is that it is hard to improve upon the informal “rules of
thumb” that have traditionally governed qualitative analysis. They have their faults, to be sure.
But remedying those faults may not be possible within the constraints of a case study framework.
What, then, are these rules of thumb? A good deal of work has gone into identifying
informal procedures to guide causal inference using qualitative data (Beach & Pedersen 2013;
Bennett & Checkel 2015; Brady & Collier 2004; Collier 2011; George 1979; Hall 2006; Jacobs
2015; Mahoney 2012; Roberts 1996; Schimmelfennig 2015; Waldner 2012, 2015a, 2015b). They
may be summarized as follows…
• Analyze sources according to their relevance (to the question of theoretical interest), proximity
(whether the source is in a position to know what s/he is claiming), authenticity (the source is not
fake or reflecting the influence of someone else), validity (the source is not biased), and diversity
(collectively, sources represent a diversity of viewpoints on the question at hand).
• When identifying a new causal factor or theory, look for one (a) that is potentially generalizable
to a larger population, (b) that is neglected in the extant literature on your subject, (c) that greatly
enhances the probability of an outcome (if binary) or explains a lot of variation on that outcome
(if interval-level), and (d) that is exogenous (not explained by other factors).
• Canvas widely for rival explanations, which also serve as potential confounders. Treat them
seriously (not as “straw men”), dismissing them only when warranted. Utilize this logic of
elimination, where possible, to enhance the strength of the favored hypothesis.
• For each explanation, construct as many testable hypotheses as possible, paying close attention to
within-case opportunities – e.g., mechanisms and alternative outcomes.
• Enlist counterfactual thought-experiments in an explicit fashion, making clear which features of
the world are being altered, and which are assumed to remain the same, in order to test the
viability of a theory. Also, focus on periods when background features are stable (so they don’t
serve as confounders) and minimize changes to the world (the minimal-rewrite rule) so that the
alternate scenario is tractable.
• Utilize chronologies and diagrams to clarify temporal and causal interrelationships among
complex causal factors. Include as many features as possible so that the time-line is continuous,
Researchers should bear in mind that these diverse rules of thumb are intended to shed
light on causal inference for an individual case or a small set of cases. Inferences for that case(s)
may – or may not – be generalizable to a larger population.
The final section of this study is devoted to multimethod research, where both qual and quant
styles of evidence are brought to bear on the same general research question (Brewer & Hunter
2006; Goertz 2015; Harrits 2011; Lieberman 2005; Seawright 2015a). While multimethod
research is increasingly common, there are serious questions about its effectiveness (Lohmann
2007). Doing more than one thing might mean doing multiple things poorly, by dint of limited
time or expertise. Nor is clear whether qualitative and quantitative analysis can speak to one
another productively (Ahmed & Sil 2012).
In discussing this question it is important not to confuse disagreement with
incommensurability. If qual and quant tests of a proposition are truly independent there is always
the possibility that they will elicit different, perhaps even directly contradictory, answers. For
example, the most common style of multimethod analysis combines a quantitative analysis of
many units with an in-depth, qualitative (or at least partially qualitative) analysis of a single case
or a small set of cases, which Lieberman (2005) refers to as a nested analysis. Occasionally, these
two analyses reach different conclusions about a causal relationship (though, one suspects,
authors do not always bring these disagreements to the fore). However, the same disagreements
also arise from rival quantitative analyses (e.g., conducted with different samples or
specifications) and rival qualitative analyses (e.g., focused on different research sites or generated
by different researchers). Disagreement about whether X causes Y, or about the mechanisms at
work, does not entail that multimethod research is unavailing. Sometimes, triangulation does not
confirm one’s hypothesis. It is still useful information; and for those worried about confirmation
bias, it is critical.
In any case, Seawright (2015a) points out that when qualitative and quantitative evidence
is combined these analyses are usually oriented toward somewhat different goals. Typically, a
large-n cross-case analysis is focused on measuring a causal effect while a small-n within-case
analysis is focused on identifying a causal mechanism. As such, the two styles of evidence cannot
directly conflict since their objectives are different. They nonetheless inform each other in a
One way to assess the value of multimethod research is to determine how influential it
might be in readers’ judgments about causal inference. A recent study by Gerring & Seawright
(2016) subjects this question to an experimental test based on a sample of respondents drawn
from recent summer schools sponsored by the Institute for Qualitative and Multimethod
Research (at Syracuse University) and the European Consortium for Political Research (at the
University of Ljubljana, Slovenia). Three treatments are devised from multimethod articles
published in top political science journals. The Control condition consists of the entire article, as
published; the Qualitative condition consists of the same article minus the quantitative analysis;
the Quantitative condition consists of the article minus the qualitative analysis. After reading the
assigned sections of the article, subjects are asked whether they believe that the hypothesis is true
(a) with respect to the sample under study (internal validity) and (b) with respect to a broader
population (external validity). The authors thus utilize multimethod research to interrogate the
persuasiveness of each of its components.
The findings of this study are somewhat disturbing for those who prize multimethod
research. In particular, there seems to be no advantage in packaging qualitative and quantitative
evidence together in a multimethod approach to causal inference. Indeed, respondents found
multimethod research less generalizable than single-method quantitative research. That is, adding
a qualitative component to a quantitative analysis appears to weaken a study’s claim to external
validity even when all else about that study remains the same. These results are all the more
surprising given that the chosen sample of respondents are presumably disposed to favor a
multimethod approach to social science.
One may surmise that some of the obstacles to successful multimethod research are
practical in nature, and may have affected the results of this study. Because of limited space, it is
not easy to incorporate multiple research designs in the same article-length study. Perhaps the
respondents in the foregoing study were unable to process the complexities of a multimethod
study because qualitative data, in particular, is hard to present in a highly concise form. Likewise,
because multiple methods require different skill-sets it is not always possible for the same
researcher to undertake qualitative and quantitative approaches to a topic. Perhaps the chosen
multimethod studies were not convincing because the writers were not sufficiently conversant in
This leaves open another way of viewing multimethod research. If the two modes of
analysis are difficult to combine in the same study by the same author perhaps they may be
profitably united within a larger “research cycle” (Lieberman 2016) that includes a diversity of
methods and authors. This allows scholars with a qual or quant bent to do what they do best,
concentrating their efforts on their particular skill-set and on one particular context that they can
become intimately acquainted with. The research cycle also mitigates a presentational problem –
stuffing results from myriad analyses into a 10,000 word article.
Unfortunately, the research cycle approach to multimethod research also encounters
obstacles. In particular, one must wonder whether cumulation can occur successfully across
diverse studies utilizing diverse research methods. Note that political science work is not highly
standardized, even when focused on the same research question and when utilizing the same
quantitative method. This inhibits the integration of findings, and helps to account for the
scarcity of meta-analyses in political science. Qualitative studies are even less likely to be
standardized in a way that allows for their integration into an ongoing research trajectory. Inputs
and outputs may be defined and operationalized in disparate ways, or perhaps not clearly
operationalized at all. And because samples are not randomly chosen, any aggregation of studies
cannot purport to represent a larger population in an unbiased fashion.
There is yet another angle on this topic that offers what is perhaps a more optimistic –
not to mention realistic – reading of the multimethod ideal. Rather than conceptualizing
qualitative and quantitative research as separate research designs we might regard them as
integral components of the same design.
Note that there are few purely qualitative studies. Although the main burden of inference
may be carried by qualitative data, this is often supplemented by a large-n cross-case analysis or a
large-n within-case analysis (where observations are drawn from a lower level of analysis).
Likewise, there are few purely quantitative analyses since the former is usually (always?)
accompanied by qualitative observations of one sort or another. At a minimum, qualitative data
is trotted out by way of illustration. At a maximum, qualitative data is essential to causal
In this vein, a number of recent studies highlight the vital role played by qualitative data,
even when the research design is experimental or quasi-experimental. Although we tend to think
of these designs as being quantitative – since they generally incorporate a large number of
comparable units – they may also contain important qualitative components.
There is, to begin with, the problem of research design. Without an ethnographic
understanding of the research site and the individuals who are likely to serve as subjects it is
impossible to design an experiment that adequately tests a hypothesis of interest. It is impossible
to define a confounder “in the abstract.” In-depth case-based understanding is especially
important in the context of field experimentation, where one assumes that context matters a
good deal to how subjects are likely to react to a given treatment.
Second, one must assess potential threats to inference. Where the assignment is randomized,
ex ante comparability is assured. But ex post comparability remains a serious threat to inference.
For example, experiments often face problems of compliance, so it is incumbent on the
researcher to ascertain whether subjects adhered to the prescribed protocol and, if not, which
subjects violated the protocol. Where significant numbers of subjects attrit (withdraw from
participation) there is an important question about what motivated their withdrawal and what
sort of subjects were inclined to withdraw. In field experiments, where a significant time lag
often separates the treatment and the outcome of theoretical interest, one must try to determine
whether subjects under study may have communicated with one another, introducing potential
problems of interference and/or contamination (interference across treatment and control
Third, there is a question of causal mechanisms. Assuming a treatment effect can be
measured without bias, what is it that accounts for the connection between X and Y?
Finally, there are questions of generalizability. In order to determine the external validity of
an experiment one must have a good sense of the research site and the subjects who have been
studied. Specifically, one must be able to assess the extent to which these individuals, and this
particular treatment effect, can be mapped across other – potentially quite different – settings.
These issues – of research design, inferential threats, causal mechanisms, and
generalizability – are often assessable with qualitative data. Indeed, they may only be assessable
by means of a rich, contextual knowledge of a research project as it unfolds on a particular site.
Paluck (2010) argues, further, that experimental designs may be combined with
qualitative measurement to access outcomes that would not be apprehended with traditional
quantitative measures. As an example, she explores Chattopadhyay & Duflo’s (2004) study of
women leaders in India. While praising this landmark study, Paluck (2010: 61) points out,
participant observation of women leaders outside of the council settings—such as
in their homes, where they visit with other women—could have revealed whether
they were influenced by women constituents in these more informal settings.
Intensive interviews could compare social processes in villages with female or male
council leaders to reveal how beliefs about women leaders’ efficacy shift. For
example, did other council members, elders, or religious leaders make public
statements about female leaders or the reservation system? Was there a tipping
point at which common sentiment in villages with female leaders diverged from
villages with male leaders? Such qualitatively generated insights could have enabled
this study to contribute more to general theories of identity, leadership, and political
and social change. Moreover, ethnographic work could compare understandings of
authority and political legitimacy in villages with female- and male-led councils. Do
the first female leaders inspire novel understandings of female authority and
legitimacy, or are traditional gender narratives invoked just as frequently to explain
women’s new power and position?
Paluck concludes that experiments provide an under-utilized opportunity for qualitative analysis,
one that is grossly under-utilized – due to scholars’ parochial attitudes. Qualitative scholars who
wish to understand the impact of modernization would be well-advised to construct a field
experiment in which an agent of modernization – e.g., a bridge, road, harbor, radio tower – is
randomized across sites, allowing for an opportunity to systematically compare treatment and
control groups over time, using all the ethnographic tools at their disposal.
Where the treatment is not randomly assigned (i.e., in observational research) there are
additional issues pertaining to potential assignment (or selection) bias. Here, qualitative data
often comes into play (Dunning 2012). For example, Jeremy Ferwerda & Nicholas Miller (2014)
argue that devolution of power reduces resistance to foreign rule. To do so, they focus on France
during World War Two, when the northern part of the country was ruled directly by German
forces and the southern part was ruled indirectly by the “Vichy” regime headed by Marshall
Petain. The key methodological assumption of their regression discontinuity design is that the
line of demarcation was assigned in an as-if random fashion. For the authors, and for their critics
(Kocher & Monteiro 2015), this assumption requires in-depth qualitative research – research that
promises to uphold, or call into question, the author’s entire analysis.
As a second example, we may consider Romer & Romer’s (2010) analysis of the impact
of tax changes on economic activity. Because tax changes are non-random, and likely to be
correlated with the outcome of interest, anyone interested in this question must be concerned
with bias arising from the assignment of the treatment. To deal with this threat, Romer & Romer
make use of the narrative record provided by presidential speeches and congressional reports to
elucidate the motivation of tax policy changes in the postwar era. This allows them to distinguish
policy changes that might have been motivated by economic performance from those that may
be considered as-if random. By focusing solely on the latter, they claim to provide an unbiased
test of the theory that tax increases are contractionary.
Ahmed, Amel, Rudra Sil. 2012. “When Multi-Method Research Subverts Methodological
Pluralism - Or, Why We Still Need Single-Method Research.” Perspectives on
Politics 10:4 (December) 935-953.
Alesina, Alberto, Edward Glaeser, Bruce Sacerdote. 2001. “Why Doesn’t the US Have a
European-Style Welfare State?” Brookings Papers on Economic Activity 2, 187-277.
Beach, Derek, Rasmus Brun Pedersen. 2013. Process-Tracing Methods: Foundations and Guidelines.
Ann Arbor, MI: University of Michigan Press.
Beck, Nathaniel. 2006. “Is Causal-Process Observation An Oxymoron?” Political Analysis 14(3):
Beck, Nathaniel. 2010. “Causal Process ‘Observations’: Oxymoron or (Fine) Old Wine.” Political
Analysis 18(4): 499–505.
Bennett, Andrew, Colin Elman. 2006. “Qualitative Research: Recent Developments in Case
Study Methods.” Annual Review of Political Science 9:455–76.
Bennett, Andrew. 2008. “Process Tracing: A Bayesian Approach.” In Janet Box-Steffensmeier,
Henry Brady, & David Collier (eds), Oxford Handbook of Political Methodology (Oxford: Oxford
University Press) 702-21.
Bennett, Andrew. 2015. “Disciplining our Conjectures: Systematizing Process Tracing with
Bayesian Analysis.” In Andrew Bennett & Jeffrey T. Checkel (eds), Process Tracing: From
Metaphor to Analytic Tool (Cambridge: Cambridge University Press) 276-98.
Bennett, Andrew, Jeffrey T. Checkel (eds). 2015. Process Tracing: From Metaphor to Analytic Tool.
Cambridge: Cambridge University Press.
Blatter, Joachim, Markus Haverland. 2012. Designing Case Studies: Explanatory Approaches in Small-n
Research. Palgrave Macmillan.
Boix, Carles. 1999. “Setting the Rules of the Game: The Choice of Electoral Systems in
Advanced Democracies.” American Political Science Review 93:3, 609-624.
Brady, Henry E. 2010. “Data-Set Observations versus Causal-Process Observations: The 2000
U.S. Presidential Election.” In Henry E. Brady & David Collier (eds), Rethinking Social Inquiry:
Diverse Tools, Shared Standards. 2nd ed. (Lanham, MD: Rowan & Littlefield) 237–42.
Brady, Henry E., David Collier (eds). 2004. Rethinking Social Inquiry: Diverse Tools, Shared Standards.
Lanham: Rowman & Littlefield.
Brady, Henry E., David Collier (eds). 2010. Rethinking Social Inquiry: Diverse Tools, Shared Standards.
2nd ed. Lanham, MD: Rowan & Littlefield.
Brewer, John, Albert Hunter. 2006. Foundations of Multimethod Research: Synthesizing Styles.
Thousand Oaks, CA: Sage.
Caporaso, James. 2009. “Is There a Quantitative-Qualitative Divide in Comparative Politics?” In
Todd Landman and Neil Robinson (eds), Sage Handbook of Comparative Politics (Thousand Oaks:
Chattopadhyay, Raghabendra, Esther Duflo. 2004. “Women as policy makers: Evidence from a
randomized policy experiment in India.” Econometrica 72 (5): 1409-43.
Collier, David, Colin Elman. 2008. “Qualitative and Multimethod Research: Organizations,
Publications, and Reflections on Integration.” In Janet M. Box-Steffensmeier, Henry Brady,
and David Collier (eds), The Oxford Handbook for Political Methodology (Oxford: Oxford University
Collier, David. 2011. “Understanding Process Tracing.” PS: Political Science and Politics 44 (04):
Crandell, Jamie L., Corrine I. Voils, YunKyung Chang, Margarete Sandelowski. 2011. “Bayesian
data augmentation methods for the synthesis of qualitative and quantitative research
findings.” Quality & Quantity 45, 653–669.
Crasnow, Sharon. 2012. “The Role of Case Study Research in Political Science: Evidence for
Causal Claims.” Philosophy of Science 79:5 (December) 655-66.
Dunning, Thad. 2012. Natural Experiments in the Social Sciences: A Design-Based Approach.
Cambridge: Cambridge University Press.
Eckstein, Harry. 1975. “Case Studies and Theory in Political Science.” In Fred I. Greenstein and
Nelson W. Polsby (eds), Handbook of Political Science, vol. 7. Political Science: Scope and Theory
(Reading, MA: Addison-Wesley).
Elman, Colin, Diana Kapiszewski, Lorena Vinuela. 2010. “Qualitative Data Archiving: Rewards
and Challenges.” PS: Political Science & Politics 43, 23-27.
Elman, Colin, Diana Kapiszewski. 2014. “Data Access and Research Transparency in the
Qualitative Tradition.” PS: Political Science & Politics 47/1: 43–47.
Elman, Colin. 2005. “Explanatory Typologies in Qualitative Studies of International Politics.”
International Organization 59:2 (April) 293-326.
Epstein, Leon D. 1964. “A Comparative Study of Canadian Parties.” American Political Science
Review 58 (March) 46-59.
Fairfield, Tasha. 2013. “Going Where the Money Is: Strategies for Taxing Economic Elites in
Unequal Democracies.” World Development 47, 42–57.
Fairfield, Tasha. 2015. Private Wealth and Public Revenue in Latin America: Business Power and Tax
Politics. Cambridge: Cambridge University Press.
Fenno, Richard F., Jr. 1977. “U.S. House Members in Their Constituencies: An Exploration.”
American Political Science Review 71:3 (September) 883-917.
Fenno, Richard F., Jr. 1978. Home Style: House Members in their Districts. Boston : Little, Brown.
Ferwerda, Jeremy, Nicholas Miller. 2014. “Political Devolution and Resistance to Foreign Rule:
A Natural Experiment.” American Political Science Review 108:3 (August) 642-60.
Friedman, Milton, Anna Jacobson Schwartz. 1963. A Monetary History of the United States, 1867-
1960. Princeton: Princeton University Press.
Garfinkel, Harold. 1967. Studies in Ethnomethodology. Englewood Cliffs: Prentice-Hall.
George, Alexander L. 1979. “Case Studies and Theory Development: The Method of Structured,
Focused Comparison.” In Paul Gordon Lauren (ed), Diplomacy: New Approaches in History,
Theory, and Policy (New York: The Free Press).
George, Alexander L., Timothy J. McKeown. 1985. “Case Studies and Theories of
Organizational Decision-making.” In Robert F. Coulam & Richard A. Smith (eds), Advances in
Information Processing in Organizations (Greenwich, Conn.: JAI Press) 21–58.
George, Alexander L., Andrew Bennett. 2005. Case Studies and Theory Development. Cambridge:
Gerring, John, Jason Seawright. 2016. “The Inference in Causal Inference: A Psychology for
Social Science Methodology.” Unpublished manuscript, Department of Political Science,
Gerring, John, Lee Cojocaru. 2016. “Case-Selection: A Diversity of Methods and Criteria.”
Unpublished manuscript, Boston University, Department of Political Science.
Gerring, John, Rose McDermott. 2007. “An Experimental Template for Case-Study Research.”
American Journal of Political Science 51:3 (July) 688-701.
Gerring, John. 2007. Case Study Research: Principles and Practices. Cambridge: Cambridge University
Gerring, John. 2012. “Mere Description.” British Journal of Political Science 42:4 (October) 721-46.
Gerring, John. 2016. Case Study Research: Principles and Practices, 2d ed. Cambridge: Cambridge
Gill, Christopher J, Lora Sabin, Christopher H. Schmid. 2005. “Why Clinicians are Natural
Bayesians.” BMJ 330:1080-3 (May 7).
Glassner, Barry; Jonathan D. Moreno (eds). 1989. The Qualitative-Quantitative Distinction in the Social
Sciences (Boston Studies in the Philosophy of Science, 112).
Goertz, Gary, James Mahoney. 2012. A Tale of Two Cultures: Qualitative and Quantitative Research in
the Social Sciences. Princeton: Princeton University Press.
Goertz, Gary. 2015. Multimethod Research, Causal Mechanisms, and Selecting Cases. Unpublished
manuscript, Department of Political Science, University of Notre Dame.
Grimmer, Justin, Brandon M. Stewart. 2013. “Text as Data: The Promise and Pitfalls of
Automatic Content Analysis Methods for Political Texts.” Political Analysis 21:3, 267-97.
Hall, Peter A. 2003. “Aligning Ontology and Methodology in Comparative Politics.” In James
Mahoney and Dietrich Rueschemeyer (eds), Comparative Historical Analysis in the Social Sciences
(Cambridge: Cambridge University Press).
Hall, Peter A. 2006. “Systematic process analysis: when and how to use it.” European Management
Review 3, 24–31
Hammersley, Martyn. 1992. “Deconstructing the Qualitative-Quantitative Divide.” In Julie
Brannen (ed), Mixing Methods: Qualitative and Quantitative Research (Aldershot: Avebury).
Harrits, Gitte Sommer. 2011. “More Than Method? A Discussion of Paradigm Differences
within Mixed Methods Research.” Journal of Mixed Methods Research 5(2): 150–66.
Herron, Michael C., Kevin M. Quinn. 2015. “A Careful Look at Modern Case Selection
Methods.” Sociological Methods & Research (forthcoming).
Humphreys, Macartan, Alan M. Jacobs. [in process] Integrated Inferences: A Bayesian Integration of
Qualitative and Quantitative Approaches to Causal Inference. Cambridge: Cambridge University
Humphreys, Macartan, Alan M. Jacobs. 2014. “Mixing Methods: A Bayesian Approach, v.3.”
American Political Science Review (forthcoming).
Jacobs, Alan. 2015. “Process Tracing the Effects of Ideas.” In Andrew Bennett, Jeffrey T.
Checkel (eds), Process Tracing: From Metaphor to Analytic Tool (Cambridge: Cambridge University
Kapiszewski, Diana, Lauren M. MacLean, Benjamin L. Read. 2015. Field Research in Political
Science: Practices and Principles. Cambridge: Cambridge University Press.
Karl, Terry Lynn. 1997. The Paradox of Plenty: Oil Booms and Petro-States. Berkeley: University of
King, Gary, Robert O. Keohane, Sidney Verba. 1994. Designing Social Inquiry: Scientific Inference in
Qualitative Research. Princeton: Princeton University Press.
Kocher, Matthew, Nuno Monteiro. 2015. “What’s in a Line? Natural Experiments and the Line
of Demarcation in WWII Occupied France.” Unpublished manuscript, Department of
Political Science, Yale University.
Kreuzer, Markus. 2010. “Historical Knowledge and Quantitative Analysis: The Case of the
Origins of Proportional Representation.” American Political Science Review 104:369–92.
Levy, Jack S. 2007. “Qualitative Methods and Cross-Method Dialogue in Political Science.”
Comparative Political Studies 40(2): 196–214.
Levy, Jack S. 2008. “Case Studies: Types, Designs, and Logics of Inference.” Conflict Management
and Peace Science 25:1–18.
Lieberman, Evan S. 2005. “Nested Analysis as a Mixed-Method Strategy for Comparative
Research.” American Political Science Review 99:3 (August) 435-52.
Lieberman, Evan S. 2010. “Bridging the Qualitative-Quantitative Divide: Best Practices in the
Development of Historically Oriented Replication Databases.” Annual Review of Political Science
Lieberman, Evan S. 2016. “Improving Causal Inference through Non-Causal Research: Can the
Bio-Medical Research Cycle Provide a Model for Political Science?” Unpublished manuscript,
Department of Political Science, MIT.
Lijphart, Arend. 1971. “Comparative Politics and the Comparative Method.” American Political
Science Review 65, 682-93.
Lohmann, Susanne. 2007. “The Trouble with Multi-Methodism.” Newsletter of the APSA Organized
Section on Qualitative Methods 5(1): 13–17.
Lucas, Samuel R., Alisa Szatrowski. 2014. “Qualitative Comparative Analysis in Critical
Perspective.” Sociological Methodology 44:1, 1–79.
Lynd, Robert Staughton, Helen Merrell Lynd. 1929/1956. Middletown: A Study in American Culture.
New York: Harcourt, Brace.Mahoney, James, Gary Goertz. 2006. “A Tale of Two Cultures:
Contrasting Quantitative and Qualitative Research.” Political Analysis 14:227–49.
Mahoney, James, Gary Goertz. 2006. “A Tale of Two Cultures: Contrasting Quantitative and
Qualitative Research.” Political Analysis 14:3 (Summer) 227-49.
Mahoney, James, Kathleen Thelen (eds). 2015. Advances in Comparative-Historical Analysis.
Cambridge: Cambridge University Press.
Mahoney, James, Rachel Sweet Vanderpoel. 2015. “Set Diagrams and Qualitative Research.”
Comparative Political Studies 48:1 (January) 65-100.
Mahoney, James. 2010. “After KKV: The New Methodology of Qualitative Research.” World
Politics 62 (1): 120–47.
Mahoney, James. 2012. “The Logic of Process Tracing Tests in the Social Sciences.” Sociological
Methods & Research 41:4 (November) 566-590.
Mansfield, Edward D., Jack Snyder. 2005. Electing to Fight: Why Emerging Democracies go to War.
Cambridge: MIT Press.
McKeown, Timothy J. 1999. “Case Studies and the Statistical World View.” International
Organization 53 (Winter) 161-190.
McLaughlin, Eithne. 1991. “Oppositional Poverty: The Quantitative/Qualitative Divide and
Other Dichotomies.” The Sociological Review 39 (May): 292-308.
Mill, John Stuart. 1843/1872. The System of Logic, 8th ed. London: Longmans, Green.
Moore, Barrington, Jr. 1966. Social Origins of Dictatorship and Democracy: Lord and Peasant in the
Making of the Modern World. Boston: Beacon Press.
Morgan, Mary. 2012. “Case Studies: One Observation or Many? Justification or Discovery?”
Philosophy of Science 79:5 (December) 655-66.
Paluck, Elizabeth Levy. 2010. “The Promising Integration of Qualitative Methods and Field
Experiments.” The ANNALS of the American Academy of Political and Social Science 628, 59-71.
Patton, Michael Quinn. 2002. Qualitative Research & Evaluation Methods. Sage.
Pincus, Steve. 2011. 1688: The First Modern Revolution. New Haven: Yale University Press.
Platt, Jennifer. 1992. “’Case Study’ in American Methodological Thought.” Current Sociology 40:1,
Posner, Daniel. 2004. “The Political Salience of Cultural Difference: Why Chewas and
Tumbukas are Allies in Zambia and Adversaries in Malawi.” American Political Science Review
98:4 (November) 529-46.
Ray, James Lee. 1993. “Wars between Democracies: Rare or Nonexistent?” International
Reiss, Julian. 2009. “Causation in the Social Sciences: Evidence, Inference, and Purpose.”
Philosophy of the Social Sciences 39(1): 20–40.
Rihoux, Benoit. 2013. “Qualitative Comparative Analysis (QCA), Anno 2013: Reframing The
Comparative Method’s Seminal Statements.” Swiss Political Science Review 19:2, 233–45.
Roberts, Clayton. 1996. The Logic of Historical Explanation. University Park: Pennsylvania State
Rohlfing, Ingo. 2012. Case Studies and Causal Inference: An Integrative Framework. Palgrave
Romer, Christina D., David H. Romer. 2010. “The Macroeconomic Effects of Tax Changes:
Estimates Based on a New Measure of Fiscal Shocks.” American Economic Review 100 (June)
Schimmelfennig, Frank. 2015. “Efficient Process Tracing: Analyzing the Causal Mechanisms of
European Integration.” In Andrew Bennett, Jeffrey T. Checkel (eds), Process Tracing: From
Metaphor to Analytic Tool (Cambridge: Cambridge University Press) 98-125.
Schwartz, Howard, Jerry Jacobs. 1979. Qualitative Sociology: A Method to the Madness. New York:
Seawright, Jason, John Gerring. 2008. “Case-Selection Techniques in Case Study Research: A
Menu of Qualitative and Quantitative Options.” Political Research Quarterly 61:2 (June) 294-308.
Seawright, Jason. 2015a. “The Case for Selecting Cases that are Deviant or Extreme on the
Independent Variable.” Sociological Methods & Research (forthcoming).
Seawright, Jason. 2015b. Multi-Method Social Science: Combining Qualitative and Quantitative Tools.
Cambridge: Cambridge University Press, forthcoming.
Shapiro, Ian, Rogers Smith, Tarek Masoud (eds). 2004. Problems and Methods in the Study of Politics.
Cambridge: Cambridge University Press.
Shweder, Richard A. 1996. “Quanta and Qualia: What is the ‘Object’ of Ethnographic Method?”
In Richard Jessor, Anne Colby, and Richard A. Shweder (eds), Ethnography and Human
Development: Context and Meaning in Social Inquiry (Chicago: University of Chicago Press).
Sil, Rudra. 2000. “The Division of Labor in Social Science Research: Unified Methodology or
‘Organic Solidarity’?” Polity 32:4 (Summer) 499-531.
Skocpol, Theda, Margaret Somers. 1980. “The Uses of Comparative History in Macrosocial
Inquiry.” Comparative Studies in Society and History 22:2 (April) 147-97.
Skocpol, Theda. 1979. States and Social Revolutions: A Comparative Analysis of France, Russia, and
China. Cambridge: Cambridge University Press.
Snow, C.P. 1959/1993. The Two Cultures. Cambridge: Cambridge University Press.
Strauss, Anselm, Juliet Corbin. 1998. Basic of Qualitative Research: Techniques and Procedures for
Developing Grounded Theory. Thousand Oaks: Sage.
Van Evera, Stephen. 1997. Guide to Methods for Students of Political Science. Ithaca: Cornell University
Waldner, David. 2012. “Process Tracing and Causal Mechanisms.” In Harold Kincaid (ed),
Oxford Handbook of Philosophy of Social Science (Oxford: Oxford University Press) 65-84.
Waldner, David. 2015a. “Process Tracing and Qualitative Causal Inference.” Security Studies 24:2,
Waldner, David. 2015b. “What Makes Process Tracing Good? Causal Mechanisms, Causal
Inference, and the Completeness Standard in Comparative Politics.” In Andrew Bennett,
Jeffrey T. Checkel (eds), Process Tracing: From Metaphor to Analytic Tool (Cambridge: Cambridge
University Press) 126-52.
Walter, Barbara. 2002. Committing to Peace: The Successful Settlement of Civil Wars. Princeton:
Princeton University Press.
Yanow, Dvora, Peregrine Schwartz-Shea (eds). 2013. Interpretation and Method: Empirical Research
Methods and the Interpretive Turn, 2d ed. Armonk, NY: M E Sharpe.
Qual and Quant
Converting Words to Numbers
A Selection of Recent Topics
Table 1: Case-Selection Strategies
Frameworks for Qualitative Inquiry
Table 2: Qualitative Tests and their Presumed Inferential Role
Rules of Thumb for Qualitative Inquiry