Qualitative Methods - BU Blogsblogs.bu.edu/jgerring/files/2015/11/Text_6.pdf · Qualitative Methods . ... a bifurcation appeared between quantitative and qualitative methods (Platt

1

Qualitative Methods

John Gerring Department of Political Science

Boston University [email protected]

Forthcoming:

Annual Review of Political Science 20 (May/June 2017)

Draft: 30 November 2015

Estimated pages: 23 (http://www.annualreviews.org/page/authors/article-length-estimator-1)

Comments welcome!

Please do not cite without permission

http://www.annualreviews.org/page/authors/article-length-estimator-1

2

Qualitative methods, broadly construed, extend back through the foggy mists of time to the very beginnings of social and political analysis (however that might be dated). Selfconscious reflection on those methods is comparatively recent. The first methodological statements of contemporary relevance grew out of the work of logicians, philosophers, and historians in the nineteenth century, most importantly J.S. Mill (1843). To be sure, these scholars were in a quest for science, understood as a unified venture. So the notion of a method that applies only to qualitative data would have made little sense to them.

At the turn of the twentieth century, a bifurcation appeared between quantitative and qualitative methods (Platt 1992). The natural sciences, along with economics, moved fairly quickly and without much fuss into the quantitative camp, while the humanities remained largely qualitative in orientation. The social sciences found themselves in the middle – divided between scholars aligned with each camp, and some who embraced both. For this reason, the qual/quant distinction has assumed considerable importance in these fields, and very little importance outside these fields.

Perhaps it is not coincidental that the quest for a “method” to qualitative inquiry has proceeded further in the social sciences then in the humanities. And among the social sciences one might argue that political science has gone further than any other in developing the field of qualitative methods. Accordingly, this review article focuses primarily on work produced by political scientists, with an occasional glance at neighboring disciplines.

I begin by discussing the time-honored qualitative/quantitative distinction. What is qualitative data and analysis and how does it differ from quantitative data and analysis? I propose an explicit definition for “qualitative” and then explore the implications of that definition. The second section focuses on several areas of qualitative research that seem especially prominent and/or fecund, judging by output over the past decade. This includes (a) case-selection, (b) frameworks for qualitative inquiry, (c) rules of thumb for qualitative inquiry, and (d) multimethod research.

Qual and Quant

Although the qual/quant distinction is ubiquitous in social science today, the distinction is viewed differently by scholars in either camp. As a rule, scholars whose work is primarily quantitative tend to view social science as a unified endeavor, following similar rules and assumptions. The naturalistic ideal centers on goals such as replication, cumulation, and consensus – all of which point toward a single logic of inference (Beck 2006, 2010; King, Keohane & Verba 1994).

By contrast, scholars whose work is primarily qualitative tend to view the two modes of inquiry as distinctive, perhaps even incommensurable. They are more likely to identify with the idea that knowledge of the world is embedded in theoretical, epistemological, or ontological frameworks from which we can scarcely disentangle ourselves. They may also identify with the phenomenological idea that all human endeavor, including science, is grounded in human experience. Since experiences – often couched in positions of differential power and status – vary, one can reasonably expect that the methods and goals of social science might also vary. The apparent embeddedness of knowledge reinforces qualitative scholars’ predilection toward pluralism, as it suggests that there are fundamentally – and legitimately – different ways of going about business (Ahmed & Sil 2012; Bennett & Elman 2006: 456-57; Goertz & Mahoney 2012; Hall 2003; Mahoney & Goertz 2006; Shapiro, Smith & Masoud 2004; Sil 2000; Yanow & Schwartz-Shea 2013).

Following the axiom that where one sits determines where one stands, let us also consider the stakes in this controversy. Over the past century quantitative work has been on the ascendant and qualitative work has been cast in a defensive posture. There are lots of qualitative

3

practitioners but comparatively few qualitative methodologists. Consequently, it has been difficult for researchers to explain their work in ways that those in the quantitative tradition can understand, and respect. Uncomfortable with the prospect of absorption into a “quantitative template,” one may surmise that many qualitative scholars have sought to emphasize the distinctiveness of what they do for strategic reasons – establishing a nature preserve for an endangered species, as it were.

Whatever its intellectual and sociological sources, the question of unity or dis-unity depends upon how one chooses to define similarity and difference. Any two objects will share some characteristics and differ in others. It follows that they may be either compared or contrasted, depending upon the author’s point of view. Quantitatively inclined scholars may choose to focus on similarities while qualitatively inclined scholars choose to focus on differences. Both views are correct, as far as they go. The half-empty/half-full conundrum seems difficult to overcome in this particular context.1 To put the matter in a more specific frame: all may agree with Brady & Collier (2010) that there are “diverse tools” (the pluralistic perspective) as well as “shared standards” (the monist perspective). But they do not necessarily agree on what those shared standards are or to what extent they should discipline the work of social science.

Any attempt to resolve the monism/pluralism question that begins with high-level concepts (e.g., monism and pluralism, logic of inquiry, epistemology, commensurability, naturalism, interpretivism) is probably doomed to failure. These words are loaded, and once they have been uttered the die is cast. Participants from either camp will dig in their heels.

I propose to take a ground-level approach that avoids hot-button concepts from philosophy of science and focuses instead on matters of definition. What, exactly, is qualitative data? And what, by contrast, is quantitative data? We shall then explore the repercussions of this distinction, working toward some tentative conclusions that – hopefully – all may agree with, even if they do not resolve all aspects of the qual/quant debate.

Definitions Since qualitative and quantitative are antonyms one cannot define one without defining the other. I begin, therefore, by listing some of the attributes commonly associated with these contrasting methods.

Qualitative work is expressed in natural language while quantitative work is expressed in numbers and in statistical models. Qualitative work employs small samples, while quantitative work is large-n. Qualitative work is often focused on the subjective feelings and understandings of those under study and, accordingly, with techniques such as ethnography and unstructured interviews. Quantitative work is often focused on seemingly objective conditions, or things held in common. Qualitative work draws on cases chosen in an opportunistic or purposive fashion while quantitative work employs systematic (random) sampling. Qualitative work is often focused on particular individuals, events, and contexts, lending itself to an idiographic style of analysis. Quantitative work is more likely to be focused on features that (in the researcher’s view) can be generalized across a larger population, lending itself to a nomothetic style of analysis.

Let us suppose that all of the foregoing contrasts contain some empirical truth; that is, the foregoing characteristics co-vary in the work of social scientists. And let us further suppose that they resonate with common usage of these terms, as reflected in work on the subject (e.g., Bennett & Elman 2006; Brady 2010; Caporaso 2009; Collier & Elman 2008; Glassner & Moreno 1989; Goertz & Mahoney 2012; Hammersley 1992; King, Keohane & Verba 1994; McLaughlin 1991; Levy 2007; Morgan 2012; Patton 2002; Schwartz & Jacobs 1979; Shweder 1996; Snow 1959/1993; Strauss & Corbin 1998). If so, we have usefully surveyed the field. But we have not provided anything more than a map of this rugged terrain. 1 This is nicely illustrated in recent arguments about causation (Reiss 2009).

4

My goal is to arrive at a minimal definition that bounds our subject in a fairly crisp fashion, that resonates with extant understandings, and that does not trespass on other well-established terms. (It would not be efficient, semantically speaking, to conflate qualitative with idiographic, ethnographic, or some other term in this family of concepts.) In addition, it would be helpful if the proffered definition accounts for (in a loosely causal sense) the various attributes commonly associated with the terms “qualitative” and “quantitative” as surveyed above.

With these goals in mind, I propose that a fundamental feature of qualitative work is its use of non-comparable observations – observations that pertain to different aspects of a causal or descriptive question. As an example, one may consider the clues in a typical detective story. One clue concerns the suspect’s motives; another concerns his location at the time the crime was committed; a third concerns a second suspect; and so forth. Each observation, or clue, draws from a different population. This is why they cannot be arrayed in a matrix (rectangular) dataset and must be dealt with in prose (aka narrative analysis). It is also why we have difficulty counting such observations. The time-honored question of quantitative research – What is the n? – is impossible to answer in a definitive fashion. Likewise, styles of inference based on qualitative data operate somewhat differently than styles of inference based on quantitative data.

I therefore define quantitative observations as comparable (along whatever dimensions are relevant) and qualitative observations as non-comparable, regardless of how many there are. When qualitative observations are employed for causal analysis they may be referred to as causal-process observations (Brady 2010), though I shall continue to employ the more general (and less bulky) term, qualitative observation, which applies to both descriptive and causal inferences.

The notion of a qualitative or quantitative analysis is, accordingly, an inference that rests on one or the other sort of data. If the work is quantitative, it enlists patterns of covariation found in a matrix of observations and analyzed with a formal model (e.g., set theory/QCA, frequentist statistics, Bayesian probabilities, randomization inference) to reach a descriptive or causal inference. If the work is qualitative, the inference is based on bits and pieces of non-comparable observations that address different aspects of a problem. Traditionally, these are analyzed in an informal fashion, an issue taken up below.

Some strategies of data collection seem inherently qualitative, e.g., unstructured interviews, participant-observation (ethnography), and archival work. This is because researchers are likely to incorporate a wide variety of clues drawn from different kinds of sources and addressing different aspects of a problem. The different-ness of the evidence makes them non-comparable, and hence qualitative. Other data collection strategies such as standardized surveys are inherently quantitative, as they involve counting large numbers of observations that are comparable – by assumption. Of course, they might not actually be comparable. We are speaking here of assumptions about the data generating process, not about the truth with a capital T. But we cannot avoid assumptions about the world, and these assumptions – quite rightly – lead researchers to adopt one or the other method of apprehending that reality.

Converting Words to Numbers No qualitative observation is immune from quantification. Interviews, pictures, ethnographic notes, and texts drawn from other sources may be coded, either through judgments exercised by coders or through mathematical algorithms (Grimmer & Stewart 2013). By coding I refer to the systematic measurement of the phenomenon at hand – reducing the information at hand to a small number of dimensions, consistently defined across the units of interest. All that is required, following our definition, is that multiple observations of the same kind be produced and (voila!) quantitative observations are born. These may then be represented in the matrix format familiar to those who work with rectangular datasets.

Of course, there are often practical obstacles to quantification. Perhaps additional sources (informants, pictures, texts) are unavailable. Perhaps, if available, they are not really comparable, or they introduce problems of causal identification (e.g., heterogeneity across cases

5

that could pose a problem of noise or confounding). Alternatively, it may be possible to generate additional (comparable) observations but not worthwhile, e.g., because the first observation is sufficient to prove the point at issue. Sometimes, one clue is decisive. Nonetheless, in principle, if the researcher’s assumptions of comparability are justified, qualitative data can become quantitative data. The plural of anecdote is data.

Something is always lost in the process of reducing qualitative information to quantitative data. One must ignore the unique aspects of each qualitative observation in order render them comparable. If one wishes to generalize across a population, ignoring idiosyncratic features of the data is desirable. But if one wishes to shed light on these heterogeneous features the conversion of qualitative to quantitative data will iron out the ruggedness of the landscape – obscuring variation of theoretical interest. Information loss must be reckoned with.2

Finally, and perhaps most importantly, there is an asymmetry between qual and quant. One can convert qualitative data to quantitative data but not the reverse. It is a one-way street. Once a piece of information is rendered in a matrix template whatever unique aspects may have adhered to that observation have been lost. Data reduction is possible, but not expansion. The singular of data is not anecdote, which is to say one can never recover an anecdote from a data point.

Contrasting Affinities It follows from our discussion that the utility of qualitative and quantitative data varies according to the researcher’s goals.

First, qualitative data is likely to be more useful insofar as a study is focused on a single case (or event), or a small number of cases (or events). Such investigations bear close resemblance, methodologically speaking, to a detective’s quest to explain a crime, which may be thought of as a single event or a small number of associated events (if it is a string of crimes committed by the same person or group). The reason that these investigations often rest on qualitative data is that the researcher wishes to know a lot about the chosen case/event, and this requires a supple mode of investigation that allows one to draw different kinds of observations from different populations. Whether case-level analysis is warranted may rest on other, more fundamental aspects of the analysis. For example, case-level analysis is more plausible if the cases of theoretical interest are heterogeneous and scarce (e.g., nation-states) rather than homogeneous and plentiful (e.g., firms or individuals), if the causal factor cannot be manipulated by the researcher, if the causal factor or outcome is extremely rare, if the goal of the analysis is exploratory rather than confirmatory, and so forth (Gerring 2016).

Second, qualitative data is likely to be more important in a causal analysis insofar as the researcher seeks to identify token (singular, actual) causes rather than general (type) causes. The former demands a case-level analysis – focused on what caused an outcome in a particular instance. It also usually involves counterfactual thought-experiments – focused on what would have happened if a causal factor had been present/absent – and these sorts of analyses are almost by definition qualitative in nature.

Third, qualitative data is likely to be more important if the case-level analysis is idiographic (focused on describing or explaining the case) rather than nomothetic (focused on generalizing beyond the chosen case). Critical-juncture explanations offer a case in point, since these events are the product of unique events that (by definition) cannot be predicted and explained by a general model (cites). It is difficult to imagine applying a quantitative model, or matrix observations, to a critical juncture. The general point is that unique features of a case are often

2 Of course, any rendering of a complex phenomenon involves some loss of information. This is true even for the most faithful – and lengthy – descriptions of reality such as those produced by ethnomethodologists (Garfinkel 1967).

6

easier to represent qualitatively than quantitatively. Indeed, the very act of quantitative measurement and analysis presupposes a metric that stretches beyond the case under study.

Other features of a theory or an analysis do not seem to have a bearing on the relative utility of qualitative and quantitative data.3 The main point, then, is that wherever the focus of analysis is limited to one or several units qualitative data is likely to be employed, though it may be supplemented by quantitative data. Indeed, most case study research combines qualitative and quantitative analysis (Gerring 2016), and most cross-case research involves some recourse to qualitative data, either in an anecdotal or illustrative fashion or as a way to address concerns about causal identification or causal mechanisms, as discussed below. In this light, one might argue that all research is multi-method, a subject to which we return.

A Selection of Recent Topics

Having defined our terms, and explored some of the ramifications of this definition, we turn now to recent work on the topic of qualitative methods. My treatment is highly selective, focused on several areas that have received a good deal of attention from scholars in recent years. This includes case selection, frameworks for qualitative inquiry, rules of thumb for qualitative inquiry, and multimethod research.

Evidently, many important subjects are left aside in this short review. I do not address set theory and qualitative comparative analysis (QCA), as this extraordinarily large and complex subject does not fit my proposed definition of qualitative research and is, in any case, encompassed by other recent reviews (e.g., Mahoney 2010; Mahoney & Vanderpoel 2015; Rihoux 2013). I do not address typological methods and concept formation (e.g., Bennett & Elman 2006; Elman 2005), data archiving, transparency, and replication (Elman & Kapiszewski 2014; Elman, Kapiszewski & Vinuela 2010; Lieberman 2010), comparative historical analysis (Mahoney & Thelen 2015), or the organizational features of qualitative methods (Collier & Elman 2008) – all of which are nicely handled by previous reviews. Nor do I have much to say about specific data collection methods – interviewing, ethnography, archival work, and so forth (e.g., Kapiszewski, MacLean & Read 2015). Most importantly, I focus primarily on causal inference, leaving aside many knotty questions pertaining to descriptive inference (Gerring 2012).4

Case Selection We have observed that case-based analysis is likely to contain qualitative observations (even if it also incorporates quantitative observations). Consequently, the question of case-selection – how a case, or a small number of cases, is chosen from a large number of potential cases – is central to qualitative analysis.

Quite a number of case-selection typologies have been proposed over the years, with a noticeable acceleration in the past decade. Mill (1843/1872) proposes the method of difference (aka most-similar method) and method of agreement (aka most-different method), along with

3 For example, the long-standing distinction between research that seeks a complete explanation of an outcome (“causes-of-effects”) and research that narrows its scope to a single hypothesis (“effects-of-causes”) seems to bear ambivalently on the qual/quant divide. Note that a causes-of-effects explanation may be provided solely on the basis of quantitative data, e.g., a “full” regression model. Likewise, an effects-of-causes explanation may be provided based solely on qualitative data, i.e., a process tracing analysis). 4 This imposes a “positivist” lens on the questions under discussion, as many interpretivists – and presumably all post-structuralists – view their research goals quite differently. Nonetheless, recent methodological work on qualitative methods in political science is focused mostly on causal inference, following well-established disciplinary preferences (Gerring 2012). From this perspective, our fairly restrictive scope-conditions may be justified even though it leaves a vast expanse of qualitative research outside the purview of this study.

http://smile.amazon.com/Diana-Kapiszewski/e/B00ANRR13O/ref=dp_byline_cont_book_1

http://smile.amazon.com/Lauren-M.-MacLean/e/B0036CJYG0/ref=dp_byline_cont_book_2

7

several others that have not gained traction. Lijphart (1971: 691) proposes six case study types: a-theoretical, interpretative, hypothesis-generating, theory-confirming, theory-infirming, and deviant. Eckstein (1975) identifies five species: configurative-idiographic, disciplined-configurative, heuristic, plausibility probes, and crucial-case. Skocpol & Somers (1980) identify three logics of comparative history: macro-causal analysis, parallel demonstration of theory, and contrast of contexts. Gerring (2007) and Seawright & Gerring (2008) identify nine techniques: typical, diverse, extreme, deviant, influential, crucial, pathway, most-similar, and most-different. Levy (2008) identifies five case study research designs: comparable, most and least likely, deviant, and process tracing. Rohlfing (2012: ch3) identifies five case-types – typical, diverse, most-likely, least-likely, and deviant – which are applied differently according to the purpose of the case study. Blatter & Haverland (2012: 24-26) identify three explanatory approaches – covariational, process tracing, and congruence analysis – each of which offers a variety of case-selection strategies.

Building on these efforts, Gerring & Cojocaru (2016) propose a new typology that (arguably) qualifies as the most comprehensive to date, incorporating much of the foregoing literature. Its organizing feature is the goal that a case study is intended to serve, identified in the first column of Table 1. Column 2 specifies the number of cases (n) in the case study. It will be seen that case studies enlist a minimum of one or two cases, with no clearly defined ceiling. Column 3 clarifies which dimensions of the case are relevant for case-selection, i.e., descriptive features (D), causal factors of theoretical interest (X), background factors (Z), and/or the outcome (Y). Column 4 specifies the criteria used to select a case(s) from a universe of possible cases. Column 5 offers an example of each case-selection strategy. In what follows, I offer a brief resume of the resulting typology.

8

Table 1: Case-Selection Strategies

Goals/Strategies N Factors Criteria for cases Examples

I. DESCRIPTIVE (to describe)

● Typical 1+ D Mean, mode, or median of D Lynd & Lynd (1929) Middletown ● Diverse 2+ D Typical sub-types Fenno (1977, 1978) Home Style

II. CAUSAL (to explain Y)

1. Exploratory (to identify HX) ● Outcome 1+ Y Maximize variation in Y Skocpol (1979) States and Social Revolutions ● Index 1+ Y First instance of ∆Y Pincus (2011) 1688: First Modern Revolution ● Deviant 1+ Z Y Poorly explained by Z Alesina et al (2001) Why Doesn’t US Have Welfare State? ● Most-similar 2+ Z Y Similar on Z, different on Y Epstein (1964) A Comparative Study of Canadian Parties ● Most-different 2+ Z Y Different on Z, similar on Y Karl (1997) Paradox of Plenty ● Diverse 2+ Z Y All possible configurations of Z (assumption: X ∈ 𝒁) Moore (1966) Social Origins of Dictatorship and Democracy

2. Estimating (to estimate HX)

● Longitudinal 1+ X Z X changes, Z constant or biased against HX Friedman & Schwartz (1963) Monetary History of US ● Most-similar 2+ X Z Similar on Z, different on X Posner (2004) Political Salience of Cultural Difference

3. Diagnostic (to assess HX)

● Influential 1+ X Z Y Greatest impact on P(HX) Ray (1993) Wars between Democracies ● Pathway 1+ X Z Y X→ 𝑌 strong, Z constant or biased against HX Mansfield & Snyder (2005) Electing to Fight ● Most-similar 2+ X Z Y Similar on Z, different on X & Y, X→ 𝑌 strong Walter (2002) Committing to Peace D = descriptive features (other than those to be described in a case study). H X = causal hypothesis of interest. P(H X) = the probability of HX. X = causal factor(s) of theoretical interest. X→ 𝑌 = apparent or estimated causal effect. Y = outcome of interest. Z = vector of background factors that may affect X and/or Y.

9

Many case studies are primarily descriptive, which is to say they are not organized around a central, overarching causal hypothesis. Although writers are not always explicit about their selection of cases, most of these decisions might be described as following a typical or diverse case strategy. That is, they aim to identify a case, or cases, that exemplify a common pattern (typical) or patterns (diverse). This follows from the minimal goals of descriptive analysis. Where the goal is to describe there is no need to worry about the more complex desiderata that might allow one to gain causal leverage on a question of interest. Other case studies are oriented toward causal analysis. A good case (or set of cases) for purposes of causal analysis is generally one that exemplifies quasi-experimental properties, i.e., it replicate the virtues of a true experiment even while lacking a manipulated treatment (Gerring & McDermott 2007). Specifically, for a given case (observed through time) or for several cases (compared to each other), variation in X should not be correlated with other factors that are also causes of Y, which might serve as confounders (Z), generating a spurious (non-causal) relationship between X and Y.

Exploratory cases attempt to identify a possible cause of an outcome of theoretical interest. The outcome, Y, is established, and is usually framed as a research question. What accounts for variation in Y? Or, if Y is a discrete event, Why does Y occur? The researcher may also have an idea about background conditions, Z, that influence Y but are not of theoretical interest. The purpose of the study, in any case, is to identify X, regarded as a possible or probable cause of Y. Specific techniques of exploratory case selection may be classified as outcome, index, deviant, most-different, most-similar, or diverse, as specified in Table 1.

Estimating cases aim to test a hypothesis by estimating a causal effect. That might mean a precise point estimate along with a confidence interval (e.g., from a time-series or synthetic matching analysis), or an estimate of the “sign” of a relationship, i.e., whether X has a positive, negative, or no relationship to Y. The latter is more common, not only because of the small size of the sample (at the case level) but also because it is more likely to be generalizable across a population of cases. In either situation, case selection rests on information about X and Z (not Y). Two general approaches are viable – longitudinal and most similar – as outlined in Table 1.

Diagnostic case studies help to confirm, disconfirm, or refine a hypothesis (garnered from the literature on a subject or from the researcher’s own ruminations) and identify the generative agent (mechanism) at work in that relationship. All the elements of a causal model – X, Z, and Y – are generally involved in the selection of a diagnostic case. Specific strategies may be classified as influential, pathway, or most-similar, as shown in Table 1.

Note that virtually all of these case selection strategies may be executed in an informal, qualitative fashion or by employing a quantitative algorithm. For example, a deviant case could be chosen based on a researcher’s sense about which case(s) is poorly explained by extant theories. Or it might be chosen by looking at residuals from a regression model. Discussion of the pros and cons of algorithmic case selection can be found in Gerring (2016).

The reader may wonder, how might one know whether a designated strategy will achieve what it is intended to achieve? Evidently, there are serious problems of validation to wrestle with. Several attempts have been made to assess varying case selection strategies using simulation techniques. Herron & Quinn (2015) assess estimating strategies, i.e., where the case is intended to measure causal effects. Seawright (2015b) assesses diagnostic strategies, where the case is designed to help confirm or disconfirm a causal hypothesis. Lucas & Szatrowksi (2014) assess QCA-based strategies of case-selection.

It would take some time to discuss these complex studies, so I shall content myself with several summary judgments. First, case selection techniques have different goals, so any attempt to compare them must focus on the goals that are appropriate to that technique. A technique whose purpose is exploratory (to identify a new hypothesis about Y) cannot be judged by its efficacy in identifying causal mechanisms, for example. Second, among these goals, estimating causal effects is the least common – and, by all accounts, the least successful – of these goals, so

10

any attempt to gauge the effectiveness of case selection methods should probably focus primarily on exploratory and diagnostic functions. Third, case selection techniques are best practiced when taking into account change over time in the key variables, rather than static cross-sectional analyses – as most of the simulation exercises appear to do. Finally, and most importantly, it is difficult and perhaps impossible to simulate the complex features involved in an in-depth case analysis. The question of interest – which case(s) would best serve my purpose if I devoted a case study of it? – is hard to model without introducing assumptions that pre-judge the results of the case study and are in this respect endogenous to the case-selection strategy.5

In my opinion, testing the viability of case selection strategies in a rigorous fashion would involve a methodological experiment of the following sort. First, assemble a panel of researchers with similar background knowledge of a subject. Second, identify a subject deemed ripe for case study research, i.e., it is not well-studied or has received no authoritative treatment and is not amenable to experimental manipulation. Third, select cases algorithmically, following one of the protocols laid out in Table 1. Fourth, randomly assign these cases to the researchers with instructions to pursue all case study goals – exploratory, estimating, and diagnostic. Fifth, assemble a panel of judges, who are well-versed in the subject of theoretical focus, to evaluate how well each case study achieved each of these goals. These could be scored on a questionnaire using ordinal, Likert-style categories. Judges would be instructed to decide independently (without conferring), though there might be a second round of judgments following a deliberative process in which they shared their thoughts and their preliminary decisions.

Such an experiment would be time-consuming and costly (assuming participants receive some remuneration). And it would need to be iterated across several research topics and with several panels of researchers and judges in order to make strong claims of generalizability. Nonetheless, it might be worth pursuing given the possible downstream benefits – if, that is, some strategies can be shown to be superior to others.6

Frameworks for Qualitative Inquiry Having discussed case selection, we proceed to case analysis, with a focus on the qualitative components of that inquiry. Here, we stumble upon the most mysterious, and most contested, aspect of qualitative methods. Because of its informal nature, qualitative evidence is often regarded with suspicion. It’s hard to articulate what a convincing inference might consist of, and how to know it when one sees it. What are the methodological standards of qualitative data analysis (sometimes referred to as process tracing)? To remedy this situation a number of recent studies try to make sense of qualitative data, imposing order on the seeming chaos. Proposed frameworks include set theory (Mahoney 2012; Mahoney & Vanderpoel 2015), acyclic graphs (Waldner 2015b), or – most commonly – Bayesian inference (Beach & Pedersen 2013: 83-99; Bennett 2008, 2015; Crandell et al. 2011; George &

5 For example, Herron & Quinn (2015: 9) make the assumption that the potential outcomes inherent in a case (i.e., the unit-level causal relationship) will be discovered by the case study researcher in the course of an intensive analysis of the case. Yet, “discoverability” is the very thing that case selection techniques are designed to achieve. That is, a case selection technique is regarded as superior insofar as it offers a higher probability of discovering an unknown feature of a case. 6 Note, however, that this experiment disregards qualitative judgments by researchers that might be undertaken after an algorithmic selection of cases. These qualitative judgments might serve as mediators. It could be, for example, that some case-selection strategies work better when the researcher is allowed to make final judgments – from among a set of potential cases that meet the stipulated case-selection criteria – based on knowledge of the potential cases. One must also consider a problem of generalizability that stems from the use of algorithmic procedures for selecting cases. It could be that subjects for which algorithmic case selection is feasible (i.e., where values for X, Z, and Y can be measured across a large sample) are different from subjects for which algorithmic case selection is infeasible. If so, we could not generalize the results of this experiment to the latter genre of case study research.

11

McKeown 1985; Gill et al. 2005; Humphreys & Jacobs 2014, in process; McKeown 1999; Rohlfing 2012: 180-99). These efforts have performed an enormous service to the cause of qualitative inquiry, fitting them into frameworks that are already well-established for quantitative inquiry. It should be no surprise that there are multiple frameworks, just as there are multiple frameworks for quantitative methodology. Scholars may debate whether, or to what extent, these frameworks are compatible with each other; this important debate is orthogonal to the present topic. The point to stress is that qualitative inquiry can be understood within the rubric of general causal frameworks. There is, in this sense, a single logic of inquiry. Exploring these complex frameworks in detail would take us far afield; interested readers may explore the cited literature. However, I do want to highlight one point. Thus far, applications of set theory, acyclic graphs, and bayesianism to qualitative methods have focused on making sense of the activity rather than providing a practical guide to research. It remains to be seen whether these can be developed in such a way as to alter the ways that qualitative researchers go about their business. Let me illustrate. Some years ago, Van Evera (1997) proposed a fourfold typology of tests that has since been widely adopted (e.g., Bennett & Checkel 2015: 17; George & Bennett 2005; Mahoney & Vanderpoel 2015; Waldner 2015a). A “hoop” test is necessary (but not sufficient) for demonstrating Hx. A “smoking-gun” test is sufficient (but not necessary) for demonstrating Hx. A “doubly-decisive” test is necessary and sufficient for demonstrating Hx. A “straw-in-the-wind” test is neither necessary nor sufficient, constituting weak or circumstantial evidence. These concepts, diagramed in Table 2, are useful for classifying the nature of evidence according to a researcher’s judgment. However, the hard question – the judgment itself – is elided. When does a particular piece of evidence qualify as a hoop, smoking-gun, doubly-decisive, or straw-in-the-wind test (or something in between)?

Table 2: Qualitative Tests and their Presumed Inferential Role

Inferential Role

Necessary Sufficient

Tests

Hoop Smoking-gun

Doubly-decisive Straw-in-wind

Likewise, Bayesian frameworks are useful for combining evidence from diverse quarters in a logical fashion with the use of subjective assessments, e.g., the probability that a hypothesis is true, ex ante, and assessments of the probability that the hypothesis is true if a piece of evidence (stipulated in advance) is observed. The hard question, again, is the case-specific judgment. Consider the lengthy debate that has ensued over the reasons for electoral system choice in Europe (Kreuzer 2010). Humphreys & Jacobs (2014) use this example to sketch out their application of Bayesian inference to qualitative research. In particular, they explore the “left threat” hypothesis, which suggests that the presence of a large left-wing party explains the adoption of proportional representation (PR) in the early twentieth century (Boix 1999). The authors point out that “for cases with high left threat and a shift to PR, the inferential task is to determine whether they would have…or would not have…shifted to PR without left threat” (Humphreys & Jacobs 2014: 28). Bayesian frameworks do nothing to ease this inferential task,

12

which takes the form of a counterfactual thought-experiment. Similar judgments are required by other frameworks – set theory, acyclic graphs, and so forth. To get a feel for the level of detail required in qualitative research we may benefit from a closer look at a particular inquiry. Helpfully, Tasha Fairfield (2013: 55-6; see also 2015) provides a scrupulous blow-by-blow account of the sleuthing required to reach each case-level inference in her study of how policymakers avoid political backlash when they attempt to tax economic elites. One of her three country cases is Chile, which is observed during and after a recent presidential election. Fairfield explains,

During the 2005 presidential campaign, right candidate Lavı´n blamed Chile’s persistent inequality on the left and accused President Lagos of failing to deliver his promise of growth with equity. Lagos responded by publicly challenging the right to eliminate 57 bis, a highly regressive tax benefit for wealthy stockholders that he called “a tremendous support for inequality.” The right accepted the challenge and voted in favor of eliminating the tax benefit in congress, deviating from its prior position on this policy and the preferences of its core business constituency. The following three hypotheses encompass the main components of my argument regarding why the right voted in favor of the reform: Hypothesis 1. Lagos’ equity appeal motivated the right to accept the reform, due to concern over public opinion. Hypothesis 2. The timing of the equity appeal—during a major electoral campaign—contributed to its success. Hypothesis 3. The high issue-salience of inequality contributed to the equity appeal’s success. The following four observations, drawn from different sources, provide indirect, circumstantial support for Hypothesis 1: Observation 1a (p. 48): The Lagos administration considered eliminating 57 bis in the 2001 Anti-Evasion reform but judged it politically infeasible given business-right opposition (interview: Finance Ministry-a, 2005). Observation 1b: The Lagos administration subsequently tried to reach an agreement with business to eliminate 57 bis without success (interview, Finance Ministry-b, 2005). Observation 1c: Initiatives to eliminate the exemption were blocked in 1995 and 1998 due to right opposition. (Sources: congressional records, multiple interviews) Observation 1d: Previous efforts to eliminate 57 bis did not involve concerted equity appeals. Although Concertacio´n governments had mentioned equity in prior efforts, technical language predominated, and government statements focused much more on 57 bis’ failure to stimulate investment rather than its regressive distributive impact (congressional records, La Segunda, March 27, 1998, El Mercurio, April 1, 1998, Interview, Ffrench-Davis, Santiago, Chile, Sept. 5, 2005). Inference: These observations suggest that right votes to eliminate 57 bis would have been highly unlikely without some new, distinct political dynamic. Lagos’ strong, high-profile equity appeal, in the unusual context of electoral competition from the right on the issue of inequality, becomes a strong candidate for explaining the right’s acceptance of the reform.

The appendix continues in this vein for several pages, focused relentlessly on explaining the behavior of one particular set of actors in one event, i.e., the motivation of the right-wing in favoring the reform. This event is just one of a multitude of events discussed in connection with the Chilean case study, to which must be added the equally complex set of events occurring in Argentina and Bolivia in Fairfield’s three-country study. Clearly, reaching case-level inferences is complicated business. One may conclude that if researchers agreed on case-level judgments then general frameworks could be successful in cumulating these judgments into a higher-level inference, accompanied by

13

a (very useful!) confidence interval. But if one cannot assume case-level consensus, conclusions based on qualitative judgments combined through a Bayesian (or other) framework represent little more than one researcher’s views, which might vary appreciably from another’s. Readers who are not versed in the intricacies of Chilean politics will have a hard time ascertaining whether Fairfield’s judgments are correct. This sort of problem could be overcome with a crowd-based approach. Specifically, one might survey a panel of experts – chosen randomly or with an aim to represent diverse perspectives – on each point of judgment. One could then cumulate these judgments into an overall inference, in which the confidence interval reflects the level of disagreement among experts (among other things). Unfortunately, not just any crowd will do. The extreme difficulty of case study research derives in no small part from the expertise that case study researchers bring to their task. I cannot envision a world in which lay coders, recruited through Amazon Turk or Facebook, would replace that expertise, honed through years of work on a particular problem and in a particular site (a historical period, country, city, village, organization,…). To be credible, a crowd-based approach to the problem of judgment would need to enlist the small community of experts who study a subject and can be expected to make knowledgeable judgments about highly specific questions such as the “left wing threat.” In the previous example, it would entail enlisting scholars versed in the politics of early twentieth century Europe. This procedure is conceivable, but difficult to implement. How would one identify a random, or otherwise representative, sample? (What is the sampling frame?) How would one motivate scholars to undertake the task? How would one elicit honest judgments about the specific questions on a questionnaire, uncorrupted by broader judgments about the theoretical question at hand (which they would probably be able to infer)? Likewise, if one goes to the trouble of constructing a common coding frame (a questionnaire), an on-line system for recording responses, a system of recruitment, and a Bayesian (or some other) framework for integrating judgments, the considerable investment in time and expense of such a venture would probably justify extending the analysis to many cases, chosen randomly, so that a representative sample can be attained and stochastic threats to inference minimized. In this fashion, procedures to integrate qualitative data into a quantitative framework seem likely to morph from case studies into cross-case coding exercises. This is not to argue against the idea. It is simply to point out that any standardization of procedures tends to work against the intensive focus on one or several cases which (by our definition) characterizes case study research.

Rules of Thumb for Qualitative Inquiry My tentative conclusion, based on the discussion in the previous section and pending further developments on this fast-moving field, is that it is hard to improve upon the informal “rules of thumb” that have traditionally governed qualitative analysis. They have their faults, to be sure. But remedying those faults may not be possible within the constraints of a case study framework. What, then, are these rules of thumb? A good deal of work has gone into identifying informal procedures to guide causal inference using qualitative data (Beach & Pedersen 2013; Bennett & Checkel 2015; Brady & Collier 2004; Collier 2011; George 1979; Hall 2006; Jacobs 2015; Mahoney 2012; Roberts 1996; Schimmelfennig 2015; Waldner 2012, 2015a, 2015b). They may be summarized as follows…

• Analyze sources according to their relevance (to the question of theoretical interest), proximity (whether the source is in a position to know what s/he is claiming), authenticity (the source is not fake or reflecting the influence of someone else), validity (the source is not biased), and diversity (collectively, sources represent a diversity of viewpoints on the question at hand).

• When identifying a new causal factor or theory, look for one (a) that is potentially generalizable to a larger population, (b) that is neglected in the extant literature on your subject, (c) that greatly

14

enhances the probability of an outcome (if binary) or explains a lot of variation on that outcome (if interval-level), and (d) that is exogenous (not explained by other factors).

• Canvas widely for rival explanations, which also serve as potential confounders. Treat them seriously (not as “straw men”), dismissing them only when warranted. Utilize this logic of elimination, where possible, to enhance the strength of the favored hypothesis.

• For each explanation, construct as many testable hypotheses as possible, paying close attention to within-case opportunities – e.g., mechanisms and alternative outcomes.

• Enlist counterfactual thought-experiments in an explicit fashion, making clear which features of the world are being altered, and which are assumed to remain the same, in order to test the viability of a theory. Also, focus on periods when background features are stable (so they don’t serve as confounders) and minimize changes to the world (the minimal-rewrite rule) so that the alternate scenario is tractable.

• Utilize chronologies and diagrams to clarify temporal and causal interrelationships among complex causal factors. Include as many features as possible so that the time-line is continuous, uninterrupted.

Researchers should bear in mind that these diverse rules of thumb are intended to shed light on causal inference for an individual case or a small set of cases. Inferences for that case(s) may – or may not – be generalizable to a larger population.

Multimethod Research The final section of this study is devoted to multimethod research, where both qual and quant styles of evidence are brought to bear on the same general research question (Brewer & Hunter 2006; Goertz 2015; Harrits 2011; Lieberman 2005; Seawright 2015a). While multimethod research is increasingly common, there are serious questions about its effectiveness (Lohmann 2007). Doing more than one thing might mean doing multiple things poorly, by dint of limited time or expertise. Nor is clear whether qualitative and quantitative analysis can speak to one another productively (Ahmed & Sil 2012). In discussing this question it is important not to confuse disagreement with incommensurability. If qual and quant tests of a proposition are truly independent there is always the possibility that they will elicit different, perhaps even directly contradictory, answers. For example, the most common style of multimethod analysis combines a quantitative analysis of many units with an in-depth, qualitative (or at least partially qualitative) analysis of a single case or a small set of cases, which Lieberman (2005) refers to as a nested analysis. Occasionally, these two analyses reach different conclusions about a causal relationship (though, one suspects, authors do not always bring these disagreements to the fore). However, the same disagreements also arise from rival quantitative analyses (e.g., conducted with different samples or specifications) and rival qualitative analyses (e.g., focused on different research sites or generated by different researchers). Disagreement about whether X causes Y, or about the mechanisms at work, does not entail that multimethod research is unavailing. Sometimes, triangulation does not confirm one’s hypothesis. It is still useful information; and for those worried about confirmation bias, it is critical. In any case, Seawright (2015a) points out that when qualitative and quantitative evidence is combined these analyses are usually oriented toward somewhat different goals. Typically, a large-n cross-case analysis is focused on measuring a causal effect while a small-n within-case analysis is focused on identifying a causal mechanism. As such, the two styles of evidence cannot directly conflict since their objectives are different. They nonetheless inform each other in a useful fashion. One way to assess the value of multimethod research is to determine how influential it might be in readers’ judgments about causal inference. A recent study by Gerring & Seawright (2016) subjects this question to an experimental test based on a sample of respondents drawn

15

from recent summer schools sponsored by the Institute for Qualitative and Multimethod Research (at Syracuse University) and the European Consortium for Political Research (at the University of Ljubljana, Slovenia). Three treatments are devised from multimethod articles published in top political science journals. The Control condition consists of the entire article, as published; the Qualitative condition consists of the same article minus the quantitative analysis; the Quantitative condition consists of the article minus the qualitative analysis. After reading the assigned sections of the article, subjects are asked whether they believe that the hypothesis is true (a) with respect to the sample under study (internal validity) and (b) with respect to a broader population (external validity). The authors thus utilize multimethod research to interrogate the persuasiveness of each of its components.

The findings of this study are somewhat disturbing for those who prize multimethod research. In particular, there seems to be no advantage in packaging qualitative and quantitative evidence together in a multimethod approach to causal inference. Indeed, respondents found multimethod research less generalizable than single-method quantitative research. That is, adding a qualitative component to a quantitative analysis appears to weaken a study’s claim to external validity even when all else about that study remains the same. These results are all the more surprising given that the chosen sample of respondents are presumably disposed to favor a multimethod approach to social science. One may surmise that some of the obstacles to successful multimethod research are practical in nature, and may have affected the results of this study. Because of limited space, it is not easy to incorporate multiple research designs in the same article-length study. Perhaps the respondents in the foregoing study were unable to process the complexities of a multimethod study because qualitative data, in particular, is hard to present in a highly concise form. Likewise, because multiple methods require different skill-sets it is not always possible for the same researcher to undertake qualitative and quantitative approaches to a topic. Perhaps the chosen multimethod studies were not convincing because the writers were not sufficiently conversant in both crafts.

This leaves open another way of viewing multimethod research. If the two modes of analysis are difficult to combine in the same study by the same author perhaps they may be profitably united within a larger “research cycle” (Lieberman 2016) that includes a diversity of methods and authors. This allows scholars with a qual or quant bent to do what they do best, concentrating their efforts on their particular skill-set and on one particular context that they can become intimately acquainted with. The research cycle also mitigates a presentational problem – stuffing results from myriad analyses into a 10,000 word article. Unfortunately, the research cycle approach to multimethod research also encounters obstacles. In particular, one must wonder whether cumulation can occur successfully across diverse studies utilizing diverse research methods. Note that political science work is not highly standardized, even when focused on the same research question and when utilizing the same quantitative method. This inhibits the integration of findings, and helps to account for the scarcity of meta-analyses in political science. Qualitative studies are even less likely to be standardized in a way that allows for their integration into an ongoing research trajectory. Inputs and outputs may be defined and operationalized in disparate ways, or perhaps not clearly operationalized at all. And because samples are not randomly chosen, any aggregation of studies cannot purport to represent a larger population in an unbiased fashion. There is yet another angle on this topic that offers what is perhaps a more optimistic – not to mention realistic – reading of the multimethod ideal. Rather than conceptualizing qualitative and quantitative research as separate research designs we might regard them as integral components of the same design.

Note that there are few purely qualitative studies. Although the main burden of inference may be carried by qualitative data, this is often supplemented by a large-n cross-case analysis or a large-n within-case analysis (where observations are drawn from a lower level of analysis).

16

Likewise, there are few purely quantitative analyses since the former is usually (always?) accompanied by qualitative observations of one sort or another. At a minimum, qualitative data is trotted out by way of illustration. At a maximum, qualitative data is essential to causal inference.

In this vein, a number of recent studies highlight the vital role played by qualitative data, even when the research design is experimental or quasi-experimental. Although we tend to think of these designs as being quantitative – since they generally incorporate a large number of comparable units – they may also contain important qualitative components. There is, to begin with, the problem of research design. Without an ethnographic understanding of the research site and the individuals who are likely to serve as subjects it is impossible to design an experiment that adequately tests a hypothesis of interest. It is impossible to define a confounder “in the abstract.” In-depth case-based understanding is especially important in the context of field experimentation, where one assumes that context matters a good deal to how subjects are likely to react to a given treatment.

Second, one must assess potential threats to inference. Where the assignment is randomized, ex ante comparability is assured. But ex post comparability remains a serious threat to inference. For example, experiments often face problems of compliance, so it is incumbent on the researcher to ascertain whether subjects adhered to the prescribed protocol and, if not, which subjects violated the protocol. Where significant numbers of subjects attrit (withdraw from participation) there is an important question about what motivated their withdrawal and what sort of subjects were inclined to withdraw. In field experiments, where a significant time lag often separates the treatment and the outcome of theoretical interest, one must try to determine whether subjects under study may have communicated with one another, introducing potential problems of interference and/or contamination (interference across treatment and control groups). Third, there is a question of causal mechanisms. Assuming a treatment effect can be measured without bias, what is it that accounts for the connection between X and Y? Finally, there are questions of generalizability. In order to determine the external validity of an experiment one must have a good sense of the research site and the subjects who have been studied. Specifically, one must be able to assess the extent to which these individuals, and this particular treatment effect, can be mapped across other – potentially quite different – settings. These issues – of research design, inferential threats, causal mechanisms, and generalizability – are often assessable with qualitative data. Indeed, they may only be assessable by means of a rich, contextual knowledge of a research project as it unfolds on a particular site.

Paluck (2010) argues, further, that experimental designs may be combined with qualitative measurement to access outcomes that would not be apprehended with traditional quantitative measures. As an example, she explores Chattopadhyay & Duflo’s (2004) study of women leaders in India. While praising this landmark study, Paluck (2010: 61) points out,

participant observation of women leaders outside of the council settings—such as in their homes, where they visit with other women—could have revealed whether they were influenced by women constituents in these more informal settings. Intensive interviews could compare social processes in villages with female or male council leaders to reveal how beliefs about women leaders’ efficacy shift. For example, did other council members, elders, or religious leaders make public statements about female leaders or the reservation system? Was there a tipping point at which common sentiment in villages with female leaders diverged from villages with male leaders? Such qualitatively generated insights could have enabled this study to contribute more to general theories of identity, leadership, and political and social change. Moreover, ethnographic work could compare understandings of authority and political legitimacy in villages with female- and male-led councils. Do the first female leaders inspire novel understandings of female authority and

17

legitimacy, or are traditional gender narratives invoked just as frequently to explain women’s new power and position?

Paluck concludes that experiments provide an under-utilized opportunity for qualitative analysis, one that is grossly under-utilized – due to scholars’ parochial attitudes. Qualitative scholars who wish to understand the impact of modernization would be well-advised to construct a field experiment in which an agent of modernization – e.g., a bridge, road, harbor, radio tower – is randomized across sites, allowing for an opportunity to systematically compare treatment and control groups over time, using all the ethnographic tools at their disposal. Where the treatment is not randomly assigned (i.e., in observational research) there are additional issues pertaining to potential assignment (or selection) bias. Here, qualitative data often comes into play (Dunning 2012). For example, Jeremy Ferwerda & Nicholas Miller (2014) argue that devolution of power reduces resistance to foreign rule. To do so, they focus on France during World War Two, when the northern part of the country was ruled directly by German forces and the southern part was ruled indirectly by the “Vichy” regime headed by Marshall Petain. The key methodological assumption of their regression discontinuity design is that the line of demarcation was assigned in an as-if random fashion. For the authors, and for their critics (Kocher & Monteiro 2015), this assumption requires in-depth qualitative research – research that promises to uphold, or call into question, the author’s entire analysis.

As a second example, we may consider Romer & Romer’s (2010) analysis of the impact of tax changes on economic activity. Because tax changes are non-random, and likely to be correlated with the outcome of interest, anyone interested in this question must be concerned with bias arising from the assignment of the treatment. To deal with this threat, Romer & Romer make use of the narrative record provided by presidential speeches and congressional reports to elucidate the motivation of tax policy changes in the postwar era. This allows them to distinguish policy changes that might have been motivated by economic performance from those that may be considered as-if random. By focusing solely on the latter, they claim to provide an unbiased test of the theory that tax increases are contractionary.

18

References

Ahmed, Amel, Rudra Sil. 2012. “When Multi-Method Research Subverts Methodological Pluralism - Or, Why We Still Need Single-Method Research.” Perspectives on Politics 10:4 (December) 935-953.

Alesina, Alberto, Edward Glaeser, Bruce Sacerdote. 2001. “Why Doesn’t the US Have a European-Style Welfare State?” Brookings Papers on Economic Activity 2, 187-277.

Beach, Derek, Rasmus Brun Pedersen. 2013. Process-Tracing Methods: Foundations and Guidelines. Ann Arbor, MI: University of Michigan Press.

Beck, Nathaniel. 2006. “Is Causal-Process Observation An Oxymoron?” Political Analysis 14(3): 347–52.

Beck, Nathaniel. 2010. “Causal Process ‘Observations’: Oxymoron or (Fine) Old Wine.” Political Analysis 18(4): 499–505.

Bennett, Andrew, Colin Elman. 2006. “Qualitative Research: Recent Developments in Case Study Methods.” Annual Review of Political Science 9:455–76.

Bennett, Andrew. 2008. “Process Tracing: A Bayesian Approach.” In Janet Box-Steffensmeier, Henry Brady, & David Collier (eds), Oxford Handbook of Political Methodology (Oxford: Oxford University Press) 702-21.

Bennett, Andrew. 2015. “Disciplining our Conjectures: Systematizing Process Tracing with Bayesian Analysis.” In Andrew Bennett & Jeffrey T. Checkel (eds), Process Tracing: From Metaphor to Analytic Tool (Cambridge: Cambridge University Press) 276-98.

Bennett, Andrew, Jeffrey T. Checkel (eds). 2015. Process Tracing: From Metaphor to Analytic Tool. Cambridge: Cambridge University Press.

Blatter, Joachim, Markus Haverland. 2012. Designing Case Studies: Explanatory Approaches in Small-n Research. Palgrave Macmillan.

Boix, Carles. 1999. “Setting the Rules of the Game: The Choice of Electoral Systems in Advanced Democracies.” American Political Science Review 93:3, 609-624.

Brady, Henry E. 2010. “Data-Set Observations versus Causal-Process Observations: The 2000 U.S. Presidential Election.” In Henry E. Brady & David Collier (eds), Rethinking Social Inquiry: Diverse Tools, Shared Standards. 2nd ed. (Lanham, MD: Rowan & Littlefield) 237–42.

Brady, Henry E., David Collier (eds). 2004. Rethinking Social Inquiry: Diverse Tools, Shared Standards. Lanham: Rowman & Littlefield.

Brady, Henry E., David Collier (eds). 2010. Rethinking Social Inquiry: Diverse Tools, Shared Standards. 2nd ed. Lanham, MD: Rowan & Littlefield.

Brewer, John, Albert Hunter. 2006. Foundations of Multimethod Research: Synthesizing Styles. Thousand Oaks, CA: Sage.

Caporaso, James. 2009. “Is There a Quantitative-Qualitative Divide in Comparative Politics?” In Todd Landman and Neil Robinson (eds), Sage Handbook of Comparative Politics (Thousand Oaks: Sage).

Chattopadhyay, Raghabendra, Esther Duflo. 2004. “Women as policy makers: Evidence from a randomized policy experiment in India.” Econometrica 72 (5): 1409-43.

Collier, David, Colin Elman. 2008. “Qualitative and Multimethod Research: Organizations, Publications, and Reflections on Integration.” In Janet M. Box-Steffensmeier, Henry Brady, and David Collier (eds), The Oxford Handbook for Political Methodology (Oxford: Oxford University Press) 779–95.

Collier, David. 2011. “Understanding Process Tracing.” PS: Political Science and Politics 44 (04): 823–30.

Crandell, Jamie L., Corrine I. Voils, YunKyung Chang, Margarete Sandelowski. 2011. “Bayesian data augmentation methods for the synthesis of qualitative and quantitative research findings.” Quality & Quantity 45, 653–669.

https://www.sas.upenn.edu/polisci/sites/www.sas.upenn.edu.polisci/files/Ahmed&Sil_MMR_Limits.pdf

https://www.sas.upenn.edu/polisci/sites/www.sas.upenn.edu.polisci/files/Ahmed&Sil_MMR_Limits.pdf

http://www.amazon.com/Joachim-Blatter/e/B007FXLHMK/ref=sr_ntt_srch_lnk_1?qid=1415392078&sr=1-1

http://www.amazon.com/Designing-Case-Studies-Explanatory-Approaches/dp/0230249698/ref=sr_1_1?s=books&ie=UTF8&qid=1415392078&sr=1-1

http://www.amazon.com/Designing-Case-Studies-Explanatory-Approaches/dp/0230249698/ref=sr_1_1?s=books&ie=UTF8&qid=1415392078&sr=1-1

19

Crasnow, Sharon. 2012. “The Role of Case Study Research in Political Science: Evidence for Causal Claims.” Philosophy of Science 79:5 (December) 655-66.

Dunning, Thad. 2012. Natural Experiments in the Social Sciences: A Design-Based Approach. Cambridge: Cambridge University Press.

Eckstein, Harry. 1975. “Case Studies and Theory in Political Science.” In Fred I. Greenstein and Nelson W. Polsby (eds), Handbook of Political Science, vol. 7. Political Science: Scope and Theory (Reading, MA: Addison-Wesley).

Elman, Colin, Diana Kapiszewski, Lorena Vinuela. 2010. “Qualitative Data Archiving: Rewards and Challenges.” PS: Political Science & Politics 43, 23-27.

Elman, Colin, Diana Kapiszewski. 2014. “Data Access and Research Transparency in the Qualitative Tradition.” PS: Political Science & Politics 47/1: 43–47.

Elman, Colin. 2005. “Explanatory Typologies in Qualitative Studies of International Politics.” International Organization 59:2 (April) 293-326.

Epstein, Leon D. 1964. “A Comparative Study of Canadian Parties.” American Political Science Review 58 (March) 46-59.

Fairfield, Tasha. 2013. “Going Where the Money Is: Strategies for Taxing Economic Elites in Unequal Democracies.” World Development 47, 42–57.

Fairfield, Tasha. 2015. Private Wealth and Public Revenue in Latin America: Business Power and Tax Politics. Cambridge: Cambridge University Press.

Fenno, Richard F., Jr. 1977. “U.S. House Members in Their Constituencies: An Exploration.” American Political Science Review 71:3 (September) 883-917.

Fenno, Richard F., Jr. 1978. Home Style: House Members in their Districts. Boston : Little, Brown. Ferwerda, Jeremy, Nicholas Miller. 2014. “Political Devolution and Resistance to Foreign Rule:

A Natural Experiment.” American Political Science Review 108:3 (August) 642-60. Friedman, Milton, Anna Jacobson Schwartz. 1963. A Monetary History of the United States, 1867-

1960. Princeton: Princeton University Press. Garfinkel, Harold. 1967. Studies in Ethnomethodology. Englewood Cliffs: Prentice-Hall. George, Alexander L. 1979. “Case Studies and Theory Development: The Method of Structured,

Focused Comparison.” In Paul Gordon Lauren (ed), Diplomacy: New Approaches in History, Theory, and Policy (New York: The Free Press).

George, Alexander L., Timothy J. McKeown. 1985. “Case Studies and Theories of Organizational Decision-making.” In Robert F. Coulam & Richard A. Smith (eds), Advances in Information Processing in Organizations (Greenwich, Conn.: JAI Press) 21–58.

George, Alexander L., Andrew Bennett. 2005. Case Studies and Theory Development. Cambridge: MIT Press.

Gerring, John, Jason Seawright. 2016. “The Inference in Causal Inference: A Psychology for Social Science Methodology.” Unpublished manuscript, Department of Political Science, Boston University.

Gerring, John, Lee Cojocaru. 2016. “Case-Selection: A Diversity of Methods and Criteria.” Unpublished manuscript, Boston University, Department of Political Science.

Gerring, John, Rose McDermott. 2007. “An Experimental Template for Case-Study Research.” American Journal of Political Science 51:3 (July) 688-701.

Gerring, John. 2007. Case Study Research: Principles and Practices. Cambridge: Cambridge University Press.

Gerring, John. 2012. “Mere Description.” British Journal of Political Science 42:4 (October) 721-46. Gerring, John. 2016. Case Study Research: Principles and Practices, 2d ed. Cambridge: Cambridge

University Press. Gill, Christopher J, Lora Sabin, Christopher H. Schmid. 2005. “Why Clinicians are Natural

Bayesians.” BMJ 330:1080-3 (May 7). Glassner, Barry; Jonathan D. Moreno (eds). 1989. The Qualitative-Quantitative Distinction in the Social

Sciences (Boston Studies in the Philosophy of Science, 112).

20

Goertz, Gary, James Mahoney. 2012. A Tale of Two Cultures: Qualitative and Quantitative Research in the Social Sciences. Princeton: Princeton University Press.

Goertz, Gary. 2015. Multimethod Research, Causal Mechanisms, and Selecting Cases. Unpublished manuscript, Department of Political Science, University of Notre Dame.

Grimmer, Justin, Brandon M. Stewart. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis 21:3, 267-97.

Hall, Peter A. 2003. “Aligning Ontology and Methodology in Comparative Politics.” In James Mahoney and Dietrich Rueschemeyer (eds), Comparative Historical Analysis in the Social Sciences (Cambridge: Cambridge University Press).

Hall, Peter A. 2006. “Systematic process analysis: when and how to use it.” European Management Review 3, 24–31

Hammersley, Martyn. 1992. “Deconstructing the Qualitative-Quantitative Divide.” In Julie Brannen (ed), Mixing Methods: Qualitative and Quantitative Research (Aldershot: Avebury).

Harrits, Gitte Sommer. 2011. “More Than Method? A Discussion of Paradigm Differences within Mixed Methods Research.” Journal of Mixed Methods Research 5(2): 150–66.

Herron, Michael C., Kevin M. Quinn. 2015. “A Careful Look at Modern Case Selection Methods.” Sociological Methods & Research (forthcoming).

Humphreys, Macartan, Alan M. Jacobs. [in process] Integrated Inferences: A Bayesian Integration of Qualitative and Quantitative Approaches to Causal Inference. Cambridge: Cambridge University Press.

Humphreys, Macartan, Alan M. Jacobs. 2014. “Mixing Methods: A Bayesian Approach, v.3.” American Political Science Review (forthcoming).

Jacobs, Alan. 2015. “Process Tracing the Effects of Ideas.” In Andrew Bennett, Jeffrey T. Checkel (eds), Process Tracing: From Metaphor to Analytic Tool (Cambridge: Cambridge University Press) 41-73.

Kapiszewski, Diana, Lauren M. MacLean, Benjamin L. Read. 2015. Field Research in Political Science: Practices and Principles. Cambridge: Cambridge University Press.

Karl, Terry Lynn. 1997. The Paradox of Plenty: Oil Booms and Petro-States. Berkeley: University of California Press.

King, Gary, Robert O. Keohane, Sidney Verba. 1994. Designing Social Inquiry: Scientific Inference in Qualitative Research. Princeton: Princeton University Press.

Kocher, Matthew, Nuno Monteiro. 2015. “What’s in a Line? Natural Experiments and the Line of Demarcation in WWII Occupied France.” Unpublished manuscript, Department of Political Science, Yale University.

Kreuzer, Markus. 2010. “Historical Knowledge and Quantitative Analysis: The Case of the Origins of Proportional Representation.” American Political Science Review 104:369–92.

Levy, Jack S. 2007. “Qualitative Methods and Cross-Method Dialogue in Political Science.” Comparative Political Studies 40(2): 196–214.

Levy, Jack S. 2008. “Case Studies: Types, Designs, and Logics of Inference.” Conflict Management and Peace Science 25:1–18.

Lieberman, Evan S. 2005. “Nested Analysis as a Mixed-Method Strategy for Comparative Research.” American Political Science Review 99:3 (August) 435-52.

Lieberman, Evan S. 2010. “Bridging the Qualitative-Quantitative Divide: Best Practices in the Development of Historically Oriented Replication Databases.” Annual Review of Political Science 13, 37-59.

Lieberman, Evan S. 2016. “Improving Causal Inference through Non-Causal Research: Can the Bio-Medical Research Cycle Provide a Model for Political Science?” Unpublished manuscript, Department of Political Science, MIT.

Lijphart, Arend. 1971. “Comparative Politics and the Comparative Method.” American Political Science Review 65, 682-93.

http://smile.amazon.com/Diana-Kapiszewski/e/B00ANRR13O/ref=dp_byline_cont_book_1

http://smile.amazon.com/Lauren-M.-MacLean/e/B0036CJYG0/ref=dp_byline_cont_book_2

http://smile.amazon.com/s/ref=dp_byline_sr_book_3?ie=UTF8&field-author=Benjamin+L.+Read&search-alias=books&text=Benjamin+L.+Read&sort=relevancerank

21

Lohmann, Susanne. 2007. “The Trouble with Multi-Methodism.” Newsletter of the APSA Organized Section on Qualitative Methods 5(1): 13–17.

Lucas, Samuel R., Alisa Szatrowski. 2014. “Qualitative Comparative Analysis in Critical Perspective.” Sociological Methodology 44:1, 1–79.

Lynd, Robert Staughton, Helen Merrell Lynd. 1929/1956. Middletown: A Study in American Culture. New York: Harcourt, Brace.Mahoney, James, Gary Goertz. 2006. “A Tale of Two Cultures: Contrasting Quantitative and Qualitative Research.” Political Analysis 14:227–49.

Mahoney, James, Gary Goertz. 2006. “A Tale of Two Cultures: Contrasting Quantitative and Qualitative Research.” Political Analysis 14:3 (Summer) 227-49.

Mahoney, James, Kathleen Thelen (eds). 2015. Advances in Comparative-Historical Analysis. Cambridge: Cambridge University Press.

Mahoney, James, Rachel Sweet Vanderpoel. 2015. “Set Diagrams and Qualitative Research.” Comparative Political Studies 48:1 (January) 65-100.

Mahoney, James. 2010. “After KKV: The New Methodology of Qualitative Research.” World Politics 62 (1): 120–47.

Mahoney, James. 2012. “The Logic of Process Tracing Tests in the Social Sciences.” Sociological Methods & Research 41:4 (November) 566-590.

Mansfield, Edward D., Jack Snyder. 2005. Electing to Fight: Why Emerging Democracies go to War. Cambridge: MIT Press.

McKeown, Timothy J. 1999. “Case Studies and the Statistical World View.” International Organization 53 (Winter) 161-190.

McLaughlin, Eithne. 1991. “Oppositional Poverty: The Quantitative/Qualitative Divide and Other Dichotomies.” The Sociological Review 39 (May): 292-308.

Mill, John Stuart. 1843/1872. The System of Logic, 8th ed. London: Longmans, Green. Moore, Barrington, Jr. 1966. Social Origins of Dictatorship and Democracy: Lord and Peasant in the

Making of the Modern World. Boston: Beacon Press. Morgan, Mary. 2012. “Case Studies: One Observation or Many? Justification or Discovery?”

Philosophy of Science 79:5 (December) 655-66. Paluck, Elizabeth Levy. 2010. “The Promising Integration of Qualitative Methods and Field

Experiments.” The ANNALS of the American Academy of Political and Social Science 628, 59-71. Patton, Michael Quinn. 2002. Qualitative Research & Evaluation Methods. Sage. Pincus, Steve. 2011. 1688: The First Modern Revolution. New Haven: Yale University Press. Platt, Jennifer. 1992. “’Case Study’ in American Methodological Thought.” Current Sociology 40:1,

17-48. Posner, Daniel. 2004. “The Political Salience of Cultural Difference: Why Chewas and

Tumbukas are Allies in Zambia and Adversaries in Malawi.” American Political Science Review 98:4 (November) 529-46.

Ray, James Lee. 1993. “Wars between Democracies: Rare or Nonexistent?” International Interactions 18:251–76.

Reiss, Julian. 2009. “Causation in the Social Sciences: Evidence, Inference, and Purpose.” Philosophy of the Social Sciences 39(1): 20–40.

Rihoux, Benoit. 2013. “Qualitative Comparative Analysis (QCA), Anno 2013: Reframing The Comparative Method’s Seminal Statements.” Swiss Political Science Review 19:2, 233–45.

Roberts, Clayton. 1996. The Logic of Historical Explanation. University Park: Pennsylvania State University Press.

Rohlfing, Ingo. 2012. Case Studies and Causal Inference: An Integrative Framework. Palgrave Macmillan.

Romer, Christina D., David H. Romer. 2010. “The Macroeconomic Effects of Tax Changes: Estimates Based on a New Measure of Fiscal Shocks.” American Economic Review 100 (June) 763–801.

22

Schimmelfennig, Frank. 2015. “Efficient Process Tracing: Analyzing the Causal Mechanisms of European Integration.” In Andrew Bennett, Jeffrey T. Checkel (eds), Process Tracing: From Metaphor to Analytic Tool (Cambridge: Cambridge University Press) 98-125.

Schwartz, Howard, Jerry Jacobs. 1979. Qualitative Sociology: A Method to the Madness. New York: Free Press.

Seawright, Jason, John Gerring. 2008. “Case-Selection Techniques in Case Study Research: A Menu of Qualitative and Quantitative Options.” Political Research Quarterly 61:2 (June) 294-308.

Seawright, Jason. 2015a. “The Case for Selecting Cases that are Deviant or Extreme on the Independent Variable.” Sociological Methods & Research (forthcoming).

Seawright, Jason. 2015b. Multi-Method Social Science: Combining Qualitative and Quantitative Tools. Cambridge: Cambridge University Press, forthcoming.

Shapiro, Ian, Rogers Smith, Tarek Masoud (eds). 2004. Problems and Methods in the Study of Politics. Cambridge: Cambridge University Press.

Shweder, Richard A. 1996. “Quanta and Qualia: What is the ‘Object’ of Ethnographic Method?” In Richard Jessor, Anne Colby, and Richard A. Shweder (eds), Ethnography and Human Development: Context and Meaning in Social Inquiry (Chicago: University of Chicago Press).

Sil, Rudra. 2000. “The Division of Labor in Social Science Research: Unified Methodology or ‘Organic Solidarity’?” Polity 32:4 (Summer) 499-531.

Skocpol, Theda, Margaret Somers. 1980. “The Uses of Comparative History in Macrosocial Inquiry.” Comparative Studies in Society and History 22:2 (April) 147-97.

Skocpol, Theda. 1979. States and Social Revolutions: A Comparative Analysis of France, Russia, and China. Cambridge: Cambridge University Press.

Snow, C.P. 1959/1993. The Two Cultures. Cambridge: Cambridge University Press. Strauss, Anselm, Juliet Corbin. 1998. Basic of Qualitative Research: Techniques and Procedures for

Developing Grounded Theory. Thousand Oaks: Sage. Van Evera, Stephen. 1997. Guide to Methods for Students of Political Science. Ithaca: Cornell University

Press. Waldner, David. 2012. “Process Tracing and Causal Mechanisms.” In Harold Kincaid (ed),

Oxford Handbook of Philosophy of Social Science (Oxford: Oxford University Press) 65-84. Waldner, David. 2015a. “Process Tracing and Qualitative Causal Inference.” Security Studies 24:2,

239-50. Waldner, David. 2015b. “What Makes Process Tracing Good? Causal Mechanisms, Causal

Inference, and the Completeness Standard in Comparative Politics.” In Andrew Bennett, Jeffrey T. Checkel (eds), Process Tracing: From Metaphor to Analytic Tool (Cambridge: Cambridge University Press) 126-52.

Walter, Barbara. 2002. Committing to Peace: The Successful Settlement of Civil Wars. Princeton: Princeton University Press.

Yanow, Dvora, Peregrine Schwartz-Shea (eds). 2013. Interpretation and Method: Empirical Research Methods and the Interpretive Turn, 2d ed. Armonk, NY: M E Sharpe.

Documents

Qualitative Methods - BU Blogsblogs.bu.edu/jgerring/files/2015/11/Text_6.pdf · Qualitative Methods . ... a bifurcation appeared between quantitative and qualitative methods (Platt