Backward Induction in the Wild? Evidence from Sequential ......in which forward-looking senators...

Backward Induction in the Wild?Evidence from Sequential Voting in the

U.S. Senate∗

Jorg L. Spenkuch B. Pablo Montagnes Daniel B. Magleby

Northwestern University Emory University Binghamton University

March 2017

Abstract

In the U.S. Senate, roll calls are held in alphabetical order. We document that sen-

ators early in the order are less likely to vote with the majority of their own party

than those whose last name places them at the end of the alphabet. To speak to

the mechanism behind this result, we develop a simple model of sequential voting,

in which forward-looking senators rely on backward induction in order to free ride

on their colleagues. Estimating our model structurally, we find that this form of

strategic behavior is an important part of equilibrium play. At the same time, there

appears to be a great amount of heterogeneity in senators’ use of backward rea-

soning. We also consider, but ultimately dismiss, alternative explanations related

to learning about common values and vote buying.

∗This paper merges and supersedes “Backward Induction in the Wild: Evidence from the U.S. Senate”(Spenkuch 2014) and “What’s in a Name? Strategic Voting and Moral Hazard in the U.S. Senate” (Magleby,Montagnes and Spenkuch 2015). We are grateful to Sandeep Baliga, Alessandra Casella, Tim Feddersen,Moshe Hoffman, Navin Kartik, Scott Kominers, Steven Levitt, John List, Carlo Prato, Yuval Salant, RichardVan Weelden, and Oscar Volij for many helpful conversations and suggestions. We have also benefittedfrom audience comments at the University of Chicago, Columbia, Northwestern, Michigan State, as well asnumerous conferences. Keith Poole, Jim Snyder, and Tim Groseclose generously shared their data on roll-callvotes and supermajority requirements. Enrico Berkes, Yuxuan Chen, Moonish Maredia, Daniel Wu, and JunYang provided excellent research assistance. This research was supported in part through the computationalresources provided by the Quest high-performance computing facility at Northwestern University. All viewsexpressed in this paper as well as any remaining errors are solely our responsibility. Correspondence shouldbe addressed to the first author at MEDS Department, Kellogg School of Management, 2211 Campus Dr,Evanston, IL 60208, or by e-mail: j-spenkuch@kellogg.northwestern.edu.

1. Introduction

Over the last half-century, the concepts and techniques of noncooperative game theory have

become central to economics and the social sciences more generally (Kreps 1990). Yet, game-

theoretic analyses are often criticized for lack of empirical content (e.g., Green and Shapiro

1994; Kadane and Larkey 1983; Simon 1955). If game theory is to be a positive theory

of actual human behavior–as opposed to a normative one of how people should behave–

then understanding how closely the theory’s assumptions and predictions are borne out in

real-world conduct is a matter of first-order importance.

In this paper, we examine the behavior of U.S. Senators during legislative roll-call votes. On

theoretical grounds, “strategic voting is an ineradicable possibility in all voting systems” and

should “almost always [be] present in legislatures” (Riker 1982, p. 167). Empirical evidence of

strategic behavior by legislators, however, is extremely scarce. Groseclose and Milyo (2010),

for instance, “only find ten roll call votes–in the entire history of Congress–on which a

researcher has claimed that sophisticated voting occurred” (p. 65).

Our analysis exploits the fact that the Senate holds roll calls in alphabetical order. This

does not only allow us to learn about senators’ calculus of voting, but it also affords us a

unique opportunity to study the game-theoretical concept of backward induction outside the

laboratory.

In the first part of the paper, we document two empirical regularities. First, the majority

party in the Senate is much more likely to narrowly win a roll-call vote than to narrowly

lose it. Second, senators who, by virtue of their last name, get to vote earlier in the order

are more likely to deviate from the party line than those who get to vote late.

Figure 1 illustrates this pattern in the raw data. Importantly, we only find vote-order effects

on roll calls that end up being close. We further demonstrate that there are no comparable

order effects in the modern House of Representatives, where roll calls are not held according

to the alphabet.

To account for senators’ inherent tendencies to deviate from the party line, our preferred

estimates control for senator fixed effects. We, therefore, do not assume that preferences are

as good as randomly assigned. Identification in our setup comes from two sources of plausibly

exogenous variation: (i) changes in the alphabetical composition of the Senate over time,

and (ii) within-Congress variation in the set of senators who participate in a given roll call.

Focusing on either source produces qualitatively equivalent results.

In the second part of the paper, we develop a simple game-theoretic model of “parties

as teams,” which, by appealing to backward induction, has the potential to explain the

previously unknown set of facts. In our model, senators have both instrumental preferences

over policy outcomes and expressive preferences over their recorded votes. We assume that

senators from the same party have similar instrumental interests, i.e., they prefer the same

set of bills to pass. Yet, at least some of them have conflicting expressive preferences. That

is, conditional on the outcome of the vote, some senators prefer not to be on the record

supporting the bill–even if they would rather see it pass than fail. Such situations may,

for instance, arise if constituent interests conflict with the official party position, or when

senators’ personal preferences are misaligned with those of their party.

In equilibrium, a conflicted senator’s behavior depends on how likely her vote is to be

pivotal. At first blush it may appear that conflicted agents would benefit from voting after

the choices of all others have been revealed. The model, however, shows that when preferences

are common knowledge, the exact opposite is often the case. If a measure is known to have

more supporters than needed, then early voters are able to reject it without affecting the

ultimate outcome of the roll call. This is because voters later in the order will, if necessary,

adhere to the party line in order to carry the bill to passage. As a result, forward-looking

senators can free ride on their colleagues, which gives rise to a negative relationship between

alphabetical rank and defection.

Example. For a simple example of this logic, consider the extensive-form game depicted

in Figure 2. Party D still requires two “yea” votes for the bill to pass. All of its three

remaining members, however, are conflicted. That is, they prefer to say “nay,” but only if

the measure ends up being approved anyway. If the senator who gets to vote first (i.e., D1) is

forward-looking, she realizes that her fellow party members (i.e., D2 and D3) would rather

abandon their own positions than be responsible for letting the bill fail. She, therefore, votes

“nay,” while her colleagues are forced to vote “yea.”

At its core, our theory builds on the game-theoretic idea that agents backward induct

in order to anticipate the actions of others (Kuhn 1953; Selten 1965; von Neumann and

Morgenstern 1944). To learn more about backward reasoning in this important real-world

setting, in the third part of the paper we structurally estimate our model. Our findings are

two-fold: First, the data clearly favor the null hypothesis of backward induction over the

alternative of myopic play. Second, there is a great amount of heterogeneity in senators’

reliance on backward reasoning. Taking our results at face value, about 32% of senators

appear to explicitly account for the expected behavior of those who have not yet voted,

while the behavior of another 16% is best described by a heuristic that crudely resembles

the backward induction strategy. The remaining senators behave myopically. Interestingly,

ancillary results suggest that senators become more likely to reason backwards after having

participated in a few hundred roll calls.

In the fourth part of the paper, we explicitly discuss two alternative mechanisms that

might also give rise to negative vote-order effects. The first alternative is based on theories

of herding and learning about common values (e.g., Ali and Kartik 2012; Banerjee 1992;

Bikhchandani et al. 1992; Callander 2007; Welch 1992). According to this explanation, sen-

ators exercise hindsight rather than foresight. Intuitively, senators might be uncertain as to

the best position to take on a particular bill, which is why they may look to the votes of

others to inform their own decision. If legislators interpret the choices of copartisans who

voted before them as cues, then agents toward the end of the alphabet have received a greater

number of signals. They should thus be more likely to ignore their private information and

vote with the majority of their own party.

We empirically test for herding and learning about common values by asking whether a

senator’s choice is more strongly correlated with the preceding votes of her colleagues than

one would expect under the null hypothesis of no cue taking. The answer turns out to be “no.”

That is, although our empirical approach performs well in Monte Carlo simulations, we find

no evidence of herding. In addition, we argue that an explanation based on learning about

common values cannot explain the result that senators’ choices depend on the probability of

casting a pivotal vote.

The second alternative mechanism we consider is vote buying (Dal Bo 2007; Dekel et al.

2008, 2009; Groseclose and Snyder 1996; Snyder 1990). Vote buying by the party leadership

can explain why the majority party is more likely to narrowly win a roll-call vote than to

narrowly lose it–although it may also give rise to supermajorities. A vote-buying mechanism

may also be consistent with senators’ observed choices being a function of pivot probabilities.

However, for such a theory to rationalize why legislators later in the vote order are more

likely to support the party line than those with last names closer to the beginning of the

alphabet, it would have to be the case that the party leadership is more likely to “buy off”

the former.

To better understand whether rational party leaders should target senators who vote late,

we extend the model of Dekel et al. (2009) to settings in which legislators vote sequentially.

We show that the basic insights gleaned from their static theory carry over to the dynamic

case. In particular, party leaders should target legislators who are close to indifferent, inde-

pendent of the vote order. We further argue that models that emphasize decisions by the

party leadership cannot easily explain why we find substantial individual-level heterogeneity

in how senators respond to the possibility that their vote might be decisive, and why vote

order effects only emerge after senators have participated in sufficiently many roll calls.

Our findings contribute to two separate literatures. An important body of work in be-

havioral economics documents that individuals in the laboratory do not properly backward

induct, e.g., in ultimatum games (see Camerer 2003 for review). Most relevant for our paper

are the results of McKelvey and Palfrey (1992) on the centipede game (see also Fey et al.

1996; Nagel and Tang 1998; Rapoport et al. 2003; Zauner 1999). In order to understand why

observed outcomes coincide so rarely with those prescribed by backward induction, recent

research has tested a host of potential explanations, ranging from cognitive limitations and

failures of common knowledge of rationality to preferences for fairness and altruism (Binmore

et al. 2002; Dufwenberg et al. 2010; Gneezy et al. 2010; Johnson et al. 2002; Levitt et al.

2011; Palacios-Huerta and Volij 2009). This strand of the literature typically concludes that

social preferences or departures from rationality cannot fully explain the observed violations

of Nash equilibrium (Binmore et al. 2002; Johnson et al. 2002).1 Instead, failures of backward

induction are, at least in part, attributed to cognitive limitations–though subjects can be

taught to reason backwards (Dufwenberg et al. 2010; Gneezy et al. 2010).

While studies conducted in the laboratory have provided important insights into hu-

man decision-making, there remain inherent methodological limitations (see Levitt and List

2007). Tests of fundamental game-theoretical concepts outside of the laboratory, however,

are scant.2 It remains, therefore, unknown to which extent failures of backward reasoning

generalize to other, real-world contexts.

Although our results corroborate the basic tenets of game theory more closely than one

might have expected based on the extant literature, certain results from the laboratory

appear to travel very well. In particular, our finding that experience makes senators more

likely to engage in strategic preemption complements existing evidence according to which

the behavior of professionals is often more consistent with the predictions of standard theory

than that of novices (e.g., List 2003, 2004; Palacios-Huerta and Volij 2008). At the same

time, legislators’ low speed of learning underscores the importance of studying real-world

settings in which individuals had sufficient time to accumulate experience.

We also contribute to a large literature on voting in legislatures (e.g., Levitt 1996; Lon-

dregan 2000; Krehbiel 1998; McCarty et al. 2016; Poole 2005; Poole and Rosenthal 1997;

Rhode 1991; Snyder and Groseclose 2000, among many others). Theory and a variety of

anecdotes suggest that politicians might act strategically in roll-call settings (see Enelow

1981; Jenkins and Munger 2003; Riker 1986; Volden 1998). Yet, thorough empirical evidence

1An important exception are the results of Palacios-Huerta and Volij (2009), who argue that failure ofbackward induction is due to a lack of common knowledge of rationality, and Baghestanian and Frey (2016),who replicate the findings in the former paper with GO instead of chess players.2The most important exception is a growing literature on the use of mixed strategies in professional sports.

While earlier work studying settings as wide-ranging as tennis serves in Wimbledon and penalty kicks insoccer cannot reject minimax play (Chiappori et al. 2002; Hsu et al. 2007; Palacios-Huerta 2003; Walker andWooders 2001), Kovash and Levitt (2009) show that pitches in Major League Baseball and play choices inthe National Football League exhibit too much serial correlation to be consistent with players using mixedstrategies. They suggest that earlier studies’ inability to reject the null hypothesis may be due to a lack ofstatistical power.

of tactical voting in Congress is almost nonexistent (see Landha 1994; Poole and Rosenthal

1997; Wilkerson 1999). To rationalize the surprising absence of evidence, Groseclose and Mi-

lyo (2010) provide a formal model in which agents vote simultaneously and strategic voting

does not arise in any pure-strategy equilibrium.3 To the best of our knowledge, our finding

that senators’ choices depend on the probability of casting a pivotal vote is one of the first

systematic pieces of evidence to suggest that tactical voting is, in fact, widespread.4 Our

work, therefore, helps to better understand legislators’ calculus of voting and, as a result,

the genesis of public policy.

2. Roll-Call Votes in the U.S. Senate

Article I of the Constitution states that “each House shall keep a Journal of its Proceedings,

and [. . . ] the Yeas and Nays of the Members of either House on any question shall, at the

Desire of one fifth of those Present, be entered on the Journal.”5 According to the Rules of

the Senate, a senator who has the floor may, at any time, ask for the Yeas and Nays on the

bill, motion, amendment, etc. that is currently pending. If at least 11 senators (i.e., one fifth

of the minimal quorum) raise their hands in support of the request, then the eventual vote

on the issue will be conducted by calling the roll, with each senator’s vote being recorded.

Although a roll-call request has no effect on when the issue will be voted upon, the low

requirement for ordering the Yeas and Nays, coupled with the fact that senators often care

intensely about their track record, means that the Senate decides most controversial issues

by roll-call votes.6

Regarding the manner in which roll calls are to be conducted, Rule XII of the Senate

requires that

“when the yeas and nays are ordered, the names of Senators shall be called alphabetically; and

each Senator shall, without debate, declare his assent or dissent to the question, unless excused

by the Senate; and no Senator shall be permitted to vote after the decision shall have been

announced by the Presiding Officer, but may for sufficient reasons, with unanimous consent,

change or withdraw his vote.”

In practice, when the time to vote has come, the presiding officer announces that “the Yeas

3As in our model of “parties as teams,” Groseclose and Milyo (2010) assume that at least some legislatorshave “position taking” (i.e., expressive) concerns that differ from their policy preferences. In subsequentwork, Groseclose and Milyo (2013) allow legislators to vote sequentially, which leads them to conclude that,in such settings, strategic voting may be part of equilibrium play.4For recent studies attempting to estimate the extent of strategic voting among ordinary citizens, see

Kawai and Watanabe (2013) and Spenkuch (2015).5In describing the voting procedures in the Senate, this section borrows heavily from Rybicki (2013).6Neither voice nor division votes are recognized by the Rules of the Senate. They are permitted by

precedent. In practice, division votes are very rare and voice votes are almost exclusively used on uncontestedquestions. Sometimes these are even decided “without objection” and without a formal vote.

and Nays have been ordered and the clerk will call the roll.” The clerk then calls senators

in alphabetical order. Senators who are present declare their choice. Following the initial

call of the roll, the clerk recapitulates the vote by respectively identifying those who voted

“yea” and “nay.” Senators who were absent when their name was first called, but have since

arrived on the floor, are allowed to go to the rostrum and still cast their vote. The clerk

calls their name, and repeats the senator’s choice. Usually, the presiding officer announces

the decision fifteen minutes after the beginning of the roll call–though votes are sometimes

kept open longer for more senators to hurry to the floor. On average, senators participate in

about 85% of calls.

It is important to note that, on the majority of roll calls, a nonnegligible number of senators

arrive on the floor late, i.e., after the clerk first called their name. Consequently, the actual

order in which votes are submitted is not strictly alphabetical. Nevertheless, changes in the

alphabetical composition of the chamber do provide quasi-random variation in the order in

which senators were first allowed to cast their votes. That is, a senator whose last name

starts with the letter “A” can always announce his decision before a colleague whose last

name starts with a “Z.”

3. A First Look at the Data

3.1. Data Sources and Summary Statistics

Our analysis uses data on all roll-call votes in the U.S. Senate since the emergence of the

two-party system, i.e., from the beginning of the 35th until the end of the 112th Congress

(1857—2013). The data contain senators’ names, party affiliation, and final votes. They come

from the Congressional Record and have been transcribed by Keith Poole and coauthors.7

Table 1 presents descriptive statistics. On average, about 95.5 distinct senators serve in a

given Congress, participating in almost 512 roll calls per two-year period–though the latter

number varies widely over time. According to the definition in Snyder and Groseclose’s (2000)

seminal work on party influence, about half of the almost 40,000 roll calls in the data end up

being “lopsided” in the sense that more than 65% or less than 35% of senators vote “yea.”8

The remaining half is said to be contested, or “close.” Approximately 56% of roll calls are

divisive. That is, the majority of senators from one party takes a position opposite from that

of the majority of the other party.

In total, the data consist of almost 2.9 million individual votes, of which about 18.4% go

7For precise defintions as well as additional information on the sources of all variables used throughoutthe analysis, see Appendix H.8For votes that require a supermajority, e.g., treaties and cloture votes, the corresponding cutoffs are

51.7% and 81.7% (i.e., 66.7%±15%). Data on supermajority requirements come from Snyder and Groseclose(2000) and have been manually extended through the 112th Congress.

against the party line. More precisely, senator i’s vote is said to deviate from the party line

whenever it does not coincide with the majority of others from the same party (not counting

i herself). The intuitive appeal of this definition is based on the idea that, on average,

the positions of copartisans should be aligned. That is, senators’ own preferences and their

party’s stance are likely highly correlated. Thus, looking at i’s colleagues provides a way

to gauge whether a given bill, amendment, etc. is popular within her party, while avoiding

endogeneity issues arising from i’s choice itself.9 Alternative definitions of the party line

might, for instance, be based on the votes of party leaders or the parties’ whips. Reassuringly,

they lead to qualitatively similar results.10

3.2. Descriptive Analysis

Focusing on members of the Democratic and Republican Parties, Table 2 presents our main

empirical finding. The numbers therein are based on the following econometric model:

(1) di,p,r,c = µi + λoi,r,c + εi,p,r,c.

Here, di,p,r,c is an indicator variable equal to one if senator i deviated from the party line on

roll call r during Congress c, µi marks a senator fixed effect, and oi,r,c denotes i’s alphabetical

percentile ranking among her colleagues. That is, oi,r,c takes on a value of zero for the senator

whom the clerk calls first, whereas it is one for the agent whose surname ranks her last.11

The coefficient of interest is λ. It measures by how much the opportunity of voting ear-

lier affects the probability that senators adhere to the party line. By including µi, all of

our specifications control for agents’ idiosyncratic tendencies to deviate from the majority

position of their own party. Put differently, senator fixed effects account for the fact that

surnames themselves are unlikely to be as good as randomly assigned. For instance, names

are known to be indicative of ethnicity and social class (see, e.g., Bertrand and Mullainathan

2004; Fryer and Levitt 2004). In fact, names even contain information about partisanship.

Looking only at the first letters of senators’ last names, a χ2-test rejects the null hypothesis

that their distribution is the same for Democrats and Republicans (p < .01). We, therefore,

control for senators’ intrinsic disposition.

9The most important downside of this method of inferring the party line is that it is undefined wheneverthere are exactly as many “yeas” as there are “nays” among a senator’s colleagues. This is the case for about1.4% of observations, which are consequently discarded.10Appendix Table A.2 replicates the papers’ main descriptive finding for alternative definitions of the

party line. An important disadvantage of defining the party line by how the party leadership votes is that,for procedural reasons, the majorty-party leader sometimes votes against a bill that he in actuality supports.Another disadvantage is that parties did not adopt today’s leadership system until the late 1910s, whichrenders earlier data unusable.11Formally, oi ≡ si−1

S−1 , where S denotes the number of senators and si is i’s raw alphabetical rank.

Identification in our setup comes from two sources of plausibly exogenous variation: (i)

changes in the alphabetical composition of the Senate over time (most of which is due to

the replacement of retiring senators or those who fail to get reelected), and (ii) within-

Congress variation in the set of senators who participate in a given roll call (e.g., because

other senators were not on Capitol Hill when roll call r was held, or because they abstained

due to a conflict of interest). That is, conditional on having a particular last name, senator

i is allowed to vote earlier on some roll calls than on others because a colleague who ranked

ahead of her in the alphabet was replaced by someone whose last name comes after hers, or

because another colleague happened to be absent on a particular day.

The results in the first three columns of Table 2 are based on alphabetical rank among

senators who participated in a particular roll call, whereas the ones in the remaining three

columns use the order of all senators who officially served in Congress at the time when

roll call r was conducted. Within each set of regressions, the leftmost estimates control for

senator fixed effects, while the ones in the middle and rightmost columns include senator-

by-Congress fixed effects. The estimate in column (1), therefore, exploits both within- and

across-Congress variation in roll call—specific alphabetical rank, while the ones in columns

(2) and (3) rely solely on the former. By contrast, the results in columns (4)—(6) discard any

variation arising from senators not participating in some roll-call votes. Instead, identification

comes from changes in the alphabetical composition of the chamber over time. Column (3)

allows for both across- and within-Congress changes, whereas columns (5) and (6) use only

the latter (i.e., variation due to deaths, expulsions, or sudden departures for other reasons).

Our specifications in columns (3) and (6) also control for the fraction of copartisans who

deviate from the party line during the same roll call. If there are no strategic interactions

between senators, then controlling for the choices of others from the same party yields more

precise estimates by implicitly accounting for roll call—specific unobservables that might

affect how controversial a given measure is. If there are strategic interactions, however, then

the specifications in columns (3) and (6) include an endogenous control variable, which will

generally bias the coefficient of interest (cf. Cameron and Trivedi 2005, ch. 4). In Appendix

A, we prove that if senators systematically preempt each other–as in our theory of “parties

as teams”–then the point estimates in columns (3) and (6) are, in fact, upward biased.

That is, they constitute upper bounds on the true vote-order effect. By contrast, for the

coefficients in the remaining columns to be biased, one would have to believe that changes

in senators’ preferences are systematically correlated with whether they get to vote earlier

or later than usual.

Importantly, all point estimates in Table 2 are negative and statistically significant at

conventional levels. In fact, the estimate based on the least amount of potentially suspect

variation, i.e., the one in column (5), is the most negative of all. At the same time, it is also

the least precisely estimated. Taking the 95%-confidence intervals implied by the standard

errors in Table 2 at face value, one can reject neither large nor small effects of senators’

alphabetical ranking on the probability of defection. Nevertheless, the evidence in Table 2

does imply that the opportunity to vote early causes senators to deviate from the party

line.12

Figure 3 presents results from a placebo test. Specifically, we randomly reassign senators’

names and reestimate equation (1) based on the newly induced ordering. Repeating this

process sufficiently many times, we compare our preferred point estimate from column (1)

in Table 2 to the distribution of placebo coefficients. As one might expect, we find that the

median placebo estimate is approximately zero. More importantly, 98.4% of them are larger

than the true coefficient, which suggests that our finding in the previous table is unlikely to

be due to chance.13

In Table 3, we show that vote-order effects in the Senate are driven by “close” rather than

“lopsided” roll calls. We also establish that there are no corresponding effects in the mod-

ern House of Representatives, where, after the introduction of electronic voting machines,

alphabetical roll calls have become obsolete. In electronic “roll calls,” there exists no pre-

determined order in which legislators get to cast their vote. Any congressman is allowed to

submit his choice as soon as the vote has been opened. Reassuringly, the point estimates for

the House show no systematic correlation between defection and alphabetical rank. In fact,

two of the six coefficients even have the “wrong” sign, and none of them are statistically

significant.14

12There exists the possibility that the results in columns (1)—(3) are affected by selective abstention. Forthe point estimates in the these columns to be biased, it would have to be the case that abstention isconcentrated in certain parts of the vote order and correlated with other senators’ propensity to deviatefrom the party line. To assuage concerns about selective abstention driving our results, we show in AppendixTable A.3 that the point estimates become, if anything, more negative when we restrict attention to rollcalls with greater levels of participation.13Appendix Figure A.1 replicates this placebo test based on the specification in colum (3) of Table 2. The

results are qualitatively identical.14For additional context, when a recorded vote is held in the House, congressmen typically have fifteen

minutes to cast their votes by inserting an identification card into one of forty voting stations distributedthroughout the House floor and pushing a button corresponding to “yes,” “no,” or “present.” There isno prespecified order. If a member votes yes (no) on the issue under consideration a green (red) light isilluminated next to her name on a large “scoreboard” mounted above the Speaker’s chair. If she votes“present,” a yellow light goes off instead.

Before the introduction of voting machines during the 93rd Congress, recorded votes were held by orallycalling the roll, but they were not permitted in the Committee of the Whole, the form in which the Houseordinarily operates to debate and vote on amendments. As a consequence, congressmen voted on many issuesin anonymity and not in alphabetical order (cf. Koempel et al. 2008). In 1970, for instance, the House usedvoice, division, or teller votes on issues ranging from a measure to exempt potatoes from federal marketingorders to American troops in Cambodia, the antiballistic missile system, and school desegregation (cf. Con-

Next, we examine the frequency with which the majority party in the Senate wins close

roll calls. Using McCrary’s (2008) discontinuity test, Figure 4 shows that there exists a large

discontinuity around a winning margin of zero. More precisely, there are about twice as

many roll calls that the majority narrowly wins than it narrowly loses, and the difference is

statistically highly significant (p < .001).15

In sum, our analysis of sequential roll-call votes in the U.S. Senate documents three empir-

ical regularities. First, the majority party is much more likely to barely win a roll-call vote

than to just lose it. Second, senators who, by virtue of their last name, get to vote earlier

in the order are more likely to deviate from the party line than their colleagues who get to

vote late. Third, the vote order affects the choices of senators only on roll calls that end up

being close.

4. Parties as Teams

In this section, we present a simple model of sequential voting that, by appealing to backward

induction, has the potential to rationalize these findings. The theory is closely related to that

of Groseclose and Milyo (2013) and to the dynamic free-riding problem in Iaryczower and

Oliveros (2015).

Primitives. Let there be a finite set of senators, i = 1, 2, . . . , S, who can either vote

“yea” or “nay.” Each of them belongs to one of two parties, Democrats (D) or Republicans

(R). The Democratic Party is in the majority, i.e., |D| > |R|. It supports the bill that is

currently under consideration. The Republican Party, on the other hand, would like to see

it fail. Passage of the measure requires strictly more “yeas” than “nays.”

Members of both parties derive expressive utility directly from how they vote. In line with

the standard, spatial model of voting, we let ξ and ζ denote the “yea”- and “nay”-positions

on the bill, with ξ < ζ. We further assume that senators evaluate these positions according

to the distance from their personal ideal point xi. These ideal points might arise because

senators themselves are ideological (Levitt 1996), or because they are being held accountable

by their constituents (Mayhew 1974). Specifically, we assume that Ui (ξ) = − (xi − ξ)2 and

Ui (ζ) = − (xi − ζ)2.

In addition to their position-taking utility, agents also have instrumental preferences. That

is, they value whether or not the bill ultimately passes. Such instrumental preferences might

be due to concerns about their party’s reputation or “brand” (Downs 1957; Snyder and Ting

gressional Quarterly 1971). In the data, roll calls held after the introduction of electronic voting outnumberthose before by about two to one. Interestingly, point estimates for the period before the 93rd Congress arelower than those afterwards. Large standard errors, however, make direct comparisons speculative.15This result holds regardless of whether the Vice President belongs to the majority party, although the

estimated discontinuity is smaller when he does not (cf. Appendix Figure A.2).

2002), or because party elites exert pressure on rank and file members (Rhode 1991; Snyder

and Groseclose 2000). Thus, irrespective of how a given senator votes herself, all Democrats

receive utility δD > 0 if the bill passes, whereas Republicans incur a penalty of δR < 0.

Without loss of generality, define αi ≡ Ui (ξ) − Ui (ζ) and normalize each agent’s utility

from voting “nay” when the bill is rejected to zero. The following 2×2 matrix summarizes

senators’ payoffs, depending on their party affiliation p ∈ {D,R}.

bill passes bill rejected

vote “yea” αi + δp αi

vote “nay” δp 0

The important point to note is that, conditional on the overall outcome of the roll call,

all senators would like to follow their position-taking preferences, i.e., vote “yea” whenever

αi > 0. If their vote ended up being pivotal, however, then senators for whom αiδp < 0 and

|αi| < |δp| would be better off by voting against their expressive preferences and with the

party line instead.16 We say that these agents are “conflicted.”

Timing & Information. Senators publicly announce their choice in the order of their

exogenously determined index. The vote order as well as individual senators’ payoffs are com-

mon knowledge. Although the latter assumption is unlikely to be literally correct, senators

do interact frequently and parties often hold straw polls in advance of important votes. We

would, therefore, expect agents to be rather well-informed about each other’s preferences.

The substantive implication of the common knowledge assumption is that forward-looking

senators do not only know the current vote count, but by using backward induction they

can perfectly predict the choices of their colleagues who have not yet voted. In our struc-

tural estimation in Section 5, we relax this assumption. For now, however, we abstract from

uncertainty, as analyzing a finite extensive-form game with perfect information significantly

simplifies the exposition.17

Equilibrium. We focus on subgame-perfect Nash equilibria (SPNE). Generically, the

game above admits a unique SPNE, which is in pure strategies. To see this, consider the

senator who votes last. Having observed all previous votes, she knows whether or not her

vote will be pivotal. Generically, αi 6= 0 and |αi| 6= |δp|, which implies that it is not optimal

16Consistent with Ali and Kartik (2012), our notion of pivotality is dynamic in that senators evaluate thepossibility that their choice affects the outcome of the vote, taking into account the best responses by laterplayers. Thus, roll calls that are ex post decided by more than one vote–and hence would be unaffected byan ex post vote change–might nevertheless have had many dynamically pivotal moments.17Another reason to abstract from uncertainty is that it would work against finding negative vote-order

effects in the data. That is, if uncertainty about preferences was quantitatively important, then senatorsearly in the order might prefer to “play it safe” rather than preempt their colleagues. In Appendix B, weshow that if uncertainty is low enough, senators’ optimal strategies coincide with those in Proposition 1.

for her to play a mixed strategy. As a consequence, the second-to-last senator also knows

whether her choice changes the outcome of the roll call, which, again, results in a strictly

dominant action. Proceeding along the same lines, every player has a dominant action at

any node of the game tree.

For senators whose position-taking preferences are either aligned with the party line (i.e.,

αiδp > 0) or dominate it (i.e., |αi| > |δp|), equilibrium strategies are straightforward. These

individuals choose “yea” if and only if αi > 0.

A more interesting situation arises when senators are conflicted, i.e., when αiδp < 0 and

|αi| < |δp|. As in the example in Figure 2, these agents reason backwards to determine

whether their vote will be pivotal.

To formally characterize conflicted senators’ strategy, we introduce the following notation.

Let Y (N) denote the set of all agents who will vote “yea” (“nay”) for sure, and let Y (N) be

the group of senators who would do so if their vote was known to be decisive. |·|i′>i denotes

the number of agents from a particular set who submit their choice after i. In addition, define

y (n) as the minimal number of “yeas” (“nays”) required for the bill to pass (fail), and let

(yi, ni) be the current vote count when it is i’s turn to submit her choice. All of these objects

are part of senators’ information set.

Proposition 1: In the unique generic subgame-perfect equilibrium of the game, Demo-

cratic senators for whom αiδp < 0 and |αi| < |δp| abandon the party line if and only

if |Y |i′>i +∣∣∣Y ∣∣∣

i′>i+ 1 6= y − yi, whereas their Republican counterparts defect whenever

|N |i′>i +∣∣∣N ∣∣∣

i′>i+ 1 6= n− ni.

Proof: All proofs are collected in Appendix A.

In words, a conflicted senator counts the number of agents who have not yet voted and

who would choose her party’s preferred outcome, either for sure or if their vote was pivotal.

If there are enough others who would go along with the party line if need be, or too few, she

defects.

Empirical Implications. Proposition 1 suggests that being allowed to vote early con-

fers an advantage because it lets forward-looking senators preempt each other. The first

conflicted agent may be able to defect without rendering the roll call lost because there are

enough others who can be counted on to carry the bill to passage. Subsequent senators,

however, can rely on fewer and fewer of their colleagues, making it, on average, less likely

that they defect.

A sufficient (but not necessary) condition for this intuition to go through and for negative

vote-order effects to arise in equilibrium is that the ideal points of legislators from either

party do not overlap. Under this assumption, we can prove the following propositions.

Proposition 2: Suppose that maxi∈D xi < mini∈R xi. There exists some ι ∈ {0, 1, . . . , S}such that conflicted senators defect if and only if i ≤ ι.

Further,

(a) 1 ≤ ι < S if |Y | < y < |Y |+∣∣∣Y ∣∣∣ or |N | < n < |N |+

∣∣∣N ∣∣∣;(b) ι = 0 if |Y | < |Y |+

∣∣∣Y ∣∣∣ = y or |N | < |N |+∣∣∣N ∣∣∣ = n;

(c) ι = S if |Y | ≥ y or |N | ≥ n.

Implication: Deviations from the party line should be more common early in the vote

order than at the end.

Substantively, the first part of the proposition states that, along the equilibrium path,

conflicted senators deviate from the party line if and only if they appear “early enough” in

the order. The second part delineates the situations in which the vote order does and does

not impose any limits on defection. Intuitively, vote order effects arise only when one side

needs some but not all of its conflicted members to go along with the party line. Together

with the next proposition, this implies that order effects do not occur on roll calls that end

up being “lopsided.”

Proposition 3: In equilibrium, the majority party wins a roll call vote if and only if

|Y | +∣∣∣Y ∣∣∣ ≥ y. If, in addition, maxi∈D xi < mini∈R xi and |Y | < y, then the bill receives

exactly as many “yea” votes as required for passage. By contrast, if maxi∈D xi < mini∈R xi

and |Y | +∣∣∣Y ∣∣∣ < y, then, in equilibrium, the majority party falls at least

∣∣∣Y ∣∣∣ votes short of

winning.

Implication: The majority party should be more likely to barely win a roll-call vote than

to just lose it.

Implication: Vote-order effects should only be observed on roll calls that are ex post close.

The first implication follows from the fact that strategic defection might turn a comfortable

majority into a narrow victory, i.e., it makes wins endogenously close. Its effect on losses,

however, is the opposite. If a party is going to lose, then, in equilibrium, its conflicted

members exacerbate the defeat by defecting.

To see the second implication, note that if the roll call does not end up being close, i.e.,

if the “yea” position obtains with more than the minimal majority, then it must have been

the case that |Y | ≥ y. If a majority of senators intrinsically supports the bill, then all others

are free to deviate from the party line, irrespective of their position in the vote order. There

are, therefore, no vote-order effects on roll calls that are not endogenously close.

Our proofs of Propositions 2 and 3 depend on the assumption of nonoverlapping prefer-

ences to avoid intractable combinatorics with an arbitrary number of senators. While this

restriction may appear to be very strong, a large literature in political science has produced

considerable evidence to suggest that it is approximately satisfied for most of the period since

the emergence of the two-party system (see Barber and McCarty 2015; Poole and Rosen-

thal 1997; Theriault 2008). In fact, McCarty et al. (2016) report estimates of senators’ ideal

points for recent Congresses that are perfectly consistent with our assumption.

Only during the middle of the twentieth century, when, by historical standards, polarization

was unusually low and party labels contained less information about legislators’ ideology,

are nonoverlapping preferences a priori unreasonable. At the same time, we stress that this

assumption is by no means necessary for the model to rationalize our empirical findings.

In Appendix D, we present simulation results that yield similar comparative statics without

imposing such a restriction on senators’ preferences. We also provide additional reduced-form

evidence that supports our model.

Broadly summarizing, by appealing to backward induction our theory of “parties as teams”

explains (i) why the majority party is much more likely to barely win a roll-call vote than to

just lose it, (ii) why senators who appear early in the vote order are more likely to deviate

from the party line than their colleagues who get to vote late, and (iii) why vote-order effects

only arise on roll calls that end up being close.

Common Knowledge of (Ir)rationality. To arrive at these predictions, we implic-

itly assume that senators’ rationality is common knowledge. In many games, common knowl-

edge of rationality (CKR) is unlikely to be exactly satisfied; and even small deviations from

CKR may give rise to equilibrium strategies that are dramatically different from the one

prescribed by backward induction (see, e.g., Kreps et al. 1982; Reny 1992). For instance, if

one of the players in the centipede game believes that her opponent is irrational, then it

need not be optimal for her to play the backward-induction strategy either. It is, therefore,

important to consider whether plausible deviations from CKR would affect the empirical

implications above.

By way of example, suppose that senator i ∈ D was conflicted but myopic. That is, she

always deviates from the party line, even in situations in which her vote is necessary for the

bill to pass. If this is known to all of her colleagues, then Propositions 1—3 continue to go

through, only the “accounting” changes. That is, instead of being counted as part of Y , i

would now belong to N . Similarly, if senator i was conflicted but would nonetheless always

vote with the party line, then i ∈ Y . In general, the predictions of our theory would remain

unaffected by deviations from CKR as long as agents know about others’ irrationality, and

provided that the deviant behavior can be rationalized by an alternative set of payoffs over

a senator’s recorded vote and the ultimate outcome of the bill.

Instead of it being known for sure that some agents do not play the backward-induction

strategy, it could also be the case that CKR fails because there is a small probability that

agents make a mistake. In Appendix B, we consider situations in which there is uncertainty

about whether conflicted senators stick with the party line. We show that when the prob-

ability of mistakes is low enough, then the backward-induction strategy in Proposition 1

remains optimal. As a consequence, the theory’s empirical predictions would continue to go

through.

5. A Closer Look at the 112th Congress

At its core, our model builds on the game-theoretic idea that forward-looking agents try to

anticipate the behavior of others in order to maximize their own payoffs. After all, systematic

preemption is predicated upon agents being forward-looking, or on behavioral rules that are

ultimately based on backward-induction logic. In order to learn more about strategic foresight

in this important real-world setting, we now turn to the 112th Congress (01/2011—01/2013).

5.1. Data and Reduced-Form Evidence

A significant limitation of existing roll-call data is that they only contain senators’ final

choices. While knowledge of senators’ names is sufficient to reconstruct the order in which

they were first allowed to submit their votes, the Congressional Record does not indicate

whether a senator submitted his choice after the initial, alphabetical call of the roll, nor

does it contain any information on whether a given vote was ever changed or withdrawn.

In Appendix C, we show that allowing for vote changes in our model of “parties as teams”

would not affect equilibrium behavior. Nonetheless, whether senators do, in fact, flip-flop is

ultimately an emprical question.

To overcome these limitations and provide evidence that senators are not playing a game

that is altogether different from the one in the previous section, we contacted the C-SPAN

network and requested video recordings of all 486 roll calls conducted during the 112th

Congress. Human coders watched each recording and transcribed every senator’s vote, whether

it was submitted during the alphabetical call of the roll, as well as any subsequent changes.

They also recorded the exact order in which votes were submitted by latecomers, i.e., after

the clerk had stopped calling senators’ names. A second coder rewatched the video to check

the transcription for errors, ensuring that senators’ final choices match the Congressional

Record.18

To the extent that the newly collected information is generalizable, it is possible to establish

the following three facts: (i) As only 0.5% of initial votes are ever changed or withdrawn,

“flip-flopping” is quantitatively negligible. (ii) Consistent with the idea that senators use

the opportunity to cast their vote as early as possible in order to preempt their colleagues,

senators at the beginning of the alphabet are more likely to vote during the initial call of

the roll than those towards the end. (iii) Alphabetical rank and actual vote order covary

closely. Regressing the position in which a senator ended up voting on her alphabetical rank

and a set of senator fixed effects produces a point estimate of 1.066 (with a standard error

of .088).

Next, we provide suggestive evidence in support of the assumption that senators are more

likely to support the party line when their vote might be pivotal. Relying on the exact order

in which votes were cast, we group the data by the number of “yeas” and “nays” that were

still required when a given senator voted as well as the number of copartisans who had yet

to do so. We then define the empirical frequency of a pivotal vote as the share of instances

in which a different choice by agents who voted in the same “state of the world” would have

changed the overall outcome of the roll call.

Figure 5 shows the average defection rate for each value of this crude indicator of ex post

pivotality. Although our measure has few unique values, the data indicate a strongly negative

relationship–consistent with our theory’s core assumption.

5.2. Structural Estimation

To more rigorously assess the prevalence and importance of backward reasoning among

senators, we now structurally estimate our model of “parties as teams.” In doing so, we

relax the assumption that payoffs are common knowledge.

Setup. Specifically, in order to accommodate the possibility that senators’ views on a

given bill may only be partially known to their colleagues, we let player i’s position-taking

utility from voting “yea” on roll call r be given by Ui,r (ξr) = − (xi − ξr)2 + ε+i,r, while that

from choosing “nay” is Ui,r (ζr) = − (xi − ζr)2 + ε−i,r. Intuitively, the preference shocks ε+i,r

and ε−i,r indicate how much senator i’s valuation of a particular position differs from what

would be expected based on her ideology. We posit that these differences are i.i.d. random

variables that are only observed by i herself. All other elements of the game remain common

knowledge.19

18For additional information on the coding process, see Appendix H.19In particular, we continue to assume that agents know the order in which votes end up being cast, or,

equivalently, that they can predict the outcome of whichever game determines the order of votes that aresubmitted after the initial, alphabetical call of the roll.

Since senators can no longer perfectly anticipate whether others will support the party

line, the overall outcome of the vote is generally uncertain. Let qi,r(yi,r + yi,r) denote senator

i’s belief that her party’s preferred position obtains, conditional on having already observed

yi,r “yeas” as well as her own vote, yi,r. Further, let δ be the instrumental utility associated

with achieving the party’s goal.

With this notation in hand, a senator votes “yea” if and only if Ui r (ξr) + δqi,r(yi,r + 1) ≥Ui,r (ζr)+δqi,r(yi,r). Assuming that ε+i,r and ε−i,r are normally distributed with E

[ε+i,r

[ε−i,r]

and Var[ε+i,r − ε−i,r

]= σ2, the probability that i assents to the bill when yi,r others have

already done so is equal to

Pr(yi,r = 1|yi,r

ε−i,r − ε+i,r ≤

− (xi − ξr)2 + (xi − ζr)2

+δ[qi,r(yi,r + 1)− qi,r(yi,r)

] = Φ (βrxi + γr + δ∆qi,r) ,

where βr = 2σ

(ξr − ζr), γr = 1σ

(ζ2r − ξ2

r ), ∆qi,r ≡ 1σ

[qi,r(yi,r + 1)− qi,r(yi,r)

], and Φ denotes

the standard normal CDF.

For now, we maintain that all agents form beliefs about the outcome of the roll call by

reasoning backwards. Hence, the probability of casting a pivotal vote, i.e., ∆qi,r, is a function

of the observed number of “yeas,” the roll call—specific parameters (βr, γr), and the ideological

ideal points of senators who have yet to vote.20

Estimation. Given the assumptions above, the likelihood is

L (x, β, γ, δ|Ψ) =C∏r=1

S∏i=1

Φ (βrxi + γr + δ∆qi,r)yi,r(2)

× [1− Φ (βrxi + γr + δ∆qi,r)]1−yi,r ,

where x = (x1, . . . , xS), β = (β1, . . . , βC), γ = (γ1, . . . , γC), and Ψ denotes the data. To

estimate (x, β, γ, δ) we follow the Bayesian Markov Chain Monte Carlo approach of Clinton

et al. (2004) (see Appendix E for details).

Clinton et al. (2004) pioneer the Bayesian analysis of roll-call data. The key difference

between their setup and ours is that the former assumes legislators are pure position-takers,

20To construct qi,r (·), let δi equal δ if i’s party supports the bill and −δ otherwise. Further, let qS,r (y) =I[y ≥ y

], and proceed recursively:

qi,r (y) = qi+1,r (y + 1) Φ(βrxi+1 + γr + δi [qi+1,r(y + 1)− qi+1,r(y)]

)+ qi+1,r (y)

[1− Φ

(βrxi+1 + γr + δi [qi+1,r(y + 1)− qi+1,r(y)]

If i’s party prefers the bill to pass, then qi,r (y) = qi,r (y). Otherwise, qi,r (y) = 1− qi,r (y).

i.e., δ = 0. Thus, compared to conventional analyses of voting in legislatures, we estimate one

additional parameter, which governs whether senators behave strategically. Our structural

model, therefore, nests the standard one.

We favor the Bayesian over a frequentist approach for two reasons. First, it produces valid

inferences even when the number of model parameters is a function of the sample size.

Second, it allows us to directly assign probabilities to substantive hypotheses about how

senators behave. For instance, below we ask whether senators are strategic, and we calculate

the posterior probability of H0 : δ > 0.21

In our view, the most important drawback of a Bayesian analysis is that the results might

be driven by the prior. To mitigate this concern and to “let the data speak” as much as

possible, we only impose vague priors. Specifically, we assume that (x, β, γ, δ) ∼ N (0, κI),

where I denotes the identity matrix and κ = 25. As the presentation of our results makes

clear, the information in the data “swamps” these priors.

Before we proceed to the results, however, it is useful to point out how each parameter is

identified. Note, in our hand-collected data for the 112th Congress, we only observe senators’

votes and the order in which they were cast.

Identification. As in virtually all random utility models, the parameters in (2) can only

be recovered up to some arbitrary normalizations. To set the overall scale, we let σ = 1. A

peculiarity of the spatial model of voting is that we additionally need to fix the scale and

directionality of senators’ ideal points. This is because, for any κ ≷ 0, βrxi = (κβr)(xiκ

We resolve this indeterminacy by anchoring the ideal point of Senator John Kerry (D-MA)

at −1 and that of Senator John McCain (R-AZ) at 1. Hence, an ideal point of xi = 0 should

be interpreted as ideologically halfway between Kerry and McCain, whereas a senator with

xi < −1 (xi > 1) would be more liberal (conservative) than the former (latter). These are

pure normalizations; no matters of substance are involved.

In the absence of strategic behavior, i.e., with δ = 0, the likelihood function above is iso-

morphic to that of a one-dimensional latent variable or IRT model. Identification of these

models has been studied extensively in the psychometrics and, to a lesser extent, the polit-

ical science literature (see, e.g., Bock et al. 1988; Bollen 1989; Jackman 2001; Rivers 2003;

Thurstone 1947). Poole (2000), for instance, establishes that the ordering of legislators’ ideal

points can be recovered without imposing parametric assumptions on the random compo-

nent in senators’ utility function. Peress (2012) shows that, as long as ε+i,r and ε−i,r are i.i.d.,

nonparametric identification extends to the cardinal case and encompasses all parameters in

the standard IRT framework.

21By contrast, classical p-values do not generally correspond to the probability that the null hypothesis iscorrect.

To build intuition for how the parameters in our model depend on the data, first focus on a

set of “lopsided” votes–as in Snyder and Groseclose (2000). Since the outcome of these roll

calls is never truly in question, senators’ choices are pinned down by their position-taking

preferences, i.e., δ∆qi,r ≈ 0. When legislators decide on the same set of issues, then those who

tend to vote alike must have similar ideal points. Given the normalizations above, senators

who often vote with John Kerry will have negative ideal points, whereas senators whose

choices align with those of John McCain will have positive x. A senator will be placed to the

left (right) of Kerry (McCain) if she disagrees even more frequently with McCain (Kerry)

than Kerry (McCain) does. Thus, even if senators are, in fact, strategic, it is still possible to

infer their ideal points from “lopsided” votes.

Next, consider the roll call—specific parameters γ and β, which denote the bill’s valence

and partisan divisiveness, respectively. A measure that receives many “yeas” from both sides

of the aisle has high valence and low divisiveness. By contrast, roll calls that split neatly

along partisan lines have low γ and a high absolute value of β. Whether the latter is positive

or negative depends on which of the parties supports the bill.

Lastly, identification of δ comes from two sources. First, conditional on the intensity of

a senator’s position-taking preferences (i.e., the value of βrxi + γr), is she more likely to

support the party line during a “close” roll call than a “lopsided” one? Second, comparing

the votes of different legislators on the same bill, are senators less likely to defect when, due

to the vote order, their own vote stands a higher chance of being decisive? If so, then δ is

positive.

The preceding arguments clarify which features of the data allow us to draw inferences

about different model parameters. Intuitively, our Bayesian MCMC approach treats all pa-

rameters as random variables. Using Bayes Rule, it combines our diffuse priors with the

information in the data to obtain the posterior distribution of (x, β, γ, δ). By construction,

the posterior summarizes all available information about these unknown quantities.

5.3. Results

Baseline. Figure 6 plots the posterior mean of senators’ estimated ideal points against

the first dimension of Poole and Rosenthal’s (1997) DW-Nominate scores, the most widely

used index of legislators’ ideology. Although the two measures are based on different mod-

elling assumptions, their rank-order correlation exceeds .9–consistent with the idea that

ideological ideal points can be nonparametrically identified.

To speak to the question of whether senators exhibit strategic foresight, we turn to the

posterior distribution of δ. According to our theory of “parties as teams,” senators engage in

backward induction to optimally decide between pursuing their instrumental and position-

taking goals. If senators actually have instrumental preferences and if they are forward-

looking, then δ should be positive. We, therefore, evaluate the null hypothesis H0 : δ > 0

against the alternative H1 : δ ≤ 0. The latter encompasses the assumption of pure position-

taking, which underlies almost all analyses of roll-call voting in the political science literature.

Figure 7 depicts the marginal posterior of δ (solid line) relative to its prior (dashed line).

Reassuringly, the posterior is considerably more concentrated, suggesting that the data rather

than the prior are driving our results. Most importantly, positive values account for 96.4% of

probability mass. As a consequence, the posterior odds of H0 versus H1 are Pr(H0|Ψ)Pr(H1|Ψ)

≈ 26.8,

which leads us to conclude that there is strong evidence of strategic behavior in the U.S.

Senate.

A the same time, heterogeneity in senators’ subjective beliefs may muddle the interpreta-

tion of our estimates. Suppose, for instance, that only a subset of agents actually engage in

backward induction, while others behave myopically. If correct, then the posterior in Figure

7 would systematically understate the importance of strategic considerations. In order to

decouple heterogeneity in expectation formation from senators’ instrumental concerns, we

now extend our baseline model to allow for different “types” of legislators.22

Extended Model. In particular, we empirically differentiate among three distinct ex-

pectation-formation mechanisms: (i) myopia, (ii) backward induction, and (iii) a behavioral

heuristic that crudely resembles the backward-induction strategy in Proposition 1. That is,

rather than explicitly contemplating the likely behavior of others, some senators may take

a mental shortcut by comparing the number of copartisans who have yet to submit their

vote with the number of “yeas” or “nays” that are still required for their party’s preferred

outcome to obtain. If the former falls short of the latter by exactly one, then agents who rely

on this heuristic believe that their own vote is going to be decisive. Otherwise, they conclude

that their choice has no effect on the ultimate outcome of the roll call. Put differently, agents

who reason according to this heuristic expect that all senators follow the party line if need

Assuming that senators’ types are common knowledge, and letting π1, π2, and π3 denote

22Since we have no substantive interest in the roll call—specific parameters (β, γ), we relegate the presen-tation of these results to Appendix Figures A.18 and A.19.23Our approach of addressing heterogeneity in reasoning by introducing differentially sophisticated “types”

is similar to that in the literature on level-k thinking (see, e.g., Gill and Prowse 2016 or the review ofCrawford et al. 2013), with the important difference that the “heuristic type” in our analysis is typically notaccomodated by the aforementioned framework.

the population share of each, the likelihood of our extended model is given by

L (x, β, γ, δ, π|Ψ) =C∏r=1

S∏i=1

3∑t=1

πtΦ (βrxi + γr + δ∆qi,r,t)yi,r

× [1− Φ (βrxi + γr + δ∆qi,r,t)]1−yi,r ,

where ∆qi,r,t refers to senator i’s subjective probability of casting a pivotal vote when she

reasons like type t.24 Since we have no strong a priori justification to favor one expectation

formation mechanism over the other, we impose a symmetric Dirichlet prior with parameters

a1 = a2 = a3 = 1. That is, our prior is that (π1, π2, π3) are uniformly distributed over the

unit simplex. We continue to assume that all senators place the same weight on achieving

their party’s preferred outcome, and, in light of the evidence from our baseline model, we

restrict δ to be weakly positive.

Reassuringly, senators’ estimated ideal points remain virtually unaffected when we allow

for heterogeneity in expectation formation (cf. Appendix Figure A.14). The posterior distri-

bution of δ, however, shifts markedly to the right. As shown in Figure 8, once we account for

the possibility that some agents may be myopic, the posterior mean of δ increases to 6.15.

To put this number into perspective, consider a senator who, based on her ideology, would

be expected to assent to a bill with probability Φ (3) = .999. A value of δ ≈ 6 implies that if

her party opposes the measure and she knows her vote to be pivotal, then that same senator

will now reject the bill with probability .999. The evidence, therefore, suggests that senators

care intensely about the party line. They are not pure position takers.

Yet, the evidence also implies that many legislators are not fully sophisticated. Figure 9

depicts the marginal posteriors of π1, π2, and π3. Taking the posterior means at face value,

almost 52% of senators behave myopically. Another 16% are estimated to rely on the heuristic

described above, while 32% of agents actually reason backwards.

To be clear, there are potentially many ways for senators to form expectations about

the outcome of the roll call, and our structural model distinguishes only three of them.

Nevertheless, we believe that even such a crude differentiation is useful because the estimated

type shares are informative about broad categories of mental reasoning.

For instance, we do not claim that nearly a third of senators engage in exact backward

24More specifically, let |P |i′>i denote the number of senators who vote after i and whose party supportsthe bill. For senators who rely on the heuristic described above,

∆qi,r,3 =

{1 if |P |i′>i + 1 = y − y0 otherwise

whereas ∆qi,r,1 = 0 for myopic agents. The beliefs of legislators who reason backwards continue to be definedrecursively, similarly to the construction in footnote 20.

induction. However, for Bayes Rule to assign considerable mass to π2 rather than π3 or π1,

it must be the case that the choices of these legislators are more consistent with explicit

backward reasoning than either of the alternatives. That is, these senators are forward-

looking and consider the likely behavior of at least a few others who have not yet voted–

above and beyond what could be inferred based on party affiliation alone. It is this type that

most closely resembles the idealized rational economic agent.

By contrast, senators whose conduct is best described by the mental shortcut above do

not react to the likely actions of others, although they do behave with an eye toward the

future. These types realize that the outcome of the roll call depends on who has yet to vote,

but they lack the sophistication to account for uncertainty in the choices of other players.

Still, game theory provides a useful tool to understand even these senators’ behavior. After

all, without strategic interactions there would be no preemption and no (attempted) free

riding, both of which are reflected in this type’s choices. Only when it comes to myopic agents

would a game-theoretical model lead us astray. Since our results suggest that about half of

senators fall into this category, we ask what predicts whether a legislator lacks foresight?

Ancillary Results. To speak to this issue, we calculate the mean posterior probability

of each senator being myopic. We then regress these probabilities on a parsimonious set of

individual-level characteristics and present the results in Table 4. Although our estimates

are very noisy–in part because only 102 senators served during the 112th Congress–there

is some evidence to suggest that agents with little to no prior experience in roll-call settings

are more likely to act myopically.

In Figure 10, we return to all roll-call votes since the emergence of the two-party system and

study whether experience correlates with whether senators engage in preemption. The results

are based on the empirical model in equation (1), but allow for the impact of alphabetical

rank, i.e., λ, to vary with the number of votes in which a senator had already participated

at the time a given roll call was held.

For senators who have fewer than a thousand votes under their belt, there is no evidence

that they preempt their colleagues when they are allowed to vote earlier. Not only are the

respective point estimates jointly insignificant (p = .765), but each of them is rather close

to zero. By contrast, estimates for agents who have participated in more than a thousand

previous calls are large and statistically significant. Furthermore, it is possible to reject the

null that the least experienced senators (i.e., those who have cast fewer than a hundred

roll call votes) react at least as strongly to alphabetical rank as the most experienced ones,

i.e., those who have voted more than 5,000 times before (p = .037). Even senators with the

experience of 500 to 1,000 calls appear to react less to the opportunity to vote early than

their colleagues with more than 5,000 previous votes (p = .045). Exploiting the fact that

senators vote on more bills in some Congresses than in others, Appendix Figure A.3 shows

that these patterns are qualitatively robust to conditioning on the time a senator has served.

This suggests that it is not seniority, but experience with the game, that leads agents to

preempt each other.

6. Alternative Explanations

Although the evidence above supports the idea that strategic voting coupled with back-

ward reasoning is an important part of equilibrium play, we now discuss two alternative

mechanisms that might also give rise to our main finding of negative vote-order effects.

6.1. Herding and Learning about Common Values

The first alternative is based on theories of herding and learning about common values (e.g.,

Ali and Kartik 2012; Banerjee 1992; Bikhchandani et al. 1992; Callander 2007; Welch 1992).

At its core, this explanation assumes that legislators are uncertain about whether supporting

a particular bill is in their own best interest. They, therefore, resort to the choices of others

to inform their decisions. In particular, if a senator who votes later in the order observes

that those before her have condoned a particular measure, she might infer that assenting to

the bill is in her own best interest–even if her initial instinct was to oppose it.

Example. Rather than explicate a general model that would simply duplicate existing

contributions, we confine ourselves to presenting a stylized example that conveys the intuition

for why herding and learning about common values might explain vote-order effects in the

Senate. Let there be three senators, i = 1, 2, 3, who must choose between voting for or

against a particular bill, vi ∈ {0, 1}. Each legislator receives a strictly positive payoff if her

choice matches the unknown state of the world ω ∈ {0, 1}, and zero otherwise. Initially,

all senators believe both states to be equally likely. They then receive private, independent

signals si ∈ {0, 1} that are correct with probability p > 12.25

Senators vote in the order of their index. As p > 12, senator 1 will always choose v1 = s1.

Senator 2 is, therefore, perfectly able to infer her colleague’s signal. She then updates her

belief about the true state of the world according to Bayes Rule. Whenever s2 = v1, she

concludes that Pr(ω = s2) > 12

and votes her signal. If s2 6= v1, i.e., if senator 2 knows there

to be two contradictory signals, her posterior belief is that both states are equally likely. She

then randomizes between playing v2 = s2 and v2 = v1.

Now consider senator 3’s problem, conditional on having observed senators 1 and 2. If her

colleagues made different choices, she knows that they must have received opposing signals.

25More precisely, p ≡ Pr(si = 1|ω = 1) = Pr(si = 0|ω = 0).

Since her copartisans’ signals cancel each other out, senator 3 chooses to rely on her private

information and votes v3 = s3. If, however, she observes that both of her colleagues voted in

the same way, she will generally follow suit.

To see why, note that her signal and that of senator 1 offset each other. Without observing

senator 2, she would believe both states of the world to be equally likely. However, as the vote

of senator 2 is also informative about the state of the world, she infers that Pr(ω = s2) > 12.

She, therefore, ignores her private information and adheres to the majority position.

The key takeaway from this simple example is that herding may occur because the votes

of earlier senators are informative about copartisans’ (common) best course of action. In-

tuitively, whenever senators look at fellow party members for cues, learning about common

values leads to positive serial correlation in choices. Moreover, given that those who vote

later in the order observe a greater number of their colleagues, they will, on average, have

a more precise posterior. Later voters should thus be more likely than earlier ones to ignore

their private information and go with the majority position in their own party.

Our empirical test of this mechanism probes the prediction that a senator’s own choice

depends causally on those of her colleagues who have already voted. In particular, we estimate

(3) yi,p,r = µi + θy<i,p,r + χi,p,r + ηi,p,r,

where yi,p,r is an indicator variable for whether senator i voted “yea” on roll call r, y<i,p,r

denotes the share of her copartisans who did so before it was i’s turn, and µi marks a senator

fixed effect. To account for the fact that senators have not only observed a different mix of

potential signals, but also differentially many, we include fixed effects for i’s position in the

vote order, χi,p,r.

The parameter of interest is θ. It measures the effect of copartisans’ observable choices on

those of senator i. If there is learning about common values, then, all else equal, we would

expect that a senator is more likely to say “yea” the greater the share of her colleagues who

have already chosen that position. That is, θ should be positive.

For θ to be consistently estimated, it must be the case that the error term in equation (3)

is uncorrelated with the regressors. This, however, is unlikely to be the case. Even if senator

i is not taking any cues, she is likely to assent to the same set of bills as her colleagues voting

before her–simply because members of the same party tend to have similar preferences

(Krehbiel 1993). Unobserved heterogeneity in the popularity of bills may, therefore, lead to

spuriously positive estimates of θ.26

26For instance, using the House as a placebo test, we obtain an estimate of θ in excess of .9, even thoughcongressmen generally do not submit their votes in alphabetical order.

Thus, instead of asking whether θ is statistically distinguishable from zero, we ask whether

it is larger than one would expect under the null hypothesis of no cue taking, conditional on

bill characteristics. If not, we are forced to conclude that the data are consistent with the

To obtain the distribution of the estimator under the null hypothesis, we build on an idea

in Scheinkman and LeBaron (1989) and resort to placebo techniques (see also Efron 1982, ch.

5). Specifically, we randomly assign senators to a particular spot in the vote order. Holding

their actual choices fixed, we reestimate equation (3) after replacing the original y<i,c with

that induced by the placebo ordering. Repeating this procedure sufficiently many times gives

rise to the null distribution. By construction, “randomly reshuffling” the vote order breaks

the serial dependence in senators’ choices. Under the null, nonzero values of the estimator

must be due to other, omitted variables or to random noise.27

Figure 11 shows the CDF of the placebo estimates. The upper panel relies on senators’

alphabetical ordering during all roll-call votes since the emergence of the two-party system.

In the lower panel, we rely on our hand-collected data on the sequence in which votes

were cast during the 112th Congress. As one might expect if copartisans’ preferences are

highly correlated, the median of each placebo distribution is much closer to one than to

zero. Strikingly, the original point estimate (depicted by the vertical line) is actually smaller

than the vast majority of values in the respective null distribution. The evidence provides,

therefore, no support for the herding mechanism.

In order to demonstrate that our placebo approach is not severely underpowered, we show

in Appendix F that it does detect negative serial correlation in defection among copartisans–

as predicted by systematic preemption in our model of “parties as teams.” Further, the

appendix reports results from a Monte Carlo study, which suggest that this approach reliably

detects herding in data sets considerably smaller than ours.

Lastly, and perhaps most importantly, we note that a herding mechanism cannot easily

explain why senators’ choices depend on the probability of casting a pivotal vote. Thus, we

not only fail to find evidence in support of learning about common values, but we can also

reject such an explanation on the basis of it being incomplete.

6.2. Vote Buying

The second alternative explanation we consider is vote buying (Dal Bo 2007; Dekel et al.

2008, 2009; Groseclose and Snyder 1996; Snyder 1990). One way to think about vote buying

is as one of several potential justifications for the payoff structure in our theory of “parties

27Among physicists, this approach is known as the method of surrogate data. It is commonly used to testfor nonlinearities in time series on superfluids, brain waves, and sunspots (see, e.g., Theiler et al. 1992).

as teams.” Suppose, for instance, that party leaders can write outcome-dependent contracts

with rank-and-file members. Rather than stipulating actual cash payments–which would be

illegal–such contracts may take on the form of “pork,” i.e., particularized benefits written

into the actual bill. Thus, at least a subset of agents would receive some benefit δ if and only

if the bill passes. Note, however, for negative vote-order effects to occur in such a setting, it

would still have to be the case that agents reason backwards.

Of course, if the party leadership could, it might want to offer payments that are conditional

on an individual senator’s choice instead of the overall outcome. The perhaps more relevant

question is, therefore, whether vote buying can explain vote-order effects without relying

on senators to backward induct. For such a theory to rationalize why agents later in the

vote order are more likely to support the party line than those with last names closer to

the beginning of the alphabet, it would have to be the case that, in equilibrium, the party

leadership is more likely to bribe the former.

A robust prediction of traditional vote-buying models is that the legislators who are

“bought off” are closer to being indifferent than those who do not receive any payments

(see, e.g., Dekel et al. 2008, 2009; Groseclose and Snyder 1996; Snyder 1990). The reason is

that it is cheaper to sway someone who feels less strongly about a particular issue. Thus,

for conventional vote-buying theories to explain the patterns in the data without appealing

to senators employing backward induction, one would have to believe that senators become

more likely to intrinsically support a measure when their alphabetical rank increases.

However, existing theories do not explicitly account for the fact that senators cast their

votes sequentially. It is, therefore, not clear whether the conclusions from these models do,

indeed, apply to roll calls in the Senate. In an effort to better understand whether rational

party leaders should target legislators who appear later in the alphabet, we have extended the

model of Dekel et al. (2009) to settings in which votes are submitted in order. In Appendix

G, we show that the basic insights gleaned from their static theory carry over to the dynamic

case. In particular, we prove that, in equilibrium, party leaders should target legislators based

on their preferences and independent of the vote order.

This leaves two salient possibilities for a vote buying mechanism to explain our findings:

(i) party leaders may be unaware of senators’ preferences (and make insufficient attempts to

become informed prior to the vote); or (ii) party leaders could be irrational. Consider, for

instance, the following decision rule. Suppose the leadership chooses to acquire little to no

information about the preferences of rank-and-file members. Instead of approaching the set

of senators who are the “cheapest,” party leaders may simply wait for the likely outcome of

the roll call to become apparent, and, if it is going to be close, pay whatever is necessary to

“convince” the agents who have yet to submit their vote.

Clearly, in settings in which it is nearly costless to acquire information about preferences,

such a strategy would not be optimal. Nonetheless, if party leaders did follow this or a similar

behavioral rule, then vote buying would give rise to negative order effects without relying

on backward reasoning.

Yet, models that emphasize decisions by the party leadership rather than individual sen-

ators have trouble explaining why we find substantial individual-level heterogeneity in how

rank-and-file members react to the probability of casting the decisive vote. Moreover, if vote

buying causes negative order effects without senators reasoning backwards, then it is not

clear why agents only respond to their position in the vote order after having participated in

sufficiently many roll calls. The result that experience matters, even conditional on seniority,

leads us to further discount the vote-buying mechanism.

7. Concluding Remarks

Game theory posits that agents try to anticipate the actions of others in order to maximize

their own payoffs. Nowhere else is this idea as purely embodied as in the principle of backward

induction in dynamic games of perfect information. Although backward induction provides

each player with an impeccable way to arrive at an optimal strategy, and despite the fact that

it is widely used to analyze games’ subgame-perfect equilibria, there remains one nagging

issue: when tested in the laboratory, the predictions of backward induction have not held up

to empirical scrutiny.

In this paper, we study backward reasoning in the “wild,” i.e., the United States Senate.

Our analysis builds on the fact that the Senate holds roll calls in alphabetical order. Ex-

ploiting quasi-random variation in the alphabetical composition of the chamber over time,

we document two striking facts. First, the majority party is considerably more likely to nar-

rowly win a roll-call vote than to narrowly lose it. Second, senators who appear early in the

vote order are more likely to deviate from the party line than those towards the end.

To rationalize these findings, we propose a theory of “parties as teams.” Our model assumes

that senators are position takers who, all else equal, would like to vote for the alternative

preferred by themselves or by their constituents. Yet, senators also care about the ultimate

outcome of the roll call–either because they are concerned about their party’s reputation

or due to the influence of party elites. For the subset of individuals whose own preferences

are not aligned with the position of their party, a conflict of interest arises. They would like

to defect, but only if their vote does not end up being decisive. In equilibrium, conflicted

senators are able to deviate without rendering the roll call lost if they appear early enough in

the vote order. Agents towards the end of the order, however, must stick with the party line

to still carry the bill to passage. Since agents only defect when their own vote is going to be

inconsequential, the theory predicts the relative dearth of narrow defeats for the majority.

Negative vote-order effects arise because senators backward induct in order to free ride on

their colleagues.

Our structural estimates imply that many, but by no means all, senators explicitly reason

backwards. Given the pervasiveness of failures of backward induction in the laboratory, this

result may be unexpected. At the same time, certain experimental results travel very well.

We find that free riding and strategic preemption increase as senators gain experience, which

complements existing evidence suggesting that the behavior of professionals is often more

consistent with the predictions of standard theory than that of novices.

More broadly, the evidence we present supports anecdotal accounts according to which

politicians engage in strategic behavior during legislative votes. Our results also speak to

a long-standing debate among political scientists about the extent of party influence (e.g.,

Krehbiel 1993; McCarty et al. 2001; Rhode 1991; Snyder and Groseclose 2000, among many

others). Our findings for the 112th Congress are consistent with the notion that parties are

influential enough to induce senators to follow the party line if need be. Party discipline is

not strong enough, however, to prevent legislators from opposing the party’s official position

if the opportunity permits.

References

Ali, S. N., andN. Kartik (2012). “Herding with Collective Preferences.” Economic Theory, 51(3): 601—626.

Asheim, G. B., and M. Dufwenberg (2003). “Deductive Reasoning in Extensive Games.” Economic

Journal, 113(487): 305—325.

Baghestanian, S., and S. Frey (2016). “GO figure: Analytic and Strategic Skills are Separable.” Journal

of Behavioral and Experimental Economics, 64: 71—80.

Banerjee, A. V. (1992). “A Simple Model of Herd Behavior.” Quarterly Journal of Economics, 107(3):

797—817.

Barber, M. and N. McCarty (2015). “Causes and Consequences of Polarization,” (pp. 15—58) in N.

Persily (ed.), Solutions to Political Polarization in America. New York: Cambridge University Press.

Bertrand, M., and S. Mullainathan (2004). “Are Emily and Greg More Employable than Lakisha

and Jamal? A Field Experiment on Labor Market Discrimination.” American Economic Review, 94(4):

991—1013.

Bikhchandani, S., D. Hirshleifer, and I. Welch (1992). “A Theory of Fads, Fashion, Custom, and

Cultural Change as Informational Cascades.” Journal of Political Economy, 100(5): 992—1026.

Binmore, K., J. McCarthy, G. Ponti, L. Samuelson, and A. Shaked (2002). “A Backward Induction

Experiment.” Journal of Economic Theory, 104(1): 48—88.

Bock, R. D., R. Gibbons, and E. J. Muraki (1988). “Full Information Factor Analysis.” Applied Psy-

chological Measurement, 12(3): 261—280.

Bollen, K. A. (1989). Structural Equations with Latent Variables. New York: Wiley.

Callander, S. (2007). “Bandwagons and Momentum in Sequential Voting.” Review of Economic Studies,

74(3): 653—684.

Camerer, C. (2003). Behavioral Game Theory. Princeton, NJ: Princeton University Press.

Cameron, A. C., and P. K. Trivedi (2005). Microeconometrics: Methods and Applications. New York:

Cambridge University Press.

Chiappori, P.-A., S. D. Levitt, and T. Groseclose (2002). “Testing Mixed-Strategy Equilibria When

Players Are Heterogeneous: The Case of Penalty Kicks in Soccer.” American Economic Review, 92(4):

1138—1151.

Clinton, J., S. Jackman, and D. Rivers (2004). “The Statistical Analysis of Roll Call Data.” American

Political Science Review, 98(2): 355—370.

Congressional Quarterly (1971). “Members Vote in Anonymity on Many Crucial Issues,” (p. 454) in

Congressional Quarterly Almanac, 1970. Washington, DC: Congressional Quarterly, Inc.

Crawford, V. P., M. A. Costa-Gomes, and N. Iriberri (2013). “Structural Models of Nonequilibrium

Strategic Thinking: Theory, Evidence, and Applications.” Journal of Economic Literature, 51(1): 5—62.

Dal Bo, E. (2007). “Bribing Voters.” American Journal of Political Science, 51(4): 789—803.

Dekel, E., M. O. Jackson, and A. Wolinsky (2008). “Vote Buying: General Elections.” Journal of

Political Economy, 116(2): 351—380.

, , and (2009). “Vote Buying: Legislatures and Lobbying.” Quarterly Journal of

Political Science, 4(2): 103—128.

Downs, A. (1957). An Economic Theory of Democracy. New York: Harper and Row.

Dufwenberg, M., R. Sundaram, and D. J. Butler (2010). “Epiphany in the Game of 21.” Journal of

Economic Behavior and Organization, 75(2), 132—143.

Efron, B. (1982). The Jackknife, the Bootstrap and Other Resampling Plans. Philadelphia: Society for

Industrial and Applied Mathematics.

Enelow, J. M. (1981). “Saving Amendments, Killer Amendments, and an Expected Utility Theory of

Voting.” Journal of Politics, 43(4): 1062—1089.

Fey, M., R. D. McKelvey, and T. R. Palfrey (1996). “An Experimental Study of Constant Sum

Centipede Games.” International Journal of Game Theory, 25(3): 269—287.

Fryer, R. G., and S. D. Levitt (2004). “The Causes and Consequences of Distinctively Black Names.”

Quarterly Journal of Economics, 119(3): 767—805.

Gill, D., and V. Prowse (2016). “Cognitive Ability, Character Skills, and Learning to Play Equilibrium: A

Level-k Analysis.” Journal of Political Economy, 124(6): 1619—1676.

Gneezy, U., A. Rustichini, and A. Vostroknutov (2010). “Experience and Insight in the Race Game.”

Journal of Economic Behavior and Organization, 75(2), 144—155.

Green, D. P., and I. Shapiro (1994). Pathologies of Rational Choice Theory. New Haven, CT: Yale

University Press.

Groseclose, T., and J. M. Snyder (1996). “Buying Supermajorities.” American Political Science Review,

90(2): 303—315.

, and J. Milyo (2010). “Sincere versus Sophisticated Voting in Congress: Theory and Evidence.”

Journal of Politics, 72(1): 60—73.

, and (2013). “Sincere versus Sophisticated Voting when Legislators Vote Sequentially.”

Social Choice and Welfare, 40(3): 745—751.

Hsu, S., C. Huang, and C. Tang (2007). “Minimax Play at Wimbledon: Comment.” American Economic

Review, 97(1): 517—523.

Iaryczower, M., and S. Oliveros (2015). “Competing for Loyalty: The Dynamics of Rallying for Sup-

port.” mimeographed, Princeton University.

Jackman, S. (2001). “Multidimensional Analysis of Roll Call Data via Bayesian Simulation: Identification,

Estimation, Inference, and Model Checking.” Political Analysis, 9(3): 227—241.

Jenkins, J. A., and M. C. Munger (2003). “Investigating the Incidence of Killer Amendments in

Congress.” Journal of Politics, 65(2): 498—517.

Johnson, E. J., C. Camerer, S. Sen, and T. Rymon (2002). “Detecting Failures of Backward Induction:

Monitoring Information Search in Sequential Bargaining.” Journal of Economic Theory, 104(1): 16—47.

Landha, K. K. (1994). “Coalitions and Congressional Voting.” Public Choice, 78(1): 43—63.

Levitt, S. D. (1996). “How Do Senators Vote? Disentangling the Role of Voter Preferences, Party Affiliation,

and Senator Ideology.” American Economic Review, 86(3): 425—441.

, and J. A. List (2007). “What Do Laboratory Experiments Measuring Social Preferences Reveal

About the Real World?” Journal of Economic Perspectives, 21(2): 153—174.

, J. A. List, and S. E. Sadoff (2011). “Checkmate: Exploring Backward Induction among Chess

Players.” American Economic Review, 101(2): 975—990.

List, J. A. (2003). “Does Market Experience Eliminate Market Anomalies?” Quarterly Journal of Eco-

nomics, 118(1): 41—71.

(2004). “Neoclassical Theory Versus Prospect Theory: Evidence from the Marketplace.” Economet-

rica, 72(2): 615—625.

Londregan, J. (2000). Legislative Institutions and Ideology in Chile’s Democratic Transition. New York:

Cambridge University Press.

Kadane, J. B., and P. D. Larkey (1983). “The Confusion of Is and Ought in Game Theoretic Contexts.”

Management Science, 29(12): 1365—1378.

Kawai, K., and Y. Watanabe (2013). “Inferring Strategic Voting.” American Economic Review, 103(2):

624—662.

Koempel, M. L., J. R. Straus, and J. Schneider (2008). “Recorded Voting in the House of Represen-

tatives.” Congressional Research Service Report RL34570.

Kovash, K., and S. D. Levitt (2009). “Professionals Do Not Play Minimax: Evidence from Major League

Baseball and the National Football League.” NBER Working Paper No. 15347.

Krehbiel, K. (1993). “Where’s the Party?” British Journal of Political Science, 23(2): 235—266.

(1998). Pivotal Politics. Chicago: University of Chicago Press.

Kreps, D. M. (1990). Game Theory and Economic Modelling. New York: Oxford University Press.

, P. Milgrom, J. Roberts, andR. Wilson (1982). “Rational Cooperation in the Finitely Repeated

Prisoners’ Dilemma.” Journal of Economic Theory, 27(2):245—252.

Kuhn, H. W. (1953). “Extensive Games and the Problem of Information,” (pp. 193—216) in H. W. Kuhn,

and A. Tucker (eds.), Contributions to the Theory of Games, Vol. 2. Princeton, NJ: Princeton University

Press.

McCarty, N., K. T. Poole, H. Rosenthal (2001). “The Hunt for Party Discipline in Congress.” Amer-

ican Political Science Review, 95(3): 673—687.

, , and (2016) Polarized America: The Dance of Unequal Riches. Boston: MIT Press.

Magleby, D. B., B. P. Montagnes, and J. L. Spenkuch (2015) “What’s in a Name? Strategic Voting

and Moral Hazard in the U.S. Senate,” mimeographed, Northwestern University.

Mayhew, D. R. (1974). Congress: The Electoral Connection. New Haven, CT: Yale University Press.

McCrary, J. (2008). “Manipulation of the Running Variable in the Regression Discontinuity Design: A

Density Test.” Journal of Econometrics, 142(2): 698—714.

McKelvey, R. D., and T. R. Palfrey (1992). “An Experimental Study of the Centipede Game.” Econo-

metrica, 60(4): 803—836.

Nagel, R., and F. F. Tang (1998). “Experimental Results on the Centipede Game in Normal Form: An

Investigation on Learning.” Journal of Mathematical Psychology, 42(2-3): 356—384.

Palacios-Huerta, I. (2003). “Professionals Play Minimax.” Review of Economic Studies, 70(2): 395—415.

, andO. Volij (2008). “Experientia Docet: Professionals Play Minimax in Laboratory Experiments.”

Econometrica, 76(1): 71—115.

, and O. Volij (2009). “Field Centipedes.” American Economic Review, 99(4): 1619—1635.

Peress, M. (2012). “Identification of a Semiparametric Item Response Model.” Psychometrika, 77(2): 223—

Poole, K. T. (2000). “Nonparametric Unfolding of Binary Choice Data.” Political Analysis, 8(3): 211—237.

(2005). Spatial Models of Parliamentary Voting. Cambridge, UK: Cambridge University Press.

, and H. Rosenthal (1997). Congress: A Political-Economic History of Roll Call Voting. New York:

Oxford University Press.

Rapoport, A., W. E. Stein, J. E. Parco, and T. E. Nicholas (2003). “Equilibrium Play and Adaptive

Learning in a Three-Person Centipede Game.” Games and Economic Behavior, 43(2): 239—265.

Reny, P. J. (1992). “Rationality in Extensive-Form Games.” Journal of Economic Perspectives, 6(4): 103—

Rhode, D. W. (1991). Parties and Leaders in the Postreform House. Chicago: University of Chicago Press.

Riker, W. H. (1982). Liberalism Against Populism. San Francisco: W.H. Freeman.

(1986). The Art of Political Manipulation. New Haven, CT: Yale University Press.

Rivers, D. (2003). “Identification of Multidimensional Spatial Voting Models.” mimeographed, Stanford

University.

Rybicki, E. (2013). “Voting and Quorum Procedures in the Senate.” Congressional Research Service Report

96-452.

Scheinkman, J. A., and B. LeBaron (1989). “Nonlinear Dynamics and Stock Returns.” Journal of Busi-

ness, 62(3): 311—337.

Selten, R. (1965). “Spieltheoretische Behandlung eines Oligopolmodels mit Nachfragetragheit.” Zeitschrift

fur die gesamte Staatswissenschaft, 121: 301—324 and 667—689.

Simon, H. A. (1955). “A Behavioral Model of Rational Choice.” Quarterly Journal of Economics, 69(1):

99—118.

Snyder, J. M. (1990). “On Buying Legislatures.” Economics and Politics, 3(2): 93—109.

, and T. Groseclose (2000). “Estimating Party Influence in Congressional Roll-Call Voting.”

American Journal of Political Science, 44(2): 193—211.

, and M. M. Ting (2002). “An Informational Rationale for Political Parties.” American Journal of

Political Science, 46(1): 90—110.

Spenkuch, J. L. (2014). “Backward Induction in the Wild: Evidence from the U.S. Senate.” mimeographed,

Northwestern University.

(2015). “(Ir)rational Voters?” mimeographed, Northwestern University.

Theiler, J., S. Eubank, A. Longtin, B. Galdrikian, and J. D. Farmer (1992). “Testing for Nonlin-

earity in Time Series: The Method of Surrogate Data.” Physica D, 58: 77-94.

Theriault, S. (2008).Party Polarization in Congress. New York: Cambridge University Press.

Thurstone, L. L. (1947). Multiple Factor Analysis. Chicago: Universiyt of Chicago Press.

Volden, C. (1998). “Sophisticated Voting in Supermajoritarian Settings.” Journal of Politics, 60(1): 149—

von Neumann, J., and O. Morgenstern (1944). Theory of Games and Economic Behavior. Princeton,

NJ: Princeton University Press.

Walker, M., andWooders, J. (2001). “Minimax Play at Wimbledon.” American Economic Review, 91(5):

1521—1538.

Welch, I. (1992). “Sequential Sales, Learning, and Cascades.” Journal of Finance, 47(2): 695—732.

Wilkerson, J. D. (1999). “Killer Amendments in Congress.” American Political Science Review, 93(3):

535—552.

Zauner, K. G. (1999). “A Payoff Uncertainty Explanation of Results in Experimental Centipede Games.”

Games and Economic Behavior, 26(1): 157—185.

A. All Roll Calls

B. Close Roll Calls

C. Lopsided Roll Calls

Figure 1: Probability of Deviating from the Party Line, U.S. Senate

Notes: Figure shows the average frequency with which senators deviate from the majority of their copartisans, depending on their position in the vote order. The upper panel considers all roll calls during the 35th to 112th Congresses. The remaining two panels restrict attention to close and lopsided roll calls, respectively. Roll calls are classified as either "lopsided" or "close" according to the definition in Snyder and Groseclose (2000).

Figure 2: Example of Sequential Voting Game

Notes: Figure shows an example of the sequential voting game in Section 4 with one party and three players, all of whom receive payoff α = -1 if they vote "yea" and δ = 2 if the bill ends up being approved. Two "yea" votes are needed for passage. The thick lines indicate each player's optimal action at a particular node in the game tree.

Figure 3: Empirical CDF of Placebo Estimates for λ in Equation (1)

Notes: Figure shows the empirical cumulative distribution function for placebo estimates of λ in equation (1), based on 10,000 randomly generated orderings. The vertical line indicates the point estimate in the original data.

Figure 4: Distribution of Excess Votes in Favor of Majority Party's Position

Notes: Figure shows the frequency of excess votes (relative to the threshold required for passage) in favor of the position held by the Senate's majority party, as well as the estimated density function and the associated 95%-confidence intervals. The underlying data come from roll calls in the U.S. Senate that required a simple majority and were held during the 35th–112th Congresses. Density estimates are based on local linear regressions with a bandwidth of 4, applied separately on each side of the cutoff. See McCrary (2008) for details on the estimation procedure.

Figure 5: Defection Rates as a Function of Pivotality, 112th Congress

Notes: Figure depicts the relationship between deviations from the party line and the empirical frequency of casting a pivotal vote. The latter is defined as the overall share of instances in which a different choice would have changed the outcome of the roll call, conditional on the number of "yeas" and "nays" that were still required when a particular senator voted as well as the number of copartisans choosing after her.

Figure 6: Posterior Mean of Senators' Ideal Points vs DW-Nominate Scores

Notes: Figure plots the posterior mean of senators' ideal points in our baseline structural model against the left–right dimension of DW-Nominate scores. "D" and "R" respectively denote Democratic and Republican senators. Estimates are based on 1,000,000 MCMC draws.

Figure 7: Posterior Distribution of δ, Baseline Model

Notes: Figure depicts the marginal posterior density of δ in our baseline structural model (solid line) as well as the associated prior (dashed line). The thick line indicates the 90% highest posterior density region. Estimates are based on 1,000,000 MCMC draws and smoothed using a Gaussian kernel with a bandwidth of .03.

Figure 8: Posterior Distribution of δ after Accounting for Myopic Senators

Notes: Figure shows the marginal posterior density of δ in our extended model (solid line) as well as the associated prior (dashed line). The thick line indicates the 90% highest posterior density region. Estimates are based on 1,000,000 MCMC draws and smoothed using a Gaussian kernel with a bandwidth of .1.

A. Posterior Share of Myopic Senators

B. Posterior Share of Backward-Reasoning Senators

C. Posterior Share of "Intermediate Type"

Figure 9: Marginal Posterior of Type Shares

Notes: Figure shows the marginal posterior distributions for the type shares in our extended model (solid lines) as well as the associated priors (dashed lines). The upper panel refers to the share of myopic agents (π1). The middle panel shows the fraction of backward inductors (π2), while the lower one pertains to the share of senators using the heuristic outlined in the main text (π3). All estimates are based on 1,000,000 MCMC draws and smoothed using a Gaussian kernel with a bandwidth of .005.

Figure 10: Order Effects, by Senators' Prior Experience

Notes: Figure shows point estimates and the associated 95%-confidence intervals for λ in equation (1), depending on the number of roll calls in which senators had previously participated. Estimates are based on senators' roll call–specific rank. Confidence intervals account for heteroskedasticity and clustering of the residuals at the Congress level.

≤ 100 100 to 500 500 to 1,000 1,000 to 2,000 2,000 to 5,000 ≥ 5,000

Number of Roll Calls in which Senator Previously Participated

A. 35th–112th Congresses

B. 112th Congress

Figure 11: Null Distribution for θ in Equation (3)

Notes: Figure shows the empirical cumulative distribution function of θ in equation (3) under the null hypothesis of no herding (solid line) vis-à-vis the actual point estimate (dashed line). The upper panel relies on senators' roll call–specific alphabetical ordering and uses data on all roll-call votes during the 35th–112th Congresses. The lower panel restricts attention to the 122th Congress and relies on the exact order in which votes ended up being cast. Both distributions are based on 10,000 randomly generated placebo orderings.

Variable Mean Median SD Min Max

Congress Level (N = 78):Number of Roll Calls 511.9 492 272.2 84 1,311Number of Distinct Senators 95.45 101 12.18 54 111Number of Distinct Democrats 47.81 48 15.24 10 82Number of Distinct Republicans 46.41 47 10.33 16 67

Roll Call Level (N = 39,929 ):Number of Valid Votes 76.10 85 21.85 14 100Outcome "Close" .500 0 .500 0 100Outcome "Lopsided" .500 1 .500 0 100Divisive .564 1 .496 0 100

Vote Level (N = 2,897,879) :Alphabetical Rank .500 .500 .292 0 1Deviation from Party Line .184 0 .388 0 1

Table 1: Summary Statistics for Roll-Call Votes in the U.S. Senate, 1857–2013

Notes: Entries are descriptive statistics for the most important variables used throughout the analysis. For precise definitions of all variables, see the Data Appendix.

(1) (2) (3) (4) (5) (6)

Alphabetical Rank -.172*** -.115** -.030*** -.221*** -.511** -.114*(.055) (.049) (.011) (.071) (.255) (.065)

Controls:Senator Fixed Effects Yes No No Yes No NoSenator × Congress Fixed Effects No Yes Yes No Yes YesFraction of Copartisans Deviating No No Yes No No Yes

R-Squared .055 .074 .241 .055 .074 .241Number of Observations 2,897,879 2,897,879 2,897,879 2,897,879 2,897,879 2,897,879

Table 2: Likelihood of Deviating from the Party Line as a Function of Alphabetical Rank, U.S. Senate

Deviate

Notes: Entries are coefficients and standard errors from estimating equation (1) by ordinary least squares. Heteroskedasticity robust standard errors are clustered by Congress and reported in parentheses. As explained in the main text, Alphabetical Rank corresponds to senators' alphabetical percentile ranking among their colleagues. The three left-most columns are based on senators' alphabetical rank among those who participated in a given roll call, whereas the three right-most columns construct rank based on the entire set of senators who served in Congress at the time a given roll call was held. ***, **, and * denote statistical significance at the 1%-, 5%-, and 10%-levels, respectively.

A. Roll Call Specific Order B. Order Among All Senators

A. Senate

(1) (2) (3) (4) (5) (6)

Alphabetical Rank -.172*** -.030*** -.053 -.018 -.313*** -.046***(.055) (.011) (.038) (.016) (.090) (.015)

All All Lopsided Lopsided Close CloseRoll Calls Roll Calls Roll Calls Roll Calls Roll Calls Roll Calls

Controls:Senator Fixed Effects Yes No Yes No Yes NoSenator × Congress Fixed Effects No Yes No Yes No YesFraction of Copartisans Deviating No Yes No Yes No Yes

B. House of Representatives

(7) (8) (9) (10) (11) (12)

Alphabetical Rank -.074 .034 -.152 -.009 -.025 .047(.174) (.021) (.199) (.019) (.231) (.043)

Controls:Senator Fixed Effects Yes No Yes No Yes NoSenator × Congress Fixed Effects No Yes No Yes No YesFraction of Copartisans Deviating No Yes No Yes No Yes

R-Squared .047 .231 .029 .247 .144 .271Number of Observations 9,618,470 9,618,470 5,230,083 5,230,083 4,388,387 4,388,387Notes: Entries are coefficients and standard errors from estimating equation (1) by ordinary least squares. The upper panel does so for the U.S. Senate, while the entries in the lower panel refer to the House of Representatives after the introduction of electronic voting machines. Estimates are based on legislators' roll call–specific alphabetical rank. Heteroskedasticity robust standard errors are clustered by Congress and reported in parentheses. As explained in the main text, roll calls are classified as "close" or "lopsided" according to the cutoffs in Snyder and Groseclose (2000). ***, **, and * denote statistical significance at the 1%-, 5%-, and 10%-levels, respectively.

Table 3: Deviations from the Party Line as a Function of Alphabetical Rank, U.S. Senate & House of Representatives

Deviate

Sample

Deviate

Sample

Senator Characteristic (1) (2) (3) (4)

< 100 Roll Calls Prior Experience .118* .137* .147* .121(.065) (.078) (.079) (.087)

Age .002 .002 .002(.003) (.003) (.003)

Female .052 .064(.078) (.078)

Republican .076(.055)

Average of Dependent Variable .525 .525 .525 .525R-Squared .032 .036 .042 .064Number of Observations 102 102 102 102

Table 4: Correlates of Myopic Play

Mean Posterior Probability of Myopic Type

Notes: Entries are coefficients and standard errors from regressing the estimated mean posterior probability that a particular senator is myopic on different sets of covariates. Heteroskedasticity robust standard errors are reported in parentheses. ***, **, and * denote statistical significance at the 1%-, 5%-, and 10%-levels, respectively.

Online Appendix to

“Backward Induction in the Wild?

Evidence from Sequential Voting in the U.S. Senate”

Contents

A Omitted Proofs 4

A.1 Controlling for the Votes of Copartisans Introduces Bias . . . . . . . . . . . 4

A.2 Proof of Proposition 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

A.5 Vote Order Effects with S = 5 and Overlapping Ideal Points . . . . . . . . . 7

B Parties as Teams with Low Enough Uncertainty 8

C Parties as Teams with Delay and Vote Changes 9

D Simulation Results and Reduced-Form Comparative Statics 12

D.1 Simulated Comparative Statics . . . . . . . . . . . . . . . . . . . . . . . . . 12

D.2 Additional Reduced-Form Evidence . . . . . . . . . . . . . . . . . . . . . . . 13

E Structural Estimation: Additional Details 14

E.1 Estimation Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

E.2 Additional Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

F Detecting Herding: Some Monte Carlo Evidence 17

F.1 Evidence on Systematic Preemption . . . . . . . . . . . . . . . . . . . . . . . 18

F.2 Monte Carlo Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

G A Model of Sequential Vote Buying 20

H Data Appendix 24

H.1 Universe of Roll Calls, 1857—2013 . . . . . . . . . . . . . . . . . . . . . . . . 24

H.2 Roll Calls in the 112th Congress . . . . . . . . . . . . . . . . . . . . . . . . . 25

H.3 Senator Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

References 26

List of Figures

A.1 Empirical CDF of Placebo Estimates for λ, Controlling for Copartisans’ Choices 28

A.2 Winning Margings, by Party Affiliation of Vice President . . . . . . . . . . . 29

A.3 Estimated Order Effects by Senators’ Prior Experience, Controlling for Time

Served in the Senate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

A.4 Example of Sequential Voting Game with Uncertainty . . . . . . . . . . . . . 31

A.5 Example of Nonmonotonic Defection . . . . . . . . . . . . . . . . . . . . . . 32

A.6 Simulation Results for Situations when Only Majority Party Senators are

Conflicted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

A.7 Simulation Results for Situations when Senators from Both Parties are Con-

flicted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

A.8 Simulation Results for Situations in which Both Parties are Split and Almost

Equal in Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

A.9 Distribution of the Majority Party’s Roll Call-Specific Seat Advantage, U.S.

Senate 1857—2013 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

A.10 Deviations from the Party Line in the Raw Data, by Majority Party’s Seat

Advanatge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

A.11 Estimated Order Effects, by Decade . . . . . . . . . . . . . . . . . . . . . . . 38

A.12 Iterative History of δ in the MCMC Algorithm . . . . . . . . . . . . . . . . . 39

A.13 Iterative History of Type Shares in the MCMC Algorithm . . . . . . . . . . 40

A.14 Posterior Mean of Senators’ Ideal Point, Allowing for 3 Types . . . . . . . . 41

A.15 Posterior Distribution of δ in a Model without “Intermediate Type” . . . . . 42

A.16 Posterior Mean of Senators’ Ideal Point, Allowing for Rational and Myopic

Types Only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

A.17 Posterior of Type Shares in a Model without “Intermediate Types” . . . . . 44

A.18 Posterior of β, Pooling over Roll Calls . . . . . . . . . . . . . . . . . . . . . 45

A.19 Posterior of γ, Pooling over Roll Calls . . . . . . . . . . . . . . . . . . . . . . 46

A.20 Null Distribution for ψ in Equation (E.1) . . . . . . . . . . . . . . . . . . . . 47

List of Tables

A.1 Deviations from the Party Line, by Voting Position . . . . . . . . . . . . . . 48

A.2 Estimated Order Effects for Alternative Definitions of the Party Line . . . . 49

A.3 Estimated Order Effects for Roll Calls with Little Abstention . . . . . . . . . 50

A.4 Deviations from the Party Line as a Function of Alphabetical Rank among

All Members of the Chamber . . . . . . . . . . . . . . . . . . . . . . . . . . 51

A.5 Reduced-Form Comparative Statics . . . . . . . . . . . . . . . . . . . . . . . 52

A.6 Order Effects, by Senators’ Characteristics . . . . . . . . . . . . . . . . . . . 53

A.7 Monte Carlo Rejection Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

A.8 Preference Configurations and Deviations with S = 5 . . . . . . . . . . . . . 55

Appendix A: Omitted Proofs

A.1. Controlling for the Votes of Copartisans Introduces Bias

In Section 3.2, we state that when defection by one senator prevents some of her colleagues from

deviating from the party line as well, then controlling for the votes of copartisans introduces upward

bias into the OLS estimates. Here, we state this claim more formally and prove it.

Consider the following econometric model:

di,c = α+ βoi,c + γd−i,c + ηi,c,

where d−i,c denotes the share of i’s copartisans who deviated on roll call c, and all other symbols

are as defined in the main text. If senators preempt each other, then the OLS estimator of β is

generally biased. In particular, if agents behave as in our model and oi,c is not correlated with ηi,c,

then plim βOLS > β.

To see this, let oi,c = r0 + r1d−i,c + R be the linear projection of oi,c on the space spanned by

d−i,c and a constant, and let di,c = q0 + q1d−i,c + Q be the linear projection of di,c on the same

space. By the Frisch-Waugh Theorem (Frisch and Waugh 1933)

plim βOLS =Cov(Q,R)

Var (R)

= β +Cov(ηi,c, oi,c − r1d−i,c)

Var (R).

Hence, βOLS is unbiased only if Cov(ηi,c, oi,c − r1d−i,c) = 0.

Strategic preemption implies that Cov(d−i,c, ηi,c

)< 0. The reason is that if a particular senator

is conflicted and defects, then, on average, fewer of her copartisans will be free to deviate before

the roll call is lost. When Cov(ηi,c, oi,c) = 0, we have Cov(ηi,c, oi,c − r1d−i,c) = −r1 Cov(ηi,c, d−i,c).

By construction, r1 ≡Cov(d−i,c,oi,c)

Var(d−i,c). If senators preempt each other, then r1 > 0. This is because

the negative vote order effects in our model imply that senators who vote later in the order are

more likely to vote with the party line. Thus, d−i,c is, on average, smaller when oi,c is small, i.e.,

Cov(d−i,c, oi,c

)> 0. As a result, −r1 Cov(ηi,c, y−i,c) > 0 and plim βOLS > β, as desired.

A.2. Proof of Proposition 1

Our proof of Proposition 1 proceeds in two steps. First, we show that the proposed strategy is

subgame-perfect. We then show that, generically, the game admits a unique SPNE.

Subgame-Perfection. To see that the stratgey specified in Proposition 1 is a Nash equilibrium,

consider any node at which a conflicted Democrat chooses, and suppose that all others continue to

play their equilibrium strategies.

If |Y |i′>i +∣∣∣Y ∣∣∣

i′>i+ 1 > y − yi, then, by construction of Y and Y , the Democratic party will

win the roll call even if senator i deviates. This is because there are enough others in Y who will

vote subsequently and stick with the party line if need be. Put differently, if |Y |i′>i +∣∣∣Y ∣∣∣

i′>i+

1 > y − yi and everybody plays their equilibrium strategies, then it can never be the case that

|Y |i′′>i′ +∣∣∣Y ∣∣∣

i′′>i′+ 1 < y − yi′ for any i′ ∈ D with i′ > i, which means that the Democrats must

win. A conflicted senator anticipates this and, therefore, defects.

If |Y |i′>i+∣∣∣Y ∣∣∣

i′>i+ 1 < y−yi, however, then the Democratic party cannot win the roll call, even

if i votes “yea.” This is because |Y |i′>i +∣∣∣Y ∣∣∣

i′>i+ 1 < y− yi implies |N |i′>i +

∣∣∣N ∣∣∣i′>i

+ 1 ≥ n− ni,which in turn means that if everybody else plays their equilibrium strategies, then the Republicans

can guarantee themselves victory. Since a conflicted Democrat cannot affect the overall outcome of

the roll call, it is optimal to defect whenever |Y |i′>i +∣∣∣Y ∣∣∣

i′>i+ 1 < y − yi.

If |Y |i′>i +∣∣∣Y ∣∣∣

i′>i+ 1 = y − yi, then conflicted Senators must vote with the party line or else

the roll call will be lost. By way of contradiction, suppose a conflicted Democrat voted “nay.”

If there is no other (conditional) “yea”-voter after i, i.e., if∣∣∣Y ∣∣∣

i′>i+ |Y |i′>i = 0 , then defecting

will immediately cause the roll call to be lost since |Y |i′>i +∣∣∣Y ∣∣∣

i′>i+ 1 = y − yi implies that

y − yi > 0. If there is another (conditional) “yea”-voter following i, say i′, it will be the case that

|Y |i′′>i′ +∣∣∣Y ∣∣∣

i′′>i′+ 1 < y− yi′ and |N |i′′>i′ +

∣∣∣N ∣∣∣i′′>i′

+ 1 ≥ n−ni′ , which, based on the argument

above, also implies that the Republican Party would win for sure. Thus, conflicted Democrats find

it optimal to stick with the party line.

After replacing Y (Y ) with N (N) and y (y) with n (n), the same arguments apply for conflicted

Republicans. Since we considered an arbitrary node, we can start at the end of the game and

proceed by backward induction to show that the strategies in Propostion 1 are subgame-perfect,

as desired.

Uniqueness. Generically, it will be the case that αi 6= 0 and |αi| 6= |δp| for all players and

parties, which implies that mixing is not optimal for the last player. Thus, the second-to-last

player’s vote either changes the outcome for sure, or it will be inconsequential with certainty. Since,

generically, αi 6= 0 and |αi| 6= |δp|, the second-to-last player strictly prefers one of his actions

over the other. Proceeding along the same lines, no other player will be indifferent between “yea”

and “nay.” This shows that any subgame-perfect equilibrium must be in pure strategies. Since the

number of players is finite, backward induction terminates and it produces a unique subgame-perfect

equilibrium.

Our proof of Proposition 2 proceeds in four steps. First, we establish a lemma that clarifies the role

of nonoverlapping preferences. We then prove (a)—(c), which together establish that, for each roll

call, there is a cutoff value for defection.

Lemma: If maxi∈D xi < mini∈R xi, then it cannot be the case that members of both parties are

conflicted with respect to the same roll call vote.

Let s ≡ arg maxi∈D xi and t ≡ arg mini∈R xi. Suppose by way of contradiction that our premise

holds, but there are members of both parties that are conflicted. Having conflicted members of

both parties implies there exists some i ∈ D, and j ∈ R such that αi < 0 < αj . For the conflicted

Democrat, αi < 0 implies that xi is closer to ζ than ξ. Since ξ < ζ and xs ≥ xi then xs is closer to

ζ than ξ. Similarly, for the conflicted Republican, αj > 0 implies that xj is closer to ξ than ζ. Since

ξ < ζ and xt ≤ xj then xt is closer to ξ than ζ. Taken together we have xs is closer to ζ than ξ and

xt is closer to ξ than ζ and ξ < ζ which implies xs > xt which is a contradiction of our premise.

Case (a): If |Y | < y < |Y | +∣∣∣Y ∣∣∣, then, by the lemma, N = ∅, and at least one conflicted

majority party senator must vote with the party line for the bill to pass. Let q =∣∣∣Y ∣∣∣+ |Y |− y, and

let q = 1, . . . ,∣∣∣Y ∣∣∣ index all conflicted senators according to the order in which they vote. Clearly,

if all q > q adhere to the equilibrium strategies in Proposition 1, then any q ≤ q will be able to

defect without rendering the roll call lost. They will, therefore, do so. Since all q ≤ q defect along

the equlibrium path, any q > q must vote with the party line. From 1 ≤ q <∣∣∣Y ∣∣∣, it then follows

that ι, the position of q in the overall vote order, is 1 ≤ ι < S.

The proof for |N | < n < |N |+∣∣∣N ∣∣∣ proceeds along similar lines.

Case (b): If |Y | < |Y |+∣∣∣Y ∣∣∣ = y, then the lemma implies that N = ∅. Hence, all i ∈ Y must

vote with the party line for the majority party to win. By Proposition 1 no conflicted majority

party members defects along the equilibrium path, and ι = 0.

The proof for |N | < |N |+∣∣∣N ∣∣∣ = n proceeds along similar lines.

Case (c): If |Y | ≥ y, then the majority party wins the roll call with certainty, even if all of

its conflicted members defect and if all conflicted minority party senators voted with their party.

Hence, along the equilibrium path, all senators are free to defect, which implies that ι = S.

The proof for |N | ≥ n proceeds along similar lines.

If |Y |+∣∣∣Y ∣∣∣ ≥ y, then for i = 1 we have that |Y |i′>1 +

∣∣∣Y ∣∣∣i′>1

+ 1 ≥ y. If all majority-party senators

play their equilibrium strategies, then |Y |i′′>i′ +∣∣∣Y ∣∣∣

i′′>i′+ 1 ≥ y − yi′ for all i′ ∈ D. Thus, the

majority party wins.

If |Y |+∣∣∣Y ∣∣∣ < y, however, let i = mini∈D i, then we have that |Y |i′>i +

∣∣∣Y ∣∣∣i′>i

+ 1 < y, which in

turn implies that, along the equilibrium path, |Y |i′′>i′ +∣∣∣Y ∣∣∣

i′′>i′+ 1 < y − yi′ for all i′ ∈ D. As a

consequence, the majority party loses. This shows that the majority wins if and only if |Y |+∣∣∣Y ∣∣∣ ≥ y ,

as desired.

To see that the bill will receive exactly y “yea”-votes whenever maxi∈D xi < mini∈R xi and

|Y | < y ≤ |Y | +∣∣∣Y ∣∣∣, note that, by the lemma above, N = ∅. Further, note that we are either in

case (a) or (b) of Proposition 2. If the latter, then, in equilibrium all i ∈ Y ∪ Y vote “yea” (i.e.,

they support the party line), while all i ∈ N vote “nay.” This results in y total “yeas.” If we are

in case (a) of Proposition 2, then the first ι =∣∣∣Y ∣∣∣ + |Y | − y conflicted senators find it optimal to

defect, while all remaining ones stick with the party line. By construction of ι and given that all

i ∈ N say “nay,” we, again, arrive at y total “yeas”-votes.

Now, suppose |Y |+∣∣∣Y ∣∣∣ < y, which implies that the majority loses. Along the equilibrium path,

any i ∈ Y defects. If∣∣∣Y ∣∣∣ ≥ 1 then N = ∅, and the bill will only get |Y | votes. Thus, if |Y |+

∣∣∣Y ∣∣∣ < y

the majority falls more than∣∣∣Y ∣∣∣ votes short. If

∣∣∣Y ∣∣∣ = 0, then, given that the majority loses, the

claim that it falls at least∣∣∣Y ∣∣∣ votes short of y is trivially true.

A.5. Vote Order Effects with S = 5 and Overlapping Ideal Points

To show that the assumption of nonoverlapping ideal points is not strictly necessary for Proposition

2(a), which predicts negative vote-order effects, we now consider a variant of our model of “parties

as teams” with five senators, i.e., S = 5, and no such restriction on preferences.

Note, regardless of whether senators’ ideal points do or do not overlap, every senator is either an

unconflicted yea voter (Y ), a unconflicted nay voter (N), a conflicted yea voter (Y ), or a conflicted

nay voter (N). Furthermore, conditional on knowing a senator’s type, αi and δ contain no relevant

information for determining her choices. Thus, equilibrium behavior is uniquely determined the by

the “preference configuration,” i.e., the profile of senator’s types and the order in which they vote.

We assume simple majority rule, i.e., y = n = 3, and restrict attention to all 252 configurations

which correspond to the case considered in Proposition 2(a), i.e., |Y | < y < |Y |+∣∣∣Y ∣∣∣ or |N | < n <

|N | +∣∣∣N ∣∣∣. Consistent with the language in the main text, we say that a senator “deviates” if she

belongs to Y and votes “nay” or if she belongs to N and chooses “yea.”

Appendix Table A.8 lists all preference configurations for which |Y | < y < |Y | +∣∣∣Y ∣∣∣ (left-most

colums), the equilibrium choices of agents (middle columns), as well as indicators that highlight

deviations (right-most columns). (Limiting ourselves to |Y | < y < |Y |+∣∣∣Y ∣∣∣ is without further loss

of generality as results for the case of |N | < n < |N |+∣∣∣N ∣∣∣ can be obtained by a simple relabeling.)

Counting the number of deviations at each position in the vote order shows that they are more

frequent early in the order than late. Thus, if one believes that all preference profiles are a priori

equally likely, then one would expect to find negative vote-order effects on average.

Put differently, a Bayesian with no prior information on which preference profiles are more prev-

elant among senators than others would aggregate over the set of possible configurations by taking

simple means, which would result in a posterior with monotonically declining deviation rates.

Of course, there exist preference profiles for which deviations are nonmonotonic in the vote

order, and even a (smaller) number of configurations in which only senators at the end of the order

deviate. Thus, if one believes that the latter configurations are sufficiently more likely to occur

than others, then negative vote-order effects need not necessarily obtain. Nonetheless, the point of

our argument is that even a model that allows for the ideal points of senators to overlap has the

potential to rationalize our main empirical finding. Nonoverlapping ideal points are sufficient to

generate negative vote-order effects but not necessary.

Appendix B: Parties as Teams with Low Enough Uncertainty

This appendix illustrates by way of example how uncertainty may lead to situations in which

senators early in the vote order may prefer to “play it safe” rather than deviate from the party

line. At the same time, the appendix demonstrates that if uncertainty is low enough (relative to

the gains from deviating), then conflicted senators’ optimal strategies coincide with those in the

common knowledge environment considered in Proposition 1.1

To see that there may be situations in which senators prefer to “play it safe” rather than preempt

their colleagues, consider the game in Figure A.4. Here, the Democratic party requires one more

“yea” for the bill to pass. Senator D1 is conflicted in the sense that she would like to vote “nay,”

but only if her vote was not pivotal. In one state of nature, Senator D2’s disdain for the bill is

so strong that she will vote “nay” regardless of whether this causes the bill to fail. In the other

state of nature, D2 would support the bill if need be. Alternatively, one can think of the states of

nature as determining whether D2 is conflicted and rational. In the first state, D2 is conflicted but

irrational in the sense that she would always vote “nay.” In the second state, D2 is rational and

plays the equilibrium strategy specified by Proposition 1. Figure A.4 is, therefore, also an example

of behavior when common knowledge of rationality fails.

Let q be the probability that D2 defects no matter what, i.e., q ≡ Pr (α2 < −δ). Given the payoffs

specified in the figure, D1 deviates from the party line if q < 12 . More generally, there will be a

cutoff value p = αδ , such that D1 finds it worthwhile (in expectation) to defect if only if q < p. As

a consequence, there may be nodes in the game tree at which senators “play it safe.”

To see that “low enough uncertainty” would lead to senators adopting the same strategies as in

the common knowledge environment, focus on a conflicted Democrat, i, and fix her position-taking

payoff from voting “yea” αi = α < 0. Further, let δ ≡ δD and assume a constant probability of

“mistakes”, i.e., that senators further down in the vote order deviate from their party’s position

with probability q even if that caused the roll call to be lost for sure.

We first show that if q is small enough such that α+ δ < (1− q)∣∣Y ∪Y ∣∣

i′>i δ, then that conflicted

Democrat would defect whenever |Y |i′>i+∣∣∣Y ∣∣∣

i′>i+1 > y−yi. In words, with low enough uncertainty

the senator would prefer to rely on the support of all∣∣∣Y ∪ Y ∣∣∣

i′>iremaining copartisans to still carry

the bill to passage rather than to vote for the bill herself (even if it that guaranteed passage and

the measure might otherwise fail). Since α < 0, α + δ > 0 and∣∣∣Y ∪ Y ∣∣∣

i′>i< ∞, there exists a

q1i > 0, such that for all q < q1

i the condition above is always satisfied.

To establish that the senator would deviate from the party line when |Y |i′>i+∣∣∣Y ∣∣∣

i′>i+1 < y−yi

it suffices for q to be small enough such that 0 > α +

[1− (1− q)

∣∣N∪N∣∣i′>i

]δ, i.e., the senator

would rather defect and lose the roll for sure call than hope for the opposition to make at least one

1This does not necessarily mean that the outcome of the game will be the same, as the outcome dependsalso on realized uncertainty. It does imply, however, that senators whose preferences make them “conflicted”(i.e., for whom |αi| < |δp| and αiδp < 0) would choose the same action at any node in the game tree as inthe game with common knowledge.

mistake. Again, given that α < 0, α+ δ > 0 and∣∣∣N ∪N ∣∣∣

i′>i<∞, there exists a q2

i > 0 such that

for all q < q2i the condition holds.

Lastly, when |Y |i′>i +∣∣∣Y ∣∣∣

i′>i+ 1 = y − yi the conflicted senator would vote with the party line

if q was small enough such that

[1− (1− q)

∣∣N∪N∣∣i′>i

]δ ≤ α + (1− q)

∣∣Y ∪Y ∣∣i′>i−1δ. The left-hand

side of this inequality gives the expected payoff if i defects but the roll call would be won for sure if

at least one member of the opposition defected as well. As defection by more than one member of

the opposing party may be required for the roll call to be won in the end, this constitutes a weak

upper bound on the true expected payoff from defecting. The right hand-side denotes the expected

payoff from voting with the party line and having to rely on all remaining copartisans to carry

the bill to passage (which is a weak lower bound on the true expected payoff from supporting the

party). For q = 0, the left-hand side of the inequality equals zero and the right-hand side equals

α + δ > 0. For q = 1, however, the left-hand side is equal to δ > 0, whereas the right-hand side is

α < 0. Thus, by the Intermediate Value Theorem, there exists a q3i > 0 at which equality holds.

Furthermore since the left hand side is increasing in q and the right hand side is decreasing in q,

the inequality is satisfied for all q < q3i .

Combining the previous three arguments, let qi = min{q1i , q

2i , q

3i } and further repeating the

process at any node of the game tree where a member of D votes, let qD = mini∈D qi. if q is lower

than the resulting qD then senators of party D will defect if and only if |Y |i′>i+∣∣∣Y ∣∣∣

i′>i+1 6= y−yi,

as in Proposition 1.

To determine the maximal amount of uncertainty under which senators from both parties would

still play the same strategies as in Proposition 1, repeat the same steps for members of the Repub-

lican party and choose the smallest of qD and qR.

Appendix C: Parties as Teams with Delay and Vote Changes

In this appendix, we consider two extensions to our model of “parties as teams.” First, we allow for

the possibility that a senator may skip her opportunity to vote when she is first called and instead

votes in an exogenous order after all of her colleagues have been afforded an initial opportunity

to cast their vote. Second, we allow for the possibility that after all senators have voted in order,

an agent may change her vote, provided she obtains unanimous consent from all other senators. In

what follows, we maintain the assumptions of Propositions 1—3 and modify the model as described.

Delayed Vote. By Senate rules, a senator who is not on the floor when called may still cast

her vote after every other senator has been afforded an opportunity to vote. If we extend our model

to account for the possibility of strategic delay, we can show that (i) our equilibrium of interest

continues to hold, and (ii) if there is any positive cost to delay, then it continues to be the unique

generic equilibrium.

The intuition for the robustness of our result is that skipping the opportunity to vote early is

never beneficial and may be costly. If a senator is unconflicted, or if the outcome of the vote will

be decided by more than one vote, then the possibility to delay does not affect behavior. Moreover,

if a conflicted senator is presented with an opportunity to vote in which deviating would not cause

her party to lose the roll call, then waiting may give a conflicted copartisan the opportunity to

preempt the senator who chooses to delay. There are no gains from delaying a vote.

More formally, consider the following extension to our model of “parties as teams.” When senator

i is called to vote, she can either cast her vote on the spot or delay. If she chooses to delay, then

she will be allowed to cast a “yea” or “nay” vote after the last senator (S) has been afforded his

opportunity to vote.

Denote the number of senators who skip the vote by K ≥ 0. We assume that the set of senators

who choose to delay, vote in a new exogenous, random order, i = 1, ...,K. There is no opportunity to

skip the second vote. Let (yS , nS) be the vote count after each senator has been afforded their first

opportunity to vote. Note that the game that proceeds after senator S is afforded his opportunity

to vote is equivalent to the game previously analyzed with a new yea threshold yK = y− yS and a

new nay threshold nK = n− nS . In particular, all the behavior described in Proposition 1 continues

to hold. Let∣∣∣Y ∣∣∣

ı>ibe set of all conditional yea voters that have not yet voted.2 Now, consider a

senator i from the Democratic Party (the argument is the symmetric for Republicans) deciding

whether or not to delay her vote. The senator will find herself in one of four cases:

(1) Senator i is unconflicted.

(2) Senator i is a conflicted type and |Y |i′>i +∣∣∣Y ∣∣∣

ı>i+ 1 = y − yi.

ı>i+ 1 < y − yi.

ı>i+ 1 > y − yi.

In cases 1-3, the equlibrium behavior of senator i would be the same if she voted in the called

order or if she delayed and voted in the “second” round.

In case 4, if |Y |i′>i < y − yi so the vote of some conflicted senators is need following i, then by

delaying the initial vote and submitting her choice in the second round, one of three things are

possible.

Possibility 1: Enough other conflicted voters have delayed their vote so that there is still an excess

of conditional yea-voters choosing after i in the second round. Senator i can, therefore, still vote

nay and see the bill pass.

Possibility 2: The bill fails, regardles of i’s choice in the second round.

Possibility 3: The conflicted senator finds herself in a situation where her vote is required for

passage.

Both possibility 2 and 3 are strictly worse than voting when called, while possibility 1 is equally

In case 4, if |Y |i′>i ≥ y − yi then no conflicted vote is required for the bill to pass. Conflicted

voters have, therefore, a strict incentive to always vote nay.

In all cases, there is no incentive to delay the vote. Hence, if we assume that when indifferent

2Note that if skipping is not permitted then∣∣∣Y ∣∣∣

ı>i=∣∣∣Y ∣∣∣

i′>i

senators do not delay their vote, or if there is ε > 0 cost of delaying, then no senator will choose

to do so. Therefore,∣∣∣Y ∣∣∣

ı>i=∣∣∣Y ∣∣∣

i′>iand the equilibrium behavior in the extended game will be

observationally equivalent to equilibrium of the original game, i.e., the game in the main text.

If there is no cost to delaying a vote, then there exists the possibility of multiple equilibria, in

some of which unconflicted senators skip their vote. However, even in these equilibria voting early

presents conflicted senators the opportunity to free-ride on those who are first allowed to vote after

Vote Changes. By Senate rules, after all votes have been cast, a senator may request to

change her recorded vote. If and only if no other senator objects, the vote change is granted. We

consider this possibility and show that since consequential vote changes would not be granted with

unanimous consent, there is never a gain to casting a vote and changing it at a later time. Thus, if

there is any cost to reversing one’s choice–for example because senators are wary of being percieved

as “flip-floppers”–then the equilibrium we consider in the main text will be the unique generic

equilibrium of the extended game.

Consider the following modification to our model of “parties as teams.” After a complete call

of the roll, each senator is offered the opportunity to change her vote (in some exogenously given

order). If a senator does not ask for a vote change, nothing occurs. If a senator does request a vote

change, then she pays a positive cost ε > 0 and each of the remaining senators is simultaneously

offered the opportunity to oppose the change at no cost. If at least one agent opposes the change,

then the original vote remains in place. After every senator has been offered an opportunity to

change their vote the game ends according to the final tally.

The first thing to note is that since, by assumption, there is always at least one senator that prefers

the bill to pass and one senator that prefers the bill to fail, unanimous consent will be granted only

if it does not alter the outcome of the bill. Additionally, even if a vote change would not reverse

the outcome of the roll call, it is still an equilibrium for two or more of the remaining senators to

oppose a vote change. Since vote changes do not affect senators’ utility when they would not alter

the ultimate outcome of the call, these “no change”-equilibria are robust to standard refinements

of voting games, which limit players to weakly undominated strategies. Thus, in the vote-change

subgame, there is always an equilibrium where no vote change is permitted, and, as consequence,

our results in Propositions 1-3 remain unchanged even if ε = 0.

However, if we assume that when indifferent senators permit a vote change, we can still show

that our results continue to hold in such “permissive” equilibria. First, consider the behavior of

an unconflicted Democratic senator who is considering whether to initially vote “nay” when all

other players continue to play as in Proposition 1 (the argument is symmetric for unconflicted

Republicans). We will show that voting “yea” continues to be a strictly dominant strategy if ε > 0

and a weakly dominant strategy if ε = 0.

After the initial call of the roll, an unconflicted Democratic senator will find herself in one of

three cases:

(1) The bill passes by one or more votes.

(2) The bill fails by two or more votes.

(3) The bill fails by exactly one vote.

In cases 1 and 2, if the senator initially said “nay,” she would find it optimal to ask for a vote

change, which would be granted in a “permissive” equilibrium. However, doing so will cost her ε.

In case 3, the senator would not be permitted to change a “nay”-vote.

If the senator had initiall voted “yea,” the outcome would be unchanged in cases 1 and 2, but she

would economize on ε. In case 3, the outcome of the vote may or may not change (depending on the

choices of agents i′ > i). If unchanged, the senator would still economize on ε, but if the outcome

did change, then the senator would be strictly better off on the instrumental dimension and save ε.

Thus, if ε = 0 the senator is weakly better off voting her “true preference” when first called, and if

ε > 0 she is strictly better off. In either case, it remains an equilibrium for unconflicted Democratic

senators to vote “yea.”

Given that unconflicted senators continue to initially vote their preference, it follows that it is an

equilibrium for conflicted senators to vote according to Proposition 2 when first called. Since the

outcome of the roll call cannot change after the initial call of the roll, deviating from the behavior

prescribed by Proposition 2 would lower the payoffs of conflicted senators.

In sum, we have shown that the equilibrium in the main text continues to hold if vote changes

were permited, and that it is the unique generic equilibrium if there is any cost to changing one’s

Appendix D: Simulation Results and Reduced-Form Comparative Statics

D.1. Simulated Comparative Statics

In the main text we derive the comparative static of negative vote-order effects under the assumption

that the ideological ideal points of members of both parties do not overlap. This restriction rules out

situations in which Democrats and Republicans are conflicted. Although there is ample empirical

evidence to suggest that our assumption is approximately satisfied for most of the period that

we study, it is nonetheless restrictive. The example in Figure A.5, for instance, shows that when

members of both parties are conflicted, then defection is not always monotonic in rank. It is,

therefore, important to show that nonoverlapping preferences are not necessary for negative vote-

order effects to obtain on average.

In Appendix Figures A.6 and A.7 we present simulation results that suggest that our conclusions

about negative vote order effects and winning margins hold more generally. Each panel is based

on 10 million roll calls in which 100 senators follow their equilibrium strategies in Proposition 1.

For each roll call, the order in which senators vote is randomly determined. The panels on the left

depict the average frequency with which agents in a given position deviate from the party line. The

panels on the right show the total number of “yea” votes.

Figure A.6 depicts the situation predicted by simple agenda-setter models in political science

(e.g., Romer and Rosenthal 1978): a bill that is unpopular with all members of the minority as

well as some of the majority. The probability that any given majority party senator is conflicted

equals 30%. In Figure A.7 we let members of both parties be conflicted with probability 30%. With

nonoverlapping preferences, such a situation would never occur. Yet, provided that the margin of

majority is neither too large nor too small, on average, deviations from the party line decrease with

rank.3 Moreover, the majority often wins with only a very small margin. Simulation results for

other parameter combinations deliver qualitatively similar results and are available upon request.

Figure A.8 shows that when members of both parties would like to deviate and the margin of

majority is small, then the average probability of defection need not be monotonic, even on average.4

Although typical seat margins in the U.S. Senate suggest that such scenarios are unlikely to be

empirically important (cf. Appendix Figure A.9), it is worth noting that, provided that the majority

party has more members than “yea” votes needed for passage, voting very early still confers an

advantage.

D.2. Additional Reduced-Form Evidence

According to our analytical results in the main text as well as the simulations in Figures A.6—A.8,

rank and defection should be strongly correlated when the minority party is united and when the

seat advantage of the majority is sizeable but not large enough for the roll call to be lopsided.

By contrast, when the majority party has only a one- or two-seat advantage, then, in equilibrium,

almost all of its conflicted members must stick with the party line or else the roll call will be lost.

Under these circumstances one would not expect to see a large negative point estimate. In addition,

the simulations suggest that for calls on which the minority is split, the correlation between rank

and defection should only be modestly negative, if at all. This is because, defection by members of

the minority lessens the need of conflicted majority party senators to stick with the party line.

By and large, these predictions are borne out in the results in Appendix Table A.5. Dividing roll

calls by the median defection rate among members of the minority party and estimating λ on the

sample of calls on which the minority was “split” shows that rank and defection are practically

uncorrelated under these circumstances. The same is true when the majority party enjoys a very

large or a very small seat advantage, but not in an intermediate range–as predicted by the theory.

Interestingly, λ is estimated to be positive (but statistically insignificant) for cases in which the

majority party owns only one or two seats more than the minority. Though the simulation results

in Figure A.8 are consistent with a positive point estimate for certain parameter combinations, we

note that the raw data indicate a small negative slope (cf. Appendix Figure A.10). In any case,

3The rate of decline is lower in the latter figure because defection by minority-party senators lessens theneed for members of the majority who come later in the order to support their own party’s position.

4This observation may be surprising. It is due to the fact that defection by a conflicted member of the

minority decreases yi without lowering∣∣∣Y ∪ Y ∣∣∣

i′>i. Thus, in any particular game, defection by majority-

party members need not be monotonic in rank. When the seat advantage of the majority is small enough,this need not even be true on average.

we do not observe strongly negative vote-order effects when the majority party has barely more

members than the minority, as predicted.

In Figure A.11, we show how λ varies over time. Although political scientists disagree about how

best to measure partisanship, they generally concur that parties were less important in the 1960s

and 1970s, but much more so before and thereafter (see, e.g., Rhode 1991; Snyder and Groseclose

2000). Viewed through the lens of our theory of “parties as teams,” one would not expect to see much

of a relationship between alphabetical rank and defection during the period in which partisanship

was low. After all, when δp ≈ 0, senators have little incentive to backward induct. Point estimates

before 1960 and after 1980, however, should be negative and large.

Although our estimates are imprecise and do not capture some of the more subtle patterns in

Snyder and Groseclose’s (2000) seminal work on party influence, we do observe markedly smaller

vote-order effects during the middle of the twentieth century. Note, a related (but ultimately)

distinct explanation for the lack of vote-order effects during this period is that the ideal points

of Democratic and Repubilcan senators were more likely to overlap during the decades of low

partisanship–as suggested by a large literature in political science (see, e.g., Poole and Rosenthal

1997). Nonetheless, the observed pattern is in line with our similation results, which predict lower

vote-order effects when members of both parties are conflicted.

Appendix Table A.6 explores additional sources of heterogeneity in vote-order effects. Consistent

with our theory, vote-order effects are larger among members of the majority than their minority-

party counterparts. Although the estimated difference is perhaps somewhat smaller than one might

have expected, it is statistically highly significant (p < .01). There are also small differences with

respect to electoral incentives. Comparing the first four and the last two years of their terms,

senators are slightly more likely to engage in strategic preemption when the next election is already

on the horizon (p = .107). The remaining entries explore whether senators’ response to changes in

their alphabetical rank differs by age, gender, or formal education. Although the difference is barely

statistically significant at conventional levels, if at all, the point estimates suggest that senators

without a college education may actually be more likely to exploit the opportunity to preempt their

colleagues than those without one.5

Appendix E: Structural Estimation: Additional Details

E.1. Estimation Procedure

As stated in the main text, our estimation procedure builds on the Markov Chain Monte Carlo

(MCMC) approach of Clinton et al. (2004). In what follows, we detail our procedure, borrowing

heavily from the description of Clinton et al. (2004).

5One admittedly speculative explanation is that the few individuals making it to the Senate without beingformally educated might possess higher-than-average innate intelligence, which could be especially conduciveto recognizing the advantages conferred by being allowed to vote early.

Baseline Model. Intuitively, our MCMC algorithm “explores” the joint posterior density of

(x, β, γ, δ) by successively sampling from the conditional densities that together characterize the

joint distribution. Define the latent variable y∗i,r ≡[Ui,r (ξr) + δqi,r(yi,r + 1)

]−[Ui,r (ζr) + δqi,r(yi,r)

and note that yi,r = 1 if y∗i,r ≥ 0 and 0 otherwise. We augment the Gibbs sampler with y∗i,r, which

greatly reduces the computational complexity because it allows us to exploit standard results on the

Bayesian analysis of linear regression models. We also treat ∆q as an unknown parameter vector.

Letting z index the iterations of the MCMC algorithm, a given iteration consists of sampling from

the following conditional densities.

1. g (y∗|y, x, β, γ, δ,∆q) : At the start of iteration z, we have(x(z−1), β(z−1), γ(z−1), δ(z−1),∆q(z−1)

Depending on whether a given senator votes “yea” (yi,r = 1) or “nay” (yi,r = 0), we sample

y∗i,r from one of the two following truncated normal distributions:

y∗i,r|(yi,r = 0, x

(z−1)i , β(z−1)

r , γ(z−1)r , δ(z−1),∆q

(z−1)i,r

)∼ N

(z−1)i,r , 1

)I(y∗i,r < 0

)y∗i,r|

(yi,r = 1, x

(z−1)i , β(z−1)

r , γ(z−1)r , δ(z−1),∆q

(z−1)i,r

)∼ N

(z−1)i,r , 1

)I(y∗i,r ≥ 0

where µ(z−1)i,r = x

(z−1)i β

(z−1)r + γ

(z−1)r + δ(z−1)∆q

(z−1)i,r and I (·) is an indicator function. For

abstentions we sample from N(µ

(z−1)i,r , 1

), effectively creating multiple imputations.

2. g (β, γ|y∗, x, δ,∆q) : For every r, we run a “Bayesian regression” of (y∗ − δ∆q) on x and a

constant, and then sample from the posterior density. That is, we sample β(z)r and γ

(z)r from

a multivariate normal with mean

[X(z−1)′X(z−1) + T−1

]−1 [X(z−1)′

(y∗(z)·,r − δ(z−1)∆q

(z−1)·,r

)+ T−1

]and covariance matrix

[X(z−1)′X(z−1) + T−1

]−1,

where X(z−1) is an S × 2 matrix with a typical row(x

(z−1)i,r , 1

), y∗(z)·,r and ∆q

(z−1)·,r are both

of dimension S × 1, and τ0 and T0 respectively denote the prior mean and covariance matrix

of (βr, γr).

3. g (x|y∗, β, γ, δ,∆q) : For every i, we now run a “Bayesian regression” of (y∗ − δ∆q − γ) on β

(and no constant). More specifically, letting ν0 and V0 denote the prior mean and variance

of xi, we sample x(z)i from a normal distribution with mean

[β(z)′β(z) + V −1

]−1 [β(z)′

(y∗(z)i,· − δ(z−1)∆q

(z−1)i,· − γ(z)

)+ V −1

and variance

[β(z)′β(z) + V −1

]−1,

where β(z), y∗(z)i,· and ∆q

(z−1)i,· are all of dimension C×1. After updating all xi, we renormalize

x so that the ideal points of John Kerry and John McCain are equal to −1 and 1, respectively.

4. g (δ|y∗, x, β, γ,∆q) : Next, we run a “Bayesian regression” of (y∗ − β′x− γ) on ∆q (and no

constant). In particular, let l0 and L0 denote the prior mean and variance of δ, respectively.

We then sample δ(z) from a normal with mean

[∆q(z−1)′∆q(z−1) + L−1

]−1 [∆q(z−1)′

(y∗(z) − β(z)x(z) − γ(z)

)+ L−1

and variance

[∆q(z−1)′∆q(z−1) + L−1

]−1.

Note, ∆q(z−1) and y∗(z) are SC×1 vectors, while β(z), γ(z), and x(z) are comformably stacked.

5. g (∆q|y∗, x, β, γ, δ) : Lastly, we use(x(z), β(z), γ(z), δ(z)

)to update ∆q

(z)i,r based on the con-

struction in footnote 20 in the main text. Note, conditional on(x(z), β(z), γ(z), δ(z)

), ∆q

(z)i,r has

a distribution with all mass at single point. In calculating ∆q(z)i,r we consider the exact order

in which votes were cast, ignoring abstentions. For senators who abstained on a particular

vote, we set ∆q(z)i,r = 0.

Iterating the above steps produces a Markov chain of our parameter vector with the joint pos-

terior density as its limiting distribution. Put differently, after a large number of iterations of the

alogrithm, successive samples of the paramter vector are drawn from its posterior distribution.

Extended Model. The Gibbs sampler for our extended model is very similar to the one above,

with the important difference that we augment the model by introducing finite mixture components

in order to represent each expectation-formation mechanism. Letting ti denote the sampled type

of senator i, steps 1—5 are preceded by:

0.1. g (ti|y∗, x, β, γ, δ,∆q, π) : For each i, we sample t(z)i from a multinomial distribution with

parameter vector

πi,k =

π(z−1)k

C∏r=1

(z−1)r x

(z−1)i + γ

(z−1)r + δ(z−1)∆q

(z−1)i,r,k

)yi,r [1− Φ

(z−1)r x

(z−1)i + γ

(z−1)r + δ∆q

(z−1)i,r,k

)]1−yi,r3∑

π(z−1)m

C∏r=1

(β(z−1)r x

(z−1)i +γ

(z−1)r +δ(z−1)∆q

(z−1)i,r,m

)yi,r[1−Φ

(β(z−1)r x

(z−1)i +γ

(z−1)r +δ∆q

(z−1)i,r,m

)]1−yi,r .

0.2 g (π|y∗, x, β, γ, δ,∆q, t) : We update π(z) by sampling from a Dirichlet distribution with pa-

rameter vector

ak = nk + ak,

where nk =S∑i=1I(t(z)i = k

)and ak is the prior on the Dirichlet parameters.

Steps 1—5 above then proceed conditional on t(z) and π(z). In particular, ∆q(z−1)i,r becomes ∆q

(z−1)i,r,ti

Implementation. We implement the above algorithms using compiled C++ code that we call

from within MATLAB. All results we report are based on 10 parallel chains with 120,000 iterations

each, of which the first 20,000 are discarded as “burn in” period. To obtain starting points for each

chain we implement the nonparametric unfolding procedure of Poole (2000) and add random noise

to the resulting values. We confirm convergence of the Markov chain based on the Potential Scale

Reduction Factor (PSRF) of Gelman and Rubin (1992). In Appendix Figures A.12 and A.13, we

show the iterative history of the (thinned) Markov chain for the most important parameters in our

model, i.e., for δ and (π1, π2, π3).

E.2. Additional Results

For completeness, Appendix Figure A.14 plots the posterior mean of senators’ ideal points in our

extended model against the first dimension of Poole and Rosenthal’s (1997) DW-Nominate scores,

i.e., it replicates Figure 6 in the main text. As the comparison of both figures demonstrates, allowing

for different “types” of legislators has almost no impact on the estimated ideal points.

In addition to the extended structural model for which we present results in the text, we have

also estimated a slightly more parsimonous extension of our baseline theory. This extension only

allows for two types of legislators: (i) myopic ones, and (ii) backward inductors. Results are shown

in Appendix Figures A.15—A.17. Interestingly, the posterior distribution of π2, i.e., the share of

senators reasoning backwards, is very similar to that in the model with three types. It appears that

allowing for an additional, “intermediate” type reduces primarily the estimated fraction of myopic

senators.

Similarly, we find that allowing for two or three types makes very little difference for the posterior

of δ. This suggest that it is the myopic rather than the “intermediate” type that causes the rightward

shift in the posterior relative to our baseline model.

For each of the three structural models that we have estimated, Appendix Figures A.18 and

A.19 depict the marginal posterior distributions of β and γ, respectively. Rather than showing the

distribution for each bill-specific parameter, we pool over all roll calls. More detailed results are

available from the authors upon request.

Appendix F: Detecting Herding: Some Monte Carlo Evidence

This appendix speaks to the ability of our placebo approach to detect “abnormal serial correlation,”

as predicted by theories of herding and learning about common values. First, using data from the

35th—112th Congresses, we show that our approach detects lower-than-expected serial correlation

in defection–consistent with systematic preemption. We then present evidence from a Monte Carlo

study, which we have conducted to directly evaluate the ability of the method to detect herding in

small- to medium-sized data sets.

F.1. Evidence on Systematic Preemption

Our theory of “parties as teams” predicts that senators systematically preempt each other. If

correct, then we would expect that deviations from the party line by senator i are negatively

correlated with defection by whichever copartisan votes next. However, simply estimating

(E.1) di,p,r = µi + ψdi+1,p,r + ηi,p,r

by OLS is subject to the concern that ψ might be spuriously positive whenever unobserved roll-

call characteristics cause senators from the same party to be skeptical about a bill. We, therefore,

rely on our placebo approach and ask whether ψ is smaller than one would expect under the null

hypothesis of no preemption.

The answer turns out to be “yes.” Appendix Figure A.20 shows the null distribution. Consistent

with our model of “parties as teams,” our actual estimate of ψ is smaller than 96.4% of placebo

coefficients. We, therefore, reject the null.

F.2. Monte Carlo Evidence

We first describe the data-generating process for our Monte Carlo study and the simulation proce-

dure. We then present results on the frequency of type I and type II errors.

Data-Generating Process. Each Monte Carlo simulation features 100 senators, of which 60

belong to the majority party. Senator i votes “yea” on roll call c if and only if her intrinsic utility

from doing so exceeds zero, i.e.,

yi,c =

1 if y∗i,c ≥ 0

0 otherwise,

with y∗i,c given by

y∗i,c = αp + θy<i,c + εi,c.

As in the main text, y<i,c stands for the share of i’s copartisans who chose “yea” before it was

i’s turn to vote. By construction, there are herding effects in senators’ choices whenever θ 6= 0.

The absolute value of α governs how similar copartisans’ preferences are absent any cue-taking

behavior. For simplicity, α varies only by party, with αD = −αR. The noise term εi,c ∼ N (0, σ)

denotes senator i’s idiosyncratic preference regarding bill c. It is independently distributed across

senators and roll calls. Thus, by construction, we have that

Pr(yi,c = 1|y<i,c

(αp + θy<i,c

where Φ denotes the standard normal CDF.

Estimation & Simulation. For each simulation run, i.e., each parameter combination (a, θ),

we use the data-generating process above to create a “reference data set.” We then estimate

yi,c = κ + ϕy<i,c + ηi,c

on the reference data in order to obtain the “true” point estimate, i.e., ϕ. To determine whether

ϕ is larger than one would expect under the null of no bandwagon effects, we apply our placebo

approach to the reference data set, relying on 1,000 “randomly reshuffled” vote orders to construct

the null distribution, as explained in the main text. Lastly, for each simulation run, we record

whether the null hypothesis would have been rejected at the 5%-significance level, i.e, whether ϕ

is greater than 95% of placebo coefficients.

Since the purpose of this exercise is to evaluate the ability of surrogate data testing to detect band-

wagon effects across a range of alternative settings, we vary the parameters α = {0, .1σ, .2σ, .3σ}and θ = {0, .05σ, .1σ, .15σ}, as well as the number of available observations per senator, |c| =

{500, 1,000, 10,000}. For each parameter combination, we conduct 1,000 independent Monte Carlo

runs. The resulting rejection rates are reported in Appendix Table A.7.

Results. The evidence in Table A.7 indicates that our placebo approach is more likely to detect

herding effects (i.e., less likely to commit a type II error) when the econometrician observes more

roll calls per senator and when preferences are less correlated among copartisans (i.e., when α is

lower). The intuition for the latter pattern is simple: a higher correlation increases the likelihood

that copartisans would “naturally” vote in the same way, which makes it more difficult to pick up

any potential effect of one legislator on the other.

Importantly, the Monte Carlo results suggest that our approach exhibits adequate to excellent

statistical power, even in relatively small data sets. In particular, as long as the econometrician

observes 500 to 1,000 more roll calls (∼1 Congress), moderate bandwagon effects (i.e., θ = .15σ)

are reliably detected. Only for small and very small herding effects (i.e., θ = .05σ or θ = .1σ) to be

picked up does it take up to 10,000 roll calls.

For comparison, in our data for the 35th—112th Congress, we observe nearly 40,000 roll calls.

We, therefore, suspect that our approach would have detected herding among senators with high

probability, if it existed.

It is also reassuring to note that theoretical and actual rates of committing a type I error are

fairly closely aligned (cf. the column for θ = 0).

Appendix G: A Model of Sequential Vote Buying

This appendix extends the model of Dekel et al. (2009) (henceforth DJW) to the sequential vot-

ing structure of the Senate to show that their result that winning coalitions are made up of the

lowest cost legislators continues to hold. Thus, in order for vote buying to account for the patterns

we observe, senators would need to become intrinsically more supportive of a measure as their

alphabetical rank increases.

In what follows we closely adhere to the model of DJW while introducing an explicit sequential

legislator-vote-round structure with an exogenous and fixed vote order.

In particular, the model is as follows. Let there be a legislature of size N odd deciding between

a new policy x supported by party X and a status quo y supported by party Y . Policy x is

implemented if it receives a majority of the votes. Each legislator i is characterized by her utility

each policy ui(x) and ui(y) and the order in which she votes j.6 We denote the net benefit they

get from voting for the policy x by v∆xi = ui(x) − ui(y) and, analogously, from voting for y,

v∆yi = ui(y) − ui(x). We assume that v∆x

i is strictly decreasing in i, so that i = 1 is the greatest

natural supporter of x and i = N is the greatest supporter of y (or least supporter of x.)7 Let

J(i) denote the position in the order in which legislator i votes and let I(j) be the rank in terms

of support for the legislator who votes jth. So, legislator i votes in the J(i)th position and the

legislator voting in the jth position is the I(j)th most intense supporter of policy x. Given the

ordering of legislators by i, let m be the preference median legislator and assume that v∆xm > 0 so

that a majority of legislators prefers voting for x. As mentioned, there are two vote buyers X and

Y , where X can generally be thought of the majority party and Y the opposition or an outside

group. X would like to pass the bill and values policy x over y at WX . Y would like the status

quo to remain and values it WY more than policy x.8 Let nX = |{i : v∆xi > 0}| be the number of

legislators that would support passage without vote buying.

We assume that there is a smallest increment of resources that can be transfered ε and that v∆xi ,

WX , and WY are not integer multiples of ε in order to avoid indifference. For a positive number k,

let dkeε (resp. bkcε) be the smallest multiple of ε greater than k (resp. largest multiple of ε smaller

than k). If k is negative, then dkeε = −d|k|eε and bkcε = −b|k|cεFollowing DJW, we assume that both parties can make binding offers that depend only on a

senator’s choice. Let BX(j) and BY (j) denote the final offers made to the legislator voting in

position j by parties X and Y respectively. We also assume that voter v∆xi votes for x if and only

if BX(J(i)) + v∆xi > BY (J(i)). If party Y were to prevail, the cheapest possible way for them to

do so would be the purchase a minimum majority buying the votes of the weakest supporters of x

6We assume that legislators only care about their vote for a few reasons. First, we want to adhere tothe structure of DJW who make the same assumption. Second, Dal Bo (2007) shows that a vote buyer caneliminate any legislator’s pivotality concerns by buying one more vote than needed. Finally, we want tocontrast the vote buying setting to our parties as teams model.

7This assumption is slightly stronger than DJW who assume that v∆xi is non-increasing. Our stronger

assumption rules out indifference between winning coalitions.8Alternatively, we could think of these quantities as their effective budgets.

which given the incremental unit ε is equal to v∆X =∑ni=mdv∆x

i eε. Further, we let bWY cε > v∆X

to avoid the trivial situation where Y can never prevail.

Vote buying proceeds exactly as in DJW but instead of bids being made simultaneously to all

legislators, vote buying proceeds round by round. Thus, bidding occurs for the vote of legislator j

after all votes from legislators in position 1 to j − 1 have been recorded. After bidding, legislator j

announces her vote and the game proceeds to the next round with legislator j + 1 until all votes

have been cast.

The particulars are as follows: In each round, parties alternate in making offers and observe all

past offers. So for example, party X begins by making an offer and Y can counter X’s offer at which

point X can counter and so on. Once an offer has been made by a party, the subsequent offer by that

same party cannot be lower. Again following DJW, we assume there is a small positive cost, γ > 0,

of making an offer in any round. The bidding round ends when two consecutive offer opportunities

pass where the vote of the legislator does not change. We say that a party drops out if it does not

raise its bids between rounds and the buying round ends. Thus, a party’s strategy specifies how much

more is offered over the previous offer given the vote count and previous offers. Following DJW, we

assume that parties employ strategies that are analogous to their Least Expensive Majority (LEM)

strategies in which at each time the party bids (that is the party does not drop out), it purchases

the vote of the legislator in the least expensive manner provided the bid does not exceed the net

value of winning. Given these strategies, each round amounts to a complete information English

auction with bidding costs and jump bidding permitted for the legislator’s vote.

If the game does not end, then the payoff for both parties is −∞. Otherwise, party k’s payoff if

she wins is Wk minus the total cost of votes purchased and the cost of bidding. If party k loses,

then her final payoff is a loss equal to the cost of votes purchased and the cost of bidding.

Our solution concept is subgame perfect equilibrium. In order to account for position of play

in the game tree, we introduce additional notation. Let Y (j) = 1 (= 0) denote a yea (nay) vote

in round j. Legislator i votes for the final offer that gives him the highest total utility and so

votes yea if BX(J(i)) + v∆Xi > BY (J(i)) and votes nay if BX(J(i)) + v∆X

i < BY (J(i)). Let

R(j) = N+12 −

∑j−1t=1 Y (t) be the number of yea votes needed for passage and M(j) = N + 1− j be

the remaining votes at the beginning of round j.

Denote by VX(R,M) and VY (R,M) the continuation value of the buyers when there are R yes

votes needed and M rounds remaining. Finally, let WX(R(j),M(j)) be the gross benefit to party

X of prevailing (winning the round) absent of the cost of buying a vote in round j when there are R

yea votes remaining and note that WX(R(j),M(j)) = VX(R−1,M −1)−VX(R,M −1). Similarly,

WY (R(j),M(j)) is party Y ’s gross value of prevailing in round j and is equal to VY (R,M − 1) −VY (R− 1,M − 1).

Proposition 1 is nearly identical to proposition 1 in DJW (2009) and follows from the same logic

of backward induction, limits to rational bids and perfect information. The result extends trivially

to our setting as backward induction implies that each voting round amounts to a simple version

of their game in which the parties compete for a single vote (rather than seeking a majority), that

the values of the parties in the rounds are WY (R(j),M(j)) and WX(R(j),M(j)), and that we can

we can work backwards through the rounds. We present the proof, but note that it follows their

logic closely and should not be construed as a novel contribution.

Proposition 1: Given positive costs to making an offer γ > 0, the vote buying game has a pure

strategies equilibrium. In every equilibrium, the same party wins and the losing party never makes

any offers for votes.

Proof: First, note that any legislator-bidding round j with gross winning benefits of winning

WX(R(j),M(j)) and WY (R(j),M(j)) is equivalent to a simple one legislator game with termi-

nal payoffs (WX(R(j),M(j)), 0) if X prevails and (0, WY (R(j),M(j))) if Y prevails. We first

show that for any game equivalent to the round j legislator bidding round, there is an equiv-

alent truncated game with bounded bidding. In each bidding period within a round, the offer

of at least one party must strictly increase or else the round ends. So after h rounds the min-

imum offer to the legislator is hε and for h large enough, the offer to the legislator will be

greater than max{WX(R(j),M(j)), WY (R(j),M(j))}. In order for k to make an offer greater than

Wk(R(j),M(j)) it must be that k is certain that the other party will outbid them. However, it

cannot be the case that both parties are certain to lose and be outbid, so after some finite number

of rounds both parties will quit bidding. This implies there is a finite bound to each party’s bids

and so we can truncate bids to some finite maximum bid for each party and preserve all equilibria

of the game.

Now, the truncated equivalent to each round is a finite, sequential-move game of perfect infor-

mation and so the existence of equilibrium follows as one can be found by backward induction.

Furthermore, there is no indifference anywhere in the game due to our assumption that values

are not ε-multiples. In particular, this implies that each terminal node in the round has a unique

winner. As each party prefers to win in the truncated game, in any subgame working backwards

by induction from those nodes that precede terminal nodes, there is a unique winner. Since there

is a unique winner, it cannot be the case that the losing party makes an offer as they could deviate

to offering nothing an save the cost of bidding and making payments.

Having shown that every voting round with gross benefits of winning WX(R(j),M(j)) and

WY (R(j),M(j)) has a unique winner, we can apply the same set of inductive arguments on the

game as whole and show that there is a unique winner and the loser never bids as required. Q.E.D.

The following remark highlights the first steps of our proof as a useful result to reference later.

Remark 1: Given positive costs to making an offer γ > 0, each round j with gross winning

benefits of winning WX(R(j),M(j)) and WY (R(j),M(j)) has a unique winner and the losing party

never makes any offers.

Having established that in our environment the same party wins in all equilibria, we now char-

acterize the least costly equilibrium for the winning party and show that it is characterized by a

threshold type where all legislators below the threshold (if X wins) or above the threshold (if Y

wins) vote with the winning party. These are the same winning coalitions that emerge in the DJW

models, where offers are made to all legislators simultaneously. Our key contribution is to show

that these coalitions extend to the sequential setting.

Let WBk(i) denote the bid (possibly 0) paid by k to i when party k wins and i is part of her

winning coalition. The winner’s objective is to collect the cheapest winning coalition or to

(P) minA⊆I

∑i∈A

WBk(i) subject to |A| ≥ N + 1

To show that the solution to this program will be characterized by a threshold, we characterize

WBk(i) for every legislator.

Lemma 1: For γ > 0 and sufficiently small, in any legislator round j, party k wins the round if

and only if bWk((R(j),M(j))cε + v∆kI(j) > bW`((R(j),M(j))cε

Proof: First note that since v∆kI(j) is not an integer multiple of ε, so it is never the case that

bWk((R(j),M(j))cε + v∆kI(j) = bW`((R(j),M(j))cε. To see that k wins if the condition holds, let

k bid her maximum willingness to pay bWk((R(j),M(j))cε, which she will be willing to do if

γ > 0 is sufficiently small. The most ` would be willing to bid is bW`((R(j),M(j))cε so legislator j

votes for k’s position if bWk((R(j),M(j))cε + vkI(j) > bW`((R(j),M(j))cε + v`I(j) which is exactly

our condition. Since k has a winning strategy, by Remark 1, k wins the round. To see that the

condition is necessary, assume that it does not hold. Then by the same reasoning as above it must

be the case that ` wins the round. Q.E.D.

Corollary 1: If k wins the round and bWk((R(j),M(j))cε ≤ bW`((R(j),M(j))cε then v∆kI(j) > 0

Lemma 2: If k wins the round and ` must make a strictly positive offer to win the vote of legislator

j, then ` will quit the round immediately without making an offer.

Proof: Our argument is by induction. First note that ` quits at any node when she must bid

greater than bW`((R(j),M(j))cε. Assume that ` will quit at any node where the minimum bid

needed to win is qε. Now consider a node α where ` must spend (q − 1)ε. If ` makes the offer and

the game proceeds to node α′. Because we assume that k wins the round, k can respond by adding

ε to her offer and be in a position to win the vote. The game then proceeds to node α′′ where `

must now bid qε to prevail and so by our induction hypothesis, ` will quit. Thus, at node α′ the

continuation equilibrium must have k winning. Thus, `’s offer at α leads to a loss of γ, the cost of

making an offer and so ` would have been better off quiting at node α. Q.E.D.

Lemma 3: If k wins round j then WBk(I(j)) = max{0,−v∆kI(j)}

Proof: We proceed by cases. Case 1: k wins and bWk((R(j),M(j))cε ≥ bW`((R(j),M(j))cεand v∆k

I(j) > 0. By lemma 2, ` never bids and so k can win for a bid of 0. Case 2: k wins and

bWk((R(j),M(j))cε ≥ bW`((R(j),M(j))cε and v∆kI(j) < 0. If k never bids a positive amount then

round ends with ` winning, a contradiction of our premise. When k first bids a positive amount

it must bid at least dv∆kI(j)eε or else again ` wins. Now in order for ` to win, it must make an

offer equal to dv∆kI(j) − dv∆k

I(j)eεeε = ε > 0 and so by Lemma 2 drops out. Case 3: k wins and

bWk((R(j),M(j))cε < bW`((R(j),M(j))cε and v∆kI(j) < 0 is ruled out by Corollary 1. Case 4: k wins

and bWk((R(j),M(j))cε < bW`((R(j),M(j))cε and v∆kI(j) > 0. By assumption k wins and so by

Lemma 2 ` will never make an offer, so k wins with a bid of 0. Collecting the cases it is clear that

k’s winning bid is equal to max{0,−v∆kI(j)}. Q.E.D.

Proposition 2: If X prevails then the lowest cost winning coalition is defined by an nX ≥ nX

where i votes for x if and only if i ≤ nX . If Y prevails then the lowest cost coalition is defined by

nY ≤ m such that i votes for y if and only if i ≥ nY

Proof: The proposition follows immediately from the vote buyer’s program (P) and the charac-

terization of WBk(I(j)) in Lemma 3 Q.E.D.

Proposition 2 is the chief result of this appendix. It establishes that the coalitions that emerge

from vote buying should include those most likely to support the winner’s position absent vote

buying. Thus, unless changes in senators’ position-taking preferences are correlated with changes

in the vote order, vote buying cannot generate our main empirical results on order effects.

Appendix H: Data Appendix

This appendix provides a description of all data used in the paper, as well as precise definitions

together with the sources of all variables.

H.1. Universe of Roll Calls, 1857—2013

Data on all roll call votes in the United States Senate were kindly provided by Keith Poole.9 They

are based on careful codings of the Congressional Record. The data contain senators’ names, home

states, party affiliation, and final votes. They neither indicate the actual order in which votes were

submitted, nor do they contain any information on whether a given senator changed or withdrew his

initial vote. Unfortunately, this information is not part of the Congressional Record. The analysis

in this paper restricts attention to the votes of Democratic and Republican senators since the

emergence of the two-party system, i.e. from the 35th to the 112th Congress (1857—2013). The

following variables are being used:

9They are publicly accessible at http://www.voteview.com.

Party Line is defined for each roll call vote that a senator submits. It equals the vote choice of the

simple majority of other senators from the same party (not including the senator for whose vote it

is calculated).

Deviate is an indicator variable equal to one if a senator’s vote differs from the party line, as defined

above. It is zero otherwise and undefined for senators who did not participate in a given roll call.

Alphabetical Rank is defined as si−1S−1 , where S denotes the number of senators who particiapte in

a given roll call, and si is senator i’s raw alphabetical rank among participants. si is constructed

based on senators’ last names, as contained in the raw data. Roughly speaking, the variable Rank

corresponds to senators’ alphebatical percentile ranking among their colleagues (divided by 100).

“Close” vs “Lopsided” Roll Calls are categorized as in Snyder and Groseclose (2000). That is, a

roll call is said to be “lopsided” whenever more than 65% or less than 35% of Senators voted “yea.”

For votes that require a supermajority, e.g., treaties and cloture votes, the corresponding cutoffs

are 51.7% and 81.7% (i.e. 66.7% ± 15%). Data on supermajority requirements come from Snyder

and Groseclose (2000) and have been manually extended through the 112th Congress.

Divisive is an indicator variable equal to one if the majority of one party votes in the opposite

direction of the majority of the other party. It is zero otherwise.

“Split” Minority is an indicator variable equal to one if fewer than the median percentage of

minority-party senators deviate from the party line on a particular roll call. It is zero otherwise.

Seat Advantage is defined as the difference in the number of senators between the majority and

minority parties who participate in a given roll call.

DW-Nominate Scores were kindly provided by Keith Poole. For a description of the DW-Nominate

estimation procedure, see Poole (2005).

Experience is defined as the total number of roll call votes that a senator had ever submitted before

a particualr roll call was conducted.

H.2. Roll Calls in the 112th Congress

As described in the main text, a team of research assistants collected information above and beyond

what is contained in the Congressional Record for the 112th Congress. To this end, the C-SPAN

network was asked to furnish video recordings of all 486 roll calls held during this period, i.e.,

from January 3, 2011 to January 3, 2013. Due to technically difficulties C-SPAN was only able

to produce videos for 484 out of 486 calls. We retrieved the remaining two recordings from the

National Archives in Washington, D.C.

A research assistant watched each recording, pausing frequently to transcribe every senator’s

choice, whether it was submitted during the alphabetical call of the roll, as well as any subsequent

changes. Recording senators’ votes during the initial period is facilitated by the fact that, directly

after the call of the roll, the clerk summarizes the vote by respectively identifying those who

voted “yea” and “nay.” The research assistant also recorded the exact order in which votes were

submitted (or changed) after the clerk had stopped calling the roll, but before the roll call had been

officially closed. Doing so is relatively straightforward, as the clerk generally announces the name

of the senator who just submitted his vote, followed by his choice. A second research assistant then

rewatched each video in order to check the transcription for errors and to ensure that senators’

final choices in the data match the official Congressional Record.

H.3. Senator Characteristics

Raw data on senators’ characteristics come from the Database of Congressional Historical Statis-

tics and were obtained through the Inter-University Consortium for Political and Social Research

(ICPSR 3371). The data were manually checked for errors and extended to cover all senators who

served before the end of the 112th Congress. Whenever the information in the Biographical Direc-

tory of the U.S. Congress differed from the raw data, the latter was changed to conform to the

former.10 Throughout the analysis, the following variables are used:

Age is defined as a senator’s age (in years) at the beginning of a particular Congress.

Gender is defined as the senator’s biological sex.

College Educated is an indicator variable equal to one if the Biographical Directory of the U.S.

Congress indicates that the senator graduated from college. It is zero otherwise.

End of Term refers to the last two years of a senator’s regular 6-year term.

References

Clinton, J., S. Jackman, and D. Rivers (2004). “The Statistical Analysis of Roll Call Data.”

American Political Science Review, 98(2): 355—370.

Dal Bo, E. (2007). “Bribing Voters.” American Journal of Political Science, 51(4): 789—803.

Dekel, E.,M. O. Jackson, andA. Wolinsky (2008). “Vote Buying: General Elections.” Journal

of Political Economy, 116(2): 351—380.

, , and (2009). “Vote Buying: Legislatures and Lobbying.” Quarterly Jour-

nal of Political Science, 4(2): 103—128.

Frisch, R., and F. V. Waugh (1933). “Partial Time Regressions as Compared with Individual

Trends.” Econometrica, 1(4): 387—401.

Gelman, A., and D. B. Rubin (1992). “Inference from Iterative Simulation Using Multiple Se-

quences.” Statistical Science, 7(4): 457—511.

Poole, K. T. (2000). “Nonparametric Unfolding of Binary Choice Data.” Political Analysis, 8(3):

211—237.

10The Biographical Directory of the U.S. Congress is available at http://bioguide.congress.gov/.

(2005). Spatial Models of Parliamentary Voting. Cambridge, UK: Cambridge University

Press.

, and H. Rosenthal (1997). Congress: A Political-Economic History of Roll Call Voting.

New York: Oxford University Press.

Rhode, D. W. (1991). Parties and Leaders in the Postreform House. Chicago: University of

Chicago Press.

Romer, T., and H. Rosenthal (1978). “Political Resource Allocation, Controlled Agendas, and

the Status Quo.” Public Choice, 33(4): 27—43.

Snyder, J. M., and T. Groseclose (2000). “Estimating Party Influence in Congressional Roll-

Call Voting.” American Journal of Political Science, 44(2): 193—211.

Notes: Figure shows the empirical cumulative distribution function for placebo estimates of λ in equation (1), controlling for the mean deviation rate among copartisans. Estimates are based on 10,000 randomly generated placebo orderings. The vertical line indicates the point estimate in the original data.

Figure A.1: Empirical CDF of Placebo Estimates for λ, Controlling for Copartisans' Choices

A. Vice President Belongs to Majority Party

B. Vice President Doesn't Belong to Majority Party

Figure A.2: Winning Margings, by Party Affiliation of Vice President

Notes: Figure shows the frequency of excess votes (relative to the threshold required for passage) in favor of the position held by the Senate's majority party, as well as the estimated density function and the associated 95%-confidence intervals. The upper panel restricts attention to situations in which the Vice President belongs to the majority party, whereas the lower one presents results for the opposite case. The underlying data come from roll calls in the U.S. Senate that required a simple majority and were held during the 35th–112th Congresses. Density estimates are based on local linear regressions with a bandwidth of 4, applied separately on each side of the cutoff. See McCrary (2008) for details on the estimation procedure.

Figure A.3: Estimated Order Effects by Senators' Prior Experience, Controlling for Time Served in the Senate

Notes: Figure shows point estimates and the associated 95%-confidence intervals for λ in equation (1), controlling for the number of Congresses in which a senator had previously served. Estimates are based on senators' roll call–specific rank. Confidence intervals account for heteroskedasticity and clustering of the residuals at the Congress level.

≤ 100 100 to 500 500 to 1,000 1,000 to 2,000 2,000 to 5,000 ≥ 5,000

Number of Roll Calls in which Senator Previously Participated

Figure A.4: Example of Sequential Voting Game with Uncertainty

Notes: Figure shows an example of our sequential voting game with uncertainty about preferences. For simplicity, there is one party and two players. One "yea" vote is required for passage. Holding the outcome of the roll call fixed, both D1 and D2 would like to go on the record opposing the bill. D1, however, would always rather vote "yea" than let the measure fail for sure. By contrast, D2's payoffs are determined by nature and unknown to D1. In one state of nature, D2 will always vote "nay" regardless of whether this causes the bill to fail. In the other state, D2 is willing to support the party line if need be. The thick lines indicate each player's optimal action at a particular node in the game tree.

Figure A.5: Example with Nonmonotonic Defection

Notes: Figure shows an example of our sequential voting game with two parties and five Senators. All senators are conflicted. Specifically, Democrats receive payoff α = -1 if they vote "yea" and δ = 2 if the bill ends up being approved, whereas Republicans receive α = 1 and δ = -2. Three "yea" votes are needed for passage. The thick lines indicate each player's optimal action at a particular node in the game tree.

Figure A.6: Simulation Results for Situations when Only Majority Party Senators are Conflicted

Notes: Panels on the left depict the average rate of defection as a function of when a senator gets to cast her vote. Panels on the right show a histogram of the total number of "yea" votes on a simulated roll call. All results are based on 10 million simulations in which 100 senators play the equilibrium strategies prescribed by Proposition 1, assuming that the bill passes when a simple majority assents. For each simulation, the order in which agents vote is randomly determined. The probability that a particular member of the majorty party is "conflicted" equals 30%. The preferences of all other senators are aligned with their parties' opposing stances. In the upper two panels, 55 senators belong to the majority. In the middle and lower two panels, the number of majority party members is 65 and 85, respectively.

A. Defection Rates B. Number of "Yeas"I.

Notes: Panels on the left depict the average rate of defection as a function of when a senator gets to cast her vote. Panels on the right show a histogram of the total number of "yea" votes on a simulated roll call. All results are based on 10 million simulations in which 100 senators play the equilibrium strategies prescribed by Proposition 1, assuming that the bill passes when a simple majority assents. For each simulation, the order in which agents vote is randomly determined. The probability that a member of either party is "conflicted" equals 30%. The preferences of all other senators are aligned with their parties' opposing stances. In the upper two panels, 55 senators belong to the majority. In the middle and lower two panels, the number of majority party members is 65 and 85, respectively.

Figure A.7: Simulation Results for Situations when Senators from Both Parties are ConflictedA. Defection Rates B. Number of "Yeas"

Notes: Figure depicts the expected average rate of defection as a function of when a senator gets to cast her vote. The results in each panel are based on 10 million simulated roll calls in which 100 senators follow the equilibrium strategies in Proposition 1. For each roll call, the order in which agents vote is randomly determined. The majority party's preferred outcome obtains whenever a simple majority votes "yea." In all simulations, there is a 30% probability that any given senator is "conflicted." The preferences of all other agents are aligned with their parties' opposing stances. Panels I–IV lower the number of senators who belong to the majority party from 55 to 52.

III. 53 Majority Party Members IV. 52 Majority Party Members

Figure A.8: Simulation Results for Situations in which Both Parties are Split and Almost Equal in SizeI. 55 Majority Party Members II. 54 Majority Party Members

Figure A.9: Distribution of the Majority Party's Roll Call-Specific Seat Advantage, U.S. Senate 1857–2013

Notes: Figure shows the distribution of the majority party's seat advantage during the 35th–112th Congresses, restricting attention to senators who participated in a given roll call. The majority party is defined by roll call, i.e., the party with the most senators participating.

A. Less than Two Seats

B. Three to Twenty Seats

C. More than Twenty-One Seats

Notes: Figure shows the average frequency with which senators deviate from the majority of their copartisans, depending on their position in the vote order. The upper panel restricts attention to roll calls in which the majority party had a seat advantage of at most 2 while the remaining two panels focus on roll calls with seat advantages of 3 to 20 and at least 21, respectively.

Figure A.10: Deviations from the Party Line in the Raw Data, by Majority Party's Seat Advanatge

Figure A.11: Estimated Order Effects, by Decade

Notes: Figure shows point estimates and the associated 95%-confidence intervals for λ, estimated decade by decade. Estimates are based on equation (1) and senators' roll call–specific rank. Confidence intervals account for heteroskedasticity and clustering of the residuals at the Congress level.

A. Model with Homogenous Senators (Baseline Model)

B. Model with 3 Types of Senators (Extended Model)

Figure A.12: Iterative History of δ in the MCMC Algorithm

Notes: Figure shows the iterative history of δ in our MCMC algorithm (thin line) and its posterior mean (thick line). The upper panel pertains to our baseline model, while the lower one refers to our extended structural model. All parallel chains consist of 120,000 iterations, of which the first 20,000 are discarded as burn-in period. For this figure we thin each chain by a factor of 1,000.

A. Share of Myopic Senators (π 1 )

B. Share of Backward Reasoning Senators (π2 )

C. Share of "Intermediate Type" (π 3 )

Figure A.13: Iterative History of Type Shares in the MCMC Algorithm

Notes: Figure shows the iterative history of π1, π2, and π3 in the MCMC algorithm for our extended model (thin line) and their posterior mean (thick line). All parallel chains consist of 120,000 iterations, of which the first 20,000 are discarded as burn-in period. For this figure we thin each chain by a factor of 1,000.

Figure A.14: Posterior Mean of Senators' Ideal Points, Allowing for 3 Types

Notes: Figure plots the posterior mean of senators' ideal points in our extended structural model against the left–right dimension of DW-Nominate scores. "D" and "R" respectively denote Democratic and Republican senators. Estimates are based on 1,000,000 MCMC draws.

Figure A.15: Posterior Distribution of δ in a Model without "Intermediate Type"

Notes: Figure shows the marginal posterior density of δ in a variant of our structural model with myopic types and backward inductors only (solid line) as well as the associated prior (dashed line). The thick line indicates the 90% highest posterior density region. Estimates are based on 1,000,000 MCMC draws and smoothed using a Gaussian kernel with a bandwidth of .1.

Figure A.16: Posterior Mean of Senators' Ideal Points, Allowing for Rational and Myopic Types Only

Notes: Figure plots the posterior mean of senators' ideal points in a variant of our structural model with myopic types and backward inductors only against the left-right dimension of DW-Nominate scores. "D" and "R" respectively denote Democratic and Republican senators. Estimates are based on 1,000,000 MCMC draws.

A. Posterior Share of Myopic Senators

B. Posterior Share of Backward Reasoning Senators

Figure A.17: Posterior of Type Shares in a Model without "Intermediate Types"

Notes: Figure shows the marginal posterior distributions for the type shares in a variant of our structural model with myopic types and backward inductors only (solid lines) as well as the associated priors (dashed lines). The upper panel refers to the share of myopic agents (π1), while the lower one pertains to the share of senators reasoning backwards (π2). All estimates are based on 1,000,000 MCMC draws and smoothed using a Gaussian kernel with a bandwidth of .005.

A. Baseline Model

B. Extended Structural Model

C. Model without "Intermediate" Type

Figure A.18: Posterior of β, Pooling over Roll Calls

Notes: Figure shows the marginal posterior distribution of β pooling over all roll calls during the 112th Congress (solid lines) as well as the associated priors (dashed lines). The upper panel refers to our baseline model, while the middle and lower panels pertain to our extended structural model and a variant thereof with myopic types and backward inductors only. All estimates are based on 1,000,000 MCMC draws and smoothed using a Gaussian kernel with a bandwidth of .1.

A. Baseline Model

B. Extended Structural Model

C. Model without "Intermediate" Type

Figure A.18: Posterior of γ, Pooling over Roll Calls

Notes: Figure shows the marginal posterior distribution of γ pooling over all roll calls during the 112th Congress (solid lines) as well as the associated priors (dashed lines). The upper panel refers to our baseline model, while the middle and lower panels pertain to our extended structural model and a variant thereof with myopic types and backward inductors only. All estimates are based on 1,000,000 MCMC draws and smoothed using a Gaussian kernel with a bandwidth of .1.

Figure A.20: Null Distribution for ψ in Equation (E.1)

Notes: Figure shows the empirical cumulative distribution function for placebo estimates of ψ in equation (E.1) under the null hypothesis of no herding (solid line) vis-à-vis the actual point estimate (dashed line). Estimates are based on 10,000 randomly generated placebo orderings.

Position Raw Mean Standard Error

1 .185 .0022 .193 .0023 .170 .0024 .182 .0025 .193 .0026 .187 .0027 .189 .0028 .197 .0029 .186 .00210 .194 .00211 .193 .00212 .204 .00213 .190 .00214 .192 .00215 .191 .00216 .198 .00217 .197 .00218 .194 .00219 .199 .00220 .189 .00221 .189 .00222 .184 .00223 .187 .00224 .191 .00225 .193 .00226 .193 .00227 .187 .00228 .198 .00229 .189 .00230 .195 .00231 .190 .00232 .192 .00233 .191 .00234 .182 .00235 .183 .00236 .181 .00237 .178 .00238 .179 .00239 .184 .00240 .184 .00241 .191 .00242 .188 .00243 .186 .00244 .191 .00245 .186 .00246 .191 .00247 .189 .00248 .184 .00249 .182 .00250 .179 .00251 .181 .00252 .181 .00253 .185 .00254 .177 .00255 .183 .00256 .180 .00257 .172 .00258 .167 .00259 .174 .00260 .176 .00261 .163 .00262 .168 .00263 .156 .00264 .160 .00265 .168 .00266 .168 .00267 .163 .00268 .169 .00269 .172 .00270 .166 .00271 .171 .00272 .183 .00273 .174 .00274 .184 .00375 .175 .00276 .170 .00277 .171 .00278 .177 .00379 .179 .00380 .177 .00381 .178 .00382 .189 .00383 .194 .00384 .187 .00385 .177 .00386 .184 .00387 .195 .00388 .186 .00389 .202 .00390 .197 .00391 .176 .00392 .181 .00393 .190 .00394 .179 .00395 .169 .00396 .160 .00397 .155 .00498 .141 .00499 .163 .005

100 .166 .007

Table A.1: Deviations from the Party Line, by Voting Position

Notes: Entries are mean defection rates at each position in the vote order together with the associated standard errors.

Roll Call Specific Order

Order Among All Senators N

Based on Majority of Fellow Party Members -.172*** -.221*** 2,897,879(.055) (.071)

Based on Vote of Party Leader -.367*** -.422*** 2,176,089(.087) (.108)

Based on Vote of Party Whip -.237** -.276** 2,251,000(.095) (.112)

Table A.2: Estimated Order Effects for Alternative Definitions of the Party Line

Notes: Entries in the center columns are point estimates and standard errors for λ, given different definitions of the party line. The respective definition is indicated on the left of each row. Estimates are based on equation (1) and either the roll call–specific order or the order of all senators who served in the chamber when a given vote was held. Heteroskedasticity robust standard errors are clustered by Congress and reported in parentheses. The column on the right shows the number of valid observations associated with each definition. Sample sizes vary because parties did not adopt today's leadership system until the early twentieth century. ***, **, and * denote statistical significance at the 1%-, 5%-, and 10%-levels, respectively.

Order Among All Senators N

All Roll Calls -.172*** -.221*** 2,897,879(.055) (.071)

10% Abstention or Less -.193** -.284*** 2,123,525(.075) (.093)

5% Abstention or Less -.295*** -.395*** 1,834,738(.087) (.101)

1% Abstention or Less -.416 -.435 205,258(.237) (.249)

Table A.3: Estimated Order Effects for Roll Calls with Little Abstention

Notes: Entries in the center columns are point estimates and standard errors for λ, by level of abstention. Estimates are based on equation (1) and either the roll call–specific order or the order of all senators who served in the chamber when a given vote was held. Heteroskedasticity robust standard errors are clustered by Congress and reported in parentheses. The column on the right shows the number of valid observations within each subsample. ***, **, and * denote statistical significance at the 1%-, 5%-, and 10%-levels, respectively.

A. Senate

(1) (2) (3) (4) (5) (6)

Alphabetical Rank -.221*** -.511** -.100* -.357* -.383*** -.692*(.071) (.257) (.054) (.199) (.126) (.358)

Controls:Senator Fixed Effects Yes No Yes No Yes NoSenator × Congress Fixed Effects No Yes No Yes No Yes

B. House of Representatives

(7) (8) (9) (10) (11) (12)

Alphabetical Rank -.290 .304 -.410 -.338 -.104 .973(.205) (.928) (.248) (1.435) (.262) (.974)

Controls:Senator Fixed Effects Yes No Yes No Yes NoSenator × Congress Fixed Effects No Yes No Yes No Yes

R-Squared .047 .060 .029 .047 .144 .163Number of Observations 9,618,470 9,618,470 5,230,083 5,230,083 4,388,387 4,388,387Notes: Entries are coefficients and standard errors from estimating equation (1) by ordinary least squares. The upper panel does so for the U.S. Senate, while the entries in the lower panel refer to the House of Representatives after the introduction of electronic voting machines. Estimates are based on legislators' alphabetical rank among everyone who served in the chamber at the time the vote was held. Heteroskedasticity robust standard errors are clustered by Congress and reported in parentheses. As explained in the main text, roll calls are classified as "close" or "lopsided" according to the cutoffs in Snyder and Groseclose (2000). ***, **, and * denote statistical significance at the 1%-, 5%-, and 10%-levels, respectively.

Table A.4: Deviations from the Party Line as a Function of Alphabetical Rank among All Members of the Chamber

Deviate

Sample

Deviate

Sample

Order Among All Senators

Baseline -.172*** -.221*** 2,897,879(.055) (.071)

Split Minority .013 .041 1,414,016(.046) (.065)

By Majority Party's Seat Advanatge:1 or 2 Seats .075 .216 251,203

(.079) (.180)

3 to 5 Seats -.115 -.113 202,566(.108) (.137)

6 to 10 Seats -.180* -.280* 877,082(.104) (.141)

11 to 20 Seats -.182 -.372*** 720,202(.114) (.128)

> 20 Seats .007 .094 772,785(.078) (.120)

Table A.5: Reduced-Form Comparative Statics

Notes: Entries in the center columns are point estimates and standard errors for λ in different subsamples of the data. The respective restriction is indicated on the left of each row. Estimates are based on equation (1) and either the roll call–specific order or the order of all senators who served in the chamber when a given vote was held. Heteroskedasticity robust standard errors are clustered by Congress and reported in parentheses. The column on the right lists the number of observations in each subsample. ***, **, and * denote statistical significance at the 1%-, 5%-, and 10%-levels, respectively.

F-Test F-Test λ [p -value] λ [p -value]

Baseline -.172*** -.221***(.055) (.071)

By Party Memebrship:Majority Party -.180*** -.231***

(.054) (.072)Minority Party -.163*** -.214***

(.056) (.073)By Age:

< 50 Years -.161*** -.206***(.053) (.069)

50 to 65 Years -.160*** -.204***(.053) (.069)

> 65 Years -.160*** -.204***(.055) (.071)

By Gender:Male -.171*** -.220***

(.055) (.072)Female -.235* -.278*

(.133) (.151)

By Educational Attainment:Less than College -.267*** -.340***

(.077) (.107)College Educated -.139** -.180**

(.059) (.075)

By Closeness to Next Election:End of Term -.179*** -.229***

(.056) (.072)Not End of Term -.170*** -.220***

(.054) (.071)

Notes: Entries in the center column are point estimates and standard errors for λ in different subsamples of the data. The respective restriction is indicated on the left of each row. Heteroskedasticity robust standard errors are clustered by Congress and reported in parentheses. Within each panel, the rightmost column displays p -values from an F-test for equality of coefficients.

A. Roll Call Specific Order B. Order Among All Senators

Table A.6: Order Effects, by Senators' Characteristics

A. 500 Roll Calls

Valence (α) 0 .05σ .1σ .15σ0 .045 .269 .623 .902

.1σ .055 .227 .621 .871

.2σ .040 .197 .479 .756

.3σ .052 .159 .350 .526

B. 1,000 Roll Calls

Valence (α) 0 .05σ .1σ .15σ0 .042 .409 .877 .996

.1σ .044 .394 .845 .993

.2σ .042 .292 .740 .954

.3σ .056 .234 .538 .759

C. 10,000 Roll Calls

Valence (α) 0 .05σ .1σ .15σ0 .046 .997 1 1

.1σ .048 .997 1 1

.2σ .044 .975 1 1

.3σ .045 .899 1 1Notes: Entries denote the fraction of Monte Carlo simulations for which the null hypothesis of no herding is rejected at the 5%-significance level. Each entry is based on a 1,000 simulations. For details on the data generating process and the estimation procedure, see Appendix F.

Table A.7: Monte Carlo Rejection Rates

Herding Effect (θ)

S1 S2 S3 S4 S5 S1 S2 S3 S4 S5 S1 S2 S3 S4 S5Y Y Ỹ Ỹ Ỹ y y n n y 0 0 1 1 0Y Y Ỹ Ỹ Ñ y y n y y 0 0 1 0 1Y Y Ỹ Ỹ N y y n y n 0 0 1 0 0Y Y Ỹ Ñ Ỹ y y n y n 0 0 1 1 1Y Y Ỹ N Ỹ y y n n y 0 0 1 0 0Y Y Ñ Ỹ Ỹ y y y n n 0 0 1 1 1Y Y N Ỹ Ỹ y y n n y 0 0 0 1 0Y Ỹ Y Ỹ Ỹ y n y n y 0 1 0 1 0Y Ỹ Y Ỹ Ñ y n y y y 0 1 0 0 1Y Ỹ Y Ỹ N y n y y n 0 1 0 0 0Y Ỹ Y Ñ Ỹ y n y y n 0 1 0 1 1Y Ỹ Y N Ỹ y n y n y 0 1 0 0 0Y Ỹ Ỹ Y Ỹ y n n y y 0 1 1 0 0Y Ỹ Ỹ Y Ñ y n y y y 0 1 0 0 1Y Ỹ Ỹ Y N y n y y n 0 1 0 0 0Y Ỹ Ỹ Ỹ Y y n n y y 0 1 1 0 0Y Ỹ Ỹ Ỹ Ỹ y n n y y 0 1 1 0 0Y Ỹ Ỹ Ỹ Ñ y n y y y 0 1 0 0 1Y Ỹ Ỹ Ỹ N y n y y n 0 1 0 0 0Y Ỹ Ỹ Ñ Y y n y y y 0 1 0 1 0Y Ỹ Ỹ Ñ Ỹ y n y y n 0 1 0 1 1Y Ỹ Ỹ N Y y n y n y 0 1 0 0 0Y Ỹ Ỹ N Ỹ y n y n y 0 1 0 0 0Y Ỹ Ñ Y Ỹ y n y y n 0 1 1 0 1Y Ỹ Ñ Ỹ Y y n y n y 0 1 1 1 0Y Ỹ Ñ Ỹ Ỹ y n y n y 0 1 1 1 0Y Ỹ N Y Ỹ y n n y y 0 1 0 0 0Y Ỹ N Ỹ Y y n n y y 0 1 0 0 0Y Ỹ N Ỹ Ỹ y n n y y 0 1 0 0 0Y Ñ Y Ỹ Ỹ y y y n n 0 1 0 1 1Y Ñ Ỹ Y Ỹ y y n y n 0 1 1 0 1Y Ñ Ỹ Ỹ Y y y n n y 0 1 1 1 0Y Ñ Ỹ Ỹ Ỹ y y n n y 0 1 1 1 0Y N Y Ỹ Ỹ y n y n y 0 0 0 1 0Y N Ỹ Y Ỹ y n n y y 0 0 1 0 0Y N Ỹ Ỹ Y y n n y y 0 0 1 0 0Y N Ỹ Ỹ Ỹ y n n y y 0 0 1 0 0Ỹ Y Y Ỹ Ỹ n y y n y 1 0 0 1 0Ỹ Y Y Ỹ Ñ n y y y y 1 0 0 0 1Ỹ Y Y Ỹ N n y y y n 1 0 0 0 0Ỹ Y Y Ñ Ỹ n y y y n 1 0 0 1 1Ỹ Y Y N Ỹ n y y n y 1 0 0 0 0Ỹ Y Ỹ Y Ỹ n y n y y 1 0 1 0 0Ỹ Y Ỹ Y Ñ n y y y y 1 0 0 0 1Ỹ Y Ỹ Y N n y y y n 1 0 0 0 0Ỹ Y Ỹ Ỹ Y n y n y y 1 0 1 0 0Ỹ Y Ỹ Ỹ Ỹ n y n y y 1 0 1 0 0Ỹ Y Ỹ Ỹ Ñ n y y y y 1 0 0 0 1Ỹ Y Ỹ Ỹ N n y y y n 1 0 0 0 0Ỹ Y Ỹ Ñ Y n y y y y 1 0 0 1 0Ỹ Y Ỹ Ñ Ỹ n y y y n 1 0 0 1 1Ỹ Y Ỹ N Y n y y n y 1 0 0 0 0Ỹ Y Ỹ N Ỹ n y y n y 1 0 0 0 0Ỹ Y Ñ Y Ỹ n y y y n 1 0 1 0 1Ỹ Y Ñ Ỹ Y n y y n y 1 0 1 1 0Ỹ Y Ñ Ỹ Ỹ n y y n y 1 0 1 1 0Ỹ Y N Y Ỹ n y n y y 1 0 0 0 0Ỹ Y N Ỹ Y n y n y y 1 0 0 0 0Ỹ Y N Ỹ Ỹ n y n y y 1 0 0 0 0Ỹ Ỹ Y Y Ỹ n n y y y 1 1 0 0 0Ỹ Ỹ Y Y Ñ n y y y y 1 0 0 0 1Ỹ Ỹ Y Y N n y y y n 1 0 0 0 0Ỹ Ỹ Y Ỹ Y n n y y y 1 1 0 0 0Ỹ Ỹ Y Ỹ Ỹ n n y y y 1 1 0 0 0Ỹ Ỹ Y Ỹ Ñ n y y y y 1 0 0 0 1Ỹ Ỹ Y Ỹ N n y y y n 1 0 0 0 0

Table A.8: Preference Configurations and Deviations with S=5

Preference Configuration Equilibrium Votes Deviation

Ỹ Ỹ Y Ñ Y n y y y y 1 0 0 1 0Ỹ Ỹ Y Ñ Ỹ n y y y n 1 0 0 1 1Ỹ Ỹ Y N Y n y y n y 1 0 0 0 0Ỹ Ỹ Y N Ỹ n y y n y 1 0 0 0 0Ỹ Ỹ Ỹ Y Y n n y y y 1 1 0 0 0Ỹ Ỹ Ỹ Y Ỹ n n y y y 1 1 0 0 0Ỹ Ỹ Ỹ Y Ñ n y y y y 1 0 0 0 1Ỹ Ỹ Ỹ Y N n y y y n 1 0 0 0 0Ỹ Ỹ Ỹ Ỹ Y n n y y y 1 1 0 0 0Ỹ Ỹ Ỹ Ỹ Ỹ n n y y y 1 1 0 0 0Ỹ Ỹ Ỹ Ỹ Ñ n y y y y 1 0 0 0 1Ỹ Ỹ Ỹ Ỹ N n y y y n 1 0 0 0 0Ỹ Ỹ Ỹ Ñ Y n y y y y 1 0 0 1 0Ỹ Ỹ Ỹ Ñ Ỹ n y y y n 1 0 0 1 1Ỹ Ỹ Ỹ N Y n y y n y 1 0 0 0 0Ỹ Ỹ Ỹ N Ỹ n y y n y 1 0 0 0 0Ỹ Ỹ Ñ Y Y n y y y y 1 0 1 0 0Ỹ Ỹ Ñ Y Ỹ n y y y n 1 0 1 0 1Ỹ Ỹ Ñ Ỹ Y n y y n y 1 0 1 1 0Ỹ Ỹ Ñ Ỹ Ỹ n y y n y 1 0 1 1 0Ỹ Ỹ N Y Y n y n y y 1 0 0 0 0Ỹ Ỹ N Y Ỹ n y n y y 1 0 0 0 0Ỹ Ỹ N Ỹ Y n y n y y 1 0 0 0 0Ỹ Ỹ N Ỹ Ỹ n y n y y 1 0 0 0 0Ỹ Ñ Y Y Ỹ n y y y n 1 1 0 0 1Ỹ Ñ Y Ỹ Y n y y n y 1 1 0 1 0Ỹ Ñ Y Ỹ Ỹ n y y n y 1 1 0 1 0Ỹ Ñ Ỹ Y Y n y n y y 1 1 1 0 0Ỹ Ñ Ỹ Y Ỹ n y n y y 1 1 1 0 0Ỹ Ñ Ỹ Ỹ Y n y n y y 1 1 1 0 0Ỹ Ñ Ỹ Ỹ Ỹ n y n y y 1 1 1 0 0Ỹ N Y Y Ỹ n n y y y 1 0 0 0 0Ỹ N Y Ỹ Y n n y y y 1 0 0 0 0Ỹ N Y Ỹ Ỹ n n y y y 1 0 0 0 0Ỹ N Ỹ Y Y n n y y y 1 0 0 0 0Ỹ N Ỹ Y Ỹ n n y y y 1 0 0 0 0Ỹ N Ỹ Ỹ Y n n y y y 1 0 0 0 0Ỹ N Ỹ Ỹ Ỹ n n y y y 1 0 0 0 0Ñ Y Y Ỹ Ỹ y y y n n 1 0 0 1 1Ñ Y Ỹ Y Ỹ y y n y n 1 0 1 0 1Ñ Y Ỹ Ỹ Y y y n n y 1 0 1 1 0Ñ Y Ỹ Ỹ Ỹ y y n n y 1 0 1 1 0Ñ Ỹ Y Y Ỹ y n y y n 1 1 0 0 1Ñ Ỹ Y Ỹ Y y n y n y 1 1 0 1 0Ñ Ỹ Y Ỹ Ỹ y n y n y 1 1 0 1 0Ñ Ỹ Ỹ Y Y y n n y y 1 1 1 0 0Ñ Ỹ Ỹ Y Ỹ y n n y y 1 1 1 0 0Ñ Ỹ Ỹ Ỹ Y y n n y y 1 1 1 0 0Ñ Ỹ Ỹ Ỹ Ỹ y n n y y 1 1 1 0 0N Y Y Ỹ Ỹ n y y n y 0 0 0 1 0N Y Ỹ Y Ỹ n y n y y 0 0 1 0 0N Y Ỹ Ỹ Y n y n y y 0 0 1 0 0N Y Ỹ Ỹ Ỹ n y n y y 0 0 1 0 0N Ỹ Y Y Ỹ n n y y y 0 1 0 0 0N Ỹ Y Ỹ Y n n y y y 0 1 0 0 0N Ỹ Y Ỹ Ỹ n n y y y 0 1 0 0 0N Ỹ Ỹ Y Y n n y y y 0 1 0 0 0N Ỹ Ỹ Y Ỹ n n y y y 0 1 0 0 0N Ỹ Ỹ Ỹ Y n n y y y 0 1 0 0 0N Ỹ Ỹ Ỹ Ỹ n n y y y 0 1 0 0 0

Average Deviation Rate: .619 .429 .333 .270 .222Notes: The three left-most columns list all possible preference configurations for S=5 in which |Y| < ȳ < |Y| + |Ỹ| . As in the main text, Y (N) indicates that a senator always vote "yea" ("nay"). Ỹ (Ñ) indicates that a senator would vote "yea" ("nay") if and only if doing so would change the outcome of the roll call. The three columns in the middle show agents' equilibrium choices, with y (n ) implying that a senator votes "yea" ("nay"). The three right-most columns indicate whether an agent deviates from her instrumental preferences. As explained in Appendix A.5, we say that a senator deviates if either i∈Ỹ and she votes "nay" or i∈Ñ and she says "yea."

Table A.8 continued

Backward Induction in the Wild? Evidence from Sequential ......in which forward-looking senators...

Documents

3.2.2. Dynamic Games of complete information: Backward Induction and Subgame perfection

Index [jeffe.cs.illinois.edu]jeffe.cs.illinois.edu/teaching/algorithms/book/99-backmatter.pdf · summary of past decisions, backward induction, see dynamic programming Baguenaudier,

Microeconomics II Lecture 2: Backward induction and subgame …€¦ · backward induction, i.e., by starting by contemplating the last stage of the game and requiring that all earlier

Senators 2013

Uncertainty Aversion and Backward Inductionfm · subgame perfection and show that the backward induction outcome breaks down in the subgame-perfect equilibrium of the centipede game,

Extensive Form Games: Backward Induction and Imperfect

BRANCHING TIME, PERFECT INFORMATION GAMES ...faculty.econ.ucdavis.edu/faculty/bonanno/PDF/bi_logic.pdfBRANCHING TIME, PERFECT INFORMATION GAMES AND BACKWARD INDUCTION Giacomo Bonanno¤

Checkmate: Exploring Backward Induction among Chess …rady.ucsd.edu/docs/...Checkmate_Backward_Induction... · VOl. 101 NO. 2 lEVITT ETAl.: BACkWARd INduCTION 977 strategies. Behavior

AN APPROXIMATE DYNAMIC PROGRAMMING ......For a small, tractable problem, the backward dynamic programming (BDP) algorithm (also known as backward induction or nite{horizon value iteration)

Chapter 9 Backward Induction - MIT OpenCourseWare 9 Backward Induction We now start analyzing the dynamic games with complete information. These notes focus on the perfect-information

Chapter 3 : Dynamic Game Theory....Nash equilibrium of a dynamic game. This is by using the principle of backward induction. b. Backward Induction. Backward induction is the principle

Senators Phil

Approximation Algorithms for the Stochastic Lot-sizing ... · PDF fileApproximation Algorithms for the Stochastic ... graph algorithms, ... They proposed backward induction algorithms

Extensive Form Games: Backward Induction and …kevinlb/teaching/cs532l - 2008-9...RecapBackward InductionImperfect-Information Extensive-Form GamesPerfect Recall Lecture Overview

2.3. Value of Information: Decision Trees and Backward Induction

3.2.2. Dynamic Games of complete information: Backward Induction and Subgame perfection - Credibility and Commitment -

Backward Induction with Players who Doubt Others

Cooperation in the Finitely Repeated Prisoner’s Dilemmaecon.ucsb.edu/~sevgi/EmbreyFrechetteYuksel_Jan2017.pdfhow backward induction prevails in nitely repeated games. However, it

How Backward are the Other Backward Classes? Changing ... · How Backward are the Other Backward Classes? Changing Contours ... How Backward are the Other Backward ... hardly any

Backward induction or forward reasoning? An experiment of ... · and question backward induction (3) by not observing its necessary information retrieval. In an additional treatment,