15
Theory Dec. (2012) 72:205–219 DOI 10.1007/s11238-011-9247-6 Infinity in the lab. How do people play repeated games? Lisa Bruttel · Ulrich Kamecke Published online: 13 April 2011 © Springer Science+Business Media, LLC. 2011 Abstract We introduce a novel mechanism to eliminate endgame effects in repeated prisoner’s dilemma experiments. In the main phase of a supergame our mechanism gen- erates more persistent cooperation than finite horizon or random continuation rules. Moreover, we find evidence for cooperation-enhancing “active/reactive” strategies which concentrate in the initial phase of a supergame as subjects gain experience. Keywords Prisoner’s dilemma · Infinite horizon · Experiment · Continuation rule JEL Classification C91 · D80 1 Introduction Many economic games such as the prisoner’s dilemma, trust games as well as standard oligopoly games model fundamental social situations in which individual rational- ity and selfishness prevent players from reaching an efficient solution. However, if these one-shot games are repeated, the conflict between individual and social payoff Electronic supplementary material The online version of this article (doi:10.1007/s11238-011-9247-6) contains supplementary material, which is available to authorized users. L. Bruttel (B ) Department of Economics, Universität Konstanz, Box 131, 78457 Constance, Germany e-mail: [email protected] U. Kamecke Department of Business and Economics, Humboldt-Universität zu Berlin, Spandauer Str. 1, 10099 Berlin, Germany e-mail: [email protected] 123

Infinity in the lab. How do people play repeated games?

Embed Size (px)

Citation preview

Page 1: Infinity in the lab. How do people play repeated games?

Theory Dec. (2012) 72:205–219DOI 10.1007/s11238-011-9247-6

Infinity in the lab. How do people play repeated games?

Lisa Bruttel · Ulrich Kamecke

Published online: 13 April 2011© Springer Science+Business Media, LLC. 2011

Abstract We introduce a novel mechanism to eliminate endgame effects in repeatedprisoner’s dilemma experiments. In the main phase of a supergame our mechanism gen-erates more persistent cooperation than finite horizon or random continuation rules.Moreover, we find evidence for cooperation-enhancing “active/reactive” strategieswhich concentrate in the initial phase of a supergame as subjects gain experience.

Keywords Prisoner’s dilemma · Infinite horizon · Experiment · Continuation rule

JEL Classification C91 · D80

1 Introduction

Many economic games such as the prisoner’s dilemma, trust games as well as standardoligopoly games model fundamental social situations in which individual rational-ity and selfishness prevent players from reaching an efficient solution. However, ifthese one-shot games are repeated, the conflict between individual and social payoff

Electronic supplementary material The online version of this article(doi:10.1007/s11238-011-9247-6) contains supplementary material, which is available to authorized users.

L. Bruttel (B)Department of Economics, Universität Konstanz, Box 131, 78457 Constance, Germanye-mail: [email protected]

U. KameckeDepartment of Business and Economics, Humboldt-Universität zu Berlin, Spandauer Str. 1,10099 Berlin, Germanye-mail: [email protected]

123

Page 2: Infinity in the lab. How do people play repeated games?

206 L. Bruttel, U. Kamecke

maximization often changes in nature, because the threat of adverse reactions maystabilize the efficient solution.

This argument has considerable explanatory power. Experimental evidence1

and the historic example2 of the Maghreb traders show that people are well awareof the reactions which they trigger with noncooperative behavior. As a consequencethe outcome of repeated social interaction becomes systematically more cooperativeeven though the noncooperative stage game outcome maintains part of its descrip-tive appeal. Speaking formally, the above mentioned conflict becomes an equilibriumselection problem.

Game theory has two problems when explaining such situations. First, the con-flict between individual and social payoff maximization is much more evident ingame models than in reality, because theoretically it can only be solved if the gameis repeated infinitely often,3 which can never happen in reality.4 Second, gametheory does not offer convincing solution principles for the resulting equilibriumselection problem. The development of a theory of the evolution of cooperation inrepeated social interaction is an ongoing task to which we want to contribute in thispaper.

Experimental research contributed several insights into the theory of repeatedsocial interaction. Experiments confirm that repetitions indeed increase the coop-eration rates in simple strategic situations such as prisoner’s dilemma (Dal Bó2005) or trust games (Bohnet and Huck 2004). According to Selten et al. (1997)participants distinguish between three phases of a repeated game. In the initialphase they try to learn the cooperativeness of their opponents, while reveal-ing (or hiding) their own attitude or intention. If they manage to start coopera-tion they go on playing a main phase of the game as if the experiment wouldcontinue forever using some type of stationary stick-and-carrot strategy whichdepends on a few characteristics of the present state of the game but certainlynot on the whole history. In this main phase the experimenter usually findsa considerable and stable share of cooperative outcomes. Only during the endphase of the game participants return to the noncooperative stage game strate-gies. This is also found in Normann and Wallace (2010). They compare treat-ments with a (known or unknown) finite horizon and with a random continuationrule. They cannot find any systematic treatment effect during the main phase ofthe repeated game. Only in the last rounds of the treatments with a known finalround participants behave systematically different to the treatment with randomcontinuation.

1 See, for instance, Normann and Wallace (2010), Dal Bó (2005) or Camera and Casari (2009).2 See Greif (1993).3 Or if the stage game has more than one equilibrium, which is not the case for many of the standard stagegames considered here.4 Many experimenters use random continuation rules to circumvent the problem. However, the credibilityof a constant continuation probability is difficult to sustain, because the participants never spend more thansome hours in the lab. The number of rounds cannot exceed a practical upper limit. In other words, realisticcontinuation probabilities must eventually drop to zero.

123

Page 3: Infinity in the lab. How do people play repeated games?

Infinity in the lab 207

In finitely repeated games the final breakdown of cooperation makes it difficult tostudy the details of the cooperative main phase, because it is complicated to locate theexact border between the main and the end phase. In particular, the distinction of thetwo phases gets difficult in repeated supergames as the endgame effects may unravelwhen the players gain experience.5 Selten and Stoecker (1986) show how participantslearn to anticipate the endgame effect. The final deviation moves to earlier roundswhen the supergame is repeated several times.6 In our study we avoid this kind ofendgame effect by eliminating the end phase of the game.

Our new mechanism to eliminate endgame effects is related to the strategy methodintroduced by Axelrod (1984). Axelrod invited “the profession” to submit computerprograms playing the infinitely repeated prisoner’s dilemma. As the computer candetermine the payoff of the infinitely repeated game this approach eliminates all end-game considerations. In our experiment, we move Axelrod’s strategy approach closerto the interactive decision making in an ongoing infinitely repeated game. First, welet the subjects play three opening rounds of a standard prisoner’s dilemma game,and then we extract state dependent Markov strategies which are used to compute thepayoffs for the remaining infinitely repeated prisoner’s dilemma game. This procedureeliminates endgame effects arising from a known finite end of the game, enabling usto study the alleged “more cooperative play” during the main phase of a repeated pris-oner’s dilemma game in isolation. In order to relate the behavior of our participantsto the literature on repeated prisoner’s dilemma games we compare our two maintreatments to supergames with a known end and a random continuation version of thegame.

For our strategy extraction we distinguish two different methods. In the Strat-egy treatment, we let the participants chose between “cooperate” (c) and “defect”(d) for three rounds. After that they have to submit a general decision rule condi-tioning on the four possible states (c, c), (c, d), (d, c) or (d, d). This strategy com-bined with the observed play in the first three rounds is then used to compute theresulting sequence of decisions and the resulting payoff of the infinitely repeatedgame.

The strategy approach cuts the supergame into two phases with “hot” and “cold”decision making. To control for the consequences of such a structural break we con-duct another treatment called Moore where we employ an implicit strategy extrac-tion method. Participants of this treatment continue to select “cooperate” or “defect”after the third round until identical decisions reoccur. As soon as the two partic-ipants reach one of the four possible states (c, c), (c, d), (d, c), or (d, d) for thesecond time the computer program computes the payoff of the infinitely repeatedgame as if the two participants repeated the observed cycle in between forever. Thisinterpretation of the behavior is equivalent to the extraction of a one or a two state

5 Dal Bó (2005) has a similar design but considers only experiments with an expected duration of fourrounds at most so that he eliminates the main phase of the game. He comes to the opposite conclusion thatthere is in fact less cooperation under a finite horizon than under a random continuation rule.6 This result has been confirmed by Engle-Warnick and Slonim (2006, p.621) who claim that “strategiesinferred over time evolve consistently with best response behavior toward the unique non-cooperative equi-librium in the finite game and the maximally cooperative outcome in the indefinite game.” Andreoni andMiller (1993), on the other hand, observe shorter endgames over the course of the experiment.

123

Page 4: Infinity in the lab. How do people play repeated games?

208 L. Bruttel, U. Kamecke

(incomplete) Moore machine,7 and it is therefore closely related to the method intro-duced by Engle-Warnick and Slonim (2004, 2006) to analyze the behavior in arepeated trust experiment.8 Similarly, Camera et al. (2010) use two-state finite auto-mata to classify strategies in a prisoner’s dilemma experiment.9

Selten and Stoecker (1986) generate stronger learning effects in a finitelyrepeated prisoner’s dilemma game by reinviting the same participants back for a sec-ond session.10 To check whether such long run learning plays a role in our context,we implement a repetition of two of our treatments with experienced subjects. Inthe treatment with a known finite end we expect to replicate the unraveling end-game effect. In the Moore treatment we expect more systematic behavior of expe-rienced participants, because they might better understand the strategy extractionmethod.

The results of our treatment with a known finite horizon confirm that coopera-tion collapses by the end of a supergame, that the point of time where it collapsesmoves to earlier rounds in later repetitions, and that it does so much more drasticallywith experienced subjects. With random continuation we find decreasing cooperationrates within supergames which may result from the participants’ doubts about thestopping probability being the same after each round of a supergame. Our new mech-anisms, on the other hand, produce approximately constant cooperation rates duringthe main phase and thus seem appropriate to overcome these problems. Moreover,we detect a systematic pattern in behavior trying to establish mutually cooperativebehavior in the initial phase of the repeated game. Several participants do not accepta breakdown of cooperation after the first round of a session. Instead of reactingwith the expected grim or tit-for-tat strategy they try to reestablish cooperation beforethey continue with defection when the game switches into the “eternity” mode. As aconsequence, participants in the Moore treatment reduce their cooperation rates sig-nificantly when it comes to the second phase of the game, in which the decisions “foreternity” are made. With experienced players, the new mechanism produces clearerresults.

Our paper is structured as follows. In Sect. 2, we introduce the design and pro-cedures of the experiment. Some behavioral predictions are discussed in Sect. 3. InSect. 4, we discuss the experimental results. Section 5 concludes.

7 Note that the set of Moore machines with the two states ‘c’ and ‘d’ is equivalent to the set of our simpleMarkov strategies.8 Engle-Warnick and Slonim (2004, 2006) use Moore machine strategies as an analytical tool, but they donot incorporate them in the experimental design.9 Deviations from the simple action plans prescribed by these automata are treated as random errors intheir paper. Compared to their approach, we go one step further by actually restricting choices such thatrandom errors are not possible.10 The suspicion that learning effects occur over night rather than within one experimental session is in linewith the results of Brosig et al. (2007). In their experiment they observe the behavior of the same subjectsin different dictator and prisoner’s dilemma games over three waves of experimental sessions, which takeplace on different days, and they also find more selfish (rational) behavior by reinviting participants forsecond and third sessions.

123

Page 5: Infinity in the lab. How do people play repeated games?

Infinity in the lab 209

Table 1 Payoff matrixDefect Cooperate

Defect 35; 35 100; 10

Cooperate 10; 100 65; 65

2 Design and procedures

2.1 Design

In our experiment we compare four treatments in which subjects play the prisoner’sdilemma game from Table 1 repeatedly11:

The treatments differ with respect to the rule how the number of repetitions isdetermined and in their intertemporal discount factors. In treatment Known, we havea finite horizon of seven rounds in each supergame. Profits are discounted with factor0.8855.12 In treatment Random, we consider a random continuation rule with 20%termination probability. This is equivalent to a discount factor of 0.8. In treatmentsMoore and Strategy we consider our new mechanism, once with an implicitlylearning computer (Moore) and once with explicit programming of the continuationstrategies (Strategy). In these new mechanisms the stage game is repeated forever,but participants do not need to enter their decisions themselves all the time. In bothtreatments the participants play the prisoner’s dilemma game for three initial rounds13

before the strategies for future play are programmed. In Strategy the programmingis explicitly done by the participants after the third round of each supergame. Theyenter the way they want to react to the four situations [(cooperate, cooperate); (coop-erate, defect); (defect, cooperate); (defect, defect)] into the computer. In Moore thecomputer learns how a participant reacts in these situations. The participants play threeinitial rounds in which the computer does not interpret their behavior. From the fourthround on, the computer assumes that the participants will react to a known situation inthe same way as before. As soon as a participant faces a situation for the second time,the computer program has learned enough to calculate the final outcomes, becauseMarkovian players will continue the “cycle” of play in between the two same situa-tions forever.14 In this treatment the exact number of rounds in which participants entertheir decisions is not determined in advance but depends on their play. The discountfactor in treatments Moore and Strategy is 0.8.

11 In the experimental instructions (for an English translation see the online appendix of this paper) we usea neutral labeling for the choices.12 We calibrated the discount factor in Known such that uniform cooperative or defective play would yieldthe same payoff per supergame than in the other three treatments with infinite repetition of the stage game.We choose seven rounds as the horizon for a better comparison with the two versions of the new mechanism.13 We choose three as the number of rounds of the initial phase because this was the average duration of itin the strategies programmed by the participants in the experiment of Selten et al. (1997).14 This mechanism will in general not detect a complete strategy as some situations in the prisoner’sdilemma may never be reached before the cycle is concluded.

123

Page 6: Infinity in the lab. How do people play repeated games?

210 L. Bruttel, U. Kamecke

Table 2 Number of participants per treatment

KNOWN RANDOM MOORE STRATEGY

1–20 21–40 1–20 21–40

Participants 24 24 36 48 24 24

Independent observations 4 4 6 8 4 4

2.2 Procedures

The sessions took place in the computer laboratory at Humboldt-University Berlin andin the lakelab, the laboratory for experimental economics at the University of Kon-stanz. Participants were recruited from undergraduate classes of different subjects. Theexperiment was conducted using Fischbacher’s z-Tree software (2007). The sessionsin Berlin included 12 subjects, in Konstanz 24 subjects at a time. Table 2 summarizesthe number of participants in each treatment. All in all 134 subjects participated in theexperiment, 46 of them twice: in some sessions of treatments Known and Moore wereinvited participants to a second session.15

All sessions lasted about 1 h, including the time for reading the instructions andanswering a questionnaire after the experiment, and the participants earned on average12.98 Euros.16 Subjects participated in 20 supergames of the same treatment. In alltreatments participants were randomly rematched with another opponent after eachsupergame. The participants of one session were matched within independent groupsof six participants. Participants were not informed about the division into groups. Theinstructions allude to different opponents in different supergames.

3 Behavioral hypotheses

We take the distinction of behavior in an initial phase, a main phase and an end phaseas the starting point of our analysis. In the game of Selten et al. (1997), the initialphase lasts two to four rounds. We therefore think that the average of three rounds isa good approximation of the average duration of the initial phase not only in Mooreand Strategy but also in Known and Random. The end phase has according toSelten et al. (1997) two rounds. This is mainly relevant in treatment Known with afinite duration. Consequently, the main phase of Known comprises rounds 4 and 5.In Random there is no clear distinction between the main and the end phase becausethe final round is not known. In Moore and Strategy, all rounds after round 3 arepart of the main phase. By construction of these two treatments there is no end phase.

15 In these sessions of Known we invited 12 participants which actually gained experience with Knownbefore. Another 12 participants gained experience with any of our treatments. In Moore we had 22 par-ticipants out of 24 showing up for the experienced session in Konstanz. Therefore, we had to allow for 2additional participants, which already gained experience in other prisoner’s dilemma experiments but notin our specific experiment.16 Participants in the Moore session with experienced participants received an additional showup fee of10 Euros.

123

Page 7: Infinity in the lab. How do people play repeated games?

Infinity in the lab 211

In the following, we will derive hypotheses about the behavior in these three phasesof a supergame. As Selten et al. (1997) we argue that decisions in the initial phaseare predetermined and depend rather on the cooperativeness of the player in questionthan on the decisions of the opponent. In the main phase this relation changes to theopposite. Here, we expect that both decisions of the pair of players in the previousround determine their play in the next round. In the end phase, we expect decreasingcooperation rates by the end of a supergame.

We expect players to use the initial phase to “get to know” each other. Participantswill open the supergame by playing either c or d. In the next round, they will most likelycontinue to cooperate if both started with c. If one or both of them defected in the firstround, they may continue with mutual defection, but we also expect some “active/reac-tive” behavior in which participants try to manipulate the opponent’s behavior in orderto reach a favorable outcome. An example for “active/reactive” behavior would be anattempt to reach long run cooperation with, for instance, a tit-for-tat opponent. Thisrequires an unexpected cooperative move to trigger cooperative play, that is a choiceof c after (c, d) or (d, d), followed by a second unexpected c in the following roundwhile waiting for the other’s reaction. Additionally, the opponent has to anticipate thesecond unexpected c and must be willing to follow the other’s initiative to coordinateat (c, c). This way “active/reactive” decisions generate strategic behavior in whichthe decision in round t depends not only on the state in t − 1 but also in t − 2. Eventhough this behavior regularly leads to lower payoffs (as a large fraction of opponentsdo not reciprocate cooperation) it seems to appear consistently in repeated games.17

However, we also expect that such attempts to establish cooperation will be given upafter a while if the opponent persistently defects. We therefore expect to observe thisbehavior to concentrate in the initial phase of the supergame. To summarize:

Hypothesis 1 (Initial phase) During the initial phase players will use “active/reac-tive” decisions to establish cooperative choices. After the initial phase, there will be adrop in average cooperation rates, because players give up “active/reactive” behaviorif it did not establish mutual cooperation so far.

If this hypothesis holds we should see unexpected cooperation in rounds 2 and 3, andthe (c, d) and (d, c) states should concentrate in the first few rounds. With experience,players will learn that “active/reactive” behavior is either immediately successful (ifthe opponent is a tit-for-tat player) or never (if the opponent systematically defects).Thus, (c, d) and (d, c) states will become rare in the main phase of a supergame andconcentrate in the initial phase.

After the initial phase of the supergame we expect a main phase to begin in whichindividual decision making can be described by a quadruple � ∈ {c, d}4 which assignsa decision to each of the four possible states mutual cooperation s1 = (c, c), coor-dination failure s2 = (c, d) and s3 = (d, c), and mutual defection s4 = (d, d).18 In

17 Dal Bó (2005) is not very successful with his attempt to relate his portion of (c, d) and (d, c) states tobest reply behavior in the repeated prisoner’s dilemma.18 A similar set of strategies was used by Camera et al. (2010) or Engle-Warnick and Slonim (2004)to analyze the outcomes of a standard repeated game experiment. Moreover, it is equivalent to theset of two-state Moore machines (see Binmore and Samuelson 1992) used for a similar purpose by

123

Page 8: Infinity in the lab. How do people play repeated games?

212 L. Bruttel, U. Kamecke

Moore and Strategy, this assumption is actually implemented as a restriction ofthe strategy space. In Known and Random, we expect to observe the same effect as“active/reactive” behavior should become rare after the initial phase (and particularlyin later supergames).

If this assumption holds our prisoner’s dilemma game has three equilibrium strate-gies. Players either always defect playing (d, d, d, d),19 or always cooperate with oneof the equilibrium strategies “tit-for-tat” (c, d, c, d) or “grim” (c, d, d, d). The possi-ble combinations of the strategies of two players can result in the following patterns ofplay: always defect, always cooperate, or a first round coordination failure followedby defection thereafter.20

During the main phase of the prisoner’s dilemma experiment we therefore expectthat the participants either cooperate or defect consistently. We further expect thatparticipants in one matching group will not coordinate on one of the pure equilibria.Instead, we predict that both outcome patterns will coexist,21 because the critical frac-tion of cooperating participants at which a cooperator gets just as much as a defectoris α = 0.227 (and larger if there are active/reactive participants) so that the observedpayoffs will probably not favor one of the natural equilibrium strategies. We will testthe following hypothesis:

Hypothesis 2 (Main phase) During the main phase of a supergame the participantswill either cooperate as long as the state s1 = (c, c) is realized and defect after oneof the states s2 = (c, d), s3 = (d, c), or s4 = (d, d), or they will defect consistently.The willingness to cooperate in the main phase in round t depends only on the stateof the game in t − 1 but not on previous decisions which were made more than oneround ago.

If this hypothesis holds we should not observe (c, d) states in the main phase. Theindividual decisions in this round should concentrate on c after (c, c) and on d afterthe other three states. Furthermore, states in round t − 2 should not have a significantexplanatory impact on decisions in t . Finally, the comparison of the fraction of suchstates in the sessions of Moore with experienced subjects with that in the sessionswith inexperienced subjects allows to test whether the concentration of “active/reac-tive” behavior in the initial phase is clearer after some time for reflection between thetwo repetitions.

The central feature of the treatments Moore or Strategy is the elimination ofthe endgame. We therefore expect endgame effects only in Known and to a certain

Footnote 18 continuedEngle-Warnick and Slonim (2004, 2006). Selten et al. (1997) also found that behavior in the main phasemainly conditions on decisions in the immediately preceding round.19 Or a behaviorally equivalent strategy.20 Note that tat-for-tit (Binmore 1992, p. 386–387) is not an equilibrium in our game because it pays todeviate to (d, d, d, d) against a tat-for-tit player.21 See again Camera et al. (2010). Even with random matching after every round in groups of only fourparticipants they observe coexistence of systematic cooperation and defection.

123

Page 9: Infinity in the lab. How do people play repeated games?

Infinity in the lab 213

extent also in Random but not in Moore or Strategy. In Known, every participantknows the last round of the game. In line with the experimental literature on repeatedgames we therefore expect participants to select “defect” at least in the very last roundof the experiment. In Random, the players only know that the experiment cannot lastforever. We expect an increasing suspicion about the game not continuing for a longtime. Such a suspicion would lead to a larger fraction of participants playing “defect”as the number of rounds being already played becomes larger. Our expectations aboutthe endgame behavior is summarized by the following hypothesis:

Hypothesis 3 (End phase) Endgame effects will let cooperation rates drop signifi-cantly in the end phase of Known and continuously over the course of Random,while they remain consistently high in the main phase of Moore and Strategy.

Again we expect that learning within and between experimental sessions will makeendgame effects stronger the more experience players gain. We think that participantsreconsider their strategies particularly carefully between two sessions of the sameexperiment. This expectation founds on the results of Selten and Stoecker (1986) andBrosig et al. (2007) who confirm particularly pronounced learning processes betweensessions. Within one session of Known, we already expect the strength of the end-game effect to increase. Between the two sessions of Known we expect this finalbreakdown of cooperation to be even stronger.

4 Results

Figure 1 depicts the average cooperation rates in the four treatments, separating thedata for Known and Moore into inexperienced (supergames 1–20) and experienced(supergames 21–40) participants. As expected, the cooperation rates in the main phaseof Moore and Strategy are much more stable than in Known and Random, wherethey decrease at the end (in Known) and over the whole game (in Random).22 InKnown this difference is caused by an unraveling endgame effect within and betweensessions. Figure 2 shows that the final drop of the cooperation rates becomes strongerand moves to earlier rounds in later supergames and that this effect is considerablystronger in the sessions of Known 21–40 with experienced subjects.

A comparison of cooperation rates in the first and second round in Fig. 1 indicatesthat active/reactive behavior in fact has an important influence on outcomes. Cooper-ation rates do not drop as much as expected from the first to the second round.

Comparing the frequency of (c, d) and (d, c) states in the second and fourth roundof each treatment allows to test the hypothesis that active/reactive behavior concen-trates in the initial phase of the supergames. Combining the data from all treatmentsand treating only matching groups as independent observations, the relative frequencyof these mixed states is significantly smaller in round 4 than in round 2, irrespectiveof whether we include the data from experienced players or not (Wilcoxon signed

22 In Random, the cooperation rate continues to follow a downward trend after the seven rounds shownin the picture.

123

Page 10: Infinity in the lab. How do people play repeated games?

214 L. Bruttel, U. Kamecke

Fig. 1 Average cooperation ratein all treatments over time

Fig. 2 Cooperation rate in Known with inexperienced and experienced participants over time, groupedby each five supergames (SG)

Table 3 p-Values:(c, d)-outcomes in rounds 2 vs. 4

KNOWN 1–20 <0.1KNOWN 21–40 <0.1RANDOM <0.05MOORE 1–20 <0.01MOORE 21–40 0.1250STRATEGY <0.1

rank test, one-sided, p-value <0.01).23 The difference is also significant in all singletreatments except Moore 21–40. Table 3 shows the corresponding p-values, separatedby treatment.

23 We use only rounds 2 and 4 (and not 3 and 5 as well) for the following two reasons: First, in the treat-ments with experienced players it is not totally clear whether round 3 belongs to the initial or to the mainphase. Second, round 5 is problematic because (i) in Known 21–40 (c, d) states might rather already reflectendgame defections and (ii) we do not have decisions for all players in Moore and Strategy for thisround.

123

Page 11: Infinity in the lab. How do people play repeated games?

Infinity in the lab 215

For further tests of our hypotheses we estimate a logit model for the willingness tocooperate, wilcot . In this model, ini tial indicates rounds from the initial phase of theexperiment. It is equal to one in rounds 1–3 and zero otherwise. The dummy variableend is included in Known only. It is one in the last two rounds of a supergame andzero otherwise.

In our model, we treat the two previous states as exogenous regressors, includ-ing cct−1, cdt−1 and dct−1 as well as cct−2, cdt−2 and dct−2 as dummies for theoutcome of the previous two rounds. The cooperation level after mutual defectionddt−1 and ddt−2 is the baseline level of cooperation. The values of cct−τ , cdt−τ anddct−τ , τ ∈ {1; 2}, are then to be interpreted relative to the cooperation rate in a (d, d)

state. Our hypothesis on the main phase behavior implies that the outcomes should onlydepend on the previous state of the game. Thus, we expect no impact of cct−2, cdt−2and dct−2 on wilcot .

The variable r counts the rounds within a supergame and s indicates the supergamewithin a session. If participants in Random loose their confidence in the continua-tion probability the round index, r , would have a negative influence on the individualwillingness to cooperate so that r can capture this type of endgame effect. With thevariable end, on the other hand, we capture the expected endgame effect in a repeatedprisoner’s dilemma game with known finite horizon. Including both s and end intothe regression of Known we distinguish whether an endgame effect appears only inthe last round or whether this effect is proceeding to earlier rounds. Such a learningeffect would be captured by a negative influence of the index of the supergames s.For a better comparison between treatments, we include for Moore and Strategyall decisions until the seventh round in the analysis, no matter whether they are madeby the subjects themselves or by the computer program.

The regression results are reported in Table 4. To account for differences betweenthe treatments, we report the values for all treatments separately. Additionally, the lastcolumn of Table 4 contains a regression with the combined data of all treatments toprovide a joint test for the hypothesis on main phase behavior.

The variable ini tial is significantly positive in Moore 21–40 and Strategy. Thisprovides some evidence that cooperation increasing active/reactive behavior movesfrom the main to the initial phase for experienced players or for those who are forcedto think about a complete strategy.

In all treatments the willingness to cooperate is significantly larger after mutualcooperation cct−1 than after the reference point ddt−1. The states cdt−1 and dct−1have a weaker, but still positive and significant effect on cooperation compared toddt−1. The lag-2 dummy variables cct−2, cdt−2 and dct−2 have less influence onthe willingness to cooperate in round t as they have much smaller coefficients than thelag-1 variables. The last regression in Table 4 shows that, taken all data together, thewillingness to cooperate two periods after a (c, c) of (c, d) state is still significantlylarger than after a (d, d) state.24 However, this effect does not exist in Moore 21–40.This confirms that the artificial simplification of strategies in Moore captures many

24 This result remains robust if we exclude the data from experienced players, which are contained twicein this version of the regression.

123

Page 12: Infinity in the lab. How do people play repeated games?

216 L. Bruttel, U. Kamecke

Tabl

e4

Para

met

erva

lues

inth

ere

gres

sion

Kno

wn

1–20

Kno

wn

21–4

0R

ando

mM

oore

1–20

Moo

re21

–40

Stra

tegy

All

Initi

al−0

.319

−0.5

430.

0127

−0.2

66−0

.015

9−0

.103

0.05

020.

0383

0.75

8∗∗

0.65

4*0.

712∗

∗∗0.

870∗

∗∗0.

173

(0.3

95)

(0.4

43)

(0.6

20)

(0.6

94)

(0.2

68)

(0.2

35)

(0.2

02)

(0.2

32)

(0.3

40)

(0.3

80)

(0.1

54)

(0.2

20)

(0.1

51)

End

−0.4

550.

178

−1.4

67∗

−0.7

22−1

.232

∗∗∗

(0.6

32)

(0.6

47)

(0.7

65)

(0.7

85)

(0.4

68)

cct−

13.

577∗

∗∗3.

582∗

∗∗5.

051∗

∗∗5.

158∗

∗∗5.

175∗

∗∗5.

345∗

∗∗5.

021∗

∗∗4.

125∗

∗∗6.

713∗

∗∗6.

284∗

∗∗4.

177∗

∗∗3.

646∗

∗∗4.

423∗

∗∗(0

.538

)(0

.883

)(0

.108

)(0

.362

)(0

.439

)(0

.680

)(0

.450

)(0

.565

)(0

.535

)(0

.721

)(0

.657

)(0

.498

)(0

.312

)

cdt−

11.

594∗

∗∗1.

988∗

∗∗1.

208∗

∗∗2.

254∗

∗∗1.

492∗

∗∗1.

788∗

∗∗1.

442∗

∗∗1.

166∗

∗∗1.

581∗

∗∗1.

361∗

∗∗1.

225∗

∗∗1.

287∗

∗∗1.

548∗

∗∗(0

.580

)(0

.657

)(0

.277

)(0

.246

)(0

.356

)(0

.433

)(0

.223

)(0

.321

)(0

.446

)(0

.511

)(0

.248

)(0

.352

)(0

.183

)

dc t

−10.

737∗

∗∗1.

018∗

∗∗1.

510∗

∗∗2.

040∗

∗∗1.

477∗

∗∗1.

816∗

∗∗1.

789∗

∗∗1.

475∗

∗∗1.

606∗

∗∗1.

505∗

∗∗1.

273∗

∗∗1.

292∗

∗∗1.

509∗

∗∗(0

.220

)(0

.247

)(0

.470

)(0

.483

)(0

.278

)(0

.342

)(0

.208

)(0

.190

)(0

.447

)(0

.369

)(0

.311

)(0

.280

)(0

.116

)

cct−

20.

112

−0.2

360.

376

1.56

4∗∗∗

0.53

81.

612∗

∗∗0.

920∗

∗∗(0

.694

)(0

.590

)(0

.283

)(0

.289

)(0

.419

)(0

.315

)(0

.231

)

cdt−

2−0

.121

−0.2

590.

606∗

∗1.

441∗

∗∗0.

585

1.31

1∗∗∗

0.84

9∗∗∗

(0.4

14)

(0.9

27)

(0.2

65)

(0.2

18)

(0.3

93)

(0.4

31)

(0.1

80)

dc t

−2−0

.028

1−0

.814

∗∗∗

0.45

20.

466∗

∗0.

136

0.45

80.

182

(0.3

70)

(0.3

03)

(0.3

39)

(0.2

21)

(0.4

12)

(0.3

33)

(0.1

42)

r−0

.205

∗∗−0

.524

∗∗∗

−0.0

997

−0.4

85−0

.076

7∗∗

−0.0

418

−0.0

974

−0.1

31∗

−0.0

771

−0.1

090.

0968

∗∗∗

0.15

9∗∗

−0.0

894∗

∗(0

.095

8)(0

.186

)(0

.267

)(0

.340

)(0

.039

0)(0

.052

7)(0

.061

3)(0

.074

5)(0

.208

)(0

.203

)(0

.037

6)(0

.070

0)(0

.039

8)

s0.

0010

9−0

.006

16−0

.059

4∗∗∗

−0.0

731∗

∗∗−0

.020

1−0

.015

90.

0015

7−0

.002

19−0

.003

42−0

.007

23−0

.003

930.

0077

6−0

.008

52

(0.0

140)

(0.0

203)

(0.0

126)

(0.0

151)

(0.0

228)

(0.0

233)

(0.0

111)

(0.0

115)

(0.0

0407

)(0

.008

82)

(0.0

132)

(0.0

0653

)(0

.007

08)

Con

stan

t−0

.728

0.63

0−0

.691

1.41

7−2

.071

∗∗∗

−2.7

03∗∗

∗−1

.956

∗∗∗

−2.1

58∗∗

∗−2

.455

∗∗−2

.336

∗∗−2

.466

∗∗∗

−3.6

39∗∗

∗−2

.084

∗∗∗

(0.9

68)

(1.4

61)

(1.0

28)

(1.5

20)

(0.5

39)

(0.7

52)

(0.4

13)

(0.4

38)

(0.9

58)

(0.9

55)

(0.4

88)

(0.8

41)

(0.2

95)

Stan

dard

erro

rin

brac

kets

.**

*D

enot

essi

gnif

ican

ceat

the

1%le

vel,

**at

the

5%le

vel

and

*at

the

10%

leve

l.St

anda

rder

rors

are

clus

tere

dby

mat

chin

ggr

oup.

For

anal

tern

ativ

esp

ecifi

catio

nw

ithfix

edef

fect

sfo

rm

atch

ing

grou

psin

stea

d,se

eth

eon

line

appe

ndix

ofth

ispa

per

123

Page 13: Infinity in the lab. How do people play repeated games?

Infinity in the lab 217

relevant aspects of unrestricted decisions at least with experienced participants.25

However, the partly significant lag-2 variables cdt−2 and dct−2 in the other treatmentsprovide evidence for persistent active/reactive play.26 In Moore 1–20 all lag-2 statescontaining at least one player cooperating have a positive impact on the willingnessto cooperate in t . In our interpretation, players in this treatment perceive the infiniterepetition of their play in the main phase as a strong incentive to coordinate at mutualcooperation using active/reactive play. The experienced participants in Moore 21–40consequently have the highest cooperation rates in the first round of a supergame. Asthey also experienced that an exploited attempt to establish cooperation yields verylow profits they give up active/reactive behavior in the second session. If partners donot cooperate from the beginning, they will not start doing so in the crucial strategyelicitation phase either.

Endgame effects show up in Known and Random. In Known there is a significantnegative influence of r for the inexperienced players,27 and the supergame index s issignificantly negative for experienced participants. This indicates a stronger unravel-ing of the endgame effect with experienced players. We take this as support for thegeneral conclusion that learning between sessions changes behavior more fundamen-tally than learning within sessions. The pure endgame effect captured with the variableend seems to be much weaker compared to the general decrease of cooperation overtime. In Random we find a significant downward trend within one supergame onlyin the first regression.28 Including the lag-2 states, r is still negative, but no longersignificant.

5 Discussion

Little is known about how people decide in repeated games. In the context of a repeatedprisoner’s dilemma supergame, we study how behavior evolves when we restrict deci-sions to Moore machines with a “memory” of only one round, compared to unre-stricted decisions under a known finite or randomly determined horizon. Accordingto our data, this simplification captures a large share of actual decisions. Further, wefind evidence for what we call “active/reactive” behavior, attempts to (re-)establishcooperation among formerly not cooperating participants. The inexperienced Mooreplayers do not accept a cooperation failure easily, and remarkably many participantstry to establish cooperation before they moved to the expected noncooperative equi-

25 This result remains robust in the alternative specification with fixed effects for matching groups. Onlydct−2 turns out to be significant in Moore, even though its coefficient gets even smaller.26 These results may alternatively be generated by a subject type effect rather than a decision making effect,because it would also show up in the comparison of average decisions if, for example, some subjects havea persistently high general tendency to cooperate after the opponent defected. If there are subjects whoalways cooperate this may show up in the regression by a significantly positive cdt−2 because they run intothis situation more often. However, the effect is relatively robust if we control for subject fixed effects (seethe online appendix to this paper).27 In the alternative specification with fixed effects for matching groups, this holds also for the experiencedplayers.28 In Strategy, the time trend of cooperation captured with r is even significantly positive.

123

Page 14: Infinity in the lab. How do people play repeated games?

218 L. Bruttel, U. Kamecke

librium play. However, experienced participants seem to learn that such behavior doesnot pay and use less forgiving strategies instead.

Our experiment shed some light into the black box of strategy formation inrepeated games. Future research may relax our strict Markov assumption since wefound that behavior two rounds ago actually matters for current decisions in the formof active/reactive behavior. A mechanism similar to the automata in Engle-Warnickand Slonim (2006) which allows for a longer memory could yield useful further in-sights into active/reactive behavior. The sample-7 equilibrium mixed-strategy conceptintroduced by Selten and Chmura (2008) goes into this direction as well. Given therelatively short expected duration of actual play in our experiment, an immediate appli-cation of their method to our data is not possible. However, as they find the sample-7equilibrium explaining much of their data in mixed 2 × 2 games this concept can be ahelpful tool to study strategies in prisoner’s dilemma games which condition decisionsto more than the two previous rounds.

Acknowledgments We would like to thank Urs Fischbacher, Werner Güth, Zohal Hessami, Oliver Kirchk-amp, Stefan Kolassa, Axel Ockenfels and conference participants in Goslar for many helpful suggestions.The comments of two anonymous referees helped improving the paper considerably. Stephi Benke andSebastian Schwenen provided excellent research support.

References

Andreoni, J., & Miller, J. H. (1993). Rational cooperation in the finitely repeated prisoners’ dilemma:Experimental evidence. Economic Journal, 103, 570–585.

Axelrod, R. (1984). The evolution of cooperation. New York: Basic Books.Binmore, K. G. (1992). Fun and games. Lexington, MA: D.C. Heath.Binmore, K. G., & Samuelson, L. (1992). Evolutionary stability in repeated games played by finite

automata. Journal of Economic Theory, 57, 278–305.Bohnet, I., & Huck, S. (2004). Repetition and reputation: Implications for trust and trustworthiness

when institutions change. American Economic Review, Papers and Proceedings of the One HundredSixteenth Annual Meeting, 94(2), 362–366.

Brosig, J., Riechmann, T., & Weimann, J. (2007). Selfish in the End? An investigation of consistencyand stability of individual behavior. Munich Personal RePEc Archive 2035.

Camera, G., & Casari, M. (2009). Cooperation among strangers under the shadow of the future. AmericanEconomic Review, 99(3), 979–1005.

Camera, G., Casari, M., & Bigoni, M. (2010). Cooperative strategies in groups of strangers: Anexperiment. Working Paper.

Dal Bó, P. (2005). Cooperation under the shadow of the future: Experimental evidence from infinitelyrepeated games. American Economic Review, 95(5), 1591–1604.

Engle-Warnick, J., & Slonim, R. L. (2004). The evolution of strategies in a repeated trust game. Journalof Economic Behavior and Organization, 55, 553–573.

Engle-Warnick, J., & Slonim, R. L. (2006). Inferring repeated-game strategies from actions: Evidencefrom trust game experiments. Economic Theory, 28, 603–632.

Fischbacher, U. (2007). z-Tree: Zurich toolbox for ready-made economic experiments. ExperimentalEconomics, 10(2), 171–178.

Greif, A. (1993). Contract enforceability and economic institutions in early trade: The Maghribi traders’coalition. American Economic Review, 83(3), 525–548.

Normann, H., & Wallace, B. (2010). The impact of the termination rule on cooperation in a prisoner’sdilemma experiment. Working Paper.

Selten, R., & Chmura, T. (2008). Stationary concepts for experimental 2×2 games. American EconomicReview, 98(3), 938–966.

123

Page 15: Infinity in the lab. How do people play repeated games?

Infinity in the lab 219

Selten, R., & Stoecker, R. (1986). End behavior in sequences of finite prisoner’s dilemma super-games. Journal of Economic Behavior and Organization, 7, 47–70.

Selten, R., Mitzkewitz, M., & Uhlich, G. R. (1997). Duopoly strategies programmed by experiencedplayers. Econometrica, 65(3), 517–555.

123