7

Click here to load reader

The enemy within: Autocorrelation bias in content analysis of narratives

Embed Size (px)

Citation preview

Page 1: The enemy within: Autocorrelation bias in content analysis of narratives

Computers and the Humanities 30: 433439, 1997. 433 (~) 1997 Kluwer Academic Publishers. Printed in the Netherlands.

The Enemy Within: Autocorrelation Bias in Content Analysis of Narratives

R o b e r t H o g e n r a a d 1 , D e a n P. M c K e n z i e 2 & C o l i n M a r t i n d a l e 3 1 Psychology Department, Catholic University of Louvain, 10 Place du Cardinal Mercier, B-1348 Louvain-la-Neuve, Belgium; e-mail: [email protected]; 2 Department of Psychological Medicine, Monash University, Clayton, Victoria 3168, Australia; e-mail: [email protected]; 3 Department of Psychology, University of Maine, Orono, Maine 04469-5742, USA; e-mail: [email protected]

Key words: autocorrelation, computer-aided content analysis, narratives, serial dependencies

Abstract

Many content analysis studies involving temporal data are biased by some unknown dose of autocorrelation. The effect of autocorrelation is to inflate or deflate the significant differences that may exist among the different parts of texts being compared. The solution consists in removing effects due to autocorrelation, even if the latter is not statistically significant. Procedures such as Crosbie's (1993) ITSACORR remove the effect of at least first-order autocorrelations and can be used with small samples. The AREG procedure of SPSS (1994) and the AUTOREG procedure of SAS (1993) can be employed to detect and remove first-order autocorrelations, and higher-order ones too in the case of AUTOREG, while several methods specifically intended for small samples (Huitema and McKean, 1991, 1994) have been developed. Four examples of content analysis studies with and without autocorrelation are discussed.

"I1 y a darts le mot, dans le verbe, quelque chose de sacr6 qui nous d6fend d'en faire un jeu de hasard" [There is in the word, within the word, something sacred that prevents us from making it a game of

chancel (Baudelaire, Th6ophile Gautier (I), 1887, Oeuvres completes, Biblioth&tue de La PlEiade, p. 103)

1. The student's complaint

Imagine for a moment that one discovers during exam- inations that the scores students receive from an exam- iner depend systematically upon the scores obtained by the student who was tested just before them. Suppose further that one discovers that students receive sys- tematically lower scores when they follow a brilliant student and higher scores when they follow a mediocre one. Such a bias would be rightly seen as unfair acad- emic practice. Corrective measures would quickly be taken.

The trouble with our imaginary examiner is that his or her scores follow a pattern, which is what auto- correlation may be defined as (see Fl6res and Gins-

burgh, 1996, for a real-life musical competition exam- pie). The pattern may be a simple one, as when scores depend on the scores o f the student tested just before (first-order autocorrelation). The pattern may be more elaborate, as when the dependency involves scores of every second student (second-order autocorrelation), or of every third student (third-order). The dependen- cy may also consist of complex combinations of, for example, first- and second-order autocorrelations, or second- and fourth-order ones.

2. Observations in content analysis

What seems a matter of common sense in the exam- ple of the student rating bias seems less worrying for narratives. Yet, we, in content analytic studies, rarely pay attention to this sort of statistical dependency. The role of content analysts, like that of historians, is to extract, analyze, and synthesize information obtained from independently collected facts. If content analysts

Page 2: The enemy within: Autocorrelation bias in content analysis of narratives

434

allow autocorrelational dependencies to distort their analyses, they introduce bias into their results. They might then just as well rely on the opinion of unquali- fied judges on their texts.

The general requirement in tests of statistical sig- nificance is that observations or ratings must be inde- pendent of each other (Judd, McLelland, and Culhane, 1995). We do all we can to avoid allowing interviewee answers being contaminated by the answers of another interviewee. If we were studying temperature, stock prices, admissions to psychiatric hospitals, or other temporal data, then we would generally take auto- correlation into account (Box, Jenkins, and Reinsel, 1994; McKenzie, Clarke, and Martindale, 1996; Pol- lock, 1987). However, many content analysis studies involving temporal data are also probably biased by some unknown amount of autocorrelation. The pres- ence of autocorrelation may grossly distort the level of significance when we use parametric procedures such as analysis of variance (Kenny and Judd, 1986), regres- sion (Ostrom, 1990; McKenzie, Clarke, and Martin- dale, 1996), t-test (Crosbie, 1993), and permutation test (Hayes, 1996). Contrary to what Sells, Smith, and Sprenkle (1995) argue, autocorrelation has also been found to affect the results of nonparametric procedures such as the binomial test (Crosbie, 1987), the bootstrap (Young, 1994), cross-validation (Burman, Chow, and Nolan, 1994), and the Mann-Whitney test (Zimmer- man, 1993).

The words in a text are not independent (Bratley and Fortier, 1983; Stubbs, 1995): They follow a pattern, which is why autocorrelation exists. Narratives are characteristically predicated on their seriality (Sigel- man, 1995): The main challenge for future research, says Baayen (1992, p. 361) about word frequency dis- tributions, is to construct linguistically less naive mod- els that do not build on the unrealistic assumption that in language words appear at random. As Brunet (1991, p. 73) puts it "A writer does not p i c k . . , words out of an urn, and - except by way of surrealist experimenta- tion - . . . does not choose them by sticking a pin in a dictionary".

The effect of dependencies or correlations between adjacent observations is to inflate or deflate the real dif- ferences or trends that may exist in the different parts of a text being compared. Positive autocorrelation inflates the significance of results, but negative autocorrelation deflates it. For example, Crosbie (1993) found that, in the presence of a first-order positive autocorrelation of 0.80, the t-test rejected the null hypothesis up to ten times as often as would be expected. In the case of

first-order positive autocorrelation of 0.20, the t-test still rejected the null hypothesis up to twice as often as as it should have.

Positive autocorrelation causes each score to be more similar to the preceding one than would occur by chance; this similarity reduces the variability in the series and may cause small differences to become signals of a change that does not exist. In a similar way, a negative autocorrelation causes each score to be more different from the preceding one; this differ- ence increases the variability of the series (Crosbie, 1993), enough so as to hide a possible statistically sig- nificant difference. In other words, and independently of interactions between several autocorrelations, when we are looking for, say, polynomial trends, the pres- ence of positive autocorrelations will make our results look more statistically significant than they actually are; negative autocorrelations will have the reverse effect (Simonton, 1990).

In short, in the case of autocorrelations, observa- tions covary simply because they happen to be some distance one from the other. This sort of dependen- cy can be removed by randomizing the observations, which is indeed what good experimenters do when con- ducting an experiment but which we text analysts can only do with statistical techniques because the depen- dency is in the text. As Bratley and Fortier have it, "when one studies a literary text as literature, it is fun- damental that it is not a random sample of words, but art: A structure designed to produce an aesthetic effect" (1983, p. 20). Definitely, attention to autocorrelations must now be part of the tool-kit of the content analyst, for reasons to be developed here. We dare say that if we fail to consider autocorrelations, most of what could go wrong (in matters of textual analysis) will do so.

3. The answer to problems of serial dependency

The solution to the problem of serial dependency con- sists in removing the variability due to autocorrelations (Simonton, 1977) even if the latter are not statistically significant. According to Gottman, indeed, "sophisti- cated examinations of the effect of even moderate auto- correlation in the data show that estimates of effect size that assume no autocorrelation may be badly biased, and the actual confidence intervals may differ marked- ly from what we would assume them to be" (1981, p. 56). In practice, removing, say, first-order autocorrela- tion is done by subtracting, from each observed datum, the value of the residual (that is, predicted value minus

Page 3: The enemy within: Autocorrelation bias in content analysis of narratives

actual value) of the preceding datum multiplied by the value of the autocorrelation (Crosbie, 1993, p. 967). This operation removes dependency in the residuals from the regression while also restoring the conditions of a basic statistical assumption concerning regression (Gottman, 1981, p. 57).

Box-Jenkins (ARIMA) type models (Box et al., 1994; Gottman, 1981; SAS, 1993) classically used until now require large samples, e.g. at least 25 to 50 observations, and can be difficult to interpret (Huitema and McKean, 1991; Crosbie, 1993). More recent procedures such as Crosbie's (1993) ITSACORR, DeCarlo and Tryon's C-statistic (1993), Huitema and McKean's rF1 (1994), are able to identify (and, in ITSACORR's case, remove) the effect of at least first- order positive and negative autocorrelations in small samples. In the examples that follow, we removed autocorrelations using the AUTOREG procedure of SAS (1993).

AUTOREG can be used to estimate first- and higher-order autocorrelation while AREG (SPSS, 1994) can be used to estimate first-order autocorre- lation only. For the sake of simplicity of illustration, we have used AUTOREG both to estimate and remove first- and higher-order autocorrelation independently of trends in the data, and then analyzed with GLM (General Linear Model, in SAS, 1985) what is left once autocorrelation is taken out. The standard proce- dure followed here involved four steps: (1) estimate a model with GLM, (2) estimate and remove the auto- correlation from the observed series with AUTOREG (with the probability level for significance of auto- correlations set at 0.50), (3) derive the new corrected series from the predicted and residual values given by AUTOREG, and (4) apply whatever statistical test is pertinent to the newly corrected series, including regressions, using the appropriate degrees of freedom. This conceptual procedure is the one followed by previ- ous authors such as Crosbie (1993, p. 967) and Moore, Osgood, Larzelere, and Chamberlain (1994, p. 721). Even if not statistically significant in itself, autocorre- lation has been found (Crosbie, 1993, p. 967; Gottman, 1981, p. 56) to affect results. A probability level fixed at 0.05 would be too low while setting it to 0.99 might be too far the other way. A level of 0.50 was employed as a compromise though more information needs to be collected on this topic. The AUTOREG procedure of SAS (1993, p. 185) suggests setting probability levels between 0.001 and 0.20.

In psychological data, positive first-order autocor- relation is the most common case (Gottman, 1981;

435

Moore et al., 1994; Simonton, 1990). Positive auto- correlation - that is, the value at observation t has a positive relation with the value at the observation just before - results in inflated differences. Most statistical tests, such as regression, undertaken on such a series would signal a linear or polynomial trend that does not actually exist, or at least not to the extent indicated. Autocorrelation is negative when the value at point t is systematically the opposite of the immediately preced- ing observation. Here, the increase in variability is such that it may mask a significant difference in the series, a difference that would be hidden or covered up by too much variability. Negative autocorrelation results in underevaluated differences: Regression methods may miss a linear or polynomial trend that does exist in the series.

Regarding narratives, a useful pointer is the ques- tion of the extent to which what is written at any place in the narrative is dependent on what exists before in the same narrative. This condition, if fulfilled, is enough to create a positive or negative dependency which, if systematic, is all that is needed to cause an appearance of change in a series even without any genuine change.

If we can only guess at the reasons for serial depen- dencies in texts, there is little doubt, however, about the facts. Thus, Martindale (1990, p. 316 ff.) shows, among many other cases, the presence of a signifi- cant fifth-order autocorrelation in book VI of Virgil's Aeneid for the variable "primordial content" and a sim- ilar fourth-order autocorrelation, for the same variable, in Melville's Moby Dick. Primordial thought content, in Martindale's 1990 theory of aesthetic evolution, is the sort of free-associative and concrete mode of thought that allows novelists for example to devel- op new images and new combinations of words. This mode of thought, and its opposite, symbolic or concep- tual thought content, are assessed through a thesaurus that contains words that have been uniquely ascribed to either primordial or conceptual thought contents. Another example to go by is Ide (1989, p. 281) uncov- ering an 1 lth-order serial dependency in the fluctua- tions of image density in William Blake's "The Four Zoas". Definitely, we should not restrict the blueprint of the next examples to serial dependencies of the first order only.

Page 4: The enemy within: Autocorrelation bias in content analysis of narratives

436

4. Examples

4.1. Monte Carlo simulation

In the spirit of Crosbie (1993), we shall first exam- ine the performance of relevant statistical methods on simulated data, and then proceed with real data. In the examples that follow, we wish to search for poly- nomial regressions (curvilinear relationships) (Cohen and Cohen, 1975), as applied by Hogenraad, McKen- zie, Morval, and Ducharme (1995), Martindale (1990), Simonton (1990), West (1991) and others. The effects of autocorrelation on this type of analysis have not until now been studied. We therefore did a 'Monte Carlo' (e.g. Noreen, 1989) experiment, using a computer to generate many random samples from a population, add various amounts of autocorrelation, and then count up the number of "significant results". If we define "sig- nificant" as the 0.05 level, then no more than 5% of our results should be significant, given that the data are purely random.

Whether behavioral data in general (Judd et al., 1995) or textual data in particular (Stubbs, 1995) are ever truly normal is a contentious issue. Howev- er, parametric methods such as polynomial regression require the data (or rather the residuals, Cohen and Cohen, 1975) to follow a normal or Gaussian curve. Ten thousand samples (with mean = 0 and standard deviation = 1) of size 50 and 100, were random- ly drawn from a normal population, using the IMSL (IMSL, 1991) RNNOF function. Levels of autocorre- lation ranging from 0 to 0.50, in increments of 0.10, were then added to the data [for each datapoint from 2 to N, denoted t, let the new value tt -- t + (p.(t-1)), where p is the desired level of autocorrelation]. In other terms, the new value (tt) is equal to a random number (t), plus p times the preceding observation. Polynomial regressions (up to the fifth order) were then performed using IMSL (IMSL, 1991) subroutines DCORVC and DRSTEP.

Employing a conventional a of 0.05 and using ran- dom data free of trends or patterns, we should reject the null hypothesis no more than 5% of the time. Yet pos- itive autocorrelation, of 0.50 for example, makes the test too liberal: We reject the null hypothesis 50% of the time rather than the 5% we would expect with random data. Note that even the presence of autocorrelation of 0.10 led to an inflation of 2 while an autocorrelation of -0. I 0 produced significant results only half as often as it should.

Rs te of pr imord ia l t h o u g h t

13

12

11

lO

I

11

L • •

I

21 Chapters

• uncorrected • -~-- cor rected

- - - . bes t f it (corrected'

T I

31 41

Figure 1. Uncorrected and corrected profiles for serial dependency of Jules Veme's Twenty Thousand Leagues Under the Sea in primordial thought content.

4.2. Jules Verne's Twenty Thousand Leagues Under the Sea

Jules Verne's Twenty Thousand Leagues Under the Sea (English translation, Project Gutenberg) is com- posed of 2 parts of 23 chapters each, totalling 101,034 words, 8,330 different words before lemmatization, and 6,393 different words after lemmatization (Hogen- raad, Daubies, and Bestgen, 1995).

A tally was done of the presence, in percentage of words, of primordial thought contents in each of the 46 chapters of Twenty Thousand Leagues Under the Sea. A polynomial analysis yielded a significant M-shaped profile for the variable "primordial thought content" [R 2 = 0.29, F(4, 41) = 4.13, p < 0.01]. However, more telling than fitting a profile is the examination of the raw data before and after removing the lag 3 autocorrelation of -0.48 (n = 46, t = 3.45, p < 0.001) (Figure 1). Removing the negative autocorrelation of -0.48 reduced the variability of the series while also increasing its R 2 to 0.36 [F(5, 40) -- 4.46, p < 0.01]. Except for its reduced variability, the new upper profile corrected for serial dependency closely follows the ups and downs of the uncorrected one.

4.3. Hamlet

Hamlet is a sad drama. Is it sad all the way through the text? What would be the answer of a computer

Page 5: The enemy within: Autocorrelation bias in content analysis of narratives

437

equipped with a good dictionary of, say, sadness, and of a good content analysis package to go with it? We analyzed Hamlet (Oxford Text Archives, mod- ernized spelling) with the help of PROTAN (Hogen- raad, Daubies, and Bestgen, 1995). Hamlet is com- posed of 27,421 words, 4,227 different ones before lemmatization, and 3,483 after. We compared Ham- let's words to the words in a dictionary designed to measure the degree of pleasantness or sadness in lan- guage (Whissell, Fournier, Pelland, Weir and Makarec, 1986). Whissell's dictionary of evaluation was con- structed by averaging the ratings made by judges using 7-point intensity scales. Judges had to decide whether a given word was pleasant (weight of 7) or unpleasant (weight of 1) or somewhere in between (weights of 2-6).

We then compared the evaluation words in the dic- tionary with Hamlet's words segmented into 20 equal parts of 1,371 words each. Arbitrary segments were used here because of the unequal lengths of the natural segments (acts or scenes) in the play. We observed an inverted U-shaped profile [R 2 = 0.47, F(2, 17) = 7.64, p < 0.01 ]. Removing the first-order autocorrelation of 0.22 (n = 20, t - - -1 .30,p < 0.50) and the second-order autocorrelarion of -0.33 (n = 20, t = 1.68, p < 0.50) caused a small gain in explained variance, i.e. [R 2 = 0.59, F(4, 15) = 5.38, p < 0.01]. The new correct- ed set of data should presumably better stand further statistical tests.

4.4. The institutional discourse o f science

We have examined until now serial dependency in lan- guage in texts of individual writers. What about insti- tutional language such as that of science? The textual status of science permits us to re-create a history of science based on the analysis of the words that make science (Appleby, 1994; Braudel, 1958; Martindale, 1990; Simonton, 1990).

Snow (1959) made the point that science must be animated by an inherent a priori optimism. How otherwise could a scientist build and test his or her hypotheses? We analyzed all the rifles of the lead- ing experimental psychology journal, Psychological Review, from 1894 to 1989 (Hogenraad, Bestgen, and Durieux, 1992). The 3,430 titles of Psychological Review collected over 96 years made up 25,114 words, 3,999 different words before lemmatization, and 3,317 after lemmatization. We analyzed this data base for the rate of pleasantness of title words using Whissell's et

Rate of pleasantness

48

46

44

42

40

38

• uncorrected -o-corrected ---beet fit (corrected)

1894 1904 1914 1924 1934 1944 1954 1964 1974 1984

Year

Figure 2. Uncorrected and corrected rates of pleasanmess for serial dependency in the floes of Psychological Review from 1894 to 1989.

al. (1986) dictionary of affect adapted to the PROTAN software (Hogenraad, Daubies and Bestgen, 1995).

The rate of pleasantness in Psychological Review unfolds over 96 years in a U-shaped sequence of high, low, and high pleasantness [R 2 = 0.28, F(2, 93) = 18.2, p < 0.001] (uncorrected profile in Figure 2). Presum- ably, the renewed rate of pleasantness that one can observe from the 1970's on marks the beginning of the cognitive success story in experimental psycholo- gy. However, the mood of the titles of Psychological Review is also characterized by a negative autoregres- sive process (r = --0.23, n = 96, t = 2.25, p < 0.05) at lag 5 which, in itself, is remarkable. What it means is that, throughout this text-based history, the mood at any period of this behavioral science is usually depen- dent on the reverse of the mood that existed five years before. So, the a priori optimism of experimental psy- chology is much more than that. It is a communicative enthusiam (which is one more way to define an autore- gressive process) turning into disparagement. Remov- ing this "contagious optimism" causes the new profile to increase to [R 2 --0.33, F(4, 91)= 11.03,p < 0.001] (corrected profile in Figure 2). One degree of freedom is accounted for by the lag 5 dependency, the other three are accounted for by the three variables repre- senting the cubic relationship.

Page 6: The enemy within: Autocorrelation bias in content analysis of narratives

438

5. Conclusions

The examples we have just reviewed have their own limitations. For example, most of the narratives ana- lyzed here are characterized by negative dependen- cies. Perhaps there is something to be built on that. In hindsight, it now appears that a homogeneous positive autocorrelation in a narrative would just bring what a narrative needs the least, i.e. monotony, and bore- dom in the end (Martindale, 1990). Actually, although knowledge has been built up about psychological data, less empirical experience is available concerning ser- ial dependency in narratives. Obviously, we need to accumulate more empirical data in order to develop a new knowledge base regarding serial dependency in narratives. The present paper is a practical step in that direction and somewhere to go from. Poetry, plays and other texts are written over time and there- fore possess an inherent temporal order. It would be unfortunate to see a good content analytic work jeopar- dized by trivial biases such as statistical dependencies between widely separate elements. The adverse effects of such dependencies are known. It is advisable to con- trol these effects using readily available techniques. Identifying and removing serial dependencies in con- tent analysis results matters because it leaves them intact though with some changes: heightened rather than transformed.

References

Appleby, J., L. Hunt, and M. Jacob. Telling the Truth about History. New York: Norton, 1994.

Baayen, H. "Statistical Models for Word Frequency Distributions: A Linguistic Evaluation." Computers and the Humanities 26 (1992), 347-363.

Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time-series Analy- sis: Forecasting and Control (3d ed.). Englewood Cliffs, NJ: Prentice-Hall, 1994.

Bratley, E and E A. Fortier. "Themes, Statistics and the French Novel." In Sixth International Conference on Computers in the Humanities. Eds. S. K. Burton and D. D. Short. Rockville, MD: Computer Science Press, 1983, pp. 18-25.

Braudel, E "Histoire et Sciences Sociales: La Longue Durte" [Histo- ry and Social Science: The Long Duration]. Annales 13 (1958), 725-753.

Brunet, E. "What Do Statistics Tell Us?." In Research in Humanities Computing. 1. Selected Papers from the ALLCIACH Confer- ence, Toronto, June 1989. Ed. I. Lancashire. Oxford, England: Clarendon Press, 1991, pp. 70-92.

Burman, P., E. Chow, and D. A. Noland. "A Cross-validatory Method for Dependent Data." Biometrika 81 (1994), 351-358.

Cohen, J. and P. Cohen. Applied Multiple Regression~Correlation Analysis for the Behavioral Sciences. Hillsdale, NJ: Lawrence Erlbaum Associates, 1975.

Crosbie, J. "The Inability of the Binomial Test to Control Type I Error with Single-subject Data." Behavioral Assessment 9 (1987), 141-150.

Crosbie, J. "Interrupted Time-series Analysis with Brief Single- subject Data." Journal of Consulting and Clinical Psychology 61 (1993), 966-974.

DeCarlo, L. T. and W. W. Tryon. "Estimating and Testing Autocorre- lation with Small Samples: A Comparison of the C-statistic to a Modified Estimator." Behaviour Research & Therapy 3 (1993), 781-788.

Fl6res, R. G., Jr. and V. A. Ginsburgh. "The Queen Elisabeth Musical Competition: How Fair is the Final Ranking?." The Statistician 45 (1996), 97-104.

Gottman, J. M. Time Series Analysis: A Comprehensive Introduction for Social Scientists. Cambridge, England: Cambridge Univer- sity Press, 1981.

Hayes, A. E "Permutation Test is not Distribution-free: Testing H0: p = 0." Psychological Methods 1 (1996), 184-198.

Hogenraad, R., Y. Bestgen, and J. F. Durieux. "Psychology as Liter- ature." Genetic, Social, and General Psychology Monographs 118 (1992), 455-478.

Hogenraad, R., C. Daubies, and Y. Bestgen. Une Th~orie et une M~thode G~n~rale d'Analyse Textuelle Assist~e par Ordi- nateur: Le Syst~me PROTAN (PROTocol ANalyzer), Ver- sion du 2 mars 1995 [A General Theory and Method of Computer-aided Text Analysis: The PROTAN System (PRO- Tocol Analyzer), Version of March 2, 1995]. Unpublished document, Psychology Department, Catholic University of Louvain, Louvain-la-Neuve, Belgium, 1995, 265 pages [http://www.psp.ucl.ac.be/~ upso/protan/PROTANAE.htm i].

Hogenraad, R., D. P. McKenzie, J. Morval, and F. A. Ducharme. "Paper Trails of Psychology: The Words that Made Applied Behavioral Sciences." Journal of Social Behavior and Person- ality 10 (1995), 491-516.

Huitema, B. E. and J. W. McKean. "Autocorrelation Estimation and Inference with Small Samples."Psychological Bulletin 110 (1991), 291-304.

Huitema, B. E. and J. W. McKean. "Two Reduced-bias Autocorre- lation Estimators: rF1 and rF2." Perceptual and Motor Skills 78 (1994), 323-330.

Ide, N. M. "A Statistical Measure of Theme and Structure." Com- puters and the Humanities 23 (1989), 277-283.

IMSL. User's Manual, STAT/Library: FORTRAN Subroutines for Statistical Analysis, Version 2. Houston, TX: IMSL, 1991.

Judd, C. M., G. H. McLelland, and S. E. Culhane. "Data Analysis: Continuing Issues in the Everyday Analysis of Psychological Data." Annual Review of Psychology 46 (1995), 433-465.

Kenny, D. A. and C. M. Judd. "Consequences of Violating the Inde- pendence Assumption in ~ the Analysis of Variance." Psycholog- ical Bulletin 99 (1986), 422-431.

Martindale, C. The Clockwork Muse: The Predictability of Artistic Change. New York: Basic Books, 1990.

McKenzie, D. P., D. M. Clarke, and C. Martindale. "Autocorrela- tion and Admission Diversion" [letter and reply]. Psychiatric Services 47 (1996), 91-92.

Moore, K. J., D. W. Osgood, R. E. Larzelere, and P. Chamberlain. "Use of Pooled Time Series in the Study of Naturally Occurring Clinical Events and Problem Behavior in a Foster Care Setting." Journal of Counseling and Clinical Psychology 62 (1994), 718- 728.

Noreen, E. W. Computer-intensive Methods for Testing Hypotheses: An Introduction. New York: Wiley, 1989.

Ostrom, C. W., Jr. Time Series Analysis: Regression Techniques (2nd ed.). Newbury Park, CA: Sage, 1990.

Page 7: The enemy within: Autocorrelation bias in content analysis of narratives

Pollock, D. S. G. "The Methods of Time-series Analysis." Interdis- ciplinary Science Reviews 12 (1987), 128-135.

SAS Institute, Inc. SAS User's Guide: Statistics, Version 5 Edition. Cary, NC: SAS Institute Inc., 1985.

SAS Institute, Inc. SAS/ETS User's Guide, Version 6 (2nd ed.). Cary, NC: SAS Institute Inc., 1993.

Sells, S. P., T. E. Smith, and D. H. Sprenkle. "Integrating Qualita- tive and Quantitative Research Methods: A Research Model." Family Process 34 (1995), 199-218.

Sigelman, L. "By Their (New) Words Shall Ye Know Them: Edith Wharton, Marion Mainwaring, and The Buccaneers." Comput- ers and the Humanities 29 (1995), 271-283.

Simonton, D. K. "Cross-sectional Time-series Experiments: Some Suggested Statistical Analyses." Psychological Bulletin 84 (1977), 489-502.

Simonton, D. K. Psychology, Science, and History: An Introduction to Historiometry. New Haven, CT: Yale University Press, 1990.

Snow, C. P. The Two Cultures. Cambridge, England: Cambridge University Press, 1959.

SPSS, Inc. SPSS Trends 6.1. Chicago: SPSS Inc., 1994. Stubbs, M. "Collocations and Semantic Profiles: On the Cause of

the Trouble with Quantitative Studies." Functions of Language 2 (1995), 23-55.

West, A. N. "Primary Process Content in the King James Bible: The Five Stages of Christian Mysticism." Computers and the Humanities 25 (1991), 227-238.

Whissell, C., M. Fournier, R. Pelland, D. Weir, and K. Makarec. "A Dictionary of Affect in Language. IV. Reliability, Validity, and Applications." Perceptual andMotor Skdlls 62 (1986), 875-888.

Young, G. A. "Bootstrap: More Than a Stab in the Dark." Statistical Science 9 (1994), 382-415.

Zimmerman, D. W. "A Note on Nonindependence and Nonparamet- tic Tests." Perceptual and Motor Skills 76 (1993), 407-412.

Authors' note

This work has benefited from the help of the Belgian National Fund for Scientific Research. Parts of this paper were presented at the Z U M A / S C C A C Confer- ence on "Text Analysis and Computers" (Mannheim, Germany, 18-21 September 1995). We would like to thank Yves Bestgen for his interest and help in this work and reviewers Paul Fortier and Harald Baayen for their comments on an earl ier version of this paper.

Robert Hogenraad is currently Senior Research Associate with the Belgian National Fund for Scientific

439

Research at the Psychology Department of the Catholic University of Louvain at Louvain-la-Neuve, Belgium. His research interests have concentrated in recent years on the text-based history of social sciences. His recent publications are: "Les roots qui ont fait les relations industrielles" (Qurbec, Presses de l 'Universit6 Laval, 1994) (with R. Boulard and D. P. McKenzie); "Paper trails of psychology: The words that made applied behavioral sciences" (with D. P. McKenzie, J. Mor- val, and F. A. Ducharme), Journal o f Social Behavior and Personality 10 (1995), 491-516; "Trails of social science: The visibil i ty of scientific change in crimino- logical journals" (with D. Kaminski and D. P. McKen- zie), Social Science Information~Information sur les Sciences Sociales 34 (1995), 663-685.

Colin Mart indale is professor at the Psychology Department of the Universi ty of Maine. He is currently president of the International Associat ion of Empirical Aesthetics ( IAEA) (http:/ /www.ume.maine.edu/~iaea) and editor of the journal "Empirical Studies o f the Arts". Recent publications are: "Fame more fickle than fortune: On the distribution of literary eminence". Poetics 23 (1995); "Noncoding DNA, Zipf ' s law, and language" (with A. K. Konopka). Science 268 (1995), 784; "How can we measure a society 's creativity". In Dimensions o f creativity. Ed. M. Boden. Cambridge: M.I.T. Press, 1994, pp. 159-197.

Dean P. McKenzie is a professional officer and sta- tistician involved with statistical analysis and develop- ment at the Department of Psychological Medicine, Monash University, Melbourne, Australia. Recent publications are: "Classification by similarity: An overview of statistical methods of case-based reason- ing" (with R. S. Forsyth). Computers in Human Behav-

ior 11 (1995), 273-288; "On the utility of content analysis in author attribution: The Federalist" (with C. Martindale). Computers and the Humanities 29 (1995), 259-270; "The construction of a brief case-finding instrument for depression" (with D. M. Clarke, R. J. Marshall , and G. C. Smith), Integrative Psychiatry (in press).