Download doc - The Urantia Book as a Test Case for Statistical Authorship Attribution in Genre-Distinctive Texts

Christopher SmithMay 4, 2010REL 441 HC, American ScripturesDr. Richard Bushman

The Urantia Book as a Test Case for Statistical Authorship Attribution in Genre-Distinctive Texts

Introduction

A recent statistical study of the Book of Mormon by Matthew Jockers, Daniela Witten,

and Craig Criddle (hereafter Jockers, et. al.) attempted to determine who authored that book by

measuring a set of word frequencies for each chapter and comparing them to word frequencies in

texts known to have been penned by a set of candidate authors. The word frequencies were

analyzed using two related classifications: Delta and Nearest Shrunken Centroids (NSC). In

their control tests, the authors found both classifications to be quite accurate, with NSC emerging

as a slightly more robust technique.i Application of the classifications to the Federalist Papers

resulted in similarly encouraging results, with Delta producing only three cross-validation errors

and NSC producing none.ii

Although the Federalist Papers are a classic case, however, they are not really

comparable to the Book of Mormon. According to Shlomo Argamon, the assumptions of word-

frequency analysis “fundamentally limit use of the method [to cases in which] all the samples

(from all authors) are of pretty much the same textual variety, otherwise we would expect the

word frequency distributions over the comparison set to be a mixture of several disparate

distributions, one for each genre found in the set, thus potentially biasing results depending on

the variety of the test text.”iii The Jockers and Witten study of the Federalist Papers satisfied this

criterion, but the Jockers, et. al. study of the Book of Mormon unequivocally did not.

In the Jockers, et. al. study of the Book of Mormon, the individual candidate authors’

“wordprints” were based largely on texts of a single genre or style, with a different genre

predominating for each author. Very few of the samples were of a similar type to the Book of

Mormon. Under these conditions, we would expect the control samples to be reliably attributed

to the proper author even if—perhaps especially if—the Delta method is highly sensitive to genre

and context. If the method is genre-sensitive, however, we would expect to obtain much less

accurate results when testing the candidate authors against a text of a different genre, such as the

Book of Mormon.iv

The present study applies the Delta word frequency classification method to the Urantia

Book (also known as the Urantia Papers), a religious text in many respects comparable to the

Book of Mormon. Like the Book of Mormon, the Urantia Book is highly distinctive in its genre

and style. Also like the Book of Mormon, the Urantia Book claims to have been authored by a

number of divinely inspired superhuman narrators. Skeptics of each book, meanwhile, disagree

as to whether each had a single human author or is the product of a multiple-author conspiracy.

If the Delta attribution method can produce meaningful results when applied to the Urantia

Book, it would tend to bolster its applicability to the Book of Mormon and to other, similar

cross-genre cases.

Unfortunately, the method turns out to be of dubious usefulness in choosing among

candidate authors. When the 197 Urantia Papers were tested against seven candidate authors,

including three likely candidates and four control authors, the large majority of the Papers were

attributed to two of the control authors: Sigmund Freud and myself. Only a very few of the

Papers were attributed to the candidate who, from other evidence, seems to be their most likely

author. Similarly, a test of the text’s internal authorship claims turned out to be moderately

successful in choosing the correct narrator, but it is difficult to assess the significance of this

finding, given that narrator and genre tended to be covariates. The method turns out to be highly

robust for determining the genre of a text, which demonstrates that it is very context-sensitive.

Another application of the method turned out to be more fruitful. In addition to genre,

the method also turns out to be somewhat sensitive to changes in an author’s style over time.

Controlling for genre, we can use the method to chart stylistic trends within the Book. The

basically linear developmental trend that emerges is suggestive of unitary rather than multiple

authorship.

i Matthew L. Jockers, Daniela M. Witten, and Craig S. Criddle, “Reassessing Authorship of the Book of Mormon Using Delta and Nearest Shrunken Centroid Classification,” Literary and Linguistic Computing Advance Access (December 6, 2008), available from http://llc.oxfordjournals.org/cgi/content/short/23/4/465 [accessed April 16, 2010].ii Matthew L. Jockers and Daniela M. Witten, "A Comparative Study of Machine Learning Methods for Authorship Attribution,” Literary and Linguistic Computing Advance Access (April 12, 2010), available from http://llc.oxfordjournals.org/cgi/content/full/fqq001 [accessed April 16, 2010].iii Shlomo Argamon, “Interpreting Burrows’s Delta: Geometric and Probabilistic Foundations, Literary and Linguistic Computing Advance Access (March 1, 2008), available from http://llc.oxfordjournals.org/cgi/content/full/23/2/131 [accessed April 16, 2010]. Even Jockers and Witten admit that “context-specific words” can adversely impact the results. Thus even in the Federalist Papers case, “if the Madison training texts and the test texts address a particular topic that is not addressed by the Hamilton or Jay training texts, then the NSC classifier might use these words as very strong evidence that the test texts were written by Madison.” Jockers and Witten, “Comparative Study,” 6.iv The control tests by which Jockers, et. al. determined their accuracy rates, moreover, split the author corpuses in half and tested the two halves against each other. Such split-halving will tend to average out genre differences on both sides of the test. For the control tests to be comparable to testing individual Book of Mormon chapters, they should have separated out small, individual texts and tested them against the larger corpus. See Jockers, et. al., “Reassessing Authorship,” 7. The application of Delta to the Book of Mormon is further complicated by studies which show that when authors who have no familiarity with statistical attribution methods attempt to obfuscate their style or to imitate the style of another author, the accuracy of stylometric methods is reduced “to the level of random guessing.” Since the Book of Mormon imitates the King James Version of the Bible, stylometry is unlikely to be useful in determining its authorship. See Michael Brennan and Rachel Greenstadt, “Practical Attacks Against Authorship Recognition Techniques,” available from www.cs.drexel.edu/~greenie/brennan_paper.pdf [accessed April 16, 2010].

http://www.cs.drexel.edu/~greenie/brennan_paper.pdf

http://llc.oxfordjournals.org/cgi/content/full/23/2/131

http://llc.oxfordjournals.org/cgi/content/full/fqq001

http://llc.oxfordjournals.org/cgi/content/short/23/4/465

The Urantia Authorship Controversy

The story of the Urantia Book began sometime between 1906 and 1911, when

psychologist and former Adventist minister William S. Sadler examined an individual known as

the “sleeping subject” (probably Wilfred Kellogg), whose wife was concerned about his

“abnormal movements” while sleeping. To Sadler’s surprise, Kellogg began to speak in his

sleep, claiming to be “a student visitor on an observation mission from another planet.” Sadler

was initially skeptical, but eventually came to believe that this was an authentic spiritual

phenomenon. A group called “the Forum” formed in 1923, and asked questions of the celestial

beings that were then put to the sleeping subject by Sadler and the five other members of the

“Contact Commission”. Answers to the Forum’s questions were provided as formal essays

known as the Urantia Papers. The Forum and the Commission were sworn to secrecy about the

identity of the sleeping subject and the mode by which the Book was received, for fear that

people would become preoccupied with these details rather than studying the Book itself. Sadler

insisted, however, that the Book was not received through channeling or automatic writing. He

strongly implied that the typed pages of the manuscript simply materialized in the room. In 1950

the Urantia Foundation was founded to publish the Book, which it did in 1955.v

The Papers themselves claim to have been written by celestial beings in order to inform

the denizens of Urantia—which is what they call our planet—about God, science, history, the

cosmos, and the life and teachings of Jesus. For the most part, the authors of the Papers are

identified by order of being rather than by name.vi It is clear in some cases, however, that certain

Papers are supposed to have been written by the same individuals. Ken Glasziou has tested these

v Sarah Lewis, “The Peculiar Sleep: Receiving the Urantia Book,” in The Invention of Sacred Tradition, James R. Lewis and Olav Hammer, eds. (Cambridge University Press, 2007), 200-203; Marian Rowley and William S. Sadler, “A History of the Urantia Movement,” typed manuscript (1960), available from http://urantiabook.org/archive/history/histumov.htm [accessed April 21, 2010].

http://urantiabook.org/archive/history/histumov.htm

internal authorship claims using a technique pioneered by Mosteller and Wallace that basically

looks at the frequencies with which certain “function words” (articles, conjunctions, and

demonstrative pronouns) are used to begin sentences or clauses. Glasziou found that the method

was able to distinguish among five of the narrators with a high level of statistical significance.vii

Skeptics of the Book’s internal claims have proposed a number of possible human

authors. The most likely scenarios would seem to be that the Book was dictated by the sleeping

subject, that the manuscripts were planted by William Sadler, or that the Book was collectively

authored by the members of the Contact Commission.

Certain stylistic continuities throughout the Book would seem to point in the direction of

unitary authorship, even though the four different “parts” into which it is divided—particularly

the fourth—deal with quite different subject matter. Numbered lists are employed in every

section of the Book. The vocabulary of the Book is almost comically pretentious throughout,

and concepts are presented in excruciating detail. There is a persistent concern to delineate

hierarchies of being and to fit Hebraic and Christian concepts into a systematic, scientific

framework. These features are found even in the most unique part of the book: the extended

narrative of the life of Jesus attributed to the Second Midwayer attached to the Apostle Andrew.

The author turns the teachings of Jesus into a systematic philosophy, and in his narration of the

events of Jesus’s life exhibits an obsession with minor details such as names, locations, and exact

dates. It seems highly unlikely that more than one individual in William Sadler’s circle could

have possessed the distinctive turn of mind of which the Urantia Book seems to be a product.

vi Sometimes a paper is said to be “presented” or “sponsored” by a particular being rather than “written” or “indited”. It is difficult to know whether these all should be treated as equivalent terms.vii Ken Glasziou, “Part 3: Who Wrote the Urantia Papers,” (1996), available from http://urantiabook.org/archive/readers/doc183.htm [accessed April 21, 2010].

http://urantiabook.org/archive/readers/doc183.htm

As for the identity of the author, William S. Sadler seems the most likely candidate.

Martin Gardner has quite convincingly argued that while some of the conceptual content of the

papers may have been channeled through Wilfred Kellogg, it was in fact Sadler who formulated

the written text. Gardner provides a very extensive list of unusual words and phrases that appear

both in the Urantia Book and in Sadler’s many works. He also demonstrates that the science of

the Book—particularly its endorsements of eugenics and of De Vries’ “mutation theory” of

evolution—reflects of Sadler’s own strongly held views. Other parallels are found in its

theology, its psychiatric prescriptions, and its economic and political theories.viii

Certainly it would have taken a prolific writer who was well-read on many subjects to

produce the English text of the Urantia Book. Its more than 2,000 printed pages provide a

virtually comprehensive view of life, religion, and the universe. Of those involved in the

production of the Book, it is Sadler who best fits this description. A reviewer of one of his

psychology books complained that it did not take long to discover why the book had 1229 “big

pages”: “The author wishes to tell everything, and in . . . a rambling way, a way of much

overlapping, of not a little repetition.”ix These words could as easily have been written of the

Urantia Book. A great many of Sadler’s works fit this description, and he wrote quite a few, on

many different subjects.

The Delta Methodology

The Delta methodology employed in the present paper is slightly modified from Jockers,

et. al.

viii Martin Gardner, Urantia: The Great Cult Mystery (Amherst, New York: Prometheus Books, 1995), 273-320, 423-35.ix Review of William S. Sadler, Theory and Practice of Psychiatry, The Journal of nervous and Mental Disease 86, no. 5 (November 1937): 605-606.

First, lists of frequently-occurring words are generated according to two slightly different

criteria. The first, slightly more stringent criterion admits only those words that occur at least

once in each of the 197 Urantia Papers. There are 39 such words in all.x The second criterion is

to admit all words that occur in each sample corpus—that is, each set of sample texts from a

particular author or genre—at least once per thousand words.xi Both rules are designed to

exclude infrequent or highly contextual words that might skew the results. Most of the tests in

the present paper will employ the second rule, but the first will be used in the test of time-

dependence (which requires testing every Urantia Paper individually against every other Urantia

Paper).

Next, a set of word-frequency vectors for all sample texts is produced. The mean is

subtracted from each vector and it is divided by its standard deviation in order to turn the raw

word frequencies into z-scores. Basically what this does is weights each word vector equally. If

we used raw frequencies rather than z-scores, word vectors with higher absolute variance across

all samples would end up more heavily weighted in the final Delta comparison. For example, if

“and” occurs between 20 and 100 times per thousand words across all samples, whereas “for”

occurs between only 5 and 10 times per thousand words across all samples, then aggregating the

x The words generated under this criterion are: a, all, and, are, as, at, be, but, by, even, for, from, have, in, is, it, more, no, not, of, on, one, or, so, such, that, the, their, these, they, this, time, to, when, which, with. Most of these are what authorship-attribution experts term “function words”. Although they have occasionally been described as “non-contextual”, I have found in my own testing that usage of most of these words is strongly influenced by genre and context.xi For the eight narrator case, the words used are: a, all, an, and, as, at, be, but, by, father, for, from, have, his, in, is, it, life, no, not, of, on, one, only, or, such, that, the, their, there, these, they, this, time, to, when, which, who, with. For the seven author case, the words used are: a, all, an, and, are, as, at, be, but, by, do, for, from, have, in, into, is, it, not, of, on, one, or, so, that, the, their, there, these, this, to, was, we, when, which, will, with. For the genre-controlled case, the words used are: a, all, an, and, are, as, at, be, been, but, by, even, first, for, from, god, has, have, he, his, human, in, into, is, it, life, man, more, no, not, of, on, one, only, or, other, so, spiritual, such, that, the, their, there, these, they, this, those, time, to, upon, was, when, which, while, who, will, with, world, would.

frequency distances between texts would result in the "for" distances being swamped out by the

much larger "and" distances. Dividing each word vector by its standard deviation normalizes the

various word vectors by their variance so that they are comparable and equally weighted.xii

Next, we find the “Delta” distances for each word vector between each test document and

each sample corpus. If we are testing sample X against author Y, for example, then we find the

absolute value of the difference between sample X’s z-score for a given word vector and author

Y’s average z-score for that vector. Once we have done this for all vectors, we average them

together in order to get an average distance between the text and the author. If author Y’s

distance from the sample is smaller than all other authors’, then we conclude that author Y is the

most probable of our candidate authors to have written the text.

An estimate of the accuracy of the method can be obtained by performing control tests in

which each of an author’s sample texts is individually subtracted out from his or her corpus, and

tested against all authors. In theory, the percentage of these control tests that result in attribution

to the correct author represents the method’s accuracy rate. As we have already discussed,

however, this method of estimating accuracy is rendered very problematic by the effect of

exogenous variables such as genre.

Another way to estimate accuracy is a simple chi-squared test. This tells us if our results

are substantially different from what would be expected if we assigned each text to a random

author. Generally, if a chi-squared test results in a “p-score” under .05, the result is considered

“statistically significant”. Again, however, that our results are non-random does not mean that

authorship is the causative variable. If other variables, such as genre, are resulting in

xii The use of standard deviation in weighting the word vectors assumes that the “spread” of each word vector across all our sample texts is a typical spread. If we have a large number samples from one authors of genre, then the standard deviation for one or more of the word vectors used in the analysis might skew small, in which case the vector(s) in question would still be unduly weighted.

substantially skewed and misleading results, we would still expect a low p-score. Thus, neither

of our options for accuracy-estimation is really reliable.

Using the above method, several tests will be conducted. The first will test the internal

authorship claims of the Urantia Book, in order to determine whether its celestial narrators can

be distinguished from each other. The second will test each of the Urantia Papers against three

likely human authors and four control authors. The third will divide the Urantia Book into three

broad genre categories, and test each of the Papers against these categories in order to determine

to what extent genre may be influencing our results. And finally, an additional set of tests will

be performed in order to establish whether the style of the Urantia Book undergoes a linear

evolution when the Papers are arranged sequentially and tested against each other. An attempt

will here be made to control for genre by testing each genre independently against itself, and

then testing the samples from each genre against the larger book (excluding the other samples

from the same genre as a given test text).

All of these analyses are performed using a computer program that I designed myself

using Visual Basic 6 (in conjunction with some features of Microsoft Excel 2003). My program

automates most of the processes required, including determining word ratios, calculating z-

scores and Delta distances, and creating graphs. The role of the researcher, then, is simply to

prepare and classify the text samples to be studied, and to evaluate the results.

Results

In order to test the internal claims of the Urantia Book, it was necessary to omit from the

test those narrators who are identified too ambiguously in the text to allow for meaningful

testing. Narrators to whom only a single paper was attributed were also omitted. This left eight

subjects: the Chief of Seraphim,xiii the Chief of the Archangels of Nebadon,xiv the Divine

Counselor assigned to reveal the attributes of God,xv the Mighty Messenger temporarily

sojourning on Urantia,xvi the Perfector of Wisdom commissioned by the Ancient of Days,xvii the

Second Midwayer attached to the Apostle Andrew,xviii the Solitary messenger of Orvonton,xix and

Solonia, the Seraphic voice in the Garden.xx

Of the 114 text samples tested, 89 of them (i.e., ~78%) were attributed correctly. If the

results were random, 14.25 correct attributions would have been expected. Thus, the results

were highly statistically significant (p < .0001). This would seem to confirm Ken Glasziou’s

results (based on a similar method) supporting the text’s internal claim to multiple authorship.

Actually, though, our chi-squared results indicate only that the results of the test were non-

random. Since the narrator variable is highly correlated to the genre and sequence (i.e., time)

variables, it is very difficult to know which variable actually determined our results. For

example, the most accurate attributions were made to the Second Midwayer attached to the

Apostle Andrew. The section of the book for which this narrator is supposed to have been

responsible also happens to be the most distinctive in terms of content and genre. The texts in

this section that were misattributed tended to be those that deviated from the theme and genre of

the remainder of the section. Thus, it is very possible that the ability of the method to distinguish

among narrators is merely an artifact of the genre, content, or time differentials between the text-

groupings each narrator is supposed to have produced.

xiii Papers 82-84, 113, 114.xiv Papers 33, 35.xv Papers 0-5.xvi Papers 32, 34, 40, 42, 54, 55, 115-118.xvii Papers 11-14.xviii Papers 121-96.xix Papers 107-112.xx Papers 73-76.

For our test of possible early twentieth-century human forgers, text samples were

assembled for three of the most likely authors and five control authors. The three likely authors

are William S. Sadler,xxi Lena Sadler,xxii and William Sadler, Jr.xxiii (Wilfred Kellogg was

excluded because no writing samples could be obtained for him in a searchable digital format.)

The five control authors were Sigmund Freud,xxiv myself,xxv and the biblical writers Matthew and

Luke (from the King James Version).xxvi

Of the 105 control samples, 79 (~75%) were attributed to the correct author. This is

again a very statistically significant result (p < .0001). As in the previous test, however,

sequence and genre were exogenous variables that correlated highly with authorship. The extent

to which authorship was the measured variable, then, is an open question. As for the Urantia

xxi The William S. Sadler sample was comprised of four chapters from his book The Mind at Mischief: Tricks and Deceptions of the Subconscious and How to Cope with Them (New York: Funk & Wagnals Company, 1929), available from http://www.cimmay.us/pdf/sadler.pdf [accessed April 21, 2010]. This is one of his shorter, more popular works, and so is perhaps not ideal for comparison to the Urantia Book. But it was the only work for which a searchable digital text was available.xxii The Lena Sadler sample consists of three chapters from a book she co-authored with her husband: The Mother and Her Child (Toronto: McClelland, Goodchild & Stewart, 1916), available from http://www.gutenberg.org/files/20817/20817-h/20817-h.htm [accessed April 21, 2010]. Again, it is hardly ideal to use a co-authored text, but this was the only digital writing sample available for her. She does seem to have been the primary author of most of the book.xxiii The William Sadler, Jr. sample consists of three chapters from his book, A Study of the Master Universe: A Development of Concepts in the Urantia Book (Second Society Foundation, 1968), available from http://urantiabook.org/studies/smu/index.html [accessed April 21, 2010].xxiv The Sigmund Freud sample consists of a chapter from his Dream Psychology: Psychoanalysis for Beginners, tr. M. D. Eder (New York: The James A. McCann Company, 1920), available from http://www.gutenberg.org/files/15489/15489-h/15489-h.htm [accessed April 21, 2010], and three chapters from The Interpretation of Dreams, tr. A. A. Brill, 3rd ed. (New York: Macmillan, 1911), available from http://www.psychwww.com/books/interp/toc.htm [accessed April 21, 2010].xxv The Christopher Smith sample consists of academic and narrative writings from my personal files, as well as several entries from my personal religious studies blog.xxvi For these samples, the books of Matthew, Luke, and Acts were divided into individual chapters according to the KJV chapter numbering. The KJV text was obtained from http://www.biblegateway.com [accessed April 21, 2010].

http://www.biblegateway.com/

http://www.psychwww.com/books/interp/toc.htm

http://www.gutenberg.org/files/15489/15489-h/15489-h.htm

http://urantiabook.org/studies/smu/index.html

http://www.gutenberg.org/files/20817/20817-h/20817-h.htm

http://www.cimmay.us/pdf/sadler.pdf

Book itself, 91 of the Papers were attributed to Sigmund Freud, 74 to Christopher Smith (that’s

me), 17 to Luke, 11 to William Sadler, 3 to Lena Sadler, 2 to Matthew, and none to William

Sadler, Jr. In a number of cases—particularly in the section that narrates the life of Jesus—the

individual Papers were more similar to the author to which they were assigned than to the

averages for the Urantia Book itself. Obviously, these results are nonsensical. Sigmund Freud

lived in Austria and wrote in German, and thus cannot have authored the Book. I was not even

alive at the time the Book was written. The results for the most likely authors, meanwhile, are

all considerably below the level of randomness. There are a few possible explanations for these

results. The first is that the true author(s) of the Book was/were not included in the test. The

second is that the method simply is not robust for attribution across genres.

In order to get some idea of the sensitivity of the method to genre differences, I

conducted a third test in which the Urantia Papers were divided into three broad genre categories,

and each individual Paper was then subtracted out and individually tested against these

groupings. The three groupings chosen were Cosmo-Theology (which includes texts on the

nature of God, the hierarchies of celestial beings, the structure of the universe, and the

philosophy of religion),xxvii Earth History (including evolutionary, sociological, and religious

history),xxviii and Pseudo-Biblical Narrative (including events and teachings from the life of Jesus,

as well as the author’s reflections on said events and teachings).xxix It might be desirable to

subdivide these genres further, but unfortunately the subgenres are so highly interwoven that to

separate them would be a life’s work. Even the three divisions above required some rather

arbitrary judgments in the cases of several Papers that mix aspects of more than one category. In

xxvii Papers 0-56, 99-120, 196.xxviii Papers 57-98, 121, 195.xxix Papers 122-94.

any case, this test attributed 181 (~92%) of the 197 Papers to the proper genre, with a high level

of statistical significance (p < .0001). As in previous cases, the determinative variable(s) might

be something other or in addition to merely genre.

Presumably, the test that had the highest accuracy is the one that measured the variable

that was most determinative for our results. The accuracies reported for the tests above,

however, are not really comparable, since the number of attribution “candidates” in each case

was different. (For example, there is a much higher likelihood of a “lucky guess” when choosing

among three genres than when choosing among eight narrators.) Thus, a somewhat different

measure of accuracy was created, based on the attribution “rank” the method assigned to the

correct candidate. If all the correct candidates were assigned first rank, the accuracy of the

method is said to be 100%. If all were assigned the mean rank (second out of three, for

example), the accuracy of the method is said to be 0%. If all correct candidates were assigned

last rank, the method’s accuracy is said to be -100%. This measure is calculated by the

following formula:

weighted accuracy = 1 - (1 / (highest rank * # of samples - mean rank * # of samples)) * (sum of

observed rankings - # of samples)

This basically measures the deviation of our results from the mean, “random” level of each test,

and expresses it as a percentage of the total possible variance. It thus equally weights tests with

different numbers of candidates, such that we can compare accuracy rates across multiple tests.

When the weighted accuracy measure is applied to the narrator and genre tests, we find

that the narrator test was ~83% accurate, and the genre test was ~91% accurate. Thus, the

narrator results could be explained as an artifact of covariance with genre, but the genre results

could not be entirely explained as an artifact of covariance with narrator. The genre variable

offers predictive power above and beyond what the narrator variable provides. Having said this,

it is very possible that both narrator and genre are causative variables.

Applying our weighted accuracy measure to the control tests for our human author

candidates returns an accuracy of ~88%. Thus author, too, may be a causative variable. On the

other hand, much of the accuracy of the control tests may be an artifact of covariance with genre.

The absurd results of our attempt to attribute the Urantia Papers to a human author suggest that

major differences in text-type may completely obscure the effects of the authorship variable.

(Alternatively, deliberate obfuscation or imitation on the part of the author may be to blame.)

A final variable to be tested is time, or sequence. Assuming that the Urantia Papers were

written by a single author in the sequence in which they currently appear, we might expect to

detect a more or less linear pattern of stylistic development over the course of the Book.

Significant deviations from this pattern might indicate that the Book had multiple authors or that

the Papers are arranged out of sequence.

In order to assess time dependence, we first test a Urantia Paper against every other

individual Paper. We then graph the Paper’s Delta distances from the other Papers, in sequence,

and perform a linear regression analysis. The slope of the resulting regression line is basically a

measure of the test-Paper’s relative similarity to the front and back halves of the book. If the

slope is positive, the Paper is most similar to Papers near the beginning of the Book. If the slope

is negative, the Paper is most similar to Papers near the end of the Book. Assuming that style is

time-dependent, we would expect our regression slopes to become gradually more negative as

we repeat this analysis for successive test-Papers throughout the Book. And, in fact, this is more

or less what we observe in Figure 1.

Here again, however, we face the problem of exogenous variables. There are three major

genre groupings in the Urantia Book, and they fall more or less in sequence. Thus, the linear

pattern here could easily be an artifact of shifts in genre. In order to control for this problem,

each genre was tested internally against itself, excluding the rest of the Book. The same negative

linear patterns emerge in Figures 2, 3, and 4.

Figure 1

Figure 2

Figure 3

Figure 4

These figures suggest that, at the very least, each of the individual genres had a single

author. Time-dependence across genres is more difficult to assess. It is possible to conduct a

control test in which each individual text within a given genre is tested against the rest of the

Book, excluding the other texts of the test text’s genre. As Figure 5 shows, we still find a

negative linear trend for two of the three genres, but the third genre has a positive linear trend.

Moreover, in the cases of the first two genres, the fit of the data to the regression lines is not as

good as in the within-genre tests. There are a few possible explanations for these results. First,

our genre classifications may be inadequate. Second, the genres may be arranged out of

sequence. (For example, the undated fourth Part of the Urantia Book may have been written

prior to Parts I-III.) Third, the different genres may be the work of different authors. And fourth

and finally, an author’s “voice” for a given genre may evolve independently of his or her

“voices” for other genres, such that we would not expect time-dependence to be clearly

discernible across genres. (For example, a given word may be becoming more frequent in an

author’s narrative writings even as it is becoming less frequent in his or her theological writings.

Presumably, the more different two genres are from each other, the greater the probability that

they will exhibit independent developmental patterns.) It is difficult to know which, if any, of

the above propositions explains our results. Possibly more than one is true.

Figure 5

Conclusion and Directions for Future Research

The results reported here suggest that while Delta classification of word-frequency scores

may well be an accurate authorship attribution method for texts of the same general type, it

cannot be considered effective for attributing highly distinctive pseudonymous texts. Moreover,

analysts should be very careful in their assumptions about causality. That the results of a Delta

analysis are statistically significant does not mean that authorship is the variable they are

measuring. Exogenous variables, including genre, narrator, and sequence of composition, may

considerably complicate attempts to draw meaningful conclusions about authorship from a Delta

analysis. Identifying and finding ways to control for important variables may help achieve more

reliable results.

Even if the Delta method cannot be used to identify the author of a genre-distinctive text,

however, it may still be useful for drawing conclusions about the internal makeup of that text. If

linear patterns of stylistic development can be discerned in the text, it may indicate that the text

was written in sequence by a single author. Alternatively, deviations from this pattern of

development may indicate multiple authorship or out-of-sequence composition. Other texts

should be analyzed using this approach in order to assess its overall usefulness in making

determinations about authorship and order of composition.

Given the present state of statistical authorship methods, they are unlikely to supplant

more traditional modes of analysis for the foreseeable future. If we conclude that William S.

Sadler wrote the Urantia Book, for example, it must be based primarily on carefully collated

textual and historical evidence. Even so, statistical methods can perhaps provide some additional

insight, if carefully controlled and responsibly interpreted. In the present case, the statistics

suggest that the three major genre groupings in the Book seem likely to each have had a single

author, and this author may have been the same individual for all three groups. To say more than

that would require further methodological refinement beyond the scope of the present study.