Download pdf - PIRLS: Poisson Iteratively Reweighted Least Squares ... · PIRLS: Poisson Iteratively Reweighted Least Squares Computer Program for Additive, Multiplicative, Power, and Non-linear

PIRLS: Poisson Iteratively Reweighted Least Squares Computer

Program for Additive, Multiplicative, Power, and Non-linear Models

Leif E. PetersonCenter for Cancer Control ResearchBaylor College of MedicineHouston, Texas 77030

November 3, 1997

Abstract

A computer program to estimate Poisson regression coefficients, standard errors, Pearson χ2 and deviancegoodness of fit (and residuals), leverages, and Freeman-Tukey, standardized, and deletion residuals for additive,multiplicative, power, and non-linear models is described. Data used by the program must be stored and inputfrom a disk file. The output file contains an optional list of the input data, observed and fitted count data (deathsor cases), coefficients, standard errors, relative risks and 95% confidence intervals, variance-covariance matrix,correlation matrix, goodness of fit statistics, and residuals for regression diagnostics.Key Words: Poisson Regression, Iteratively Reweighted Least Squares, Regression Diagnostics, Matrix Ma-

nipulation.

1 INTRODUCTION

1.1 Poisson Distribution

Poisson regression models are commonly used for modeling risk factors and other covariates when the rates ofdisease are small, i.e., cancer [1, 2, 3, 4, 5]. An area of research in which Poisson regression of cancer rates iscommonly employed is the investigation of radiation-induced cancer among Hiroshima and Nagasaki atomic bombsurvivors [6, 7, 8] and individuals occupationally [9], medically [10] and environmentally [11] exposed to ionizingradiation. Poisson regression has also been used to investigate the effects of information bias on lifetime risks [12],diagnostic misclassification [13, 14], and dose-response curves for radiation-induced chromosome aberrations [15].Several computer programs have been developed that perform Poisson regression [16, 17, 18, 19, 20]. While the

PREG program [3] can fit additive, multiplicative, and non-linear models, it does not provide the capability to fitpower models nor perform regression diagnostics. This report documents the PIRLS Poisson regression algorithmfor fitting additive, multiplicative, power, and non-linear models and regression diagnostics using leverages of theHat matrix, deletion and standardized residuals.The critical data required to apply the computer code are:

• Number of cases for each stratum• Number of person-years for each stratum• Design matrix of independent variables

There is only one output format, listing the input data, observed and fitted count data (deaths or cases), coeffi-cients, standard errors, relative risks (RR) and 95% confidence intervals, variance-covariance matrix, correlationmatrix, goodness of fit statistics, and residuals for regression diagnostics. The program and all of its subroutines

1

are written in FORTRAN-77 and are run on a desktop PC with a AMD-K6-166MHz(MMX) chip. The algorithmcan be compiled and run on Unix machines as well. The PC version of the executable requires 180,224 bytes ofrandom access memory (RAM) to operate. Average execution time is several seconds per run, depending on theamount of input data.

2 PROGRAM DESCRIPTION

The computer program proceeds as follows. During run-time, the user interface with the program involves thefollowing:

• Specifying the type of model to be fit (1-additive, 2-multiplicative, 3-power, and 4-non-linear)• For additive models, specifying the power, ρ, which ranges from 1 for purely additive models to 0 for purelymultiplicative models

• Specifying the number of independent variables (not cases or person-years) that are to be read from theinput file

• Specifying the number of parameters to estimate• For non-linear models, optional starting values for parameters• For multiplicative models, specifying if the regression coefficients are to be exponentiated to obtain RR andits 95% CI

• Specifying whether or not the variance-covariance and correlation matrices are written to the output filewhose name is specified by the user

• Specifying whether or not the input data are written to the output file whose name is specified by the user• Specifying whether or not the observed and fitted count data (deaths or cases) are written to the output file• Specifying whether or not the leverages, and standardized and deletion residuals are written to the outputfile

Once the data above have been specified during run time, the program does the following:

1. First, a “starting” regression is performed by regressing the ratio of counts (deaths or cases) to person-yearsof follow-up as the dependent variable on the independent variables. For non-linear models, the defaultstarting regression can be skipped by specifying starting values for each parameter.

2. The difference between the observed and fitted counts (error residuals) from the starting regression areused as the dependent variable and regressed on the cross product of the inverse of the information matrix(variance covariance-matrix) and the score vector to obtain the solution vector.

3. The solution vector is then added to the original regression coefficients obtained during the starting regres-sion.

4. Iterations are performed until the sum of the solution vector is below a fixed value called the convergencecriterion, for which the default in PIRLS is 10−5.

5. After convergence is reached, the standard errors are obtained from the diagonal of the variance-covariancematrix, and if specified, residuals for regression diagnostics are estimated.

6. Depending on specifications made by the user, various data are written to the output file, and then theprogram terminates run-time.

3 NOTATION AND THEORY

3.1 Binomial Basis of Poisson Regression

For large n and small p, e.g., a cancer rate of say 50/100,000 population, binomial probabilities are approximatedby the Poisson distribution given by the form

P (x) =e−m

x!, (1)

2

Table 1: Deaths from coronary disease among British male doctors [21].

Non-smokers SmokersAge group, i di Ti λi(0) di Ti λi(1)

35-44 2 18,790 0.1064 32 52,407 0.610645-54 12 10,673 1.1243 104 43,248 2.404755-64 28 5,710 4.9037 206 28,612 7.199865-74 28 2,585 10.8317 186 12,663 14.688575-84 31 1,462 21.2038 102 5,317 19.1838

where the irrational number e is 2.71828.... In many respects, the Poisson distribution has been widely used inscience and does not really have a direct relationship with the binomial distribution. As such, np is replaced byµ in the relationship

f(x) =µxe−µ

x!, (2)

where µ is the mean of the Poisson distribution. It has been said that as long as npq is less than 5, then the dataare said to be Poisson distributed.

3.2 Poisson Assumption

The Poisson assumption states that, for stratified data, the number of deaths, d, in each cell approximates thevalues x=0,1,2,... according to the formula

P (d = x) =(λT )e−λT

x!, (3)

where λ is the rate parameter which is equal to the number of deaths in each cell divided by the respective numberof person-years (T ) in that same cell. As an example, Table 1 lists a multiway contingency table for the Britishdoctor data for coronary disease and smoking [21]. The rates for non-smokers, λi(0), are simply the ratio of thenumber of deaths to Ti in age group i and exposure group 0, whereas the rates for smokers, λi(1), are the ratioof deaths to Ti in age group i for the exposure group 1.

3.3 Ordinary Least Squares

Let us now look into the sum of squares for common linear regression models. We recall that most of our linearregression experience is with the model

Y = Xβ + ε , (4)

where Y is a column vector of responses, X is the data matrix, β is the column vector of coefficients and ε is theerror. Another way of expressing the observed dependent variable, yi, of (4) in model terms is

yi = β0 + β1xi1 + β2xi2 · · · βjxij + εi , (5)

and in terms of the predicted values, yi, is

yi = β0 + β1xi1 + β2xi2 · · · βjxij + ei. (6)

The residual sum of squares of (6) is

SS(β) =∑n

i=1{yi − yi}2

=∑n

i=1e2i

= e′e.(7)

3

Because of the relationshipe = Y −Xβ , (8)

we can rewrite (7) in matrix notation as

SS(β) = (Y −Xβ)′(Y −Xβ). (9)

3.4 Partitioning Error Sum of Squares for Weighted Least Squares

Suppose we have the general least squares equation

Y = Xβ + ε. (10)

and according to Draper and Smith [22] set

E(ε) = 0 V(ε) = V σ2 ε ∼ N(0,Vσ2) , (11)

and say thatP′P = PP = P2 = PP = V. (12)

Now if we let

f = P−1ε E(f) = 0 , (13)

and then premultiply (10) by P1, as in

P−1Y = P−1Xβ +P−1ε , (14)

orZ = Qβ + f . (15)

By applying basic least squares methods to (14) the residual sums of squares is

f ′f = ε′(P−1)′P−1ε , (16)

since(P−1ε)′ = ε′(P−1)′ , (17)

and since(B−1A−1) = (AB)−1 , (18)

thenf ′f = ε′(PP)−1ε , (19)

which is equivalent tof ′f = ε′(V)−1ε. (20)

The sum of squares residual then becomes for (14)

SS(β) = (Y −Xβ)′V(Y −Xβ) . (21)

3.5 Partitioning Error Sum of Squares for Iteratively Reweighted Least Squares

According to McCullagh and Nelder [23], a generalized linear model is composed of the following three parts:

• Random component, λ, which is normally distributed with mean 1Ti· µi and variance µiT2

i

.

• Systematic component, x, which are covariates related to the linear predictor, (x′iβ), by the form

(x′iβ) =p∑j=1

xijβj . (22)

4

Table 2: The rate parameter and its various link functions in PIRLS.Form of predictor for λ Model Risk coefficients

λi(x′iβ) = (x

′iβ) Additive Absolute Risk

λi(x′iβ) = exp(x

′iβ) Multiplicative Relative Risk

λi(x′iβ) = (x

′iβ)

1ρ Power N/A

λi(x′iβ) = β1xi1[1− {1− e(−β2xi2)}β3 ] Non-linear N/A

• The link between the random and systematic components for a linear model is given byλi = λi(xi

′β) , (23)

where λi is the observed rate parameter and λi = (xi′β) is the predicted rate parameter. It should be pointed

out that the rate parameter, λi = (xi′β), serves as its own link function. Thus, the rate parameter provides

a linearization of our distributional models so that we can use multiple linear regression to obtain Best LinearUnbiased Estimators (BLUEs). In effect, the rate parameter allows us to reduce non-linear (non-Gaussian) modelsto linear models so that we can solve the system of linear equations. The rate parameter can take on variouslinear and non-linear functions. These are shown in Table 2. The power transformation was adapted from [24].As shown in Table 2, it is obvious that the random component, λi, can take on a non-linear relationship with

the systematic component, x. To linearize λi in a least squares approach by a succession of stages or iterations,we carry out a Taylor series expansion of λi about a point β0. Since we know that

g(β) = g(β0) + g(β0)′(β − β0) , (24)

a Taylor series expansion of λi about (β − β0) gives us

λi ' λi(x′iβ0) +p∑j=1

[∂λi(x

′iβ)

∂βj

]β = β0

(β − β0) + ei . (25)

Rearranging this equation, we have

λi − λi(x′iβ0) = Z0δ0 + ei , (26)

where

Z0ij =[∂λi(x

′β)∂βj

]β = β0

, (27)

and

δ0i = βi − β0i . (28)

The Jacobian matrix of partial derivatives of λi(x′iβ) with respect to the parameters, β at the zeroth iteration

is then

Z0 =

z11 z12 · · · z1pz21 z22 · · · z2p...

.... . .

...zn1 zn2 · · · znp

= Z0ij, n× p. (29)

From (26), the residuals, ei, now become

ei = λi − λi(x′iβ)− Zδ . (30)

5

3.5.1 Development of Dependent Variable, Y

Using (30), let λi − λ0i (x′iβ) be equal to Yi. We now rewrite the residualsei = Yi − Zδ , (31)

so that the column vector, Y of dependent Y becomes

Y =

λ1 − λ01(x′1β)λ2 − λ02(x′2β)

...λn − λ0n(x′nβ)

. (32)

3.5.2 Development of Weighting Matrix, W

We know that V ar(cX) = c2 V ar(X) so if we substitute in λ we have

V ar( 1Ti· µi) =

(1Ti

)2V ar(µi)

= µiT2i

.(33)

Therefore, each diagonal element of the weight matrix becomes

wii =1

V ar(λi)=T 2iµi=Ti · Tiλi · Ti

=Ti

λi. (34)

Because λi can be very small and Ti very large and it is better to divide by a larger number than a smaller one

with a computer, we will we choose the weightT2iµi. The entire weighting matrixW is then

W =

T21µ1

0 0 0

0T22µ2

0 0...

.... . .

...

0 0 · · · T2nµn

. (35)

3.5.3 Defining Maximum Likelihood Solution Vector, δ

Let V−1 = W, and now substitute our new definition for residuals from (31) into (21) obtain the new sum ofsquares

SS(δ) = (Y − Zδ)′W(Y − Zδ) (36)

= WY′Y − (Zδ)′WY −Y′ZWδ + δ′Z′WZδ. (37)

The transpose (Zδ)′ of the term (Zδ)′WY in (37) is equal to δ′Z′ because (AB′ = B′A′). In addition, since(Zδ)′WY is a 1 × 1 matrix, or scalar, then its transpose Y′ZWδ′ has the same value. Thus, (37) is transformedinto

SS(δ) =WY′Y − 2δ′Z′WY + δ′Z′WZδ. (38)

The δ′ on the left sides of parameters in the right side of (38) drop out because if we set

A = (Z′WY) , (39)

and from Searle [25] know that∂x′A∂x

= A , (40)

6

and if we setA = (Z′WZδ) , (41)

and from [25] know that∂(x′Ax)∂x

= 2Ax , (42)

then the normal equation then becomes

SS(δ) = Z′WZδ + Z′WY . (43)

In order to minimize (43) with respect to the δ parameters, we must take the first partial derivative. This givesus

∂SS(δ)

∂δ= −2Z′WY + 2(Z′WZδ) , (44)

that provides us with the consistent equations

(Z′WZδ) = Z′WY , (45)

that has solutionδ = (Z′WZ)−1Z′WY . (46)

At the ith iteration, the values of the parameters are

βi = βi−1 + δ . (47)

Convergence is reached at the point when

{βi − β(i−1)}/β(i−1) < ε , (48)

where ε is 10−5 by default in PIRLS.

3.6 Variance-covariance Matrix

The variance-covariance matrix, with parameter variance on the diagonal and covariances on the off-diagonals, atconvergence is identical to the information matrix or inverse of the weighted dispersion matrix

I−1β = (Z′WZ)−1 =

σ21 σ12 · · · σ1jσ12 σ22 · · · σ2j...

.... . .

...σ1j σ2k · · · σ2j

. (49)

3.7 Standard Errors of Regression Coefficients

The standard errors of the regression coefficients are equivalent to√σ2j . A Wald test for each parameter is simply

βj/s.e.(βj) and indicates significance if the ratio is < −1.96 or > 1.96 at the α = 0.05 level of significance.

3.8 Residuals and Regression Diagnostics

The error residual, ei has mean 0 and measures the lack of concordance of the observed µi against the fitted valueµi. The greater the value of ei, the worse the fit of the model. Pearson residuals, rP are estimate as

rP =µi − µiµ

, (50)

and provide a measure of the residual variation. The deviance residuals

rD = µi

[logµi

µi

]+ (µi − µi) , (51)

7

allow the investigator to determine the goodness of fit of the model, i.e., how well the observed data agree withthe fitted values. Freeman and Tukey [26] introduced a variance stabilized residual for Poisson models given by

rFT =√µ+√µ+ 1−

√µ+ 1. (52)

The leverage [23] for each observation, hi, provides information about the geometric distance between a givenpoint (zi1, zi2, · · · , zij) and the centroid of all points (zi1, zi2, · · · , zij) in the predictor space. Individual leverages,hi, are obtained from the diagonal of the “Hat” matrix given by

H =√WZ(Z′WZ)−1Z′

√W. (53)

Because∑hi = trace(H) = p, observations with high leverage can approach unity. A criterion introduced by

Hoaglin and Welsch [27] for determing high leverage for an observation states that a point has high leverage ifhi > 2p/n. Points with high leverage are located at a substantial distance from the fitted regression line, andact to “pull” the fitted line toward their location. Increases in the goodness of fit of a model can be obtained byrefitting a model after the observation(s) with high leverage have been removed.Cook [28] introduced the residual, ∆(β)−ij , which measures the difference between β when the observation is

included in the model and β when the observation is not in the model. The more positive the values of ∆(β)−ij ,the more influence an observation has when compared with other observations. Individual ∆(β)−ij residuals areobtained by the n× p matrix ∆(β)−ij , by the relationship

∆(β)−ij =−(Z′WZ)−1zijeijwij

1− hij . (54)

Finally, the standardized residual is used to measure the model error introduced by particular observation whenits error residual has constant variance

ri =ei√(1− hi)

. (55)

Values for ri that exceed 1.96 tend to be outliers at the α = 0.05 level of significance.

3.9 Goodness-of-Fit Statistics

The Pearson χ2 goodness of fit is

χ2 =

n∑i=1

rP2 , (56)

and the deviance goodness of fit is

D =

n∑i=1

rD , (57)

which is also χ2 distributed. If a model fits, its χ2 and D will be lower than the degrees of freedom (n− p).

3.10 Structure

Program flow is outline in Figure 1. The starting regression to obtain the initial values of the parameter vector,βi is performed by calling:CALL INIT(NREG, NCOL, NROW, ISTART, START, B, G, RHO, IPRED, IDATA, INFILE, OUTFILE,IFLT)For non-linear models, the user can specifiy starting values for the parameters and skip the multiplicative fit

to obtain starting parameter estimates. The solution vector, δ, used to modify the original regression coefficientsduring each iteration are determined through calls of:CALL PRED(NCOL, NROW, G, IDF, DEV, TPR, NREG, RHO, IPRED, ZTWZ1, ZTWY, ITER, WIN,

YDIFF, OEXPE, OOBS, OFT, OPR, XIN,XT,IFLT)Lastly, the leverages, deletion residuals, and standardized residuals are obtained by calling:CALL FIT(WIN, YDIFF, XIN, XT, ZTWZ1, OEXPE, LEV, DRES, SRES, NCOL, NROW,IFLT)

8

Start

?Read di, Ti, and X

?Estimate λi = di/Ti, wi = Ti

*

H H HHjJJJJJJ

Y= λ

Y= log(λ)

Y= λρ

Y= log(λ)

Additive

Multiplicative

Power

Non-linear (optional)

�

�

�

�

?β = (X′WX)−1X′WYStarting regression

Iteratively Reweighted Least Squaresβi

?

?

βi+1 = δi+1 + βi

λ(x′β) = x′β

λ(x′β) = exp(x′β)

λ(x′β) = (x′β)1ρ

λ(x′β) = β1xi1[1− {1− e(−β2xi2)}β3 ]

Additive

Multiplicative

Power

Non-linear (example)

+

QQ

QQ

QQk

AAAAAAAAK

Estimate Yi = di − λ(x′β) Ti6

Estimate wii, zij

6

δi+1 = (Z′WZ)−1Z′WY′ - ||δi+1|| > 10−5 -

Stop when ||δi+1|| < 10−5

Figure 1: Program flow in PIRLS.

9

The parameter, arrays, and variables used in PIRLS are defined below.MP=4000 Parameter Constant Fixed array size for maximum num-

ber of recordsNP=28 Parameter Constant Fixed array size for maximum num-

ber of variablesIP=1 Parameter Constant Fixed array size for when 1 is neededNIP=16 Parameter Constant Fixed array when more than 1 is

neededINFILE Character Input Input filename and pathOUTFILE Character Input Output filename and pathNROW Integer Input Number of input records readNCOL Integer Input Number of parameters specified by

userISTART Integer Input For non-linear models; 0=user spec-

ified starting values, 1=starting val-ues estimated by multiplicative fit insubroutine INIT

START Real Input For non-linear models; starting val-ues specified by user when IS-TART=0

IPRED Integer Input Number of parameters to be read infrom input file

NREG Integer Input Type of regression model to befit (1-additive, 2-multiplicative, 3-power, 4-non-linear)

IDATA Integer Input Write input data to output file,1=yes, 0=no

ICOVAR Integer Input Write variance-covariance and cor-relation matrices to output file,1=yes, 0=no

ICELLS Integer Input Write observed and fitted counts, di,to output file, 1=yes, 0=no

IEX Integer Input Exponentiate regression coefficientsfor multiplicative models only, i.e.,NREG=2

IREG Integer Input Estimate standardized and dele-tion residuials and leverages, 1=yes,0=no

RHO Real Input Power transform for power models,ρ

OBS(MP) Real Input Number of deaths or cases in sub-group (record), µi

10

T(MP) Real Input Number of person-years of follow-upin subgroup (record), T

X(MP,NP) Real Work Design matrix of input independentvariables, X

LAMBDA(MP) Real Work Observed rate, λiLAMHAT(MP) Real Work Predicted rate, λiEXPE(MP) Real Work Expected deaths or cases in sub-

group, µi = λi × TY(MP) Real Work Error Residual, Yi = µi − diW(MP) Real Work Variance weights =

W=T 2/Var(λi))Z(MP,NP) Real Work Jacobian matrix of first partial

derivatives, ZZTWZ(MP,NP) Real Work Information matrix, (Z′WZ)ZTWZ1(NP,NP) Real Work Inverse of information matrix,

(Z′WZ)−1

ZTWY(NP,1) Real Work Score vector, Z′WYFT(MP) Real Work Freeman-Tukey residuals, rFTPR(MP) Real Work Pearson χ2 residuals, rPDEV Real Work Deviance residuals, rDVARCOV(NP,NP) Real Work Variance-covariance matrix used for

output, (Z′WZ)G(NP,1) Real Work Solution vectors, δB(NP,1) Real Work Regression coefficients updated at

each iteration, βiSE(NP) Real Work Standard error of βiCORR(NP,NP) Real Work Correlation matrix of coefficientsLL(NP) Real Work Lower bound of 95% confidence

limit for relative risk (multiplicativemodel only)

UL(NP) Real Work Upper bound of 95% confidencelimit for relative risk (multiplicativemodel only)

ML(NP) Real Work Relative risk (multiplicative modelonly)

LEV(MP,MP) Real Work Leverages from the Hat matrix, hiDRES(MP,NP) Real Work Deletion residuals, ∆(β)i,jSRES(MP) Real Work Standardized residuals, riIDF Integer Work Degrees of freedom used for good-

ness of fitDEV Real Work Deviance goodness of fit, D2d.f.TPR Real Work Pearson χ2 goodness of fit χ2d.f.IFLT Integer Output Premature exit codes

0=Convergence reached1=NCOL< 12=NROW< 13=Degrees of freedom, (NROW-NCOL), < 14=|∑βj | = 05=|∑Gj | = 06=Iterations > 100

11

3.11 Auxiliary Algorithms

MATMUL [29] performs matrix multiplication for the matrix manipulation. TRNPOS tranposes arrays as neededfor matrix mulitplication. MATINV calls SVDCMP based on [30] to invert the information matrix using singularvalue decompostion. ALG and GAMAIN are used for estimating tabled values of χ2.

3.12 Restriction

All input parameters are checked for nullity and a fault message is returned if there is an illegal entry.

3.13 Time

A thorough investigation of absolute timing has not been performed; however, it should be noted that executiontime is a function of the number of model parameters and effects to be considered.

3.14 Precision

The algorithm may be converted to double precision by making the following changes:

1. Change REAL to DOUBLE PRECISION in the algorithm.

2. Change the constants to double precision.

3. Change EXP to DEXP and ALOG to DLOG in all applicable routines.

4. Make appropriate changes in auxiliary routinesMATMUL, TRNPOS, MATINV, SVDCMP, and in functionsALG and GAMAIN.

3.15 Application - Fitting a Multiplicative Model

3.15.1 Data Arrangement and Partial Derivatives

We shall enumerate the steps required for fitting a multiplicative model. Before we start the regression run,however, we must set up the data file and also determine the first partial derivatives of the rate ratio λ withrespect to each parameter in the model.

1. Arranging the data in an ASCII text file. The first step is to arrange our data in an ASCII text file.The typical ASCII text file arrangement of data for a Poisson regression run using data in Table 1 is shownin the following:

Cases T/1,000 Age Smoke

---------------------------------

2. 18.790 1 0 0 0 0 0

12. 10.673 0 1 0 0 0 0

28. 5.710 0 0 1 0 0 0

28. 2.585 0 0 0 1 0 0

31. 1.462 0 0 0 0 1 0

32. 52.407 1 0 0 0 0 1

104. 43.248 0 1 0 0 0 1

206. 28.612 0 0 1 0 0 1

186. 12.663 0 0 0 1 0 1

102. 5.317 0 0 0 0 1 1

where the person-years in Table 1 are divided by 1,000 and the covariates denoting the age groupare dummy coded (0,1) with a 1 to indicate that the rates are from a given cell, and whether or notthe subgroup smoked.

12

2. Determining Partial Derivatives of λi(x′iβ) With Respect to Each βj . Next we must write

out the partial derivatives of the rate parameter with respect to each parameter (these will be usedlater in the algorithm). The partial derivatives are used in the Jacobian Z matrix to form the scoresduring each iteration. Thus, each partial derivative forms an element, zij , in the Jacobian Z matrix.Given that the partial derivative of an exponentiated function u in the form eu is eudu, the partialderivative of λi(x

′iβ) with respect to each parameter, βj is as follows:

zij =∂λi(x

′iβ)

∂βjxij , (58)

3.15.2 Steps in the Algorithm

The following section discusses the steps taken by the algorithm for performing iteratively reweighted least squares.

1. Definition of the model. The model we will fit is a log-linear model

λi(x′iβ) = e

(β1xi1+β2xi2+β3xi3+β4xi4+β5xi5+β6xi6) , (59)

which is equivalent to

log(λi(x′iβ)) = (β1xi1 + β2xi2 + β3xi3 + β4xi4 + β5xi5 + β6xi6) . (60)

2. Estimate column vector of dependent variables for the log-linear model. The algorithm starts byreading in the data from the input file and then estimates the rate parameter, λi, for each record as theratio of number of observed deaths to the number of observed person-years. The log of λi is then calculatedto obtain the dependent variable column vector

Y =

log( d1T/1,000

)

log( d2T/1,000

)

log( d3T/1,000

)

log( d4T/1,000

)

log( d5T/1,000

)

log( d6T/1,000

)

log( d7T/1,000

)

log( d8T/1,000

)

log( d9T/1,000

)

log( d10T/1,000

)

. (61)

and when substituting in the data above we get

Y =

log( 218.790

)log( 12

10.673)

log( 285.710

)log( 28

2.585)

log( 311.462

)log( 32

52.407)

log( 10443.248

)log( 206

28.612)

log( 18612.663

)log( 102

5.317)

. (62)

13

3. Construction of the X matrix of independent variables. In addition to the estimation of the depen-dent variable Y column vector, the algorithm constructs the X matrix of independent variables from theindependent variables read in from the input file as

X =

1 0 0 0 0 00 1 0 0 0 00 0 1 0 0 00 0 0 1 0 00 0 0 0 1 01 0 0 0 0 10 1 0 0 0 10 0 1 0 0 10 0 0 1 0 10 0 0 0 1 1

. (63)

4. “Starting Regression” - log-linear regression to estimate initial values of parameters. Once theY and X matrices are constructed, the algorithm fits the log-linear model

log(λi(x′iβ)) = (β1x1 + β2x2 + β3x3 + β4x4 + β5x5 + β6x6) , (64)

which in matrix terms is given asβ = (X′WX)−1X′WY , (65)

where diagonal elements ofW are set equal to T . Figure 1 shows the program flow in PIRLS for the variousmodels. Table 3 shows the link functions for λ for each starting regression and the linear predictor used forestimating λi(x

′iβ) during subsequent iterations.

Table 3: Link functions used during “starting regression” and linear predictor used to estimate λ during subsequentiterations.

Model Link function for λ Form of predictor for λi(x′iβ)

Additive λi λi(x′iβ) = (x

′iβ)

Multiplicative log(λi) λi(x′iβ) = exp(x

′iβ)

Power λρi

λi(x′iβ) = (x

′iβ)

1ρ

Non-linear (optional)∗ log(λi) λi(x′iβ) = exp(x

′iβ)

∗ For non-linear models, the user can specifiy starting values forparameters, rather than using the default multiplicative fit to obtainstarting values.

5. Estimation of predicted rate parameter. After the first regression or iteration, the regression coeffi-cients, βj , are used to estimate predicted rates from the linear predictor

λi(x′iβ) = e

∑p

j=1xijβj

= e(x′iβ) . (66)

6. Estimation of expected number of deaths. The predicted rates are then multiplied by the observedperson-years in each record to give the expected number of deaths as

µi = di(x′iβ) = λi(x

′iβ)Ti . (67)

7. Estimation of dependent variable. The expected number of deaths, µ, are then subtracted from theobserved deaths, µ, to give the dependent variable

Yi = µi − µi = di − λi(x′iβ)Ti . (68)

8. Construction of column vector Y. The column vector Y is then equal to [Yi].

14

Table 4: Survival data for splenic stem cell colony formation after irradiation and transplantation (see ref. [31]).

Colonies, µ Trials, Ti Cell concentration, xi1 Dose(Sv), xi260.0 6.0 1.25 0.0066.0 7.0 1.75 0.9646.0 4.0 3.00 1.9282.0 9.0 7.20 2.88105.0 11.0 24.00 4.32123.0 15.0 75.00 5.7612.0 4.0 120.00 6.72

9. Estimation of the partial derivatives for each row vector of the data matrix X. For each recordin the data matrix, X, estimate the value of the partial derivatives using the current parameter values, βj .For our log-linear model in (65), we get

zij =∂λi(x

′iβ)

∂βj= e(x

′iβ)xij . (69)

10. Construction of the Jacobian matrix of partial derivatives. Now the algorithm constructs the n× pJacobian matrix Z composed of the first partial derivatives estimated above, which appears as Z = [zij ].

11. Construction of the inverse variance weighting matrix. Since the Poisson variance of µi changes ateach iteration, we must weight the dispersion matrix with weights defined by T 2i /µi given asW = diag[wii],

where wii isT2iµi.

12. Estimation the solution vector at each iteration. The maximum likelihood solutions at the next andsubsequent iterations are obtained through the matrix manipulation

δ = (Z′WZ)−1Z′WY . (70)

13. Updating regression coefficients. At each iteration, the new solution vectors are added to the previousparameter estimates given as

βi = βi−1 + δ . (71)

14. Repeat steps 5-13 above until convergence is reached. The above steps (5 through 13) are repeateduntil the euclidean norm of the score vector is below some non-negative value

{βi − β(i−1)}/β(i−1) < ε , (72)

where ε is 10−5 by default in PIRLS. At this point, convergence is reached at a global maximum.

3.16 Application - Fitting a Non-linear Model

For this example, we will fit a non-linear model to estimate the number of stem cell colonies in spleens of laboratorymice into which irradiated cells were transplanted [31]. Table 4 shows the colonies formed, µ, number of trials,Ti, cell concentration, xi1, and radiation dose (Sv), xi2, arranged in tabular notation.The non-linear model we will fit to obtain maximum likelihood estimates of colonies formed is of the form

λi = β1xi1[1− {1− e(−β2xi2)}β3 ] , (73)

where xi1 is the concentration of transplanted cells, xi2 is the radiation dose (Sv), and β1, β2, and β3 are theparameters of interest.

15

In order to build the Jacobian matrix, Z, we need to take the first partial derivatives of λi with respect to β1,β2, and β3, i.e., ∂λi/∂β1, ∂λi/∂β2, and ∂λi/∂β3. Since there are three β coefficients, the first column entries forrow i of Z, i.e., zi1, will be estimated by use of equation

∂λi

∂β1= xi1[1− {1− e(−β2xi2)}β3 ]. (74)

The first partial derivative in the 2nd column entry of row i of Z, i.e., zi2 is estimated as

∂λi

∂β2= −β1xi1{1− e(−β2xi2)}β3β3xi2 e−β2xi2

(1− e−β2xi2) . (75)

The third column of Z has elements zi3, which are numerically estimated as

∂λi

∂β3= −β1xi1{1− e(−β2xi2)}β3 ln(1− e−β2xi2) . (76)

When fitting this model, we chose to specify starting values of β1 = 7.6364, β2 = 0.9341, and β3 = 7.6364 from[3], rather than use the default multiplicative fit to obtain starting values. Input and output for this model usingthe data in Table 4 are shown in the Appendix.

3.17 Input

During run-time, the user must respond to the following queries concerning each run:

• The type of model to be fit (1-additive, 2-multiplicative, 3-power, and 4-non-linear)• For additive models, the power, ρ, which ranges from 1 for purely additive models to 0 for purely multiplica-tive models

• The number of independent (input) variables to be read in from the input file• The number of parameters to estimate• For non-linear models, optional starting values for parameters• For multiplicative models, the regression coefficients are to be exponentiated to obtain RR and its 95% CI• Whether or not the variance-covariance and correlation matrices are written to the output file• Whether or not the input data are written to the output file• Whether or not the observed and fitted count data (deaths or cases) are written to the output file• Whether or not the leverages, and standardized and deletion residuals are written to the output file

4 USER INPUT DATA FILE

PIRLS uses one input data file (ASCII) whose name is specified by the user at run-time. The input file containsthree types of information

1. Run title (text) in record 1

2. FORTRAN format statement (in parentheses) in record 2 needed for reading the input data in later records

3. Input data, i.e., cases, person-years, and independent variables in all subsequent records

The examples in the Appendix give listings of input files.

5 SAMPLE INPUT/OUPUT CASES

The output for all runs is in tabular form in the ASCII text file whose name is specified by the user at run-time.Four examples using an input file are given in the Appendix.

16

Table 5: Names and descriptions of each PIRLS file.

Filename Description Format

INFILE∗ Input filename ACSII textOUTFILE∗ Output filename ACSII textPIRLS.FOR FORTRAN source code ASCII textPIRLS.EXE Executable file FORTRAN

∗ Input and output filenames are specified by user at run-time.

6 FILENAMES

Table 5 list the names of the files that were used to program, link and execute PIRLS. As one notices, the sourcecode is an ASCII text file and the object and executable files have been compiled with Microsoft FORTRANPowerstation 4.0. There is one required input data file, whose name is specified at run-time.

7 AVAILABILITY

The program and all of its subroutines are available from the Journal of Statistical Software free of charge athttp://www.stat.ucla.edu/journals/jss/

8 ACKNOWLEDGEMENTS

The author is grateful to the reviewers for helpful comments. Work on this algorithm was performed under NASAcontract NAS-9 18236.

References

[1] Frome, E.L., Kutner, M.H., Beauchamp, J.J. (1973) Regression analysis of Poisson-distributed data. JASA.68:935-940.

[2] Frome, E.L. (1983) The analysis of rates using Poisson regression methods. Biometrics. 39:665-674.

[3] Frome, E.L. PREG: A computer program for Poisson regression analysis. Oak Ridge Associated UniversitiesReport ORAU-178 (1981). Oak Ridge National Laboratory. Oak Ridge(TN), ORAU.

[4] Muirhead, C.R., Darby, S.C. (1987) Modelling the relative and absolute risks of radiation-induced cancers.J. R. Stat. Soc. A. 150:83-118.

[5] Breslow, N.E., Day, N.E. (1987) Statistical Methods in Cancer Research. Volume II: The design and anaylsisof cohort studies. Chapter 4. IARC Scientific Publication No. 82. Lyon, IARC.

[6] Pierce, D.A., Shimizu, Y., Preston, D.L., Vaeth, M., Mabuchi, K. (1996) Studies of mortality of atomic bombsurvivors. Report 12, Part I. Cancer: 1950-1990. Radiat. Res. 146:1-27.

[7] Thompson, D.E., Mabuchi, K., Ron, E., Soda, M., Tokunaga, M., Ochikubo, S., Sugimoto, S., Ikeda, S.,Terasaki, M., Izumi, S. (1994) Cancer Incidence in Atomic Bomb Survivors. Part II: Solid Tumors, 1958-1987. Radiat. Res. 137:S17-S67.

17

[8] Preston, D.L., Kusimi, S., Tomonaga, M., Izumi, S. Ron, S., Kuramoto, A., Kamada, N., Dohy, H., Matsui,T., Nonaka, H., Thompson, D.E., Soda, M., Mabuchi, K. (1994) Cancer Incidence in Atomic Bomb Survivors.Part III: Leukemia, Lymphoma and Multiple Myeloma, 1950-87. Radiat. Res. 137, S68-S97; 1994.

[9] Cardis, E., Gilbert, E.S., Carpenter, L., Howe, G., Kato, I., Armstring, B.K., Beral, V., Cowper, G., Douglas,A., Fix, J., Fry, S.A., Kaldor, J., Lave, C., Salmon, L., Smith, P.G., Voelz, G.L., Wiggs, L.D. (1995) Effectsof low doses and low dose rates of external ionizing radiation: Cancer mortality among nuclear industryworkers in three countries. Radiat. Res. 142: 117-132.

[10] Ron, E., Lubin, J.H., Shore, R.E., Mabuchi, K., Modan, B., Pottern, L.M., Schneider, A.B., Tucker, M.A.,Boice, J.D.Jr. (1995) Thyroid cancer after exposure to external radiation: a pooled analysis of seven studies.Radiat. Res. 141:259-277.

[11] Kerber, R.A., Till, J.E., Simon, S.E., Lyon, J.L., Thomas, D.C., Preston-Martin, S., Rallison, M.L., Lloyd,R.D., Stevens, W.S. (1993) A cohort study of thyroid disease in relation to fallout from nuclear weaponstesting. J. Am. Med. Assoc. 270:2076-2082.

[12] Peterson, L.E., Schull, W.J., Davis, B.R., Buffler, P.A. (1994) Information Bias and Lifetime MortalityRisks of Radiation-Induced Cancer. U. S. NRC NUREG Report GR-0011. Nuclear Regulatory Commission,Washington, D.C.

[13] Sposto, R., Preston, D.L, Shimizu, Y., Mabuchi, K. (1992) The effect of diagnostic misclassification onnon-cancer and cancer mortality dose-response in A-bomb survivors. Biometrics. 48:605-617.

[14] Whittemore, A.S., Gong, G. (1991) Poisson regression with misclassified counts: application to cervicalcancer mortality rates. Appl. Stat. 40; 81-93.

[15] Frome, E.J., DuFfrain, R.J. (1986) Maximum likelihood estimation for cytogenetic dose-response curves.Biometrics. 42:73-84.

[16] Ripley, B.D., Venables, W.N. (1994) Modern Applied Statistics with S-PLUS. New York, Springer-Verlag.

[17] Tierney, L. (1990) Lisp-Stat: An object-oriented environment for statistical computing and dynamic graphics.New York, John Wiley and Sons.

[18] STATA (1997) - Poisson Regression, User’s Guide. College Station(TX), Stata Press.

[19] Preston, D.L. and Pierce, D.A. (1997) AMFIT: A program for parameter estimation in additive and multi-plicative rate models with grouped survival data - methods, models, and examples. AMFIT User’s Guide.Hirosoft International Corporation. Seattle(WA), Hirosoft International Corporation.

[20] EGRET (1997) - Poisson Regression, User’s Guide. Statistics and Epidemiology Research Corporation. Seat-tle(WA), SERC.

[21] Doll, R., Hill, A.B. (1966) Mortality of British doctors in relation to smoking: Observations on coronarythrombosis. Natl. Cancer Inst. Monogr. 19:205-268.

[22] Draper, N.R., Smith, H. (1981) Applied Regression Analysis. New York, John Wiley and Sons.

[23] McCullagh, P., Nelder, J.A. (1983) Generalized Linear Models. London, Chapman and Hall.

[24] Aranda-Ordaz, F.J. (1983) An extension of the proportional-hazards model for grouped data. Biometrics.39:109-117.

[25] Searle, S.R. (1982) Matrix Algebra Useful for Statistics. New York, John Wiley and Sons.

[26] Freeman, M.F., Tukey, J.W. (1950) Transformations related to the angular and the square root. Annals ofStat. 21:607-611.

[27] Hoaglin, D.C., Welsch, R.E. (1978) The Hat matrix in regression and ANOVA. Am. Statistician. 32:17-22.

[28] Cook, R.D. (1977) Detection of influential observations in linear regression. Technometrics. 19:15-18.

[29] Heiberger, R. (1989) Computation for the analysis of designed experiments. New York, John Wiley and Sons.

[30] Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T. (1989) Numerical Recipes: The Art of Scien-tific Computing. New York, Cambridge University Press.

[31] Till, J.E., McCulloch, E.A. (1961) A direct measurement of radiation sensitivity of normal mouse bonemarrow cells. Radiat. Res. 14:213-222.

18

9 APPENDIX: SAMPLE INPUT/OUTPUT CASES

Example 1: Additive (linear) model using data in Table 1 adapted from [21]. The input data file forthis run is distribution file EXAMP1-3.DAT and is listed below:

PIRLS

(F4.0,F8.3,6F3.0)

2. 18.790 1. 0. 0. 0. 0. 0.

12. 10.673 0. 1. 0. 0. 0. 0.

28. 5.710 0. 0. 1. 0. 0. 0.

28. 2.585 0. 0. 0. 1. 0. 0.

31. 1.462 0. 0. 0. 0. 1. 0.

32. 52.407 1. 0. 0. 0. 0. 1.

104. 43.248 0. 1. 0. 0. 0. 1.

206. 28.612 0. 0. 1. 0. 0. 1.

186. 12.663 0. 0. 0. 1. 0. 1.

102. 5.317 0. 0. 0. 0. 1. 1.

The screen queries to be entered by the user are as follows:

ENTER THE NAME OF INPUT FILE:

EXAMP1-3.DAT


EXAMPLE1.OUT

ENTER 1-ADD 2-MULT 3-POWER 4-NON-LINEAR

1

ENTER RHO: 1 - ADDITIVE - OTHER

1

ENTER NUMBER OF PREDICTOR (INPUT) VARIABLES

6

NUMBER OF PARAMETERS TO BE FITTED?

6

PRINT INFORMATION AND CORR MATRICES? 0-NO 1-YES

1

PRINT INPUT DATA? 0-NO 1-YES

1

PRINT FITTED CELLS? 0-NO 1-YES

1

PERFORM REGRESSION DIAGNOSTICS? 0-NO 1-YES

1

The output file contains:

TITLE: PIRLS

19

MODEL SELECTED: ADDITIVE

INPUT DATA FORMAT: (F4.0,F8.3,6F3.0)

2. 18.790 1. 0. 0. 0. 0. 0.

12. 10.673 0. 1. 0. 0. 0. 0.

28. 5.710 0. 0. 1. 0. 0. 0.

28. 2.585 0. 0. 0. 1. 0. 0.

31. 1.462 0. 0. 0. 0. 1. 0.

32. 52.407 1. 0. 0. 0. 0. 1.

104. 43.248 0. 1. 0. 0. 0. 1.

206. 28.612 0. 0. 1. 0. 0. 1.

186. 12.663 0. 0. 0. 1. 0. 1.

102. 5.317 0. 0. 0. 0. 1. 1.

ITERATION: 1 DEVIANCE: 86.7977













ITERATIONS: 13

REC OBS EXP FREEMAN-TUKEY PEARSON

1 2.0 1.6 .4403 .1113

2 12.0 17.5 -1.3592 1.7346

3 28.0 36.0 -1.3638 1.7751

4 28.0 35.0 -1.1909 1.3856

5 31.0 28.0 .5896 .3156

6 32.0 35.4 -.5343 .3202

7 104.0 96.5 .7722 .5822

8 206.0 197.3 .6327 .3874

9 186.0 178.7 .5558 .2952

10 102.0 105.1 -.2763 .0895

B COEFFICIENT STD. ERROR WALD SCORE

20

1 .0841 .0661 1.2724 .00013366

2 1.6407 .2179 7.5310 -.00000497

3 6.3035 .4565 13.8080 -.00000113

4 13.5241 .9642 14.0270 -.00000030

5 19.1696 1.7045 11.2466 -.00000012

6 .5907 .1255 4.7055 -.00000556

POWER: 1.00

GOODNESS-OF-FIT TESTS

STATISTIC ESTIMATE D.F. PROB.

CHI-SQUARE 6.997 4 .1361

DEVIANCE 7.433 4 .1147

VARIANCE-COVARIANCE MATRIX OF COEFFICIENTS.

B1 B2 B3 B4 B5 B6

B1 .004

B2 .003 .047

B3 .003 .010 .208

B4 .003 .010 .011 .930

B5 .003 .009 .010 .010 2.905

B6 -.004 -.012 -.013 -.013 -.012 .016

CORRELATION MATRIX OF COEFFICIENTS.

B1 B2 B3 B4 B5 B6

B1 1.000

B2 .211 1.000

B3 .111 .097 1.000

B4 .053 .046 .024 1.000

B5 .028 .025 .013 .006 1.000

B6 -.490 -.431 -.226 -.107 -.057 1.000

TRACE OF THE HAT MATRIX AND ADJUSTED RESIDUALS.

RECORD LEVERAGE ADJ. RESID.

1 .9763 2.1668

2 .3088 -1.5841

3 .1888 -1.4792

4 .1777 -1.2981

5 .2216 .6367

6 .9318 -2.1671

7 .7680 1.5841

8 .8229 1.4792

21

9 .8248 1.2981

10 .7794 -.6367

DELETION RESIDUALS OR APPROXIMATE CHANGE IN COEFFICIENTS AFTER

DELETION OF EACH RECORD.

RECORD B1 B2 B3 B4 B5 B6

1 -.9193 -.6402 -.7019 -.7049 -.6663 .8551

2 .0148 .2307 .0471 .0473 .0447 -.0573

3 .0052 .0151 .3258 .0167 .0158 -.0202

4 .0021 .0061 .0067 .5818 .0063 -.0081

5 -.0006 -.0018 -.0020 -.0020 -.5790 .0024

6 .0223 -.6403 -.7020 -.7050 -.6663 .8552

7 .0148 -.5164 .0471 .0473 .0447 -.0573

8 .0052 .0151 -1.3999 .0167 .0158 -.0202

9 .0021 .0061 .0067 -2.6924 .0063 -.0081

10 -.0006 -.0018 -.0020 -.0020 2.0342 .0024

On the first page of the output, we see that the model is indeed additive, we also see the input data and thevalue of the deviance goodness of fit at each iteration. The number of observed and expected deaths are given andthe Freeman-Tukey and Pearson residuals as well. At convergence, the coefficient for smoking (No. 6) is 0.5907.Thus, smoking adds 0.59 deaths per 1,000 person-years (or 59/100,000) of follow-up at each age. However, sincethe χ2 and D goodness of fit statitics are greater then 4 degrees of freedom, we conclude that the model does notfit the data well. Table 6 shows the comparison between the modeled baseline rates and modeled baseline ratesto which 0.59/1,000 was added.

Table 6: Interpretation of absolute risks from the additive model.

Baseline rate, λi0(x′i0β) Smoker rate, λi1(x

′i0β)

(βj) (λi0(x′i0β) + βsmoke))

Age group, i per 1,000 T per 1,000 T35-44 0.0841 0.674845-54 1.6407 2.231455-64 6.3035 6.894265-74 13.5241 14.11575-84 19.1696 19.7603

Example 2: Multiplicative (log-linear) model using data in Table 1 adapted from [21]. The inputdata file for this run is the same as that in Example 1.The screen queries to be entered by the user are as follows:


EXAMP1-3.DAT


EXAMPLE2.OUT


2

22


6


6

EXPONENTIATE COEFFICIENTS? 0-NO 1-YES

1


0


0


0


0

The output file contains:

TITLE: PIRLS

MODEL SELECTED: MULTIPLICATIVE

INPUT DATA FORMAT: (F4.0,F8.3,6F3.0)







1 -1.0116 .1918 -5.2751 .00000207

2 .4724 .1304 3.6236 .00000969

3 1.6159 .1147 14.0944 -.00002298

4 2.3389 .1162 20.1343 .00001905

5 2.6885 .1250 21.5110 .00001919

23

6 .3545 .1074 3.3019 .00002873

POWER: .00



CHI-SQUARE 11.155 4 .0249

DEVIANCE 12.132 4 .0164

At convergence, the coefficient for smoking (No. 6) is 0.3545. Since we exponentiated the regression coefficients,we can interpret the RR for smoking and the 95% confidence interval as being 1.43(1.15,1.76). This implies thatthe risk of coronary death in smokers is 1.43 times the age-specific rates of the non-smokers. The goodness of fitstatistics for this run are greater than those generated in Example 1, indicating that the multiplicative model fitsworse when compared with the additive model. Table 7 shows the comparison between modeled baseline ratesand the modeled baseline rates multiplied by the RR of 1.43.

Table 7: Interpretation of relative risks from the multiplicative model.


′i0β)

(eβj ) (λi0(x′i0β)× eβsmoke))


Example 3: Power model using data in Table 1 adapted from [21]. The input data file for this runis the same as that in Example 1.The screen queries to be entered by the user are as follows:


EXAMP1-3.DAT


EXAMPLE3.OUT


3


6


6

EXPONENTIATE COEFFICIENTS? 0-NO 1-YES

24

0


0


0


0


0

This run fitted ten models with ρ varying from one to zero, i.e., going from a purely additive link, ρ = 1, to apurely multiplicative link, ρ = 0. The model quation is λi(x

′iβ)

ρ where ρ is the power function. By inspecting thefits of the models as a function of ρ, one can see that for these example data, the lowest deviance is obtained withρ between 0.5 to 0.6. A single run can now be made to fit the model with ρ = 0.55 by following the directionsgiven in Example 1, but by specifying 0.55 for ρ instead of one. Table 8 lists the modeled baseline rates, β

1/0.55j ,

and modeled baseline rates to which β1/0.55smoke was added.

Table 8: Interpretation of relative risks from the power model.


′i0β)

(β1/0.55j ) (λi0(x

′i0β) + β

1/0.55smoke ))


Example 4: Non-linear model using cell concentrations, and radiation doses to estimate numberof splenic stem cell colonies formed after transplantation into laboratory mice. The input data (seeTable 4) for this run is provided in distribution file EXAMPLE4.DAT and is listed below:

Till and McCulloch (1961)

(F4.0,F4.0,F7.2,F6.2)

60. 6. 1.25 .00

66. 7. 1.75 .96

46. 4. 3.00 1.92

82. 9. 7.20 2.88

105. 11. 24.00 4.32

123. 15. 75.00 5.76

12. 4. 120.00 6.72

The screen queries to be entered by the user are as follows:


EXAMPLE4.DAT


25

EXAMPLE4.OUT


4


2


3

INPUT OR ESTIMATE STARTING VALUES: 0-INPUT 1-EST

0

INPUT STARTING VALUE FOR COEFFICIENT: 1

7.6364


0.9341


2.8924


1


1


1


1

Results in the output file EXAMPLE4.OUT (filename specified above) indicate that 2 iterations were requiredbefore convergence was reached. Although the deviance and chi-square GOF values were greater than the 4 d.f.,the Wald statistics for the parameters were statistically significant. The scores for all coefficients were near zero.The regression parameters, standard errors (with the exception of s.e.(β1)), correlation matrix, chi-square anddeviance GOFs were identical to results in Frome [3].

TITLE: Till and McCulloch (1961)

MODEL SELECTED: NON-LINEAR

INPUT DATA FORMAT: (F4.0,F4.0,F7.2,F6.2)

60. 6. 1.25 .00

66. 7. 1.75 .96

46. 4. 3.00 1.92

26

82. 9. 7.20 2.88

105. 11. 24.00 4.32

123. 15. 75.00 5.76

12. 4. 120.00 6.72

STARTING VALUES FOR PARAMETERS

PARAMETER STARTING VALUE

1 7.6364000

2 .9341000

3 2.8924000



ITERATIONS: 2

REC OBS EXP FREEMAN-TUKEY PEARSON

1 60.0 57.3 .3874 .1298

2 66.0 73.0 -.8079 .6713

3 46.0 37.5 1.3496 1.9260

4 82.0 91.0 -.9414 .8942

5 105.0 101.4 .3807 .1296

6 123.0 113.9 .8536 .7194

7 12.0 19.9 -1.9041 3.1247


1 7.6364 .1186 64.3725 .00030114

2 .9341 .0399 23.4229 -.00096028

3 2.8924 .7476 3.8688 .00006890

POWER: .00



CHI-SQUARE 7.595 4 .1076

DEVIANCE 8.017 4 .0909

VARIANCE-COVARIANCE MATRIX OF COEFFICIENTS.

B1 B2 B3

B1 .014

B2 -.002 .002

27

B3 -.066 .025 .559

CORRELATION MATRIX OF COEFFICIENTS.

B1 B2 B3

B1 1.000

B2 -.343 1.000

B3 -.741 .853 1.000

TRACE OF THE HAT MATRIX AND ADJUSTED RESIDUALS.

RECORD LEVERAGE ADJ. RESID.

1 .8060 .8181

2 .3663 -1.0293

3 .2251 1.5766

4 .4551 -1.2811

5 .2954 .4289

6 .6358 1.4055

7 .2162 -1.9966

DELETION RESIDUALS OR APPROXIMATE CHANGE IN COEFFICIENTS AFTER

DELETION OF EACH RECORD.

RECORD B1 B2 B3

1 -.1978 .0228 .9235

2 .0581 .0141 .0336

3 .0065 -.0276 -.4434

4 -.0339 .0336 .6643

5 .0060 -.0010 -.0749

6 -.0201 .0499 .4879

7 .0231 -.0354 -.4386

28