Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
Linear models and their mathematical foundations:Factorial models
Steffen Unkel
Department of Medical StatisticsUniversity Medical Center Gottingen
Winter term 2018/19 1/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
Motivating example
Suppose the interest is in the corn yield when differentfertilizers are available and corn is planted in different soiltypes. The questions one is interested in answering are:
1 Does fertilizer type have an effect on crop yield?
2 Does soil type have an effect on crop yield?
3 Do the two treatment factors interact? For instance, theremay be no difference between fertilizer 1 and fertilizer 2 in soiltype 1, but fertilizer 1 may produce a greater corn yield thanfertilizer 2 in soil type 2.
Factorial experiments, also known as factorial designs, areused to answer such questions.
The analysis of data arising from such experiments involvesfactorial models, which facilitate an analysis of the effects dueto the treatment factors on some response.
Winter term 2018/19 2/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
What are factorial models?
In factorial models the response (e.g. yield) is considered tobe expressible as the sum of...
1 effects due to individual factors (e.g. fertilizer and soil type)acting one at a time,
2 effects due to pairs of factors, and
3 effects beyond their separate combinations (two-factorinteractions), and so on.
A factor has a limited number of variations that are used inthe experiment, known as factor levels.
We will only consider balanced models that have an equalnumber of observations in each factor level or factorcombination.
Winter term 2018/19 3/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
The design matrix in factorial models
The term “factorial” refers to a particular class of designmatrices X.
In case of factorial models, X is a matrix of zeros and onesonly, and is sometimes called an incidence matrix.
Factorial models are often called analysis of variance(ANOVA) models to be distinguished from linear regressionmodels for which the covariates are continuous.
We shall use the terms factorial models and regression modelsas descriptors for different kind of matrices X.
Winter term 2018/19 4/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
One-way ANOVA model
The one-way balanced model can be expressed as
yij = µ+ αi + εij , i = 1, . . . , I ; j = 1, . . . , J ,
where µ is the grand mean, α1, α2, . . . , αI represent theeffects of I treatments, each of which is applied to Jexperimental units, and yij is the response of the jthobservation among the J units that receive the ith treatment.
The random error terms are denoted by εij .
In some experimental situations, the I groups may representsamples from I populations whose means we wish to compare,populations that are not created by applying treatments.
Winter term 2018/19 5/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
One-way ANOVA: assumptions
To complete the model, we make the following assumptions:
A1 E(εij) = 0 ∀ i , j .
A2 Var(εij) = σ2 ∀ i , j .
A3 Cov(εij , εrs) = 0 ∀ (i , j) 6= (r , s).
Occasionally, we will make use of the following additionalassumption:
A4 εij ∼ N (0, σ2) ∀ i , j .
Any of these assumptions may fail to hold with real data.
Winter term 2018/19 6/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
Alternative one-way ANOVA formulation
The mean for the ith treatment or population can be denotedby µi .
Thus, using assumption A1 we have E(yij) = µi = µ+ αi .
We can rewrite the one-way model equation as
yij = µi + εij , i = 1, . . . , I ; j = 1, . . . , J .
This is called the cell-means model formulation.
Winter term 2018/19 7/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
Two-way ANOVA model with interaction
The two-way balanced model with interaction can beexpressed as
yijk = µ+αi+βj+γij+εijk , i = 1, . . . , I ; j = 1, . . . , J; k = 1, . . . ,K .
The effect of factor A at the ith level is αi , and the term βj isdue to the jth level of factor B.
The term γij represents the interaction AB between the ithlevel of A and the jth level of B.
If γij is omitted, we have a two-way ANOVA model withoutinteraction.
Winter term 2018/19 8/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
Two-way ANOVA model: assumptions
To complete the model, we make the following assumptions:
A1 E(εijk) = 0 ∀ i , j , k.
A2 Var(εijk) = σ2 ∀ i , j , k.
A3 Cov(εijk , εrst) = 0 ∀ (i , j , k) 6= (r , s, t).
Occasionally, we will make use of the following additionalassumption:
A4 εijk ∼ N (0, σ2) ∀ i , j , k.
Any of these assumptions may fail to hold with real data.
Winter term 2018/19 9/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
Alternative two-way ANOVA formulation
Let µij = E(yijk) denote the mean of a random observation inthe (ij)th cell.
Using assumption A1 we haveE(yijk) = µij = µ+ αi + βj + γij .
We can rewrite the two-way model equation in the cell-meansformulation as
yijk = µij + εijk , i = 1, . . . , I ; j = 1, . . . , J; k = 1, . . . ,K .
Winter term 2018/19 10/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
Experimental situations for two-way ANOVA
There are two experimental situations in which the two-wayANOVA model seem appropriate:
1 Factors A and B represent two types of treatment, forexample, various levels of fertilizer and soil type applied in anagricultural experiment. We apply each of the combinations ofthe levels of A and B to a number of (randomly) selectedexperimental units.
2 In another situation, the population may exist naturally, forexample, gender (males and females) and political preference(e.g. Democrats, Republicans). A (random) sample of anumber of observations is obtained from each of the I × Jpopulations.
Winter term 2018/19 11/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
Crossed versus nested factors
Crossed factor structure:
Recall that with crossed factors we see all every level of factorA at every level of factor B.
This means that factor level 1 of factor A has the samemeaning across all levels of factor B.
Nested factor structure:
We call factor B nested in factor A if we have different levelsof B within each level of A.
Examples:1 Patients are nested in hospitals.2 Samples are nested in batches3 Students are nested in classes. Classes are nested in schools.
A factorial design can have both crossed and nested factors.
Winter term 2018/19 12/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
Example of a nested factorial design
Suppose that we want to analyze student performance. Datafrom different classes from different schools (on a studentlevel) are available.
Questions of interest:What is the grade variability between different schools?What is the grade variability between classes within the sameschool?What is the grade variability between students within the sameclass?
This is a nested factorial design, as classes are clearly notcrossed with schools, similarly for students.
We will revisit nested factor structures in the lectures on linearmixed-effects modelling.
Winter term 2018/19 13/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
Rank deficient design matrices
We can write an ANOVA model in matrix form asy = Xβ + ε.
ANOVA models are often expressed with more parametersthan can be estimated, which results in X being rankdeficient.
If X is rank deficient, then X>X is singular.
The normal equations X>Xβ = X>y do not have uniquesolution.
Winter term 2018/19 14/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
Estimation of β
If X is n × p with rank(X) = k < p ≤ n, the system ofequations X>Xβ = X>y is consistent.
Since the normal equations are consistent, a solution is givenby
β = (X>X)−X>y ,
where (X>X)− is any generalized inverse of X>X.
For a particular generalized inverse (X>X)−,E(β) = (X>X)−X>Xβ.
The expression (X>X)−X>Xβ is not invariant to the choiceof (X>X)−.
Winter term 2018/19 15/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
Estimation of β (2)
Suppose there is a p × n matrix A such that E(Ay) = β. Ifso, then
β = E(Ay) = E(A(Xβ + ε)) = AXβ .
Since this must hold for all β, we have AX = Ip.
But rank(AX) < p, hence AX cannot be equal to Ip.
Conclusion: there are no linear functions of y that yield anunbiased estimator of β.
Winter term 2018/19 16/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
Estimation of σ2
We define
SSE = (y− Xβ)>(y− Xβ)
= y>y− β>
X>y
= y>[I− X(X>X)−X>
]y ,
where β is any solution to the normal equationsX>Xβ = X>y.
For an estimator of σ2, we define
s2 =SSE
n − k,
where n is the number of rows of X and k = rank(X).
Winter term 2018/19 17/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
Estimation of σ2 (2)
For s2 = SSE/(n − k), the following properties hold:
i. E(s2) = σ2.
ii. The estimator s2 is invariant to the choice of β or to thechoice of the generalized inverse (X>X)−.
Winter term 2018/19 18/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
IntroductionEstimation
Maximum likelihood estimation
Let y ∼ Nn(Xβ, σ2I), where X is n × p of rank k < p ≤ n.
The maximum likelihood estimators for β and σ2 are given by
β = (X>X)−X>y , σ2 =1
n(y− Xβ)>(y− Xβ) .
It holds thatβ ∼ Np((X>X)−X>Xβ, σ2(X>X)−X>X(X>X)−).
Winter term 2018/19 19/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
Estimable linear combinations of parametersReparameterizationImposing side conditions
Linear combinations of the parameters
Having established that we cannot estimate β, we next inquireas to whether we can estimate any linear combination of theparameters, say λ>β.
A linear function of parameters, λ>β, is said to be estimableif there exists a linear combination of the observations with anexpected value equal to λ>β.
In other words, λ>β is estimable if there exists a vector asuch that E(a>y) = λ>β.
Winter term 2018/19 20/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
Estimable linear combinations of parametersReparameterizationImposing side conditions
Is a particular function λ>β estimable?
Conditions for estimability
A particular linear function λ>β is estimable if and only if anyone of the following equivalent conditions hold:
i. λ> is a linear combination of the rows of X; that is, thereexists a vector a such that a>X = λ>.
ii. λ> is a linear combination of the rows of X>X or λ is a linearcombination of the columns of X>X; that is, there exists avector r such that r>X>X = λ> or X>Xr = λ.
iii. λ or λ> is such that
X>X(X>X)−λ = λ or λ>(X>X)−X>X = λ> ,
where (X>X)− is any (symmetric) generalized inverse of X>X.
Winter term 2018/19 21/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
Estimable linear combinations of parametersReparameterizationImposing side conditions
Linearly independent functions of β
A set of functions λ>1 ,λ>2 β, . . . ,λ
>mβ is said to be linearly
independent if the coefficient vectors λ1,λ2, . . . ,λm arelinearly independent.
In the non-full-rank model y = Xβ + ε, the number of linearlyindependent functions of β is equal to the rank of X.
All estimable functions can be obtained from Xβ or X>Xβ.
Thus we can examine linear combinations of the rows of X orof X>X to see what functions of the parameters are estimable.
Winter term 2018/19 22/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
Estimable linear combinations of parametersReparameterizationImposing side conditions
Estimators of λ>β
From what has been stated previously, we have the followingestimators of λ>β:
E1 a>y, where a> satisfies λ> = a>X.
E2 r>X>y, where r> satisfies λ> = r>X>X.
E3 λ>β, where β is a solution of X>Xβ = X>y.
Winter term 2018/19 23/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
Estimable linear combinations of parametersReparameterizationImposing side conditions
Properties
The estimators E1–E3 have the following properties:
i. E(a>y) = E(r>X>y) = E(λ>β) = λ>β.
ii. Var(r>X>y) = σ2r>X>Xr = σ2r>λ.
iii. Var(λ>β) = σ2λ>(X>X)−λ.
iv. The estimators λ>β and r>X>y are BLUE.
v. The estimator a>y is not guaranteed to have minimumvariance.
Winter term 2018/19 24/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
Estimable linear combinations of parametersReparameterizationImposing side conditions
Purpose of reparameterization
In reparameterization, we transform the non-full-rank modely = Xβ + ε, where X is n × p of rank k < p ≤ n, to thefull-rank model
y = Zγ + ε ,
where Z is n × k of rank k and γ = Uβ is a set of k linearlyindependent functions of β with U being a k × p matrix ofrank k < p.
Thus Zγ = Xβ and we can write Zγ = ZUβ = Xβ, whereX = ZU.
Then, ZUU> = XU> and Z = XU>(UU>)−1.
Winter term 2018/19 25/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
Estimable linear combinations of parametersReparameterizationImposing side conditions
Full-rank model
To establish that Z is full-rank, note thatrank(Z) ≥ rank(ZU) = rank(X) = k.
However, Z cannot have rank greater than k since Z has kcolumns.
Therefore, rank(Z) = k and the model y = Zγ + ε is afull-rank model.
Winter term 2018/19 26/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
Estimable linear combinations of parametersReparameterizationImposing side conditions
Estimation in the full-rank model
For the full-rank model we can use the normal equationsZ>Zγ = Z>y to obtain the unique solutionγ = (Z>Z)−1Z>y.
An unbiased estimator of σ2 is given by
s2 =1
n − k(y− Zγ)>(y− Zγ) .
It holds that Zγ = Xβ and
(y− Xβ)>(y− Xβ) = (y− Zγ)>(y− Zγ) .
Winter term 2018/19 27/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
Estimable linear combinations of parametersReparameterizationImposing side conditions
Indeterminacy of estimable functions
The set Uβ = γ is only one set of linearly independentestimable functions.
Let Vβ = δ be another set. Then there exists a matrix Wsuch that y = Wδ + ε.
Now an estimable function λ>β can be expressed as
λ>β = b>γ = c>δ .
Henceλ>β = b>γ = c>δ ,
and either reparameterization gives the same estimator ofλ>β.
Winter term 2018/19 28/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
Estimable linear combinations of parametersReparameterizationImposing side conditions
Purpose of side conditions
Side conditions provide (linear) constraints that make theparameters unique.
Side conditions must be non-estimable functions of β.
Since rank(X) = k < p ≤ n, the rank deficiency in the rank ofX is p − k.
In order to obtain a unique solution vector β, we must defineside conditions that make up this deficiency in rank.
Winter term 2018/19 29/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
Estimable linear combinations of parametersReparameterizationImposing side conditions
Defining side conditions
We define side conditions Tβ = 0, where T is a (p − k)× pmatrix of rank p − k such that Tβ = 0 is a set ofnon-estimable functions.
If y = Xβ + ε, where X is n × p of rank k < p ≤ n, and if Tis a (p − k)× p matrix of rank p − k such that Tβ is a set ofnon-estimable functions, then there is a unique vector β thatsatisfies both X>Xβ = X>y and Tβ = 0.
Winter term 2018/19 30/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
Testable hypotheses
We consider hypotheses about the regression coefficients inthe model y = Xβ + ε, where y ∼ Nn(Xβ, σ2I) and X is ann × p design matrix of rank k < p ≤ n.
A hypothesis is said to be testable if there exists a set oflinearly independent estimable functions λ>1 β,λ
>2 β, . . . ,λ
>t β
such that H0 is true if and only ifλ>1 β = λ>2 β = · · · = λ>t β = 0.
Often the subset of β’s whose equality we wish to test is suchthat every contrast
∑i ciβi is estimable;
∑i ciβi is a contrast
if∑
i ci = 0.
Winter term 2018/19 31/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
Example of a testable hypothesis
Suppose that we have the model
yij = µ+ αi + βj + εij , i , j = 1, 2, 3
and the hypothesis of interest is H0: α1 = α2 = α3.
By taking linear combinations of the rows of Xβ, we canobtain two linearly independent estimable functions α1 − α2
and α1 + α2 − 2α3.
H0 is true if and only if α1 − α2 and α1 + α2 − 2α3 aresimultaneously equal to zero.
Therefore, H0 is testable and is equivalent to
H0 :
(α1 − α2
α1 + α2 − 2α3
)=
(00
).
Winter term 2018/19 32/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
General linear hypothesis approach
As illustrated, a hypothesis such as H0: α1 = α2 = α3 can beexpressed in the form H0: Cβ = 0.
We can test this hypothesis in a manner analogous to thatused for the general linear hypothesis test for the full-rankmodel.
We assume that y ∼ N (Xβ, σ2I), where X is n × p of rankk < p ≤ n. Let C be an m × p matrix of rank m ≤ k suchthat Cβ is a set of linearly independent estimable functionsand let β = (X>X)−X>y.
Winter term 2018/19 33/34
Non-full rank modelsRemedies to deal with the rank deficiency of the design matrix
Testing hypotheses
General linear hypothesis test
If H0: Cβ = 0 is true, the test statistic
F =SSH/m
SSE/(n − k)
=(Cβ)>[C(X>X)−C>]−1(Cβ)/m
SSE/(n − k)
is distributed as F (m, n − k).
Reject H0 if F ≥ Fα;m;n−k , where Fα;m;n−k is the upper αpercentage point of the (central) F distribution with m andn − k degrees of freedom.
Winter term 2018/19 34/34