10
Journal of Statistical Planning and Inference 104 (2002) 459–468 www.elsevier.com/locate/jspi Optimality of mixed-level supersaturated designs Shu Yamada a ; , Tomomi Matsui b a Department of Management Science, Science University of Tokyo, Kagurazaka 1-3, Shinjuku, Tokyo 162-8601, Japan b Department of Mathematical Engineering and Information Physics, Faculty of Engineering, University of Tokyo, Hongo 7-3-1, Bunkyo, Tokyo 113-8656, Japan Received 14 April 2000; received in revised form 14 May 2001; accepted 4 June 2001 Abstract This paper presents generalized theorems on the optimality of supersaturated designs in terms of low dependency over all pairs of column vectors. Some mixed-level supersaturated designs are constructed using a method based on these theorems. An index is proposed for measur- ing the eciency of supersaturated design and applied to evaluate the constructed mixed-level supersaturated designs. c 2002 Elsevier Science B.V. All rights reserved. MSC: Primary 62K15; Secondary 05B20 Keywords: 2 statistic; Dependency between two columns; Saturated and orthogonal design; Lower bound of dependency 1. Introduction When an experiment is expensive and the number of factors is large, a useful rst step is to identify a few active factors among many candidate factors, because it is possible to determine the optimum conditions of the active factors based on additional experiments. Supersaturated design is helpful for the rst step, since it is a kind of fractional factorial design in which the number of columns is greater than or equal to the number of rows and can include large number of factors comparing to usual fractional factorial designs. Supersaturated design was originated by Satterthwaite (1959) as a random balance design and formulated by Booth and Cox (1962) in a systematic manner. The main aim of construction of supersaturated design is to generate two-level column vectors while maintaining low dependency (non-orthogonality) over all pairs of column vectors. In the previous studies of two-level supersaturated designs, the dependency between two column vectors has been measured using the squared inner product between the two Corresponding author. Tel.: +81-3-3260-4271; fax: +81-3-3235-6479. E-mail address: [email protected] (S. Yamada). 0378-3758/02/$ - see front matter c 2002 Elsevier Science B.V. All rights reserved. PII: S0378-3758(01)00248-8

Optimality of mixed-level supersaturated designs

Embed Size (px)

Citation preview

Page 1: Optimality of mixed-level supersaturated designs

Journal of Statistical Planning andInference 104 (2002) 459–468

www.elsevier.com/locate/jspi

Optimality of mixed-level supersaturated designs

Shu Yamadaa ; ∗, Tomomi MatsuibaDepartment of Management Science, Science University of Tokyo, Kagurazaka 1-3, Shinjuku,

Tokyo 162-8601, JapanbDepartment of Mathematical Engineering and Information Physics, Faculty of Engineering,

University of Tokyo, Hongo 7-3-1, Bunkyo, Tokyo 113-8656, Japan

Received 14 April 2000; received in revised form 14 May 2001; accepted 4 June 2001

Abstract

This paper presents generalized theorems on the optimality of supersaturated designs in termsof low dependency over all pairs of column vectors. Some mixed-level supersaturated designsare constructed using a method based on these theorems. An index is proposed for measur-ing the e3ciency of supersaturated design and applied to evaluate the constructed mixed-levelsupersaturated designs. c© 2002 Elsevier Science B.V. All rights reserved.

MSC: Primary 62K15; Secondary 05B20

Keywords: �2 statistic; Dependency between two columns; Saturated and orthogonal design;Lower bound of dependency

1. Introduction

When an experiment is expensive and the number of factors is large, a useful <rststep is to identify a few active factors among many candidate factors, because it ispossible to determine the optimum conditions of the active factors based on additionalexperiments. Supersaturated design is helpful for the <rst step, since it is a kind offractional factorial design in which the number of columns is greater than or equalto the number of rows and can include large number of factors comparing to usualfractional factorial designs.Supersaturated design was originated by Satterthwaite (1959) as a random balance

design and formulated by Booth and Cox (1962) in a systematic manner. The main aimof construction of supersaturated design is to generate two-level column vectors whilemaintaining low dependency (non-orthogonality) over all pairs of column vectors. Inthe previous studies of two-level supersaturated designs, the dependency between twocolumn vectors has been measured using the squared inner product between the two

∗ Corresponding author. Tel.: +81-3-3260-4271; fax: +81-3-3235-6479.E-mail address: [email protected] (S. Yamada).

0378-3758/02/$ - see front matter c© 2002 Elsevier Science B.V. All rights reserved.PII: S0378 -3758(01)00248 -8

Page 2: Optimality of mixed-level supersaturated designs

460 S. Yamada, T. Matsui / Journal of Statistical Planning and Inference 104 (2002) 459–468

column vectors based on the dependency between the estimates of the factor eEects.Speci<cally, the covariance of two factor eEects assigned to the two column vectorscan be represented by a function of the inner product when the two factor eEects areconsidered to analyze the collected data.The papers by Lin (1993), Wu (1993), Iida (1994) and Deng et al. (1994) have

constructed supersaturated designs from the existing designs such as the half fraction ofthe Plackett and Burman design proposed by Lin (1993). Lin (1995) and Cheng (1997)have gained insight into the properties of the design criteria. The construction methodsproposed by Nguyen (1996) and Li and Wu (1997) are algorithmic approaches. Tangand Wu (1997) and Yamada and Lin (1997) have constructed supersaturated designsby applying the properties of an orthogonal base.As regards the evaluation of total dependency of the supersaturated design, the av-

erage of the squared inner products over all paired columns is widely applied, whereit is sometimes denoted by E(s2). Tang and Wu (1997) and Nguyen (1996) have in-dependently shown a lower bound of E(s2). In addition, Tang and Wu (1997) haveshown that a supersaturated design partitioned into saturated and orthogonal designsis optimal in terms of the E(s2) criterion. Based on this property, they developed amethod for constructing supersaturated designs by sequentially adding matrices gener-ated by permutation of rows in the initial matrix that is a saturated and orthogonaldesign. Their algorithm is justi<ed by the property of the E(s2) criterion.

The three-level supersaturated design de<ned by Yamada and Lin (1999) is a naturalextension of the two-level supersaturated design. They de<ned a measure for depen-dency between two column vectors modi<ed by the �2 statistic of which is originallyapplied to the hypothesis test in a two-way contingency table. They have constructedthree-level supersaturated designs from two-level supersaturated designs while main-taining low dependency as de<ned by this measure. Yamada et al. (1999) have showna method for constructing a three-level supersaturated design by addition of designswhere these designs are generated by permutation of rows of an initial three-levelorthogonal design.In this paper, we <rst present generalized theorems on the optimality of mixed-level

supersaturated designs using the �2 statistic that is similar to the three-level case.Next we describe mixed-level supersaturated designs generated using a constructionmethod based on the generalized theorems. Finally, we propose an index to representthe e3ciency of supersaturated designs and apply it to the constructed mixed-leveldesigns.

2. The design criteria for mixed-level supersaturated design

Let cl be an n-dimensional column vector consisting of equal numbers of 1s; 2s; : : : ; lsand Cnl be the set of all collections of cl, where n is a multiple number of l. For exam-ple, when l=3 and n=12, each level of {1; 2; 3} appears four times, respectively. Theequal occurrence property on column vectors have been assumed in previous studies

Page 3: Optimality of mixed-level supersaturated designs

S. Yamada, T. Matsui / Journal of Statistical Planning and Inference 104 (2002) 459–468 461

(see e.g. Booth and Cox, 1962; or Lin, 1993; or Wu, 1993) and it is also assumedhere. A column vector cl ∈Cnl and an n × p matrix C nl =(cl·1; : : : ; cl·p) are called anl-level column vector and l-level design matrix, respectively. An n ×∑q

i=1 pi ma-trix C n=(cl1·1; : : : ; cl1·p1 ; : : : ; clq·1; : : : ; clq·pq), where cli·k ∈Cnli (16 i6 q; 16 k6pi),is called a mixed-level design matrix consisting of l1; l2; : : : ; lq-level column vec-tors. For example, a mixed-level design consisting of two and three-level columnvectors is denoted by C =(c2·1; : : : ; c2·p1 ; c3·1; : : : ; c3·p2 ), where q=2, l1 = 2; l2 = 3,c2·k ∈Cn2 (16 k6p1) and c3·m ∈Cn3 (16m6p2).We next consider a de<nition of saturated and supersaturated designs based on the

possibility to estimate factor eEects assigned to column vectors. We de<ne a degree ofsaturation by

v=∑q

i=1 (li − 1)pin− 1

(1)

for any n × ∑qi=1 pi design matrix C n=(cl1·1; : : : ; cl1·p1 ; : : : ; clq·1; : : : ; clq·pq), where

cli·k ∈Cnli (16 i6 q; 16 k6pi). The degree v implies the number of full-dimensionalorthogonal bases in the given mixed-level supersaturated design matrix C n. Thus, a de-sign is called a supersaturated design and a saturated design when v¿ 1 and v=1,respectively, based on the number of full dimensional orthogonal bases.As regard the measure for dependency, Yamada and Lin (1999) de<ned an in-

dex between two multi-level column vectors by an analogy of the �2 statistic. Letnrs(cl1 ; cl2 ) be the number of rows whose values are (r; s) in the n× 2 matrix (cl1 ; cl2 ),where cl1 ∈Cnl1 and cl2 ∈Cnl2 . Then

∑16r6l1

∑16s6l2 n

rs(cl1 ; cl2 ) = n. The dependencybetween cl1 ∈Cnl1 and cl2 ∈Cnl2 is measured by

�2(cl1 ; cl2 ) =∑

16r6l1

∑16s6l2

(nrs(cl1 ; cl2 )− n=(l1l2))2n=(l1l2)

: (2)

For a mixed-level design C n=(cl1·1; : : : ; cl1·p1 ; : : : ; clq·1; : : : ; clq·pq), we introduce thesummation of �2 values

�2(C n)=∑

16i¡j6q

∑16k6pi

∑16m6pj

�2(cli·k ; clj·m) +∑

16i6q

∑16k¡m6pi

�2(cli·k ; cli·m) (3)

as a criterion of total dependency, by following the basic idea of the popular designcriterion E(s2). For example, �2(C n)= 0 implies that C n is an design consisting ofmutually orthogonal columns as well as the case of E(s2)= 0.

Another class of design criteria is the maximum �2 values as well as the maximumsquared inner product applied in Lin (1995) for the two-level design. In this paper, weutilize the following design criteria:

max{�2(cli·k ; cli·m) | 16 k ¡m6pi}; (4)

max{�2(cli·k ; clj·m) | 16 k6pi; 16m6pj} (5)

as secondary criteria in a similar manner of Yamada and Lin (1997).

Page 4: Optimality of mixed-level supersaturated designs

462 S. Yamada, T. Matsui / Journal of Statistical Planning and Inference 104 (2002) 459–468

3. Preliminaries

Let d∗i be an n-dimensional column vector consisting of n=l 1s and n(l − 1)=l 0s,where a 1 in jth row of d∗r indicates an r in the jth row of cl (16 r6 l). For example,c3 = (1; 2; 3; 1; 2; 3)� is represented by D∗

3 = (d∗1 ; d∗2 ; d

∗3 ); d

∗1 = (1; 0; 0; 1; 0; 0)�; d∗2 =

(0; 1; 0; 0; 1; 0)�; d∗3 = (0; 0; 1; 0; 0; 1)�, where “�” denotes the transpose. In general,an l-level column vector cl is represented by an n × l matrix D∗

l =(d∗1 ; : : : ; d∗l ). The

above de<nition is introduced in order to prove the generalized theorems in the follow-ing section and corresponds to the dummy variable representation, which is sometimesapplied in regression analysis for qualitative dependent variables.Let D∗

li and D∗lj be the matrices corresponding to cli ∈Cnli and clj ∈Cnlj , respectively.

The �2 statistic between cli and clj is given by

�2(cli ; clj)= tr

((D∗�li D

∗lj −

nliljJli×lj

)�(D∗�li D

∗lj −

nliljJli×lj

))/(nlilj

);(6)

where Ja×b is the a× b matrix whose elements are 1. Furthermore, we de<ne

Dli =l1=2in1=4

(D∗li −

1liJn×li

)(7)

for the dummy variable representation of the �2 value between cli and clj such that

�2(cli ; clj)= tr(D�lj DliD

�li Dlj): (8)

The dummy variable representation D=(Dl1·1; : : : ;Dl1·p1 ; : : : ;Dlq·1; : : : ;Dlq·pq) corre-sponding to C n=(cl1·1; : : : ; cl1·p1 ; : : : ; clq·1; : : : ; clq·pq) is also convenient for representingthe sum of the �2 values such that

�2(C n)= 12 (tr(D

�DD�D)− vn(n− 1)): (9)

Let C nl1 = (cl1·1; : : : ; cl1·p1 ), cl1·k ∈Cnl1 (16 k6p1) and Dl1 be an l1-level saturatedand orthogonal design such that p1(l1−1)= n−1 and �2(C nl1 ) = 0 and an n×p1l1 matrixde<ned by the dummy variable representation corresponding to C nl1 , respectively. Thediagonal and oE-diagonal elements in Dl1D

�l1 are (n−1)=n1=2 and −1=n1=2, respectively.

Let Dl2 be an n × l2 matrix corresponding to a column vector cl2 ∈Cnl2 . As regardthe matrix Dl2D

�l2 , we have following properties:

(i) The diagonal elements of the matrix are equal to (l2 − 1)=n1=2.(ii) In each column (row) vector of the matrix, (l2−1)=n1=2 and −1=n1=2 appear n=l2

and n(l2 − 1)=l2 times, respectively.

4. Generalized theorems

Theorem 1. Any design C n=(cl1·1; : : : ; cl1·p1 ; : : : ; clq·1; : : : ; clq·pq) satis:es the inequality

�2(C n)¿ 12v(v− 1)n(n− 1); (10)

where cli·k ∈Cnli (16 i6 q; 16 k6pi) and v is the degree of saturation of C n.

Page 5: Optimality of mixed-level supersaturated designs

S. Yamada, T. Matsui / Journal of Statistical Planning and Inference 104 (2002) 459–468 463

Proof. We denote the ordered eigenvalues of the positive semide<nite matrix D�Dby �1¿ �2¿ · · ·¿ �n−1¿ �n= · · · �p∗ =0, where p∗ =

∑qi=1 pi and D is the dummy

variable representation corresponding to the design matrix C n. Because D�D is sym-metric, the eigenvalues of D�DD�D are {�21; �22; : : : ; �2p∗}, and the sum of the �2 valuesof design matrix C n is

�21 + �22 + · · ·+ �2n−1 = tr(D�DD�D)= 2�2(C n) + vn(n− 1): (11)

A lower bound of the value �21 + �22 + · · · + �2n−1 can be obtained as the optimal

value of the convex quadratic programming problem:

minimize �21 + �22 + · · ·+ �2n−1

subject to �1 + �2 + · · ·+ �n−1 = tr(D�D)= vn1=2(n− 1):

The above convex quadratic programming problem has the unique optimal solution�∗1 = �

∗2 = · · ·= �∗n−1 = vn

1=2. Hence, the optimal value of tr(D�DD�D) is equal to(vn1=2)2(n− 1), from which one derives

�2(C n)¿ 12 ((vn

1=2)2(n− 1)− vn(n− 1))= 12v(v− 1)n(n− 1): (12)

We, next, consider a design that attains the lower bound proved as above. To obtainthis design, the following lemma is useful.

Lemma 1. Let C nl1 = (cl1·1; : : : ; cl1·p1 ) be an l1-level saturated and orthogonal designsuch that p1(l1 − 1)= n − 1 and �2 (C nl1 ) = 0; where cl1·k ∈Cnl1 (16 k6p1). Then;every column vector cl2 ∈Cnl2 satis:es the equality∑

16k6p1

�2(cl1·k ; cl2 ) = n(l2 − 1):

Proof. Consider the dummy variable representation Dl1 and Dl2 corresponding toC nl1 = (cl1·1; : : : ; cl1·p1 ) and cl2 , respectively. The sum of the �2 values can be describedby ∑

16k6p1

�2(cl1·k ; cl2 ) = tr(D�l1Dl2D

�l2Dl1 ) = tr(Dl1D

�l1Dl2D

�l2 ): (13)

The property of the elements in Dl1D�l1 and Dl2D

�l2 shown in the previous section

ensures that all diagonal elements of the matrix Dl1D�l1Dl2D

�l2 are equal to l2 − 1 and

this lemma is obtained.

Theorem 2. Let C n=(C nl1 ; : : : ;Cnlq) be a mixed-level supersaturated design consist-

ing of saturated and orthogonal designs C nli =(cli·1; : : : ; cli·pi) (16 i6 q) such that�2(C nli)= 0 and pi(li − 1)= n− 1; i.e. C nli is saturated with respect to the estimationof all main e?ects. Then; this design C n is optimal in terms of the sum of �2 values.

Proof. We show that the design C n consisting of saturated and orthogonal designsattains the lower bound of the sum of �2 values. Because each member of the partition

Page 6: Optimality of mixed-level supersaturated designs

464 S. Yamada, T. Matsui / Journal of Statistical Planning and Inference 104 (2002) 459–468

is orthogonal, we have the following:

�2(C n) =∑

16i¡j6q

∑16k6pi

∑16m6pj

�2(cli·k ; clj·m): (14)

Also, Lemma 1 and pi(li − 1)= n− 1 (l6 i6 q) implies∑16i¡j6q

∑16k6pi

∑16m6pj

�2(cli·k ; clj·m)=12q(q− 1)n(n− 1): (15)

Now, when C n consists of q saturated and orthogonal design matrices, the degree ofsaturation is equal to q, i.e. �= q. Thus, the above equation implies that the mixed-levelsupersaturated design C n consisting of saturated and orthogonal designs is optimal interms of the sum of �2 values, since it attains equality of the lower bound shown inEquation (10).

5. Design evaluation

Theorem 1 is useful for evaluating the dependency because the lower bound can beapplied to any type of supersaturated design. We de<ne an index for measuring thedesign e3ciency for given design C n by

(1=2)v(v− 1)n(n− 1)�2(C n)

; (16)

which we call �2-e3ciency. In this index, the numerator implies the lower bound ofthe sum of �2 values for any mixed-level supersaturated design given n and v andthe denominator implies the sum of �2 values of the given design matrix C n. So,the �2-e3ciency takes a value from 0 to 1 and indicates the degree of attainmentcompared with an optimal design in terms of sum of �2 values, i.e. the degree ofattainment compared with an optimal design. For example, when the �2-e3ciency isequal to 1, the design is optimal in terms of the sum of �2 values.This index is derived from an analogy of well-known design indices, such as D-

e3ciency and G-e3ciency. Speci<cally, D-e3ciency is de<ned by the ratio of thevalues of D-optimality under the given design to the optimal design in terms of theD-optimality, where D-e3ciency and G-e3ciency are introduced in a standard textbookof optimum designs such as Atkinson and Donev (1992).

6. Example

In this section we describe the construction of a mixed-level supersaturated de-sign consisting of two and three-level column vectors with n=12 runs based on thetheorems shown in Section 4. Theorems 1 and 2 imply that a supersaturated designpartitioned into saturated and orthogonal designs is optimal in terms of the sum of �2

values. However, there is no three-level orthogonal design with n=12 runs, althoughthere are two-level saturated and orthogonal designs with n=12 runs. Therefore, we

Page 7: Optimality of mixed-level supersaturated designs

S. Yamada, T. Matsui / Journal of Statistical Planning and Inference 104 (2002) 459–468 465

construct mixed-level designs such that C n=(C n2 ;Cn3 ), where C

n2 = (c2·1; : : : ; c2·n−1) is

a two-level saturated and orthogonal design and C n3 = (c3·1; : : : ; c3·p2 ) is a three-leveldesign.When C n2 is a saturated and orthogonal design, Lemma 1 implies the summation of

the �2 values such that∑

16k6n−1 �2(c2·k ; c3) is a constant for all c3 ∈Cn3, i.e. the sum

of �2 values are invariant to the selection of c3. Therefore, we introduce the followingcriteria:

max{�2(c2·k ; c3·m) | 16 k6 n− 1; 16m6p2};

max{�2(c3·k ; c3·m) | 16 k ¡m6p2}:The following lemma is convenient.

Lemma 2. Let C n2 = (c2·1; : : : ; c2·n−1) be a two-level saturated and orthogonal design.Then; there is no three-level design C n3 = (c3·1; : : : ; c3·p2 ) such that max{�2(c2·k ; c3·m) | 16 k6 n− 1; 16m6p2}¡ 2n=(n− 1):

Proof. The condition max{�2(c2·k ; c3·m) | 16 k6 n−1; 16m6p2}¡ 2n=(n−1) im-plies

∑16k6n−1 �

2(c2·k ; c3·m)¡ 2n, while Lemma 1 implies∑

16k6n−1 �2(c2·k ; c3·m)

= 2n. Contradiction.

The set of all possible variations of �2(c2; c3) is {0:0; 2:0; 6:0; 8:0}, where it is ob-tained by a brute force computer search to examine the all pairs of one c2 ∈C12

2 andone c3 ∈C12

3 . Lemma 2 shows that there is no column vector c3 ∈C123 that satis<es

max{�2(c2·k ; c3) | 16 k6 n−1}¡ 2n=(n−1)=24=11. This equation implies that theredoes not exist any three-level column c3 ∈C12

3 satisfying max �223 = 2:0. Therefore, weexplore three-level column vectors c3 ∈C12

3 under the condition max�223 = 6:0 which isthe minimum greater than 24=11 in the set {0:0; 2:0; 6:0; 8:0}.The following (12× 11) matrix

C 122 = (c2·1; : : : ; c2·11)=

1 1 1 1 2 1 1 2 2 2 21 1 1 2 1 2 2 1 1 2 21 1 2 1 1 2 2 2 2 1 11 2 2 2 1 1 1 1 2 1 21 2 1 2 2 1 2 2 1 1 11 2 2 1 2 2 1 1 1 2 12 1 2 1 2 1 2 1 1 1 22 1 1 2 2 2 1 1 2 1 12 1 2 2 1 1 1 2 1 2 12 2 1 1 1 2 1 2 1 1 22 2 1 1 1 1 2 1 2 2 12 2 2 2 2 2 2 2 2 2 2

Page 8: Optimality of mixed-level supersaturated designs

466 S. Yamada, T. Matsui / Journal of Statistical Planning and Inference 104 (2002) 459–468

Table 1Explored three-level design matrices

C 123a (�2c =1:5) C 12

3b (�2c =3:0) C 123c (�2c =6:0)

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 2 2 1 1 1 1 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 21 2 2 1 2 1 2 2 2 1 1 2 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 2 2 2 2 3 31 3 3 3 3 1 2 2 2 2 2 1 1 2 2 2 2 1 1 1 2 2 2 2 2 2 3 3 3 1 1 2 2 2 3 3 3 1 1 3 3 1 12 1 2 2 3 2 1 2 3 1 3 3 2 1 2 2 3 2 3 3 1 1 2 2 2 3 1 2 2 2 3 2 3 3 1 2 3 1 3 1 1 1 23 1 3 3 2 3 2 1 3 2 3 3 3 3 2 3 2 2 1 3 2 3 1 3 3 3 2 1 2 2 3 3 1 1 3 2 2 3 2 1 3 1 12 2 1 3 1 2 2 3 1 3 3 1 2 2 2 3 1 3 2 2 1 3 3 1 3 3 2 2 3 1 2 1 2 3 2 3 3 3 3 3 1 2 32 2 2 3 3 2 3 1 2 3 2 3 2 3 3 2 2 1 2 3 3 2 3 2 1 1 2 1 3 3 2 3 1 3 1 2 1 2 1 2 3 3 22 3 3 1 2 2 3 3 3 2 1 2 2 3 3 3 3 3 3 1 2 1 3 3 2 3 3 3 1 3 3 1 2 1 3 3 1 2 1 3 2 2 33 2 3 2 1 3 3 2 1 3 1 3 3 2 3 1 2 2 3 2 3 2 1 3 3 2 1 3 1 3 2 2 3 3 2 1 2 3 3 3 1 3 33 3 1 1 3 3 3 3 2 1 3 1 3 2 3 3 3 3 2 3 3 3 2 1 3 1 3 3 3 2 1 3 3 2 3 1 2 1 3 2 2 2 13 3 2 2 1 3 1 3 3 3 2 2 3 3 1 2 3 3 3 2 3 3 3 3 1 2 3 2 2 3 3 3 3 2 2 3 3 3 2 1 3 3 2

is a saturated and orthogonal design matrix, and is equivalent to the design with n=12runs by Plackett and Burman (1946). The algorithm for constructing design matricesC 12 = (C 12

2 ;C123 ) is as follows. Initially, we put the set of three-level column vectors

to the empty set. In each iteration, a three-level column vector c3 satisfying

max{�2(c2·k ; c3) | 16 k6 n− 1}6 6:0

and

max{�2(c3·k ; c3) | 16 k6p∗2}6 �2c ;

is searched and added to the three-level design where p∗2 is the number of the pre-

enumerated three-level column vectors in each stage.We examine all column vectors in the set C12

3 in an arbitrary order. The thresh-old value �2c is selected from the set {1:5; 3:0; 6:0; 11:5; 12:0; 15:0; 24:0} which is theset of all possible variations of �2 values between two three-level column vectors.The set is also obtained by a brute force computer search to examine all the pairsof c3 ∈C12

3 . Table 1 shows matrices constructed using the algorithm. The left por-tion, consisting of <ve column vectors, is generated to satisfy the two conditions:max{�2(c2·k ; c3·m) | 16 k6 n − 1; 16m6p2}6 6:0 and max{�2(c3·k ; c3·m) | 16k ¡m6p2}6 �2c =1:5. In a similar manner, seven and thirty-one columns are ex-plored under the conditions that �2c =3:0 and 6:0, respectively. The result is alsosummarized in Table 1.Table 2 shows the number of column vectors, the degree of saturation v, the lower

bounds, �2(C n) and the �2-e3ciency for the design matrices shown in Table 1. Thefrequencies of the �2 values are shown in Table 3. For example, there are 55 pairsof column vectors where one column vector is selected from C 12

2 and the other fromC 12

3a . Table 1 indicates that 9 pairs of �2 values are 0, 39 pairs of �2 values are 2, and7 pairs of �2 values are 6.According to the �2-e3ciency, (C 12

2 ;C123b ) and (C 12

2 ;C123c ) are relatively better than

(C 123b ) and (C 12

3c ), respectively. This is because adding a saturated and orthogonal design

Page 9: Optimality of mixed-level supersaturated designs

S. Yamada, T. Matsui / Journal of Statistical Planning and Inference 104 (2002) 459–468 467

Table 2Evaluation of constructed supersaturated design

Three-level Mixed-level

C 123a C 12

3b C 123c (C 12

2 ;C123a ) (C 12

2 ;C123b ) (C 12

2 ;C123c )

Number of columns 5 7 31 15 18 42Degree of saturation 0.91 1.27 5.64 1.91 2.27 6.64Lower bound — 22.9 1724.7 114.5 190.9 2468.7�2(C 12) 15.0 48.0 1905.0 135.0 216.0 2649.0�2-e3ciency — 0.47 0.91 0.85 0.88 0.93

Table 3Frequency of �2 values

(i) �2(c2; c3)

�2 (C 122 ;C

123a ) (C 12

2 ;C123b ) (C 12

2 ;C123c )

0 9 17 512 39 48 2496 7 12 41

(ii) �2(c3; c3)

�2 (C 123a ;C

123a ) (C 12

3b ;C123b ) (C 12

3c ;C123c )

1.5 10 10 1363.0 0 11 916.0 0 0 238

is advantageous in terms of the �2-e3ciency, because addition of the high e3ciencymatrices brings high e3ciency on the resulting design. This fact is con<rmed by thede<nition of the e3ciency shown in Eq. (16).

7. Concluding remarks

We have presented two generalized theorems, they are useful for evaluation andprovide a theoretical background for any type of supersaturated design. Speci<cally,Theorem 1 shows a lower bound of the sum of the �2 values for any type of su-persaturated design, that can be regarded as a generalization of previous studies fortwo-level supersaturated designs in terms of E(s2) criteria. This lower bound derivesthe �2-e3ciency as an index for measuring the degree of attainment compared to anoptimal design in terms of the sum of �2 values. This index has an advantage that itcan be used for any type of supersaturated designs.Theorem 2 suggests some constructing methods of optimal supersaturated design

in terms of the sum of �2 values, where it can be regarded as the generalizationof the previous studies for two-level supersaturated design. This theorem implies theoptimality of supersaturated designs constructed by the sequential addition of designmatrices generated by permutation of rows of a saturated and orthogonal design matrix.

Page 10: Optimality of mixed-level supersaturated designs

468 S. Yamada, T. Matsui / Journal of Statistical Planning and Inference 104 (2002) 459–468

This construction method is utilized by Yamada et al. (1999) in three-level designswithout theoretical justi<cation on the optimality. In general, Theorem 2 suggests thepossibility of construction of optimal l-level supersaturated designs (l¿ 2) in terms ofthe sum of �2 values through the constructed method. In the case of mixed-level design,further investigation is needed as one referee shows the impossibility of constructionof optimal two-level and three-level supersaturated design based on Theorem 2.As regard the data analysis in supersaturated design, the stepwise selection has been

applied under an assumption of “eEect sparsity”, e.g. Lin (1993). Furthermore, Westfallet al. (1998) and Yamada (1999) evaluate the Type I and II selection errors in stepwiseregression. The data analysis in supersaturated design may be the other important topicas well as design construction discussed in this paper.

Acknowledgements

We thank the editor and the referees for their useful comments that helped improve-ment of this paper.

References

Atkinson, A.C., Donev, A.N., 1992. Optimum Experimental Designs. Oxford University Press, Oxford.Booth, K.H.V., Cox, D.R., 1962. Some systematic supersaturated designs. Technometrics 4, 489–495.Cheng, C.S., 1997. E(s2)-optimal supersaturated designs. Statist. Sinica 7, 929–939.Deng, L.Y., Lin, D.K.J., Wang, J.N., 1994. Supersaturated design using Hadamard matrix. IBM Research

Report. RC19470 IBM Watson Research Center.Iida, T., 1994. A construction method of two-level supersaturated design derived from L12. Jpn. J. Appl.

Statist. 23 147–153 (in Japanese).Li, W.W., Wu, C.F.J., 1997. Columnwise-pairwise algorithms with applications to the construction of

supersaturated designs. Technometrics 39, 171–179.Lin, D.K.J., 1993. A new class of supersaturated designs. Technometrics 35, 28–31.Lin, D.K.J., 1995. Generating systematic supersaturated designs. Technometrics 37, 213–225.Nguyen, N.K., 1996. An algorithmic approach to constructing supersaturated designs. Technometrics 38,

69–73.Plackett, R.L., Burman, J.P., 1946. The design of optimum multifactorial experiments. Biometrika 33, 303–

325.Satterthwaite, F.E., 1959. Random balance experimentation (with discussion). Technometrics 1, 111–137.Tang, B., Wu, C.F.J., 1997. A method for constructing supersaturated designs and its E(s2) optimality.

Canad. J. Statist. 25, 191–201.Westfall, P.H., Young, S.S., Lin, D.K.L., 1998. Forward selection error control in the analysis of

supersaturated design. Statist. Sinica 8, 101–117.Wu, C.F.J., 1993. Construction of supersaturated designs through partially aliased interactions. Biometrika

80, 661–669.Yamada, S., 1999. Selection errors of stepwise regression in the analysis of supersaturated design. Proceedings

of the Section of Physical and Engineering Sciences 1999, American Statistical Association, 79–84.Yamada, S., Lin, D.K.J., 1997. Supersaturated design including an orthogonal base. Canad. J. Statist. 25,

203–213.Yamada, S., Lin, D.K.J., 1999. Three-level supersaturated design. Statist. Probab. Lett. 45, 31–39.Yamada, S., Ikebe, Y., Hashiguchi, H., Niki, N., 1999. Construction of three-level supersaturated design. J.

Statist. Plann. Inference 81, 183–193.