19
STATISTICS IN MEDICINE Statist. Med. 2008; 27:845–863 Published online 4 July 2007 in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/sim.2952 Hypothesis testing in functional linear regression models with Neyman’s truncation and wavelet thresholding for longitudinal data Xiaowei Yang 1, 2, , and Kun Nie 2 1 Division of Biostatistics, Department of Public Health Sciences, School of Medicine, University of California, Davis, CA 95616, U.S.A. 2 BayesSoft, Inc., 2221 Caravaggio Drive, Davis, CA 95618, U.S.A. SUMMARY Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. In many cases, the trajectories do not look globally linear or polynomial, making it difficult to summarize the data or test hypotheses using standard longitudinal data analysis based on various linear models. An alternative approach is to apply the approaches of functional data analysis, which directly target the continuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of data exploration, many functional data analysis strategies have been developed based on various schemes of smoothing, but fewer options are available for making causal inferences regarding predictor–outcome relationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves, two testing strategies with good power have been proposed for high-dimensional analysis of variance: the Fourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessation clinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing into the framework of functional linear regression models (FLRMs) with continuous functional responses and categorical or continuous scalar predictors. The analysis procedure consists of three steps: first, apply the Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear model in the transformed domain; and finally, test the regression coefficients using either adaptive Neyman or thresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linear regression model, the development of this model and computational tools should enhance the capacity of medical statistics for longitudinal data. Copyright 2007 John Wiley & Sons, Ltd. KEY WORDS: functional linear regression model; adaptive Neyman test; thresholding test; Fourier transform; wavelet transform; longitudinal data analysis Correspondence to: Xiaowei Yang, Division of Biostatistics, Department of Public Health Sciences, School of Medicine, Med Sci 1-C, University of California, Davis, CA 95616, U.S.A. E-mail: [email protected] Contract/grant sponsor: National Institute on Drug Abuse; contract/grant numbers: N44 DA35513, R03 DA016721, P50 DA 12755 Received 8 December 2005 Copyright 2007 John Wiley & Sons, Ltd. Accepted 26 April 2007

Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

Embed Size (px)

Citation preview

Page 1: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

STATISTICS IN MEDICINEStatist. Med. 2008; 27:845–863Published online 4 July 2007 in Wiley InterScience(www.interscience.wiley.com) DOI: 10.1002/sim.2952

Hypothesis testing in functional linear regression models withNeyman’s truncation and wavelet thresholding

for longitudinal data

Xiaowei Yang1,2,∗,† and Kun Nie2

1Division of Biostatistics, Department of Public Health Sciences, School of Medicine,University of California, Davis, CA 95616, U.S.A.

2BayesSoft, Inc., 2221 Caravaggio Drive, Davis, CA 95618, U.S.A.

SUMMARY

Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. Inmany cases, the trajectories do not look globally linear or polynomial, making it difficult to summarizethe data or test hypotheses using standard longitudinal data analysis based on various linear models.An alternative approach is to apply the approaches of functional data analysis, which directly target thecontinuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of dataexploration, many functional data analysis strategies have been developed based on various schemes ofsmoothing, but fewer options are available for making causal inferences regarding predictor–outcomerelationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves,two testing strategies with good power have been proposed for high-dimensional analysis of variance: theFourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessationclinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing intothe framework of functional linear regression models (FLRMs) with continuous functional responses andcategorical or continuous scalar predictors. The analysis procedure consists of three steps: first, applythe Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear modelin the transformed domain; and finally, test the regression coefficients using either adaptive Neyman orthresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linearregression model, the development of this model and computational tools should enhance the capacity ofmedical statistics for longitudinal data. Copyright q 2007 John Wiley & Sons, Ltd.

KEY WORDS: functional linear regression model; adaptive Neyman test; thresholding test; Fouriertransform; wavelet transform; longitudinal data analysis

∗Correspondence to: Xiaowei Yang, Division of Biostatistics, Department of Public Health Sciences, School ofMedicine, Med Sci 1-C, University of California, Davis, CA 95616, U.S.A.

†E-mail: [email protected]

Contract/grant sponsor: National Institute on Drug Abuse; contract/grant numbers: N44 DA35513, R03 DA016721,P50 DA 12755

Received 8 December 2005Copyright q 2007 John Wiley & Sons, Ltd. Accepted 26 April 2007

Page 2: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

846 X. YANG AND K. NIE

1. INTRODUCTION

In clinical trials or epidemiological studies with longitudinal design, certain features are repeatedlymeasured on one or several groups of subjects so that the evolution of the focused responses couldbe estimated or the hypothesized causal relationships could be investigated [1]. Facilitated bythe advances of modern information technology, it is common to encounter medical longitudinaldata with large numbers of repeated measures. For example, in a smoking cessation clinical trial,up to 36 breath samples were collected on each smoker and analyzed for testing the efficacyof some behavioral therapies [2]. In an epidemiological study, a group of dialysis patients werefollowed for several years till death, with kidney functions and body composition characteristicsmeasured once monthly [3]. To deal with the within-subject dependency among repeated measures,many multivariate modeling strategies have been introduced, e.g. conditional models with randomeffects, marginal models with semi-parametric formulation, and transition models assumingMarkovprocesses [4]. These modeling strategies were mainly developed for longitudinal data with sparsetime grid. As the sampling grids tend to be dense, the data structure becomes more complicatedwith nonlinear time patterns and the between-subject variance becomes more influential.

As an alternative to longitudinal data analysis (LDA), functional data analysis (FDA) may offera more suitable solution, at least theoretically. When the measures are recorded densely overtime, they are typically termed functional or curve data, and accordingly the method chosen toanalyze them is called FDA [5]. Although there are experimental errors or noises in functionaldata, their effects can be greatly reduced by smoothing the data recorded at closely spaced timepoints. From the perspective of FDA, the unit in data analysis is an entire curve from a subjectbeing observed in continuum, even though it is recorded at discrete time points in practice. Thereare basically two steps in applying FDA, although other analytical schemes are feasible. The firststep is to represent trajectories in discrete time grids into smooth curves via expansion with basisfunctions or smoothing with local weighting. Then, we conduct analysis in the second step byfitting a model within the functional space, e.g. functional principal component analysis (FPCA)and functional linear models. For a full investigation of the related theories and applications, referto the monographs by Ramsay and Silverman [5].

Until very recently, FDA and LDA had been viewed as disjoint enterprises and endeavor is beingactively made to reconcile the two methods. Within the context of principal component analysis,Hall and colleagues [6] investigate the asymptotic features of FPCA to longitudinal data withsparse time grids. As they have shown for a sample of n curves with random measurement times,the estimation of eigenvalues is a semi-parametric problem with root-n consistent estimators, evenif only a few observations are recorded for each function. The comparison of LDA and FDA wasextensively discussed in the year 2004 emerging issues of Statistica Sinica [7], which indicates thatthe two methods are quite distinct with different focuses and types of modeling approach. FDA tendsto apply non-parametric methods for exploring the data to find interesting descriptive features of aphenomenon, whereas LDA is more interested in making inferences regarding outcome–predictorrelationships by conducting hypothesis testing with parametric models.

As discussed by Cardot and colleagues [8–10], a functional linear model offers a naturalextension of traditional linear models. Theoretically, within the framework of functional linearmodels, the effects of predictors to the response curves can be estimated and accordingly thehypotheses can be tested via comparison of nested models. Depending on the types of responseand predictors, we have conceived three forms of functional linear models: those with functionalresponse and scalar predictors; those with scalar response and functional predictors; and those

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 3: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

FUNCTIONAL LINEAR MODELS AFTER FOURIER AND WAVELET TRANSFORM 847

with functional variables on both sides of the equation. In this paper, we restrict our attentionto the first type of functional linear model to analyze longitudinal data with continuous repeatedmeasures as responses. In this case, we call it the functional linear regression model (FLRM).Generalized functional linear models for non-Gaussian responses was also developed by Mullerand Stadtmuller [11]. One advantage of the FLRM is that time-varying treatment effects can bestudied, yet requires fewer assumptions on the mean structure and intra-subject correlation pattern.An interesting example of functional data is a urinary metabolite progesterone level studied byBrumback and Rice [12].

To apply the FLRM, a stationary Gaussian process is a convenient assumption for the mea-surement process. Within the original domain, least-squares estimation methods with or withoutpenalty on smoothness can be applied directly to the data and the estimated regression coefficientfunctions in the FLRM can be tested directly using a statistic similar in form to the F statisticsin traditional linear regression models. In Shen and Faraway [13], a computationally efficientapproximation of the asymptotic distribution of the functional F statistic was proposed and laterapplied to the smoking cessation data [14], which are reanalyzed in this article using a group oftransform-based methods for fitting FLRMs.

When longitudinal data were viewed as groups of curves and compared, Fan and Lin [15]developed the methods called adaptive Neyman test and thresholding test, within the frameworkof high-dimension analysis of variance (HANOVA) [15, 16]. Both testing methods depend ondimension-reduction or smoothing techniques using orthogonal transformation. After the Fouriertransform, the estimated coefficients in the frequency domain are constructed to yield the adaptiveNeyman statistic. Similarly, the thresholding statistic is calculated after performing the wavelettransform. In this paper, we discuss how to extend these two testing methods from the settingof HANOVA to the setting of FLRMs. In the following section, the smoking cessation clinicaltrial with longitudinal carbon monoxide level data is introduced as a motivating study. Then, inSection 3, the FLRMs are defined both in the original time domain and in the transformed Fourieror wavelet domain. The testing method based on Neyman’s truncation or Wavelet thresholdingis also developed here for the FLRMs within the specific transformed domain. In Section 4,carbon monoxide data are analyzed to illustrate the performance of the testing methods. Finally,additional remarks and guidelines for applying FLRMs to longitudinal data are addressed inSection 5.

2. A MOTIVATING STUDY

The development of this work was closely related to the analysis of a clinical trial of smok-ing cessation in methadone-maintained tobacco smokers [2]. This study was designed to testthe effectiveness of two forms of behavioral therapy: relapse prevention (RP) and the contin-gency management (CM), for improving smoking cessation outcomes using nicotine trans-dermalpharmacotherapy. A total of 174 participants were randomly assigned to one of the four behavioraltreatment groups: a control group that received no behavioral therapy (42 subjects); RP-only (42subjects); CM-only (43 subjects); and a combined RP+CM condition (47 subjects). The RP pro-gram applied cognitive techniques and behavioral skills to enhance and instill smoking cessationand to prevent relapse. In this program, the interventions were delivered in weekly hour-longgroup counseling sessions. The CM for tobacco smoking was accomplished by providing vouchersthat could be exchanged for goods or services once a participant could provide scheduled breath

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 4: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

848 X. YANG AND K. NIE

Figure 1. Mean levels of the carbon monoxide across the treatment groups. For each plot, y-axisindicates log(CO+ 1) transform of the original level of carbon monoxide (p.p.m.), x-axis indicatesthe number of clinic visits for study participants (1, . . . , 36). Both individual profiles and the meanprofile are plotted for each of the four treatment conditions: control, RP-only, CM-only, and RP+CM

(RP, relapse prevention; CM, contingency management).

samples indicating smoking abstinence. The repeated measures of interest in this 12-week studyare the expired breath samples, which were collected three times per week and analyzed for carbonmonoxide levels (p.p.m) to indicate recent tobacco abstinence.

After a log(1+y) transform, the observed individual trajectories and the smoothed mean functionfor carbon monoxide levels within each treatment group are depicted in Figure 1. From the plots,we first notice that the between-subject variances are considerably high and the mean functionslook quite nonlinear. Within each group, the mean level of carbon monoxide declines quicklyfrom the baseline level within the first two weeks. Then, the mean curve remains oscillatingaround a flat level that looks different across the four groups. For significance testing, a convenientapproach is to compare the observed carbon monoxide levels across treatment conditions on eachtime point using ANOVA. While this method provides a useful tool for exploratory purposes, it islimited in making inferences regarding treatment efficacy because there it is difficult to combinethe results and p-values from the multiple comparisons. Another problem with this point-wiseANOVA is introduced by missing values, which happened in this study due to either participants’

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 5: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

FUNCTIONAL LINEAR MODELS AFTER FOURIER AND WAVELET TRANSFORM 849

early withdrawal or occasional missing of clinical visit. As commonly seen in other clinical trialswith fixed design, the proportion of missing values ascends in time and usually reaches the highestlevel around the termination of the study. In the set of carbon monoxide data, the missingness rateincreases from less than 1 per cent in the first week to as high as 24 per cent in the last week.As we had investigated earlier using incomplete longitudinal modeling strategies [17], there wasstrong evidence for suggesting that the missing values in this data set were not ‘missing completelyat random’ (MCAR) [18]. In fact, it seems that the probability of missingness for the ‘current’data point would be high if the previous observed values are high. This implies that the actualmean carbon monoxide levels should be higher than they are shown in Figure 1 if there were nomissing values, especially toward the end of the study. In other words, we argue that the smokerswho failed to maintain abstinence would have a higher dropout probability than those who didmaintain abstinence well.

A more proper analytical alternative is to use longitudinal models as discussed by Diggle et al.[4]. For this data set, we could approximately divide the 12-week study period into two parts:the period of instilling tobacco abstinence (weeks 1–3) and the period of maintaining abstinence(weeks 4–12). Within each period, linear forms were assumed and a piecewise random interceptmixed regression model was fitted [2]. The fitted model suggested that participants receiving CMcorresponded with lower levels of carbon monoxide than those receiving no CM (T4680 = 9.88 withp-value< 0.0001). This random-effects linear model assumed that the missing values were ‘missingat random’ (MAR), which is less restrictive than MCAR. It also accommodates covariates that arenot categorical such as age (Age), baseline carbon monoxide level (BaseCO), and the number ofnicotine patches received during the study (Patches). BaseCO was not treated as response variableat time zero because it was measured on a randomly chosen day during the two-week baselineperiod.

Although this linear mixed-effects model may provide a reasonable solution for the main taskof analysis, it does not fit well the nonlinear form of trajectories by assuming simple piecewiselinear lines. On the other hand, the complicated covariance structure among repeated measureswithin and across subjects may not be characterized adequately by the longitudinal model, whichusually seeks the feature of parsimony by assuming a simple parametric covariance structure. Inthe rest of this article, we discuss how to improve our analysis by reanalyzing the data using thestrategy of FDA.

3. FUNCTIONAL LINEAR MODELS AND HYPOTHESIS TEST

Essentially the same as other branches of statistics, the goal of FDA is either descriptive orinferential. Descriptive analysis aims to represent, display, and summarize data for exploringinteresting features or patterns, whereas inferential analysis aims to derive or evaluate predictor–outcome relationships. In many medical statistics, the second goal is the primary interest andfunctional linear models provide a useful tool for explaining the variation in a functional responsevariable by a set of inputs or independent variables.

3.1. The functional view of longitudinal data

In a longitudinal study with pre-fixed time grids, repeated measures (yi1, yi2, . . . , yiT ) are collectedon each subject i = 1, . . . , n at times t = 1, . . . , T . In longitudinal models, multivariate distributions

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 6: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

850 X. YANG AND K. NIE

are assumed for the vectors of measures and the aim is to estimate the parameters characterizingthe probability distributions. From the perspective of FDA, each vector is viewed as a collection ofdiscrete samples from a continuous curve {yi (t): t ∈ [0, T ], i = 1, . . . , n} and this curve is viewedas the unit and target in data analysis.

Therefore, the first step of FDA is to construct the underlying curves from the observed vectors.If the repeated measures are errorless, then the process is simply carried out by interpolation;otherwise, smoothing techniques are required. The most popular options for smoothing are linearsmoothers, which estimate {yi (t) : t ∈ [0, T ], i = 1, . . . , n} by

yi (t) =T∑j=1

Si j (t)yi j

This is a linear combination of discrete observations using weights {Si j (t) : i = 1, . . . , n; j = 1, . . . ,T }. An intuitive way is to give higher weights for observations that stand closer in time to t . Thisis the strategy of smoothing by local weights using various forms of kernel function [19].

Alternatively, we can estimate the above function by a linear combination of K known basisfunctions (e.g. Fourier, polynomial, spline, and wavelet bases),

yi (t) =K∑

k=1cik�k(t)

where the number K controls the degree of smoothness. The coefficients of the expansioncik (i = 1, . . . , n; k = 1, . . . , K ) can be determined by minimizing the least-squares criterion∑T

j=1 [yi j − ∑Kk=1 cik�k( j)]2. Ideally, we hope K is small, at least smaller than n, in order

to obtain smoother curves and preserve adequate degrees of freedom for model fitting and hypo-thesis testing in the next step of FDA. A popular way of choosing a proper level of K is by theapproach of roughness penalty [5].

In order to make the curves ready for the next step of analysis, other types of pre-processingmay be necessary. For example, if the study design is observational with unbalanced and non-fixedtime grids, we may need to align and register the curves using warping functions so that the curvescan be conformed to a common time grid. Once data are ready for analysis, descriptive summarystatistics can be naturally defined and calculated, e.g. mean, variance, and covariance functions.Advanced modeling schemes such as FPCA and canonical correlation analysis can be conductedtoo. For details of these methods, refer to the monographs of Ramsay and Silverman [5, 20]. Inthe rest of this section, we restrict our discussion to the analysis based on the type of functionallinear model.

3.2. Functional linear regression model (FLRM)

As a natural extension of multivariate linear models, the functional linear model offers a usefultool to make causal inferences regarding the outcome–predictor relationship within the contextof functional data. When yi j is continuous with normal distribution, a convenient assumption for{yi (t) : t ∈ [0, T ], i = 1, . . . , n} is a stationary Gaussian process, and the linear functional model isdefined as

yi (t) = x ′i�(t) + εi (t)

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 7: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

FUNCTIONAL LINEAR MODELS AFTER FOURIER AND WAVELET TRANSFORM 851

where �(t) = (�1(t), . . . , �p(t))′, and xi = (xi1, . . . , xip)′ is the vector of observations of the scalar

predictor variables X1, . . . , X p on subject i . These predictors are usually measured at baselinein a clinical trial. Each � j (t) ( j = 1, . . . , p; t ∈ [0, T ]) is a regression coefficient function, andεi (t) is an error function of Gaussian process with zero mean and unknown covariance function,r(s, t) = cov(Y (s), Y (t)). This model degenerates to the standard multiple linear regression whenfixing the time t . It is the time-varying feature of the regression coefficients that differentiates thismodel from classic linear regression models. Within this article, we reserve the title of functionallinear regression model (FLRM) for this model [13, 21]. Other types of functional linear modelsare conceivable and seen in the literature, e.g. those with scalar outcome variables but functionalpredictors Xi j (t) [9], and those with functional outcome variables and functional predictors [22].

Let X = (x1, . . . , xn)′ denote the design matrix, Y (t) = (y1(t), . . . , yn(t))′ denote the vectorof observed response values at time t on all the subjects, and �(t) = (ε1(t), . . . , εn(t))′, we canrepresent the FLRM in matrix–vector format.

3.2.1. FLRM.

Y (t) = X�(t) + �(t)

In general, we have two options to fit this model. The first one is to conduct point-wiseminimization of the least-squares criterion within the original time domain. Without restrictingthe way in which �(t) varies as a function of t , the least-squares criterion SSE(�(t))= ∫ ‖Y (t) −X�(t)‖2 dt can be minimized at each t . Assuming the general type of restrictions on � (i.e.L�(t) = 0 with matrix L orthogonal to X ), the least-squares estimate is

�(t) = (X ′X + �L ′L)−1X ′Y (t)

where � is the Lagrange multiplier. One may also consider adding a roughness or other types ofpenalty on the least-squares criterion to get smoother or regularized estimators, which are in similarform as �(t) above, except that �>0 is a parameter controlling the amount of penalty [23, 24].

The second option for fitting an FLRM is via orthogonal expansion. As seen in Section 3.1, whenK linearly independent basis functions �(t) = (�1(t), . . . ,�K (t))′ are used to represent the curves{yi (t) : i = 1, . . . , n, t ∈ [0, T ]}, we have Y (t) =C�(t) where the (i, k)th (i = 1, . . . , n; k = 1, . . . ,K ) element of the matrix C is the coefficient of �k(t) in expanding yi (t). By using the sameset of basis functions to expand the regression coefficient function, we have �(t) = B�(t) whereB is a p× K matrix with expansion coefficients. Similarly, the error vector �(t) is expanded as�(t) = E�(t) with coefficients in matrix E . By doing so, we actually transform the FLRM into amultivariate linear regression model within the new space spanned by �(t) (t ∈ [0, T ]), i.e.3.2.2. FLRM∗.

C =XB + E

For convenience, we label this model FLRM∗. We can solve the matrix system of linear equations(X ′X + �L ′L)B = XTC to obtain

B = (X ′X + �L ′L)−1X ′C

which is the least-squares estimator with restrictions or penalties.

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 8: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

852 X. YANG AND K. NIE

3.3. Testing hypothesis in FLRM

Within the framework of functional regression analysis, the typical null hypothesis in significancetesting has the form of

H0 : � j (t) = 0, (t ∈ [0, T ])

where 1� j�p. Using the smoking cessation study as an example, if x j is the dummy vari-able for the treatment named CM, the null hypothesis is equivalent to saying that there is notreatment effect for CM during the study. In many cases, the hypothesis testing is conducted bycomparing two nested FLRMs, the full model (�) with x j included and the reduced model (�)with x j excluded. There are not many satisfactory solutions available for the functional modelcomparison.

If the models � and � are fitted within the original time domain, the simplest method is to exa-mine the point-wise F statistics at each t , F(t)= ((RSS�(t) − RSS�(t))/(p−q))/RSS�(t)/(n−p),where dim(�) = p, dim(�) = q , RSS�(t) =∑n

i (yi (t) − y�i (t))2, and RSS�(t) =∑n

i (yi (t) −y�i (t))2 with y�

i (t) (or y�i (t)) indicating the predicted value from � (or �). This method carries

the serious problem of multiple comparison. To overcome this issue, computation-intensive meth-ods were proposed based on either permutation [5] or bootstrap [21]. Later, Shen and Faraway[13] generalized the point-wise F statistics to the global one:

F = (rss� − rss�)/(p − q)

rss�/(n − p)

where rss� =∑ni

∫(yi (t) − y�

i (t))2 dt and rss� =∑ni

∫(yi (t) − y�

i (t))2 dt . The asymptoticaldistribution of this statistic is not an F but can be approximated by one with adjusted degrees offreedom. This method was applied to our smoking cessation data set [14].

Alternatively, we can define and test the null hypothesis using FLRM∗ within the space spannedby the basis functions, i.e.

H∗0 : b j (k) = 0 (k = 1, . . . , K )

where j = 1, . . . , p, and b j = (b j (1), . . . , b j (K ))′ is the j th row of the coefficient matrix B inthe FLRM∗. Testing this null hypothesis is equivalent to testing the one in the original domain.Since the FLRM∗ is basically a multivariate model with dimension K , which is usually muchsmaller than T (the number of repeated measures), the task of testing becomes simpler. Certainbasis functions may possess some useful features that could further simplify this task. In thefollowing, we discuss how to conduct the above hypothesis testing via the Fourier and waveletbasis functions.

3.4. Adaptive Neyman test in the Fourier domain

Perhaps the most well-known basis expansion is provided by the Fourier series, which have beenwidely used in engineering and physical sciences [25, 26]. With Fourier transform, the functionyi (t) can be expressed in the frequency domain as the linear combination of sine and cosine

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 9: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

FUNCTIONAL LINEAR MODELS AFTER FOURIER AND WAVELET TRANSFORM 853

functions with different frequencies

yi (t) ≈ ci1 + ci2 sin(2�t/T ) + ci3 cos(2�t/T ) + · · · + ci,K−1 sin(R2�t/T )

+ ci,K cos(R2�t/T )

where K =2R+1 and the basis functions are�1(t) = 1,�2r (t) = sin(r�t), and�2r+1(t) = cos(r�t)(r = 1, . . . , R). An appealing feature of this basis is that the derivatives of the represented curvesare still smooth because the basis functions are always differentiable. It is often used for di-mension reduction by choosing K�T ; the Fourier transform could compress the original signalinto low Fourier frequency coefficients after discarding the high-frequency components, whichusually represent noise processes. Another useful feature of the Fourier transform is that the tem-poral correlation within the time domain could be eliminated after the transformation, and thefrequency coefficients, b j (k)’s, are independent of each other with Gaussian distributions undermild conditions.

Determined by the nature of Fourier basis, the absolute values of b j (1), . . . , b j (m) should bemuch larger than b j (m + 1), . . . , b j (K ), where m is a number usually much smaller than K .Therefore, when testing H∗

0 , it suffices to test whether the first m elements are equal to zeros.The adaptive Neyman statistic [15, 16] provides an ideal instrument for such a test. Within thefrequency domain, three steps are required to perform an adaptive Neyman test. First, an estimationmethod (e.g. least squares) is used to obtain point estimators (b j (1), . . . , b j (K )) and standard errors(SE(b j (1)), . . . , SE(b j (K ))) for the regression coefficients in FLRM∗. Then, the adaptive Neymanstatistic is calculated from the standardized estimates of the regression coefficients, i.e.

T ∗AN = max

1�m�K

⎧⎨⎩(√2m)−1

m∑k=1

⎡⎣( b j (k)

SE(b j (k))

)2

− 1

⎤⎦⎫⎬⎭which is the maximum value among K summations, each summing the deviance from zero forthe first m elements (m = 1, . . . , K ). Finally, the p-value for this test is obtained using either theasymptotic or the simulated distribution of T ∗

AN. As seen in Nie [27], when K is large, one canobtain the p-value from the asymptotic double-exponential distribution

P{TAN�x}→ exp{− exp(−x)}where TAN =√2 log log T T ∗

AN − 2 log log T − 0.5 log log log T + 0.5 log(4�). The essential ideaof the proof is seen in Fan and Lin [15] for comparing groups of curves within the framework ofHANOVA. When K is small, one can use the distribution table derived from simulation studiesby Fan and Lin [15] to obtain p-values.

3.5. Thresholding test after wavelet transform

Wavelets provide another family of basis functions for the space of square integrable functions.By choosing a suitable mother wavelet function �(t), the basis functions can be constructed bydilating and translating, i.e.

� jk(t) = 2 j/2�(2 j t − k) ( j = 1, . . . ; k = 1, . . .)

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 10: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

854 X. YANG AND K. NIE

The term ‘wavelet’ arose from the localized wave-like function with zero mean and compactsupport. The wavelet transform provides a tool for multi-resolution analysis in the sense that � jk

yields local information about yi (t) near position 2− j k on scale 2− j . In other words, waveletsprovide a sequence of degrees of locality, whereas the Fourier representation only provides globalinformation at different frequencies. Therefore, the wavelet transform has a better resolution whenthe original signal has discontinuities or sharp aberrations. Wavelets are often used for dimensionreduction too. One can perform the wavelet transform to a signal curve and then set to zero allwavelet coefficients smaller than a threshold level. The original signal can then be reconstructedfrom the wavelets with large coefficients by the inverse wavelet transform. More details on wavelettransform are seen in [28, 29].

When wavelet basis functions are used, we obtain FLRM∗ in the wavelet domain. By carefullychoosing K wavelets (usually the first K ones in the order of �, �10, �11, . . .), we can estimate Band b j ( j = 1, . . . , p) using least-squares estimation with or without penalty. For many practicalcurves, large value coefficients among (b j (1), . . . , b j (K )) often fall on a few components. Totake advantage of this feature of wavelet transform, we can test H∗

0 using the so-called waveletthresholding statistic

T ∗H =

K∑k=1

⎧⎨⎩(

b j (k)

SE(b j (k))

)2

I

(∣∣∣∣∣ b j (k)

SE(b j (k))

∣∣∣∣∣>�

)⎫⎬⎭where the cutoff level �>0 is called a thresholding parameter, and I (·) is an identity function.It was shown that TH = �−1(T ∗

H − ) has an asymptotical standard normal distribution [16], andthe null hypothesis H∗

0 is rejected at level when

TH = �−1(T ∗H − )>z1−

where � =√2 log(KaT ) for some sequence aT → 0 (as K → ∞) at a logarithmic rate, =√2/�a−1

T

�(1+�−2) and �2 =√2/�a−1

T �3(1+3�−2). Choosing � is based on the consideration that it shouldbe suitable. If � is too large, the thresholding test statistic would filter out all the important coef-ficients; otherwise, it will accumulate too much noise. For details of choosing values of aT andrelated theories of wavelet thresholding, refer to [15, 16, 30].

4. APPLICATION

In this section, we revisit the carbon monoxide data to illustrate the hypothesis testing strategiesbased on Neyman truncation and wavelet thresholding within the framework of multiple imputationin dealing with missing values.

4.1. Inference based on multiple imputation for handling missing values

About 20 per cent of the carbon monoxide levels were missing due to participants’ intermittentmissing of clinic visit or earlier withdrawal. To handle this problem, the method of multipleimputation [31] was applied. After the log(1 + y) transform, we treated the repeated measures(yi1, . . . , yiT )′ as multivariate Gaussian distributed, i.e. yi ∼N(, �). By specifying a normal

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 11: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

FUNCTIONAL LINEAR MODELS AFTER FOURIER AND WAVELET TRANSFORM 855

prior distribution for the mean vector (i.e. |�∼N(0, �−1�)) and an inverted Wishart distribution

for the covariance matrix (i.e. � ∼W−1(r, �)), we drew multiple imputations using the iterativealgorithm called data augmentation [32], which was implemented in the Missing Data Analysislibrary of S-plus [33]. This algorithm consists of two steps per iteration. In the imputation step,one draws imputation of missing values (ymis

i ) conditionally on the observed ones (yobsi ) using theposterior predictive distribution (ymis

i |yobsi ∼N(mis|obs, �mis|obs), with mis|obs and �mis|obs derivedfrom regressing ymis

i on yobsi ); in the proposing step, new parameters and � are drawn using thecomplete data with the current imputed values. Since there was no prior information at hand, weadopted the non-informative form of the normal-Wishart prior distribution. The data augmentationprocess was initialized from the maximum likelihood estimate of and � by the EM algorithm.Several diagnostic tools suggested that the process converged within 200 iterations and we ranadditional 2000 iterations to generate five sets of imputations with an interval of 500 iterations.As seen later, our analyses suggested that the between-imputation variation for most parameterswas relatively low. Therefore, the fraction of missing information should be fairly low and theefficiency of a parameter estimate based on five imputations would be very high. Rubin [31] showsthat the efficiency of an estimate based on m imputations is approximately (1+ �/m)−1, where �is the rate of missing information for the quantity being estimated. If ��10 per cent, the efficiencywould be higher than 98 per cent.

For each imputed data set we performed FDA using either FLRM or FLRM∗, and then madeinferences by summarizing the multiple analysis results. When the ultimate analysis goal is toestimate a parameter, the multiple versions of estimators could be combined explicitly usingRubin’s rule [31], which takes into consideration the within-imputation and between-imputationvariances. Nonetheless, when hypothesis testing was conducted to each data set, the p-valuesor test statistics could not be combined easily. To solve this difficulty, we proposed a stra-tegy based on the idea of bootstrap resampling [34]. First, we pooled all the five imputeddata sets together. Then, we randomly redrew with replacement a large number (e.g. S = 200)of subsets of this pooled data set, each subset with a sample size of N = 174 as seen in theoriginal data. Finally, we applied adaptive Neyman test or wavelet thresholding test to each re-sampled data set, and eventually counted the fraction of acceptance among the S tests. Suchan acceptance rate could be approximately viewed as a generalized p-value in making finalinferences.

4.2. FLRMs with F-test in the original time domain

For each imputed data set, we fitted an FLRM in the time domain using least-squares estimationwithout smoothing or restriction. Seven predictors were included in the full model (�): interceptterm (X1 = 1), dummy variables for RP (X2 = 1 or 0; if RP or not) and CM (X3 = 1 or 0; ifCM or not), interaction term (X4 = X2 ∗ X3), baseline carbon monoxide level (X5 =BaseCO), age(X6 =Age), and number of nicotine patches received during treatment (X7 =Patches). As seen inYang et al. [14], the estimated regression coefficients (� j (t); j = 2, . . . , 6) looked similar for allthe imputed data sets, again, suggesting that the fraction of missing information is small. Aftercomparing � with the nested model � (after X2, X4, and X6 excluded from �) using the globalversion of F test, it was found that the rest covariates were significant in predicting the trajectoriesof carbon monoxide levels. Both the estimate of �3(t) and the functional F test of ‘H0 : �3(t) = 0’strongly supported a favorable efficacy of the CM treatment method (refer to Yang et al. [14]for more details). As our simulation studies indicated, this functional F-test did not have a good

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 12: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

856 X. YANG AND K. NIE

power when the sorted eigenvalues of the error process decreased fast. Unfortunately, this may bethe case for the set of carbon monoxide data. In the following, we reanalyzed the same data basedon Fourier or wavelet transform.

4.3. Adaptive Neyman test in the Fourier domain

An alternative way to test � j (t) = 0 (t ∈ [0, T ]) is via testing b j (k) = 0 (k = 1, . . . , K ) in a trans-formed domain. When K is large enough, the model FLRM∗ would satisfactorily represent themodel FLRM; thus, the testing upon parameters in the FLRM∗ can be used to make inferencesregarding the parameters in the FLRM. On the other hand, for the purpose of dimension reduction,K should still be smaller than T (the number of repeated measures per person). For each imputeddata set, residual analysis for the above-fitted FLRMs � and � provided little evidence of violationof the assumption of stationary Gaussian process for error terms {εi (t)}. Therefore, the adaptiveNeyman test within the Fourier domain could be applied.

The first step in fitting an FLRM∗ is to smooth the original trajectories of carbon monoxide levelsvia the Fourier basis {1, sin(2�t/T ), cos(2�t/T ), . . . , sin(R2�t/T ), cos(R2�t/T )}. In practice,repeated measures yi (t): t = 1, . . . , T on the i th subject are discrete samples, and thus should betransformed into the frequency domain via discrete Fourier transform (DFT) [26]. Assuming thatthe trajectory outside the range [0, T ] is extended T -periodic (i.e. yi (t) = yi (t + T )), the DFT isdefined as

ci (k) = 1

T

T∑t=1

[yi (t)e−j(k−1)(t−1)2�/T ]

where j is the square root of −1, ej� = cos(�) + j sin(�), and ci (k) represents the kth Fouriercoefficient for subject i . Since yi (t)’s are real numbers (instead of complex ones), the DFTresults in a symmetric coefficient vector (Ci (1), . . . ,Ci (T )) about the Nyquist frequency, i.e.Im(Ci (1))= 0 Re(Ci (k))=Re(Ci (T+1−k)), Im(Ci (k))=−Im(Ci (T+1−k)). Here, the functionsRe() and Im() refer to the operation of extracting the real and imaginary parts of a complexnumber.

For all the imputed carbon monoxide data sets, it was found that the DFT with K = 5 wassufficient to represent the original trajectories. When these trajectories were viewed as signals inthe time domain, more than 95 per cent of their energy could be interpreted by the processeswith the first five low frequencies (i.e. Ci (1), . . . ,Ci (5)). It was also seen that the testing resultsbased on the adaptive Neyman statistics would not change much by enlarging K further. For eachimputed data set, DFT was conducted using the FDA library of Ramsay and Silverman in S-plus[20]. The smoothed curves for all individuals are depicted in Figure 2 for the first imputed dataset; they look very similar for other imputed data sets.

Then, the adaptive Neyman test was conducted in the frequency domain by working withthe Fourier coefficients Ci (k) (i = 1, . . . , 174; k = 1, . . . , 5) corresponding to each imputed dataset. These coefficients were output from low to high frequencies in an array, where the realparts and imaginary parts alternate. Due to the symmetric feature, only the top five coefficients(Re(Ci (1)),Re(Ci (2)), Im(Ci (2)),Re(Ci (3)), Im(Ci (3))) were used in fitting an FLRM∗. Whenall the interesting predictors (i.e. X1–X7) were included in the model, the least-squares methodwithout penalty was used to obtain the estimated regression coefficients b j (k) ( j = 1, . . . , 7;k = 1, . . . , 5). The adaptive Neyman statistics were calculated for testing the effect of each

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 13: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

FUNCTIONAL LINEAR MODELS AFTER FOURIER AND WAVELET TRANSFORM 857

Figure 2. Mean levels and smoothed carbon monoxide levels using the Fourier basis functions. For eachplot, y-axis indicates log(1 + y) transform of the original level of carbon monoxide (p.p.m.), x-axisindicates the number of clinic visits for study participants (1, . . . , 36). Both individual profiles and themean profile are plotted for each of the four treatment conditions: control, RP-only, CM-only, and RP+CM(RP, relapse prevention; CM, contingency management). These plots were drawn using the first imputed

data set. Plots for other imputed data sets look similar.

predictor, i.e.

T ∗AN( j) = max

1�m�5

⎧⎨⎩(√2m)−1

m∑k=1

⎡⎣( b j (k)

SE(b j (k))

)2

− 1

⎤⎦⎫⎬⎭Since the number K = 5 was small, we derived the p-values from the simulated tables given byFan and Lin [15]. For all the five imputed data sets, the test statistics and p-values for all theseven predictors are listed in Table I. To make overall inferences across imputations, the bootstrapresampling strategy was applied with S = 200, and the acceptance rates support conclusions that

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 14: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

858 X. YANG AND K. NIE

Table I. Estimated adaptive Neyman statistics for testing the regression coefficients in the model FLRM*.

Imputed data sets Intercept RP CM RP*CM Baseco Age Patches

1 134.13 0.46 10.62 1.12 21.55 2.59 32.562 97.58 0.44 8.78 0.84 20.32 2.23 13.003 132.73 0.96 11.36 0.55 24.76 2.57 25.514 113.57 1.62 8.45 1.20 19.47 1.44 17.495 115.41 0.31 8.34 1.79 18.50 2.35 14.31

Note: The critical value at significance level 0.05 of the adaptive Neyman test is 3.50 [11]. The shaded areacorresponds to the adaptive Neyman test with P-values < 0.05.

are consistent with the ones in Table I. Both the individual and combined tests consistently suggestthat CM, BaseCO, and Patches are significant predictors with p-values smaller than 0.05.

The adapted Neyman test offered a good solution here for functional analysis of the carbonmonoxide data mainly because the Fourier transform helped us reduce the dimensionality fromT = 36 to K = 5 and dilute the temporal correlation between intra-subject repeated measures. Bydoing so, we could save on computation time and imputation effort. Nonetheless, such a savingcame at the price of losing the simplicity of clinical interpretation in the original domain. Itwas challenging for us to interpret the test statistics and the testing results within the frequencydomain to our clinical investigators. For the purpose of interpretation, we had to go back to theoriginal domain by applying the inverse Fourier transform to reconstruct the FLRMs. This reverseprocedure turned to be approximately the procedure of fitting another group of FLRMs to thesmoothed curves yi (t) (i = 1, . . . , 174) as shown in Figure 2.

By regressing the smoothed curves to all the predictors using the least-squares estimation, weobtained � j (t) = (X ′X)−1X ′Y (t), which are depicted in Figure 3 for the first imputed data set.

Since Y (t) = (y1(t), . . . , y174(t))′ is a vector of smooth curves in the time domain, the � j (t)’s aresmooth too. One can compare the smoothed plots in this figure with the non-smoothed ones in Yanget al. [14] to verify that they look very similar to each other. The plot of � j (t) provides an intuitivevisual tool for displaying the time-varying treatment effect for the predictor X j . As seen in Figure 3,the fitted coefficient functions of X2, X4, and X6 are close to the zero line, indicating that thetreatment effect of RP, the interaction effect between RP and CM, and the influence of age effectare approximately negligible. Regression coefficient functions for CM and Patches are mostlynegative-valued throughout the study period, indicating favorable effects of CM and nicotine patchreplacement. The coefficient function �5(t) has positive values through the study period, implyingthat the higher the baseline carbon monoxide level, the more difficult it is to achieve abstinence.

4.4. Wavelet thresholding test

For each imputed data set, we conducted wavelet thresholding test in the following steps. First,discrete wavelet transform [28, 29] was applied using the pyramid algorithm of Mallat [35], whichhad been implemented into the R library waveslim. We used the Daubechies orthogonal compactlysupported wavelet basis of length L = 8 [29]. This transform was done for each subject with avector of repeated measures (yi1, . . . , yi36). The function named dwt.nondyadic() was used becauseT = 36 was not a power of 2; the boundary choice was set as ‘periodic’, which was the defaultoption of the function; and the depth of the decomposition was 4, the largest integer smaller than

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 15: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

FUNCTIONAL LINEAR MODELS AFTER FOURIER AND WAVELET TRANSFORM 859

Figure 3. Estimated regression coefficient function in the functional regression model using smoothedcurves. After applying the least-squares estimation to fit the functional regression model FLRM to thesmoothed carbon monoxide curves, these two plots depict the regression coefficient functions correspondingto predictor variables: dummy variables for the two treatment conditions (RP and CM), the interactionterm (RP ∗CM), baseline carbon monoxide level (BaseCO), smoker’s age (Age), and number of nicotinepatches a smoker has received during the study (Patches). Y -axis indicates estimated values of regression

coefficients and x-axis indicates the number of clinic visits for each smoker (1, . . . , 36).

log2 36. After this step, we obtained a set of wavelet coefficients (ci1, . . . , ci35) for each subject,which were ordered by first dilation and then translation.

In the next step, we fitted the multivariate linear model FLRM∗ to regress the wavelet coefficientsto the set of interesting predictors X1 − X7. Using the least-squares method, we estimated theregression coefficients as {b j (1), . . . , b j (35); j = 1, . . . , 7}. The thresholding statistic summarizesall the 35 coefficients, it does not really matter whether one should order the above waveletcoefficients in fitting the model FLRM∗.

In the last step, we calculated the thresholding statistics TH for each predictor variable usingthe formula given in Section 3.5, where we set n = 174, aT = log−2.5(n), � =√2 log(35aT ),= √

2/�a−1T �(1 + �−2), and �2 = √

2/�a−1T �3(1 + 3�−2). The estimated thresholding statistics

are summarized in Table II for each imputed data set. Using the asymptotic standard normaldistribution, the testing results suggested that X1, X3, X5, and X7 were significant factors for

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 16: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

860 X. YANG AND K. NIE

Table II. Estimated thresholding statistics for testing the regression coefficients in the model FLRM*.

Imputed data sets Intercept RP CM RP*CM Baseco Age Patches

1 192.65 0.84 16.04 0.65 34.16 1.47 36.732 139.65 0.50 11.70 0.29 28.40 1.04 17.103 189.46 0.36 16.79 1.83 31.48 0.20 51.134 173.38 −1.36 12.68 −0.53 26.95 −0.18 39.515 158.37 −0.19 12.02 −0.75 28.50 −0.38 21.79

Note: The critical value at significance level 0.05 of the thresholding test is 1.96 [11]. The shaded areacorresponds to the thresholding test with P-values < 0.05.

predicting the wavelet coefficients, hence the carbon monoxide levels in the original domain.Again, the bootstrap resampling strategy with S = 200 was applied to make an overall thresholdingtest after combining the five imputed data sets. This all-in-one test suggests the same results asindividual tests seen in Table II, which are also jointly consistent with the conclusions supportedby the earlier adaptive Neyman test.

As found by Fan [16], the power of the thresholding test could be improved if � was re-placed by the hard-thresholding parameter (i.e. �H =√2 log(T aT ), aT = min(4(bmax)

−4, log−2 T ),where bmax = max1�k�T |b j (k)/SE(b j (k))|). The analysis was redone using this hard-thresholdingparameter, and ended up with similar testing results.

5. DISCUSSION

This paper illustrates how to perform FDA for longitudinal data via the method of orthogonalbasis expansion. More specifically, the Fourier adaptive Neyman test and the wavelet thresholdingtest were introduced to evaluate the effect of a continuous or categorical factor in predicting thetrajectory of repeated measures. The carbon monoxide levels from a smoking cessation phase-IIclinical trial were analyzed with both longitudinal models and functional regression models. Bothanalyses consistently supported that smokers receiving CM provided breath samples with lowercarbon monoxide levels than smokers receiving other therapies. It is also found that smokerswho had received larger numbers of nicotine patches during treatment or provided lower carbonmonoxide at baseline associated with significantly better tobacco abstinence level during the studyperiod.

The idea of the adaptive Neyman test and the thresholding test was first proposed by Fanand Lin [15] to test the differences between groups of curves. When comparing curves betweentwo groups, they translated the curves into Fourier domain or wavelet domain and then appliedthe adaptive Neyman or thresholding test. When the number of groups is bigger than two, theygeneralized the two-group comparison into the HANOVA just like the extension of a t-test to anANOVA. In this article, we further generalized HANOVA to the setting of FLRM, a functionalform of the linear model. A parallel model FLRM∗ was used to analyze the carbon monoxidedata set within the Fourier or wavelet domain. For comparison purpose, we listed in Table III theresults of HANOVA using the adaptive Neyman test for the same group of imputed data sets. TheHANOVA testing results jointly support the conclusions that are consistent with those inferred byFLRM∗ in comparing the treatment effects between the four groups. The obvious advantage of

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 17: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

FUNCTIONAL LINEAR MODELS AFTER FOURIER AND WAVELET TRANSFORM 861

Table III. Estimated adaptive Neyman statistics for pair-wise comparison in HANOVA.

Imputed data sets 1 2 3 4 5

Control versus RP-only 0.24 0.26 0.57 1.23 0.23Control versus CM-only 5.71 (0.005) 5.00 (0.010) 6.83 (0.003) 6.15 (0.005) 4.96 (0.010)Control versus RP + CM 6.57 (0.003) 6.54 (0.003) 5.51 (0.005) 4.84 (0.010) 6.01 (0.005)RP-only versus CM-only 5.63 (0.005) 6.30 (0.0025) 5.37 (0.01) 7.91 (0.001) 4.83 (0.010)RP-only versus RP + CM 6.54 (0.003) 8.27 (0.0005) 4.23 (0.025) 6.38 (0.003) 6.00 (0.005)CM-only versus RP + CM 0.80 0.38 0.59 0.98 1.75Overall 6.58 (0.003) 7.16 (0.003) 5.88 (0.005) 6.86 (0.003) 5.84 (0.005)

Note: The critical value at significance level 0.05 of the adaptive Neyman test is 1.96 [11]. The shaded areacorresponds to the adaptive Neyman test with P-values < 0.05. P-values are shown in parentheses.

the functional regression model is that numerical covariates such as age, baseline level of carbonmonoxide, and number of nicotine patches could contribute to the analysis.

In practice, a whole procedure of FDA consists of three components: data exploration, modelfitting, and inference making. By conducting data pre-processing and exploring with variousgraphical tools, we could have a rough understanding of the distribution of the data or curves.For continuous response outcomes, FLRMs can be fitted using techniques like least-squares es-timation with or without various regularization schemes. For binary or categorical repeated mea-sures, generalized functional regression models could be imagined. Although we restricted ourdiscussion mostly to repeated measures with balanced design in this article, it is feasible thatwe reconstruct the data to get estimates of yi (t) over a common grid {t j ; j = 1, . . . , T }. Var-ious smoothing techniques can be used for this task, e.g. model-based cross validation [36],kernel-based or spline-based non-parametric regression [37], robust to outlier method LOWESS[38]. The choice of different smoothing techniques should not be crucial. When making infer-ences, various forms of hypotheses can be tested using the adaptive Neyman or thresholdingtest statistics. However, other testing methods such as the point-wise or global F-test could beapplied too.

With the advent of modern information techniques, repeated measures are collected at higherfrequencies over longer periods of time. To analyze such data with a large-scale time grid, it wouldnot be sufficient or adequate to apply traditional vector-based LDA. As an alternative option, theFDA view of LDA becomes appealing. First, it is more robust than standard longitudinal models,requiring fewer assumptions on mean and correlation structures. Second, by making full use of thedata along with dimension-reduction schemes, it associates with large statistical power, is capableof handling missing values or unbalanced time grids, and also with much faster computation speed.Third, the functional regression model FLRM provides an intuitive tool in illustrating time-varyingtreatment effects and formal tests on an overall difference.

On the other side, the method has some limitations too. Perhaps, the most notable one isregarding the statistical interpretation of the test statistics from FLRM∗ within the transformeddomain. Because the regression coefficients of an FLRM∗ are associated with the frequencies inthe Fourier domain or the scale and location parameters in the wavelet domain, it is not an easy jobto explain them to medical researchers who are usually non-statisticians. As recommended by ananonymous reviewer, one solution resorts to transform the b j (k)’s obtained from the thresholdingor Neyman truncation procedures back to � j (t)’s, which are inside the original time domain andcan be plotted as in Figure 3. This idea was also seen in Morris et al. [39]. Currently, methods for

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 18: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

862 X. YANG AND K. NIE

creating 95 per cent confidence intervals or bands are being developed for the Neyman truncationand wavelet thresholding tests.

In this article, we adopted a two-step strategy in handling missing values for the carbon monoxidedata, which simply separated the imputation and analysis procedures into two independent ones.The multivariate-normal based imputation model assumed that the missing values were ‘missingat random.’ Although such an assumption is not too liberal, it lacks verification and its influenceon the FDA should be further assessed. By drawing from the framework of Bayesian inferences,it may be possible to develop a one-step strategy to conduct functional regression analysis forincomplete longitudinal data where the imputation step and FLRM fitting step can be conductedsimultaneously as seen in [40].

We emphasize that the analysis of the carbon monoxide data is mainly for demonstratingproposed FDA techniques. The data set may not be ideal for the application of the adaptive Neymanor thresholding test. There exist other modeling strategies, such as smoothing splines. In fact, wedid fit several longitudinal models with B-spline basis to this data set and obtained consistentconclusions. The spline methods, unfortunately, share the same limitation as the Fourier or wavelet-based methods, all weak in statistical interpretation. To further evaluate the performance of theproposed strategies, we conducted several simulation studies, where data were generated undervarious scenarios. We found that the adaptive Neyman test performs better than the thresholding testwhen large values of signals are concentrated in the first few components, while the thresholdingtest performs better when large values are concentrated in a few components (not necessarily thefirst few). For details of the simulation results, refer to Nie [27].

ACKNOWLEDGEMENTS

This work was partially supported by the National Institute on Drug Abuse through an SBIR contract N44DA35513 and two research grants: R03 DA016721 and P50 DA 12755. We especially thank HamutahlCohen and Jinhui Li for their editorial assistance.

REFERENCES

1. Hand DJ, Crowder MJ. Practical Longitudinal Data Analysis. Chapman & Hall: London, 1996.2. Shoptaw S, Rotheram-Fuller E, Yang X, Frosch D, Nahom D, Jarvik ME, Rawson RA, Ling W. Smoking

cessation in methadone maintenance. Addiction 2002; 97:1317–1328.3. Kaysen GA, Dubin JA, Muller HG, Mitch WE, Rosales LM, Levin NW. Relationships among inflammation

nutrition and physiologic mechanisms establishing albumin levels in hemodialysis patients. Kidney International2002; 61(6):2240–2249.

4. Diggle PJ, Liang KY, Zeger SL. Analysis of Longitudinal Data. University Press: Oxford, 1994.5. Ramsay JO, Silverman BW. Functional Data Analysis. Springer: New York, 1997.6. Hall P, Muller H-G, Wang J-L. Properties of principal methods for functional and longitudinal data analysis. The

Annals of Statistics 2006; 34:1493–1517.7. Davidian M, Lin X, Wang J-L. Introduction (Emerging issues in longitudinal and functional data analysis).

Statistica Sinica 2004; 14:613–614.8. Cardot H, Feraty F, Mas A, Sarda P. Testing hypotheses in the functional linear model. Scandinavian Journal of

Statistics 2003; 30:241–255.9. Cardot H, Feraty F, Sarda P. Functional linear model. Statistics and Probability Letters 1999; 45:11–22.10. Cardot H, Goia A, Sarda P. Testing for no effect in functional regression models, some computational approaches.

Communications in Statistics—Simulations and Computations 2003; 33:179–199.11. Muller H-G, Stadtmuller U. Generalized functional linear models. The Annals of Statistics 2005; 33:774–805.12. Brumback BA, Rice JA. Smoothing spline models for the analysis of nested and crossed samples of curves (with

Discussion). Journal of the American Statistical Association 1998; 93:961–994.

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim

Page 19: Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data

FUNCTIONAL LINEAR MODELS AFTER FOURIER AND WAVELET TRANSFORM 863

13. Shen Q, Faraway JJ. A generalized F test for linear models with functional responses. Statistica Sinica 2004;14:1239–1257.

14. Yang X, Shen Q, Xu H, Shoptaw S. Functional regression analysis using an F test for longitudinal data withlarge numbers of repeated measures. Statistics in Medicine 2007; 26(7):1552–1566.

15. Fan J, Lin SK. Tests of significance when data are curves. Journal of the American Statistical Association 1998;93:1007–1021.

16. Fan J. Test of significance based on wavelet thresholding and Neyman’s truncation. Journal of American StatisticalAssociation 1996; 91:674–688.

17. Yang X, Shoptaw S. Assessing missing data assumptions in longitudinal studies: an example using a smokingcessation trial. Drug and Alcohol Dependence 2005; 77:213–225.

18. Little RJA, Rubin DA. Statistical Analysis with Missing Data. Wiley: New York, 2002.19. Silverman B. Density Estimation for Statistics and Data Analysis. Chapman & Hall: London, 1986.20. Ramsay JO, Silverman BW. Applied Functional Data Analysis. Springer: New York, 2002.21. Faraway JJ. Regression analysis for a functional response. Technometrics 1997; 39:254–261.22. Yao F, Muller H-G, Wang J-L. Functional regression analysis for longitudinal data. The Annals of Statistics 2005;

33:2873–2903.23. Wahba G, Wang Y, Gu C, Klein R, Klein B. Smoothing spline ANOVA for exponential families, with application

to the Wisconsin epidemiological study of diabetic retinopathy. The Annals of Statistics 1995; 23:1865–1895.24. Gu C. Smoothing Spline ANOVA Models. Springer: New York, 2002.25. Bloomfield P. Fourier Analysis of Time Series—An Introduction. Wiley: New York, 1976.26. Bracewell R. The Fourier Transform and its Applications (3rd edn). McGraw-Hill: New York, 1999.27. Nie K. Hypothesis testing for high-dimensional data sets. Ph.D. Thesis, University of California, Los Angeles,

2004.28. Silverman BW. Wavelets in statistics: some recent developments. Physica-Verlag: Heidelberg, 1998; 15–26.29. Vidakovic B. Statistical Modeling by Wavelets. Wiley: New York, 1999.30. Vidakovic B. Wavelet-based functional data analysis: theory, applications and ramifications. Proceedings

PSFVIP-3, Maui, HI, U.S.A., 18–21 March 2002.31. Rubin DB. Multiple Imputation for Nonresponse in Surveys. Wiley: New York, 1987.32. Schafer JL. Analysis of Incomplete Multivariate Data. Chapman and Hall Series Monographs on Statistics and

Applied Probability, vol. 72. Chapman & Hall: London, 1997.33. Schimert J, Schafer JL, Hesterberg T, Fraley C, Clarkson DB. Analyzing Data with Missing Values in S-PLUS.

Insightful Corporation: Seattle, WA, 2000.34. Efron B, Tibshirani R. An Introduction to the Bootstrap. Chapman & Hall: London, 1993.35. Mallat SG. A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on

Pattern Analysis and Machine Intelligence 1989; 11–17:674–693.36. Rice JA, Silverman BW. Estimating the mean and covariance structure nonparametrically when the data are

curves. Journal of the Royal Statistical Society, Series B 1991; 53:233–243.37. Wahba G. Spline Models for Observational Data. SIAM: Philadelphia, PA, 1990.38. Cleveland W. Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical

Association 1979; 74:829–836.39. Morris JS, Vannucci M, Brown PJ, Carroll RJ. Wavelet-based nonparametric modeling of hierarchical functions

in colon carcinogenesis. Journal of the American Statistical Association 2003; 98:573–583.40. Yang X, Belin TR, Boscardin J. Simultaneous vs sequential handling of imputation and variable selection for

linear regression models. Biometrics 2005; 68:498–506.

Copyright q 2007 John Wiley & Sons, Ltd. Statist. Med. 2008; 27:845–863DOI: 10.1002/sim