22
Comparison of variants of canonical correlation analysis and partial least squares for combined analysis of MRI and genetic data Claudia Grellmann a,b, , Sebastian Bitzer a , Jane Neumann a,b , Lars T. Westlye c,d , Ole A. Andreassen c , Arno Villringer a,b,e,f , Annette Horstmann a,b a Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1A, 04103 Leipzig, Germany b Leipzig University Hospital, IFB Adiposity Diseases, Philipp-Rosenthal-Straße 27, 04103 Leipzig, Germany c Oslo University Hospital, NORMENT KG Jebsen Centre for Psychosis Research, Kirkeveien 166, PO Box 4956, Nydalen, 0424 Oslo, Norway d University of Oslo, Department of Psychology, PO Box 1094, Blindern, 0317 Oslo, Norway e Leipzig University Hospital, Clinic of Cognitive Neurology, Liebigstraße 16, 04103 Leipzig, Germany f Mind and Brain Institute, Berlin School of Mind and Brain, Humboldt-University, Unter den Linden 6, 10099 Berlin, Germany abstract article info Article history: Accepted 9 December 2014 Available online xxxx Keywords: Canonical correlation analysis Partial least squares correlation Functional magnetic resonance imaging Single nucleotide polymorphisms The standard analysis approach in neuroimaging genetics studies is the mass-univariate linear modeling (MULM) approach. From a statistical view, however, this approach is disadvantageous, as it is computationally intensive, cannot account for complex multivariate relationships, and has to be corrected for multiple testing. In contrast, multivariate methods offer the opportunity to include combined information from multiple variants to discover meaningful associations between genetic and brain imaging data. We assessed three multivariate techniques, partial least squares correlation (PLSC), sparse canonical correlation analysis (sparse CCA) and Bayesian inter-battery factor analysis (Bayesian IBFA), with respect to their ability to detect multivariate genotypephenotype associations. Our goal was to systematically compare these three approaches with respect to their performance and to assess their suitability for high-dimensional and multi-collinearly dependent data as is the case in neuroimaging genetics studies. In a series of simulations using both linearly independent and multi-collinear data, we show that sparse CCA and PLSC are suitable even for very high-dimensional collinear imaging data sets. Among those two, the predictive power was higher for sparse CCA when voxel numbers were below 400 times sample size and candidate SNPs were considered. Accordingly, we recommend Sparse CCA for candidate phenotype, candidate SNP studies. When voxel numbers exceeded 500 times sample size, the predictive power was the highest for PLSC. Therefore, PLSC can be considered a promising technique for multivariate modeling of high-dimensional brainSNP-associations. In contrast, Bayesian IBFA cannot be recommended, since additional post-processing steps were necessary to detect causal relations. To verify the applicability of sparse CCA and PLSC, we applied them to an experimental imaging genetics data set provided for us. Most importantly, application of both methods replicated the ndings of this data set. © 2014 Elsevier Inc. All rights reserved. 1. Introduction Many neurological and psychiatric disorders are associated with ge- netic factors. The most common polymorphism in the human genome is the single-nucleotide polymorphism (SNP) (Crawford and Nickerson, 2005). It has been shown that genetic variation has a closer relationship to brain imaging measures than to cognitive or clinical diagnostic mea- sures (Gottesman and Gould, 2003; Hibar et al., 2011; Smit et al., 2012). As a consequence, recent attention in imaging neuroscience has been focused on the genome-wide search for genetic variants that explain the variability observed in both brain structure and function. The standard approach to jointly examine the brain and the genome is the so-called mass-univariate linear modeling (MULM) approach. MULM consists of tting all possible univariate linear regression models and searching for a subset of important predictors, for which the null hypothesis of no association could be rejected. Detecting causal relations in this vein is, however, statistically and computationally challenging as SNPs are likely to interact with each other in their inuence on brain structure and function and MULM is thus not able to detect complex links that may exist between imaging and genetic data. A further limita- tion is related to the need to determine an experiment-wide signicance level that accounts for the multiple testing problem. Imaging genetics studies bail a high risk for false-positive ndings unless appropriate NeuroImage xxx (2014) xxxxxx Corresponding author at: Department of Neurology, Max Planck Institute for Human Cognitive and Brain Science, Stephanstraße 1A, 04103 Leipzig, Germany. E-mail addresses: [email protected] (C. Grellmann), [email protected] (S. Bitzer), [email protected] (J. Neumann), [email protected] (L.T. Westlye), [email protected] (O.A. Andreassen), [email protected] (A. Villringer), [email protected] (A. Horstmann). YNIMG-11847; No. of pages: 22; 4C: 6, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20 http://dx.doi.org/10.1016/j.neuroimage.2014.12.025 1053-8119/© 2014 Elsevier Inc. All rights reserved. Contents lists available at ScienceDirect NeuroImage journal homepage: www.elsevier.com/locate/ynimg Please cite this article as: Grellmann, C., et al., Comparison of variants of canonical correlation analysis and partial least squares for combined analysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10.1016/j.neuroimage.2014.12.025

Comparison of variants of canonical correlation analysis and partial least squares for combined analysis of MRI and genetic data

Embed Size (px)

Citation preview

NeuroImage xxx (2014) xxx–xxx

YNIMG-11847; No. of pages: 22; 4C: 6, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20

Contents lists available at ScienceDirect

NeuroImage

j ourna l homepage: www.e lsev ie r .com/ locate /yn img

Comparison of variants of canonical correlation analysis and partial leastsquares for combined analysis of MRI and genetic data

Claudia Grellmann a,b,⁎, Sebastian Bitzer a, Jane Neumann a,b, Lars T. Westlye c,d, Ole A. Andreassen c,Arno Villringer a,b,e,f, Annette Horstmann a,b

a Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1A, 04103 Leipzig, Germanyb Leipzig University Hospital, IFB Adiposity Diseases, Philipp-Rosenthal-Straße 27, 04103 Leipzig, Germanyc Oslo University Hospital, NORMENT KG Jebsen Centre for Psychosis Research, Kirkeveien 166, PO Box 4956, Nydalen, 0424 Oslo, Norwayd University of Oslo, Department of Psychology, PO Box 1094, Blindern, 0317 Oslo, Norwaye Leipzig University Hospital, Clinic of Cognitive Neurology, Liebigstraße 16, 04103 Leipzig, Germanyf Mind and Brain Institute, Berlin School of Mind and Brain, Humboldt-University, Unter den Linden 6, 10099 Berlin, Germany

⁎ Corresponding author at: Department of Neurology, MCognitive and Brain Science, Stephanstraße 1A, 04103 Lei

E-mail addresses: [email protected] (C. Grellman(S. Bitzer), [email protected] (J. Neumann), l.t.westly(L.T. Westlye), [email protected] (O.A. Andr(A. Villringer), [email protected] (A. Horstmann).

http://dx.doi.org/10.1016/j.neuroimage.2014.12.0251053-8119/© 2014 Elsevier Inc. All rights reserved.

Please cite this article as: Grellmann, C., et aanalysis of MRI and genetic data, NeuroImag

a b s t r a c t

a r t i c l e i n f o

Article history:Accepted 9 December 2014Available online xxxx

Keywords:Canonical correlation analysisPartial least squares correlationFunctional magnetic resonance imagingSingle nucleotide polymorphisms

The standard analysis approach in neuroimaging genetics studies is the mass-univariate linear modeling(MULM) approach. From a statistical view, however, this approach is disadvantageous, as it is computationallyintensive, cannot account for complex multivariate relationships, and has to be corrected for multiple testing.In contrast, multivariate methods offer the opportunity to include combined information frommultiple variantsto discover meaningful associations between genetic and brain imaging data. We assessed three multivariatetechniques, partial least squares correlation (PLSC), sparse canonical correlation analysis (sparse CCA) andBayesian inter-battery factor analysis (Bayesian IBFA), with respect to their ability to detect multivariategenotype–phenotype associations. Our goal was to systematically compare these three approaches with respectto their performance and to assess their suitability for high-dimensional and multi-collinearly dependent dataas is the case in neuroimaging genetics studies. In a series of simulations using both linearly independent andmulti-collinear data, we show that sparse CCA and PLSC are suitable even for very high-dimensional collinearimaging data sets. Among those two, the predictive power was higher for sparse CCA when voxel numberswere below 400 times sample size and candidate SNPs were considered. Accordingly, we recommend SparseCCA for candidate phenotype, candidate SNP studies. When voxel numbers exceeded 500 times sample size,the predictive power was the highest for PLSC. Therefore, PLSC can be considered a promising techniquefor multivariate modeling of high-dimensional brain–SNP-associations. In contrast, Bayesian IBFA cannot berecommended, since additional post-processing steps were necessary to detect causal relations. To verify theapplicability of sparse CCA and PLSC, we applied them to an experimental imaging genetics data set providedfor us. Most importantly, application of both methods replicated the findings of this data set.

© 2014 Elsevier Inc. All rights reserved.

1. Introduction

Many neurological and psychiatric disorders are associated with ge-netic factors. Themost commonpolymorphism in the human genome isthe single-nucleotide polymorphism (SNP) (Crawford and Nickerson,2005). It has been shown that genetic variation has a closer relationshipto brain imaging measures than to cognitive or clinical diagnostic mea-sures (Gottesman and Gould, 2003; Hibar et al., 2011; Smit et al., 2012).

ax Planck Institute for Humanpzig, Germany.n), [email protected]@psykologi.uio.noeassen), [email protected]

l., Comparison of variants ofe (2014), http://dx.doi.org/10

As a consequence, recent attention in imaging neuroscience has beenfocused on the genome-wide search for genetic variants that explainthe variability observed in both brain structure and function.

The standard approach to jointly examine the brain and the genomeis the so-called mass-univariate linear modeling (MULM) approach.MULM consists of fitting all possible univariate linear regressionmodelsand searching for a subset of important predictors, for which the nullhypothesis of no association could be rejected. Detecting causal relationsin this vein is, however, statistically and computationally challenging asSNPs are likely to interact with each other in their influence on brainstructure and function and MULM is thus not able to detect complexlinks that may exist between imaging and genetic data. A further limita-tion is related to the need to determine an experiment-wide significancelevel that accounts for the multiple testing problem. Imaging geneticsstudies bail a high risk for false-positive findings unless appropriate

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

2 C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

corrections are made. These limitations call for dedicated multivariatemethods that combine information from multiple markers simulta-neously into the analysis and that are statistically powerful, even ifthe number of variables heavily exceeds the available sample size.Two different strategies can be pursued to appropriately discovermeaningful associations between genetic and brain imaging datausing multivariate strategies. Forward modeling observes a collectionof brain imaging measures to find new genetic variants that causephenotypic variation (Meyer-Lindenberg, 2012). Likewise, establishedgenetic variants might be considered for understanding the role ofbrain structure and function in health and illness (backwardmodeling).

Partial least squares analysis (PLS) (Tucker, 1958; Wold, 1975) andcanonical correlation analysis (CCA) (Hotelling, 1936) are commonmultivariate approaches to study the relationship between two sets ofvariables by building orthogonal linear combinations of the observedvariables of each block (so-called latent variables). PLS aims atmaximiz-ing, at each step, the covariance between latent variables. In contrast,CCAmaximizes the correlation between latent variables and it thus cor-rects for within-set covariances prior to the decomposition. In the pastyears there have been several publications applying different variantsof PLS and CCA to detect multivariate genotype–brain imaging pheno-type associations, such as (Boutte and Liu, 2010; Hardoon et al., 2009;Le Floch et al., 2012; Wan et al., 2011). However, comprehensive com-parisons and contrasting juxtapositions with respect to sample size,data characteristics and strengths and weaknesses of these methodsare still limited. One study addressing this issue compared penalizedPLS regression and regularized Kernel CCA in combination with differ-ent strategies of dimension reduction to look for associations betweensimulated SNP and imaging data (Le Floch et al., 2012). Systematically,the authors accessed the generalizability of themultivariate associationusing cross-validation and their significance using permutation tests.However, for computational purposes, examination was limited to thefirst two pairs of latent variables. Furthermore, the authors concludedthat univariate SNP filtering is essential to overcome the overfittingissues of their multivariate methods. It is thus still unclear whetherother variants of PLS or CCA might be capable of high-dimensionalsettings without prior dimension reduction. Moreover, it is strikingthat the authors addressed the use of their selected methods, penalizedPLS regression and regularized Kernel CCA, in the context of high-dimensional SNP and comparably low dimensional imaging data(sample size 500 with 85,772 SNPs and 34 brain locations of interest).It is therefore of interest to discover how multivariate approaches per-form on whole brain imaging data, as the covariance structure betweenimaging variables is expected to be much stronger than the covariancestructure between various SNPs (Kovacevic et al., 2013).

In this study, we systematically compare three promisingmultivari-ate methods, partial least squares correlation (PLSC) (Tucker, 1958;McIntosh et al., 1996), sparse canonical correlation analysis (sparseCCA) (Witten et al., 2009) and Bayesian inter-battery factor analysis(Bayesian IBFA) (Klami et al., 2013), with respect to their performanceusing simulated imaging and SNP data to provide practical advice fortheir use in genetic neuroimaging studies. In particular, we considerthe applicability of thesemethods in the context of whole brain imagingdata, when someprior knowledge on the involvement of certain SNPs ofinterest is existing. Subsequently, we also compare the performance ofPLSC, sparse CCA and Bayesian IBFA on simulated whole brain imagingand high-dimensional SNP array data. We selected our methods basedon the following properties. We decided on PLSC, because singularvalue decomposition (SVD) is directly applied to the cross-productmatrix. In contrast to the penalized PLS regression applied by Le Flochet al. (2012), it is a non-iterative strategy, which is quite fast especiallyfor high-dimensional data sets. Moreover, PLSC has been shown tobe particularly suited to the analysis of the relationships betweenmeasures of brain activity and of behavior or experimental design(Krishnan et al., 2011; McIntosh and Lobaugh, 2004). Sparse CCA in-cludes a L1 penalization of canonicalweights such that highly correlated

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

sparse linear combinations of the two sets of variables are identified.Sparse canonical weights are beneficial as it is easier to differentiatebetween important and less important variables. Sparse CCA wasfurther chosen, because it has repeatedly been shown to give good re-sults in genotype–phenotype association studies (Avants et al., 2010;Wan et al., 2011; Witten et al., 2009; Witten and Tibshirani, 2009),even though these studies are lacking exact performance evaluationcriteria. Moreover, we included a Bayesian CCA approach, becausetreating CCA as a generative model is expected to be more robust thanthe classical linear algebraic solution (Klami et al., 2013). However,most Bayesian CCA implementations break down for large dimensional-ities with small sample sizes. We selected Bayesian IBFA, because it iscomputationally efficient and therefore works for high-dimensionaldata (Virtanen et al., 2011). Bayesian IBFA imposes a group-wise sparsi-ty to estimate the posterior of the model. It is therefore able to reliablyseparate the correlated effects from non-shared ones (Virtanen et al.,2011) and to automatically determine the component number. Thelatter is particularly important if underlying relations are not known apriori.

With the systematic assessment and comparison of themethods, weaddress the following questions: How many non-informative variablesmay be included in addition to causal variants to still be able to discovermeaningful associations between imaging and genetic data? If suchrelations are represented by the methods, are they also statistically sig-nificant and reliable? How many components need to be considered todetect all causal relations?Whichmethods are suitable for performanceevaluation? Do sparse CCA, Bayesian IBFA and PLSC yield comparableperformance or where do they differ? To further verify our findings,we apply the methods to an experimental imaging genetics data setprovided for us, with the objective of replicating the findings previouslydescribed.

In summary, the primary goal of this study is to provide a tutorialfor researchers in the field of imaging genetics, which collects andelaborately compares three methods that adapt to the needs of high-dimensional data sets such as in imaging genetics, which offers anadvice on which method to choose according to the properties of thedata set and which also supplies tools for performance evaluationand interpretation of results. An application-orientated comparativecollection of methods for imaging genetics, in particular in the contextof whole-brain imaging data, when some prior knowledge on the in-volvement of certain SNPs of interest is existing (backward model(Meyer-Lindenberg, 2012)), is not available to date. However, backwardmodels need to be considered separately from forwardmodels since thecovariance structure between imaging variables is expected to bemuchstronger than the covariance structure between various SNPs, whichmight be methodologically challenging for some statistical methods.

2. Methods

Anoverviewof the threemethods, sparse canonical correlation anal-ysis (sparse CCA), Bayesian inter-battery factor analysis (Bayesian IBFA)and partial least squares correlation (PLSC) is given in Table 1. Displayedare, together with the actual model, the criterion, which is maximizedby each method and the number of latent variables. Bayesian IBFAresults in a single latent variable, whereas sparse CCA and PLSC givetwo separate variables for each data set that are maximally correlatedwith each other. Further, we contrast the orthogonality of latentvariables, influencing the interpretability of components, and thenumber of components, since a method is preferable if the number ofcomponents is automatically determined and does not need to be setby the user prior to analysis. In addition it is shown which methodsare iterative, as iterative approaches might be computationally expen-sive, whether resulting weights are sparse, as sparsity enables the userto identify causal relations easily, and whether the approaches aresusceptible to resampling, which has an influence on permutationtesting and bootstrapping.

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

Table 1Overview of methods.The table gives an overview about main characteristics of sparse CCA, Bayesian IBFA and PLSC.

Sparse CCA Bayesian IBFA PLSC

Criterion to maximize maxw1k k¼ w2k k¼1

corr X1w1;X2w2ð Þ maxw1k k¼ w2k k¼1

cov X1w1;X2w2ð ÞModel max

w1k k¼ w2k k¼1w

01X

01X2w2−λ1 w1k k1−λ2 w2k k1

s.t. ‖w1‖2 ≤ 1, ‖w2‖

2 ≤ 1Zc ¼

ZZ1Z2

24

35 � N 0; Ið Þ; Y ¼ X1

X2

� �� N WZc;Σð Þ

with W ¼ W1 V1 0W2 0 V2

� �Σ ¼ σ2

1I 00 σ2

2I

� �S = X1

' X2 = W1DW2'

Latent variables Z1 = X1W1, Z2 = X2W2 Z = W'Y Z1 = X1W1, Z2 = X2W2

Orthogonality of latent variables z01lz1k

¼ 0 and z02lz2k

¼ 0; l≠k z01lz2k

¼ 0; l≠k

Number of components Defined by user Automatical determination by groupwise applicationof the ARD prior

min(p1, p2)

Iterative approach √ √ xSparsity of weights √ √ xInfluence of resampling Susceptible Susceptible Not susceptible

3C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

2.1. Canonical correlation analysis

Two of the considered methods, sparse CCA and Bayesian IBFA, arebased on the general concept of canonical correlation analysis (CCA).CCA (Hotelling, 1936) is a common method to model the relationshipbetween two sets of variables X1 and X2 of dimensions n × p1 andn× p2 based on their correlation. For each data set, the goal is to succes-sively build orthogonal linear combinations of the observed variables ofthis set (so-called latent variables), such that at each step the correlationbetween the pair of latent variables is maximal. The following criterionis optimized in each step:

maxw1k k¼ w2k k¼1

corr X1w1;X2w2ð Þ; ð1Þ

where w1 and w2 are weight vectors. Assuming that the variables ofeach data set are standardized column-wise, the function's result to bemaximized is

maxw1k k¼ w2k k¼1

w01X

01X2w2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

w01X

01X1w1

p � ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiw0

2X02X2w2

p : ð2Þ

The matrices X1′X2, X1′X1 and X2′X2 are estimates of the cross-covariance and covariance matrices, respectively. The total number ofcanonical correlations is equal to the number of variables in the smallerdataset, i.e.min(p1, p2).

The solution of Eq. (2) is not affected by rescalingw1 orw2 either to-gether or independently. The CCA optimization problem formulated inEq. (2) is therefore equivalent to maximizing the numerator subject tow1′X1′X1w1 = w2′X2′X2w2 = 1. The solution to Eq. (2) may be obtainedby solving the generalized eigen problem

X 01X1

� �−1X 01X2 X 0

2X2� �−1X0

2X1w1 ¼ δ2w1

X 02X2

� �−1X 02X1 X 0

1X1� �−1X0

1X2w2 ¼ δ2w2

ð3Þ

where δ denotes the canonical correlation.In high-dimensional data analysis, when the number of variables

exceeds the number of observations, CCA results in weight vectorsthat are not uniquely defined, because it involves the computation oftwo inverses, (X1′X1)−1 and (X2′X2)−1. To overcome these problems,we have used two alternative approaches, which are introduced in thesubsequent sections.

2.2. Sparse canonical correlation analysis

One possibility to solve the non-invertibility issue of CCA is to applyL1-penalties to the canonical weights. L1-regularization is given by theterm λ · ‖x‖1.

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

Sparse CCA methods have been proposed by several authors(Parkhomenko et al., 2007; Parkhomenko et al., 2009; Waaijenborget al., 2008; Wiesel et al., 2008; Witten et al., 2009). Witten et al.(Witten et al., 2009) introduced a penalized matrix decomposition(PMD) on the cross-covariancematrix aimed at identifying sparse linearcombinations of the two sets of variables that are highly correlatedwitheach other. PMD solves the optimization problem

maxw1k k¼ w2k k¼1

w01X

01X2w2−λ1 � P1 w1ð Þ−λ2 � P2 w2ð Þ ð4Þ

subject to w1′X1′X1w1 ≤ 1, w2′X2′X2w2 ≤ 1. Note that the equalityconstraints have been relaxed to inequality constraints to make thesets convex. P1 and P2 are convex penalty functions, which can take ona variety of forms. A useful example is the LASSO (least absolute shrink-age and selection operator) (Tibshirani, 1996)

P1 w1ð Þ ¼Xni¼1

jw1i j ¼ w1k k1:

It has been shown that in other high-dimensional problems, treatingthe covariance matrix as diagonal can yield good results (Dudoit et al.,2001; Tibshirani et al., 2003). Therefore, substituting I1 = X1′X1 andI2 = X2′X2 gives the so-called diagonal penalized CCA

maxw1k k¼ w2k k¼1

w01X

01X2w2−λ1 w1k k1−λ2 w2k k1 ð5Þ

subject to ‖w1‖2 ≤ 1, ‖w2‖

2 ≤ 1.Since the objective function is biconvex, i.e. it is convex in w1 with

w2 fixed and vice versa, the sparse CCA criterion may be iterativelysolved by updating w1, holding w2 fixed, and vice versa until conver-gence. The update for w1 is given by

1. w1 ¼ argmin w1k k¼112 X 0

1X2w2−w1�� ��2 þ λ w1k k1

2. Normalizew1� ¼

w1w1k k if w1k kN00 otherwise:

The update for w2 is similar.For sparse CCA, λ1 and λ2 are not set directly. Witten et al. (Witten

et al., 2009) have provided an algorithm to select tuning parameters c1and c2 as the number of variables of each data set that should be given anon-zero weight. This algorithm is based on permutation. Alternatively,values of c1 and c2 can be chosen to result in desired sparsity ofw1 andw2.

Sparse CCA has been extended to sparse supervised CCA (Witten andTibshirani, 2009), which is able to seek for variables in the two data setsthat are correlated with each other and associated with an outcomemeasure. It is also possible to consider more than two data sets (sparsemultiple CCA). As we are only interested in the relation of imaging andgenetic data, we did not use any of those extensions.

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

4 C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

Recently Chi et al. (2013) have proposed an extension to thework onPMD for sparse CCA algorithms. The authors haveweakened the identitycovariance assumption to account for correlation structure in both datasets. The objective function to be maximized is therefore the same,

maxw1k k¼ w2k k¼1

w01X

01X2w2−λ1 w1k k1−λ2 w2k k1: ð6Þ

Alterations were done to the constraints, such that full covariancematrices are included. This will be realized in future.

2.3. Bayesian inter-battery factor analysis

A secondmethod for copingwith large dimensionalities is a Bayesiantreatment of the inter-battery factor analysis (IBFA) model (Klami et al.,2013). The IBFAmodel (Browne, 1979) not only extracts the correlationsbetween data sets but also decomposes the data into shared and dataset-specific components. The term CCA therefore emphasizes the searchfor correlations (shared components), whereas IBFA accentuates thedecomposition into shared and data set-specific components.

Given are twomultivariate randomvariablesx1∈ℝp1�1 andx2∈ℝp2�1,which are considered to be generated by the same unobservedlatent variable z ∈ ℝK × 1. Data samples are observed as matrices Xm ¼xm1 ;…; xmn

� , m = 1, 2, with n observations, which are assumed to be

mean centered. By feature-wise concatenation Y ¼ X1X2

� �, the Bayesian

IBFA model is given by

Zc � N 0; Ið ÞY � N WZc;Σð Þ ð7Þ

with Zc ¼ZZ1Z2

24

35; W ¼ W1 V1 0

W2 0 V2

� �and diagonal noise covariance

Σ ¼ σ21I 00 σ2

2I

" #∈ℝp�p , p = p1 + p2, indicating independence of the

noise over the features. The notationN(μ , Σ) corresponds to the normaldistribution with mean μ and covariance Σ.

The model in Eq. (7) is a simplified factor analysis model with a spe-cific form of sparse structure for the linear projection W. For the firstdataset, we have X1 ~ N(W1Z + V1Z1, σ1

2I), and for the second,respectively X2 ~ N(W2Z + V2Z2, σ2

2I). The shared latent variables Zcapture the variation common in both data sets. They are linearly trans-formed to represent the observations X1 and X2 by multiplication withWm∈ℝpm�K , m = 1, 2. The remaining variation specific to each dataset, modeled by latent variables Zm, is transformed by another linearmapping VmZm, where Vm∈ℝpm�Km , m = 1, 2. The actual observationsare thus generated by the sumof the twomatrix factorizations, followedby adding up noise as covariance matrix with a low-rank structure.Note that the probabilistic IBFA model in Eq. (7) results in a singlelatent variable Z, whereas classical CCA gives two separate variablesZ1 = X1W1 and Z2 = X2W2 that are maximally correlated. However, tocompare the results of classical CCA andBayesian IBFA, it is possible to ei-ther average the canonical scores Z1 and Z2 of classical CCA or to producetwo separate latent variables by estimating the distribution of Z condi-tional on having observed only one of the views p(Z|X1) and p(Z|X2).

For Bayesian analysis, the model in Eq. (7) needs to becomplemented with priors for the model parameters. To automaticallylearn the group-wise sparsity structure for W, an automatic relevancedetermination prior (ARD) (Neal, 1996) is implemented in a group-wise manner

p Wð Þ ¼ ∏Kc

k¼1∏p1

d1¼1N wd1 ;k

j0;α−11;k

�∏

p1þp2

d2¼p1þ1N wd2 ;k

j0;α−12;k

�" #

αm;k � Gamma α0;β0ð Þ:ð8Þ

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

For each weight wdm ;k of the kth component, m = 1, 2, the priorinfers the posterior of the αm,k, whose elements each have the non-informative Gamma prior with small values for the hyperparametersα0 and β0. Shared components will have small αm,k (large variance),whereas components specific to either data view will have small αm,k

only for the active view. The model automatically infers the total num-ber of components by making unnecessary components wdm ;k inactivefor both views.

The noise precision parameters τm = σm−2 of the covariance matrix

are given Gamma priors

τm � Gamma ατ0;β

τ0

� �: ð9Þ

For inference, a mean-field approximation q Θð Þ ¼ ∏jq θ j� �

for the

posterior p(Θ) is used, factorized over all elements of the model (7).

q W ; τm;αm;Zcð Þ ¼ ∏n

N¼1q zcN

�� ∏

2

m¼1q τmð Þq αmð Þð Þ � ∏

p1þp2

d¼1q wd;:

�� ð10Þ

The separate terms q(·) are updated alternatingly to minimize theKullback–Leibler divergence DKL(p, q) between q(W, τm, αm, Zc) andp(W, τm, αm, Zc|Y) to obtain an approximation that best matches thetrue posterior. This is equivalent to maximizing the lower bound

L qð Þ ¼ logp Yð Þ−DKL p; qð Þ ¼Z

q W ; τm;αm;Zcð Þ � log p W ; τm;αm;Zc;Yð Þq W ; τm;αm;Zcð Þ :

ð11Þ

Themodel in Eq. (7) is invariant to any linear transformations, sincethe likelihood of the model is invariant to all invertible R ∈ ℝK × K:W*Z* = (WR)(R− 1Z) = WZ. Thus, probabilistic CCA finds the samesubspace as the classical solution, but interpretation of components islimited and requires further constraints or postprocessing. For BayesianIBFA, this problem is solved bymaximizing the variational lower boundwith respect to R. Given a fixed likelihood, the only way the variationalbound can improve is by rotating the components such that the poste-rior p(Θ) best matches the prior distribution, which assumes indepen-dent latent variables. Hence, the model is forced to find latentvariables that are a posteriori maximally independent of each other,which improves interpretability and convergence speed.

2.4. Partial least squares correlation

An alternative to CCA is amethod that is also searching for linear com-binations of the original variables X1 ∈ ℝn × p1 and X2 ∈ ℝn × p2, but forwhich the criterion of maximal correlation is balanced with the require-ment to explain as much variance as possible. This method is called PLS.There are two basic types of PLS methods, which are called partial leastsquares regression (PLSR) (Wold, 1975) and partial least squares correla-tion (PLSC) (McIntosh et al., 1996). PLSR is a regression technique thatpredicts one data set from another (Krishnan et al., 2011). We, however,implemented PLSC, which analyzes the association of X1 and X2 bymaximizing the covariance between the pair of latent variables,

maxw1k k¼ w2k k¼1

cov X1w1;X2w2ð Þ: ð12Þ

PLSCwas introduced as Tucker Inter-battery Analysis (Tucker, 1958)and was first applied to functional neuroimaging data and behavioraloutcome measures by McIntosh et al. (1996). For PLSC, Singular ValueDecomposition (SVD) is performed on the cross-product matrix X1

' X2,which, in contrast to CCA, is not corrected for within-set covarianceprior to the decomposition,

S ¼ X0

1X2 ¼ ADB0¼ d1a1b

0

1 þ d2a2b0

2 þ…þ dKaKb0

K ; ð13Þ

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

5C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

where K = min(p1, p2). The coefficients of PLSC, W1 and W2, equalthe left and right singular vectors A and B. They are called saliences, ifit is known a priori that there is a causal relation between X1 and X2,and are comparable to the canonical weights of CCA. The singular valuesdi, i = 1, …, K, in the SVD provide the covariance between the linearcombinations Z1 = X1W1 and Z2 = X2W2, so-called latent variables orscores. In contrast to CCA, which forces the latent variables of eachdata set to be mutually orthogonal, PLSC requires the orthogonalitybetween each latent variable z1l and each latent variable z2k for l ≠ k.

For PLSC, the salience of a certain variable,w1l , is proportional to thecovariance of the corresponding variable x1l with the scores for theother block Z2, w1l∝cov x1l ;Z2

� �, and likewise w2l∝cov x2l ;Z1

� �.

Hence, adding or removing a further variable x1 lþ1ð Þ (or x2 lþ1ð Þ) has onlya small effect on w1l (or w2l ) (Wegelin, 2000). In contrast, the coeffi-cients of CCA are computed as multiple regression coefficients, suchthat adding and deleting of variables have a huge effect.

2.5. Assessment of significance

To assess the significance of the correlation (CCA) and covariance(PLSC) of latent variables, respectively, we applied permutation tests.For this purpose, subjects, i.e. rows of input matrices, are randomlyreassigned without replacement and CCA and PLSC are recalculated. Ateach permutation, the statistic (correlation and covariance) is thencompared to the statistic obtained on original data with probabilityvalue equal to the number of times the statistic of permuted dataexceeds the original value. For each permutation test performed, weconsidered 5000 permutations in order to get a good estimate of theempirical p-values.

It has been shown that the rate of false positives is quite high whenthe overall correlation structure isweak, such as in studies searching forassociations between brain imaging and SNP measures. Therefore it isoften more important to detect reliable associations, measured by thestability of pairings between left and right singular vectors of the SVDfor any set of subjects (Kovacevic et al., 2013). For PLSC, Kovacevicet al. (2013) recently introduced a split-half reliability testing as an al-ternative to significance testing. The split-half resampling starts bydecomposing the cross-product matrix S using SVD, S¼ADB0 . To testthe stability of the pairings between left and right singular vectors,subjects are randomly split and SVD is used to decompose the splitcross-product matrices S1 and S2. The original matrices A and B arethen projected onto each half of S to obtain half-sample matchingpairings,

A1 ¼ S1BD−1 and A2¼S2BD

−1

B1 ¼ S01AD−1 and B2¼S

0

2AD−1

:ð14Þ

The correlations of A1 and A2, aswell as B1 and B2, are taken asmeancorrelation across split-halves. The procedure is repeated many times.By randomly permuting subjects and repeating the split-half correlationestimation for each permuted data set, a null distribution for the split-halfcorrelations pAcorr

and pBcorr is created, which is used to estimate the prob-ability of exceeding the correlations from the original un-permuted dataset. For each split-half reliability test performed, we considered 100half-splits and 100 permutations.

For evaluation of the reliability of the weights of sparse CCA andPLSC, we used bootstrapping. In both data sets, subjects are randomlyreassigned with replacement, such that the assignment of subjects ismaintained, whereas each subject's contribution is changed.

The output of Bayesian IBFA does not contain the actual canonicalweights but their posterior, i.e. meanweights as well as the covariancesof the weights, which we used to assess the contribution of eachvariable. Hence, we could replace the time-consuming bootstrappingby directly evaluating the credibility of the weights.

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

2.6. Performance evaluation

For performance evaluation we used the predictive squared correla-tion coefficient (Cramer, 1980) and the model selection procedureCHull (Wilderjans et al., 2013).

The predictive squared correlation coefficient Q2 is the ratio of thesum of squared differences between the observed variable xobs andthe predicted variable xpred, which is called the predictive residualsum of squares (PRESS), and the sum of squared differences betweenthe observed variable and its mean (sum of squares, SS),

Q2 ¼ 1−

Xni¼1

xobsi −xpredi

�2Xni¼1

xobsi −xobs �2 ¼ 1− PRESS

SS: ð15Þ

It measures the prediction performance of themodel. In case of idealprediction, we obtain Q2 = 1. For bad predictions, PRESS is greater thanSS. Thus, the predictive squared correlation coefficient can even benegative, which means that in prediction the model performs worsethan the observed variables' mean.

For Bayesian IBFA, which results in a single latent variable Z, thepredictive squared correlation coefficient Q2 was computed directly. Incontrast, for sparse CCA and PLSC we used the average of the canonicalscores Z1 and Z2.

The second method, CHull, identifies a model that balances modelfit and model complexity. For each model, a goodness of fit or misfitvalue f and a complexity value c is computed. CHull then determinesthe convex hull of the fit-measure-by-complexity-measure plot andidentifies a model by computing a scree-test value st that indicateshowmuch better a solution is compared to a less complex one, relativeto how much better a solution is in comparison with a more complexone,

sti ¼f i− f i−1

ci−ci−1f iþ1− f iciþ1−ci

; ð16Þ

for all modelsmi located on the upper (goodness of fit) or lower (misfit)boundary of the convex hull. Finally, themodel with the largest st-valueis selected. Note that for the first and last complexity measure there isno less complex or more complex solution for comparison and st willequal zero.

As goodness of fit measure for CHull, we estimated the out-of-sample correlation coefficient (CCA) and out-of-sample covariance(PLSC), respectively, by 10-fold cross-validation (CV), as suggested byLe Floch et al. (2012). At each fold of the CV, we calculated trainingweights W1

train and W2train on the training set and by linear mapping

Z1test = X1testW1

train and Z2test = X2testW2

train estimated the latent variablesof the test set. The out-of-sample correlation coefficient (CCA) is thecorrelation between the Z1test and Z2test variables. The same holds forthe out-of-sample covariance of PLSC. Note that the out-of-samplecorrelation coefficient and the out-of-sample covariance might benegative, as they reflect an average correlation or covariance of testsamples over folds.

2.7. Simulated data

Statistical inferences usingMULM in the context of imaging geneticsdata are problematic since dependencies among collinear variables areignored. It is furthermore challenging to handle the massive multipletesting problem. In a series of simulations, we aimed for discussing,whether using PLSC, sparse CCA or Bayesian IBFA we are able to over-come these limitations of MULM. Therefore, we simulated two data

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

6 C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

sets, as described in Le Floch et al. (2012). Each data set consisted of 100samples. In contrast to (Le Floch et al., 2012), who addressed the use ofmultivariate approaches for high-dimensional SNP and comparably lowdimensional imaging data, we wanted to evaluate the performance ofourmethods in the context of whole brain imaging data, as imaging var-iables are expected to be inmuch higher collinearity than SNPmeasures(Kovacevic et al., 2013). For the imaging data set, we generated 90,000voxels using multivariate normal distribution with mean and covari-ance parameters estimated from experimental data. The genotypedata set consisted of 50 SNPs thatwere generated using the gs algorithm(Li and Chen, 2008) based on phase III HapMap data. HapMap III is thethird phase of the International HapMap project (The HapMapConsortium, 2003). It includes 11 populations. We considered a samplecollected by the Centre d'Etude du Polymorphisme Humain (CEPH)in 1980 from people living in Utah with ancestry from NorthernandWestern Europe (CEU). SNPswere coded using the additive geneticmodel, 0 for one type of homozygous, 1 for heterozygous and 2 for theother type of homozygous individuals.

To induce a causal relation between simulated imaging and SNPdata, we randomly selected one voxel and 3 SNPs with minor allelefrequency (MAF) greater than 0.2, such that the pairwise correlationbetween that voxel, voxels in multi-collinearity with the selectedvoxel, and the selected SNPs was on average 0.3. In other simulationstudies, the authors usually controlled for the correlation betweencausal imaging phenotypes and causal SNPs at a value of, on average,0.5 (e.g. Boutte and Liu, 2010; Le Floch et al., 2012). However, weselected an average correlation of 0.3, since a weaker associationmight be more realistic and is in line with p-values reported in studiesconsidering univariate associations between specific SNPs and brainareas of interest (e.g. Filippini et al., 2009; Ousdal et al., 2012; Potkinet al., 2009).

We applied sparse CCA, Bayesian IBFA and PLSC to study therelationship of causal voxels and SNPs, whereby the dimensionality ofthe imaging data set was stepwise increased from 100 voxels, equalingsample size, to 1000, 10,000, 20,000, 30,000, 40,000, 50,000, 70,000 andfinally 90,000 voxels, exceeding sample size by a multiple. See Fig. 1 forschematic representation of simulated data.

To further compare PLSC, sparse CCA and Bayesian IBFAwith respectto their performance in more realistic imaging genetic settings, weextended our simulation experiment by also considering a large num-ber of SNPs. We therefore simulated an imaging data set consisting of90,000 highly collinear voxels and a genotype data set consisting of

Fig. 1. Simulated data set. For the first simulation, the imaging data set consisted of 100voxels, and 50 SNPs were generated. One voxel and three SNPs were randomly selectedand a causal relation between that voxel, voxels in multi-collinearity with the selectedvoxel, and the selected SNPs, as shown in red, was induced. The dimensionality of theimaging data was stepwise increased from 100 to 90,000 voxels.

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

10,000 SNPs. The number of SNPs was chosen to resemble the numberof SNPs genotyped on a small human SNP arrays (LaFramboise, 2009).As described above, we then induced a causal relation between onerandomly selected voxel and 3 SNPs, such that the pairwise correlationbetween that voxel, voxels in multi-collinearity with the selected voxel,and the selected SNPs was on average 0.3.

To discusswhether the failure of anymethod is related to the dimen-sionality of the imaging data set or to the degree of multi-collinearitybetween voxels, we created a second simulation series consisting of lin-early independent imaging variables. For imaging genetic settings thismight be irrelevant but could be applied to other areas of life science.We simulated two data sets as we did for multi-collinear imagingdata, each consisting of 100 samples. For the imaging data set, 100,000voxels were generated using standard multivariate normal distribution.The genotype data set consisted of 50 SNPs.

To induce causal relations between simulated imaging and SNP data,we randomly selected 5 SNPs and 300 voxels. This time we simulatedtwo independent causal patterns, as for linearly independent data thismight be more realistic. For the first causal pattern, the first 2 SNPswere averaged and associated with the first 100 voxels such that thepairwise correlation between the voxels and the selected SNPs was onaverage 0.3. Similarly, the remaining 3 SNPs and 200 voxels were linkedto each other to account for the second causal pattern. Dimensionality ofthe imaging data was again stepwise increased from 1000 to 10,000,20,000, 30,000, 40,000, 50,000 and finally 100,000 variables.

2.8. Experimental data

In this studywe aimed at replicating a study previously described byOusdal et al. (2012) in order to verify the applicability of sparse CCA,Bayesian IBFA and PLSC for experimental imaging genetics data. Theoriginal study was performed to test the hypothesis that monoaminessuch as dopamine, serotonin and norepinephrine are important modu-lators of amygdala activity (LeDoux, 2007). Therefore, the authors com-bined genome-widemicroarray SNP and functional imaging data duringa face-matching task, a common and validated paradigm to measureamygdala activity (Carre et al., 2010; Hariri et al., 2002). As amygdalawas selected as region of interest, the original study is a candidate phe-notype study and the authors have used mass-univariate linear model-ing to detect relevant SNPs thatmay explain variabilitywithin amygdalaactivity. In contrast, for the application of sparse CCA, Bayesian IBFA andPLSC, we includedwhole brainmeasures, because our goalwas not onlyreplicating the results of the original study, but also detecting furtherbrain regions involved during the face-matching task.

The original studywas designed as follows. Participantswere 224 in-dividuals (109women), either healthy controls or patientswith diagno-sis of schizophrenia spectrum disorder, bipolar disorder or psychosisnot otherwise specified. Patients were recruited from the psychiatricunit of Oslo University Hospital and diagnosed via Structural Clinical In-terview for DSM-IV Axis I disorders (SCID-I). Healthy control subjectswere randomly selected from the Norwegian citizen registration. MRIscanswere acquired on a 1.5 T SiemensMagnetom Sonata scanner. Dur-ing face-matching task, participants were asked, which of two stimuli,human faces expressing anger of fear, matches a target stimulus. Forsensorimotor control, stimuli were geometrical shapes. Genotypingwas done using the Affymetrix Genome-Wide Human SNP Array 6.0.For quality control, SNPs with call rates below 97% and minor allelefrequency less than 5% were removed. In addition, individuals withcall rates below 97% or outlying levels of heterozygosity (greater thanthree standard deviations from the mean) were eliminated. After this,information on 224 individuals and 546,381 SNPs was available. Formore detailed information about participants, experimental task,BOLD fMRI data acquisition or genotyping, see (Ousdal et al., 2012).

The original analysis was performed using SPM2 following standardpreprocessing pipelines and standard first level analysis. For secondlevel analysis, amygdala was extracted using the automatic anatomical

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

7C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

labels (aal) amygdala mask in the WFU PickAtlas toolbox provided inSPM. Statistical analysis was performed controlling only for diagnosistype (schizophrenia, bipolar disorder or other psychosis), as genderand age were not significantly different across subjects. At first, inorder to detect the peak voxel with the highest differential activationacross individuals for both hemispheres, a t-test was applied to individ-ual contrast values (contrast "faces minus shapes") of every amygdalavoxel. Subsequently, contrast values for the right and left amygdalapeak voxels were tested for association with each SNP separatelyusing an additivemodel of genetic effect. Bonferroni correctionwas per-formed to correct for multiple testing over SNPs and phenotypes. Thetop candidate SNPwas then analyzedwith a random-effects two samplet-test SPM analysis to explore differences in amygdala BOLD response ofCC-homozygous individuals relative to T-allele carries (CT and TT).Small volume correction based on anatomically defined bilateral amyg-dala and false discovery rate (FDR) corrected p-values were used to cor-rect for multiple testing. For more detailed information about statisticalanalysis we refer to the original publication (Ousdal et al., 2012).

In order to verify the applicability of sparse CCA, Bayesian IBFA andPLSC for experimental imaging genetics data and to make use of the ad-vantages of multivariate methods, we included whole brain measuresinstead of using a region of interest approach. For our analysis, wewere supplied by the authors with updated fMRI data of the originalpaper using FSL software for preprocessing. Genotype and demographicinformationwere provided for all 224 participants andfive SNPs, name-ly rs10014254 (C/T), rs11722038 (A/G), rs17529323 (A/C), rs382013(A/G) and rs437633 (A/G). SNPswere recoded using an additive geneticmodel, 0 for major allele homozygous individuals, 1 for heterozygousindividuals and 2 forminor allele homozygous individuals. After remov-ingmissing data for all SNPs, wewere left with 208 subjects. Prior to thestatistical analysis, we corrected for diagnosis type (schizophrenia,bipolar disorder or other psychosis) as in the original publication.

3. Results

3.1. Simulated data

3.1.1. Multi-collinear imaging data and candidate SNPsFor sparse CCAwe computed ten components at each dimensionality,

respectively, and applied the model selection procedure CHull as well asthe predictive squared correlation coefficient to detect the componentscarrying causal information. In contrast, the Bayesian IBFA automaticallydetermines how many components to detect by implementing sparsityby groupwise application of an automatic relevance determination(ARD) prior that makes unnecessary components inactive for each ofthe two data sets separately. For PLSC we computed as many compo-nents as necessary to explain at least 80% of variance. As some of theBayesian IBFA and PLSC components we computed were not carryingcausal information,we also applied CHull and the predictive squared cor-relation coefficient for model selection. The advantage of sparse CCAcompared to PLSC is that penalties are applied to the canonical weightssuch that it is possible to differentiate between important and lessimportant variables without much effort. The Bayesian IBFA does notoutput the actual canonical weights but their posterior, such that weused themeanweights aswell as the covariances of theweights to assessthe credibility of variables and to exclude less important voxels and SNPsfrom themodel. PLSC, however, does not involve the computation of anyinverse matrices and it therefore works without any further preprocess-ing. SVD only needs to be computed once, so compared to sparse andBayesian CCA, which are iterative strategies, PLSC is quite fast especiallyfor higher dimensions.

A summary of the results on simulated multi-collinear imaging datais given in Table 2.

For sparse CCA and PLSC, causal SNPs and voxels are represented inthe first component, which is associated with the highest canonical cor-relation or covariance, for all simulated dimensionalities of the imaging

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

data set. This result is confirmed by both the cumulative out-of-samplecorrelation coefficient, which is used as measure of fit for model selec-tion with CHull, and the predictive squared correlation coefficient. Interms of PLSC, Table 2 further illustrates that the causal componentmay be identified by the percentage of variance it explains, as the firstcomponent already explains 30% to 60% of overall variance for highand low dimensions, respectively. In contrast, the number of compo-nents learned by Bayesian IBFA is high even if the number of voxelsequals sample size and the causal pattern is usually represented inany of the backmost components.

To identify causal components, we used CHull with the cumulativeout-of-sample correlation coefficient (out-of-sample covariance)(Fig. 2) asmeasure of fit aswell as the predictive squared correlation co-efficient (Fig. 3). For sparse CCA and PLSC, the cumulative out-of-samplecorrelation coefficient (cumulative out-of-sample covariance) steeplyincreases for the first component and levels out for higher componentnumbers for all simulated dimensionalities. Consequently, the modelselection procedure CHull suggests to consider only the first sparseCCA and PLSC component, with one exception in terms of sparse CCAfor 50,000 considered voxels. In Fig. 2 this is highlighted by blue dots.The green dots mark the components truly carrying causal information.Note that in Table 2 and Fig. 2, the covariance and out-of-sample covari-ance of the causal PLSC component seem to be increasing with higherdimensionality in contrast to the correlation and out-of-sample correla-tion of sparse CCA. However, the covariance (and the out-of-samplecovariance) is an unbounded measure of association, which cannot becompared across data sets of increasing voxel numbers. The actualstrength of the association of latent variables is comparable for PLSCand sparse CCA.

The performance of sparse CCA and PLSC was further evaluatedusing thepredictive squared correlation coefficient. In Fig. 3, it decreasesafter the first component for all simulated dimensionalities, givingfurther evidence for considering only thefirst sparse CCA and PLSC com-ponent, which is indeed carrying the causal information.

In contrast to sparse CCA and PLSC, Fig. 2 shows that for BayesianIBFA, we are not able to detect the causal component using CHull. Asthe number of components is high, the representation of the causalpattern varies strongly across folds. The predictive squared correlationcoefficient Q2, however, is suitable for model selection in BayesianIBFA. When 100 imaging phenotypes are considered, the predictivesquared correlation coefficient steeply increases for the third componentand reaches a plateau for further components, revealing that componentthree is the causal one. The same holds true when 10,000 and 40,000voxels are considered. For 1000, 20,000 and 30,000 imaging phenotypes,Q2 already levels out after thefirst component,which is truly representingcausal voxels and SNPs. For 50,000 imaging phenotypes or more, it getshard to differentiate between causal and non-informative componentsusing the predictive squared correlation coefficient.

An illustration of voxel and SNPweights of sparse CCA, Bayesian IBFAand PLSC for 1000 imaging phenotypes is provided in Figs. 4 and 5,respectively. The component selected by the predictive squared correla-tion coefficient (and CHull in terms of sparse CCA and PLSC) is shown.For the imaging data set, the voxels we defined as causal, which includethe causal voxel itself, as shown in red, and voxels correlated to thecausal voxel at a value of 0.8 or higher (yellow), get the highestweights.However, the weight difference between causal and non-causal voxelsis small, due to the fact that many voxels are in multi-collinearity withthe causal voxel and our threshold to define a causal voxel (pairwisecorrelation ≥0.8) is very strict. Nevertheless, using bootstrapping, weshowed that all voxels we defined as causal are reliable. In contrast tothe voxel weight profile, the causal SNPs, as shown in red, are clearlycontributing most to the SNP weight profile of the first component.

For sparse CCA and Bayesian IBFA, using permutation tests, wedetected significant canonical correlations only for up to 50,000 voxelsand only for 1000 voxels, respectively. The reason is that for canonicalcorrelation analysis type models, canonical weights are computed as

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

Table 2Simulation results for multi-collinear imaging data.The dimensionality of the imaging data set was increased stepwise from 100 to 90,000 voxels. The number of SNPs was kept constant.We computed ten components for sparse CCA at each dimensionality, respectively. In contrast, for Bayesian IBFAthe component number was automatically determined and for PLSC, we considered as many components as necessary to explain at least 80% of variance. For each dimensionality of the imaging data set, the p-value of permutation testing is illus-trated. For PLSC,we also indicate the p-valuespW1

andpW2of latent variable reliability based on split-half resampling. Furthermore it is shown,which component is representing the causal pattern, the canonical correlation (CCA) or covariance (PLSC)

of that component as well as the out-of-sample correlation (CCA) or out-of-sample covariance (PLSC), estimated on the test data by 10-fold CV. For PLSC, the percentage of variance the causal component explains is also given. Finally, it is displayed,how many components are selected by CHull and the predictive squared correlation coefficient Q2 and whether it is possible to detect the causal pattern considering model selection procedures.

Dimensionalityof MRI data

Method No. ofcomponents

p-Value (pW1, pW2

) Component ofcausal pattern

Correlation(CCA)/covariance(PLSC)

Out-of-sample corr.(CCA)/out-of-sample cov.(PLSC)

Componentsselected byCHull

Componentsselected by Q2

Detectability

100 Sparse CCA 10 0.0∗ ∗ ∗ 1 0.6751 0.5223 1 1 pperm/CHull/Q2

Bayesian IBFA 11 0.034∗ 3 0.6346 0.2454 4 3 pperm/Q2

PLSC 3 0.0238∗ (0.02, 0.0) 1 6.2615 (60.99%) 2.0734 1 1 pperm/CHull/Q2

1000 Sparse CCA 10 0.0004∗ ∗ ∗ 1 0.7090 0.5147 1 1 pperm/CHull/Q2

Bayesian IBFA 20 0.008∗ ∗ 1 0.6643 0.1280 17 1 pperm/Q2

PLSC 4 0.0024∗ ∗ (0.02, 0.0) 1 18.2164 (49.99%) 7.2065 1 1 pperm/CHull/Q2

10,000 Sparse CCA 10 0.0176∗ 1 0.6098 0.2963 1 1 pperm/CHull/Q2

Bayesian IBFA 25 0.088 3 0.6025 0.1921 7 3 Q2

PLSC 6 0.0133∗ (0.01, 0.02) 1 44.1569 (36.68%) 12.1333 1 1 pperm/CHull/Q2

20,000 Sparse CCA 10 0.0072∗ ∗ 1 0.6748 0.4251 1 1 pperm/CHull/Q2

Bayesian IBFA 24 0.078 1 0.6997 0.1839 1 1 CHull/Q2

PLSC 6 0.0002∗ ∗ ∗ (0.02, 0.0) 1 73.5203 (42.16%) 37.2138 1 1 pperm/CHull/Q2

30,000 Sparse CCA 10 0.0144∗ 1 0.6846 0.3342 1 1 pperm/CHull/Q2

Bayesian IBFA 16 0.369 1 0.6570 0.1475 2 1 Q2

PLSC 7 0.0114∗ (0.05, 0.04) 1 74.1553 (32.84%) 30.3046 1 1 pperm/CHull/Q2

40,000 Sparse CCA 10 0.014∗ 1 0.5995 0.4142 1 1 pperm/CHull/Q2

Bayesian IBFA 19 0.3911 3 0.6532 0.0740 19 3 Q2

PLSC 7 0.0116∗ (0.36, 0.06) 1 82.6020 (30.06%) 28.1449 1 1 pperm/CHull/Q2

50,000 Sparse CCA 10 0.0098∗ ∗ 1 0.6941 0.3830 8 1 pperm/Q2

Bayesian IBFA 61 0.746 4 0.6824 0.0810 30 4 Q2

PLSC 10 0.0006∗ ∗ ∗ (0.02, 0.0) 1 101.4630 (35.67%) 80.2084 1 1 pperm/CHull/Q2

70,000 Sparse CCA 10 0.0576 1 0.6930 0.2501 1 1 CHull/Q2

Bayesian IBFA 64 0.353 7 0.6797 0.0361 30 (7) xPLSC 11 0.004∗ ∗ (0.08, 0.0) 1 111.8040 (31.81%) 78.6747 1 1 pperm/CHull/Q2

90,000 Sparse CCA 10 0.0738 1 0.6843 0.3205 1 1 CHull/Q2

Bayesian IBFA 68 0.283 4 0.7037 0.0177 1 (4) xPLSC 11 0.0132∗ (0.02, 0.02) 1 123.7516 (30.04%) 84.5184 1 1 pperm/CHull/Q2

(*pb0.05, **pb0.01, ***pb0.001).

8C.G

rellmann

etal./NeuroIm

agexxx

(2014)xxx–xxx

Pleasecite

thisarticle

as:Grellm

ann,C.,etal.,Com

parisonofvariants

ofcanonicalcorrelationanalysis

andpartialleast

squaresfor

combined

analysisofM

RIandgenetic

data,NeuroIm

age(2014),http://dx.doi.org/10.1016/j.neuroim

age.2014.12.025

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

Sparse CCA component

cum

ulat

ive

out−

of−s

ampl

e co

rrela

tion

1001,00010,00020,00030,00040,00050,00070,00090,000

out−of−sample correlation of 10−fold CV using Sparse CCA

0 5 10 15 20 25 30

−0.5

0.0

0.5

1.0

1.5

Bayesian IBFA componentcu

mul

ativ

e ou

t−of

−sam

ple

corre

latio

n

1001,00010,00020,00030,00040,00050,00070,00090,000

out−of−sample correlation of 10−fold CV using Bayesian IBFA

0 2 4 6 8 10

020

4060

8010

0

PLS component

cum

ulat

ive

out−

of−s

ampl

e co

varia

nce

1001,00010,00020,00030,00040,00050,00070,00090,000

out−of−sample covariance of 10−fold CV using PLS

Fig. 2. Comparison of sparse CCA (left), Bayesian IBFA (middle) and PLSC (right) results on multi-collinear imaging data of different input dimensions using out-of-sample correlation orout-of-sample covariance. The number of variables in the imaging data setwas increased from100 to 1,000, 10,000, 20,000, 30,000, 40,000, 50,000, 70,000 and finally 90,000 voxels. The y-axis depicts the cumulative average out-of-sample correlation coefficient (sparse CCA and Bayesian IBFA) or the cumulative average out-of-sample covariance (PLSC) for all components,respectively. Blue dots are used to highlight howmany components should be considered according to themodel selection procedure CHull. For higher component numbers, themodel fitdoes not considerably improve. The green dots mark the components truly carrying causal information. For sparse CCA and PLSC, the first component carries causal information for allsimulated dimensionalities and, using CHull, we are always able to detect that component. For Bayesian IBFA, only 30 components are plotted as the out-of-sample correlation coefficientlevels out for higher component numbers. In general, CHull fails to detect the causal component. For reasons of lucidity the variance of the out-of-sample correlation coefficient (standarddeviation of the out-of-sample covariance) across folds is not shown. It is small for low dimensions and increases with the number of considered voxels.

9C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

multiple regression coefficients, such that adding and deleting ofvariables has a huge effect on the canonical correlation. In addition, ifthere are much more variables relative to the number of observations,CCA is in general exposed to overfitting issues, such that noise isdescribed instead of underlying relationships. Hence, as the number ofvariables increases, resampling has an incremental influence on thecanonical correlation. In contrast to the CCA methods, PLSC is lessprone to effects of resampling and, using permutation tests, we detectedsignificant covariances for all simulated dimensionalities.

However, using permutation tests, the rate of false positives mightbe high in searching for associations between brain imaging and SNPmeasures. To detect the reliability of latent variables we additionallyused split-half reliability testing (Kovacevic et al., 2013). This strategyis so far only accessible for PLSC. In split-half reliability testing, a latentvariable is considered as reliable, when bothpW1corr

andpW2corrare small-

er than 0.05. Accordingly, for PLSC, latent variables were reliable for allsimulated voxel numbers other than 40,000 and 70,000 voxels.

To directly compare sparse CCA, Bayesian IBFA and PLSC results formulti-collinear imaging data, we computed the ratio of the out-of-sample correlation (out-of-sample covariance) estimated by 10-foldcross validation to the overall correlation (covariance) of latent vari-ables, which is illustrated in Fig. 6. Relative to the overall correlation,the out-of-sample correlation is the highest for sparse CCA till up to40,000 voxels. For more than 50,000 voxels, the ratio is higher forPLSC. For Bayesian IBFA the ratio decreases when 10,000 voxels ormore are considered. For 40,000 imaging phenotypes or more it isbelow 10%. The prediction performance of Bayesian IBFA is very lowand hence, using model selection tools, we are not able to detect causalcomponents anymore.

3.1.2. Multi-collinear imaging and high-dimensional SNP array dataAs for multi-collinear imaging data with candidate SNP selection, we

again computed ten components for sparse CCA. In contrast, the BayesianIBFA automatically determines howmany components to detect and forPLSC we computed as many components as necessary to explain at least80% of variance. Model selection was performed using CHull and thepredictive squared correlation coefficient. A summary of the results forhigh-dimensional SNP array data is given in Table 3.

Table 3 illustrates that PLSC is the only method that is able to detectcausal variableswhen both imaging and SNPdimensionality is high. The

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

causal pattern is represented in the first component, which is associatedwith the highest covariance. This result is confirmed by both the cumu-lative out-of-sample covariance and the predictive squared correlationcoefficient, as shown in Fig. 7. In contrast, using sparse CCA,model selec-tion using both the cumulative out-of-sample correlation coefficientand the predictive squared correlation coefficient suggests that thefirst component is the causal one. However, knowing ground truth,neither the causal voxel nor the causal SNPs are highly weighted inthat component (causal SNPs have medium weights and the causalvoxel is zero-weighted.) For Bayesian IBFA, there is a componentrepresenting causal voxels and SNPs. However, it is difficult to detectthe causal pattern usingmodel selection tools as the number of compo-nents learned by the ARD prior is very high.

An illustration of voxel and SNPweights of PLSC is provided in Figs. 8and 9. The component selected by the predictive squared correlation co-efficient and CHull is shown. For visualization purpose, the voxel weightprofile is zoomed in on a range of 100 voxels around the causal voxel.SNPweights are only displayed for causal SNPs. In general, causal voxels(including the causal voxel itself (red) and voxels correlated to thecausal voxel at a value of 0.8 or higher (yellow)) and two out of thethree causal SNPs (SNP 1584 and SNP 7074) get the highest weights.The third causal SNP (SNP 2410) is not detected by PLSC, which is rea-sonable since its association to the causal voxel is weaker than for theother two causal SNPs by chance. In addition to our causal SNPs, fiveother SNPs are provided with high weights by PLSC, including SNP885, SNP 1176, SNP 1216, SNP 6935 and SNP 6961. All of these SNPsare actually linked to the causal voxel, as revealed by means of Pearsoncorrelation (rSNP885 = 0.3259, rSNP1176 = 0.3016, rSNP1216 = 0.2617,rSNP6935 = 0.3184 and rSNP6961 = −0.3465).

Using permutation tests, we only detected significant covariancesfor PLSC. Canonical correlations were not significant for both sparseCCA and Bayesian IBFA. To detect the reliability of latent variablesof PLSC we used split-half reliability testing and considered latentvariables as reliable, when both pW1corr

and pW2corrwere smaller than

0.05. Accordingly, latent variables were nearly significant with pW1corr¼

0:08 and pW2corr¼ 0:0.

3.1.3. Linearly independent imaging dataAs for the simulations considering multi-collinear imaging data, we

computed ten components for sparse CCA and as many components

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

2 4 6 8 100.94

50.

955

0.96

5

100 voxels

2 4 6 8 10

0.89

0.91

0.93

0.95

1,000 voxels

2 4 6 8 10

0.70

0.80

0.90 10,000 voxels

2 4 6 8 10

0.60

0.70

0.80

20,000 voxels

2 4 6 8 10

0.65

0.70

0.75

0.80 30,000 voxels

2 4 6 8 10

0.65

0.75

0.85

40,000 voxels

2 4 6 8 10

0.50

0.60

0.70 50,000 voxels

2 4 6 8 10

0.1

0.2

0.3

0.4

0.5 70,000 voxels

2 4 6 8 10

0.0

0.2

0.4 90,000 voxels

Predictive squared correlation coefficient using Sparse CCA

Component Number

Pred

ictiv

e sq

uare

d co

rrela

tion

coef

ficie

nt

2 4 6 8 10

0.96

443

0.96

445

100 voxels

0 5 10 15

0.96

4480

0.96

4495

1,000 voxels

5 10 15 20 25

0.96

451

0.96

453

10,000 voxels

0 5 10 15 200.

00.

40.

8

20,000 voxels

0 5 10 15

0.0

0.4

0.8

30,000 voxels

5 10 15

0.96

437

0.96

441

40,000 voxels

0 10 30 50

0.96

460

0.96

475

50,000 voxels

0 20 40 600.

9646

0.96

480.

9650

70,000 voxels

0 20 40 60

0.96

460

0.96

475

0.96

490

90,000 voxels

Predictive squared correlation coefficient using Bayesian IBFA

Component Number

Pred

ictiv

e sq

uare

d co

rrela

tion

coef

ficie

nt

0.96

060.

9612

0.96

18

1 2 3

100 voxels

0.93

60.

942

0.94

8

1 2 3 4

1,000 voxels

1 2 3 4 5 6

0.82

0.84

0.86

0.88 10,000 voxels

1 2 3 4 5 6

0.76

0.80

0.84 20,000 voxels

1 2 3 4 5 6 7

0.70

0.74

0.78

0.82 30,000 voxels

1 2 3 4 5 6 7

0.65

0.70

0.75

0.80 40,000 voxels

2 4 6 8 10

0.66

0.70

0.74

0.78 50,000 voxels

2 4 6 8 10

0.60

0.65

0.70

0.75

70,000 voxels

2 4 6 8 10

0.55

0.65

90,000 voxels

Predictive squared correlation coefficient using PLS

Component Number

Pred

ictiv

e sq

uare

d co

rrela

tion

coef

ficie

nt

Fig. 3. Predictive squared correlation coefficients on different input dimensions for sparse CCA (left), Bayesian IBFA (middle) and PLSC (right) results usingmulti-collinear imaging data. The y-axis depicts the predictive squared correlation coefficientwhen the number of voxels was stepwise increased from 100 to 90,000 imaging phenotypes.

10C.G

rellmann

etal./NeuroIm

agexxx

(2014)xxx–xxx

Pleasecite

thisarticle

as:Grellm

ann,C.,etal.,Com

parisonofvariants

ofcanonicalcorrelationanalysis

andpartialleast

squaresfor

combined

analysisofM

RIandgenetic

data,NeuroIm

age(2014),http://dx.doi.org/10.1016/j.neuroim

age.2014.12.025

0 200 400 600 800 1000

0.00

0.02

0.04

0.06

0.08

0.10

voxel

voxe

l wei

ght

causal voxelcausal voxels r=0.8

Voxel weight profile of Sparse CCA for 1000 simulated voxels

0 200 400 600 800 1000

−0.5

0.0

0.5

1.0

1.5

voxelvo

xel w

eigh

t

causal voxelcausal voxels r=0.8

Voxel weight profile of Bayesian IBFA for 1000 simulated voxels

0 200 400 600 800 1000

−0.0

20.

000.

020.

040.

06

voxel

voxe

l wei

ght

causal voxelcausal voxels r=0.8

Voxel weight profile of PLS for 1000 simulated voxels

Fig. 4. Voxel weight profiles of sparse CCA (left), Bayesian IBFA (middle) and PLSC (right) for multi-collinear imaging data. The causal voxel is plotted in red, voxels in multi-collinearitywith that causal voxel are shown in yellow.

11C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

as necessary to explain at least 80% of variance for PLSC, respectively.Model selection was performed using CHull and the predictive squaredcorrelation coefficient. A summary of the results on simulated linearlyindependent imaging data is given in Table 4.

For both sparse CCA and PLSC, SNPs and imaging phenotypes of thesecond causal pattern, which is the stronger one, are always contribut-ing most to the first component, associated with the highest canonicalcorrelation or covariance, respectively. However, the representation ofthe less strong causal pattern varies depending on the dimensionalityof the imaging data set.When 1,000 imaging phenotypes are consideredonly, it is displayed, as expected, in the second component, related tothe second highest canonical correlation or covariance. For sparse CCA,increasing the dimensionality of the input imaging data leads to a shiftof the representation to the fifth component (10,000 variables) or sev-enth component (more than 20,000 variables). PLSC is more stable asvariables are weighted approximately equally for all considered dimen-sionalities and the representation of the less strongfirst causal pattern isshifted to the seventh component for all simulated dimensionalities of10,000 voxels or more. However, in contrast to multi-collinear imagingdata, many components are in general necessary to account for 80% ofvariance. Table 4 further shows that the number of components learnedby Bayesian IBFA increases from two, as expected knowing groundtruth, to eleven as the dimensionality of imaging phenotypes increasesand causal patterns are mostly represented in any of the backmostcomponents.

To identify causal components, we again used CHull with the cumu-lative out-of-sample correlation coefficient (CCA) or cumulative out-of-

0 10 20 30 40 50

−0.4

−0.2

0.0

0.2

0.4

0.6

SNP

SNP

wei

ght

causal SNPs

SNP weight profile of Sparse CCA for 1000 simulated voxels

0 10 20

−0.1

0−0

.05

0.00

0.05

0.10

0.15

0.20

S

SNP

wei

ght

causal SNPs

SNP weight profile of Bayesia

Fig. 5. SNP weight profiles of sparse CCA (left), Bayesian IBFA (middle) and PLSC

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

sample covariance (PLSC) as measure of fit, and the predictive squaredcorrelation coefficient Q2, as shown in Figs. 10 and 11, respectively.For sparse CCA and PLSC, we are able to detect both causal patternsonly for 1,000 imaging phenotypes. For higher voxel numbers, bothCHull and Q2 suggest to consider only the first component, which is car-rying the stronger causal pattern. This is related to the shift of the lessstrong causal pattern to backmost components. In contrast to sparseCCA and PLSC, for Bayesian IBFA, CHull is inappropriate for imagingdata sets of 10,000 voxels or more, as the representation of causal rela-tions varies strongly across folds. However, using the predictive squaredcorrelation coefficient, we are able to identify both causal patterns forup to 30,000 voxels. For larger data sets, Bayesian IBFA weights are nolonger significant (i.e. different from zero).

Figs. 12 and 13 provide an illustration of sparse CCA, Bayesian IBFAand PLSC weights for voxels and SNPs, respectively, on componentsselected by CHull (sparse CCA and PLSC) or by the predictive squaredcorrelation coefficient (Bayesian IBFA), when 1,000 variables are con-sidered. Variables of the stronger second causal pattern, as shown ingreen, are represented in the first component and those of the lessstrong first causal pattern (red) are most contributing to the secondcomponent. Using bootstrapping to assess the contribution of each var-iable weight to the latent variables, we showed that both, causal voxelsand causal SNPs, are reliable. Compared to results of multi-collinear im-aging data, it is also easier to differentiate between causal and non-causal variables by eye, because voxels are independent of each other.

Overall, for sparse CCA and Bayesian IBFA, using permutation tests,we detected significant canonical correlations only for up to 10,000

30 40 50

NP

n IBFA for 1000 simulated voxels

0 10 20 30 40 50

−0.2

0.0

0.2

0.4

SNP

SNP

wei

ght

causal SNPs

SNP weight profile of PLS for 1000 simulated voxels

(right) for multi-collinear imaging data. Causal SNPs are illustrated in red.

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

dimensionality of imaging data

out−

of−s

ampl

e co

varia

nce

/ cov

aria

nce

100 1000 10000 30000 50000 90000

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

out−

of−s

ampl

e co

rrela

tion

/ cor

rela

tion

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0PLS

Sparse CCABayesian IBFA

Predicitive Power of PLS, Sparse CCA and Bayesian IBFA

Fig. 6.Ratio of out-of-sample correlation (out-of-sample covariance) to overall correlation(covariance) of latent variables for sparse CCA, Bayesian IBFA and PLSC on multi-collinearimaging data. For PLSC (blue) the ratio of the out-of-sample covariance to the overallcovariance of latent variables is shown. For sparse CCA (red) and Bayesian IBFA (green) itis the relation of the out-of-sample correlation to the overall correlation of latent variables.

12 C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

(for the stronger second causal pattern) and only for 1,000 imagingphenotypes (for both causal patterns), respectively. In contrast, PLSCwas significant for up to 50,000 imaging phenotypes (stronger secondcausal pattern only). To detect the reliability of latent variables of PLSCwe used split-half reliability testing and considered latent variables asreliable, when bothpW1corr

andpW2corrwere smaller than 0.05. Accordingly,

the less strong first causal pattern was not reliable for all simulateddimensionalities. The stronger second causal pattern, however, was reli-able for 1,000 simulated imaging phenotypes.

As for multi-collinear imaging data, we computed the ratio of theout-of-sample correlation (out-of-sample covariance) estimated by10-fold cross validation to the overall correlation (covariance) of latentvariables to directly compare sparse CCA, Bayesian IBFA and PLSC. This isillustrated in Fig. 14. For the stronger second causal pattern the ratio isthe highest using sparse CCA for all simulated dimensionalities. Howev-er, for 40,000 voxels or more, it is below 50% such that model selectionby CHull fails to detect the causal component. For Bayesian IBFA theratio can only be computed for up to 20,000 voxels as voxels weightswere no longer significant for higher dimensionalities. Applying PLSC,the ratio of out-of-sample covariance to overall covariance is pooreven for low dimensionalities. This results from the high number ofcomponents that need to be considered to account for at least 80% of

Table 3Simulation results for multi-collinear imaging data and high-dimensional SNP array data.We created a data set consisting of 90,000 voxels and 10,000 SNPs. For sparse CCA,we consideredetermined and for PLSC,we considered asmany components as necessary to explain at least 80pW2

of latent variable reliability based on split-half resampling for PLSC), the canonical correlatrelation (CCA) or out-of-sample covariance (PLSC) for that component. For PLSC, the percentagecomponents are selected by CHull and the predictive squared correlation coefficient Q2 and wh

Method No. ofcomponents

p-Value (pW1, pW2

) Component ofcausal pattern

Correlation (CCAcovariance (PLSC

Sparse CCA 10 0.3918 (1) (0.9416)Bayesian IBFA 62 0.208 27 0.9923PLSC 18 0.0288∗ (0.08, 0.0) 1 1365.6188 (20.5

(*pb0.05).

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

variance in the data set. As a consequence, the covariance of the firstcomponent, which carries the second causal pattern, is low, just as theout-of-sample covariance.

Using sparse CCA for the less strong first causal pattern, the ratio ofthe out-of-sample correlation to the overall correlation is below the50% mark for 10,000 simulated voxels or more, such that using modelselection tools, we are not able to detect that pattern anymore. In con-trast to the stronger second causal pattern, Bayesian IBFA outperformsthe other strategies with regard to the first causal pattern. The ratio ofthe out-of-sample correlation to the overall correlation is the highestfor the Bayesian IBFA till up to 20,000 voxels and we are hence able todetect both the stronger second and the less strong first causal pattern.As for the second causal pattern, the ratio of out-of-sample covarianceto overall covariance of PLSC is poor even for low dimensionalitiesconsidering the first causal pattern.

3.2. Experimental data

In the original publication by Ousdal et al. (2012) individual contrastvalues for the right and left amygdala peak voxel were tested for associ-ation with each SNP separately. The authors reported a significantassociation between activation of the amygdala peak voxel in the lefthemisphere and three SNPs in high linkage disequilibrium (LD), namelyrs10014254, rs11722038 and rs17529323. The relation was mostsignificant for rs10014254 with p = 4.16 × 10−8 and p = 0.045after adjustment for multiple testing across phenotypes and allSNPs using Bonferroni correction. This SNP is located in a regulatoryregion upstream of the paired-like homeobox 2b (PHOX2B) gene. A sig-nificant interaction between SNP and diagnosis type was not reported(p = 0.28).

For multivariate analysis of the data set, we only applied PLSC andsparse CCA as, using simulated imaging and SNP data, we showed thatBayesian IBFA failed at reflecting meaningful associations for voxelnumbers above 500 times sample size (see Section 3.1.1). Prior to thestatistical analysis, we corrected for diagnosis type (schizophrenia, bi-polar disorder or other psychosis) as in the original publication. Asonly five SNPs were included, for sparse CCA analysis, we computedfive components and applied the out-of-sample correlation coefficientand the predictive squared correlation coefficient for model selection.Both gave evidence that only the first sparse CCA component shouldbe considered, with a canonical correlation of 0.5565 and a p-value of0.0721 obtained by permutation testing. Although the canonical corre-lation was non-significant, it exhibited a stable prediction performance,as the ratio of the out-of-sample correlation, estimated on the test databy 10-fold cross validation, to the overall correlation of latent variablesaccounted for approximately 0.35. For PLSC, we considered only SNPand voxel weights of the first component, as the first componentexplained already 72.76% of variance. The overall covariance of latent var-iables of 65.8063 was significant by permutation testing (p = 0.0276∗)and the predictive powerwas high (ratio of the out-of-sample covarianceto overall covariance of 0.62). An illustration of the SNPweight profiles of

d ten components. In contrast, for Bayesian IBFA the componentnumberwas automatically% of variance. The table illustrates the p-value of permutation testing (the p-valuespW1

andion (CCA) or covariance (PLSC) of the causal component as well as the out-of-sample cor-of variance the causal component explains is also given. Finally, it is displayed, howmanyether it is possible to detect the causal pattern considering model selection procedures.

)/)

Out-of-sample corr.(CCA)/out-of-samplecov. (PLSC)

Componentsselected by CHull

Componentsselected by Q2

Detectability

(0.1889) 1 1 x0.0530 26 57 x

4%) 98.4432 1 1 pperm/CHull/Q2

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

0 5 10 15

−100

010

020

0

PLS component

cum

ulat

ive

out−

of−s

ampl

e co

varia

nce

90,000 voxels, 10,000 SNPs

out−of−sample covariance of 10−fold CV using PLS

5 10 15

0.50

0.55

0.60

0.65

0.70

Component Number

Pred

ictiv

e sq

uare

d co

rrela

tion

coef

ficie

nt

90,000 voxels, 10,000 SNPs

Predictive squared correlation coefficient using PLS

Fig. 7.Model selection using out-of-sample covariance (left) and the predictive squared correlation coefficient (right) for multi-collinear imaging data and high-dimensional SNP arraydata. For the left figure, the y-axis depicts the cumulative average out-of-sample covariance. Dashed lines indicate the standard deviation of the out-of-sample covariance. The blue dotis used to highlight howmany components should be considered according to themodel selection procedure CHull. The green dotmarks the component truly carrying causal information.For the right figure, the y-axis depicts the predictive squared correlation coefficient. Both model selection procedures indicate that the first component is carrying causal information.

13C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

the first sparse CCA and PLSC component is provided in Fig. 15. Usingbootstrapping, we showed that exactly the same three SNPs in high LD,namely rs10014254, rs11722038 and rs17529323, were reliable. Incontrast to Ousdal et al. (2012), they were contributing equally tothe canonical correlation or covariance, respectively. The two SNPsrs382013 and rs437633 were zero-weighted (sparse CCA) or only hada marginal positive and negative weight (PLSC), respectively.

In the original publication, SPM random-effects two sample t-testanalysis further revealed a significantly increased activation in right(x = 16, y = −8, z = −16, p(SVC) b 0.05) and left (x = −26, y = −4,z = −14, p(SVC) b 0.05) amygdala for T-allele carriers (CT or TT) of

52700 52750 52800 52850

0.00

00.

001

0.00

20.

003

0.00

40.

005

0.00

6

voxel

voxe

l wei

ght

causal voxelcausal voxels r=0.8

Voxel weight profile of PLS for 90,000 voxels and 10,000 SNPs

Fig. 8. Voxel weight profile of PLSC for multi-collinear imaging data and high-dimensionalSNP array data. The causal voxel is plotted in red, voxels in multi-collinearity with thatcausal voxel are shown in yellow. For visualization purpose, the voxel weight profile iszoomed in on a range of 100 voxels around the causal voxel.

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

rs10014254 compared to homozygous CC-individuals. For multivariateanalysis using sparse CCA and PLSC, all voxels found to be reliableusing bootstrapping are positively weighted. As SNPs were recodedusing an additive genetic model, they reflect brain regions that,during the face-matching task, are directly associated with the SNPsrs10014254, rs11722038 and rs17529323, in line with Ousdal et al.(2012). An illustration of the voxel weight profile is given in Fig. 16,showing that we are able to replicate the findings published by Ousdalet al. (2012). Three SNPs in LD, rs10014254, rs11722038 andrs17529323 are significantly associated with amygdala activity(x=−22, y=−4, z=−12; x=18, y=−4, z=−12) in both hemi-spheres. However, as we considered the whole brain for multivariateanalysis instead of selecting amygdala as region of interest, we alsofound some other brain regions to be associated with the three SNPs.They include cerebellum (x = −28, y = −54, z = −20; x = 22,y = −54, z = −16), left hippocampus (x = −32, y = −10,z=−14), left lingual gyrus (x=−20, y=−46, z=−4), right puta-men (x=28, y=4, z=−2) and left lateral occipital cortex (x=−30,y = −66, z = 28).

After multivariate analysis using sparse CCA and PLSC, the latentvariables of both the imaging and SNP data set were searched forgroup differences according to diagnosis type. However, no differenceswere found, which is in line with findings of Ousdal et al., (2012).

4. Discussion

4.1. Simulated data

4.1.1. The impact of the dimensionality of the data setsUsing simulated imaging and SNP data, we aimed at identifying how

many non-informative variablesmight be included in addition to causalvariants to still be able to discovermeaningful associations between im-aging and genetic data.Whenweusedmean and covariance parametersof real MRI data to model the imaging data set, Bayesian IBFA gavemeaningful results until voxel numbers exceeded 500 times samplesize, whereas sparse CCA and PLSCwere able to identify causal relationsfor all simulated dimensionalities. To discuss whether the failure ofBayesian IBFA for high voxel numberswas related to the dimensionalityor to the degree of multi-collinearity between simulated voxels, we cre-ated an imaging data set using standard normal distribution and againstepwise increased the number of considered voxels. Bayesian IBFA

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

14 C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

was applicable for voxel numbers up to 300 times sample size. For largerdata sets, Bayesian IBFAweightswere no longer significant (i.e. differentfrom zero). Sparse CCA and PLSC gave meaningful results for higherdimensionalities. However, when the number of voxels exceeded 300times sample size, it became difficult to identify causal patterns usingmodel selection tools.

The comparison of the results on linearly independent and multi-collinear imaging data reveals that all three methods increased perfor-mance when simulated voxels were highly collinear. This is caused bythe increasing number of causal voxels itself as dimensionality rises.For linearly independent imaging variables, the number of causal voxelswas chosen to be 300 for all simulated dimensionalities. When we in-creased the dimensionality of the imaging data set, we increased thenumber of non-informative phenotypes such that there were muchmore non-informative than causal variables. In contrast, for multi-collinear imaging data, the number of causal voxels increased with di-mensionality. This is due to the fact that the number of voxels inmulti-collinearity with causal voxels and hence the number of voxelsassociated with the causal SNPs was increased. Consequently, for ourmultivariate methods it was easier to differentiate between causal andnon-informative variables. As for all three methods high-dimensional

800 850 900 950

−0.0

3−0

.02

−0.0

10.

000.

010.

020.

03

SNP

SNP

wei

ght

SNP 885

SNP weight profile of PLS for 90,000 voxels and 10,000 SNPs

1500 1550 1600 1650

−0.0

3−0

.02

−0.0

10.

000.

010.

020.

03

SNP

SNP

wei

ght

SNP 1584

SNP weight profile of PLS for 90,000 voxels and 10,000 SNPs

Fig. 9. SNP weight profile of PLSC for multi-collinear imaging data and high-dimensional SNPyellow. SNPs are named according to their number in the simulated data set.

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

images caused problems when voxels were independent, but onlyBayesian IBFA failed for high-dimensional collinear data sets, itcould be carefully assumed that Bayesian IBFA is susceptible to thedegree of multi-collinearity, not to the dimensionality of the imagingdata set itself. However, to verify this conclusion, further simulationsare necessary, focusing on linearly independent variables but forwhich the information content is also enhanced with increasingdimensionality.

To further compare PLSC, sparse CCA and Bayesian IBFAwith respectto their performance in more realistic imaging genetic settings, weextended our simulation experiment by also considering a large num-ber of SNPs. We increased the dimensionality of the genotype data setto 10,000 SNPs, resembling the number of SNPs genotyped on a smallhuman SNP arrays. PLSC was the only method that was able to detectboth causal voxels and causal SNPs. This is in line with our results onmulti-collinear imaging data when candidate SNPs were selected.However, for higher SNP numbers of current whole-genome scanstogether with whole brain imaging data, PLSC will be inefficient sincethe cross-product matrix is decomposed using SVD and a prior dimen-sionality reduction is highly recommended to accommodate the largenumbers of variables.

1100 1150 1200 1250

−0.0

2−0

.01

0.00

0.01

0.02

0.03

SNP

SNP

wei

ght

SNP 1176SNP 1216

SNP weight profile of PLS for 90,000 voxels and 10,000 SNPs

6900 6950 7000 7050

−0.0

3−0

.02

−0.0

10.

000.

010.

020.

03

SNP

SNP

wei

ght

SNP 6935SNP 6961SNP 7074

SNP weight profile of PLS for 90,000 voxels and 10,000 SNPs

array data. Causal SNPs are illustrated in red. Other SNPs detected by PLSC are shown in

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

Table 4Simulation results for linearly independent imaging data.The dimensionality of the imaging data set was increased stepwise from 1,000 to 100,000 phenotypes. The number of SNPs was kept constant. For sparse CCA, we computed ten components at each dimensionality, respectively. In contrast, BayesianIBFA automatically learns howmany components to select. For PLSC, we considered asmany components as necessary to explain at least 80% of variance. Illustrated is the p-value of permutation testing for each dimensionality (the p-valuespW1

andpW2

of latent variable reliability based on split-half resampling for PLSC), and it is shown which components are representing the two causal patterns, the canonical correlation (CCA) or covariance (PLSC) of those components as well as the out-of-sample correlation (CCA) or out-of-sample covariance (PLSC), estimated on the test data by 10-fold CV. For PLSC, the percent of variance these components explain is given in addition. Furthermore, it is displayed, howmany components are selectedby CHull and the predictive squared correlation coefficient Q2. In the last column, it is shownwhether it is possible to detect the two causal patterns considering thosemodel selection procedures. In terms of Bayesian IBFA, canonical weightswere nolonger significant for 40,000 voxels or more and corresponding columns in the result table are empty.

Dimensionalityof MRI data

Method No. ofcomponents

p-Value (pW1, pW2

) Causalpattern

Component Correlation(CCA)/covariance(PLSC)

Out-of-sample corr.(CCA)/out-of-sample cov.(PLSC)

Componentsselected byCHull

Componentsselected byQ2

Detectability

1000 Sparse CCA 10 0.0∗ ∗ ∗

0.0∗ ∗ ∗12

21

0.94830.9818

0.93410.9687

2 1 pperm/CHullpperm/CHull/Q2

Bayesian IBFA 3 0.02∗

0.015∗12

21

0.89750.9108

0.65850.8043

2 1,2 pperm/CHull/Q2

pperm/CHull/Q2

PLSC 21 0.0∗ ∗ ∗ (0.0, 0.19)0.0∗ ∗ ∗ (0.0, 0.0)

12

21

7.1408 (8.38%)11.7506 (22.68%)

2.04542.9912

2 1 pperm/CHullpperm/CHull/Q2

10,000 Sparse CCA 10 0.0530.015∗

12

51

0.97250.9858

0.30120.9501

1 1 xpperm/CHull/Q2

Bayesian IBFA 8 0.0970.0857

12

65

0.87510.8992

0.68950.6516

7 5,6 Q2

Q2

PLSC 25 0.02∗ (0.07, 0.73)0.0∗ ∗ ∗ (0.0, 0.99)

12

71

14.2899 (3.96%)18.2599 (6.46%)

0.11951.1887

1 1 ppermpperm/CHull/Q2

20,000 Sparse CCA 10 0.88420.8422

12

71

0.97740.9875

0.02720.9070

1 1 xCHull/Q2

Bayesian IBFA 10 0.52110.3132

12

98

0.89940.8996

0.17520.3130

9 8,9 CHull/Q2

CHull/Q2

PLSC 25 0.08 (0.12, 0.71)0.0∗ ∗ ∗ (0.0, 0.97)

12

71

19.9202 (3.89%)24.0979 (5.69%)

0.20400.7011

1 1 xpperm/CHull/Q2

*30,000 Sparse CCA 10 1.00.9856

12

7/91

0.9844/0.98650.9883

0.3302/0.26150.7600

1 1 xCHull/Q2

Bayesian IBFA 11 0.59810.5498

12

9–

0.9076–

0.2257–

10 9 Q2

xPLSC 25 0.17 (0.13, 0.31)

0.0∗ ∗ ∗ (0.0, 1.0)12

71

24.3343 (3.88%)28.8167 (5.44%)

0.07650.5180

1 1 xpperm/CHull/Q2

40,000 Sparse CCA 10 1.01.0

12

71

0.99820.9937

0.24750.2770

8 1 xQ2

Bayesian IBFA 9 0.62240.2126

12

6 – xx

PLSC 25 0.14 (0.05, 0.53)0.0∗ ∗ ∗ (0.0, 0.66)

12

71

28.0766 (3.88%)32.9296 (5.34%)

0.08650.2415

2 1 xpperm/Q2

50,000 Sparse CCA 10 1.01.0

1 7 0.9876 0.3094 8 1 x2 1 0.9629 0.2345 Q2

Bayesian IBFA 9 0.84110.4082

12

6 – xx

PLSC 25 0.05 (0.04, 0.28)0.0∗ ∗ ∗ (0.0, 0.78)

12

71

31.4143 (3.89%)36.6167 (5.28%)

0.03340.0843

19 1 xpperm/Q2

100,000 Sparse CCA 10 1.01.0

12

7 0.9935 0.3587 – 1 x

Bayesian IBFA 2 0.88120.5512

12

– – xx

PLSC 25 0.64 (0.77, 0.32)0.12 (0.04, 0.17)

12

71

44.1608 (3.86%)51.2970 (5.20%)

0.02800.0473

4 1 xQ2

(*pb0.05, **pb0.01, ***pb0.001).

15C.G

rellmann

etal./NeuroIm

agexxx

(2014)xxx–xxx

Pleasecite

thisarticle

as:Grellm

ann,C.,etal.,Com

parisonofvariants

ofcanonicalcorrelationanalysis

andpartialleast

squaresfor

combined

analysisofM

RIandgenetic

data,NeuroIm

age(2014),http://dx.doi.org/10.1016/j.neuroim

age.2014.12.025

0 2 4 6 8 10

−10

12

34

Sparse CCA component

cum

ulat

ive

out−

of−s

ampl

e co

rrela

tion

1,00010,00020,00030,00040,00050,000100,000

out−of−sample correlation of 10−fold CV using Sparse CCA

0 2 4 6 8 10

−10

12

34

5

Bayesian IBFA componentcu

mul

ativ

e ou

t−of

−sam

ple

corre

latio

n

1,00010,00020,00030,00040,00050,000

out−of−sample correlation of 10−fold CV using Bayesian IBFA

0 5 10 15 20 25

−20

24

68

PLS component

cum

ulat

ive

out−

of−s

ampl

e co

varia

nce

1,00010,00020,00030,00040,00050,000100,000

out−of−sample covariance of 10−fold CV using PLS

Fig. 10. Comparison of sparse CCA (left), Bayesian IBFA (middle) and PLSC (right) results on linearly independent imaging data of different input dimensions using out-of-samplecorrelation or out-of-sample covariance. The number of variables in the imaging data set was increased from 1,000 to 10,000, 20,000, 30,000, 40,000, 50,000 and finally 100,000 voxels.The y-axis depicts the cumulative average out-of-sample correlation coefficient (sparse CCA and Bayesian IBFA) or the cumulative average out-of-sample covariance (PLSC) for allcomponents, respectively. The dashed lines show the variance of the out-of-sample correlation coefficients (standard deviation of the out-of-sample covariances). Blue dots are used tohighlight how many components should be considered according to the model selection procedure CHull. The green dots mark the components truly carrying causal information. For1,000 imaging phenotypes we are able to detect both causal patterns according to CHull in all three methods. For sparse CCA and PLSC, increasing the dimensionality of the imagingdata set till up to 30,000 voxels results in choosing only the first component, which is representing the stronger causal pattern. Using Bayesian IBFA, we are in general not able to detectcausal components using CHull for more than 1,000 simulated voxels.

16 C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

4.1.2. Significance of associationsFurther, we wanted to discuss whether the causal relations detected

by the three methods were also statistically significant. To address thisquestion, we used permutation tests.

For multi-collinear imaging data, by permutation test, Sparse CCA andBayesian IBFA gave significant correlations for up to 50,000 or 1,000voxels, respectively. PLSC was significant for all simulated dimensional-ities. This is due to the fact that PLSC is less prone to effects of resampling(see Section 2.4).

For linearly independent imaging data, sparse CCA and BayesianIBFA yielded significant correlations for up to 10,000 and 1,000 phe-notypes, respectively. PLSC was significant for up to 50,000 variables.Therefore, compared to results on multi-collinear imaging data, thethree methods performed worse. As discussed in Section 4.1.1, formulti-collinear imaging data the number of causal voxels increasedwith dimensionality of the imaging data set. Therefore, causalrelations were more stable and we detected significant canonicalcorrelations and covariances, respectively, for higher dimensionaldata sets.

When both the imaging and the SNP data set were high-dimensional,we only detected significant covariances for PLSC. This is in line with ourresults on multi-collinear imaging data when candidate SNPs wereselected.

For neuroimaging genetics studies, it has been shown that thestrength of the correlation between imaging variables, even throughnot necessarily related to the SNPs, can overpower the permutationtests and identify false positive associations (Kovacevic et al., 2013).We therefore also performed split-half reliability testing to access thereliability of PLSC latent variables. For multi-collinear imaging data,latent variables were reliable for all simulated voxel numbers otherthan 40,000 and 70,000 voxels, revealing the strength of PLSC in thecontext of genotype–phenotype association studies. In contrast, forlinearly independent imaging data, latent variables were in generalnot reliable with the exception of 1,000 simulated imaging phenotypes.This finding supports that PLSC is appropriate, in particular, when thedata set comprises highly collinear variables. When applying PLSC,however, split-half reliability testing should always be considered inaddition to traditional permutation testing to assure that identifiedbrain–SNP-associations are truly reliable.

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

4.1.3. Number of components to considerIn earlier publications, authors have mainly focused on the first (e.g.

(Witten et al., 2009;Witten and Tibshirani, 2009)) or the first two pairs(e.g. (Le Floch et al., 2012; Lin et al., 2013)) of canonical correlations orcovariances, respectively. Our goal was to find out whether this issufficient.

For multi-collinear imaging data, when we considered only onecausal relation of affected voxels and SNPs, both PLSC and sparse CCAdisplayed this causal pattern in the first component, associated withthe highest canonical correlation or covariance, for all simulated dimen-sionalities. Using Bayesian IBFA the representation varied considerably.

When imaging datawasmodeled using standard normal distributionincorporating two causal patterns, the stronger causal pattern wasalways represented in the first component, associated with the highestcanonical correlation or covariance. For low-dimensional data sets of1,000 voxels, the less strong causal pattern was displayed in the compo-nent of the second highest canonical correlation or covariance by allthree methods. However, when more non-informative voxels wereadded to the imaging data set, the representation of the less strong causalpattern was shifted to any of the backmost components.

We hence showed that when methods like sparse CCA and PLSC areused, for which the component number is not automatically learned bythemodel, one should always consider more components than expected,as causal relations might be represented in any of the backmost compo-nents. It is of course possible that we detect only one causal relationsuch that it is sufficient to consider only the component of the highestcanonical correlation or covariance, respectively, in particular whenhighly collinear whole brain imaging data is considered together withcandidate SNPs. However, other components may also be informativeand as a consequence there is a need for appropriate model selectiontools, which are discussed later (see Section 4.1.4).

4.1.4. Approaches to evaluate the performanceUsing simulated data, we further wanted to address, whichmethods

might be suitable for model selection. We considered the predictivesquared correlation coefficient and the model selection procedureCHull with the out-of-sample correlation coefficient or out-of-samplecovariance as goodness of fit measure. For PLSC, we examined asmany components as necessary to explain at least 80% of variance.

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

2 4 6 8 10

0.93

00.

940

0.95

0

1,000 voxels

2 4 6 8 10

0.93

50.

945

10,000 voxels

2 4 6 8 10

0.92

50.

935

0.94

5 20,000 voxels

2 4 6 8 10

0.91

0.93

30,000 voxels

2 4 6 8 10

0.88

0.92

40,000 voxels

2 4 6 8 10

0.91

0.93

0.95

50,000 voxels

2 4 6 8 10

0.86

0.90

0.94

100,000 voxels

Predictive squared correlation coefficient using Sparse CCA

Component Number

Pred

ictiv

e sq

uare

d co

rrela

tion

coef

ficie

nt

0.99

603

0.99

605

1 2 3

1,000 voxels

1 2 3 4 5 6 7 80.96

460.

9650

0.96

54 10,000 voxels

2 4 6 8 100.92

7670

0.92

7676

20,000 voxels

2 4 6 8 100.81

140.

8116

0.81

18

30,000 voxels

Predictive squared correlation coefficient using Bayesian IBFA

Component Number

Pred

ictiv

e sq

uare

d co

rrela

tion

coef

ficie

nt

5 10 15 20

0.95

10.

953

0.95

50.

957

1,000 voxels

5 10 15 20 25

0.89

0.91

0.93

0.95 10,000 voxels

5 10 15 20 250.84

0.88

0.92

20,000 voxels

5 10 15 20 25

0.82

0.86

0.90

0.94 30,000 voxels

5 10 15 20 25

0.80

0.85

0.90

40,000 voxels

5 10 15 20 25

0.75

0.85

50,000 voxels

5 10 15 20 25

0.65

0.75

0.85

100,000 voxels

Predictive squared correlation coefficient using PLS

Component Number

Pred

ictiv

e sq

uare

d co

rrela

tion

coef

ficie

nt

Fig. 11. Predictive squared correlation coefficients on different input dimensions for sparse CCA (left), Bayesian IBFA (middle) and PLSC (right) results using linearly independent imaging data. The y-axis depicts the predictive squared correlationcoefficient when the number of voxels was stepwise increased from 1,000 to 100,000 imaging phenotypes.

17C.G

rellmann

etal./NeuroIm

agexxx

(2014)xxx–xxx

Pleasecite

thisarticle

as:Grellm

ann,C.,etal.,Com

parisonofvariants

ofcanonicalcorrelationanalysis

andpartialleast

squaresfor

combined

analysisofM

RIandgenetic

data,NeuroIm

age(2014),http://dx.doi.org/10.1016/j.neuroim

age.2014.12.025

0 200 400 600 800 1000

−0.2

0−0

.10

0.00

voxel

voxe

l wei

ght

1st causal pattern2nd causal pattern

1st component

0 200 400 600 800 1000

0.00

0.10

voxel

voxe

l wei

ght

1st causal pattern2nd causal pattern

2nd component

Voxel weight profile of Sparse CCA for 1000 simulated voxels

0 200 400 600 800 1000

−0.4

0.0

voxel

voxe

l wei

ght

1st causal pattern2nd causal pattern

1st component

0 200 400 600 800 1000

−0.4

0.0

voxelvo

xel w

eigh

t

1st causal pattern2nd causal pattern

2nd component

Voxel weight profile of Bayesian IBFA for 1000 simulated voxels

0 200 400 600 800 1000

−0.1

00.

000.

10

voxel

voxe

l wei

ght

1st causal pattern2nd causal pattern

1st component

0 200 400 600 800 1000

−0.1

00.

000.

10

voxel

voxe

l wei

ght

1st causal pattern2nd causal pattern

2nd component

Voxel weight profile of PLS for 1000 simulated voxels

Fig. 12.Voxel weight profiles of sparse CCA (left), Bayesian IBFA (middle) and PLSC (right) for linearly independent imaging data. Voxels of the stronger second causal pattern (green) arerepresented in the first component and those of the less strong first causal pattern (red) in the second one.

18 C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

When there was a high degree of multi-collinearity between simulat-ed voxels, the amount of variance explained strongly reduced the numberof components to consider and, for PLSC, the variance criterion is hencerecommendable for a first data reduction. However, when imaging datawas simulated using standard normal distribution, percentage of varianceexplained was no sufficient model selection criterion.

According tomodel selectionwith thepredictive squared correlationcoefficient and CHull, we detected that the applicability of those tools isvery similar for PLSC and sparse CCA. Both methods are appropriatetools for model selection. CHull appears to be even more accurate, asfor sparse CCA and PLSC the average of latent variables was computedto estimate Q2, which might be prone to outliers.

In contrast to sparse CCA and PLSC, we showed that the out-of-sample correlation coefficient, and so CHull, is in general not suitablefor detection of causal Bayesian IBFA components, due to the fact thatthe representation of causal relations might vary strongly across foldsof the 10-fold cross validation. In contrast, the predictive squared corre-lation coefficient Q2 is recommendable for model selection of BayesianIBFA.

As we showed, various model selection tools might evaluate theperformance of different multivariate strategies differently. Therefore,we would suggest to always consider several tools for model selection.In our case, both CHull and the predictive squared correlation coefficientQ2 were demonstrated to be suitable approaches.

0 10 20 30 40 50

−0.5

0.5

SNP

SNP

wei

ght

1st causal pattern2nd causal pattern

1st component

0 10 20 30 40 50

−0.5

0.5

SNP

SNP

wei

ght

1st causal pattern2nd causal pattern

2nd component

SNP weight profile of Sparse CCA for 1000 simulated voxels

0 10 20

−0.4

0.0

0.4

S

SNP

wei

ght

1st co

0 10 20

−0.4

0.0

0.4

S

SNP

wei

ght

2nd co

SNP weight profile of Bayesian I

Fig. 13. SNP weight profiles of sparse CCA (left), Bayesian IBFA (middle) and PLSC (right) for lrepresented in the first component and those of the less strong first causal pattern (red) in the

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

4.1.5. Which methods to choose?In contrast to other studies focusing onwhole-genome SNP data, the

major goal of this studywas to state underwhich circumstances it is ad-visable to use either sparse CCA, Bayesian IBFA or PLSC in the context ofwhole brain imaging data, as the covariance structure between imagingvariables is expected to be much stronger than the covariance structurebetween various SNPs. For direct comparison of our three methods wecomputed the ratio of the out-of-sample correlation (out-of-sample co-variance) estimated by 10-fold cross validation to the overall correlation(covariance) of latent variables.

For highly collinear imaging variables, Bayesian IBFA cannot berecommended, since the number of components automatically learnedby the ARDpriorwas always excessively high, such that additional post-processing steps were necessary to differentiate between causal andnon-informative components. This was true for both candidate SNPsand high-dimensional SNP array simulations (3.1.1 and 3.1.2, respec-tively). In contrast, in Section 3.1.1, sparse CCAand PLSCwere applicablefor very high-dimensional imaging data sets and resultingweights weremuch alike. Among those two, the predictive power was higher forsparse CCA when voxel numbers were below 400 times sample size,as shown in Fig. 6. Hence, sparse CCA seems an appropriate methodfor candidate phenotype, candidate SNP studies. PLSC is fast and whenvoxel numbers were above 500 times sample size its predictive powerexceeded that of sparse CCA. Thus, PLSC seems to be themost appropriate

30 40 50

NP

1st causal pattern2nd causal pattern

mponent

30 40 50

NP

1st causal pattern2nd causal pattern

mponent

BFA for 1000 simulated voxels

0 10 20 30 40 50

−0.6

0.0

0.4

SNP

SNP

wei

ght

1st causal pattern2nd causal pattern

1st component

0 10 20 30 40 50

−0.6

0.0

0.4

SNP

SNP

wei

ght

1st causal pattern2nd causal pattern

2nd component

SNP weight profile of PLS for 1000 simulated voxels

inearly independent imaging data. SNPs of the stronger second causal pattern (green) aresecond one.

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

dimensionality of imaging dataout−

of−s

ampl

e co

varia

nce

/ cov

aria

nce

1000 10000 20000 30000 40000 50000 1e+05

0.0

0.3

0.6

0.9

out−

of−s

ampl

e co

rrela

tion

/ cor

rela

tion

0.0

0.3

0.6

0.9PLS

Sparse CCABayesian IBFA

1st causal pattern

dimensionality of imaging dataout−

of−s

ampl

e co

varia

nce

/ cov

aria

nce

1000 10000 20000 30000 40000 50000 1e+05

0.0

0.3

0.6

0.9

out−

of−s

ampl

e co

rrela

tion

/ cor

rela

tion

0.0

0.3

0.6

0.9PLS

Sparse CCABayesian IBFA

2nd causal pattern

Predicitive Power of PLS, Sparse CCA and Bayesian IBFA

Fig. 14. Ratio of out-of-sample correlation (out-of-sample covariance) to overall correla-tion (covariance) of latent variables for sparse CCA, Bayesian IBFA and PLSC on linearlyindependent imaging data. For PLSC (blue) the ratio of the out-of-sample covariance tothe overall covariance of latent variables is shown. For sparse CCA (red) and BayesianIBFA (green) it is the relation of the out-of-sample correlation to the overall correlationof latent variables.

19C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

tool formultivariate analysis of imaging genetics data. It might seem con-spicuous that, in Fig. 6, PLSC exhibited a higher predictive power when50,000 voxels or more were considered. However, it cannot be expectedthat PLSC performs better the higher the dimensionality. As we workedon randomly generated data sets, the association of causal voxels andSNPs was simply stronger for these dimensionalities, which is in linewith the p-values of permutation testing reported for 50,000 voxels ormore (Table 2). It is therefore not possible to compare the predictionperformance of each individual method across dimensionalities, but tocompare the performance of PLSC against the performance of sparseCCA or Bayesian IBFA for a given voxel number.

SNP

SNP

wei

ght

rs10014254 rs11722038 rs17529323 rs382013 rs437633

0.00

0.10

0.20

0.30

0.40

0.50

0.60 bootstrapping

SNP weight profile of 1st SCCA component

Fig. 15. SNP weight profile of the first sparse CCA (left) and PLSC (right) component on expers17529323, are reliable according to bootstrapping approach, as shown in red.

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

To further compare PLSC, sparse CCA and Bayesian IBFAwith respectto their performance in more realistic imaging genetic settings, weextended our simulation experiment by considering 10,000 SNPs. Inline with our results on multi-collinear imaging data when candidateSNPs were selected, PLSC was the only method that was able to detectboth causal voxels and causal SNPs.

When standard normal distributionwas used to simulate linearly in-dependent imaging data, Bayesian IBFA outperformed both sparse CCAand PLSC as it was the only method that was able to detect both thestronger and the less strong causal pattern. Consequently, when vari-ables of the considered data set are expected to be linearly independent,Bayesian IBFA is recommendable as long as voxel numbers are below300 times sample size. For higher numbers of variables, results shouldbe interpreted with caution, as it became difficult to identify causalpatterns using model selection tools for all considered strategies.

The reason why PLSC was the strongest method for collinear data isthat for PLSC, SVD is directly performed on the cross-productmatrix X1′

X2, which, in contrast to CCA, is not corrected for within-set covariancesprior to the decomposition. As a consequence, there is no need for addi-tional assumptions to account for the non-invertibility issue of CCA and,even more meaningful, we also do not need to assume that within-setcovariance matrices are diagonal, like we do for both, sparse CCA andBayesian IBFA. PLSC results are therefore most probably more accurate.On the other hand, PLSC has two disadvantages compared to the CCAmethods. Firstly, as the covariance of latent variables is maximizedinstead of the correlation, we cannot assess the height of associationof the latent variables, due to the fact that covariances are not standard-ized. Furthermore, for PLSC, latent variables of different components arenot mutually orthogonal like for sparse CCA and Bayesian IBFA, suchthat interpretability is deteriorated. However, for imaging geneticsdata, for which variables are assumed to be highly collinear, causal rela-tions are often represented in thefirst component, and PLSC resultsmaybe interpreted comparable to CCA results.

4.2. Experimental data

We were successful in replicating the findings published by Ousdalet al. (2012). Therefore we verified that sparse CCA and PLSC areapplicable for high-dimensional analysis of experimental data sets andreveal reliable associations between brain imaging and genetic data.Compared to the general linear model applied by Ousdal et al. (2012),

SNP

SNP

wei

ght

rs10014254 rs11722038 rs17529323 rs382013 rs437633

−0.0

50.

050.

150.

250.

350.

450.

55

bootstrapping

SNP weight profile of 1st PLS component

rimental data. Three SNPs in high linkage disequilibrium, rs10014254, rs11722038 and

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

Fig. 16.Voxelweight profile of thefirst sparse CCA (top) and PLSC (bottom) component on experimental data. Thefigures illustrate voxels that are found to be reliable using bootstrappingapproach. All voxel weights are positive. For PLSC, only voxels with top 50% weights are shown.

20 C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

the methods are advantageous as we included whole brain measuresinstead of the amygdala peak voxels to search for influencing SNPs.

In linewithOusdal et al. (2012), we found that three SNPs in high LD,rs10014254, rs11722038 and rs17529323, were significantly associatedwith amygdala activity. These SNPs are located upstream of the paired-like homeobox 2b (PHOX2B) gene. As the PHOX2B gene is known toregulate the expression of enzymes necessary for the biosynthesis ofmonoamines, such as dopamine and norepinephrine (Brunet andPattyn, 2002), Ousdal et al. (2012)were able to confirm their hypothesisthat the monoaminergic signaling pathway plays a central role in theregulation of amygdala activity. In contrast to Ousdal et al. (2012),who reported that the most significant association was found withrs10014254, for us, all three SNPs contributed equally to the canonicalcorrelation or covariance, respectively. However, as we usedmultivariatestrategies,we had to exclude participants forwhich genotype valuesweremissing, such that the three SNPswere in even higher LD. Furthermore, asreported by Ousdal et al. (2012), we also found that homozygous carriersof the minor allele (TT for rs10014254, GG for rs11722038 and CC forrs17529323) compared to heterozygous individuals and homozygouscarriers of the major allele (CC for rs10014254 and AA for rs11722038and rs17529323) revealed an increased amygdala activation, as all SNPand voxel weights exposed a direct relation.

As we considered whole brain measures, we did not only findamygdala, but also some other brain regions to be associated with theSNPs rs10014254, rs11722038 and rs17529323 during the emotionalface-matching task, including cerebellum, left hippocampus, left lingualgyrus and right putamen. All these brain regions have been shown to beincreasingly activated during processing of emotional faces. An enhancedactivation of putamen, cerebellum and amygdalawas reported by (Fusar-Poli et al., 2009) and (Schraa-Tam et al., 2012) during processing of neg-ative emotional faces. Hippocampus and amygdala (Benedetti et al.,2011), putamen (Surguladze et al., 2010) and lingual gyrus (Demenescuet al., 2013) were further shown to be significantly higher activated inresponse to negative facial expression in chronic schizophrenia patients,patients with diagnosis of bipolar disorder and patients diagnosed withpanic disorder, respectively, compared to healthy controls. In our analysis,however, increased activation of amygdala, hippocampus and putamenwas not limited to patients with diagnosis of either schizophreniaspectrum disorder or bipolar disorder, as analysis of latent variables didnot reveal any significant differences between patients of the diagnosticgroups, which was in line with the original publication.

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

We could not confirm by the literature so far, that individual differ-ences in activation of cerebellum, hippocampus, lingual gyrus and puta-men might be explained by variation of PHOX2B SNPs like rs10014254,rs11722038and rs17529323.However,we foundevidence for hippocam-pus to be associated with another gene influencing the monoaminergicsignaling pathway, the gene that encodes the enzyme MAOA. MAOA de-grades monoamines such as norepinephrine, dopamine and serotoninand plays a critical role in the regulation of their neurotransmission (Leeand Ham, 2008). In that study, Lee and Ham investigated the relationshipbetween the MAOA-upstream variable number of tandem retreats(uVNTR) polymorphism and brain responses to negative facial stimuliusing fMRI in healthy Koreanwomen. They reported a greater brain activ-ity in the left amygdala in participants with the low activity allele in sadversus neutral condition. In the angry vs. neutral condition, participantswith the low activity allele, however, showed greater brain activity inthe right hippocampus and right anterior cingulate cortex. As activationin left amygdala has been shown to be associatedwith higher level cogni-tive processes such as cognitive representations of fear and consciousemotional learning (Morris et al., 1998), and hippocampus activationhas been associated with declarative memory processes (Knight et al.,2004), the authors discussed that simultaneous activations of hippocam-pus and amygdala reflect an interaction in the associative processing offacial emotional stimuli in declarative memory. This might also be truefor our data.

5. Conclusion

To conclude, in this study we examined three multivariate methods,sparse CCA, Bayesian IBFA and PLSC, according to their performance ingenetic neuroimaging studies. We elaborated an application-orientatedcomparison of those methods, together with a clear statement onwhich method to choose depending on the properties of the data set,and together with tools for performance evaluation and interpretationof results. Therefore our resultsmight facilitate the decision on an appro-priate method selection for researchers in the field of imaging genetics.In particular, we focused on the analysis of whole-brain imaging data,as the covariance structure between imaging variables is expected tobe much stronger than the covariance structure between various SNPs,and other studies have already addressed the use of multivariate ap-proaches for whole-genome SNP data. Brain imaging data is naturallyhighly collinear. However, as one might take advantage of multivariate

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

21C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

strategies for other data sets that are expected to be associated withgenetic variation, such as cognitive or clinical diagnostic measures, wealso considered the possibility that data sets are not assumed to comprisemany dependencies.

For linearly independent data sets, Bayesian IBFA was shown to berecommendable as long as voxel numbers are below 300 times samplesize. Bayesian IBFA automatically learned how many components toconsider andwas the onlymethod to be able to detect two independentcausal patterns between simulated variables of the two data sets. Forhigher numbers of variables, however, results should be interpretedwith caution, as it became difficult to identify causal patterns usingmodel selection tools for all considered strategies.

Formulti-collinear data like imaging genetics data, Bayesian CCA can-not be recommended since additional post-processing stepswere neces-sary to differentiate between causal and non-informative components.This was true for both candidate SNPs and high-dimensional SNP arraysimulations. In contrast, sparse CCA was proven to be an appropriatemethod for candidate phenotype, candidate SNP studies. Its predictivepower was shown to be high when voxel numbers were below 400times sample size and 50 candidate SNPs were considered only. PLSCwas the fastest method among the considered strategies and whenvoxel numbers were above 500 times sample size its predictive powerexceeded that of sparse CCA. In addition, it was the only method thatwas able to detect both causal voxels and causal SNPs in more realisticimaging genetic settings when the dimensionality of both voxels andSNPs was reasonably high, resembling small human SNP arrays. Thus,PLSC proved to be the most appropriate tool for multivariate analysis ofimaging genetics data. However, for higher SNP numbers of currentwhole-genome scans together with whole brain imaging data, PLSCwill be inefficient and a prior dimensionality reduction is highly recom-mended to accommodate the large numbers of variables.

Acknowledgments

This work was supported by the German Federal Ministry of Educa-tion and Research (IFB AdiposityDiseases, FKZ: 01EO1001) to CG, AH,JN, and AV, and the German Research Foundation (CRC 1052 Obesitymechanisms) to AH, JN and AV. We would like to thank Anja Dietrich,who provided valuable feedback on an earlier version of this manuscript.

References

Avants, B.B., Cook, P.A., Ungar, L., Gee, J.C., Grossman, M., 2010. Dementia inducescorrelated reductions in white matter integrity and cortical thickness: a multivariateneuroimaging study with sparse canonical correlation analysis. NeuroImage 50 (3),1004–1016.

Benedetti, F., Radaelli, D., Poletti, S., Falini, A., Cavallaro, R., Dallaspezia, S., Riccaboni, R.,Scotti, G., Smeraldi, E., 2011. Emotional reactivity in chronic schizophrenia: structuraland functional brain correlates and the influence of adverse childhood experiences.Psychol. Med. 41 (3), 509–519.

Boutte, D., Liu, J., 2010. Sparse canonical correlation analysis applied to fMRI and genetic datafusion. IEEE International Conference on Bioinformatics and Biomedicinepp. 422–426.

Browne, M.W., 1979. The maximum-likelihood solution in inter-battery factor analysis.Br. J. Math. Stat. Psychol. 32, 75–86.

Brunet, J.F., Pattyn, A., 2002. PHOX2 genes — from patterning to connectivity. Curr. Opin.Genet. Dev. 12, 435–440.

Carre, J.M., Fisher, P.M., Manuck, S.B., Hariri, A.R., 2010. Interaction between trait anxietyand trait anger predict amygdala reactivity to angry facial expressions in men but notwomen. Soc. Cogn. Affect. Neurosci. 7, 213–221.

Chi, E.C., Allen, G.I., Zhou, H., Kohannim, O., Lange, K., Thompson, P.M., 2013. Imaging ge-netics via sparse canonical correlation analysis. Proc IEEE Int. Symp. Biomed. Imaging740–743.

Cramer, R.D., 1980. BC(DEF) parameters. 2. An empirical structure-based scheme for theprediction of some physical properties. J. Am. Chem. Soc. 102 (6), 1849–1859.

Crawford, D., Nickerson, D., 2005. Definition and clinical importance of haplotypes. Annu.Rev. Med. 56, 303–320.

Demenescu, L.R., Kortekaas, R., Cremers, H.R., Renken, R.J., van Tol, M.J., van der Wee, N.J.,Veltman, D.J., den Boer, J.A., Roelofs, K., Aleman, A., 2013. Amygdala activation and itsfunctional connectivity during perception of emotional faces in social phobia andpanic disorder. J. Psychiatr. Res. 47 (8), 1024–1031.

Dudoit, S., Fridlyand, J., Speed, T., 2001. Comparison of discrimination methods for theclassification of tumors using gene expression data. J. Am. Stat. Assoc. 96, 1151–1160.

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

Filippini, N., Rao, A.,Wetten, S., Gibson, R.A., Borrie,M., Guzman, D., Kertesz, A., Loy-English, I.,Williams, J., Nichols, T.,Whitcher, B.,Matthews, P.M., 2009. Anatomically-distinct geneticassociations of APOE epsilon4 allele load with regional cortical atrophy in Alzheimer'sdisease. NeuroImage 44 (3), 724–728.

Fusar-Poli, P., Placentino, A., Carletti, F., Landi, P., Allen, P., Surguladze, S., Benedetti, F.,Abbamonte, M., Gasparotti, R., Barale, F., Perez, J.,McGuire, P., Politi, P., 2009. Functionalatlas of emotional faces processing: a voxel-based meta-analysis of 105 functionalmagnetic resonance imaging studies. J. Psychiatr. Neurosci. 34 (6), 418–432.

Gottesman, I.I., Gould, T.D., 2003. The endophenotype concept in psychiatry: etymologyand strategic intentions. Am. J. Psychiatr. 160, 636–645.

Hardoon, D.R., Ettinger, U., Mourão-Miranda, J., Antonova, E., Collier, D., Kumari, V.,Williams, S.C., Brammer, M., 2009. Correlation-based multivariate analysis of geneticinfluence on brain volume. Neurosci. Lett. 450 (3), 281–286.

Hariri, A.R., Mattay, V.S., Tessitore, A., Kolachana, B., Fera, F., Goldman, D., Egan, M.F.,Weinberger, D.R., 2002. Serotonin transporter genetic variation and the response ofthe human amygdala. Science 297, 400–403.

Hibar, D.P., Kohannim, O., Stein, J.L., Chiang, M.-C., Thompson, P.M., 2011. Multilocusgenetic analysis of brain images. Front. Genet. 2 (73), 1–21.

Hotelling, H., 1936. Relations between two sets of variates. Biometrika 28 (3/4), 321–377.Klami, A., Virtanen, S., Kaski, S., 2013. Bayesian canonical correlation analysis. J. Mach.

Learn. Res. 14, 965–1003.Knight, D.C., Cheng, D.T., Smith, C.N., Stein, E.A., Helmstetter, F.J., 2004. Neural substrates

mediating human delay and trace fear conditioning. J. Neurosci. 24, 218–228.Kovacevic, N., Abdi, H., Beaton, D., McIntosh, A.R., 2013. Revisiting PLS resampling: com-

paring significance versus reliability across range of simulations. New Perspectivesin Partial Least Squares and Related Methods, Springer Proceedings in Mathematicsand Statistics. 56 pp. 159–170.

Krishnan, A., Williams, L.J., McIntosh, A.R., Abdi, H., 2011. Partial least squares (PLS)methods for neuroimaging: a tutorial and review. NeuroImage 56 (2), 455–475.

LaFramboise, T., 2009. Single nucleotide polymorphism arrays: a decade of biological,computational and technological advances. Nucleic Acids Res. 37 (13), 4181–4193.

Le Floch, E., Guillemot, V., Frouin, V., Pinel, P., Lalanne, C., Trincherah, L., Tenenhaus, A.,Moreno, A., Zilbovicius, M., Bourgerone, T., Dehaene, S., Thirion, B., Poline, J.-B.,Duchesnay, E., 2012. Significant correlation between a set of genetic polymorphismsand a functional brain network revealed by feature selection and sparse partial leastsquares. NeuroImage 63 (1), 11–24.

LeDoux, J., 2007. The amygdala. Curr. Biol. 17, 868–874.Lee, B.T., Ham, B.J., 2008. Monoamine oxidase A-uVNTR genotype affects limbic brain

activity in response to affective facial stimuli. NeuroReport 19 (5), 515–519.Li, J., Chen, Y., 2008. Generating samples for association studies based on HapMap data.

BMC Bioinforma. 9 (44), 1–13.Lin, D., Zhang, J., Li, J., Calhoun, V.D., Deng, H.-W., Wang, Y.-P., 2013. Group sparse canon-

ical correlation analysis for genomic data integration. BMC Bioinforma. 14 (1),245–260.

McIntosh, A.R., Lobaugh, N.J., 2004. Partial least squares analysis of neuroimaging data:applications and advances. NeuroImage 23 (Supplement 1), S250–S263.

McIntosh, A.R., Bookstein, F.L., Haxby, J.V., Grady, C.L., 1996. Spatial pattern analysis offunctional brain images using partial least squares. NeuroImage 3 (3), 143–157.

Meyer-Lindenberg, A., 2012. The future of fMRI and genetics research. NeuroImage 62 (2),1286–1292.

Morris, J.S., Ohman, A., Dolan, R.J., 1998. Conscious and unconscious emotional learning inthe human amygdala. Nature 393, 467–470.

Neal, R.M., 1996. Bayesian Learning for Neural Networks. Springer-Verlag.Ousdal, O.T., Brown, A.A., Jensen, J., Nakstad, P.H., Melle, I., Agartz, I., Djurovic, S., Bogdan,

R., Hariri, A.R., Andreassen, O.A., 2012. Association between variants near a monoam-inergic pathway gene (PHOX2B) and amygdala reactivity: a genome-wide functionalimaging study. Twin Res. Hum. Genet. 15 (3), 273–285.

Parkhomenko, E., Tritchler, D., Beyene, J., 2007. Genome-wide sparse canonical correla-tion of gene expression with genotypes. BMC Proc. 1, S119.

Parkhomenko, E., Tritchler, D., Beyene, J., 2009. Sparse canonical correlation analysis withapplication to genomic data integration. Stat. Appl. Genet. Mol. Biol. 8 (1), 1–34.

Potkin, S.G., Turner, J.A., Guffanti, G., Lakatos, A., Fallon, J.H., Nguyen, D.D., Mathalon, D.,Ford, J., Lauriello, J., Macciardi, F., FBIRN, 2009. A genome-wide association study ofschizophrenia using brain activation as a quantitative phenotype. Schizophr. Bull.35 (1), 96–108.

Schraa-Tam, C.K., Rietdijk, W.J., Verbeke, W.J., Dietvorst, R.C., van den Berg, W.E., Bagozzi,R.P., De Zeeuw, C.I., 2012. fMRI activities in the emotional cerebellum: a preferencefor negative stimuli and goal-directed behavior. Cerebellum 11 (1), 233–245.

Smit, D.J.A., van't Ent, D., de Zubicaray, G., Stein, J.L., 2012. Guest editorial—neuroimaging andgenetics: exploring, searching, and finding. Twin Res. Hum. Genet. 15 (3), 267–272.

Surguladze, S.A., Marshall, N., Schulze, K., Hall, M.H., Walshe, M., Bramon, E., Phillips, M.L.,Murray, R.M., McDonald, C., 2010. Exaggerated neural response to emotional faces inpatients with bipolar disorder and their first-degree relatives. NeuroImage 53 (1),58–64.

The HapMap Consortium, 2003. The international HapMap project. Nature 426, 789–796.Tibshirani, R., 1996. Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B 58

(1), 267–288.Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G., 2003. Class prediction by nearest shrunken

centroids, with applications to DNA microarrays. Stat. Sci. 18, 104–117.Tucker, L.R., 1958. An inter-battery method of factor analysis. Psychometrika 23 (2),

111–136.Virtanen, S., Klami, A., Kaski, S., 2011. Bayesian CCA via group sparsity. Proceedings of the

28th International Conference on Machine Learning (ICML-II)pp. 457–464.Waaijenborg, S., Verselewel De Witt Hamer, P., Zwinderman, A., 2008. Quantifying the

association between gene expressions and DNA-markers by penalized canonicalcorrelation analysis. Stat. Appl. Genet. Mol. Biol. 7 (1), 1–27.

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025

22 C. Grellmann et al. / NeuroImage xxx (2014) xxx–xxx

Wan, J., Kim, S., Inlow, M., Nho, K., Swaminathan, S., Risacher, S.L., Fang, S., Weiner, M.W.,Faisal Beg, M., Wang, L., Saykin, A.J., Shen, L., ADNI, 2011. Hippocampal surface map-ping of genetic risk factors in AD via sparse learning models. Med. Image Comput.Comput. Assist. Interv. 14 (2), 376–383.

Wegelin, J.A., 2000. A survey of partial least squares (PLS) methods, with emphasis on thetwo-block case. Technical Report 371. University of Washington, Department ofStatistics.

Wiesel, A., Kliger, M., Hero, A.O., 2008. A Greedy Approach to Sparse Canonical CorrelationAnalysis. http://arxiv.org/abs/0801.2748v1 (Available at).

Wilderjans, T.F., Ceulemans, E., Meers, K., 2013. CHull: a generic convex-hull-basedmodelselection method. Behav. Res. Methods 45, 1–15.

Please cite this article as: Grellmann, C., et al., Comparison of variants ofanalysis of MRI and genetic data, NeuroImage (2014), http://dx.doi.org/10

Witten, D.M., Tibshirani, R.J., 2009. Extensions of sparse canonical correlation analysiswith applications to genomic data. Stat. Appl. Genet. Mol. Biol. 8 (1), 1–27.

Witten, D.M., Tibshirani, R.J., Hastie, T., 2009. A penalized matrix decomposition, withapplications to sparse principal components and canonical correlation analysis.Biostatistics 10 (3), 515–534.

Wold, H., 1975. Path models with latent variables: the NIPALS approach. In: Blalock, H.M.,et al. (Eds.), Quantitative sociology: International perspectives on mathematical andstatistical modeling. Academic Press, pp. 307–357.

canonical correlation analysis and partial least squares for combined.1016/j.neuroimage.2014.12.025