718
Applied Multivariate Analysis Neil H. Timm SPRINGER

E 007 applied multivariate (neil h. timm)

Embed Size (px)

DESCRIPTION

analisis multivariado

Citation preview

  • 1. Applied MultivariateAnalysisNeil H. TimmSPRINGER

2. Springer Texts in StatisticsAdvisors:George Casella Stephen Fienberg Ingram OlkinSpringerNew YorkBerlinHeidelbergBarcelonaHong KongLondonMilanParisSingaporeTokyo 3. This page intentionally left blank 4. Neil H. TimmApplied MultivariateAnalysisWith 42 Figures 5. Neil H. TimmDepartment of Education in PsychologySchool of EducationUniversity of PittsburghPittsburgh, PA [email protected] BoardGeorge Casella Stephen Fienberg Ingram OlkinDepartment of Statistics Department of Statistics Department of StatisticsUniversity of Florida Carnegie Mellon University Stanford UniversityGainesville, FL 32611-8545 Pittsburgh, PA 15213-3890 Stanford, CA 94305USA USA USALibrary of Congress Cataloging-in-Publication DataTimm, Neil H.Applied multivariate analysis / Neil H. Timm.p. cm.(Springer texts in statistics)Includes bibliographical references and index.ISBN 0-387-95347-7 (alk. paper)1. Multivariate analysis. I. Title. II. Series.QA278 .T53 2002519.535dc21 2001049267ISBN 0-387-95347-7 Printed on acid-free paper.c 2002 Springer-Verlag New York, Inc.All rights reserved. This work may not be translated or copied in whole or in part without the written permissionof the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010, USA), except forbrief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of informationstorage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now knowor hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, andsimilar terms, even if the are not identified as such, is not to be taken as an expression of opinion as to whether ornot they are subject to proprietary rights.Printed in the United States of America.9 8 7 6 5 4 3 2 1 SPIN 10848751www.springer-ny.comSpringer-Verlag New York Berlin HeidelbergA member of BertelsmannSpringer Science+Business Media GmbH 6. To my wifeVerena 7. This page intentionally left blank 8. PrefaceUnivariate statistical analysis is concerned with techniques for the analysis of a singlerandom variable. This book is about applied multivariate analysis. It was written to pro-videstudents and researchers with an introduction to statistical techniques for the analy-sisof continuous quantitative measurements on several random variables simultaneously.While quantitative measurements may be obtained from any population, the material in thistext is primarily concerned with techniques useful for the analysis of continuous observa-tionsfrom multivariate normal populations with linear structure. While several multivariatemethods are extensions of univariate procedures, a unique feature of multivariate data anal-ysistechniques is their ability to control experimental error at an exact nominal level and toprovide information on the covariance structure of the data. These features tend to enhancestatistical inference, making multivariate data analysis superior to univariate analysis.While in a previous edition of my textbook on multivariate analysis, I tried to precedea multivariate method with a corresponding univariate procedure when applicable, I havenot taken this approach here. Instead, it is assumed that the reader has taken basic coursesin multiple linear regression, analysis of variance, and experimental design. While studentsmay be familiar with vector spaces and matrices, important results essential to multivariateanalysis are reviewed in Chapter 2. I have avoided the use of calculus in this text. Emphasisis on applications to provide students in the behavioral, biological, physical, and socialsciences with a broad range of linear multivariate models for statistical estimation andinference, and exploratory data analysis procedures useful for investigating relationshipsamong a set of structured variables. Examples have been selected to outline the processone employs in data analysis for checking model assumptions and model development, andfor exploring patterns that may exist in one or more dimensions of a data set.To successfully apply methods of multivariate analysis, a comprehensive understand-ingof the theory and how it relates to a flexible statistical package used for the analysis 9. viii Prefacehas become critical. When statistical routines were being developed for multivariate dataanalysis over twenty years ago, developing a text using a single comprehensive statisticalpackage was risky. Now, companies and software packages have stabilized, thus reduc-ingthe risk. I have made extensive use of the Statistical Analysis System (SAS) in thistext. All examples have been prepared using Version 8 for Windows. Standard SAS pro-cedureshave been used whenever possible to illustrate basic multivariate methodologies;however, a few illustrations depend on the Interactive Matrix Language (IML) procedure.All routines and data sets used in the text are contained on the Springer-Verlag Web site,http://www.springer-ny.com/detail.tpl?ISBN=0387953477 and the authors University ofPittsburgh Web site, http://www.pitt.edu/timm. 10. AcknowledgmentsThe preparation of this text has evolved from teaching courses and seminars in appliedmultivariate statistics at the University of Pittsburgh. I am grateful to the University ofPittsburgh for giving me the opportunity to complete this work. I would like to express mythanks to the many students who have read, criticized, and corrected various versions ofearly drafts of my notes and lectures on the topics included in this text. I am indebted tothem for their critical readings and their thoughtful suggestions. My deepest appreciationand thanks are extended to my former student Dr. Tammy A. Mieczkowski who read theentire manuscript and offered many suggestions for improving the presentation. I also wishto thank the anonymous reviewers who provided detail comments on early drafts of themanuscript which helped to improve the presentation. However, I am responsible for anyerrors or omissions of the material included in this text. I also want to express specialthanks to John Kimmel at Springer-Verlag. Without his encouragement and support, thisbook would not have been written.This book was typed using ScientificWorkPlace Version 3.0. I wish to thank Dr. MelissaHarrison, Ph.D., of Far Field Associates who helped with the LATEX commands used toformat the book and with the development of the author and subject indexes. This book hastaken several years to develop and during its development it went through several revisions.The preparation of the entire manuscript and every revision was performed with great careand patience by Mrs. Roberta S. Allan, to whom I am most grateful. I am also especiallygrateful to the SAS Institute for permission to use the Statistical Analysis System (SAS) inthis text. Many of the large data sets analyzed in this book were obtained from the Data andStory Library (DASL) sponsored by Cornell University and hosted by the Department ofStatistics at Carnegie Mellon University (http://lib.stat.cmu.edu/DASL/). I wish to extendmy thanks and appreciation to these institutions for making available these data sets forstatistical analysis. I would also like to thank the authors and publishers of copyrighted 11. x Acknowledgmentsmaterial for making available the statistical tables and many of the data sets used in thisbook.Finally, I extend my love, gratitude, and appreciation to my wife Verena for her patience,love, support, and continued encouragement throughout this project.Neil H. Timm, ProfessorUniversity of Pittsburgh 12. ContentsPreface viiAcknowledgments ixList of Tables xixList of Figures xxiii1 Introduction 11.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Multivariate Models and Methods . . . . . . . . . . . . . . . . . . . . . 11.3 Scope of the Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Vectors and Matrices 72.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Vectors, Vector Spaces, and Vector Subspaces . . . . . . . . . . . . . . . 7a. Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7b. Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8c. Vector Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 Bases, Vector Norms, and the Algebra of Vector Spaces . . . . . . . . . . 12a. Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13b. Lengths, Distances, and Angles . . . . . . . . . . . . . . . . . . . . . 13c. Gram-Schmidt Orthogonalization Process . . . . . . . . . . . . . . . 15d. Orthogonal Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 17e. Vector Inequalities, Vector Norms, and Statistical Distance . . . . . . 21 13. xii Contents2.4 BasicMatrixOperations . . . . . . . . . . . . . . . . . . . . . . . . . . 25a. Equality, Addition, and Multiplication of Matrices . . . . . . . . . . . 26b. Matrix Transposition . . . . . . . . . . . . . . . . . . . . . . . . . . 28c. Some Special Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 29d. Trace and the Euclidean Matrix Norm . . . . . . . . . . . . . . . . . 30e. Kronecker and Hadamard Products . . . . . . . . . . . . . . . . . . . 32f. DirectSums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35g. The Vec() and Vech()Operators . . . . . . . . . . . . . . . . . . . . 352.5 Rank, Inverse, and Determinant . . . . . . . . . . . . . . . . . . . . . . . 41a. Rank and Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41b. Generalized Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . 47c. Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502.6 Systems of Equations, Transformations, and Quadratic Forms . . . . . . . 55a. Systems ofEquations . . . . . . . . . . . . . . . . . . . . . . . . . . 55b. Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 61c. ProjectionTransformations . . . . . . . . . . . . . . . . . . . . . . . 63d. Eigenvalues andEigenvectors . . . . . . . . . . . . . . . . . . . . . . 67e. MatrixNorms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71f. Quadratic Forms and Extrema . . . . . . . . . . . . . . . . . . . . . 72g. Generalized Projectors . . . . . . . . . . . . . . . . . . . . . . . . . 732.7 Limits andAsymptotics . . . . . . . . . . . . . . . . . . . . . . . . . . . 763 Multivariate Distributions and the Linear Model 793.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793.2 Random Vectors and Matrices . . . . . . . . . . . . . . . . . . . . . . . 793.3 TheMultivariateNormal (MVN)Distribution . . . . . . . . . . . . . . . 84a. Properties of the Multivariate Normal Distribution . . . . . . . . . . . 86b. Estimating and. . . . . . . . . . . . . . . . . . . . . . . . . . . 88c. TheMatrixNormalDistribution . . . . . . . . . . . . . . . . . . . . 903.4 The Chi-Square and Wishart Distributions . . . . . . . . . . . . . . . . . 93a. Chi-Square Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 93b. TheWishartDistribution . . . . . . . . . . . . . . . . . . . . . . . . 963.5 OtherMultivariateDistributions . . . . . . . . . . . . . . . . . . . . . . 99a. TheUnivariate t andFDistributions . . . . . . . . . . . . . . . . . . 99b. Hotellings T 2Distribution . . . . . . . . . . . . . . . . . . . . . . . 99c. TheBetaDistribution . . . . . . . . . . . . . . . . . . . . . . . . . . 101d. Multivariate t, F, and 2Distributions . . . . . . . . . . . . . . . . . 1043.6 The General Linear Model . . . . . . . . . . . . . . . . . . . . . . . . . 106a. Regression, ANOVA, and ANCOVA Models . . . . . . . . . . . . . . 107b. Multivariate Regression, MANOVA, and MANCOVA Models . . . . 110c. The Seemingly Unrelated Regression (SUR) Model . . . . . . . . . . 114d. The General MANOVA Model (GMANOVA) . . . . . . . . . . . . . 1153.7 Evaluating Normality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1183.8 Tests of Covariance Matrices . . . . . . . . . . . . . . . . . . . . . . . . 133a. Tests of Covariance Matrices . . . . . . . . . . . . . . . . . . . . . . 133 14. Contents xiiib. Equality of Covariance Matrices . . . . . . . . . . . . . . . . . . . . 133c. Testing for a Specific Covariance Matrix . . . . . . . . . . . . . . . . 137d. Testing for Compound Symmetry . . . . . . . . . . . . . . . . . . . . 138e. Tests of Sphericity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139f. Tests of Independence . . . . . . . . . . . . . . . . . . . . . . . . . . 143g. Tests for Linear Structure . . . . . . . . . . . . . . . . . . . . . . . . 1453.9 Tests ofLocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149a. Two-Sample Case, 1 = 2 =. . . . . . . . . . . . . . . . . . . 149b. Two-Sample Case, 1= 2 . . . . . . . . . . . . . . . . . . . . . . 156c. Two-Sample Case, Nonnormality . . . . . . . . . . . . . . . . . . . . 160d. ProfileAnalysis,OneGroup . . . . . . . . . . . . . . . . . . . . . . 160e. Profile Analysis, Two Groups . . . . . . . . . . . . . . . . . . . . . . 165f. Profile Analysis, 1= 2 . . . . . . . . . . . . . . . . . . . . . . . 1753.10 UnivariateProfileAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . 181a. UnivariateOne-GroupProfileAnalysis . . . . . . . . . . . . . . . . . 182b. UnivariateTwo-GroupProfileAnalysis . . . . . . . . . . . . . . . . . 1823.11 PowerCalculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1824 Multivariate Regression Models 1854.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1854.2 MultivariateRegression . . . . . . . . . . . . . . . . . . . . . . . . . . . 186a. Multiple Linear Regression . . . . . . . . . . . . . . . . . . . . . . . 186b. Multivariate Regression Estimation and Testing Hypotheses . . . . . . 187c. Multivariate Influence Measures . . . . . . . . . . . . . . . . . . . . 193d. Measures of Association, Variable Selection and Lack-of-Fit Tests . . 197e. Simultaneous Confidence Sets for a New Observation ynewand the Elements of B . . . . . . . . . . . . . . . . . . . . . . . . . . 204f. Random X Matrix and Model Validation: Mean Squared Er-rorofPrediction inMultivariateRegression . . . . . . . . . . . . . . 206g. Exogeniety in Regression . . . . . . . . . . . . . . . . . . . . . . . . 2114.3 MultivariateRegressionExample . . . . . . . . . . . . . . . . . . . . . . 2124.4 One-WayMANOVAandMANCOVA . . . . . . . . . . . . . . . . . . . 218a. One-WayMANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . 218b. One-WayMANCOVA . . . . . . . . . . . . . . . . . . . . . . . . . 225c. Simultaneous Test Procedures (STP) for One-WayMANOVA/MANCOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2304.5 One-WayMANOVA/MANCOVAExamples . . . . . . . . . . . . . . . . 234a. MANOVA(Example 4.5.1) . . . . . . . . . . . . . . . . . . . . . . . 234b. MANCOVA(Example 4.5.2) . . . . . . . . . . . . . . . . . . . . . . 2394.6 MANOVA/MANCOVA with Unequal i or Nonnormal Data . . . . . . . 2454.7 One-Way MANOVA with Unequal i Example . . . . . . . . . . . . . . 2464.8 Two-WayMANOVA/MANCOVA . . . . . . . . . . . . . . . . . . . . . 246a. Two-WayMANOVAwithInteraction . . . . . . . . . . . . . . . . . 246b. AdditiveTwo-WayMANOVA . . . . . . . . . . . . . . . . . . . . . 252c. Two-WayMANCOVA . . . . . . . . . . . . . . . . . . . . . . . . . 256 15. xiv Contentsd. Tests of Nonadditivity . . . . . . . . . . . . . . . . . . . . . . . . . . 2564.9 Two-WayMANOVA/MANCOVAExample . . . . . . . . . . . . . . . . 257a. Two-WayMANOVA(Example 4.9.1) . . . . . . . . . . . . . . . . . 257b. Two-WayMANCOVA(Example 4.9.2) . . . . . . . . . . . . . . . . 2614.10 Nonorthogonal Two-Way MANOVA Designs . . . . . . . . . . . . . . . 264a. Nonorthogonal Two-Way MANOVA Designs with andWithoutEmptyCells, and Interaction . . . . . . . . . . . . . . . . . . . . . . 265b. AdditiveTwo-WayMANOVADesignsWithEmptyCells . . . . . . . 2684.11 Unbalance, Nonorthogonal Designs Example . . . . . . . . . . . . . . . 2704.12 Higher Ordered Fixed Effect, Nested and Other Designs . . . . . . . . . . 2734.13 ComplexDesignExamples . . . . . . . . . . . . . . . . . . . . . . . . . 276a. NestedDesign(Example 4.13.1) . . . . . . . . . . . . . . . . . . . . 276b. Latin Square Design (Example 4.13.2) . . . . . . . . . . . . . . . . . 2794.14 Repeated Measurement Designs . . . . . . . . . . . . . . . . . . . . . . 282a. One-Way Repeated Measures Design . . . . . . . . . . . . . . . . . . 282b. Extended Linear Hypotheses . . . . . . . . . . . . . . . . . . . . . . 2864.15 Repeated Measurements and Extended Linear Hypotheses Example . . . 294a. Repeated Measures (Example 4.15.1) . . . . . . . . . . . . . . . . . 294b. Extended Linear Hypotheses (Example 4.15.2) . . . . . . . . . . . . 2984.16 Robustness and Power Analysis for MR Models . . . . . . . . . . . . . . 3014.17 PowerCalculationsPower.sas . . . . . . . . . . . . . . . . . . . . . . . 3044.18 Testing for Mean Differences with Unequal Covariance Matrices . . . . . 3075 Seemingly Unrelated Regression Models 3115.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3115.2 The SUR Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312a. Estimation and Hypothesis Testing . . . . . . . . . . . . . . . . . . . 312b. Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3145.3 Seeming Unrelated Regression Example . . . . . . . . . . . . . . . . . . 3165.4 The CGMANOVA Model . . . . . . . . . . . . . . . . . . . . . . . . . . 3185.5 CGMANOVAExample . . . . . . . . . . . . . . . . . . . . . . . . . . . 3195.6 The GMANOVA Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 320a. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320b. Estimation and Hypothesis Testing . . . . . . . . . . . . . . . . . . . 321c. Test ofFit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324d. Subsets of Covariates . . . . . . . . . . . . . . . . . . . . . . . . . . 324e. GMANOVAvsSUR . . . . . . . . . . . . . . . . . . . . . . . . . . 326f. MissingData . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3265.7 GMANOVAExample . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327a. OneGroupDesign (Example 5.7.1) . . . . . . . . . . . . . . . . . . 328b. TwoGroupDesign (Example 5.7.2) . . . . . . . . . . . . . . . . . . 3305.8 Tests of Nonadditivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3335.9 Testing for Nonadditivity Example . . . . . . . . . . . . . . . . . . . . . 3355.10 Lack ofFitTest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3375.11 SumofProfileDesigns . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 16. Contents xv5.12 The Multivariate SUR (MSUR) Model . . . . . . . . . . . . . . . . . . . 3395.13 SumofProfileExample . . . . . . . . . . . . . . . . . . . . . . . . . . . 3415.14 Testing Model Specification in SUR Models . . . . . . . . . . . . . . . . 3445.15 Miscellanea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3486 Multivariate Random and Mixed Models 3516.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3516.2 Random Coefficient Regression Models . . . . . . . . . . . . . . . . . . 352a. Model Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . 352b. Estimating theParameters . . . . . . . . . . . . . . . . . . . . . . . . 353c. Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . 3556.3 Univariate General Linear Mixed Models . . . . . . . . . . . . . . . . . 357a. Model Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . 357b. Covariance Structures and Model Fit . . . . . . . . . . . . . . . . . . 359c. Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361d. Balanced Variance Component Experimental Design Models . . . . . 366e. Multilevel Hierarchical Models . . . . . . . . . . . . . . . . . . . . . 367f. Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3686.4 Mixed Model Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 369a. Random Coefficient Regression (Example 6.4.1) . . . . . . . . . . . . 371b. Generalized Randomized Block Design (Example 6.4.2) . . . . . . . 376c. Repeated Measurements (Example 6.4.3) . . . . . . . . . . . . . . . 380d. HLM Model (Example 6.4.4) . . . . . . . . . . . . . . . . . . . . . . 3816.5 Mixed Multivariate Models . . . . . . . . . . . . . . . . . . . . . . . . . 385a. Model Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . 386b. Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . 388c. Evaluating Expected Mean Square . . . . . . . . . . . . . . . . . . . 391d. Estimating theMean . . . . . . . . . . . . . . . . . . . . . . . . . . 392e. Repeated Measurements Model . . . . . . . . . . . . . . . . . . . . . 3926.6 Balanced Mixed Multivariate Models Examples . . . . . . . . . . . . . . 394a. Two-wayMixedMANOVA . . . . . . . . . . . . . . . . . . . . . . . 395b. Multivariate Split-Plot Design . . . . . . . . . . . . . . . . . . . . . 3956.7 Double Multivariate Model (DMM) . . . . . . . . . . . . . . . . . . . . 4006.8 Double Multivariate Model Examples . . . . . . . . . . . . . . . . . . . 403a. Double Multivariate MANOVA (Example 6.8.1) . . . . . . . . . . . . 404b. Split-Plot Design (Example 6.8.2) . . . . . . . . . . . . . . . . . . . 4076.9 Multivariate Hierarchical Linear Models . . . . . . . . . . . . . . . . . . 4156.10 Tests of Means with Unequal Covariance Matrices . . . . . . . . . . . . . 4177 Discriminant and Classification Analysis 4197.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4197.2 TwoGroupDiscriminationandClassification . . . . . . . . . . . . . . . 420a. Fishers Linear Discriminant Function . . . . . . . . . . . . . . . . . 421b. Testing Discriminant Function Coefficients . . . . . . . . . . . . . . 422c. ClassificationRules . . . . . . . . . . . . . . . . . . . . . . . . . . . 424 17. xvi Contentsd. EvaluatingClassificationRules . . . . . . . . . . . . . . . . . . . . . 4277.3 Two Group Discriminant Analysis Example . . . . . . . . . . . . . . . . 429a. Egyptian Skull Data (Example 7.3.1) . . . . . . . . . . . . . . . . . . 429b. BrainSize (Example 7.3.2) . . . . . . . . . . . . . . . . . . . . . . . 4327.4 Multiple Group Discrimination and Classification . . . . . . . . . . . . . 434a. Fishers Linear Discriminant Function . . . . . . . . . . . . . . . . . 434b. Testing Discriminant Functions for Significance . . . . . . . . . . . . 435c. VariableSelection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437d. ClassificationRules . . . . . . . . . . . . . . . . . . . . . . . . . . . 438e. LogisticDiscrimination andOtherTopics . . . . . . . . . . . . . . . 4397.5 Multiple Group Discriminant Analysis Example . . . . . . . . . . . . . . 4408 Principal Component, Canonical Correlation, and ExploratoryFactor Analysis 4458.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4458.2 Principal Component Analysis . . . . . . . . . . . . . . . . . . . . . . . 445a. Population Model for PCA . . . . . . . . . . . . . . . . . . . . . . . 446b. Number of Components and Component Structure . . . . . . . . . . . 449c. Principal Components with Covariates . . . . . . . . . . . . . . . . . 453d. SamplePCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455e. Plotting Components . . . . . . . . . . . . . . . . . . . . . . . . . . 458f. Additional Comments . . . . . . . . . . . . . . . . . . . . . . . . . 458g. Outlier Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4588.3 Principal Component Analysis Examples . . . . . . . . . . . . . . . . . . 460a. TestBattery (Example 8.3.1) . . . . . . . . . . . . . . . . . . . . . . 460b. SemanticDifferentialRatings (Example 8.3.2) . . . . . . . . . . . . . 461c. Performance Assessment Program (Example 8.3.3) . . . . . . . . . . 4658.4 Statistical Tests in Principal Component Analysis . . . . . . . . . . . . . 468a. Tests Using the Covariance Matrix . . . . . . . . . . . . . . . . . . . 468b. TestsUsing aCorrelationMatrix . . . . . . . . . . . . . . . . . . . . 4728.5 Regression on Principal Components . . . . . . . . . . . . . . . . . . . . 474a. GMANOVA Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 475b. The PCA Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4758.6 Multivariate Regression on Principal Components Example . . . . . . . . 4768.7 Canonical Correlation Analysis . . . . . . . . . . . . . . . . . . . . . . . 477a. Population Model for CCA . . . . . . . . . . . . . . . . . . . . . . . 477b. SampleCCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482c. Tests of Significance . . . . . . . . . . . . . . . . . . . . . . . . . . 483d. Association and Redundancy . . . . . . . . . . . . . . . . . . . . . . 485e. Partial, Part and Bipartial Canonical Correlation . . . . . . . . . . . . 487f. PredictiveValidityinMultivariateRegression usingCCA . . . . . . . 490g. Variable Selection and Generalized Constrained CCA . . . . . . . . . 4918.8 Canonical Correlation Analysis Examples . . . . . . . . . . . . . . . . . 492a. RohwerCCA(Example 8.8.1) . . . . . . . . . . . . . . . . . . . . . 492b. Partial andPartCCA (Example 8.8.2) . . . . . . . . . . . . . . . . . 494 18. Contents xvii8.9 ExploratoryFactorAnalysis . . . . . . . . . . . . . . . . . . . . . . . . 496a. Population Model for EFA . . . . . . . . . . . . . . . . . . . . . . . 497b. Estimating Model Parameters . . . . . . . . . . . . . . . . . . . . . . 502c. Determining Model Fit . . . . . . . . . . . . . . . . . . . . . . . . . 506d. FactorRotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507e. Estimating Factor Scores . . . . . . . . . . . . . . . . . . . . . . . . 509f. Additional Comments . . . . . . . . . . . . . . . . . . . . . . . . . . 5108.10 ExploratoryFactorAnalysisExamples . . . . . . . . . . . . . . . . . . . 511a. Performance Assessment Program (PAPExample 8.10.1) . . . . . . 511b. DiVesta andWalls (Example 8.10.2) . . . . . . . . . . . . . . . . . . 512c. Shin (Example 8.10.3) . . . . . . . . . . . . . . . . . . . . . . . . . 5129 Cluster Analysis and Multidimensional Scaling 5159.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5159.2 ProximityMeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516a. DissimilarityMeasures . . . . . . . . . . . . . . . . . . . . . . . . . 516b. SimilarityMeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . 519c. ClusteringVariables . . . . . . . . . . . . . . . . . . . . . . . . . . . 5229.3 ClusterAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522a. Agglomerative Hierarchical Clustering Methods . . . . . . . . . . . . 523b. Nonhierarchical Clustering Methods . . . . . . . . . . . . . . . . . . 530c. Number ofClusters . . . . . . . . . . . . . . . . . . . . . . . . . . . 531d. Additional Comments . . . . . . . . . . . . . . . . . . . . . . . . . . 5339.4 ClusterAnalysisExamples . . . . . . . . . . . . . . . . . . . . . . . . . 533a. ProteinConsumption (Example 9.4.1) . . . . . . . . . . . . . . . . . 534b. Nonhierarchical Method (Example 9.4.2) . . . . . . . . . . . . . . . 536c. Teacher Perception (Example 9.4.3) . . . . . . . . . . . . . . . . . . 538d. Cedar Project (Example 9.4.4) . . . . . . . . . . . . . . . . . . . . . 5419.5 Multidimensional Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . 541a. ClassicalMetricScaling . . . . . . . . . . . . . . . . . . . . . . . . 542b. NonmetricScaling . . . . . . . . . . . . . . . . . . . . . . . . . . . 544c. Additional Comments . . . . . . . . . . . . . . . . . . . . . . . . . . 5479.6 Multidimensional Scaling Examples . . . . . . . . . . . . . . . . . . . . 548a. ClassicalMetricScaling (Example 9.6.1) . . . . . . . . . . . . . . . . 549b. Teacher Perception (Example 9.6.2) . . . . . . . . . . . . . . . . . . 550c. Nation (Example 9.6.3) . . . . . . . . . . . . . . . . . . . . . . . . . 55310 Structural Equation Models 55710.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55710.2 Path Diagrams, Basic Notation, and the General Approach . . . . . . . . 55810.3 Confirmatory Factor Analysis . . . . . . . . . . . . . . . . . . . . . . . . 56710.4 Confirmatory Factor Analysis Examples . . . . . . . . . . . . . . . . . . 575a. Performance Assessment 3 - Factor Model (Example 10.4.1) . . . . . 575b. Performance Assessment 5-Factor Model (Example 10.4.2) . . . . . . 57810.5 PathAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 580 19. xviii Contents10.6 PathAnalysisExamples . . . . . . . . . . . . . . . . . . . . . . . . . . . 586a. Community Structure and Industrial Conflict (Example 10.6.1) . . . . 586b. Nonrecursive Model (Example 10.6.2) . . . . . . . . . . . . . . . . . 59010.7 StructuralEquationswithManifest andLatentVariables . . . . . . . . . . 59410.8 StructuralEquationswithManifest andLatentVariablesExample . . . . 59510.9 LongitudinalAnalysiswithLatentVariables . . . . . . . . . . . . . . . . 60010.10 Exogeniety in Structural Equation Models . . . . . . . . . . . . . . . . . 604Appendix 609References 625Author Index 667Subject Index 675 20. List of Tables3.7.1 Univariate and Multivariate Normality Tests, Normal DataDataSetA,Group 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1253.7.2 Univariate and Multivariate Normality Tests Non-normal Data,DataSetC,Group 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1263.7.3 Ramus Bone Length Data . . . . . . . . . . . . . . . . . . . . . . . . . 1283.7.4 Effects ofDelay onOralPractice. . . . . . . . . . . . . . . . . . . . . 1323.8.1 Boxs Test of 1 = 22Approximation. . . . . . . . . . . . . . . . . 1353.8.2 Boxs Test of 1 = 2 FApproximation. . . . . . . . . . . . . . . . . 1353.8.3 Boxs Test of 1 = 22DataSetB. . . . . . . . . . . . . . . . . . . 1363.8.4 Boxs Test of 1 = 22DataSetC. . . . . . . . . . . . . . . . . . . 1363.8.5 Test of Specific Covariance Matrix Chi-Square Approximation. . . . . . 1383.8.6 Test of Comparing Symmetry 2Approximation. . . . . . . . . . . . . 1393.8.7 Test of Sphericity and Circularity 2Approximation. . . . . . . . . . . 1423.8.8 Test of Sphericity and Circularity in k Populations. . . . . . . . . . . . 1433.8.9 Test of Independence 2Approximation. . . . . . . . . . . . . . . . . 1453.8.10 Test of Multivariate Sphericity Using Chi-Square and AdjustedChi-Square Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 1483.9.1 MANOVA Test Criteria for Testing 1 = 2. . . . . . . . . . . . . . . 1543.9.2 Discriminant Structure Vectors, H : 1 = 2. . . . . . . . . . . . . . . 1553.9.3 T 2 Test of HC : 1 = 2 = 3. . . . . . . . . . . . . . . . . . . . . . 1633.9.4 Two-GroupProfileAnalysis. . . . . . . . . . . . . . . . . . . . . . . . 1663.9.5 MANOVATable:Two-GroupProfileAnalysis. . . . . . . . . . . . . . 1743.9.6 Two-Group Instructional Data. . . . . . . . . . . . . . . . . . . . . . . 1773.9.7 SampleData:One-SampleProfileAnalysis. . . . . . . . . . . . . . . . 179 21. xx List of Tables3.9.8 SampleData:Two-SampleProfileAnalysis. . . . . . . . . . . . . . . . 1793.9.9 Problem Solving Ability Data. . . . . . . . . . . . . . . . . . . . . . . 1804.2.1 MANOVA Table for Testing B1 = 0 . . . . . . . . . . . . . . . . . . . 1904.2.2 MANOVATable forLack ofFitTest . . . . . . . . . . . . . . . . . . . 2034.3.1 RohwerDataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2134.3.2 RohwerData forLowSESArea . . . . . . . . . . . . . . . . . . . . . 2174.4.1 One-WayMANOVATable . . . . . . . . . . . . . . . . . . . . . . . . 2234.5.1 SampleDataOne-WayMANOVA . . . . . . . . . . . . . . . . . . . . 2354.5.2 FITAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2394.5.3 Teaching Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2434.9.1 Two-WayMANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . 2574.9.2 Cell Means for Example Data . . . . . . . . . . . . . . . . . . . . . . . 2584.9.3 Two-WayMANOVATable . . . . . . . . . . . . . . . . . . . . . . . . 2594.9.4 Two-WayMANCOVA . . . . . . . . . . . . . . . . . . . . . . . . . . 2624.10.1 Non-Additive Connected Data Design . . . . . . . . . . . . . . . . . . 2664.10.2 Non-Additive Disconnected Design . . . . . . . . . . . . . . . . . . . 2674.10.3 Type IV Hypotheses for A and B for the Connected Design inTable 4.10.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2684.11.1 Nonorthogonal Design . . . . . . . . . . . . . . . . . . . . . . . . . . 2704.11.2 Data forExercise 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2734.13.1 MultivariateNestedDesign . . . . . . . . . . . . . . . . . . . . . . . . 2774.13.2 MANOVAforNestedDesign . . . . . . . . . . . . . . . . . . . . . . . 2784.13.3 Multivariate Latin Square . . . . . . . . . . . . . . . . . . . . . . . . . 2814.13.4 BoxTireWearData . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2824.15.1 Edwards Repeated Measures Data . . . . . . . . . . . . . . . . . . . . 2954.17.1 Power Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . 3064.17.2 Power Calculations1 . . . . . . . . . . . . . . . . . . . . . . . . . 3075.5.1 SUR Model Tests for Edwards Data . . . . . . . . . . . . . . . . . . . 3206.3.1 Structured Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . 3606.4.1 Pharmaceutical Stability Data . . . . . . . . . . . . . . . . . . . . . . . 3726.4.2 CGRB Design (Milliken and Johnson, 1992, p. 285) . . . . . . . . . . . 3776.4.3 ANOVA Table for Nonorthogonal CGRB Design . . . . . . . . . . . . 3796.4.4 Drug Effects Repeated Measures Design . . . . . . . . . . . . . . . . . 3806.4.5 ANOVA Table Repeated Measurements . . . . . . . . . . . . . . . . . 3816.5.1 Multivariate Repeated Measurements . . . . . . . . . . . . . . . . . . . 3936.6.1 Expected Mean Square Matrix . . . . . . . . . . . . . . . . . . . . . . 3966.6.2 Individual Measurements Utilized to Assess the Changes inthe Vertical Position and Angle of the Mandible at Three Occasion . . . 3966.6.3 Expected Mean Squares for Model (6.5.17) . . . . . . . . . . . . . . . 3966.6.4 MMMAnalysisZullosData . . . . . . . . . . . . . . . . . . . . . . . 3976.6.5 Summary ofUnivariateOutput . . . . . . . . . . . . . . . . . . . . . . 3976.8.1 DMMResults,Dr.ZullosData . . . . . . . . . . . . . . . . . . . . . . 406 22. List of Tables xxi6.8.2 FactorialStructureData . . . . . . . . . . . . . . . . . . . . . . . . . . 4096.8.3 ANOVA for Split-Split Plot Design -Unknown Kronecker Structure . 4096.8.4 ANOVA for Split-Split Plot Design -Compound Symmetry Structure . 4106.8.5 MANOVA for Split-Split Plot Design -Unknown Structure . . . . . . 4117.2.1 Classification/ConfusionTable . . . . . . . . . . . . . . . . . . . . . . 4277.3.1 Discriminant Structure Vectors, H : 1 = 2 . . . . . . . . . . . . . . 4307.3.2 Discriminant Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 4317.3.3 Skull Data Classification/Confusion Table . . . . . . . . . . . . . . . . 4317.3.4 Willeran et al. (1991) Brain Size Data . . . . . . . . . . . . . . . . . . 4337.3.5 Discriminant Structure Vectors, H : 1 = 2 . . . . . . . . . . . . . . 4347.5.1 Discriminant Structure Vectors, H : 1 = 2 = 3 . . . . . . . . . . . 4417.5.2 Squared Mahalanobis Distances Flea Beetles H : 1 = 2 = 3 . . . . 4417.5.3 FishersLDFs forFleaBeetles . . . . . . . . . . . . . . . . . . . . . . 4427.5.4 Classification/Confusion Matrix for Species . . . . . . . . . . . . . . . 4438.2.1 Principal Component Loadings . . . . . . . . . . . . . . . . . . . . . . 4488.2.2 Principal Component Covariance Loadings (Pattern Matrix) . . . . . . 4488.2.3 Principal Components Correlation Structure . . . . . . . . . . . . . . . 4508.2.4 Partial Principal Components . . . . . . . . . . . . . . . . . . . . . . . 4558.3.1 Matrix of Intercorrelations Among IQ, Creativity, andAchievementVariables . . . . . . . . . . . . . . . . . . . . . . . . . . 4618.3.2 Summary of Principal-Component Analysis Using 13 13CorrelationMatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4628.3.3 Intercorrelations of Ratings Among the Semantic Differential Scale . . 4638.3.4 Summary of Principal-Component Analysis Using 8 8CorrelationMatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4638.3.5 Covariance Matrix of Ratings on Semantic Differential Scales . . . . . 4648.3.6 Summary of Principal-Component Analysis Using 8 8Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4648.3.7 PAP Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 4678.3.8 Component Using S inPAPStudy . . . . . . . . . . . . . . . . . . . . 4678.3.9 PAP Components Using R inPAPStudy . . . . . . . . . . . . . . . . . 4678.3.10 ProjectTalentCorrelationMatrix . . . . . . . . . . . . . . . . . . . . . 4688.7.1 Canonical Correlation Analysis . . . . . . . . . . . . . . . . . . . . . . 4828.10.1 PAPFactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5128.10.2 CorrelationMatrixof 10AudiovisualVariables . . . . . . . . . . . . . 5138.10.3 Correlation Matrix of 13 Audiovisual Variables (excluding diagonal) . . 5149.2.1 Matching Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5219.4.1 ProteinConsumption inEurope . . . . . . . . . . . . . . . . . . . . . . 5359.4.2 ProteinDataClusterChoicesCriteria . . . . . . . . . . . . . . . . . . . 5379.4.3 Protein ConsumptionComparison of HierarchicalClustering Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5379.4.4 Geographic Regions for Random Seeds . . . . . . . . . . . . . . . . . 539 23. xxii List of Tables9.4.5 Protein ConsumptionComparison of NonhierarchicalClustering Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5399.4.6 ItemClusters forPerceptionData . . . . . . . . . . . . . . . . . . . . . 5409.6.1 Road Mileages for Cities . . . . . . . . . . . . . . . . . . . . . . . . . 5499.6.2 MetricEFASolution forGammaMatrix . . . . . . . . . . . . . . . . . 5539.6.3 MeanSimilarityRatings forTwelveNations . . . . . . . . . . . . . . . 55410.2.1 SEMSymbols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56010.4.1 3-Factor PAP Standardized Model . . . . . . . . . . . . . . . . . . . . 57710.4.2 5-Factor PAP Standardized Model . . . . . . . . . . . . . . . . . . . . 57910.5.1 PathAnalysisDirect, Indirect andTotalEffects . . . . . . . . . . . . 58510.6.1 CALIS OUTPUTRevised Model . . . . . . . . . . . . . . . . . . . . 59110.6.2 Revised Socioeconomic Status Model . . . . . . . . . . . . . . . . . . 59310.8.1 Correlation Matrix for Peer-Influence Model . . . . . . . . . . . . . . . 600 24. List of Figures2.3.1 Orthogonal Projection of y on x, Pxy = x . . . . . . . . . . . . . . . 152.3.2 The orthocomplement of S relative to V, V/S . . . . . . . . . . . . . . 192.3.3 The orthogonal decomposition of V for the ANOVA . . . . . . . . . . . 202.6.1 Fixed-VectorTransformation . . . . . . . . . . . . . . . . . . . . . . . 622.6.2 y2 = PVr y2 + PVny2. . . . . . . . . . . . . . . . . . . . . . . 67r 3.3.1 z1z = z21 z1z2 + z22= 1 . . . . . . . . . . . . . . . . . . . . . . 863.7.1 Chi-Square Plot of Normal Data in Set A, Group 1. . . . . . . . . . . . 1253.7.2 BetaPlotofNormalData inDataSetA,Group 1 . . . . . . . . . . . . 1253.7.3 Chi-Square Plot of Non-normal Data in Data Set C, Group 2. . . . . . . 1273.7.4 BetaPlot ofNon-normalData inDataSetC,Group 2. . . . . . . . . . 1273.7.5 Ramus Data Chi-square Plot . . . . . . . . . . . . . . . . . . . . . . . 1294.8.1 3 2Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2514.9.1 Plots of Cell Means for Two-Way MANOVA . . . . . . . . . . . . . . 2584.15.1 Plot of Means Edwards Data . . . . . . . . . . . . . . . . . . . . . . . 2967.4.1 Plot of Discriminant Functions . . . . . . . . . . . . . . . . . . . . . . 4357.5.1 Plot of Flea Beetles Data in the Discriminant Space . . . . . . . . . . . 4428.2.1 Ideal Scree Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4578.3.1 ScreePlot ofEigenvaluesShinData . . . . . . . . . . . . . . . . . . . 4628.3.2 Plot of First Two Components Using S . . . . . . . . . . . . . . . . . . 4658.7.1 Venn Diagram of Total Variance . . . . . . . . . . . . . . . . . . . . . 486 25. xxiv List of Figures9.2.1 2 2 Contingency Table, Binary Variables . . . . . . . . . . . . . . . 5189.3.1 Dendogram for Hierarchical Cluster . . . . . . . . . . . . . . . . . . . 5249.3.2 Dendogram for Single Link Example . . . . . . . . . . . . . . . . . . . 5269.3.3 Dendogram for Complete Link Example . . . . . . . . . . . . . . . . . 5279.5.1 Scatter Plot of Distance Versus Dissimilarities, Given theMonotonicity Constraint . . . . . . . . . . . . . . . . . . . . . . . . . 5459.5.2 Scatter Plot of Distance Versus Dissimilarities, When theMonotonicity Constraint Is Violated . . . . . . . . . . . . . . . . . . . 5469.6.1 MDS Configuration Plot of Four U.S. Cities . . . . . . . . . . . . . . . 5509.6.2 MDS Two-Dimensional Configuration Perception Data . . . . . . . . . 5519.6.3 MDS Three-Dimensional Configuration Perception Data . . . . . . . . 5529.6.4 MDS Three-Dimensional Solution - Nations Data . . . . . . . . . . . . 55510.2.1 PathAnalysisDiagram . . . . . . . . . . . . . . . . . . . . . . . . . . 56310.3.1 TwoFactorEFAPathDiagram . . . . . . . . . . . . . . . . . . . . . . 56810.4.1 3-Factor PAP Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 57610.5.1 Recursive and Nonrecursive Models . . . . . . . . . . . . . . . . . . . 58110.6.1 Lincolns Strike Activity Model in SMSAs . . . . . . . . . . . . . . . . 58710.6.2 CALIS Model for Eq. (10.6.2). . . . . . . . . . . . . . . . . . . . . . . 58910.6.3 Lincolns Standardized Strike Activity Model Fit by CALIS. . . . . . . 59110.6.4 Revised CALIS Model with Signs . . . . . . . . . . . . . . . . . . . . 59110.6.5 Socioeconomic Status Model . . . . . . . . . . . . . . . . . . . . . . . 59210.8.1 Models for Alienation Stability . . . . . . . . . . . . . . . . . . . . . . 59610.8.2 Duncan-Haller-Portes Peer-Influence Model . . . . . . . . . . . . . . . 59910.9.1 GrowthwithLatentVariables. . . . . . . . . . . . . . . . . . . . . . . 602 26. 1Introduction1.1 OverviewIn this book we present applied multivariate data analysis methods for making inferencesregarding the mean and covariance structure of several variables, for modeling relationshipsamong variables, and for exploring data patterns that may exist in one or more dimensionsof the data. The methods presented in the book usually involve analysis of data consisting ofn observations on p variables and one or more groups. As with univariate data analysis, weassume that the data are a random sample from the population of interest and we usuallyassume that the underlying probability distribution of the population is the multivariatenormal (MVN) distribution. The purpose of this book is to provide students with a broadoverview of methods useful in applied multivariate analysis. The presentation integratestheory and practice covering both formal linear multivariate models and exploratory dataanalysis techniques.While there are numerous commercial software packages available for descriptive andinferential analysis of multivariate data such as SPSSTM, S-PlusTM, MinitabTM, and SYS-TATTM,among others, we have chosen to make exclusive use of SASTM, Version 8 forWindows.1.2 Multivariate Models and MethodsMultivariate analysis techniques are useful when observations are obtained for each ofa number of subjects on a set of variables of interest, the dependent variables, and onewants to relate these variables to another set of variables, the independent variables. The 27. 2 1. Introductiondata collected are usually displayed in a matrix where the rows represent the observationsand the columns the variables. The n p data matrix Y usually represents the dependentvariables and the n q matrix X the independent variables.When the multivariate responses are samples from one or more populations, one oftenfirst makes an assumption that the sample is from a multivariate probability distribution.In this text, the multivariate probability distribution is most often assumed to be the multi-variatenormal (MVN) distribution. Simple models usually have one or more means i andcovariance matrices i .One goal of model formulation is to estimate the model parameters and to test hypothesesregarding their equality. Assuming the covariance matrices are unstructured and unknownone may develop methods to test hypotheses regarding fixed means. Unlike univariate anal-ysis,if one finds that the means are unequal one does not know whether the differencesare in one dimension, two dimensions, or a higher dimension. The process of locatingthe dimension of maximal separation is called discriminant function analysis. In modelsto evaluate the equality of mean vectors, the independent variables merely indicate groupmembership, and are categorical in nature. They are also considered to be fixed and non-random.To expand this model to more complex models, one may formulate a linear modelallowing the independent variables to be nonrandom and contain either continuous or cat-egoricalvariables. The general class of multivariate techniques used in this case are calledlinear multivariate regression (MR) models. Special cases of the MR model include mul-tivariateanalysis of variance (MANOVA) models and multivariate analysis of covariance(MANCOVA) models.In MR models, the same set of independent variables, X, is used to model the set of de-pendentvariables, Y. Models which allow one to fit each dependent variable with a differ-entset of independent variables are called seemingly unrelated regression (SUR) models.Modeling several sets of dependent variables with different sets of independent variablesinvolve multivariate seemingly unrelated regression (MSUR) models. Oftentimes, a modelis overspecified in that not all linear combinations of the independent set are needed toexplain the variation in the dependent set. These models are called linear multivariatereduced rank regression (MRR) models. One may also extend MRR models to seeminglyunrelated regression models with reduced rank (RRSUR) models. Another name often as-sociatedwith the SUR model is the completely general MANOVA (CGMANOVA) modelsince growth curve models (GMANOVA) and more general growth curve (MGGC) modelsare special cases of the SUR model. In all these models, the covariance structure of Y isunconstrained and unstructured.In formulating MR models, the dependent variables are represented as a linear structureof both fixed parameters and fixed independent variables. Allowing the variables to remainfixed and the parameters to be a function of both random and fixed parameters leads toclasses of linear multivariate mixed models (MMM). These models impose a structure on so that both the means and the variance and covariance components ofare estimated.Models included in this general class are random coefficient models, multilevel models,variance component models, panel analysis models and models used to analyze covariancestructures. Thus, in these models, one is usually interested in estimating both the mean andthe covariance structure of a model simultaneously. 28. 1.3 Scope of the Book 3A general class of models that define the dependent and independent variables as ran-dom,but relate the variables using fixed parameters are the class of linear structure relation(LISREL) models or structural equation models (SEM). In these models, the variables maybe both observed and latent. Included in this class of models are path analysis, factor analy-sis,simultaneous equation models, simplex models, circumplex models, and numerous testtheory models. These models are used primarily to estimate the covariance structure in thedata. The mean structure is often assumed to be zero.Other general classes of multivariate models that rely on multivariate normal theory in-cludemultivariate time series models, nonlinear multivariate models, and others. When thedependent variables are categorical rather than continuous, one can consider using multino-miallogit or probit models or latent class models. When the data matrix contains n subjects(examinees) and p variables (test items), the modeling of test results for a group of exam-inesis called item response modeling.Sometimes with multivariate data one is interested in trying to uncover the structure ordata patterns that may exist. One may wish to uncover dependencies both within a set ofvariables and uncover dependencies with other variables. One may also utilize graphicalmethods to represent the data relationships. The most basic displays are scatter plots or ascatter plot matrix involving two or three variables simultaneously. Profile plots, star plots,glyph plots, biplots, sunburst plots, contour plots, Chernoff faces, and Andrews Fourierplots can also be utilized to display multivariate data.Because it is very difficult to detect and describe relationships among variables in largedimensional spaces, several multivariate techniques have been designed to reduce the di-mensionalityof the data. Two commonly used data reduction techniques include principalcomponent analysis and canonical correlation analysis. When one has a set of dissimilarityor similarity measures to describe relationships, multidimensional scaling techniques arefrequently utilized. When the data are categorical, the methods of correspondence analysis,multiple correspondence analysis, and joint correspondence analysis are used to geometri-callyinterpret and visualize categorical data.Another problem frequently encountered in multivariate data analysis is to categorizeobjects into clusters. Multivariate techniques that are used to classify or cluster objects intocategories include cluster analysis, classification and regression trees (CART), classifica-tionanalysis and neural networks, among others.1.3 Scope of the BookIn reviewing applied multivariate methodologies, one observes that several procedures aremodel oriented and have the assumption of an underlying probability distribution. Othermethodologies are exploratory and are designed to investigate relationships among themultivariables in order to visualize, describe, classify, or reduce the information underanalysis. In this text, we have tried to address both aspects of applied multivariate analy-sis.While Chapter 2 reviews basic vector and matrix algebra critical to the manipulationof multivariate data, Chapter 3 reviews the theory of linear models, and Chapters 46 and 29. 4 1. Introduction10 address standard multivariate model based methods. Chapters 7-9 include several fre-quentlyused exploratory multivariate methodologies.The material contained in this text may be used for either a one-semester course in ap-pliedmultivariate analysis for nonstatistics majors or as a two-semester course on multi-variateanalysis with applications for majors in applied statistics or research methodology.The material contained in the book has been used at the University of Pittsburgh with bothformats. For the two-semester course, the material contained in Chapters 14, selectionsfrom Chapters 5 and 6, and Chapters 79 are covered. For the one-semester course, Chap-ters13 are covered; however, the remaining topics covered in the course are selected fromthe text based on the interests of the students for the given semester. Sequences have in-cludedthe addition of Chapters 46, or the addition of Chapters 710, while others haveincluded selected topics from Chapters 410. Other designs using the text are also possible.No text on applied multivariate analysis can discuss all of the multivariate methodologiesavailable to researchers and applied statisticians. The field has made tremendous advancesin recent years. However, we feel that the topics discussed here will help applied profes-sionalsand academic researchers enhance their understanding of several topics useful inapplied multivariate data analysis using the Statistical Analysis System (SAS), Version 8for Windows.All examples in the text are illustrated using procedures in base SAS, SAS/STAT, andSAS/ETS. In addition, features in SAS/INSIGHT, SAS/IML, and SAS/GRAPH are uti-lized.All programs and data sets used in the examples may be downloaded from theSpringer-Verlag Web site, http://www.springer.com/editorial/authors.html. The programsand data sets are also available at the authors University of Pittsburgh Web site, http://www.pitt.edu/timm. A list of the SAS programs, with the implied extension .sas, dis-cussedin the text follow.Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7Multinorm m4 31 m5 31 m641 m73 1Norm MulSubSel m5 51 m642 m73 2m3 71 m4 51 m5 52 m643 m75 1m3 72 m4 5 1a m5 71 m66 1Box-Cox m4 52 m5 72 m66 2Ramus m4 71 m5 91 m68 1Unorm m4 91 m5 92 m68 2m3 81 m4 92 m5 13 1m3 87 m4 111 m5 14 1m3 9a m4 13 1am3 9d m4 13 1bm3 9e m4 15 1m3 9f Powerm3 10a m4 17 1m3 10bm3 11 1 30. 1.3 Scope of the Book 5Chapter 8 Chapter 9 Chapter 10 Otherm8 21 m941 m104 1 Xmacrom8 22 m942 m104 2 Distnewm8 31 m943 m106 1m8 32 m94 3a m10 6 2m8 33 m944 m108 1m8 61 m94 4m8 81 m96 1m8 82 m96 2m8 101 m96 3m8 10 2m8 10 3Also included on the Web site is the Fortran program Fit.For and the associated manual:Fit-Manual.ps, a postscript file. All data sets used in the examples and some of the exercisesare also included on the Web site; they are denoted with the extension .dat. Other data setsused in some of the exercises are available from the Data and Story Library (DASL) Website, http://lib.stat.cmu.dat/DASL/. The library is hosted by the Department of Statistics atCarnegie Mellon University, Pittsburgh, Pennsylvania. 31. This page intentionally left blank 32. 2Vectors and Matrices2.1 IntroductionIn this chapter, we review the fundamental operations of vectors and matrices useful instatistics. The purpose of the chapter is to introduce basic concepts and formulas essen-tialto the understanding of data representation, data manipulation, model building, andmodel evaluation in applied multivariate analysis. The field of mathematics that deals withvectors and matrices is called linear algebra; numerous texts have been written about theapplications of linear algebra and calculus in statistics. In particular, books by Carroll andGreen (1997), Dhrymes (2000), Graybill (1983), Harville (1997), Khuri (1993), Magnusand Neudecker (1999), Schott (1997), and Searle (1982) show how vectors and matricesare useful in applied statistics. Because the results in this chapter are to provide the readerwith the basic knowledge of vector spaces and matrix algebra, results are presented withoutproof.2.2 Vectors, Vector Spaces, and Vector Subspacesa. VectorsFundamental to multivariate analysis is the collection of observations for d variables. The dvalues of the observations are organized into a meaningful arrangement of d real1 numbers,called a vector (also called, a d-variate response or a multivariate vector valued observa-1All vectors in this text are assumed to be real valued. 33. 8 2. Vectors and Matricestion). Letting yi denote the i th observation where i goes from 1 to d, the d 1 vector y isrepresented asy =y1y2...yd(2.2.1)This representation of y is called a column vector of order d, with d rows and 1 column.Alternatively, a vector may be represented as a 1 d vector with 1 row and d columns.Then, we denote y as y and call it a row vector. Hence,y = [y1, y2, . . . , yd ] (2.2.2)Using this notation, y is a column vector and y, the transpose of y, is a row vector. Thedimension or order of the vector y is d where the index d represents the number of variables,elements or components in y. To emphasize the dimension of y, the subscript notation yd1or simply yd is used.The vector y with d elements represents, geometrically, a point in a d-dimensional Eu-clideanspace. The elements of y are called the coordinates of the vector. The null vec-tor0d1 denotes the origin of the space; the vector y may be visualized as a line segmentfrom the origin to the point y. The line segment is called a position vector. A vector y withn variables, yn, is a position vector in an n-dimensional Euclidean space. Since the vector yis defined over the set of real numbers R, the n-dimensional Euclidean space is representedas Rn or in this text as Vn.Definition 2.2.1 A vector yn1 is an ordered set of n real numbers representing a positionin an n-dimensional Euclidean space Vn.b. Vector SpacesThe collection of n 1 vectors in Vn that are closed under the two operations of vectoraddition and scalar multiplication is called a (real) vector space.Definition 2.2.2 An n-dimensional vector space is the collection of vectors in Vn that sat-isfythe following two conditions1. If xVn and yVn, then z = x + yVn2. If R and yVn, then z = yVn(The notation is set notation for is an element of.)For vector addition to be defined, x and y must have the same number of elements n.Then, all elements zi in z = x + y are defined as zi = xi + yi for i = 1, 2, . . . , n.Similarly, scalar multiplication of a vector y by a scaler R is defined as zi = yi . 34. 2.2 Vectors, Vector Spaces, and Vector Subspaces 9c. Vector SubspacesDefinition 2.2.3 A subset, S, of Vn is called a subspace of Vn if S is itself a vector space.The vector subspace S of Vn is represented as S Vn.Choosing = 0 in Definition 2.2.2, we see that 0 Vn so that every vector spacecontains the origin 0. Indeed, S = {0} is a subspace of Vn called the null subspace. Now,if and are elements of R and x and y are elements of Vn, then all linear combinationsx + y, are in Vn. This subset of vectors is called Vk , where Vk Vn. The subspaceVk is called a subspace, linear manifold or linear subspace of Vn. Any subspace Vk , where0kn, is called a proper subspace. The subset of vectors containing only the zerovector and the subset containing the whole space are extreme examples of vector spacescalled improper subspaces.Example 2.2.1 Letx =100 and y =010The set of all vectors S of the form z = x+y represents a plane (two-dimensional space)in the three-dimensional space V3. Any vector in this two-dimensional subspace, S = V2,can be represented as a linear combination of the vectors x and y. The subspace V2 iscalled a proper subspace of V3 so that V2 V3.Extending the operations of addition and scalar multiplication to k vectors, a linear com-binationof vectors yi is defined asv =ki=1iyi V (2.2.3)where yi V and i R. The set of vectors y1, y2, . . . , yk are said to span (or generate)V, ifV = {v | v =ki=1iyi } (2.2.4)The vectors in V satisfy Definition 2.2.2 so that V is a vector space.Theorem 2.2.1 Let {y1, y2, . . . , yk } be the subset of k, n 1 vectors in Vn. If every vectorin V is a linear combination of y1, y2, . . . , yk then V is a vector subspace of Vn.Definition 2.2.4 The set of n 1 vectors {y1, y2, . . . , yk } are linearly dependent if thereexists real numbers 1, 2, . . . , k not all zero such thatki=1iyi = 0Otherwise, the set of vectors are linearly independent. 35. 10 2. Vectors and MatricesFor a linearly independent set, the only solution to the equation in Definition 2.2.4 isgiven by 1 = 2 = = k = 0. To determine whether a set of vectors are linearlyindependent or linearly dependent, Definition 2.2.4 is employed as shown in the followingexamples.Example 2.2.2 Lety1 =111, y2 =011, and y3 =142To determine whether the vectors y1, y2, and y3 are linearly dependent or linearly inde-pendent,the equation1y1 + 2y2 + 3y3 = 0is solved for 1, 2, and 3. From Definition 2.2.4,1111 + 2011 + 3142 =000111 +022 +34323 =000This is a system of three equations in three unknowns(1) 1 + 3 = 0(2) 1 + 2 + 43 = 0(3) 1 2 23 = 0From equation (1), 1 = 3. Substituting 1 into equation (2), 2 = 33. If 1and 2are defined in terms of 3, equation (3) is satisfied. If 3= 0, there exist real numbers 1,2, and 3, not all zero such that3i=1i = 0Thus, y1, y2, and y3 are linearly dependent. For example, y1 + 3y2 y3 = 0.Example 2.2.3 As an example of a set of linearly independent vectors, lety1 =011, y2=112, and y3=341 36. 2.2 Vectors, Vector Spaces, and Vector Subspaces 11Using Definition 2.2.4,1011 + 2112 + 3341 =000is a system of simultaneous equations(1) 2 + 33 = 0(2) 1 + 2 + 43 = 0(3) 1 22 + 3 = 0From equation (1), 2 = 33. Substituting 33 for 2 into equation (2), 1 = 3;by substituting for 1 and 2 into equation (3), 3 = 0. Thus, the only solution is 1 =2 = 3 = 0, or {y1, y2, y3} is a linearly independent set of vectors.Linearly independent and linearly dependent vectors are fundamental to the study of ap-pliedmultivariate analysis. For example, suppose a test is administered to n students wherescores on k subtests are recorded. If the vectors y1, y2, . . . , yk are linearly independent,each of the k subtests are important to the overall evaluation of the n students. If for somesubtest the scores can be expressed as a linear combination of the other subtestsyk =k1i=1iyithe vectors are linearly dependent and there is redundancy in the test scores. It is oftenimportant to determine whether or not a set of observation vectors is linearly independent;when the vectors are not linearly independent, the analysis of the data may need to berestricted to a subspace of the original space.Exercises 2.21. For the vectorsy1 =111 and y2 =201find the vectors(a) 2y1 + 3y2(b) y1 + y2(c) y3 such that 3y1 2y2 + 4y3 = 02. For the vectors and scalars defined in Example 2.2.1, draw a picture of the space Sgenerated by the two vectors. 37. 12 2. Vectors and Matrices3. Show that the four vectors given below are linearly dependent.y1 =100, y2 =235, y3 =101, and y4 =0464. Are the following vectors linearly dependent or linearly independent?y1 =111, y2 =123, y3 =2235. Do the vectorsy1 =242, y2 =123, and y3 =61210span the same space as the vectorsx1 =002 and x2 =24106. Prove the following laws for vector addition and scalar multiplication.(a) x + y = y + x (commutative law)(b) (x + y) + z = x + (y + z) (associative law)(c) (y) = ()y = ()y = (y) (associative law for scalars)(d) (x + y) = x + y (distributive law for vectors)(e) ( + )y = y + y (distributive law for scalars)7. Prove each of the following statements.(a) Any set of vectors containing the zero vector is linearly dependent.(b) Any subset of a linearly independent set is also linearly independent.(c) In a linearly dependent set of vectors, at least one of the vectors is a linearcombination of the remaining vectors.2.3 Bases, Vector Norms, and the Algebra of Vector SpacesThe concept of dimensionality is a familiar one from geometry. In Example 2.2.1, thesubspace S represented a plane of dimension two, a subspace of the three-dimensionalspace V3. Also important is the minimal number of vectors required to span S. 38. 2.3 Bases, Vector Norms, and the Algebra of Vector Spaces 13a. BasesDefinition 2.3.1 Let {y1, y2, . . . , yk } be a subset of k vectors where yi Vn. The set of kvectors is called a basis of Vk if the vectors in the set span Vk and are linearly independent.The number k is called the dimension or rank of the vector space.Thus, in Example 2.2.1 S V2 V3 and the subscript 2 is the dimension or rank ofthe vector space. It should be clear from the context whether the subscript on V representsthe dimension of the vector space or the dimension of the vector in the vector space. Everyvector space, except the vector space {0}, has a basis. Although a basis set is not unique, thenumber of vectors in a basis is unique. The following theorem summarizes the existenceand uniqueness of a basis for a vector space.Theorem 2.3.1 Existence and Uniqueness1. Every vector space has a basis.2. Every vector in a vector space has a unique representation as a linear combinationof a basis.3. Any two bases for a vector space have the same number of vectors.b. Lengths, Distances, and AnglesKnowledge of vector lengths, distances and angles between vectors helps one to understandrelationships among multivariate vector observations. However, prior to discussing theseconcepts, the inner (scalar or dot) product of two vectors needs to be defined.Definition 2.3.2 The inner product of two vectors x and y, each with n elements, is thescalar quantityxy =ni=1xi yiIn textbooks on linear algebra, the inner product may be represented as (x, y) or xy. GivenDefinition 2.3.2, inner products have several properties as summarized in the followingtheorem.Theorem 2.3.2 For any conformable vectors x, y, z, and w in a vector space V and anyreal numbers and , the inner product satisfies the following relationships1. xy = yx2. xx 0 with equality if and only if x = 03. (x)(y) = (xy) z = xz + yz4. (x + y)5. (x + y)(w + z) = x(w + z) + y(w + z) 39. 14 2. Vectors and MatricesIf x = y in Definition 2.3.2, then xx =ni . The quantity (xx)1/2 is called thei=1 x2Euclidean vector norm or length of x and is represented as x. Thus, the norm of x is thepositive square root of the inner product of a vector with itself. The norm squared of x isrepresented as ||x||2. The Euclidean distance or length between two vectors x and y in Vnis x y = [(x y)(x y)]1/2. The cosine of the angle between two vectors by the lawof cosines iscos = xy/ x y 0 180 (2.3.1)Another important geometric vector concept is the notion of orthogonal (perpendicular)vectors.Definition 2.3.3 Two vectors x and y in Vn are orthogonal if their inner product is zero.Thus, if the angle between x and y is 90, then cos = 0 and x is perpendicular to y,written as x y.Example 2.3.1 Letx =112 and y =101The distance between x and y is then x y = [(x y)(x y)]1/2 =14 and thecosine of the angle between x and y is6cos = xy/ x y = 3/2 = 3/23/2) = 150.so that the angle between x and y is = cos1(If the vectors in our example have unit length, so that x = y = 1, then the cos isjust the inner product of x and y. To create unit vectors, also called normalizing the vectors,one proceeds as followsux = x / x =61/1/62/6 and uy = y/ y =1/20/221/and the cos = u3/2, the inner product of the normalized vectors. The normal-izedxuy = orthogonal vectors ux and uy are called orthonormal vectors.Example 2.3.2 Letx =124 and y =401Then xy = 0; however, these vectors are not of unit length.Definition 2.3.4 A basis for a vector space is called an orthogonal basis if every pair ofvectors in the set is pairwise orthogonal; it is called an orthonormal basis if each vectoradditionally has unit length. 40. 2.3 Bases, Vector Norms, and the Algebra of Vector Spaces 15y0y xP x x y = xFIGURE 2.3.1. Orthogonal Projection of y on x, Pxy = xThe standard orthonormal basis for Vn is {e1, e2, . . . , en} where ei is a vector of all zeroswith the number one in the i th position. Clearly the ei = 1 and eie j ; for all pairs iand j . Hence, {e1, e2, . . . , en} is an orthonormal basis for Vn and it has dimension (or rank)n. The basis for Vn is not unique. Given any basis for Vk Vn we can create an orthonormalbasis for Vk . The process is called the Gram-Schmidt orthogonalization process.c. Gram-Schmidt Orthogonalization ProcessFundamental to the Gram-Schmidt process is the concept of an orthogonal projection. In atwo-dimensional space, consider the vectors x and y given in Figure 2.3.1. The orthogonalprojection of y on x, Pxy, is some constant multiple, x of x, such that Pxy (yPxy).Since the cos =cos 90 = 0, we set (yx)x equal to 0 and we solve for to find = (yx)/ x2. Thus, the projection of y on x becomesPxy = x = (yx)x/ x2Example 2.3.3 Letx =111 and y =142Then, thePxy = (yx)xx2= 73111Observe that the coefficient in this example is no more than the average of the ele-mentsof y. This is always the case when projection an observation onto a vector of 1s (theequiangular or unit vector), represented as 1n or simply 1. P1y = y1 for any multivariateobservation vector y.To obtain an orthogonal basis {y1, . . . , yr } for any subspace V of Vn, spanned by anyset of vectors {x1, x2, . . . , xk }, the preceding projection process is employed sequentially 41. 16 2. Vectors and Matricesas followsy1 = x1y2 = x2 Py1x2 = x2 (x2y1)y1/ y12 y2y1y3 = x3 Py1x3 Py2x3= x3 (x3y1)y1/y21 (x3y2)y2/y22 y3y2y1or, more generallyyi = xi i1j=1ci j yj where ci j = (xiyj )/yj 2deleting those vectors yi for which yi = 0. The number of nonzero vectors in the setis the rank or dimension of the subspace V and is represented as Vr , r k. To find anorthonormal basis, the orthogonal basis must be normalized.Theorem 2.3.3 (Gram-Schmidt) Every r-dimensional vector space, except the zero-dimen-sionalspace, has an orthonormal basis.Example 2.3.4 Let V be spanned byx1 =11101, x2 =20412, x3 =11311, and x4 =62311To find an orthonormal basis, the Gram-Schmidt process is used. Sety1 = x1 =11101y2 = x2 (x2y1)y1/ y12= 20412 8411101=02210y3 = x3 (x3y1)y1/y1 2 (x3y2)y2/y2 2= 0 42. 2.3 Bases, Vector Norms, and the Algebra of Vector Spaces 17so delete y3;y4 =62311 (x4y1)y1/ y12 (x4y2)y2/ y22=62311 8411101 9902210=42121Thus, an orthogonal basis for V is {y1, y2, y4}. The vectors must be normalized toobtain an orthonormal basis; an orthonormal basis is u1 = y1/4, u2 = y2/3, and26.u3 = y4/d. Orthogonal SpacesDefinition 2.3.5 Let Vr = {x1, . . . , xr} Vn. The orthocomplement subspace of Vr in Vn,represented by V, is a vector subspace of Vn which consists of all vectors y Vn suchthat xiy = 0 and we write Vn = Vr V.The vector space Vn is the direct sum of the subspaces Vn and V. The intersection ofthe two spaces only contain the null space. The dimension of Vn, dim Vn, is equal to thedim Vr + dim V so that the dim V = n r. More generally, we have the followingresult.Definition 2.3.6 Let S1, S2, . . . , Sk denote vector subspaces of Vn. The direct sum of thesevector spaces, represented asi=1 Si , consists of all unique vectors v =k ki=1 i si wheresi Si , i = 1, . . . , k and the coefficients i R.Theorem 2.3.4 Let S1, S2, . . . , Sk represent vector subspaces of Vn. Then,1. V =ki=1 Si is a vector subspace of Vn, V Vn.2. The intersection of Si is the null space {0}.3. The intersection of V and V is the null space.4. The dim V = n k so that dim V V = n.Example 2.3.5 LetV =101,011= {x1, x2} and y V3 43. 18 2. Vectors and MatricesWe find V using Definition 2.3.5 as followsV = {y V3 | (yx) = 0 for any x V}= {y V3 | (yV}= {y V3 | (yxi } (i = 1, 2)A vector y = [y1, y2, y3] must be found such that yx1 and yx2. This implies thaty1 y3 = 0, or y1 = y3, and y2 = y3, or y1 = y2 = y3. Letting yi = 1,V =111 = 1 and V3 = V VFurthermore, theP1y =yyy and PVy = y P1y =y1 yy2 yy3 yAlternatively, from Definition 2.3.6, an orthogonal basis for V isV =101 ,1/211/2= {v1, v2} = S1 S2and the PV y becomesPv1y + Pv2y =y1 yy2 yy3 yHence, a unique representation for y is y = P1y + PV y as stated in Theorem 2.3.4. Thedim V3 = dim 1 + dim V.In Example 2.3.5, V is the orthocomplement of V relative to the whole space. OftenS V Vn and we desire the orthocomplement of S relative to V instead of Vn. Thisspace is represented as V/S and V = (V/S) S = S1 S2. Furthermore, Vn = V (V/S) S = V S1 S2. If the dimension of V is k and the dimension of S is r , thenthe dimension of V is n k and the dim V/S is k r , so that (n k) + (k r ) +r = nor the dim Vn = dim V + dim(V/S) + dim S as stated in Theorem 2.3.4. In Figure 2.3.2,the geometry of subspaces is illustrated with Vn = S (V/S) V.yi j = + i + ei j i = 1, 2 and j = 1, 2The algebra of vector spaces has an important representation for the analysis of variance(ANOVA) linear model. To illustrate, consider the two group ANOVA modelThus, we have two groups indexed by i and two observations indexed by j . Representingthe observations as a vector,y = [y11, y12, y21, y22] 44. 2.3 Bases, Vector Norms, and the Algebra of Vector Spaces 19V VV/SSVnFIGURE 2.3.2. The orthocomplement of S relative to V, V/Sand formulating the observation vector as a linear model,y =y11y12y21y22=1111 +11001 +00112 +e11e12e21e22The vectors associated with the model parameters span a vector space V often called thedesign space. Thus,V =111111000011= {1, a1, a2}where 1, a1, and a2 are elements of V4. The vectors in the design space V are linearlydependent. Let A = {a1, a2} denote a basis for V. Since 1 A, the orthocomplement ofthe subspace {1} 1 relative to A, denoted by A/1 is given byA/1 = {a1 P1a1, a2 P1a2}=1/21/21/21/21/21/21/21/2The vectors in A/1 span the space; however, a basis for A/1 is given byA/1 =1111where (A/1)1 =A and A V4. Thus, (A/1)1A = V4. Geometrically, as shown inFigure 2.3.3, the design space V A has been partitioned into two orthogonal subspaces1 and A/1 such that A = 1(A/1), where A/1 is the orthocomplement of 1 relative to A,and A A = V4. 45. 20 2. Vectors and MatricesA1y VA/1FIGURE 2.3.3. The orthogonal decomposition of V for the ANOVAThe observation vector y V4 may be thought of as a vector with components in variousorthogonal subspaces. By projecting y onto the orthogonal subspaces in the design space A,we may obtain estimates of the model parameters. To see this, we evaluate PAy = P1y +PA/1y.P1y = y1111= 1111PA/1y = PAy P1y= (ya1)a1a12+ (ya2)a2a22 (y1)112=2i=1(yai )ai 2 (y1)12ai=2i=1(yi y)ai =2i=1iaisince (A/1)1 and 1 = a1 + a2. As an exercise, find the projection of y onto A and thePA/1y2.From the analysis of variance, the coefficients of the basis vectors for 1 and A/1 yield theestimators for the overall effect and the treatment effects i for the two-group ANOVAmodel employing the restriction on the parameters that 1+2 = 0. Indeed, the restrictioncreates a basis for A/1. Furthermore, the total sum of squares, y2, is the sum of squaredlengths of the projections of y onto each subspace, y2 = P1y2+PA/1y2+PAy2.The dimensions of the subspaces for I groups, corresponding to the decomposition of y2,satisfy the relationship that n = 1 + (I 1) + (n I ) where the dim A = I and y Vn.Hence, the degrees of freedom of the subspaces are the dimensions of the orthogonal vectorspaces {1}, {A/1} and {A}for the design space A. Finally, the PA/1y2 is the hypothesissum of squares and the PAy2 is the error sum of squares. Additional relationships be-tweenlinear algebra and linear models using ANOVA and regression models are containedin the exercises for this section. We conclude this section with some inequalities useful instatistics and generalize the concepts of distance and vector norms. 46. 2.3 Bases, Vector Norms, and the Algebra of Vector Spaces 21e. Vector Inequalities, Vector Norms, and Statistical DistanceIn a Euclidean vector space, two important inequalities regarding inner products are theCauchy-Schwarz inequality and the triangular inequality.Theorem 2.3.5 If x and y are vectors in a Euclidean space V, then1. (xy)2 x2 y2 (Cauchy-Schwarz inequality)2. x + y x + y (Triangular inequality)In terms of the elements of x and y, (1) becomesixi yi2ix2iiy2i(2.3.2)which may be used to show that the zero-order Pearson product-moment correlation co-efficientis bounded by 1. Result (2) is a generalization of the familiar relationship fortriangles in two-dimensional geometry.The Euclidean norm is really a member of Minkowskis family of norms (Lp-norms)xp =ni=1|xi |p1/p(2.3.3)where 1 p and x is an element of a normed vector space V. For p = 2, wehave the Euclidean norm. When p = 1, we have the minimum norm, x1. For p = ,Minkowskis norm is not defined, instead we define the maximum or infinity norm of x asx = max1in|xi | (2.3.4)Definition 2.3.7 A vector norm is a function defined on a vector space that maps a vectorinto a scalar value such that1. xp 0, and xp = 0 if and only if x = 0,2. xp = || xp for R,3. x + yp xp + yp,for all vectors x and y.Clearly the x2 = (xx)1/2 satisfies Definition 2.3.7. This is also the case for the maxi-mumnorm of x. In this text, the Euclidean norm (L2-norm) is assumed unless noted other-wise.Note that (||x||2)2 = ||x||2 = xx is the Euclidean norm squared of x.While Euclidean distances and norms are useful concepts in statistics since they help tovisualize statistical sums of squares, non-Euclidean distance and non-Euclidean norms areoften useful in multivariate analysis.We have seen that the Euclidean norm generalizes to a 47. 22 2. Vectors and Matricesmore general function that maps a vector to a scalar. In a similar manner, we may generalizethe concept of distance. A non-Euclidean distance important in multivariate analysis is thestatistical or Mahalanobis distance.To motivate the definition, consider a normal random variable X with mean zero andvariance one, X N(0, 1). An observation xo that is two standard deviations from themean lies a distance of two units from the origin since the xo = (02 + 22)1/2 = 2 andthe probability that 0 x 2 is 0.4772. Alternatively, suppose Y N(0, 4) where thedistance from the origin for yo = xo is still 2. However, the probability that 0 y 2becomes 0.3413 so that y is closer to the origin than x. To compare the distances, we musttake into account the variance of the random variables. Thus, the squared distance betweenxi and x j is defined asD2i j= (xi x j )2/ 2 = (xi x j )( 2)1(xi x j ) (2.3.5)where 2 is the population variance. For our example, the point xo has a squared statisticaldistance D2i j= 4 while the point yo = 2 has a value of D2i j= 1 which maintains the in-equalityin probabilities in that Y is closer to zero statistically than X. Di j is the distancebetween xi and x j , in the metric of 2 called the Mahalanobis distance between xi and x j .When 2 = 1, Mahalanobis distance reduces to the Euclidean distance.Exercises 2.31. For the vectorsx =132, y =120, and z =112and scalars = 2 and = 3, verify the properties given in Theorem 2.3.2.2. Using the law of cosinesy x2 = x2 + y2 2 x y cos derive equation (2.3.1).3. For the vectorsy1 =221 and y2 =301(a) Find their lengths, and the distance and angle between them.(b) Find a vector of length 3 with direction cosines2 and cos 2 = y2/ y = 1/cos 1 = y1/ y = 1/2where 1 and 2 are the cosines of the angles between y and each of its refer-encesaxes e1=10, and e2=01.(c) Verify that cos2 1 + cos2 2 = 1. 48. 2.3 Bases, Vector Norms, and the Algebra of Vector Spaces 234. Fory =197 and V =v1 =231 , v2 =504(a) Find the projection of y onto V and interpret your result.(b) In general, if yV , can you find the PV y?5. Use the Gram-Schmidt process to find an orthonormal basis for the vectors in Exer-cise2.2, Problem 4.6. The vectorsv1 =121 and v2 =230span a plane in Euclidean space.(a) Find an orthogonal basis for the plane.(b) Find the orthocomplement of the plane in V3.(c) From (a) and (b), obtain an orthonormal basis for V3.3, 1/7. Find an orthonormal basis for V3 that includes the vector y = [1/3,3].1/8. Do the following.(a) Find the orthocomplement of the space spanned by v = [4, 2, 1] relative toEuclidean three dimensional space, V3.(b) Find the orthocomplement of v = [4, 2, 1] relative to the space spanned byv1= [1, 1, 1] and v1= [2, 0,1].(c) Find the orthocomplement of the space spanned by v1= [1, 1, 1] and v2=[2, 0,1] relative to V3.(d) Write the Euclidean three-dimensional space as the direct sum of the relativespaces in (a), (b), and (c) in all possible ways.9. Let V be spanned by the orthonormal basisv1 =1/201/20and v2 =0201/21/(a) Express x = [0, 1, 1, 1] as x = x1 + x2,where x1 V and x2 V.(b) Verify that the PV x2 = Pv1x2 + Pv2x2.(c) Which vector y V is closest to x? Calculate the minimum distance. 49. 24 2. Vectors and Matrices10. Find the dimension of the space spanned byv11111v21100v30011v41010v5010111. Let yn Vn, and V = {1}.(a) Find the projection of y onto V, the orthocomplement of V relative to Vn.(b) Represent y as y = x1 + x2, where x1 V and x2 V. What are the dimen-sionsof V and V?(c) Since y2 = x12 +x22 = PV y2 + 2, determine a general formPVyfor each of the components of y2. DividePVy2 by the dimension of V.What do you observe about the ratioPVy2/ dim V?12. Let yn Vn be a vector of observations, y = [y1, y2, . . . , yn] and let V = {1, x}where x = [x1, x2, . . . , xn].(a) Find the orthocomplement of 1 relative to V (that is, V/1) so that 1(V/1) =V. What is the dimension of V/1?(b) Find the projection of y onto 1 and also onto V/1. Interpret the coefficientsof the projections assuming each component of y satisfies the simple linearrelationship yi = + (xi x).(c) Find y PV y and y PV y2. How are these quantities related to the simplelinear regression model?13. For the I Group ANOVA model yi j = + i + ei j where i = 1, 2, . . . , I and j =1, 2, . . . , n observations per group, evaluate the square lengths P1y2 , 2PA/1y,and 2 for V = {1, a1, . . . , aI }. Use Figure 2.3.3 to relate these quantitiesPAygeometrically.14. Let the vector space V be spanned byv1 11111111{ 1 v2 v3 v4 v5 v6 v7 v8 v911110000 00001111,110011000011001111000000001100000000110000000011A, B, AB } 50. 2.4 Basic Matrix Operations 25(a) Find the space A+ B = 1(A/1)(B/1) and the space AB/(A+ B) so thatV = 1 (A/1) (B/1) + [AB/(A + B)]. What is the dimension of each ofthe subspaces?(b) Find the projection of the observation vector y = [y111, y112, y211, y212, y311,y312, y411, y412] in V8 onto each subspace in the orthogonal decomposition of Vin (a). Represent these quantities geometrically and find their squared lengths.(c) Summarize your findings.15. Prove Theorem 2.3.4.16. Show that Minkowskis norm for p = 2 satisfies Definition 2.3.7.17. For the vectors y = [y1, . . . , yn] and x = [x1, . . . , xn] with elements that have amean of zero,(a) Show that s2y= y2 /(n 1) and s2x= x2 / (n 1) .(b) Show that the sample Pearson product moment correlation between two obser-vationsx and y is r = xy/ x y .2.4 Basic Matrix OperationsThe organization of real numbers into a rectangular or square array consisting of n rowsand d columns is called a matrix of order n by d and written as n d.Definition 2.4.1 A matrix Y of order n d is an array of scalars given asYnd =y11 y12 y1dy21 y22 y2d.........yn1 yn2 yndThe entries yi j of Y are called the elements of Y so that Y may be represented as Y = [yi j ].Alternatively, a matrix may be represented in terms of its column or row vectors asYnd = [v1, v2, . . . , vd ] and vj Vn (2.4.1)orYnd =y1y2...ynand yi VdBecause the rows of Y are usually associated with subjects or individuals each yi is amember of the person space while the columns vj of Y are associated with the variablespace. If n = d, the matrix Y is square. 51. 26 2. Vectors and Matricesa. Equality, Addition, and Multiplication of MatricesMatrices like vectors may be combined using the operations of addition and scalar multi-plication.For two matrices A and B of the same order, matrix addition is defined asA + B = C if and only if C =ci j=ai j + bi j(2.4.2)The matrices are conformable for matrix addition only if both matrices are of the sameorder and have the same number of row and columns.The product of a matrix A by a scalar isA = A = [ai j ] (2.4.3)Two matrices A and B are equal if and only if [ai j] = [bi j ]. To extend the concept of aninner product of two vectors to two matrices, the matrix product AB = C is defined if andonly if the number of columns in A is equal to the number of rows in B. For two matricesAnd and Bdm, the matrix (inner) product is the matrix Cnm such thatAB = C = [ci j ] for ci j =dk=1aikbk j (2.4.4)From (2.4.4), we see that C is obtained by multiplying each row of A by each columnof B. The matrix product is conformable if the number of columns in the matrix A is equalto the number of rows in the matrix B. The column order is equal to the row order formatrix multiplication to be defined. In general, AB= BA. If A = B and A is square, thenAA = A2. When A2= A, the matrix A is said to be idempotent.From the definitions and properties of real numbers, we have the following theorem formatrix addition and matrix multiplication.Theorem 2.4.1 For matrices A,B,C, and D and scalars and , the following propertieshold for matrix addition and matrix multiplication.1. A + B = B + A2. (A + B) + C = A + (B + C)3. (A + B) =A+B4. ( + )A =A+A5. (AB)C = A(BC)6. A(B + C) = AB + AC7. (A + B)C = AC + BC8. A + (A) = 09. A + 0 = A10. (A + B)(C + D) = A(C + D) + B(C + D) = AC + AD + BC + BD 52. 2.4 Basic Matrix Operations 27Example 2.4.1 LetA =1 23 74 8 and B =2 27 53 1ThenA + B =3 410 121 9 and 5(A + B) =15 2050 605 45For our example, AB and BA are not defined. Thus, the matrices are said to not be con-formablefor matrix multiplication. The following is an example of matrices that are con-formablefor matrix multiplication.Example 2.4.2 LetA =1 2 35 1 0and B =1 2 11 2 01 2 1ThenAB =(1)(1) + 2(1) + 3(1) 1(2) + 2(2) + 3(2) 1(1) + 2(0) + 3(1)5(1) + 1(1) + 0(1) 5(2) + 1(2) + 0(2) 5(1) + 1(0) + 0(1)=4 8 46 12 5Alternatively, if we represent A and B asA =[a1, a2, . . . , ad ] and B =b1b2...bnThen the matrix product is defined as an outer productAB =dk=1akbkwhere each Ck = akbk is a square matrix, the number of rows is equal to the number ofcolumns. For the example, lettinga1 =15, a2 =21, a3 =30b1= [1, 2, 1] , b2= [1, 2, 0] , b3= [1, 2,1] 53. 28 2. Vectors and MatricesThen3k=1akbk= C1 + C2 + C3=1 2 15 10 5+2 4 01 2 0+3 6 30 0 0=4 8 46 12 5= ABThus, the inner and outer product definitions of matrix multiplication are equivalent.b. Matrix TranspositionIn Example 2.4.2, we defined B in terms of row vectors and A in terms of column vectors.More generally, we can form the transpose of a matrix. The transpose of a matrix And isthe matrix Adn obtained from A =ai jby interchanging rows and columns of A. Thus,Adn=a11 a21 an1a12 a22 an2.........a1d a2d and(2.4.5)Alternatively, if A = [ai j ] then A = [a ji ]. A square matrix A is said to be symmetric ifand only if A = A or [ai j] = [a ji ]. A matrix A is said to be skew-symmetric if A = A.Properties of matrix transposition follow.Theorem 2.4.2 For matrices A, B, and C and scalars and , the following propertieshold for matrix transposition. = BA1. (AB)2. (A + B) = A + B3. (A) = A4. (ABC) = CBA5. (A) = A6. (A+B) = A + BExample 2.4.3 LetA =1 31 4and B =2 11 1 54. 2.4 Basic Matrix Operations 29ThenA =1 13 4and B =2 11 1AB =5 42 3and (AB) =5 24 3= BA(A + B) =3 04 5= A + BThe transpose operation is used to construct symmetric matrices. Given a data matrixYnd , the matrix YY is symmetric, as is the matrix YY. However, YY= YY since theformer is of order d d where the latter is an n n matrix.c. Some Special MatricesAny square matrix whose off-diagonal elements are 0s is called a diagonal matrix. A di-agonalmatrix Ann is represented as A = diag[a11, a22, . . . , ann] or A = diag[aii ] and isclearly symmetric. If the diagonal elements, aii = 1 for all i , then the diagonal matrix Ais called the identity matrix and is written as A = In or simply I. Clearly, IA = AI = Aso that the identity matrix behaves like the number 1 for real numbers. Premultiplicationof a matrix Bnd by a diagonal matrix Rnn = diag[rii ] multiplies each element in thei th row of Bnd by rii ; postmultiplication of Bnd by a diagonal matrix Cdd = diag[c j j ]multiplies each element in the j th column of B by c j j . A matrix 0 with all zeros is calledthe null matrix.A square matrix whose elements above (or below) the diagonal are 0s is called a lower(or upper) triangular matrix. If the elements on the diagonal are 1s, the matrix is called aunit lower (or unit upper) triangular matrix.Another important matrix used in matrix manipulation is a permutation matrix. An ele-mentarypermutation matrix is obtained from an identity matrix by interchanging two rows(or columns) of I. Thus, an elementary permutation matrix is represented as Ii,i . Premul-tiplicationof a matrix A by Ii,i, creates a new matrix with interchanged rows of A whilepostmultiplication by Ii,i, creates a new matrix with interchanged columns.Example 2.4.4 LetX =1 1 01 1 01 0 11 0 1and I1,2 =0 1 01 0 00 0 1 55. 30 2. Vectors and MatricesThenA = XX =4 2 22 2 02 0 2 is symmetricI1, 2A =2 2 04 2 22 0 2 interchanges rows 1 and 2 of AAI1, 2 =2 4 22 2 00 2 2 interchanges columns 1 and 2 of AMore generally, an n n permutation matrix is any matrix that is constructed from Inby permuting its columns. We may represent the matrix as In, n since there are n! differentpermutation matrices of order n.Finally, observe that InIn = I2n= In so that In is an idempotent matrix. Letting Jn =1n1n, the matrix Jn is a symmetric matrix of ones. Multiplying Jn by itself, observe thatJ2n= nJn so that Jn is not idempotent. However, n1Jn and In n1Jn are idempotentmatrices. If A2nn= 0, the matrix A is said to be nilpotent. For A3 = 0, the matrix istripotent and if Ak = 0 for some finite k0, it is k potent. In multivariate analysisand linear models, symmetric idempotent matrices occur in the context of quadratic forms,Section 2.6, and in partitioning sums of squares, Chapter 3.d. Trace and the Euclidean Matrix NormAn important operation for square matrices is the trace operator. For a square matrixAnn = [ai j ], the trace of A, represented as tr(A), is the sum of the diagonal elementsof A. Hence,tr (A) =ni=1aii (2.4.6)Theorem 2.4.3 For square matrices A and B and scalars and , the following propertieshold for the trace of a matrix.1. tr(A+B) = tr(A) + tr(B)2. tr(AB) =tr (BA)3. tr(A) = tr(A)4. tr(AA) =tr(AA) = i, ja2i j and equals 0, if and only if A = 0.Property (4) is an important property for matrices since it generalizes the Euclideanvector norm squared to matrices. The Euclidean norm squared of A is defined asA2 =ija2i j= tr(AA) =tr(AA) 56. 2.4 Basic Matrix Operations 31The Euclidean