21
Vol. 32, No. 4, July–August 2013, pp. 533–553 ISSN 0732-2399 (print) ISSN 1526-548X (online) http://dx.doi.org/10.1287/mksc.2013.0779 © 2013 INFORMS The Dimensionality of Customer Satisfaction Survey Responses and Implications for Driver Analysis Joachim Büschken Catholic University of Eichstätt-Ingolstadt, 84059 Ingolstadt, Germany, [email protected] Thomas Otter Goethe University Frankfurt, 60629 Frankfurt am Main, Germany, [email protected] Greg M. Allenby Fisher College of Business, Ohio State University, Columbus, Ohio 43210, [email protected] T he canonical design of customer satisfaction surveys asks for global satisfaction with a product or service and for evaluations of its distinct attributes. Users of these surveys are often interested in the relationship between global satisfaction and attributes; regression analysis is commonly used to measure the conditional associations. Regression analysis is only appropriate when the global satisfaction measure results from the attribute evaluations and is not appropriate when the covariance of the items lie in a low-dimensional subspace, such as in a factor model. Potential reasons for low-dimensional responses are that responses may be haloed from overall satisfaction and there may be an unintended lack of item specificity. In this paper we develop a Bayesian mixture model that facilitates the empirical distinction between regression models and relatively much lower-dimensional factor models. The model uses the dimensionality of the covariance among items in a survey as the primary classification criterion while accounting for the heterogeneous usage of rating scales. We apply the model to four different customer satisfaction surveys that evaluate hospitals, an academic program, smartphones, and theme parks, respectively. We show that correctly assessing the heterogeneous dimensionality of responses is critical for meaningful inferences by comparing our results to those from regression models. Key words : Bayesian estimation; surveys; information processing History : Received: June 4, 2010; accepted: January 2, 2013; Eric Bradlow and then Preyas Desai served as the editor-in-chief and Fred Feinberg served as associate editor for this article. Published online in Articles in Advance April 8, 2013. 1. Introduction Managers use customer satisfaction surveys to mea- sure levels of satisfaction and study drivers of satis- faction. Measures of levels of satisfaction distinguish between more and less satisfied customers, serve as benchmarks against the competition, and are use- ful for tracking customer satisfaction through time (Fornell et al. 1996). Satisfaction driver research relates temporal or cross-sectional variation in overall satis- faction to variation in the evaluation of conceptually independent components of a product or service. Surveys are designed to elicit evaluations of the components of a product or service and an over- all measure of satisfaction. Driver analysis uses the covariance among evaluations to estimate the con- ditional relationship of the overall measure to its components. The standard approach relies on regres- sion analysis and, more recently, finite mixture regres- sion analysis (e.g., Wedel and DeSarbo 2002). 1 The conceptual advantage of regression analysis is that the 1 When these surveys contain explicit multi-item measurement models, driver analysis uses the covariance among latent variables estimated coefficients measure the conditional effects of changing one component score while holding all other component scores constant. In practice, how- ever, it is an open question whether regression analy- sis is meaningful for a specific data set. Despite a firm’s best efforts and known constraints on respondent time, a specific survey may accidentally fail at measuring conceptually independent compo- nent evaluations and effectively ask about total sat- isfaction over and over again. In practice, regression analysis and other techniques for identifying drivers of satisfaction applied to such data will result in rapidly decreasing estimates of driver importance as the number of “predictor” items increases. This is due to the property of a regression coefficient as a con- ditional effect given the other covariates and due to to infer conditional relationships between latent component eval- uations and latent overall satisfaction; latent variable regression models, e.g., LISREL, are used. In this paper we focus on intended single-item measurements of component evaluations and overall satisfaction. However, our conceptual discussion applies straight- forwardly to the analysis of latent covariance structures. 533

The Dimensionality of Customer Satisfaction Survey Responses and Implications for Driver Analysis

Embed Size (px)

Citation preview

Vol. 32, No. 4, July–August 2013, pp. 533–553ISSN 0732-2399 (print) � ISSN 1526-548X (online) http://dx.doi.org/10.1287/mksc.2013.0779

© 2013 INFORMS

The Dimensionality of Customer Satisfaction SurveyResponses and Implications for Driver Analysis

Joachim BüschkenCatholic University of Eichstätt-Ingolstadt, 84059 Ingolstadt, Germany, [email protected]

Thomas OtterGoethe University Frankfurt, 60629 Frankfurt am Main, Germany, [email protected]

Greg M. AllenbyFisher College of Business, Ohio State University, Columbus, Ohio 43210, [email protected]

The canonical design of customer satisfaction surveys asks for global satisfaction with a product or serviceand for evaluations of its distinct attributes. Users of these surveys are often interested in the relationship

between global satisfaction and attributes; regression analysis is commonly used to measure the conditionalassociations. Regression analysis is only appropriate when the global satisfaction measure results from theattribute evaluations and is not appropriate when the covariance of the items lie in a low-dimensional subspace,such as in a factor model. Potential reasons for low-dimensional responses are that responses may be haloedfrom overall satisfaction and there may be an unintended lack of item specificity. In this paper we developa Bayesian mixture model that facilitates the empirical distinction between regression models and relativelymuch lower-dimensional factor models. The model uses the dimensionality of the covariance among items in asurvey as the primary classification criterion while accounting for the heterogeneous usage of rating scales. Weapply the model to four different customer satisfaction surveys that evaluate hospitals, an academic program,smartphones, and theme parks, respectively. We show that correctly assessing the heterogeneous dimensionalityof responses is critical for meaningful inferences by comparing our results to those from regression models.

Key words : Bayesian estimation; surveys; information processingHistory : Received: June 4, 2010; accepted: January 2, 2013; Eric Bradlow and then Preyas Desai served as the

editor-in-chief and Fred Feinberg served as associate editor for this article. Published online in Articles inAdvance April 8, 2013.

1. IntroductionManagers use customer satisfaction surveys to mea-sure levels of satisfaction and study drivers of satis-faction. Measures of levels of satisfaction distinguishbetween more and less satisfied customers, serve asbenchmarks against the competition, and are use-ful for tracking customer satisfaction through time(Fornell et al. 1996). Satisfaction driver research relatestemporal or cross-sectional variation in overall satis-faction to variation in the evaluation of conceptuallyindependent components of a product or service.Surveys are designed to elicit evaluations of thecomponents of a product or service and an over-all measure of satisfaction. Driver analysis uses thecovariance among evaluations to estimate the con-ditional relationship of the overall measure to itscomponents. The standard approach relies on regres-sion analysis and, more recently, finite mixture regres-sion analysis (e.g., Wedel and DeSarbo 2002).1 Theconceptual advantage of regression analysis is that the

1 When these surveys contain explicit multi-item measurementmodels, driver analysis uses the covariance among latent variables

estimated coefficients measure the conditional effectsof changing one component score while holding allother component scores constant. In practice, how-ever, it is an open question whether regression analy-sis is meaningful for a specific data set.

Despite a firm’s best efforts and known constraintson respondent time, a specific survey may accidentallyfail at measuring conceptually independent compo-nent evaluations and effectively ask about total sat-isfaction over and over again. In practice, regressionanalysis and other techniques for identifying driversof satisfaction applied to such data will result inrapidly decreasing estimates of driver importance asthe number of “predictor” items increases. This is dueto the property of a regression coefficient as a con-ditional effect given the other covariates and due to

to infer conditional relationships between latent component eval-uations and latent overall satisfaction; latent variable regressionmodels, e.g., LISREL, are used. In this paper we focus on intendedsingle-item measurements of component evaluations and overallsatisfaction. However, our conceptual discussion applies straight-forwardly to the analysis of latent covariance structures.

533

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey Responses534 Marketing Science 32(4), pp. 533–553, © 2013 INFORMS

the item scores each reflecting the same information(up to error) about total satisfaction. We provide anumerical illustration in the Web appendix (availableat http://dx.doi.org/10.1287/mksc.2013.0779).

Some respondents may fail to think clearly aboutthe component evaluations, whereas others are ableand motivated to provide separate evaluations foreach component. In this case the joint distribution ofevaluations is a mixture of low- and high-dimensionalresponses, and regression or other types of driveranalysis should be informed by the high-dimensionalresponses only. However, low-dimensional responsescan still be informative about levels of satisfaction.Alternatively, some respondents may provide randomresponses that indiscriminately fluctuate around someanchor point on the scale. Such responses are collec-tively outliers and do not contribute any substantivelyinteresting information. In fact, the covariance amongthese responses is neither low nor high dimensionalbut zero. Technically, these responses are still informa-tive about item-specific response levels but generallynot about levels of satisfaction.2

The vast majority of customer satisfaction surveysrely on rating scales. These scales are assumed toresult in ordinal data at the individual-respondentlevel, up to measurement error. The basic axiomsof measurement (e.g., Krantz et al. 1971) rule outa direct comparison of ordinal relationships amongrespondents. For example, a two-scale point differ-ence between items reported by one respondent mayequate to a one-scale point difference reported byanother respondent. Both responses only indicate thatthe two respondents’ preferences are in the sameorder. However, when each respondent uses the sameordinal scale across multiple items, it is possible tocharacterize his preference for ordinal categories andseparate out scale use effects that can mask the dimen-sionality of substantive preferences (Rossi et al. 2001,Van Rosmalen et al. 2010, Ying et al. 2006).

In this paper we develop a Bayesian mixturemodel for customer satisfaction surveys that uses thedimensionality of covariances among responses asthe primary classification criterion while accountingfor heterogeneous scale use.3 The model probabilis-tically sorts responses according to the dimensional-ity of their covariance, distinguishing responses witha full covariance structure from those with a lower

2 Response patterns with zero substantive (i.e., net of scale usage)covariance likely result from a lack of motivation or ability to pro-cess the items on the survey.3 When surveys use multi-item scales where individual items arereflective indicators of latent constructs, the classification criterionis the dimensionality of the covariance among latent constructs.However, we leave the technical development of our conceptualargument in a survey context with explicit measurement modelsfor future research.

dimension. This way, driver analysis can be based onthe covariance of full-dimensional responses whereregression is a priori meaningful. It may still turnout to be managerially uninformative a posteriori,depending on the estimated regression coefficients,and the degree of prior management knowledge.However, the model identifies and thus controlsfor low-dimensional covariance structures where theapplication of regression lacks meaning a priori. Sim-ilarly, the model identifies and controls for responseswith zero covariance.

We apply our model to four different customer sat-isfaction surveys that evaluate hospitals, an academicprogram, smartphones, and theme parks, respectively.We find that only one data set, the smartphone sur-vey, essentially conforms to a regression model. Forthe theme park survey, the model determines thatalmost all responses are best accounted for by a one-factor model. The ability to obtain positive support fora much lower-dimensional model than the unstruc-tured, high-dimensional covariance underlying theregression model is unique to the Bayesian approach.The two remaining data sets result in a mixture of aregression model and low-dimensional factor models.In all cases the proposed classification according tothe dimensionality of responses results in intuitivelymeaningful inferences and fits the data better thanordinal regression models and mixtures of ordinalregression models.

The organization of the rest of this paper is as fol-lows. We discuss prior research in §2. We developthe model in §3. In §4, we report results from fourempirical applications. We provide a discussion of ourresults and future research topics in §5.

2. Prior ResearchMethods to determine the dimensionality of a set ofquestionnaire items, or more generally, a set of mea-surements, originated in psychology and psychomet-rics and have found applications across a variety ofdisciplines, including marketing (e.g., Beckwith andLehmann 1975, Iacobucci 1994, Sonnier and Ainslie2011) and finance (e.g., Jones 2006). A fundamen-tal assumption of these methods is that some fixed,possibly unknown, number of latent dimensions orfactors—smaller than the number of measurementstaken—generates the covariance among the observedmeasurements. From the earliest work in exploratoryfactor analysis (Spearman 1904) to sophisticated,detailed confirmatory factor models (e.g., Gibbonsand Hedeker 1992, Bradlow et al. 1999, Wang et al.2002), the hypothesis of a smaller number of latentdimensions is linked to principles of constructing themeasurements. The individual items on the question-naire are formulated with the goal of tapping into

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey ResponsesMarketing Science 32(4), pp. 533–553, © 2013 INFORMS 535

hypothesized latent dimensions, and each dimensionis represented by multiple items (Thurstone 1937).As a consequence, a general assumption is that thelatent dimensionality of a particular set of measure-ments is homogeneous in a population.

The design of many commercial customer satisfac-tion surveys proceeds differently. Designers often facesevere constraints on respondent time and motiva-tion—they need to keep surveys short and simplewhile trying to maximize information about a multi-tude of individual service or product components andtheir relationship to overall satisfaction. In contrast topsychological tests, the implicit intent of many com-mercial customer satisfaction surveys is therefore toelicit evaluations of dimension equal to the numberof items in the questionnaire so that driver analysiscan be conducted.4 In Rossiter’s C-OAR-SE terminol-ogy, applied customer satisfaction surveys measurehow customers feel about a multitude of concreteattributes (Rossiter 2002); the goal is to collect qualita-tively different information with each additional ques-tion on the survey.5

From the point of view of psychological test devel-opment, or more generally, psychometric measure-ment, this practice may appear strange. However,whereas psychometric measurement eventually asksfor reliable estimates of individuals’ scores along well-defined prior dimensions such as “analytical ability,”customer satisfaction surveys aimed at driver analy-sis emphasize the discovery of relationships betweencomponent evaluations and overall satisfaction. Andin many of these customer satisfaction applications,the goal of highly reliable posterior estimates of latenttrue scores has to be traded off against the measure-ment of additional putative drivers of satisfaction. Ina sense, psychometrics has an inherent preference foritems with low unique variance relative to other itemsmeasuring the same latent dimension, whereas cus-tomer satisfaction research aimed at driver analysisimplicitly maximizes the uniqueness of items in anattempt to cover as many conceptually independentdrivers as possible.6

4 However, a model that motivates observed responses in customersatisfaction surveys from underlying latent variables of lowerdimensionality similar to psychometric testing models was pre-sented by Bradlow and Zaslavsky (1999).5 We are aware of the fact that (mostly) academic studies oftenuse multiple items for individual, more abstract attributes with theintent of reducing measurement error (e.g., Smith et al. 1999). Thesestudies are based on strong prior knowledge about relationshipsamong limited numbers of latent variables. In contrast, standarddriver analysis aims at the “discovery” of the relative importanceof a larger number of putative drivers.6 From a measurement perspective, driver analysis attempts toestablish a purely formative measurement model. The reliability ofsuch a model increases in the number of “causal” indicators with

A basic goal of marketing research is to iden-tify drivers of some desired overall outcome (e.g.,overall satisfaction using regression models). In con-trast to psychometric models that are built aroundthe idea of theory-guided dimensionality reduction,driver analysis in marketing seeks to understand howhypothetical changes in individual component eval-uations translate into changes in overall satisfaction.More recently, this approach has been generalized toaccount for heterogeneity in the coefficients that con-vert changes in component evaluations into changesin total satisfaction (Wedel and DeSarbo 2002, Wedeland Kamakura 2000).7

Valid answers to the question of how hypotheticalchanges in individual component evaluations trans-late into changes in overall satisfaction require thatresponses to component items on a customer satisfac-tion survey are “unique” in the sense that their covari-ance cannot be explained by a relatively small numberof underlying latent dimensions.8 Here, psychomet-rics and psychology, guided by strong theories aboutthe maximum dimensionality behind the covarianceof a set of measurements, rely on hypothesis test-ing. The null hypothesis corresponds to the theoreticalnumber of dimensions, and empirical results that failto reject this null hypothesis are taken as support forthe theory (Jöreskog 1967).

In contrast, driver analysis in customer satisfactionresearch seeks positive evidence that responses are ofdimension equal to the number of questions asked.This is different from rejecting hypotheses about alower-dimensional representation (Kass and Raftery1995). The difference becomes critical in the context ofcomparing the saturated model with the number ofdimensions equal to the number of items to modelsthat posit a lower-dimensional representation. Driveranalysis should not “wait for” the rejection of low-dimensional representations but build on positive evi-dence for unique items, i.e., a covariance structure ofdimensionality equal to the number of items on thesurvey. Similarly, because it is impossible to reject thesaturated model in favor of a more constrained modelwithout imposing a penalty for the number of param-eters, tools are needed that build on positive evidencein favor of a lower-dimensional representation before

significant (conditional) effects on the latent variable, i.e., overallsatisfaction in this context (e.g., Bollen 1989).7 When the goal is to understand heterogeneity in conditionalrelationships among latent variables—a case not covered in thispaper—the analysis relies on finite mixtures of structural equationmodels.8 If this condition is not met, the analyst will run into multicolinear-ity problems, as demonstrated in the Web appendix. Practitionersand academics alike very often find signs of multicolinearity amongputative drivers in these analyses (e.g., Finn 2011).

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey Responses536 Marketing Science 32(4), pp. 533–553, © 2013 INFORMS

we can learn about the potentially heterogeneousdimensionality of responses in a particular data set.

The possibility that a given set of items on a ques-tionnaire or survey is of different inherent dimension-ality in different “populations” has received limitedempirical attention both in marketing and psychol-ogy. Although discrete mixtures of factor models havefound applications in psychology (e.g., Lubke andMuthén 2005), the focus has been on relaxing thedistributional assumptions of factor scores. The ideais that the joint distribution of responses to a mea-surement invariant test is caused by an unobservedmixture of discrete “types” that each consist of con-tinuously heterogeneous respondents. Measurementinvariance requires that the way a latent variableexpresses itself in observable measures is populationinvariant and therefore, by definition, requires a fixeddimensionality of the measurements (Meredith 1993).

However, the theory of survey design (e.g., Groveset al. 2004, Hayes 2008) suggests various causes forsurvey responses that are mixtures of distributionswith different covariance dimensionality. Potentialreasons for low-dimensional responses to a set of, inprinciple, specific, unique items include halo effectsfrom overall satisfaction, a lack of motivation to pro-cess the individual items in detail, or some combina-tion of these.

The concept of a halo goes back to Thorndike(1920), who described it as the tendency to think of aperson in general as rather “good” or “inferior” andto color the judgment of specific personal qualities bythis general feeling. Thus, reporting a haloed responseto a specific question is an instance of the more gen-eral concept of response substitution (Gal and Rucker2011). Both the theory of halo and that of responsesubstitution, as well as theories about the motivationto process individual items in detail, are formulatedat the level of individual respondents. Therefore it isa valid empirical question if a sample of responses toa particular survey is of heterogeneous dimensional-ity. And, in a sense, an answer to the well-establishedquestion about the inherent dimensionality of a set ofmeasurements is incomplete without accounting forthe possibility that different respondents may respondto the same set of items with differing underlyingdimensionalities.

Technically, we build on the literatures on finitemixtures (Frühwirth-Schnatter 2006), factor analy-sis (Mulaik 1972), and models for rating scales(Johnson and Albert 1999). We develop Bayesianexploratory factor analysis (Press and Shigemasu1989) as our model of reduced dimensionality anduse finite mixtures to capture the heterogeneity inthe dimensionality of responses. We need to accountfor heterogeneous scale use (Cunningham et al. 1977,Rossi et al. 2001, van Rosmalen et al. 2010, Ying

et al. 2006) when investigating potentially heteroge-neous dimensionalities of survey responses. A scaleusage model is essentially a device to distinguishbetween substantive and nonsubstantive covariancein a particular data set, and we are interested in thestructure of the substantive covariances. Joint identi-fication of substantive and nonsubstantive covariancecomes from the definition of a scale use model in anymodel of scale usage.

In these models, scale use is a respondent trait (ina given survey) that applies indiscriminately to everyitem on that survey. Therefore, models of scale useper se cannot explain variability across items within arespondent. The variance and covariance of responseswithin respondents then identifies the model of sub-stantive interest. The scale use model picks up vari-ation on a collection of (practically) identical scalesacross respondents that is orthogonal to any differ-ences between items. Therefore, joint identification ofsubstantive and nonsubstantive covariances hinges onsurvey items that are consistently perceived as differ-ent from each other.

An intuitive example for consistent differencesbetween items is the combination of two “positive”items and one “negative” item. Depending on thesubstantive model, one might expect positive covari-ance between the positive items and negative covari-ance between each of the positive items and thenegative item. Models of scale use can motivate nei-ther the negative covariances in this example nor thedifferences in the covariances between different itemsbecause they do not distinguish among items by defi-nition. More generally, any difference in the responsedistribution across items cannot be attributed to het-erogeneous scale usage and thus identifies the sub-stantive model. We develop our model formally in thenext section.

3. Mixture Model forCustomer Evaluations

We first derive the model from continuous latentvariables that generate the observed ordinal discreteresponses, e.g., five-point or seven-point rating scales,through a link in the form of a step function. We thendiscuss how to incorporate heterogeneous scale useeffects (Rossi et al. 2001, Ying et al. 2006).

3.1. Mixture ModelWe assume that observed customer evaluations pro-vided on a fixed-point scale arise from unobservedcontinuous evaluations. The latent evaluations aredenoted as

z′i = 4zy1 i1z′

x1 i5 (1)

consisting of an overall score (y) and a compo-nent score (x) assumed to be distributed multivari-ate normal with the covariance matrix dependent on

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey ResponsesMarketing Science 32(4), pp. 533–553, © 2013 INFORMS 537

the dimensionality of responses. The index i is forrespondents. The latent continuous evaluations arenot observed. We assume that observed censored real-izations are generated from latent continuous evalu-ations according to a cut-point model. Let yi and xij ,j = 11 0 0 0 1 q, denote the observed integer responses forrespondent i, where

yi = k if ci1 k−1 ≤ zy1 i < ci1 k1 (2)

xi1 j = k if ci1 k−1 ≤ zx1 i1 j < ci1 k1 (3)

for K + 1 respondent-specific ordered cut points8ci1 k2 ci1 k−1 ≤ ci1 kk = 11 0 0 0 1K9, where c0 = −� andcK = �. Thus, the observed responses (yi, xi’) followan ordered probit model as in Rossi et al. (2001).

We use unconstrained, “exploratory” linear factormodels to account for responses with low-dimensionalcovariance. For such low-dimensional responses, theevaluations zi arise from a lower-dimensional vectorof latent overarching evaluations �i. We indicate thelength of this vector by h with h < q + 1, i.e., smallerthan the number of items on the questionnaire. Therelationship between evaluations and latent scores is

zi = â h�i + ch + Øhi 0 (4)

Here, â is a coefficient matrix of factor loadings ofdimension (q + 15 × h, ch is a vector intercept, andØhi are unique and independently distributed normalerrors with possibly different variances. Without lossof generality, we assume that the latent overarchingevaluations �i are distributed independently, multi-variate normal; �i ∼ N401 Ih5. The joint distribution ofz is then

zi ∼ N4ch1â hâ h′

+ë 5= N4Ìh1èh51 (5)

where ë is a diagonal matrix of error variances.9

In contrast, responses that are a priori consistentwith a regressions model exhibit a “full” covariancestructure that cannot be projected into a space ofsmaller dimensionality than the number of items onthe questionnaire. One way to capture these responsesis to simply estimate a full, unrestricted covariancematrix. However, in our application we are eventuallyinterested in the regression coefficients for potentiallymeaningful driver analysis, a posteriori, i.e., giventhat fitting a regression model is meaningful a priorias per the dimensionality of the observed responses.Therefore, we derive the full, unrestricted covariancematrix from a regression model:

zy1 i = z′

x1 iÂ+ cfy + �fi 0 (6)

Here, cfy is a scalar intercept. The vector of predictorszx1 i does not include a constant and the intercept cy

9 We do not impose prior identification constraints on the factorloadings and discuss the identification of factor loadings â in theappendix.

allows for a location shift in the expected value of theoverall evaluation (y) relative to the components (x).The joint distribution of zi implied by the regressionmodel is then

zi ∼ N

((

4cfx 5′

Â+ cfy

cfx

)

1

(

Â′èxÂ+�2�y

Â′èx

èx èx

)

)

= N4Ìf 1èf 51 (7)

where cfx and èx are the marginal means and variance–covariance of zx, respectively, i.e., of the putativedrivers of the overall evaluation. The parameter �2

�y

denotes the conditional variance of zy given zx.Equation (7) is equivalent to estimating an un-

restricted, full covariance matrix. Therefore the clas-sification of responses is based on the inherentdimensionality of response vectors where (5) positscovariances of lower dimensionality, h, than the num-ber of items on the questionnaire, q + 1. Anotherimplication is that all factor models in (5) are spe-cial, restricted cases of Equation (7). When h in (5)approaches q + 1 (i.e., when the number of overar-ching latent evaluations approaches the number ofitems on the questionnaire)—specifically, when h ≥

�12 424q + 15+ 1 −

84q + 15+ 15� (Ledermann 1937)—Equation (5) no longer constrains the dimensionalityof the covariance of questionnaire items.10 In this case,the mixture is defined by different covariances (andmeans) of equal dimensionality. The popular finitemixture of regression approach (Wedel and Kamakura2000), for example, assumes equally full dimension-ality of covariances in different mixture componentsa priori.11

Finally, we account for respondents who providerandom responses y and x that indiscriminately fluc-tuate around some anchor point on the scale, afteraccounting for heterogeneous scale usage discussedin §3.2.12 Such responses are collectively outliers anddo not contribute any substantively interesting infor-mation. In fact, the covariance among these responsesis neither low nor high dimensional but zero. Theseobservations obfuscate any covariance-based infer-ence such as regression analysis, factor analysis, orcovariance-based mixture models.

10 Here, � � denotes the “ceiling,” i.e., the smallest integer that isgreater than or equal to the argument.11 Obviously, a regression model or a mixture of regressions can befit to data with a less than full-dimensional covariance structure.The point is that doing so lacks meaning a priori because a low-dimensional covariance is inconsistent with individual componentevaluations driving a global evaluation.12 When combined with our model of heterogeneous scale use,our independence model can capture any observed systematicresponse such as indiscriminate alternations between “high” and“low” responses, which are independent of systematic differencesbetween response items, as perceived by the mixture population ofmore informative, discriminate responses.

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey Responses538 Marketing Science 32(4), pp. 533–553, © 2013 INFORMS

Independent evaluations, characterized by a diag-onal covariance matrix, are special cases of bothmodels in (5) and (7). The factor model in (5) pro-duces independent responses when the matrix of fac-tor loadings â contains only zeros. The regressionmodel in (7) produces independent responses whenall regression coefficients are equal to zero and èx isdiagonal. However, when the data are a mixture ofindependent responses and dependent responses ofdifferent dimensionality, the former constitute outlierswith respect to both models (5) and (7). We model thejoint distribution of independent evaluations as

zi ∼ N4Ìu1ë u51 (8)

where ë u is a diagonal matrix of unique variances.We assume that the latent continuous evaluations zi

are marginally distributed according to a componentmixture model:

p4zi5=

H∑

h=1

�hph4zi5+�f pf 4zi5+�upu4zi51 (9)

where � is the mixture probability denoting segmentsize; and the superscripts correspond to the factormodels in (5), the regression model in (7), and theindependence model in (8). H refers to the factormodel with the largest number of factors included inthe mixture. Equation (9) implies that a random obser-vation is drawn with probability �h from the factormodel with h factors, probability �f from the regres-sion model, and probability �u from the independencemodel.

When a survey elicits overall and partial evalua-tions for multiple objects, as in two of our empiri-cal examples, we draw latent component membershipindicators conditionally independently for each objectevaluated by a respondent. Therefore, the structure bywhich responses occur is a priori the result of an inter-action between an object and the respondent ratherthan a respondent trait. In contrast, scale usage, to bediscussed next, is by definition a respondent trait.

3.2. Scale UsageRespondents are often observed to use different por-tions of the scale, with yea-sayers employing theupper end of the scale and naysayers using the bot-tom portion. Some respondents are observed to usea narrow range while others use the entire range ofthe scale. Heterogeneous scale use causes responsesto covary across respondents, similar to any other“trait” that manifests itself across multiple obser-vations.13 This nonsubstantive covariance masks themodel that generated the responses and thus inter-feres with classification based on the inherent dimen-sionality of responses.

13 Therefore, the discrete observed response vectors in our indepen-dence component are not independent, but the latent evaluationsafter correcting for heterogeneous scale use are.

The goal of any form of scale adjustment is tostandardize the data so that inference can proceedamong respondents based on within-respondent vari-ance (Rossi et al. 2001). Alternatively, one can think ofscale adjustments as changing the cutoff’s {ck} amongrespondents (Ying et al. 2006). To ensure that ourstructural inferences are not driven by heterogeneousscale usage, we employ a variant of the general modelsuggested by Ying et al. (2006). The authors proposethe following model of respondent-specific cut pointsfor the ordinal probit model for a rating scale withK categories:

(YFW) ci = 4c11 i1c21 i10001cK−11 i5

=

(

c11 i1c21 i =c11 i+ã11 i1c31 i =c11 i

+

2∑

k=1

ãk1i10001cK−11 i =c11 i+

K−2∑

k=1

ãk1i

)

1

log4ãi5∼N4Ìã1èã51

(10)

and c0 = −� and cK = � for all i. For notational sim-plicity, we write 4c11 i1ã11 i1 0 0 0 1ãK−21 i5, the vector ofcut-point vector baseline and increments, as ãi. Themodel without scale use heterogeneity is obtained asthe limit where èã → 0. Equation (10) is extremelyflexible but requires estimating K−1 cut-point param-eters per respondent. With sparse likelihood infor-mation, the log-normal hierarchical prior on thecut-point increments is likely to be influential. There-fore, we extend the YFW model by specifying a mix-ture of log-normal prior distributions:

log4ãi5∼ N4Ìã1 indi1èã1 indi 51

indi ∼ MultinomialL4Çind50(11)

The indicator variable ind takes on values 11 0 0 0 1L,and �ind is a vector of mixture probabilities. The num-ber of mixture components L is determined by com-paring models with different specifications.

Given the parameters

48ci1k91�14Ìh=11èh=1514Ìh=21èh=25100014Ìh=H1èh=H 51

4Ìf 1èf 514Ìu1èu551

the likelihood of the observed data is a mixture of nor-mal integrals over the rectangular region defined bythe cutoff points. To facilitate estimation, we employdata augmentation (Tanner and Wong 1987). Theappendix summarizes our Markov chain Monte Carlo(MCMC) sampler and provides detailed informationabout subjective prior choices.

3.3. Determining the Number ofMixture Components

Mixture models formalize the trade-off betweendecreasing the amount of data to inform parameters

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey ResponsesMarketing Science 32(4), pp. 533–553, © 2013 INFORMS 539

in a particular component and uncovering additionalstructure in the data that reflect respondent hetero-geneity. However, the maximum of the likelihood is(at least) weakly increasing in the number of compo-nents. Therefore, the number of relevant componentsis determined using penalized likelihood measures orinformation criteria and cross validation in frequentistsettings (Frühwirth-Schnatter 2006).

In a Bayesian mixture model, subjective priorsstructure the trade-off between increasing the in sam-ple likelihood and decreasing the amount of datato inform parameters in individual components, i.e.,losing efficiency. The subjective prior for the com-ponent sizes � plays a special role in this context(see the appendix for a complete list of subjectiveprior settings). We use a (standard) weakly informa-tive Dirichlet prior, Dir431 0 0 0 135. This prior expressesour lack of prior knowledge about component sizes,including the possibility that one or more componentsof our mixture model are not needed, i.e., have a truesize equal to zero.

Our MCMC sampler explores the posterior and,by construction, visits regions with higher posteriorsupport more often. For example, allocating homoge-neous observations, equally split, between two differ-ent components instead of one decreases the posteriorsupport of the data under the model. This is becausethe posterior of the component-specific parameters isless concentrated than in that state where (homoge-neous) observations are allocated to one component,leaving the second component empty. As explainednext, the empty component is essentially irrelevantfor both the posterior and the prior support of the dataunder the model.

Once all observations are allocated to one com-ponent, the penalty for carrying the empty compo-nent forward exacted by the (weakly informative)Dirichlet prior is by definition, minimal.14 There-fore, MCMC with a weakly informative subjectiveprior on component sizes can be used to implicitlyselect the number of empirically relevant compo-nents (Frühwirth-Schnatter 2006). This property ofBayesian mixture models is related to the observationthat marginal likelihoods for models with a posteri-ori empirically irrelevant (i.e., empty) components areessentially identical to those from the correspondinglyreduced model, conditional on the posterior distribu-tion of group sizes.15

This approach toward component selection workswell as long as the maximum number of (potentially

14 If the penalty from the subjective prior were large, empty com-ponents could not be observed.15 Note that the dimensionality, or, more generally, the subjectiveprior on parameters other than the component size in empty com-ponents, does not affect the marginal likelihood. For example, in the

empty) components under consideration is reason-ably small, which is the case in our applications.When the maximum number of components becomesvery large, any Dirichlet prior of fixed dimensional-ity is informative, and transdimensional MCMC algo-rithms are called for (e.g., Green 1995). We report theresults of various tests of our MCMC sampler usingsimulated data in the Web appendix. We confirm theability of our MCMC to identify the correct (smaller)number of components when we prespecify too manycomponents in the model.

4. Empirical Application4.1. Data SetsWe investigate the empirical performance of our mix-ture model using four customer satisfaction data sets.Incomplete responses were deleted from all data setsprior to any analysis. Incomplete responses containat least one missing satisfaction evaluation. The firstdata set contains the evaluation of hospitals in amajor metropolitan area of the United States. A sam-ple of 383 randomly chosen respondents from threeneighboring zip codes provided complete evaluationsof hospitals in their area. Of those, 203 evaluatedtwo hospitals, and 180 respondents evaluated only asingle hospital; 3.8% of respondents in this data setprovided incomplete responses. The hospitals wereevaluated on the basis of whether the hospital wouldbe the respondent’s first choice for treatment. Poten-tial drivers included various service attributes of therespondent’s hospitalization. The specific wording ofquestions is provided in Table 1. All evaluations weremeasured on a nine-point ordinal rating scale rangingfrom 1 (“strongly disagree”) to 9 (“strongly agree”).Respondents only evaluated hospitals with whichthey were familiar.

The second data set was obtained from studentsenrolled in a business bachelor’s program at a Germanuniversity. Global and partial satisfaction with the pro-gram was measured using the criteria described in

case of two components with identical component-specific priors,

∫∫ N∏

i=1

41 · p4yi � �15p4�15+ 0 · p4yi � �

25p4�255 d�1 d�2

=

∫ N∏

i=1

p4yi � �15p4�15 d�10

Moreover,∫∫ N

i=1

4005p4yi � �15p4�15+ 005p4yi � �

25p4�255 d�1 d�2

∫ N∏

i=1

p4yi � �15p4�15 d�1

when the data are from one component, i.e., homogeneous. Theinequality is strict for finite samples.

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey Responses540 Marketing Science 32(4), pp. 533–553, © 2013 INFORMS

Table 1 Attributes and Descriptive Statistics for the Hospital Data Set

Attribute Format in survey

Overall evaluation QuestionFirst choice “If I needed medical tests or treatment, this hospital would be my first choice”

Partial attributes QuestionLanguage “Has doctors and employees who speak my language”Puts patients first “Has doctors and employees that put the needs of patients first”Safe place “It is a safe place to be”Conveniently located “It is conveniently located”Skilled doctors “Has doctors who are highly skilled in their areas of specialty”Modern “It is modern looking and up-to-date”Clean “Has a clean environment”Concerned “This organization is very concerned about the welfare of people like me”

Descriptive statistics for hospital dataa

Attribute N Mode Mean SD

Global evaluation: First choice 586 8 6065 2076Partial evaluations

Language 586 8 7058 2001Puts patients first 586 8 7011 2022Safe place 586 8 7040 2016Conveniently located 586 8 7062 2004Skilled doctors 586 8 7056 2004Modern 586 8 7065 1088Clean 586 8 7072 1088Concerned 586 8 6088 2034

aOn a scale of 1 (strongly disagree) to 9 (strongly agree).

Table 2. The program started in 2005, and a total of545 students were enrolled in it at the time the datawere collected in early 2008. Out of this population,258 students participated in the survey. For all satis-faction measures, a five-point scale was used. None ofthe respondents in this data set provided incompleteresponses.

The third data set comprises satisfaction evaluationof U.S. smartphone users. Evaluations were obtainedonline through the use of an online smartphone userpanel in 2011. Six hundred sixty-two users partici-pated in the study and provided complete responses,with each respondent evaluating his or her smart-phone brand and model. As global scores, we usedparticipants’ likelihoods to recommend their smart-phone to friends or colleagues. As partial scores, weused the evaluation of smartphone attributes used ina recent J.D. Power and Associates smartphone usersatisfaction survey (see Table 3).16

Finally, we use a data set comprising evaluationsof Florida family vacation destinations. In this dataset, 542 complete evaluations were obtained from 311respondents who had visited at least one of the fol-lowing destinations within the past five years: DisneyParks (Magic Kingdom, Epcot, Animal Kingdom, andHollywood Studios), Universal Studios Florida and

16 See J.D. Power and Associates (2012) for a summary of the resultsfrom this survey.

Universal’s Islands of Adventure, and Busch Gardens.The respondents evaluated a maximum of two desti-nations, given that a visit to both had occurred. Theevaluations comprise 17 brand belief dimensions thatwere evaluated on a seven-point scale ranging from“does not describe it at all” to “describes it very well”(see Table 4). Two percent of the responses in thevacation data set were incomplete and deleted.

Figure 1 provides a plot of the data. The left upperpanel plots the hospital data, the left lower panel thestudent data, the upper right panel the Florida vaca-tion data, and the lower right panel the smartphonedata. The median response is on the horizontal axis,and the range of responses on the vertical axis. Eachpoint represents one response vector. The points onthe discrete median–range grid are slightly jittered toillustrate their distribution.

The hospital data and the smartphone data exhibitthe stylized features of many customer satisfactiondata sets. Many response vectors are concentrated atthe upper, positive end of the scale with only smallresponse variation across items. Students also tendtoward a positive median response, but very fewmedian responses are in the top box. Moreover, stu-dents show more response variation across items aftertaking into account that the hospital and the smart-phone data were collected using a nine-point scaleand the student data were collected using a five-pointscale. In the hospital data, for example, 33% of the

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey ResponsesMarketing Science 32(4), pp. 533–553, © 2013 INFORMS 541

Table 2 Attributes and Descriptive Statistics for the Student Data Set

Attribute Format in survey

Overall evaluation QuestionOverall satisfaction “How satisfied are you with this program in general?”

Attributes QuestionPartner companies “How satisfied are you with the partner companies integrated into the program?”Grades “How satisfied are you with the grades which you have achieved in this program?”Representation “How satisfied are you with the work of the students’ representation?”Partner universities “How satisfied are you with the current choice of foreign partner universities to study abroad?”Administration “How satisfied are you with the support provided by the administration?”Contact to instructors “How satisfied are you with the accessibility of instructors in this program?”Organization “How satisfied are you with the organization of this program?”Quality of instruction “How satisfied are you with the quality of instruction in this program?”Fellow students “How satisfied are you with your fellow students and their input into this program?”

Descriptive statistics for student dataa

Attribute N Mode Mean SD

Global evaluation: Overall satisfaction 256 4 3072 0071Partial evaluations

Partner companies 256 4 3082 1005Grades 256 4 3077 0078Representation 256 3 3002 1010Partner universities 256 4 3096 0083Administration 256 4 3092 0087Contact to instructors 256 3 3030 1009Organization 256 4 3052 0096Quality of instruction 256 3 3002 1008Fellow students 256 4 3072 0095

aOn a scale of 1 (not at all satisfied) to 5 (very satisfied).

response vectors have a range of 1 or less and 8%exhibit the maximum range on the nine-point scale.In the student data set, 16% responded with a rangeequal to 1 and 19% with the maximum range on theresponse scale.

Table 3 Attributes and Descriptive Statistics for the Smartphone Data Set

Overall evaluation QuestionLikelihood of recommendation “Based on your ownership experience, how likely is it that you would

recommend your smartphone model to a friend or colleague?”

Descriptive statistics for smartphone dataa

Attribute N Mode Mean SD

Global evaluation: Likelihood of recommendation 662 9 7055 1068Partial evaluations

Usability of basic features 662 9 7093 1037Ease of navigation menu screens 662 9 7093 1028Speed of operating system 662 9 7030 1064Navigating through the Internet 662 8 7037 1056View and send emails 662 9 7071 1043Installing new applications 662 9 7066 1057Visual appeal 662 9 7096 1040Weight 662 9 7074 1046Clarity of display 662 9 8006 1024Display’s resistance to scratches 662 9 7018 1081Toggle or navigation wheel or buttons 662 9 7036 1060Handling the touch-screen 662 9 7056 1061Multimedia capabilities 662 9 7056 1051Taking pictures with camera 662 9 7062 1055Wi-Fi connectivity 662 9 7058 1058Battery stamina 662 7 6017 2014

aOn a scale of 1 (very bad) to 9 (very good).

In the Florida vacation data set, one brand beliefdimension (“good for one-time visit only”) is ofopposite polarity. Thus, disagreement with this brandbelief indicates that respondents think that the desti-nation is worthwhile to visit again. The distribution

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey Responses542 Marketing Science 32(4), pp. 533–553, © 2013 INFORMS

Table 4 Attributes and Descriptive Statistics for the Florida Vacation Data Set

Overall evaluation QuestionGlobal satisfaction “Of the theme parks that you have visited in the past five years, how satisfied are you with

that theme park brand?”

Descriptive statistics for Florida vacation dataa

Attribute N Mode Mean SD

Global evaluation: Global satisfaction 542 7 5088 1030Partial evaluations

Relaxing 542 5 4066 1067Wholesome 542 7 5068 1034Fun 542 7 5079 1033Exciting 542 7 5059 1041Premium 542 6 5027 1050Memorable 542 7 5070 1042Good for one-time visit only 542 1 3036 2000Good enough to go back 542 7 5044 1065Interesting 542 7 5073 1034More fun than most 542 6 5023 1047Offers wide variety 542 6 5054 1038Enjoyable 542 7 5076 1033Best value for price 542 5 4088 1046Authentic 542 7 5044 1044Safe 542 7 5089 1021Great for whole family 542 7 5086 1032Great for adults 542 7 5060 1046

aOn a scale of 1 (does not describe it at all) to 7 (describes it very well).

of responses to this variable has its mode at the low-est scale point. This leads to a much larger range ofthe raw data (see Figure 1).

Figure 1 Plot of Range vs. Median (Jittered Values)

2 4 6 8

0

2

4

6

8

Hospital data

Median

Ran

ge

1 2 3 4 5

0

1

2

3

4

Student data

Median

Ran

ge

Ran

ge

1 2 3 4 5 6 7

0

1

2

3

4

5

6

Florida vacation data

Median

0

2

4

6

8

Ran

ge

2 4 6 8

Smartphone data

Median

4.2. Model ComparisonWe compare the proposed mixture of dimensionalitiesto (1) an ordinal regression model in which all overall

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey ResponsesMarketing Science 32(4), pp. 533–553, © 2013 INFORMS 543

satisfaction scores are assumed to be formed by aggre-gating partial scores and (2) a mixture of regres-sion models. The simple ordinal regression model canbe viewed as the standard approach to conductingdriver analysis for customer satisfaction data. A moresophisticated approach to driver analysis is a mixtureof regression models that allows for heterogeneityin driver effects across discrete mixture components(Wedel and DeSarbo 2002). We extend the standardmixture of regression model by a component that cap-tures responses without systematic covariance (Equa-tion (8)) for an even comparison to the mixture ofdimensionalities model:

p4zi5=

F∑

f=1

�f pf 4zi5+�upu4zi50 (12)

Regression components (Equation (7)) are indexedby f , and u indicates the independence model fromEquation (8). Equation (12) implies that a randomobservation is drawn with probability �f from theregression mixture component f and probability �u

from the independence model. We note that themixture of regression components corresponds to amixture of covariance matrices with a priori full-dimensional covariance structures è1

z1 0 0 0 1èFz .

In the mixture of dimensionalities, we set the max-imum dimensionality of a factor component to 5and find throughout that only factor components ofdimensionality 2 or smaller are supported by the data.In the mixture of regressions model, we set the max-imum number of regression components F to 3 andfind throughout that less than 3 regression compo-nents’ are required to fit the data.

We estimate all models with random cut points asin Equation (10) and determine the number of com-ponents in the hierarchical mixture of Equation (11)from the data. Overall, there are three basic models(see Table 5 for an overview). We use the log-marginaldensity (LMD) of the data given the model as a mea-sure of comparison and employ the approximation

Table 5 Model Overview

Component forHeterogeneity of Factor model independent Cut-point

Model response styles specification evaluations specification

Ordinal regression None None No Heterogeneouscut points

Mixture ofregressionmodel

Multiple regressioncomponents to account forparameter heterogeneity

None Yes Heterogeneouscut points

Mixture ofregression andfactor model

Mixture of regression andlow-dimensional responsestyles to account forstructural heterogeneity

Multiple-factor modelsto account fordimensionalheterogeneity

Yes Heterogeneouscut points

suggested by Newton and Raftery (1994) to com-pute the log-marginal density from the MCMC out-put. We run the MCMC sampler for 60,000 iterationsand discard the first 40,000 iterations as burn-in. Con-vergence is assessed by graphically monitoring theconvergence of key quantities such as componentsizes and other component-specific parameters.

As additional measures of fit, we compute the meanabsolute deviation (MAD) and root mean squarederror (RMSE) of the predicted ratings, given estimatesof the model parameters that form the hierarchi-cal prior for individual response vectors. We pre-dict latent evaluations by simulating unconstrained zi

from hierarchical prior distributions in Equations (9)and (12). Individual-level cut points and componentallocation are simulated from their hierarchical priors.We use the simulated individual-level cutoff pointsci to transform continuous latent predictions in ordi-nal predictions. The ordinal predictions are then com-pared to the observed ratings. We compute the MADand RMSE at each iteration and average these quanti-ties over the MCMC chain. Thus, the computed MADand RMSE do not condition on posterior compo-nent membership or on posterior information aboutindividual-level cutoff points.

Results are summarized in Tables 6–8. Table 6presents model fits. Tables 7 and 8 summarize theallocation of response vectors to mixture componentsin the mixture of regression model (Table 7) and themixture of dimensionalities model (Table 8). Compar-ing the fit of the models, the mixture of dimension-alities fits all data sets best according to MAD andRMSE. The same holds for LMD for the hospital, stu-dent, and Florida vacation data sets. However, LMDsuggests that the mixture of regressions fits the smart-phone data best but that it is only barely better thanthe simple ordinal regression model.

In the mixture of regressions, three out of four datasets result in a single regression component, afteraccounting for independent responses (see Table 7).Given our weakly informative, subjective prior oncomponent sizes, the posterior indicates that the data

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey Responses544 Marketing Science 32(4), pp. 533–553, © 2013 INFORMS

Table 6 Model Fit

Model 1: Model 2: Model 3:Ordinal regression model Mixture of ordinal regression Mixture of dimensionalities

Data set LMD MAD RMSE LMD MAD RMSE LMD MAD RMSE

Hospital −51644 10648 20176 −51675 10648 20170 −51300 10648 20161Student −21797 00710 00951 −21804 00697 00953 −21676 00695 00949Smartphone −121518 10369 10570 −121517 10361 10570 −121528 10375 10576Florida vacation −91413 10396 10464 −91439 10392 10468 −91350 10328 10480

Notes. LMD measured as suggested by Newton and Raftery (1994). The MAD and RMSE are of predicted ordinal ratings based on observed ratings.

Table 7 Allocation to Components in the Mixture of Regression Model

Share of regression components Share ofindependent evaluations

Data set Component 1 Component 2

Hospital 00829 (0.807, 0.862) 00000 (0.000, 0.000) 00171 (0.137, 0.193)Student 00686 (0.473, 0.865) 00000 (0.000, 0.000) 00314 (0.135, 0.527)Smartphone 00807 (0.758, 0.867) 00000 (0.000, 0.000) 00193 (0.133, 0.242)Florida vacation 00778 (0.745, 0.804) 00061 (0.055, 0.068) 00161 (0.135, 0.192)

Note. Reported are posterior means and in parentheses the 2.5% and 97.5% bounds of the posterior distribution.

Table 8 Allocation to Components in the Mixture of Dimensionalities Model

Share of factor components Share of regression Share ofindependent evaluations

Data set h= 1 h= 2 h= 3

Hospital 00341 (0.294, 0.403) 0012 (0.060, 0.172) 00050a (0.036, 0.071) 00395 (0.340, 0.459) 00095 (0.054, 0.174)Student 00334 (0.202, 0.481) 00007 (0.004, 0.012) 00007a (0.006, 0.008) 00364 (0.236, 0.508) 00280 (0.182, 0.368)Smartphone 00006 (0.003, 0.009) 00002 (0.001, 0.004) 00008a (0.006, 0.009) 00748 (0.698, 0.799) 00236 (0.182, 0.289)Florida vacation 00765 (0.747, 0.784) 00004 (0.003, 0.006) 00006a (0.003, 0.01) 00071 (0.066, 0.077) 00154 (0.133, 0.174)

Note. Reported are posterior means and in parentheses the 2.5% and 97.5% bounds of the posterior distribution.aIncludes allocation to h > 3.

are not informative about additional regression com-ponents. A second regression component is only sup-ported in the Florida vacation data set. However, theshare of responses in this component is small (6%).The share of independent evaluations ranges from16% (Florida vacation data) to 31% (student data).

In the mixture of dimensionalities, the shares ofresponses allocated to the factor models range fromessentially 0% (smartphone data) to 77% (Floridavacation data) (see Table 8). No data set supportsfactor components of dimensionality larger than 2,and out of the three data sets where factor compo-nents receive substantial allocation (hospital, student,and Florida vacation), the student data and Floridavacation data only support one-dimensional factorcomponents. Only the hospital data support a two-dimensional factor component in addition to a rela-tively much larger one-dimensional component. Theshare of independent evaluations decreases relativeto the mixture of regression model, with the excep-tion of the smartphone data. Finally, the share of theregression component ranges from 1% (Florida vaca-tion data) to 75% (smartphone data).

4.3. Hospital DataTable 9 summarizes the results from three modelsfor the hospital data. We only report coefficientsthat are a posteriori informed by the data. That is,we do not report coefficients associated with emptycomponents. The mixture of dimensionalities revealsthat about 34% of responses exhibit a one-dimensionalcovariance structure, 12% of responses exhibit a two-dimensional structure, and 10% of responses are clas-sified as random. Only about 40% of the responsesare a priori consistent with a regression model basedon the dimensionality of their covariance.

The factor loadings in the single-factor componentof the mixture are all credibly positive but vary insize. This result is consistent with respondents whoprocess individual items but color all responses byan overarching sentiment as in a factor model. Pro-cessing of items that are not identical causes varia-tion of loadings across items. The influence of oneand the same overarching sentiment when respond-ing to individually different items causes the one-dimensional covariance structure. In the two-factorcomponent of the mixture, the items “first choice,”

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey ResponsesMarketing Science 32(4), pp. 533–553, © 2013 INFORMS 545

Table9

Parameter

Estim

ates

forthe

Hospita

lDataSe

t

Ordi

nalr

egre

ssio

nM

ixtu

reof

regr

essi

onM

ixtu

reof

dim

ensi

onal

ities

Regr

essi

onco

mpo

nent

1Re

gres

sion

com

pone

nth

=1

h=

2(fa

ctor

1)h

=2

(fact

or2)

Post

Post

Post

Post

Post

Post

Coef

ficie

nts

mea

n2.

5%97

.5%

mea

n2.

5%97

.5%

mea

n2.

5%97

.5%

mea

n2.

5%97

.5%

mea

n2.

5%97

.5%

mea

n2.

5%97

.5%

Glob

al(fa

ctor

)—

——

——

——

——

2035

1083

3003

2027

009

4052

1076

0047

3096

Inte

rcep

t(re

gres

sion

)−

0024

−00

41−

0006

−00

27−

0043

−00

1200

58−

0023

1037

——

——

——

——

—La

ngua

ge−

0012

−00

2700

02−

0011

−00

2400

020029

0002

0058

0095

0067

1032

006

−00

1810

8110

5200

530

2Pu

tspa

tient

sfir

st00

12−

0014

0037

0028

000

550068

0029

1013

1027

1002

1059

0075

0024

1085

0021

−00

200

89Sa

fepl

ace

0042

0019

0067

003

0005

0055

0069

0037

1005

1023

0097

1054

2023

007

5089

00

0Co

nven

ient

lylo

cate

d0019

0004

0033

0019

0007

0032

0065

0031

0093

1012

0075

1056

1004

−00

1330

4−105

−3092

−0022

Skill

eddo

ctor

s0059

0037

0081

0062

0041

0083

0093

0053

1027

1022

0095

1052

−00

28−

1043

0066

1066

0079

3081

Mod

ern

0023

0005

0042

0015

−0005

0036

0014

−00

1400

42105

1015

1092

0014

−00

5410

1800

7−

001

1074

Clea

n−0013

−0037

0011

−00

11−

0035

0014

0065

0016

0097

1036

1006

1072

−00

24−

009

0018

0041

0001

1001

Conc

erne

d00

14−

001

004

0009

−00

1600

3400

27−

0017

0063

1033

1004

1069

0044

−00

0610

300

42−

0002

102

Note

.All

estim

ates

cred

ibly

diffe

rent

from

0ar

ein

bold

.

“puts patients first,” and “safe place” load crediblypositively on the first factor. The items “first choice,”“language,” “skilled doctors,” and “clean” load credi-bly positively on the second factor, and the item “con-veniently located” loads credibly negatively. Thus,hospital evaluations exist in two a priori indepen-dent dimensions in this component of the mixturewhere the second dimension is more about “func-tional” characteristics of the hospital.

In the regression component of our mixture ofdimensionalities that accounts for about 40% ofresponses, six items emerge as statistically credibledrivers of overall hospital evaluations. “Skilled doc-tors” is most important a posteriori, followed by “safeplace,” “puts patients first,” “conveniently located,”“clean,” and “language.”

In comparing items across components of dif-ferent dimensionalities, we see that “modern” and“concerned” are reflective of an overall hospital eval-uation in the single-factor component but do notdrive this overall evaluation. A qualitative differencebetween the two-factor and one-factor componentsis that “modern” is independent of both factors inthe former; “concerned” misses the mark for sta-tistical credibility in the two-factor component onlybarely. A qualitative difference between the two-factorcomponent and the regression component is the pro-nounced judgmental trade-off between “convenientlylocated” and “skilled doctors” in the former. Thistrade-off may be caused by a strong perceived ecolog-ical correlation between these items, but it neverthe-less impedes driver analysis.

Comparing inference from the regression com-ponent in the mixture of dimensionalities to thatfrom established technologies (i.e., ordinal regres-sion and mixtures of ordinal regressions), we seethat both methodologies fail to recognize “lan-guage” and “clean” as credible drivers of hospitaloverall evaluations. Moreover, simple ordinal regres-sion spuriously points to “modern” as a driver ofoverall evaluations. Inference regarding “language”stands out especially. Both established methodolo-gies suggest that “language” has a borderline crediblynegative impact on overall hospital evaluations. Aftercontrolling for “skilled doctors” and other variables, anegative impact of language seems hard to reconcilewith any reasonable set of prior expectations aboutdrivers.

Overall, we obtain qualitatively different and moreface-valid inference about drivers of overall hospitalevaluations from the proposed mixture of dimension-alities, and the comparison to the mixture of regres-sion corroborates the empirical relevance of mixtureof dimensionalities vis-à-vis heterogeneity in full-dimensional covariance structures.

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey Responses546 Marketing Science 32(4), pp. 533–553, © 2013 INFORMS

4.4. Student DataTable 10 summarizes the results for the student data.The mixture of dimensionalities reveals that aboutone-third of the responses exhibit a one-dimensionalcovariance structure; 28% of the responses are clas-sified as random. Mixture components featuring fac-tor models of dimensionality greater than 1 were notempirically relevant for this data set. About 36% ofthe responses are a priori consistent with a regres-sion model based on the dimensionality of theircovariance.

“Overall satisfaction” and satisfaction with “partnercompanies,” the student “representation,” and “fel-low students” are credibly positively associated witha one-dimensional latent variable in the single-factorcomponent of the mixture, and satisfaction with the“organization” of this program is borderline credi-bly positive. Satisfaction with the “administration” iscredibly negatively associated with the same latentvariable in this group, and satisfaction with “partneruniversities” is borderline credibly negative.

This pattern is consistent with respondents whoprocess individual items; i.e., heterogeneous prefer-ences for rating scale categories cannot explain themix of positive and negative loadings. However, it isinconsistent with a simple factor process. The strongerthe overarching latent sentiment toward the program,the higher the overall satisfaction, and satisfactionwith partner companies, the student representation,and fellow students but the lower the satisfactionwith the administration.

In comparing inference from the regression com-ponent in the mixture of dimensionalities to thatfrom established technologies, we see that both ordi-nal regression and the mixture of regression uncoverfewer and different credible drivers of overall studentsatisfaction. These methodologies fail to recognize theimportance of satisfaction with partner universities

Table 10 Parameter Estimates for the Student Evaluation Data Set

Ordinal regression Mixture of regression Mixture of dimensionalities

Regression component 1 Regression component h= 1

Coefficients Post mean 2.5% 97.5% Post mean 2.5% 97.5% Post mean 2.5% 97.5% Post mean 2.5% 97.5%

Global (factor) — — — — — — — — — −0034 −0067 −0001Intercept (regression) −001 −0032 001 0006 −0019 0028 2068 1031 4085 — — —Partner companies 0005 −0005 0015 0018 0 0039 0024 −0009 007 −0096 −1074 −0038Grades 0015 0003 0029 0012 −0008 003 0035 0003 0064 −0016 −0051 002Representation 0005 −0004 0014 0 −0013 0013 0031 0001 0061 −0075 −1043 −0017Partner universities 0008 −0004 0021 0012 −0008 0033 0049 0016 0089 0044 −0005 0097Administration −0004 −0019 0011 0001 −0017 0022 0025 −0015 0057 0076 0006 1041Contact to instructors −0001 −001 0008 0007 −0004 0019 0016 −001 0045 −0007 −005 0038Organization −0009 −0021 0002 −001 −0024 0005 0023 −0008 0053 −003 −0078 0006Quality of instruction 001 −0002 0024 0027 0011 0049 0047 0019 008 −0031 −1017 0035Fellow students 0019 0009 0028 0018 0005 0032 0044 0011 0073 −0051 −101 −0003

Note. All estimates credibly different from 0 are in bold.

and the student representation as drivers of over-all student satisfaction. Moreover, all posterior meansof regression coefficients estimated in the mixture ofdimensionalities are clearly positive with substantialposterior mass in the positive domain. In contrast,the borderline credibly negative impact of satisfactionwith the organization of the program inferred by theordinal regression model as well as the mixture ofordinal regressions lacks face validity.

4.5. Florida Vacation DataTable 11 summarizes the results for Florida vaca-tion data. The mixture of dimensionalities revealsthat about three-quarters of responses exhibit a one-dimensional covariance structure; 15% of responsesare classified as random. Mixture components featur-ing factor models of dimensionality greater than 1were not empirically relevant for this data set. Onlyabout 7% of the responses are a priori consistent witha regression model based on the dimensionality oftheir covariance.

In the single-factor component of the mixture ofdimensionalities, all loadings except that of “good fora one-time visit only” are credibly positive. That itemloads credibly negative on the latent factor in thiscomponent of the mixture. The result is consistentwith respondents who process individual items butcolor all responses by an overarching sentiment as ina factor model. A likely reason for the overwhelm-ing share of one-dimensional responses in this case isthat the designers of the survey, perhaps unintention-ally, chose items that are different but not concreteenough to engage respondents beyond their overallevaluation.

The percentage of respondents in the regressioncomponent of the mixture of dimensionalities is toosmall to transform into a posteriori meaningful infor-mation about drivers of overall satisfaction with the

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey ResponsesMarketing Science 32(4), pp. 533–553, © 2013 INFORMS 547

Table 11 Parameter Estimates for the Florida Vacation Data Set

Ordinal regression Mixture of regression Mixture of dimensionalities

Regression component 1 Regression component 2 Regression component h= 1

Post Post Post Post PostCoefficients mean 2.5% 97.5% mean 2.5% 97.5% mean 2.5% 97.5% mean 2.5% 97.5% mean 2.5% 97.5%

Global (factor) — — — — — — — — — — — — 0033 0022 0046Intercept 0098 0066 1031 0099 0062 1034 2045 −6096 12032 −1079 −14035 10024 — — —

(regression)Relaxing 0014 −0001 0031 0012 −0005 0029 −0029 −4058 3074 1073 −7002 10029 0024 0016 0033Wholesome −0007 −0033 0022 −0016 −0046 0012 0005 −4038 4093 −005 −8037 7047 0032 0022 0042Fun 0021 −0017 0058 0027 −0019 0078 0077 −3026 5038 −1021 −12001 803 0038 0027 005Exciting −0008 −0042 0026 −001 −0041 0022 006 −3071 6038 3079 −7 15029 0035 0025 0048Premium 0006 −0024 0032 0024 −0002 0052 −0082 −5005 3017 −3056 −12001 6054 0035 0024 0047Memorable −0006 −0036 0023 −0022 −0058 0015 −1054 −5072 2019 −0046 −906 9038 0041 0029 0056Good for one-time −0037 −0053 −0021 −0029 −0047 −0014 0051 −3053 5079 −4052 −13077 3082 −0065 −0088 −0045

visit onlyGood enough 0 −0028 0026 0012 −0017 0038 −0006 −503 404 0056 −11019 11088 0062 0044 0083

to go backInteresting −0003 −0033 0029 −001 −0049 0035 0096 −303 5038 2057 −5026 14027 0035 0024 0048More fun −0001 −0026 0026 −0014 −0044 0014 −0064 −5003 401 −0097 −11029 7089 0037 0026 005

than mostOffers wide variety −0014 −0044 0017 −0016 −0044 0011 0039 −4018 4042 0015 −8082 7066 0037 0026 005Enjoyable 0003 −0039 0043 0045 −0003 0097 0054 −3072 6033 304 −5088 13025 0039 0027 0052Best value 0012 −0011 0036 0019 −0007 0048 −0019 −4086 4051 −2011 −15018 1006 0031 0022 0041

for priceAuthentic 0018 −0007 0041 0022 −0005 0046 0031 −3055 403 −2073 −12094 5096 0033 0024 0045Safe −0013 −0035 0012 −0012 −0032 0011 −0029 −4066 5 0041 −8055 9003 0025 0017 0034Great for −0007 −0031 0016 −0024 −0051 0004 0018 −4036 4064 2049 −7012 10056 0043 003 0059

whole familyGreat for adults 0011 −0011 0033 0005 −0019 0028 0026 −4027 5011 0043 −8061 10002 0042 0028 0057

Note. All estimates credibly different from 0 are in bold.

theme park. Both ordinal regression and the mixtureof ordinal regressions point to “good for a one-timevisit only” as the only credible driver of overall sat-isfaction. The result is consistent with our simulatedapplication of a regression model to data in whicha unidimensional covariance structure is dominant(see the Web appendix). Managerially speaking, theresults from the mixture of dimensionalities implythat the recommendation to improve ratings alongthe item “good for a one-time visit only” is not anymore operational, concrete, or actionable than the rec-ommendation to do “something” to improve overallsatisfaction.

4.6. Smartphone DataThe smartphone data set is the only one amongthe four investigated in this paper where the mix-ture of dimensionalities fits the data clearly worsethan either of the two established methodologies (seeTable 6). This is consistent with the minimal poste-rior weight of factor components in the mixture ofdimensionalities (see Table 12). Moreover, the mixtureof regressions barely improves on the simple ordinalregression model in terms of fit and the data do notsupport more than one regression component. Thejoint distribution of responses is therefore consistent

with a single multivariate normal distribution as adata-generating mechanism, after accounting for het-erogeneous scale use.

A posteriori, the smartphone survey therefore iscomposed of concrete, well-differentiated items in theeyes of respondents who are able and motivatedto process these items and give specific responses.The ordinal regression identifies 9 statistically credi-ble drivers of global smartphone evaluations among16 items included in the survey. The relativelymost important performance dimensions are “batterystamina,” “clarity of display,” and “handling thetouch-screen.” Additional statistically credible driversare the “ease of navigation menu screens,” “viewand send emails,” “installing new applications,” the“display’s resistance to scratches,” “taking pictureswith camera,” and “Wi-Fi connectivity.” Furthermore,the phone’s “visual appeal,” the “speed of operat-ing system” and (the satisfaction with) the phone’s“weight” have substantial posterior mass in the pos-itive domain, and no coefficients are estimated credi-bly negative.

4.7. Scale UseWe summarize marginal posterior means and vari-ances of respondent-specific scale use parameters in

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey Responses548 Marketing Science 32(4), pp. 533–553, © 2013 INFORMS

Table 12 Parameter Estimates for the Smartphone Data Set

Ordinal regression Mixture of regression Mixture of dimensionalities

Regression component 1 Regression component

Coefficients Post mean 2.5% 97.5% Post mean 2.5% 97.5% Post mean 2.5% 97.5%

Intercept (regression) 0074 0026 1027 1017 0029 2051 3036 1025 5038Usability of basic features 0 −0015 0016 −0001 −0028 002 0005 −0025 0034Ease of navigation menu screens 0019 0004 0035 0036 0011 007 0042 0005 0087Speed of operating system 0009 −0004 0021 0016 −0006 0037 0048 0013 1002Navigating through the Internet −0005 −0021 0012 001 −0017 0044 0017 −0019 0062View and send emails 0019 0003 0036 0021 0 0042 0039 0004 0072Installing new applications 002 0006 0034 001 −001 0031 003 0011 0053Visual appeal 0012 −0002 0026 0023 −0004 0052 001 −0023 004Weight 0006 −0008 002 0003 −0018 0025 0044 0009 0089Clarity of display 0033 0015 0052 0039 0011 0067 0052 0019 0089Display’s resistance to scratches 0019 0004 0035 002 0002 0042 0039 0014 0075Toggle or navigation wheel or buttons 0004 −0012 0019 0002 −0022 0034 0031 −0004 0068Handling the touch-screen 0032 0015 0047 0031 001 0053 0036 0 0073Multimedia capabilities 0 −0016 0018 0006 −0017 0032 0002 −0036 0033Taking pictures with camera 0018 0005 0031 0009 −0007 0027 0001 −0018 0019Wi-Fi connectivity 002 0003 0035 0014 −0001 0032 0021 −0002 0053Battery stamina 0039 0023 0058 005 0023 0092 1013 006 1061

Note. All estimates credibly different from 0 are in bold.

Table 13.17 We also report first differences of poste-rior cut-point means that measure the length of latentcontinuous intervals that correspond to observed rat-ings of 2, 3, 4, etc. Across all four data sets, wesee a tendency for the interval width of ratingsto increase for better, i.e., higher ratings. Anothercommonality across all four data sets is that the cutpoints separating top box ratings from the next lowerratings have the highest posterior variances amongall cut points. Together, these results suggest that bet-ter ratings are somewhat more ambiguous in termsof across-respondent comparisons than ratings at thelower end of the satisfaction scales.

5. DiscussionWe developed a mixture model that uses the dimen-sionality of the covariance among items in a survey asthe primary classification criterion while accountingfor heterogeneity in rating scale usage. This way,our model distinguishes between responses that area priori inconsistent with regression-based driveranalysis because their covariances can be projectedonto a low-dimensional space and responses whereregression analysis makes sense a priori as per their

17 All four data sets empirically support a discrete mixture of con-tinuous priors for the scale random effects (see Equation (11)). Inthe case of the student and the smartphone data, two componentsfit best. In both data sets, a relatively much smaller component cap-tures outliers. The vacation data again are fit best with two compo-nents; however, both are of substantial size (about 60% and about40%, respectively). Finally, the hospital data support three compo-nents with component sizes equal to about 25%, 25%, and 50%,respectively.

high-dimensional covariance structure. Among thepossible reasons for failing consistency with a regres-sion model are an (unintended) lack of specificity ofindividual items on the survey, halo effects, or a lackof motivation to process individual items.

When survey items are not specific enough toengage respondents beyond an overall sentimenttoward the object of evaluation, as in the Floridavacation data, our model produces positive evidencein favor of a low-dimensional factor model. In turn,this evidence alerts the researcher to the inadequacyof regression-based driver analysis based on thesedata. When items are, in principle, specific enoughto engage respondents beyond some low-dimensionaloverarching sentiment structure, as intended by allcommercial customer satisfaction research, our model(probabilistically) separates respondents who nev-ertheless respond based on low-dimensional over-arching sentiments from respondents who providespecific, unique responses to individual items. Weencountered this situation in the hospital and studentdata sets.

Finally, our model controls for respondents whoseem to provide responses based on idiosyncraticpreferences for categories of the rating scale only,i.e., responses that collectively fail to consistently dis-criminate among individual items. The covarianceamong observed responses of this type is driven byheterogeneous scale use only. We find evidence for theempirical relevance of such responses in three out offour data sets.

Our model differs from prior research in marketingand psychology by making a heterogeneous mixture

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey ResponsesMarketing Science 32(4), pp. 533–553, © 2013 INFORMS 549

Table 13 Posterior Means and Variances of Cut Points

Hospital data Student data FL vacation data Smartphone data

Post mean First diff. Post mean First diff. Post mean First diff. Post mean First diff.

Mean of cut pointsCut1 −30582 — −30416 — −10184 — −30784 —Cut2 −30212 00370 −20410 10006 −00888 00295 −30492 00292Cut3 −20877 00336 −10129 10281 −00530 00358 −30123 00369Cut4 −20581 00296 00836 10967 −00174 00355 −20720 00403Cut5 −10922 00659 — 00175 00350 −20252 00468Cut6 −10548 00374 — 00825 00650 −10652 00599Cut7 −00844 00702 — — −00800 00852Cut8 00235 10080 — — 00325 10125

Variance of cut pointsCut1 1.730 0.226 0.192 0.506Cut2 1.732 0.367 0.239 0.595Cut3 1.767 0.413 0.156 0.672Cut4 1.958 0.832 0.158 0.764Cut5 1.952 — 0.192 1.034Cut6 1.901 — 1.298 1.072Cut7 1.898 — — 1.216Cut8 3.053 — — 2.244

of dimensionalities operational. Traditionally, mar-keting is relatively more interested in driver analy-sis, whereas psychometric tools tend to emphasizetheory-driven dimensionality reduction. The differen-tial emphasis is a natural consequence of the typicaldecision problems in the two fields. Marketing oftenneeds to understand how to influence the experienceby potentially heterogeneous but nevertheless largergroups of consumers. Psychometric tools emphasizeprecise estimation of latent variables at the individ-ual level for testing purposes.18 However, research inboth disciplines assumes that items on a questionnaireor test are measurement invariant across individualsin a survey. Psychological testing and the construc-tion of psychological tests obviously requires (strong)measurement invariance in the relevant population.Driver analysis in marketing implicitly assumes thatrespondents unequivocally interpret the meaning ofindividual items on the survey. In contrast, our modelaccommodates a particularly striking failure of mea-surement invariance: the possibility that responses toone and the same survey by different respondents areof inherently different dimensionality.

We see two immediate opportunities for futureresearch based on this paper. First, our data arethe result of mental processes occurring when oneresponds to the survey but do not contain indica-tors directly linked to information processing such as

18 Obviously, precise individual-level estimates of latent variablesare helpful when the primary goal is to understand causal relation-ships. However, in practical market research, the trade-off betweencontrolling for measurement error and obtaining information aboutadditional dimensions empirically favors the latter. This practicemakes sense in the context of specific items tapping into concreteexperiences with the product or service.

response time (e.g., Otter et al. 2008) or eye-trackingtraces (e.g., van der Lans et al. 2008). We believe thatsuch process measures will be helpful in further refin-ing models of survey response. Second, based on thefour data sets investigated here, we can only offer lim-ited operational guidance of how to best design andadminister surveys for driver analysis. The specificitems on the smartphone survey provide a positiveexample that is in obvious contrast to the collectionof items on the Florida vacation survey that renderdriver analysis meaningless. However, the mixture ofdimensionalities model will help assess the effect ofmanipulations aimed at making surveys for driveranalysis more informative.

Supplemental MaterialSupplemental material to this paper is available at http://dx.doi.org/10.1287/mksc.2013.0779.

AcknowledgmentsThe authors thank The Modellers and especially Jeff Brazelland Kevin van Horne for providing the hospital, the vaca-tion, and the smartphone data sets used in this paper andfor pointing out an error in an earlier version of theirMCMC code.

Appendix: MCMC Sampling, Prior Specification,and Posterior IdentificationOur MCMC recursively samples from the following condi-tional distributions:

Step 1. Si � zi1 4Ìh=11èh=151 4Ìh=21èh=251 0 0 0 1 4Ìh=H1èh=H 51

4Ìf 1èf 51 4Ìu1èu51Ç.Step 2. Ç � 8Si9.Step 3. 4Ìh=11 èh=151 4Ìh=21èh=251 0 0 0 1 4Ìh=H1èh=H 51

4Ìf 1èf 51 4Ìu1èu5 � 8Si1zi9.Step 4. zi � Si1 Ì

Si1 èSi1 8ci1 k9.Step 5. 8ci1 k9 � zi1 indi1 Ì

indiã 1 è

indiã .

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey Responses550 Marketing Science 32(4), pp. 533–553, © 2013 INFORMS

Step 6. Ìindiã 1 è

indiã � 8indi1 8ci1 k99.

Step 7. indi � 8ci1 k91 8Ìindiã 1è

indiã 91 Çind.

Step 8. Çind � 8indi9.Step 1 is the standard augmentation of unobserved group

membership in mixture models, denoted Si. Step 2 is astandard conjugate update of the hierarchical prior forthe component membership indicators S. Step 3 updatescomponent-specific parameters that are independent acrossmixture components conditional on {Si}. When we updateparameters of a factor model, we first augment latent fac-tor scores that are integrated out in Step 1 and then updateloadings and error variances. Step 4 cycles through the fullconditional distributions of individual elements of zi’s gen-erating from truncated normal distributions. Step 5 employsa random walk Metropolis step where the random walkis on the underlying log-normally distributed increments.Steps 6–8 are again standard updates in a mixture model.All individual steps are well documented in the literature.

We employ the following weakly informative subjectivepriors in our analysis:

ch ∼ N401105 ∀h= 11 0 0 0 1H1

âh∼ N40110 · I4q+15h5 ∀h= 11 0 0 0 1H1

�h ∼ N401 Ih5 ∀h= 11 0 0 0 1H1

Â∼ N40110 · Iq51

cfy ∼ N4011051

èfzx ∼ IW4q + 31 4q + 35 · Iq51

Ìfzx ∼ N40110 · Iq51

�2�y1 �u

m1m1 �hm1m ∼ IG43135 ∀m=110001q+11∀h=110001H1

cuz ∼ N40110 · Iq+151

èuz ∼ IW4q + 31 4q + 35 · Iq+151

ÇS ∼ Dir431 0 0 0 1350

Subjective priors for the YFW scale use model are

Ìã ∼ N4011051

èã ∼ IW4k+ 21 4k+ 25 · Ik−151

Çind ∼ Dir431 0 0 0 1350

The model we estimate is not likelihood identified. We“margin down” to the space of likelihood-identified param-eters by postprocessing the MCMC output. We first iden-tify the trade-off between cut points and latent variables bystandardizing one latent variable. We then apply orthonor-mal rotations to identify the factor loadings uniquely.

Joint Identification of Cut-Points and Latent Variables. Nei-ther the YFW model nor the extension proposed herein isidentified. More specifically, the mean and the variance ofthe latent evaluations zi and the cut points ci are not jointlyidentified. One can shift all cut points ci by a constant andshift the latent evaluations zi by the same constant andobtain the same ordinal observations. The same holds forrescaling cut points and latent evaluations by an arbitraryfactor. One way to solve the identification problem would

be to fix two cut points. This approach is not desirablebecause it fixes the scale of latent evaluations across individ-uals. Instead, we achieve identification by “postprocessing”the MCMC output. Consider the regressions component ofthe mixture of dimensionalities model:

zy1 i = cfy +ÂT zx1 i + �i0

Letz̃x = zx −�zx11

and c̃ = c −�zx111

where �zx11is the mean of zx11. This implies

z̃y1 i = zy1 i −�zx11= c

fy −�zx11

+Â′4z̃x1 i +�zx115+ �i

= cfy +�zx11

4Â′�− 15+Â′z̃x1 i + �i1

where � is a vector of ones. Furthermore, let

z̃∗

x = z̃x/�zx11and c̃∗

= c̃/�zx111

where �2zx11

is the variance of zx11.It follows that

z̃∗

y = z̃y/�zx11= �−1

zx114c

fy +�zx11

4Â′�− 15+Â′z̃x1 i + �i5

= �−1zx11

4cfy +�zx11

4Â′�− 155+Â′z̃∗

x1 i +�−1zx11

�i0

This implies that we can use the estimated mean and vari-ance of zx11 to perform the following postprocessing steps:

Step 1. c̃∗i = �−1

zx114ci −�zx11

5.

Step 2. cf ∗

y = �−1zx11

4cfy +�zx11

4ÂT �− 155.

Step 3. �2z̃∗y= �2

zy/�2

zx11.

Step 1 adjusts and rescales the cut points, which are thentransformed to the log-delta scale to recover the hyperpa-rameters. Step 2 adjusts and rescales the baseline of theregression model. Step 3 rescales the error variance giventhe variance of zx1. Postprocessing the output from theregression component implies the following postprocessingsteps for the factor models:

z1000

zq+1

=

ch1000

chq+1

+

�1000

�q+1

Î+

Ø1000

Øq+1

0

Normalization and standardization,

z̃∗1000

z̃∗q+1

=�−1zx11

ch1 −Ìzx11

000chq+1 −Ìzx11

+�−1zx11

�1

000

�q+1

Î+�−1zx11

Øh1

000

Øhq+1

1

yields the following identified quantities:1. Ã∗ = �−1

zx114Ã5.

2. ch∗

= �−1zx11

4ch −�zx115.

3. ë ∗ = �−1zx11

ë .McCulloch et al. (2000) provide a detailed discussion of theeffective subjective priors on identified quantities that resultfrom normalizing a variance to 1 by postprocessing. In addi-tion, our transformations involve (i) linear combinations ofa priori normally distributed variables, (ii) products of apriori normally distributed variables, and (iii) shifting and

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey ResponsesMarketing Science 32(4), pp. 533–553, © 2013 INFORMS 551

rescaling of sums of a priori log-normally distributed quan-tities. We note that identification of all models comparedin our paper relies on the same postprocessing strategy.Therefore, implied subjective priors on likelihood-identifiedparameters are identical across models.

Priors implied by (i) can be easily derived in closed form.Priors implied by (ii) and (iii) are not available in closed formbut are amenable to investigation by simulation. A detailedcharacterization of the implied priors is beyond the scopeof this paper. However, we note that at the cost of forgoingconjugacy when updating è

fzx , an identified sampler results

from fixing the mean and variance of an active variable inthe regression component. Alternatively, one could fix themean and variance of a variable in the independence com-ponent at the same cost or fix the error variance and constantof a variable in one of the factor components.

Identification of Factor Loadings. In the model (or mixturecomponent) with one factor, the assumption of a standardnormal distribution for the factor scores identifies all factorloadings up to their signs. However, in practice, sign flipsare unlikely when at least one loading is credibly differ-ent from zero. In this case the differently signed, symmetricposterior modes are well separated. Careful inspection ofMCMC trace plots of loadings empirically verified that signflips did not occur in any of our analyses.

In models (or mixture components) with more than onefactor, MCMC estimation without prior constraints on load-ings is free to explore an infinite number of observationallyequivalent orthonormally rotated loading matrices at each

Figure A.1 Scatterplots of Loading Draws

0 0.5 1.0 1.5 2.00.8

1.0

1.2

1.4

1.6

1.8

2.0

–2 –1 0 1 2–2

–1

0

1

2

1.3 1.4 1.5 1.6 1.7 1.8 1.9–1.3

–1.2

–1.1

–1.0

–0.9

–0.8

–0.7

Note. Top left: unidentified sampler; top right: “identification” based on a unique item; bottom left: identification based on a common item.

iteration of the sampler. The usual a priori constraint usedto obtain one unique rotation that is meaningfully compa-rable across iterations is to set all factor loadings abovethe main diagonal equal to 0. A drawback of this a pri-ori approach is that identification hinges on the assump-tion that all elements on the main diagonal of the loadingsare different from zero a posteriori. Consider a two-factormodel for six observed items, for example. MCMC estima-tion under the constrained loadings matrix,

�11 0�21 �22�31 �32�41 �42�51 �52�61 �62

1

only uniquely and comparably identifies factor loadingsacross MCMC iterations if indeed �11 6= 0 a posteriori. If the(arbitrarily chosen) first item is unique relative to the otheritems such that �11 = 0 in the data-generating mechanism,or if the data are not informative enough to reliably detect�11 6= 0, identification based on the first item is impossibleor at least tenuous. Figure A.1 shows scatterplots of loadingdraws for one item in a two-factor model. The x axis plotsloadings on the first factor and the y axis loadings on thesecond factor. The top right plot clarifies two points: Thefirst point is that identification based on a unique item isimpossible. In some sense an attempt to identify a particular(orthonormal) rotation based on a unique item has the effect

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey Responses552 Marketing Science 32(4), pp. 533–553, © 2013 INFORMS

to maximize mixing over different rotations. The secondpoint is that a badly chosen identification constraint will beimmediately recognized from the posterior.

The posterior from the unidentified sampler in the toppanel left panel of Figure A.1 only visits a subset of thepossible rotations. Finally, the bottom left plot shows thejoint posterior distribution of loadings when identificationis based on a (reliably) nonunique item.

One may argue that a bad prior identification constraintwill always be recognized and can be switched a posterioriby orthonormal rotation anyway, without the need to rerunthe MCMC. Despite this argument, we prefer identificationa posteriori because the influence of subjective priors onloading coefficients may depend on the choice of the (arbi-trary) prior identification constraint.

A posteriori identification proceeds by orthonormallyrotating the loadings matrix such that elements above themain diagonal are zero. Such rotations involve simplematrix algebra, even when the number of factors is large.Note that the analyst is free to choose any item’s loadingsto form the first, second, etc., rows of the loadings matrixa posteriori.

ReferencesBeckwith NE, Lehmann DR (1975) The importance of halo effects in

multi-attribute attitude models. J. Marketing Res. 12(3):265–275.Bollen KA (1989) Structural Equations with Latent Variables (John

Wiley & Sons, New York).Bradlow ET, Zaslavsky AM (1999) A hierarchical latent-variable

model for ordinal data from a customer satisfaction surveywith “no answer” responses. J. Amer. Statist. Assoc. 94(445):43–52.

Bradlow ET, Wainer H, Wang X (1999) A Bayesian random effectsmodel for testlets. Psychometrika 64(2):153–164.

Cunningham WH, Cunningham ICM, Green RT (1977) The ipsativeprocess to reduce response set bias. Public Opinion Quart.41(3):379–384.

Finn A (2011) Investigating the non-linear effects of e-service qual-ity dimensions on customer satisfaction. J. Retailing ConsumerServices 18(1):27–37.

Fornell C, Johnson MD, Anderson EW, Cha J, Bryant BE (1996) TheAmerican customer satisfaction index: Nature, purpose, andfindings. J. Marketing 60(4):7–18.

Frühwirth-Schnatter S (2006) Finite Mixture and Markov-SwitchingModels (Springer, New York).

Gal D, Rucker DD (2011) Answering the unasked question:Response substitution in consumer surveys. J. Marketing Res.48(1):185–195.

Gibbons RD, Hedeker DR (1992) Full-information item bi-factoranalysis. Psychometrika 57(3):423–436.

Green PJ (1995) Reversible jump Markov chain Monte Carlocomputation and Bayesian model determination. Biometrika82(4):711–732.

Groves RM, Fowler FJ, Couper MP, Lepkowski JM, Singer E,Tourangeau R (2004) Survey Methodology (John Wiley & Sons,Hoboken, NJ).

Hayes BE (2008) Measuring Customer Satisfaction and Loyalty: SurveyDesign, Use, and Statistical Analysis Methods (ASQ Quality Press,Milwaukee).

Iacobucci D (1994) Classic factor analysis. Bagozzi RP, ed. Prin-ciples of Marketing Research (Blackwell, Cambridge, MA),279–316.

J.D. Power and Associates (2012) 2012 U.S. wireless smartphoneand traditional mobile phone customer satisfaction study—Vol. 2 results. Press release (September 6), J.D. Power andAssociates, Westlake Village, CA. http://www.jdpower.com/content/study/s1019it/2012-u-s-wireless-smartphone-and-traditional-mobile-phone-customer-satisfaction-study-vol-2-results.htm.

Johnson VE, Albert JH (1999) Ordinal Data Modeling (Springer,New York).

Jones CS (2006) A nonlinear factor analysis of S&P 500 Index optionreturns. J. Finance 61(5):2325–2363.

Jöreskog KG (1967) Some contributions to maximum likelihood fac-tor analysis. Psychometrika 32(4):443–482.

Kass RE, Raftery AE (1995) Bayes factors. J. Amer. Statist. Assoc.90(430):779–795.

Krantz DH, Luce RD, Suppes P, Tversky A (1971) Foundations ofMeasurement (Academic Press, New York).

Ledermann W (1937) On the rank of the reduced correla-tional matrix in multiple-factor analysis. Psychometrika 2(2):85–93.

Lubke GH, Muthén B (2005) Investigating population heterogeneitywith factor mixture models. Psych. Methods 10(1):21–39.

McCulloch RE, Polson NG, Rossi PE (2000) A Bayesian analysis ofthe multinomial probit model with fully identified parameters.J. Econometrics 99(1):173–193.

Meredith W (1993) Measurement invariance, factor analysis andfactorial invariance. Psychometrika 58(4):525–543.

Mulaik SA (1972) The Foundations of Factor Analysis (McGraw-Hill,New York).

Newton MA, Raftery AE (1994) Approximate Bayesian infer-ence with the weighted likelihood bootstrap. J. Roy. Statist.Soc. Ser. B 56(1):3–48.

Otter T, Allenby G, van Zandt T (2008) An integrated modelof discrete choice and response time. J. Marketing Res. 45(5):593–607.

Press SJ, Shigemasu K (1989) Bayesian inference in factor analy-sis. Olkin I, Gleser LJ, Perlman MD, Press S, Sampson A, eds.Contributions to Probability and Statistics (Springer, New York),271–287.

Rossi P, Gilula Z, Allenby G (2001) Overcoming scale usage het-erogeneity: A Bayesian hierarchical approach. J. Amer. Statist.Assoc. 96(453):20–31.

Rossiter JR (2002) The C-OAR-SE procedure for scale developmentin marketing. Internat. J. Res. Marketing 19(4):305–335.

Smith AK, Bolton RN, Wagner J (1999) A model of customer satis-faction with service encounters involving failure and recovery.J. Marketing Res. 36(3):356–372.

Sonnier G, Ainslie A (2011) Estimating the value of brand-imageassociations: The role of general and specific brand-image.J. Marketing Res. 48(3):518–531.

Spearman C (1904) “General intelligence,” objectively determinedand measured. Amer. J. Psych. 15(2):201–292.

Tanner MA, Wong WH (1987) The calculation of posterior distri-butions by data augmentation. J. Amer. Statist. Assoc. 82(398):528–549.

Thorndike EL (1920) A constant error in psychological ratings.J. Appl. Psych. 4(1):25–29.

Thurstone LL (1937) Current misuse of the factorial methods.Psychometrika 2(2):73–76.

Büschken, Otter, and Allenby: The Dimensionality of Customer Satisfaction Survey ResponsesMarketing Science 32(4), pp. 533–553, © 2013 INFORMS 553

van der Lans R, Pieters R, Wedel M (2008) Eye-movement anal-ysis of search effectiveness. J. Amer. Statist. Assoc. 103(482):452–461.

van Rosmalen J, van Herk H, Groenen PJF (2010) Identify-ing response styles: A latent-class bilinear multinomial logitmodel. J. Marketing Res. 47(1):157–172.

Wang X, Bradlow ET, Wainer H (2002) A general Bayesian modelfor testlets: Theory and applications. Appl. Psych. Measurement26(1):109–128.

Wedel M, DeSarbo WS (2002) Market segment derivation and pro-filing via a finite mixture model framework. Marketing Lett.13(1):17–25.

Wedel M, Kamakura WA (2000) Market Segmentation: Conceptualand Methodological Foundations (Kluwer Academic Publishers,Norwell, MA).

Ying Y, Feinberg F, Wedel M (2006) Leveraging missing ratingsto improve online recommendation systems. J. Marketing Res.43(3):355–365.