Multi Dimensional Scaling

Multidimensional Scaling

Key Terms and Concepts Objects and subjects. Objects, also called variables or stimuli, are the products, candidates, opinions, or other choices to be compared.

Subjects are those doing the comparing. Sometimes the subjects are termed the "source" and the objects are termed the "target". It is possible for the subjects to rate themselves, in which case subjects and objects are the same. There are a number of standard formats for data collection discussed below.

1. Preference method. Subjects may be asked, "Is Choice A more similar to Choice B or to Choice C?"

2. Paired comparison method. Subjects are presented with all possible pairs of comparisons. "Please rate the similarity of Choices A and B on a scale from 0 = no similarity to 10 = complete similarity." "I like Choice A better than Choice B."

3. Confusion data method. Subjects are given a stack of cards, with each card representing an object (ex., product choice, candidate) to be rated. Subjects are asked to sort the cards into stacks, with each stack indicating similar preference or similarity, and with adjacent stacks being more similar than stacks further away.

4. Direct ranking method. Subjects are asked to rate objects from 1 = most preferred to n= least preferred of n objects.

5. Objective methods. While traditionally MDS is used for data of the above types, it may also be used for data on objective distances (ex., driving distance between delivery locations), frequencies (ex., content analysis data on times a newspaper covers a given issue), flows (ex., number of communications or transactions), or agreements (ex., percent of agreeing votes between any pair of members of a city council). More generally, a correlation matrix may be converted to a dissimilarity matrix by using (1 - r) as the measure of distance, where r is the Pearson correlation coefficient (cf. Molinero & Ezzamel, 1991, using MDS to examine correlations of corporate financial ratios). SPSS can convert any conventional dataset into a distance matrix for MDS purposes (see below).

Decompositional MDS, also called attribute-free MDS, is the most common type. The decompositional approach asks subjects to rate objects on an overall basis without reference to objective attributes such as size, color, cost, etc. This enables the researcher to produce a perceptual map for an individual or a composite map for a group of individuals. However, it is difficult to relate the underlying dimensions

Overview Multidimensional scaling (MDS) uncovers underlying dimensions based on a series of similarity or distance judgments by subjects. That is, MDS may be thought of as a way of representing subjective attributes in objective scales. A type of perceptual mapping, the central MDS output takes the form of a set of scatterplots ("perceptual maps") in which the axes are the underlying dimensions and the points are the products, candidates, opinions, or other objects of comparison. The objective of MDS is to array points in multidimensional space such that the distances separating points physically on the scatterplot(s) reflect as closely as possible the subjective distances obtained by surveying subjects. That is, MDS shows graphically how different objects of comparison do or do not cluster. MDS is mainly used to compare objects when the bases (dimensions) of comparison are not known and may differ from objective dimensions (ex., color, size, shape, weight, etc.) which are observable beforehand by the researcher. Goodness of fit of an MDS model is shown by the stress statistic, phi.

In spite of being designed for judgment data, MDS can be used to analyze any correlation matrix, treating correlation as a type of similarity measure. That is, the higher the correlation of two variables, the closer they will be located in the map created by MDS. Though it is possible to use MDS with objective distance data and with quantitative variables in general, it is more common to use factor analysis to group such variables, or to use Q-mode factor analysis or cluster analysis when grouping cases, when dimensions are objective and measurable. Nonetheless, because MDS does not require assumptions of linearity, metricity, or multivariate normality, sometimes it is preferred over factor analysis for these reasons even for objective data. On the other hand, MDS does not take account of control relationships as factor analysis does. Pros and cons of MDS vs. factor analysis are discussed below.

MDS is popular in marketing research for brand comparisons, and in psychology, where it has been used to study the dimensionality of personality traits. Other uses include analysis of particular academic disciplines using citation data (Small, 1999) and any application involving ratings, rankings, differences in perceptions, or voting.

Contents

Key concepts and terms

ALSCAL

PROXSCAL

Assumptions

SPSS output

Frequently asked questions

Bibliography

Page 1 of 18Multidimensional Scaling: Statnotes, from North Carolina State University, Public Admin...

8/26/2010http://faculty.chass.ncsu.edu/garson/PA765/mds.htm

uncovered by decompositional MDS to objective factors.

Compositional MDS is an alternative which requires subjects to rate objects on a variety of specific attributes (size, cost, etc.). This approach is undermined if the researcher fails to include all the relevant attributes. Output is for the composite case (multiple subjects) only: perceptual maps are not produced for individuals. That is, only object matrices are created (see below). Compositional MDS may involve conventional statistical procedures such as factor analysis or discriminant function analysis, or may involve specialized procedures such as correspondence analysis, semantic differential analysis, or importance/performance grid analysis. Since the purpose of MDS is to minimize the researcher's priori structuring of the data, some researchers derogate the compositional approach in favor of the decompositional.

Distance between preferences is the fundamental measurement concept in MDS. Distance may also be called similarity, dissimilarity, or proximity. There exist many alternative distance measures but all are functions of dissimilarity/similarity or preference judgments.

Similarity vs. dissimilarity matrices. For technical reasons, the ALSCAL algorithm is more efficient with dissimilarity/distance measures than with similarity/proximity measures. For this reason SPSS requires distance matrices, not similarity matrices. Cells in the matrix must indicate the degree of dissimilarity between pairs represented by the rows and columns of the matrix. If necessary, the researcher should convert similarity matrices into distance matrices before undertaking MDS analysis in SPSS.

Default distance matrices. If the data are distances (ex., rankings, ratings, comparisons) then in the Distances section of the "Multidimensional Scaling" dialog of SPSS, one accepts the default "Data are distances" checkbox.

Creating distance matrices from metric variables. The SPSS Multidimensional Scaling dialog also contains the option to "Create distances from data." This option creates a square, symmetric matrix from ordinary metric or dichotomous data, where the objects are the variables in the original dataset if the default "By variables" is selected in the Measure button dialog. Alternatively, one may select "By cases" but SPSS MDS is limited to 100 variables, so if cases are objects then one may have to use Data/Select Cases to first define the dataset to have 100 cases or fewer (Select Cases has a random sampling option). The dialog for the Measure button dialog allows the researcher to specify any of a variety of interval, count, or binary distance measures; to transform (ex., standardize) data by case (row) or variable (column); and to create the distance matrix by case or variable.

Subject, object, and objective matrices. Distance matrices may be about one subject rating another (subject matrices), about subjects rating objects (object matrices), or may represent objective distance measures (objective matrices).

Subject matrices. When the subjects themselves are also the target, a collective subject matrix is generated. Sociometric data on each student's preferences about classmates, for instance, generates one collective distance matrix, with students being the rows and columns and the cells being the distance measure from the column student as source to the row student as target. The upper data triangle does not necessarily mirror the lower as ratings may not be reciprocal for any pair. The diagonal cells are 0's as those cells are a subject with itself

Object matrices. When the targets are different from the subjects, object matrices are created, one for each subject. Choice data on each voter's preferences between pairs of candidates, for example, generates multiple individual distance matrices, one for each voter, with the rows and columns being candidates and the cells being the distance measure separating the pair of candidates for that voter (the upper triangle of data will mirror the lower if both are entered, but normally only the lower triangle is shown). The diagonal cells are 0's as those cells are a candidate with him/herself, representing 0 distance for the subject-rater.

Objective matrices. When distance between targets is objectively measured, as in a correlation matrix of objective variables, a single objective matrix is created. Driving distance between cities, as an example which is not a correlation matrix, also generates a single distance matrix, with the rows and columns being cities and the cells being the distance measure separating the cities (the upper triangle of data will mirror the lower if both are entered, but normally only the lower triangle is shown). The diagonal cells are 0's as those cells are a city with itself, representing 0 distance by objective measurement.

SPSS matrix shape. The Shape button in the SPSS Multidimensional Scaling dialog allows the researcher to select from among three formats:

1. Square symmetric. The default, where the rows and columns are the same objects. Corresponding values in the upper and lower triangles are equal. It is not necessary to enter the upper triangle, but one must enter 0's on the diagonal cells. A correlation matrix of each variable with each other is an example of a square symmetric matrix (note the data matrix is not square, only the distance matrix reflected in the correlations).

2. Square asymmetric. Rows and columns are the same objects but the distance from A to B is not necessarily the same as from B to A. Therefore corresponding values in the upper and lower triangles are not equal and the entire matrix must be entered. Zeros are entered on the diagonal, which is ignored in computation.

3. Rectangular. This shape may be used to enter multiple object matrices, where sequential sets of rows represent the object matrices for a sequence of subjects. Alternatively, a rectangular matrix may be a single matrix in which rows and columns represent different sets of objects. By default it is assumed there is a single matrix. If instead there are two or more rectangular matrices, the researcher must fill in the "Number of rows" box to define the how many rows there are per matrix (all matrices are in a single file, one set of rows after another). The number of rows must be 4 or greater and must divide evenly into the total number of rows in the data set. Unlike square matrices, the perceptual map will show points for both the column objects and the row objects.

SPSS matrix conditionality. The Model button dialog from the main Multidimensional Scaling dialog allows the researcher to select from among three matrix conditions:

1. Matrix. The default. Cell data (distances) can be compared within each distance matrix, as when there is only one matrix or when each matrix represents a different subject.



2. Row. Used (only) for asymmetric and rectangular matrices when one can make meaningful comparisons only among numbers within the rows of each matrix. For instance, out of n objects, each subject picks the first to be the "standard," and then is asked which other one in the set is most similar, next most similar, etc., until all objects are picked and the rankings entered as row 1 in what will be a square asymmetric data matrix. The second row repeats the process, but for the second object as "standard." Etc. for n rows. For each case, comparisons are meaningful only within that case (row). Comparisons would not be meaningful between rows as each row represents a different starting standard..

3. Unconditional. Used (only) for asymmetric and rectangular matrices when one can make meaningful comparisons among all values in the input matrix/matrices.

Level of measurement. Metric MDS handles interval data, but as most data are ordinal, non-metric MDS is most common. The Model button dialog in SPSS lets the user select ordinal, interval, or ratio levels as appropriate. If ordinal, there is an option in SPSS to select "Untie tied observations," which if selected treats the variable as continuous, so that ties are resolved optimally. If interval, there is an option to apply a power or root transform in the range 1 - 4.

MDS as a test of near-metricity of ordinal data. If the researcher has ordinal data and runs MDS, selecting first ordinal and then intervel level of data in the SPSS dialog, and then finds the perceptual map is similar for both algorithms, the researcher may conclude that the ordinal data are not markedly non-metric. This can be further corroborated by examining the plot of transformation for the ordinal model, discussed below.

Dimensions. By default MDS in SPSS computes two dimensions and produces a single perceptual map. However, in some cases two dimensions do not suffice to portray the points (objects) adequately and additional dimensions must be computed. If there are too few dimensions, the stress statistic (discussed below) will be too high. In SPSS, by default, a two-dimensional solution is produced. To explore whether stress could be reduced by adding dimensions, the researcher can specify more dimensions manually by entering minimum and maximum values between 1 and 6 dimensions in the Model button dialog. To obtain a single solution, enter the same value twice. For weighted models, the minimum dimension should be at least 2. Be careful about adding dimensions as interpretability will be exponentially reduced, especially beyond three dimensions. Most MDS analyses are 2- or 3-dimensional. In the extreme case, when there are as many dimensions as objects, perfect but trivial explanation will be achieved with zero stress.

Optimal number of dimensions. While selecting the number of dimensions which minimizes stress or which is at the elbow of a scree plot are guidelines for selecting optimal dimensionality, the paramount criterion should be interpretability. In the SPSS manual, for instance, the example is given of 1-, 2-, and 3-dimensional solutions for perceived similarity of body parts. The 3-dimensional solution is judged optimal because lower-order solutions are degenerate in the sense that these lower-order perceptual maps depict an oversimilified body structure compared to the 3-dimensional solution. Alternatively, some researchers use principal components factor analysis to determine dimensionality first, then proceed with MDS with the specified number of dimensions derived from the factor analysis.

Rotation of axes. Unlike factor analysis, there is no step for rotation of axes. While MDS assures that objects which are similar are close on the MDS map, the axes and orientation are arbitrary functions of the input data. Thus if the data are inter-city driving distances, the resulting map may portray cities the way a geographic map would, but then again up might be south rather than north, and right might be west rather than east. Likewise, in intuiting the meaning of dimensions, since the axes are arbitrarily oriented, it may be more interpretable to understand point location diagonally rather than vertically/horizontally.

Labeling of dimensions. As in factor analysis, there is ambiguity in the labeling of axes in MDS. Subjective procedures use subjects and/or experts to "eyeball" the perceptual maps and infer dimension labels. Kruskal & Wish (1978) recommended regressing MDS dimension coordinates on objective, related variables as independents in order to help assign meaning to the dimensions. Mathematical procedures such as property fitting (ex., PROFIT) correlate subject preference ratings with objective attribute ratings. Note that dimensions will be orthogonal but unlike factor analysis, are not rotated. This means that diagonal patterns of points, not just the original axes, may be best used to intuit the meaning of the underlying dimensions in CMDS and RMDS models. Note, however, in WMDS models, rotation of axes is not allowed.

Models in SPSS ALSCAL. A "model" in MDS is a combination of the type of distance measure, the type of matrix/matrices (including whether they are weighted or unweighted), and the level of measurement. However, a "model" may also refer to an MDS algorithm, which may hande more than one combination of these attributes (for example, the ALSCAL algorithm used by SPSS can handle both metric and nonmetric models). ALSCAL is available in SPSS Base, as opposed to PROXSCAL, which is available only with the SPSS Categories add-on module.

The models supported by the SPSS ALSCAL module are:

Classical MDS (CMDS), a.k.a. Principal Coordinate Analysis or metric CMDS. In SPSS press the Model button in the MDS dialog, then in the Model dialog select"Euclidean distance" in the Scaling Model area. If data are a single matrix, CMDS is performed.

Nonmetric CMDS is analysis where the level of measurement is specified to be ordinal. Output is similar but a different, much more computer-intensive iterative algorithm is applied based on monotonicity (order rather than metric distance is preserved in scaled MDS space and compared to the order in the input ordinal dataset).

Replicated MDS (RMDS). This model is an extension of CMDS to the case where there are multiple matrices (ex., for multiple subjects) and it is assumed that the perceptual map (stimulus configuration) is the same for each matrix (ex., each subject). That is, it is assumed that each dimension in the analysis is equally relevant to each subject's comparisons. For the 2-dimensional solution, a single perceptual map is generated for the set of matrices and summary statistics (overall stress and average RSQ) are reported.

Multiple-matrix principal coordinates analysis. Whereas traditional factor analysis handles only one data matrix at a time, RMDS and INDSCAL, discussed below, allow for simultaneous analysis of multiple matrices, as in the situation where the



researcher has multiple samples not suitable for pooled factor analysis. Also, RMDS and INDSCAL have relaxed data distribution assumptions and support analysis of smaller samples.

SPSS. If the input data matrix contains two or more matrices, scaling is set to Euclidean distance, and "individual differences Euclidean distances" is not selected, then RMDS is invoked automatically (there is no dialog selection needed or available). In SPSS press the Model button in the MDS dialog, then in the Model dialog select"Euclidean distance" in the Scaling Model area. RMDS will be metric if level of measurement is set to interval or ratio, and will be nonmetric if set to ordinal. For square symmetric and square asymmetric matrix shapes, it is not necessary to tell SPSS how many rows there are in a matrix since by definition of "square" it must be the same as columns; therefore total rows must be an even multiple of columns. For rectangular matrix shapes, the researcher must enter the number of rows in the Shape button dialog.

Individual differences Euclidean distance (INDSCAL). Also known as weighted MDS (WMDS) This model is also an extension of CMDS to the case of multiple matrices, but it is not assumed that the stimulus configuration is the same for each matrix (ex., for each subject). Each individual may attach different importance to the dimensions in the analysis. The INDSCAL algorithm computes weights representing the importance each subject attaches to each dimension and uses these weights when creating the group perceptual map.

SPSS. In SPSS press the Model button in the MDS dialog, then in the Model dialog select "Individual differences Euclidean distance" in the Scaling Model area. This algorithm scales the data using the weighted individual differences Euclidean distance (WMDS) model, which requires two or more matrices. There is an option to support negative weights.

Asymmetric Euclidean distance model (ASCAL). This is the model when one sets Shape to be "Asymmetric" and more than one dimension is requested (2 is default).

Asymmetric individual differences Euclidean distance model (AINDS). This is the model when one sets Shape to be "Asymmetric", more than one dimension is requested, and there is more than one data matrix to analyze ( "Individual differences Euclidean distance" is selected).

Generalized Euclidean metric individual differences model (GEMSCAL). This model is only available in syntax by the following lines:

ALSCAL VARIABLES = V1 TO Vn /SHAPE = ASYMMETRIC /CONDITION = ROW /MODEL = GEMSCAL /CRITERIA = DIM(4) DIRECTIONS(4)

for the case of n variables, with n>=4, and directions between 1 and the number of dimensions specified in the DIM command (here, 4).

ALSCAL Output Options in SPSS.

SPSS menu: In SPSS, select Analyze, Scale, Multidimensional Scaling (ALSCAL); in the Multidimensional Scaling dialog box, enter the objects into the Variable list box (rows and columns will be the same objects, but enter the column headings); in the Distances box leave the default as "Square Matrix" (for other, press the Shape key); click the Model button and enter the level of measurement (ordinal, interval, or ratio), the conditionality (matrix, row, unconditional), the scaling model (Euclidean distance or individual differences Euclidean distance), and accept or change the default number of dimensions. Click continue and back in the Multidimensional Scaling dialog box, click the Options button and select the output options you want. Under Options you can also change the default stress values for convergence and choose whether to treat negative distances as missing values. Click Continue to exit Options. Click OK in the Multidimensional Scaling dialog to run the analysis.

Example. Abelson & Sermat (1962) obtained data from 30 students who rated 13 pictures of women on a 9-point dissimilarity scale. Since 13 objects can be combined 2 at a time to give 78 combinations, each student thus gave 78 dissimilarity measurements. These were averaged across students using the method of successive intervals to create a matrix useful for multidimensional scaling. The 13 facial expressions were these:

1. grief: grief at death of mother 2. savor: savoring a coke 3. surprise: very pleasant surprise 4. love: maternal love-baby in arms 5. exhaustn: physical exhaustion 6. wrong: something wrong with plane 7. anger: anger at seeing dog beaten 8. pulling: pulling hard on seat of chair 9. meets: unexpectedly meets old boy friend

10. revulsion: revulsion 11. pain: extreme pain 12. knowfear: knows plane will crash 13. sleep: light sleep

The SPSS syntax for the example is:

ALSCAL VARIABLES=Grief Savor Surprise Love Exhaustion Wrong Anger Pulling Meets Revulsion Pain KnowFear Sleep /SHAPE=SYMMETRIC



/LEVEL=INTERVAL /CONDITION=MATRIX /MODEL=EUCLID /CRITERIA=CONVERGE(0.001) STRESSMIN(0.005) ITER(30) CUTOFF(0) DIMENS(2,3) /PLOT=DEFAULT ALL /PRINT=DATA HEADER

S-Stress and Interation History. SPSS ALSCAL uses minimizing Young's S-stress 1 as the criterion for stopping the its iterative solution process. Because this criterion is known to yield sub-optimal solutions (Coxon & Jones, 1980; Ramsay, 1988; Weinberg & Menil, 1993), PROXSCAL (in the SPSS Categories module) and PREFSCAL (multidimensional unfolding) are now generally preferred. Specificaly, the S-stress loss function gives greater weight to larger dissimilarities, which Ramsay (1988) notes are associated with greater error.

Iteration history for the 3 dimensional solution (in squared distances) Young's S-stress formula 1 is used. Iteration S-stress Improvement

1 .14137 2 .12308 .01829 3 .12218 .00089

Iterations stopped because S-stress improvement is less than .001000

Stress (phi) is a goodness of fit measure for MDS models. The smaller the stress, the better the fit. Stress measures the difference between interpoint distances in computed MDS space and the corresponding actual input distances, where MDS space is p-space, where p is the number of dimensions, set by SPSS as default to 2, but user-selectable in the Model button dialog to be a value from 1 to 6. High stress may reflect measurement error but also may reflect having too few dimensions. There are two versions, Young's S-stress (based on squared distances) and the Kruskal's stress (a.k.a., stress formula 1 or stress 1, based on distances). SPSS generates both but uses S-stress as the criterion for stopping the iterations by which it resets point coordinates to reduce stress, when the improvement in S-stress is .001 or less for that iteration. (The Model dialog lets the researcher adjust this cut-off; if "0" is entered, the algorithm computes 30 iterations). Stress is little affected by sample size provided the number of objects is appreciably more than the number of dimensions (see the assumptions section below).

Overall stress is the SPSS label for average stress in RMDS models (because RMDS has more than one matrix). Average stress is the square root of the mean of squared Kruskal stress values.

Scree plots are an alternative graphical stress-based criterion used as a criterion for determining the optimal number of dimensions, similar to their use in factor analysis to determine the number of factors to extract. In a scree plot, the x axis is the number of dimensions and the y axis is stress. Stress declines as number of dimensions increases. The researcher looks for the "elbow" of the plot, where the curve levels off, using that as the optimal cutting point for stopping computation of additional dimensions. In practice, locating the elbow can be ambiguous. The SPSS MDS module does not support scree plots, but a scree plot may easily be constructed manually by running the MDS model iteratively, augmenting the number of dimensions by 1 each time (this is done manually in the Model button dialog) and noting the stress at each iteration, then plotting dimensions by stress. As illustrated in the figure below, the scree plot can leave considerable ambiguity in determining number of dimensions.

Local minima. Stress should decline as the number of dimensions is increased. If stress increases when an additional dimension is allowed, this indicates a local or suboptimal solution, and increased numbers of dimensions should be examined to see if they do not reduce stress.

Measures of goodness of fit are effect size measures assessing how well the MDS model fits the data. Stress, discussed above, is one such goodness of fit measure. Others are:

Squared correlation index, R2. This is a common fit measure, with R2>= .60 considered acceptable fit. SPSS generates this under the label RSQ. RSQ is simply the squared correlation of the input distances with the scaled p-space distances using MDS coordinates. RSQ reflects the proportion of variance of the input distance data accounted for by the scaled data, or vice versa. For the Abelson-Sermat facial expressions data, RSQ was .845 even for two dimensions, .917 for three dimensions, and of course higher for additional dimensions. SPSS output looks like this:

Stress and squared correlation (RSQ) in distances



RSQ values are the proportion of variance of the scaled data (disparities) in the partition (row, matrix, or entire data) which is accounted for by their corresponding distances. Stress values are Kruskal's stress formula 1.

For matrix Stress = .10148 RSQ = .91731 Configuration derived in 3 dimensions

Average RSQ is output in RMDS models (because RMDS has more than one matrix). Average RSQ is mean of RSQ values across individual matrices. It represents the average percent of variance in the MDS p-space distances explained by the input distances, or vice versa.

Individual RSQ. WMDS models output RSQ values for individual matrices. Often, matrices correspond to subjects, in which case the researcher may sort cases into high and low values to identify sets of cases which are well-fitted by the WMDS model and sets which are not. Or the researcher may investigate whether different groups (ex., men and women) are fitted similarly by the model.

Interpretability. Though not a statistical test, perhaps the best criterion for determining the optimal number of dimensions is to run the MDS model for a sequence of dimensions, then pick the one which results in the most interpretable reproduced distances in the MDS map.

Stimulus coordinates and MDS plots. Stimulus coordinates, shown below, are the numerical coordinate locations relating stimuli to dimensions.

Stimulus Coordinates Dimension Stimulus Stimulus 1 2 3 Number Name 1 Grief .7614 -.8057 -.2675 2 Savor -.9389 .1779 .6737 3 Surprise -1.9272 1.4084 -.1769 4 Love -1.4629 -.2019 -.0737 5 Exhausti .1238 -1.2525 .5999 6 Wrong .9674 -.1513 1.0382 7 Anger 2.3366 .4615 .3991 8 Pulling -.8300 .9047 -.5853 9 Meets -1.6104 .3462 .6658 10 Revulsio .8550 -.6452 -.8355 11 Pain .7070 -.1818 -1.1268 12 KnowFear 1.6198 1.8280 -.0883 13 Sleep -.6014 -1.8883 -.2228

Plots or maps, obtained under the Output button in the SPSS dialog, are simply the graphical expression of the computed coordinates:

MDS Map, a.k.a. perceptual map but labeled "Derived Stimulus Configuration" in SPSS output, shows Dimension 1 on the X axis and Dimension 2 on the Y axis, for the 2-dimensional solution. The orientation of the points will be an arbitrary function of the input coding. The meaning of the dimensions must be intuited from the alignment of points (or from the "Table of Stimulus Coordinates", which has the same information in numeric form). The researcher looks for clusters of points, indicating a set of similar objects. Discussion of findings focuses on comparison of clusters. Since point placement within a cluster can be highly influenced by small differences, if the researcher wishes to compare points within a cluster, it is recommended that the researcher re-run MDS for just the objects in the cluster of interest.

For the two-dimensional solution to the matrix of facial expression data, SPSS ALSCAL creates the MDS map which appears below.



Here it can be seen there is a love-savor-meets cluster as well as a grief-revulsion-pain-wrong cluster. Anger is closer to the latter cluster than the former. Additional observations might be made on the basis of clustering. The axes are more difficult to interpret than the clusters, but it might be said there are two axes: the horizontal love vs. anger axis, and a vertical sleep vs. alertness axis (inferring that fear of plane crash equates to alertness). However, there is subjectivity and ambiguity. One might use multiple expert interpreters to validate a modal interpretation. Note also, the higher the stress for the solution, the less reliable the location of objects in MDS space and hence the less reliable the interpretation.

For comparison, here is the three-dimensional solution for the same dataset, graphically reflecting the coordinates above:

The three-dimensional map is harder to read. Looking at the table of stimulus coordinates aids in the interpretation. The clusters and first two dimensions remain largely the same. Dimension 1 is still love-surprise-meets on the negative end to anger on the positive pole. Likewise, dimension 2 is still sleep-exhaustion on the negative pole to knowfear-surprise on the positive pole. The third dimension is very difficult to interpret (suggesting the two-dimensional solution, being more interpretable while yielding the same clusters, may be better). It goes from pain on the negative pole to wrong on the positive pole, with smaller coordinate values and less well differentiated poles

Fit plots

Scatterplot of linear fit (a.k.a., Shepard diagram) displays disparities (input distances transformed into MDS p-space) on the Y axis and disparities on the X axis. Distances are the original distances for any two points in the input matrix. Disparities are the reproduced distances and measure the distance of two points in the MDS space created by two dimensions. In a perfect model, the distances and disparities for any two points are equal. Consequently, the more the scatterplot of linear fit forms a straight 45-degree line, the better the fit of the MDS model to the data, for the case of metric scaling. For nonmetric scaling, the best-fitting plot will have a weakly monotonic pattern - that is, it will have a step-line pattern corresponding to the step-function used for the monotonic transformation of the input data and deviations from the step-line show lack of fit.



Above, for the Abelson-Sermat facial expressions example's two-dimensional solution, it can be seen that the model works fairly well for estimated distances (disparities) of 2 or higher, but much less well for smaller disparities.

Scatterplot of nonlinear fit. This is produced for nonmetric models (level of measurement is ordinal) and shows observations on the X axis and untransformed input distances (labeled "Observations") on the Y axis. Observations are the values (not id numbers!) of input distances from small to large. A well-fitting model is homoscedastic, with points about as close to the line for low values as for high values.

Above, for the Abelson-Sermat facial expressions example's two-dimensional solution, data treated as ordinal, it can be seen that for lower input distance ratings, the model is somewhat less homoscedastic, though far from random.

Plot of transformation. This is produced for nonmetric models (level of measurement is ordinal) and shows observations on the X axis and distances after monotonic transformation on the Y axis. After transformation, distances are relabeled as disparities to distinguish them. As one moves from small values of observations on the left of the X axis to large ones on the right, points on the line will always be the same or greater in value on the Y axis, by definition of monotonicity. The nonlinear line formed by the plot of transformation is the nonlinear regression line for the ordinal data at hand. The more this regression line is relatively smooth rather than markedly stepped, the more metric-like the ordinal data and the less difference in MDS output by specifiying ordinal rather than interval level of measurement.



Above, for the Abelson-Sermat facial expressions example's two-dimensional solution, data treated as ordinal, the transformation line is fairly stepped, suggesting the student ratings are properly treated as ordinal.

Individual subject plots. If data are ordinal (ordered categorical) in level of measurement and the conditionality (see above) is set to "Matrix," this option generates separate plots for each subject's data. Only group plots are available for other data types.

Other output options

1. Data matrix. If checked, the input matrix and scaled data for each subject is displayed in the output.

2. Model and options summary. Data, model, output, and algorithmic options for the current run are displayed.

3. Weights and coordinates can by saved in SPSS syntax mode, using the OUTFILE subcommand. In the MDS main dialog, click the Help key, then under the Index tab, enter OUTFILE and select the ALSCAL listing for details.

4. Weirdness index. For INDSCAL/WMDS models, SPSS output will show the weirdness index for each subject. The weirdness index is used to flag heavily influential subjects affecting the analysis. In such weighted models, different individuals may attach different importance to different dimensions when making comparisons. A weirdness index of 0 means that individual weights each dimension the same as the average group weight. A weirdness index of 1 means the individual weights a single dimension as all-important and weights the other dimensions as of zero importance.

5. Flattened subject weights. For INDSCAL/WMDS models, SPSS output includes a table of "Flattened Subject Weights" and also a plot of "Flattened Subject Weights" which graphically displays each subject on axes formed by the dimensions. Raw subject weights are interpreted as angles and it is more intuitive to interpret flattened weights, which convert raw subject weight information into distance coordinates. In the "Flattened Subject Weights" plot, subjects with low weights on the dimensions will appear in the middle of the plot (where the 0 points of the axes variables are) and subjects with high weights on one or the other axes will appear toward the right on the X axis or toward the top on the Y axis, or both. In general, subjects in the middle of the plot will have low weirdness indices, and those toward the periphery will have distinctly higher weirdness.

Comparing flattened weights. The table and plot of flattened subject weights allows the researcher to make comparisons among subjects in terms of how they are similar or dissimilar in the emphases they give to the dimensions. Note, however, the flattening transform reduces r dimensions to (r-1) variables (this is because the flattening algorithm transforms the raw weights to add to 1.0, so when r-1 flattened weights are determined, the rth weight is also determined and is redundant). That is, flattened weight space will have one fewer dimensions than the original weight space. To differentiate, the dimensions in flattened weight space are labeled "variables." Note also that because subject weights are not independent, statistical testing of significance of differences between subjects is inappropriate.

PROXSCAL Input and Output Options in SPSS.

SPSS menu: PROXSCAL accepts square data matrices, where the cell entries are dissimilarities (the default - high is more dissimilar) or similarities. Note "proximity" may be either dissimilarity or similarity: which one is specified in the Model dialog discussed below. The matrices may be symmetrical (the upper triangle mirrors the lower triangle, meaning object A is the same distance from B as B is from A) or asymmetrical (upper and lower triangles differ, as in friendship closeness ratings, where A and B differ in their perceptions of each other).

It is also possible in the input data table to have one or more sourceid variables (ex., to set up groups for men vs. women). Thus DATA LIST / r_id c_id men women. would be followed by four columns of data: the cell row id, the column row id, the proximity score for that cell for men, and the proximity score for that cell for women. Thus one would be entering two data matrices. The SPSS manual describes other data entry options.

In SPSS, select Analyze, Scale, Multidimensional Scaling (PROXSCAL)(note you must have purchased and installed the SPSS Categories add-on to see this menu choice); in the Multidimensional Scaling: Data Format dialog box which opens, specify the



your data type as illustrated below (the illustration shows default selections). Note that like ALSCAL, PROXSCAL can create proximities from raw data. Note also that INDSCAL models can be implemented by specifying multiple sources in the Data Format dialog.

Click on the Define button to bring up the dialog shown below, where one may enter the objects into the Variable list box (rows and columns will be the same objects, but enter the column headings) as shown below. Note proximities may be weighted if desired. (Tip: long variable names will clutter the MDS map, even overwriting each other).

Click on the Model button from the above Define dialog to bring up the next dialog, where one may specify the data level ("spline" refers to smooth nondecreasing piedcwise polynomial trnsformations of the original proximities), the matrix shape (not the input matrix shape, which is full square symmetric; rather shape refers to whether the upper or lower data triangles will be analyzed, or for asymmetric data, both), whether matrix entries are similarities or dissimilarities (ex., dissimilarity is the default, where high means more dissimilar), and the number or range of dimensions for which to seek a solution.

It may be desirable to run the analysis once specifying proximities as interval and once as ordinal. The run with the lower stress is the better model. If stress is similar for both runs, the proximity data can be said to approach being metric.

The "Apply transformations" section applies only for multiple data sources as in INDSCAL models, where "Across all sources simultaneously" is selected for global rather than local analysis. Global, or unconditional, analysis is appropriate when there are multiple matrices which are similar in nature.

Note that PROXSCAL supports four alternative scaling models:

Page 10 of 18Multidimensional Scaling: Statnotes, from North Carolina State University, Public Ad...


1. Identity. This is the default simple Euclidean model as in CMDS in ALSCAL. It is not used when there are multiple sources to be compared.

2. Weighted Euclidean. This is for the INDSCAL model and is used when individual differences are to be modeled, as when there are separate matrices for men and women.

3. Generalised Euclidean. This is equivalent to the GEMSCAL model in ALSCAL. 4. Reduced Rank. This IDOSCAL variant emplyes a matrix of minimal rank.

Below, defaults are shown except the default for Dimensions is 2:

Click on the Restrictions button from the Define dialog to bring up another dialog, shown below, where one may constrain certain object coordinates to specific values (no restrictions is the default):

Click on the Options button from the Define dialog to bring up yet another dialog, shown below, where one may select among certain algorithms and convergence criteria. A Simplex starting value is the default, in effect initially placing all objects equidistant, then in one iteration trying to reduce stress, then reducing to the number of requested dimensions. The Torgerson method is the classical approach. If Multiple Random is selected, stress values are computed for multiple runs with different random starting points. Defaults are shown below.



Click on the Plots button from the Define dialog to bring up a dialog, shown below, where one may choose among various plots to output. Checking Stress generates a scree plot, discussed above in the ALSCAL section. Common space is a default and generates the MDS map, also discussed in the ALSCAL section:

Finally, click on the Output button from the Define dialog to bring up a dialog, shown below, where one specify the desired statistical output. Only common space coordinates and multiple stress measures are default output.



To run the analysis, click Continue in the Output dialog, then back in the Define dialog box, click OK.

Example. The same example is used to illustrate PROXSCAL as discussed above with regard to ALSCAL, The syntax is:

PROXSCAL VARIABLES=Grief Savor Surprise Love Exhaustion Wrong Anger Pulling Meets Revulsion Pain KnowFear Sleep /SHAPE=LOWER /INITIAL=SIMPLEX /TRANSFORMATION=RATIO /PROXIMITIES=DISSIMILARITIES /ACCELERATION=NONE /CRITERIA=DIMENSIONS(2,3) MAXITER(100) DIFFSTRESS(.0001) MINSTRESS(.0001) /PRINT=COMMON DISTANCES TRANSFORMATIONS INPUT HISTORY STRESS DECOMPOSITION /PLOT=STRESS COMMON.

Iteration history. shown below, is output primarily useful to check convergence and also to see the actual value used to start the iterative process of calculating coordinates (here the default Simplex algorithm is used).

The Residuals plot, output when "Transformed proximities versus distances" is checked under the Plots button, should approximate a straight line consistent with the linear transformation of proximities under the assumption data are numerical. If the plot does not approximate a straight line, then the analysis should be re-run specifying an ordinal transformation.



Stress and Fit Measures. As shown below, PROXSCAL outputs a wider range of fit measures than ALSCAL. Dispersion accounted for (DAF) and Tucker's coefficient of congruence (TCC) are goodness of fit measures, where higher is better fit. The four stress coefficients are measures of misfit, where lower is better fit. Stress-1 is normally used when comparing among solutions; S-stress is not.

As shown below, PROXSCAL outputs a table of decomposition of stress to identify which objects contribute most to overall stress. Not shown here, if there are multiple sources as in an INDSCAL model,the decomposition of stress table also shows which sources contribute most to overall stress.

MDS coordinates. PROXSCAL outputs the coordinates used to graph the MDS map in the section called "Common Space". Below, a two-



dimensional solution is computed. The coordinates usually will differ considerably from those computed in ALSCAL, but comparison on this basis is almost impossible as the orientation and scaling of the plots differ. comparison must be made on the basis of clustering of objects in the MDS map, as illustrated below.

MDS maps. Below, the two-dimensional and three-dimensional solutions are displayed from PROXSCAL. Though the orientation differs, the clustering of objects is substantially similar in PROXSCAL as in ALSCAL for this example.



Assumptions Proper specification of the model. All relevant objects should be included in the preference comparisons on which the MDS is based. Omission of

relevant objects can dramatically affect MDS output. The same is true if correlated but irrelevant objects are included.

Proper level of measurement. Different computational algorithms are applied to ordinal, interval, and ratio data, which must be specified correctly by the researcher. Level of data is specified under the Model button in the SPSS MDS dialog.

Objects >= dimensions. If there are more dimensions than objects, the MDS solution will be unstable. If there are too few objects in relation to dimensions, goodness of fit measures will be inflated. As a rule of thumb, the research design should provide for four times as many objects as dimensions, plus 1 (thus 5 objects for a 1-dimensional solution, 9 for 2-dimensional, etc.).

Similar scales. If variables differ greatly in scale of measurement (ex., dollars income vs. years of education), one should standardize the data first to avoid output distortion. The option to use standardized Z-scores (or other transforms) is found in the main MDS dialog under the option to "Create distances from data."

Comparability. The objects being compared/voted upon/ranked must share one or more meaningful dimensions on which meaningful comparison is possible.

History. Perceptual dimensions may change over time for the same individuals.

Sample size. Large sample size is not required. There must be at least four objects (variables).

Missing values should be a small percentage of total cases. Large numbers of missing values can lead to misleadingly low estimates of stress.

Few ties. The number of ties should not be large as this can lead to misleadingly low estimates of stress.

Data distribution. MDS does not assume any particular data distribution, though variance in the data is necessary for meaningful results. In particular, MDS is robust under non-normal data distributions (Subkoviak & Farr, 1976).

SPSS limits. SPSS supports up to 100 objects on up to 6 dimensions. There can be no more than 32,767 total values in the analysis. The total number of stimulus (object)coordinates plus the number of weights mut not be greater than the number of data values. Data weights created by the SPSS WEIGHT command are ignored.

Examples of SPSS MDS Output Air Flight Distances Between Cities. ALSCAL CMDS on an objective matrix.

Car Attributes. ALSCAL CMDS on an objective matrix formed by the "Create distances from data" and "By variables" option which, in this case, reduced data on 393 automobiles to six objects (variables such as horsepower, engine size, acceleration, etc). Appended to the output is the MDS plot if the "By cases" option is taken instead. Also appended are similar MDS plots for 1-, 2-, and 3-dimensional solutions.



Frequently Asked Questions What other procedures are related to MDS?

MDS is used to show the relationship among objects related by some distance measure, where the objects typically are subjects and distances are preferences, communication frequencies, or other sociometric data; or where the objects are choices, candidates, or alternatives and distances are preferences, perceived dissimilarities, or other rankings. Related procedures are:

1. Correspondence analysis, which creates perceptual maps showing the relation of values of multiple variables (ex., if variables are alternatives with high, medium, and low risk; high, medium, and low cost; black, white, and red colors; etc., then the plot will depict how close low risk is to high cost and red color).

2. Factor analysis, which is used for data reduction, reducing a large number of variables to a small number of underlying factors. A plot can show the location of variables in factor space. Factor analysis takes account of control relationships among the variables, whereas MDS treats correlation as a simple distance measure and will locate correlated variables close to each other on the MDS map even when their partial correlation is zero. However, factor analysis imposes more stringent assumptions: relationships must be linear in factor analysis but not in MDS, data must be multivariate normal in distribution in factor analysis but not in MDS, and factor analysis assumes metric data whereas ordinal data meet the assumptions of MDS. In general, factor analysis will yield more factors than MDS will yield dimensions, and this may make MDS more interpretable, though MDS and factor analysis are similar in the subjectivity and often difficulty of imputing meaningful labels to the factors and dimensions. MDS also allows explicit comparison of results of ordinal vs. metric models for the same data.

3. Cluster analysis. Hierarchical or k-means cluster analysis is used for data reduction, reducing a large number of cases to a small number of underlying clusters. A plot can show the location of cases in cluster space. Cluster analysis can also be used to cluster variables.

If one has metric or dichotomous raw data, it may be that factor analysis or cluster analysis would be more efficient for the researcher's problem. On the other hand, MDS has relaxed data distribution assumptions, is robust with smaller sample size than is factor analysis, and can handle multiple matrices simultaneously, and so there are instances where the researcher may prefer MDS even for objective matrices.

How does MDS work?At a very general level, MDS assigns points to arbitrary coordinates in p-dimensional space. Euclidean distances are computed for each pair of points. The computed distances are compared with the input distances to get the stress function. Coordinates in p-space are adjusted in the direction that lowers stress. The process is repeated iteratively until the reduction in stress is less than some default or researcher-specified cutoff amount.

If one has multiple data matrices, why do RMDS or INDSCAL? Why not just do a series of CMDS models, one on each matrix?Doing a series of CMDS models would be appropriate if the researcher can rule out or is uninterested in the possibility that the matrices share a common structure that would be revealed in RMDS perceptual maps. Although INDSCAL individual difference models have fewer constraints and are much less parsimonious than RMDS models, they also generate statistics (weirdness indices, flattened subject weights) which help the researcher discern patterns across matrices.

What computer programs handle MDS?In addition to modules in major packages such as SPSS, specialized scaling software programs such as ALSCAL, Multiscale, and SMACOF-IB are leading custom MDS programs. See also NewMDSX, reviewed by Routh (2007). Others include Minissa, INDSCAL, Moscal, Freemap, X-MDS, and many more are available.

What is Torgerson Scaling?Sometimes called Torgerson-Gower scaling, this is a classical multidimensional scaling method which minimizes a loss function called "strain." It has largely been replaced by metric and non-metric scaling methods discussed in this section, both of which minimize stress. See Torgerson (1958).

How does MDS relate to "smallest space analysis"?"Smallest space analysis" was a forerunner of MDS, associated with the psychometric approach to assessment variables. That is, MDS represents a later generation of algorithms in the same methodological grouping.

Bibliography Abdi, H. (2007). Metric multidimensional scaling. In Salkind, N.J. , ed.. Encyclopedia of measurement and statistics. Thousand Oaks (CA): Sage. Abelson, R. P. and Sermat, V. (1962). Multidimensional scaling of facial expressions, Journal of Experimental Psychology 63, 564-554. Borg, I., & Groenen, P. (1997). Modern multidimensional scaling. Theory and applications. New York: Springer. Cox, M.F. & Cox, M.A.A., (2001), Multidimensional scaling. London: Chapman and Hall. Coxon, A. P. M. & Jones, Charles (1980). Multidimensional scaling: Exploration to confirmation. Quality and Quantity 14(1), 31-73. Green, Paul E., Carmone, Frank J., & Smith, Scott M. (1989). Multidimensional scaling: Concept and applications. Boston: Allyn & Bacon. Kruskal, Joseph B. & Wish, Myron (1978). Multidimensional scaling. Sage University Paper Series on Quantitaive Applications in the Social

Sciences. Beverly Hills, CA: Sage Publications. MacCallum, R. C. (1977). Effects of conditionality on INDSAL and ALSCAL weights. Psychometrika 42: 297-305. Mead, A. (1992). Review of the development of multidimensional scaling methods. The Statistician 41(1), 27-39. Molinero, C. M. & Ezzamel, M. (1991) Multidimensional scaling applied to company failure. Omega 19, 259-274. Ramsay, J. O. (1977). Maximum likelihood estimation in multidimensional scaling. Psychometrika, 42,241-246. Ramsay J. O. (1988). Is multidimensional scaling magic or science? Contemporary Psychology. 33, 874-875. Routh, David A. (2007). Statistical software review. British Journal of Mathematical and Statistical Psychology 60(2), 429-432.



Schiffman, Susan S., Reynolds, M. Lance, & Young, Forest W. (1981). Introduction to multidimensional scaling. NY: Academic Press. Small, H. (1999). Visualizing science by citation mapping. Journal of the American Society for Information Science 50(9), 799-813. Subkoviak, M. J. & Farr, S. D. (1976). Violation of assumed normality in traditional multidimensional scaling. Educational and Psychological

Measurement 36(3), 639-645. Torgerson, W. S. (1958). Theory and methods of scaling. NY: Wiley. Weinberg, Sharon L. & Menil, Violeta C. (1993). The recovery of structure in linear and ordinal data: INDSCAL versus ALSCAL. Multivariate

Behavioral Research 28(2), 215 - 233. Young, Forrest W. (1999). Multidimensional scaling. Retrieved 11/17/06 from http://forrest.psych.unc.edu/teaching/p208a/mds/mds.html Young, Forrest W. & Hamer, R. M. (1987). Multidimensional scaling: History, theory, and applications. Hillsdale, NJ: Lawrence Erlbaum

Associates.

@c 2006, 2008, 2009 G. David Garson Last updated 2/8/2009.

Back



Documents

Multi Dimensional Scaling