P.PRASHANTHPhD SCHOLAR
Dept. of Agricultural ExtensionI.D. No. RAD/11-10
MULTI DIMENSIONAL SCALING
Out Line of Presentation
• Some Historical Milestones• Statistics and Terms Associated with MDS• What is MDS?• How to Conduct Multidimensional Scaling?• How to Decide Number of Dimensions?• MDS Applications• Assumptions and Limitations of MDS
• MDS is relatively more complicated scaling device, but with this sort of scaling one can scale objects, individuals or both with a minimum of information
• MDS allows a researcher to measure an item in more than one dimension at a time
• MDS is a class of procedures for representing perceptions and preferences of respondents spatially by means of a visual display
• Perceived or psychological relationships among stimuli are represented as geometric relationships among points in a multidimensional space
• These geometric representations are often called spatial maps• The axes of the spatial map are assumed to denote the
psychological bases or underlying dimensions respondents use to form perceptions and preferences for stimuli
• The MDS analysis will reveal the most salient attributes which happen to be the primary determinants for making a specific decision
• A/C to Beri (1999), MDS is a data reduction technique, the primary purpose of which is to uncover the 'hidden structure' of a set of data. It enables the researcher to represent the proximities between objects spatially as in a map
• The term 'proximities' means any set of numbers that express the amount of similarity or difference between pairs of objects
• The term 'objects' refers to things or events• The proximity data can come from similarity judgments,
identification confusion matrices, grouping data, same-different errors or any other measure of pair wise similarity
Some Historical Milestones• 1635 Van Langren provides a distance matrix and map• 1958 Torgerson provides a solution for classical MDS based on
eigendecomosition• Torgerson proposed the first MDS method and coined the term• 1966 Gower provides independently the same solution for
classical MDS and gives connection to principal component analysis
• Classical MDS – For minimizing stress• Shepard –provides heuristic for MDS in psychology• 1954 Guttman facet theory, extra information (external
variables)is available on the objects according to the facete design by which the objects are generated
• 1986-1998 Meulman: integration of (Non linear) multivariate analysis and MDS
• Much emphasis on the representation of objects less on the variables
• Findings by MDS through stress as a dimension reduction technique
• Including a wide variety of MVA techniques - (Non linear) PCA - Multiple Correspondence Analysis - Correspondence analysis -Generalized Canonical Correlation Analysis -Discriminant Analysis• 1999 Heiser, Meulman, Busing: PRPXSCAL (i.e.SMACOF) in
SPSS (PASW) 2009: De Leeuw& Mair SMACOF in R
Statistics and Terms Associated with MDS• Similarity judgments. Similarity judgments are ratings on all possible
pairs of brands or other stimuli in terms of their similarity using a Likert type scale.
• Preference rankings. Preference rankings are rank orderings of the brands or other stimuli from the most preferred to the least preferred. They are normally obtained from the respondents.
• Stress. This is a lack-of-fit measure; higher values of stress indicate poorer fits.
• R-square. R-square is a squared correlation index that indicates the proportion of variance of the optimally scaled data that can be accounted for by the MDS procedure. This is a goodness-of-fit measure.
• Spatial map. Perceived relationships among brands or other stimuli are represented as geometric relationships among points in a multidimensional space called a spatial map.
• Coordinates. Coordinates indicate the positioning of a brand or a stimulus in a spatial map.
• Unfolding. The representation of both brands and respondents as points in the same space, by using internal or external analysis, is referred to as unfolding.
What is MDS?
Table of travel times by train in french cities
• It contains the flying mileages between 10 American cities. The cities are the "objects." and the mileages are the "similarities.
• An MDS of these data gives the picture in Fig. 1, a map of the relative locations of these 10 cities in the United States.
• This map has 10 points, one for each of the 10 cities. Cities that are similar (have short flying mileages) are represented by points that are close together, and cities that are dissimilar (have large mileages) by points far apart.
• Multidimensional scaling (MDS) is a method that represents measurements of similarity or dissimilarity among pairs of objects as distance between points of a low dimensional space
• Who uses MDS?• -Psychology -Medicine• -Sociology -Chemistry• -Archaeology -Net work analysis• -Biology -Economics etc.
• Similarities and dissimilarities• Large similarity approximated by small distance
in MDS• The similarity between stimuli is inversely
related to the distances of the corresponding points in the multidimensional space
• Large dissimilarity approximated by large distance in MDS
• General term-proximity
• The Minkowski distance metric• A Euclidian distance metric • The city-block distance metric • Euclidian distance metric is often used
because of mathematical convenience in MDS procedures
• MDS is used when all the variables (whether metric or non-metric) in a study are to be analyzed simultaneously and all such variables happen to be independent
• Euclidean distance to model dissimilarity. That is, the distance dij between points i and j is defined as
• Where xi Specifies the position (coordinate) of point i on dimension
Analysis of a face similarity judgment task
• Similarity ratings is shown for 10 faces is to reveal some of the perceptual dimensions that subjects might have used when generating similarity judgments for these faces
• The two dimensional scaling solution is shown for the 10 faces.
• After visual inspection, the configuration can be interpreted as the perceptual dimensions of age and adiposity.
• A priori knowledge - Theory or past research may suggest a particular number of dimensions.
• Interpretability of the spatial map - Generally, it is difficult to interpret configurations or maps derived in more than three dimensions.
• Elbow criterion - A plot of stress versus dimensionality should be examined.
• Ease of use - It is generally easier to work with two-dimensional maps or configurations than with those involving more dimensions.
• Statistical approaches - For the sophisticated user, statistical approaches are also available for determining the dimensionality.
Conducting Multidimensional ScalingDecide on the Number of Dimensions
Conducting Multidimensional ScalingFig. 21.1
Formulate the Problem
Obtain Input Data
Decide on the Number of Dimensions
Select an MDS Procedure
Label the Dimensions and Interpret the Configuration
Assess Reliability and Validity
Conducting Multidimensional ScalingFormulate the Problem
• Specify the purpose for which the MDS results would be used.• Select the brands or other stimuli to be included in the
analysis. The number of brands or stimuli selected normally varies between 8 and 25.
• The choice of the number and specific brands or stimuli to be included should be based on the statement of the marketing research problem, theory, and the judgment of the researcher.
Obtain Input Data for MDS
i. Obtaining Input Data
a. Perception Data: Direct Approaches
b. Perception Data: Derived Approaches
c. Direct Vs. Derived Approaches
d. Preference Data
Approaches to Create Perceptual Maps
• Attribute based approaches• Non attribute based approaches• Attribute Based Approaches:• If MDS used on attribute data, it is known as attribute based
MDS• Assumption
– The attributes on which the individuals' perceptions of objects are based, can be identified
• Methods Used to Reduce the Attributes to a Small Number of Dimensions – Factor Analysis– Discriminant Analysis
• While attempting to construct a space containing m points such that m(m -·1)/2 interpoint distances reflect the input data
• The metric (quantitative) approach to MDS treats the input data as interval scale data and solves applying statistical methods for the additive constant which minimizes the dimensionality of the solution space
• The non-metric (qualitative) approach first gathers the non-metric similarities by asking respondents to rank order all possible pairs that can be obtained from a set of objects. Such non-metric data is then transformed into some arbitrary metric space and then the solution is obtained by reducing the dimensionality
• After this sort of mapping is performed, the dimensions are usually interpreted and labelled by the researcher
Input Data for Multidimensional Scaling
Direct (Similarity Judgments)
Derived (Attribute Ratings)
MDS Input Data
Perceptions Preferences
Fig. 21.2
• Perception Data: Direct Approaches. In direct approaches to gathering perception data, the respondents are asked to judge how similar or dissimilar the various brands or stimuli are, using their own criteria. These data are referred to as similarity judgments.
Very VeryDissimilar Similar
Crest vs. Colgate 1 2 3 4 5 6 7Aqua-Fresh vs. Crest 1 2 3 4 5 6 7Crest vs. Ai 1 2 3 4 5 6 7...Colgate vs. Aqua-Fresh 1 2 3 4 5 6 7
• The number of pairs to be evaluated is n (n -1)/2, where n is the number of stimuli.
Conducting Multidimensional ScalingObtain Input Data
Similarity Rating Of Toothpaste BrandsTable 21.1
Aqua-Fresh Crest Colgate Aim Gleem Macleans Ultra Brite Close-Up Pepsodent DentagardAqua-Fresh
Crest 5Colgate 6 7
Aim 4 6 6Gleem 2 3 4 5
Macleans 3 3 4 4 5Ultra Brite 2 2 2 3 5 5Close-Up 2 2 2 2 6 5 6
Pepsodent 2 2 2 2 6 6 7 6Dentagard 1 2 4 2 4 3 3 4 3
• Perception Data: Derived Approaches. Derived approaches to collecting perception data are attribute-based approaches requiring the respondents to rate the brands or stimuli on the identified attributes using semantic differential or Likert scales.
Whitens Does notteeth ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ whiten teeth Prevents tooth Does not preventdecay ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ tooth decay
.
.
.
.Pleasant Unpleasanttasting ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ tasting
• If attribute ratings are obtained, a similarity measure (such as Euclidean distance) is derived for each pair of brands.
Conducting Multi Dimensional ScalingObtain Input Data
The direct approach has the following advantages anddisadvantages:• The researcher does not have to identify a set of salient
attributes. • The disadvantages are that the criteria are influenced by the
brands or stimuli being evaluated. • Furthermore, it may be difficult to label the dimensions of the
spatial map.
Conducting Multi dimensional ScalingObtain Input Data – Direct vs. Derived Approaches
The attribute-based approach has the followingadvantages and disadvantages:• It is easy to identify respondents with homogeneous perceptions. • The respondents can be clustered based on the attribute ratings. • It is also easier to label the dimensions. • A disadvantage is that the researcher must identify all the salient
attributes, a difficult task. • The spatial map obtained depends upon the attributes identified. It may be best to use both these approaches in acomplementary way. Direct similarity judgments may beused for obtaining the spatial map, and attribute ratings maybe used as an aid to interpreting the dimensions of theperceptual map.
Conducting Multidimensional ScalingObtain Input Data – Direct vs. Derived Approaches
• Preference data order the brands or stimuli in terms of respondents' preference for some property.
• A common way in which such data are obtained is through preference rankings.
• Alternatively, respondents may be required to make paired comparisons and indicate which brand in a pair they prefer.
• Another method is to obtain preference ratings for the various brands.
• The configuration derived from preference data may differ greatly from that obtained from similarity data. Two brands may be perceived as different in a similarity map yet similar in a preference map, and vice versa..
Conducting Multidimensional ScalingPreference Data
Decide the no. of Dimensions Scree test (Elbow test)
• An important issue in MDS is choosing the number of dimensions for the scaling solution.
• A configuration with a high number of dimensions achieves very low stress values but cannot easily be comprehended by the human eye, and is opt to be determined more by noise than by the essential structure in the data.
• On the other hand, a solution with too few dimensions might not reveal enough of the structure in the data
• Stress (or other lack of fit measure) is plotted against the dimensionality.
• stress decreases smoothly with increasing dimensionality making the choice of appropriate dimensionality very difficult with this method
• In Figure 1b, the filled circles shows the scree plot for the face similarity dataset
Plot of Stress Versus Dimensionality
0.1
0.2
1 Number of Dimensions
432 500.0
0.3 S
tres
sFig. 21.3
• Even if direct similarity judgments are obtained, ratings of the brands on researcher-supplied attributes may still be collected. Using statistical methods such as regression, these attribute vectors may be fitted in the spatial map.
• After providing direct similarity or preference data, the respondents may be asked to indicate the criteria they used in making their evaluations.
• If possible, the respondents can be shown their spatial maps and asked to label the dimensions by inspecting the configurations.
• If objective characteristics of the brands are available (e.g., horsepower or miles per gallon for automobiles), these could be used as an aid in interpreting the subjective dimensions of the spatial maps.
Conducting Multidimensional ScalingLabel the Dimensions and Interpret the Configuration
A Spatial Map of Toothpaste Brands
0.5
-1.5
Dentagard
-1.0-2.0
0.0
2.0
0.0
Close Up
-0.5 1.0 1.5 0.5 2.0
-1.5
-1.0
-2.0
-0.5
1.5
1.0
Pepsodent
Ultrabrite
Macleans Aim
Crest
Colgate
Aqua- Fresh
Gleem
Fig. 21.4
Using Attribute Vectors to Label DimensionsFig. 21.5
0.5
-1.5
Dentagard
-1.0-2.0
0.0
2.0
0.0
Close Up
-0.5 1.0 1.5 0.5 2.0
-1.5
-1.0
-2.0
-0.5
1.5
1.0
Pepsodent
Ultrabrite
Macleans Aim
Crest
Colgate
Aqua- Fresh
Gleem Fights Cavities
Whitens Teeth
Cleans Stains
• The index of fit, or R-square is a squared correlation index that indicates the proportion of variance of the optimally scaled data that can be accounted for by the MDS procedure. Values of 0.60 or better are considered acceptable.
• Stress values are also indicative of the quality of MDS solutions. While R-square is a measure of goodness-of-fit, stress measures badness-of-fit, or the proportion of variance of the optimally scaled data that is not accounted for by the MDS model. Stress values of less than 10% are considered acceptable.
• If an aggregate-level analysis has been done, the original data should be split into two or more parts. MDS analysis should be conducted separately on each part and the results compared.
Conducting Multidimensional ScalingAssess Reliability and Validity
• Stimuli can be selectively eliminated from the input data and the solutions determined for the remaining stimuli.
• A random error term could be added to the input data. The resulting data are subjected to MDS analysis and the solutions compared.
• The input data could be collected at two different points in time and the test-retest reliability determined.
Conducting Multidimensional ScalingAssess Reliability and Validity
Assessment of Stability by Deleting One Brand
0.5
-1.5 -1.0-2.0
0.0
2.0
0.0
Close Up
-0.5 1.0 1.5 0.5 2.0
-1.5
-1.0
-2.0
-0.5
1.5
1.0
Pepsodent Ultrabrite
Macleans
Aim
Crest
Colgate
Aqua- Fresh
Gleem
Fig. 21.6
External Analysis of Preference Data
0.5
-1.5
Dentagard
-1.0-2.0
0.0
2.0
0.0
Close Up
-0.5 1.0 1.5 0.5 2.0
-1.5
-1.0
-2.0
-0.5
1.5
1.0
Pepsodent
Ultrabrite
Macleans Aim
Crest
Colgate
Aqua- Fresh
Gleem Ideal Point
Fig. 21.7
MDS Applications• Exploratory data analysis; by placing objects as points in a
low dimensional space• The observed complexity in the original data matrix can often
be reduced while preserving the essential information in the data
• Consumers generally prefer a particular brand of a product not on the basis of one attribute but on a number of attributes. The need for multidimensional scaling arises to understand such situations
• It helps in the identification of attributes on the basis of which consumers perceive or evaluate products or brands.
• It enables the positioning of different products or brands on the basis of these attributes.
• It helps to generate a perceptual map indicating location of the brands on the basis of attributes
Assumptions and Limitations of MDS• It is assumed that the similarity of stimulus A to B is the same as
the similarity of stimulus B to A. • MDS assumes that the distance (similarity) between two stimuli is
some function of their partial similarities on each of several perceptual dimensions.
• When a spatial map is obtained, it is assumed that interpoint distances are ratio scaled and that the axes of the map are multi dimensional interval scaled.
• A limitation of MDS is that dimension interpretation relating physical changes in brands or stimuli to changes in the perceptual map is difficult at best
• MDS is not widely used because of the computation complications involved under it
• Many of its methods are quite laborious in terms of both the collection of data and the subsequent analyses
• Political party comparison website for Dutch parliament elections 2010 asks to rate 30 political statements e.g.
• 1. the government needs to cut the budget by billions . The budget deficit should at the latest in 2015
• Agree Don’t Know Dis agree• 2. those with high income should pay more taxes• Agree Don’t know Dis agree• 11 Political parties also rated these 30 items• What is the political land cape in the Dutch elections of
2010?• Do IMDS on the distance between the 11 parties in 30
dimensional space• Data on spatial map reducing the data about 11 parties in
two dimensions
Thank you