17
Journal of Retailing and Consumer Services 13 (2006) 231–247 Clustering supermarkets: the role of experts Armando B. Mendes a, , Margarida G.M.S. Cardoso b a Mathematical Department, Azores University, R. da Ma˜e de Deus, 9501-801 Ponta Delgada, Portugal b Department of Quantitative Methods, Business School ISCTE, Av. das Forc - as Armadas, 1649-026 Lisboa, Portugal Abstract This work is part of a supermarket chain expansion study and is intended to cluster the existent outlets in order to support the evaluation of outlet performance and new outlet site location. To overcome the curse of dimensionality (a large number of attributes for a very small number of existing outlets) experts’ knowledge is considered in the clustering process. Three alternative approaches are compared for this end, the experts being required to: (1) a priori: provide values for perceived dissimilarities between pairs of outlets; (2) a posteriori: evaluate results from alternative regression trees; (3) interactively: help to select base variables and evaluate results from alternative dendrograms. The later approach provided the best results according to the marketing experts. r 2004 Elsevier Ltd. All rights reserved. Keywords: Clustering; External validation; Experts knowledge integration A supplementary exercise in cluster description involves the investigation of the clusters in order to establish whether or not they can be given substantive interpretations (y). Such substantive descriptions do not make direct use of data, but require investigators to reflect on the results of classification studies. Gordon (1999) 1. Introduction As in Europe, the retail sector in Portugal is going through a restructuring phase. Several authors (e.g. Birkin et al., 2002; Dawson, 2000; Seth and Randall, 1999) identify such factors as increasing consumer mobility, increasing electronic commerce, changing household size, concentration of market power, home market saturation, and changes in planning legislation to justify the new trends in retailing. In the food retail, in particular, after an unprecedented period of hypermar- kets growth, since the late 1970s, both in number and market share, it is now clear that hypermarket activity has slowed down significantly on behalf of the small or medium supermarkets (chain outlets including discount and hard discount chains) that nowadays present a larger dynamism. In Portugal, market share data shows that since 1996 the supermarkets were the only ones to grow simulta- neously in the number of outlets and in the volume of sales and, consequently, to increase the market share from 28% to 34% in the A.C. Nielsen universe. In 1997, the supermarkets reached the leadership and consoli- dated its expansion strategy. According to the most recent data, in 2001, supermarkets’ sales were already broadly superior to the sales in hypermarkets: 47% against just 35% of the total sales of outlets with alimentary products (see Fig. 1). This change in food outlet type is also found in other European countries (Birkin et al., 2002). Much more stringent legislation and the fact that consumers are more demanding, force the retail groups to invest in outlets of smaller dimension, and so in a proximity and quality of goods and services strategy. This investment has a longer run return as well as smaller economies of ARTICLE IN PRESS www.elsevier.com/locate/jretconser 0969-6989/$ - see front matter r 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.jretconser.2004.11.005 Corresponding author.

Clustering Supermarkets the Role of Experts 2006 Journal of Retailing and Consumer Services

Embed Size (px)

DESCRIPTION

Clustering Supermarkets the Role of Experts 2006 Journal of Retailing and Consumer Services

Citation preview

  • nsum

    et

    ar

    . da M

    ISCT

    Abstract

    T d is

    eva o ove

    for is con

    are prior

    ou ion t

    res ided

    r

    Keywords: Clustering; External validation; Experts knowledge integration

    1. Introduction

    to justify the new trends in retailing. In the food retail, inparticular, after an unprecedented period of hypermar-kets growth, since the late 1970s, both in number and

    from 28% to 34% in the A.C. Nielsen universe. In 1997,the supermarkets reached the leadership and consoli-

    more demanding, force the retail groups to invest inoutlets of smaller dimension, and so in a proximity and

    ARTICLE IN PRESSquality of goods and services strategy. This investmenthas a longer run return as well as smaller economies of

    0969-6989/$ - see front matter r 2004 Elsevier Ltd. All rights reserved.

    doi:10.1016/j.jretconser.2004.11.005

    Corresponding author.As in Europe, the retail sector in Portugal is goingthrough a restructuring phase. Several authors (e.g.Birkin et al., 2002; Dawson, 2000; Seth and Randall,1999) identify such factors as increasing consumermobility, increasing electronic commerce, changinghousehold size, concentration of market power, homemarket saturation, and changes in planning legislation

    dated its expansion strategy. According to the mostrecent data, in 2001, supermarkets sales were alreadybroadly superior to the sales in hypermarkets: 47%against just 35% of the total sales of outlets withalimentary products (see Fig. 1).This change in food outlet type is also found in other

    European countries (Birkin et al., 2002). Much morestringent legislation and the fact that consumers areA supplementary exercise in cluster descriptioninvolves the investigation of the clusters in order toestablish whether or not they can be given substantiveinterpretations (y). Such substantive descriptions donot make direct use of data, but require investigatorsto reect on the results of classication studies.Gordon (1999)

    market share, it is now clear that hypermarket activityhas slowed down signicantly on behalf of the small ormedium supermarkets (chain outlets including discountand hard discount chains) that nowadays present alarger dynamism.In Portugal, market share data shows that since 1996

    the supermarkets were the only ones to grow simulta-neously in the number of outlets and in the volume ofsales and, consequently, to increase the market sharehis work is part of a supermarket chain expansion study an

    luation of outlet performance and new outlet site location. T

    a very small number of existing outlets) experts knowledge

    compared for this end, the experts being required to: (1) a

    tlets; (2) a posteriori: evaluate results from alternative regress

    ults from alternative dendrograms. The later approach prov

    2004 Elsevier Ltd. All rights reserved.Journal of Retailing and Co

    Clustering supermark

    Armando B. Mendesa,, MaMathematical Department, Azores University, R

    bDepartment of Quantitative Methods, Business Schooler Services 13 (2006) 231247

    s: the role of experts

    garida G.M.S. Cardosob

    ae de Deus, 9501-801 Ponta Delgada, Portugal

    E, Av. das Forc-as Armadas, 1649-026 Lisboa, Portugal

    intended to cluster the existent outlets in order to support the

    rcome the curse of dimensionality (a large number of attributes

    sidered in the clustering process. Three alternative approaches

    i: provide values for perceived dissimilarities between pairs of

    rees; (3) interactively: help to select base variables and evaluate

    the best results according to the marketing experts.

    www.elsevier.com/locate/jretconser

  • scale, which forces careful decision-making (McGol-drick, 2000; Salvaneschi, 1996). Because smaller outletsare heterogeneous in aspects as location, dimension, andclient behaviour, the denition of outlet clusters isessential in outlet performance and site evaluation.In Sections 24 the next sections, an empirical

    classication of variables used in measuring super-markets performance and the role of expert knowledgein clustering and marketing applications is presented.The data collection phase is described and threeapproaches for experts knowledge integration in super-market cluster are explained. In Section 5 theseapproaches are compared and a cluster proling ispresented. This paper nishes with conclusions and amethodological discussion.

    2. Clustering supermarkets and the role of experts

    This work is part of an expansion study of asupermarket chain with small and medium dimensionoutlets and is intended to cluster the existent outlets.The classication is not just useful to evaluate therelative performance of different locations and outletmanagement, but also to use in analogy forecastmethods for the identication of potential site locations(Mendes and Themido, 2004). For that purpose, severalperformance measures and other attributes were col-lected in a framework dened in this section. Foraddressing the high dimensionality of the data, theintegration of expert knowledge in the clustering ofsupermarkets is suggested.

    2.1. Measuring supermarkets performance

    The retailers soon realized the importance of outletlocation, but understanding all the aspects of outletperformance, site locations, and the consumers beha-viour, forces to collect enormous amounts of informa-

    ARTICLE IN PRESS

    Hypermarkets

    Supermarkets (>'s)

    Supermarkets (

  • inaconuses

    kuofex

    doprga

    ARTICLE IN PRESSRetailscale development (Hardestya and Bearden, 2004),and in expert systems and automatic methods(Turban et al., 2004; Guijarro-Berdinas and Alonso-ranBeassessed (Moutinho et al., 2000; Wedel and Kama-ra, 2000; Jain and Dubes, 1988). In this work, the useexpert knowledge is suggested for non-quantitativeternal validation.The use of expert knowledge, sometimes namedmain knowledge, in order to evaluate the quality ofoblems solutions, is generally applied and investi-ted in diverse areas: marketing applications (Ow-g, 2000; Pasa, 1996; Moutinho and Brownlie, 1994),bed comparisons with other structures. In fact, cluster-g validation is of major importance in order tocomplish the study objectives (Gordon, 1999). Whenly a small number of observations are available, thee of experts based external validation becomessential, since quantitative internal reliability cannotanand in an extensive literature review. The variables aredivided in three groups:

    The location and outlet attributes that are intended toevaluate aspects only outlet and site dependent as theoutlet characteristics, the accessibilities, and the imageof the chain or the services range offered. Among theoutlet characteristics, the commercial or sales area isthe factor of major importance, which is emphasizedin Fig. 2 by an independent branch (Themido et al.,1998; Salvaneschi, 1996).

    The outlets influence area attributes that are relatedwith the evaluation of the trade area (or catchmentarea) generated by the outlet, which is essential inpotential sales forecasting. These attributes aremainly demographic variables but also refer to theimpact of the existent competition.

    Clients characteristics that refer to their preferences,attitudes, behaviour, socioeconomic prole and geo-graphic location are, nally, relevant in the evaluationof outlet performance.

    Few works attempted to classify outlet and siteevaluation variables. One good exception is the workpresented by Clarke et al. (2003b), which is largelycoherent with Fig. 2. These authors used cognitivemaps, based on answers of location experts from thelargest retail chains in United Kingdom, to identify themain variables used in location decisions. This workconrms not only the suggested framework of thevariables but also the high volume of data required inthe outlet/site evaluation studies.

    2.2. The role of experts

    Clustering methods always impose a structure to thedata, which must be validated, using specic measures

    A.B. Mendes, M.G.M.S. Cardoso / Journal oftanzos, 2002; Adelman and Riedel, 1997). Also invisual cluster validation, there is some kind of expert,or at least, user-assisted validation and interpretation(Hathaway and Bezdek, 2003; Hennig and Christlieb,2002; Jones, 1996).In what concerns marketing applications there is a

    long tradition in expert knowledge integration. How-ever, this integration is seldom formalized, opinionsbeeing integrated in the results in a non-explicited, andoften not reported, way. Exceptions can, however befound in some areas of application: Hardestya andBearden (2004) in consumer-related scale developmentresearch, Sanders and Ritzman (2004) in forecastingintegrating judgments, Owrang (2000) in discoveringnew knowledge in large volumes of data, Pasa (1996) inthe use of marketing expertise in marketing decisions,and Clarke et al. (2003b) in analogue outlet identica-tion and site assessment.Owrang (2000), for instance, uses domain or back-

    ground knowledge, in guiding and restricting the searchfor interesting knowledge. Several mechanisms aresuggested and compared for this end. On the otherhand, Pasa (1996) constructs a theoretical model thataids in evaluating marketing expertise and observe howdifferent market conditions inuence whether compa-nies emphasize marketing expertise in marketing deci-sions. He concludes that greater market instability andbigger market presence increase marketing expertisevalue.Still in the marketing literature, Clarke et al. (2003b)

    use soft systems methodologies, integrating qualita-tive expert-based intuition with a structuring mechanismfor analogue outlet identication and site assessment. Inthis work, authors recommend qualitative expert-basedor soft approaches as a complement to hard modellingsystems and they attempt to do this by isolating thedifferences between a given new site and the closestanalogue store outlet, based on multiple dimensions. Inprevious papers, the theoretical basis for managementjudgement and intuition were underlined (Clarke andMackaness, 2001); a method for synthesizing expertjudgements was outlined (Clarke et al., 2000); and theuse of a system in live decision-making environments topromote argumentation within the decision-makinggroup was described (Clarke et al., 2003a).In the pattern recognition literature contributions like

    Liu and Samal (2002) and Pedrycz (2004) may bereferred. Liu and Samal (2002) advocate that, bydenition, the clusters represent same abstract conceptthat is clearly domain dependent. Pedrycz (2004)mention the benecial aspect of incorporating domainknowledge in the fuzzy clustering mechanisms. Forjustifying the use of this knowledge, he suggests that anumber of essential features may not be available orcould not be easily quantied.Another pattern recognition example is the utilization

    ing and Consumer Services 13 (2006) 231247 233of a panel of experts to interpret classication rules,

  • ARTICLE IN PRESSRetail3. Data collection

    A large number of variables were collected in order toaccount for the diversity of attributes that may inuenceoutlets performance evaluation. This diversity of baseclustering data is considered essential (see for exampleWedel and Kamakura, 2000). Of all data collectionprocedures, explained in the next sections, a total of 250variables were obtained, measured in all kind of scales,and covering all the aspects in the suggested variableframework (Fig. 2).

    3.1. In shop surveys

    Shop surveys were held in two different years duringthe study. The rst took place in 2001 and wasaccomplished in all the existent supermarket outlets, inall days of two successive weeks, totalizing 3766 validquestionnaires. The second was conducted in 2003, in aselected group of outlets and in selected days of theweek, in a total of 2394 valid questionnaires.The questions included the clients opinions regarding

    the conguration of the outlet, accessibilities and siteconguration. They provided customers demographicand socioeconomic characterization, attitudinal andbehavioural attributes (motivation, means of transpor-tation, choices and preferences) and the identication ofpresented by Bay and Pazzani (2000). Their workconcludes that many of the generated rules are uselessor redundant and although it points for the subjectivityof the experts interpretations, it conrms the need forthis type of knowledge. In spite of that, few works havebeen presented considering the explicit integration ofexpert knowledge in feature selection and externalvalidation for the clustering analysis (see Jain et al.,1999, for a complete survey).Several authors, as Liu and Samal (2002) and Halkidi

    et al. (2001), propose the use of external validationindexes for measuring the degree of agreement betweenexpert delineated clusters and the ones obtained from amathematical method.In the present application, expert knowledge is

    suggested for non-quantitative external validation of asupermarkets clustering structure. The experts aremarketing analysts specialized in food retail outletlocation, working with the supermarket chain since itsorigin and being responsible for all location andperformance studies. This group of experts could notagree in a cluster structure for the supermarket outlets.Clustering supermarkets was considered a difcult andsubjective task. Therefore, alternative approaches toexpert knowledge integration are adopted, which are notbased in expert delineated clusters.

    A.B. Mendes, M.G.M.S. Cardoso / Journal of234the competition.Quantitative variables as the percentage of customersthat come from home, the average monthly expenses in theoutlet or the average value of purchase were madeavailable through the survey. When no signicantdifferences were observed between the average valuesreferring to 2001 and 2003, average values yielding fromthe two surveys were used. In the rare cases were pairedby outlet sample t statistic where lower than 5% (e.g. forthe mensal expenses in food), the most recent value wasused.Client segmentation with data from both surveys was

    performed resulting in two segments. The segments werecharacterized and termed as preferential costumers andeventual costumers (Cardoso and Mendes, 2002). Inconsequence, the percentage of preferential costumerswas included as a new variable in the study.

    3.2. The mystery shopping programme

    A mystery shopping programme (e.g. Wilson, 2001),was accomplished with a visit to the outlet of anincognito analyst that observed visible aspects, did abuy, and evaluated several aspects in ordinal scales.These in loco observations were performed in all thechain outlets and in some of the most importantcompeting shops.They used a check list with several location attributes,

    outlet characteristics, accessibilities, outlet visibility, andsome related with competition and characterization ofthe influence area. The variables are mainly nominal butsome are evaluations of some aspects of the outlet andservice in an ordinal scale of nine points.Coordinated with the mystery shopping programme

    the outlets GPS coordinates were also collected alongwith their nearby competitors. This GPS coordinatesand the mystery shopping data were loaded in aGeographical Information System and used to deneinuence areas, and to calculate variables used in theoutlet characterization and clustering.

    3.3. Census, geographic and competition data

    A large number of quantitative variables are availablefrom the national geographical base of census 2001 data.This is high-quality demographic data, accessible inseveral disaggregation degrees, and ready to use in aGeographical Information System. To include this data,inuence areas must be dened along with criteria forgeospatial intersection between these areas and thedemographic areas.

    Influence, trade or catchment areas can be dened asan area around the outlet from where it is likely to drawclients. Several methods have been suggested for itsdelimitation (e.g. Boots, 2002; Birkin et al., 2002;McMullin, 2000), in the present case, shortest paths

    ing and Consumer Services 13 (2006) 231247polygons and multiplicative weighted Voronoi diagrams

  • 4.1

    ARTICLE IN PRESS

    oi di

    Retailwere applied (Fig. 3). The latter method allows,simultaneously, incorporating the outlet attractivityand the competition in the outlet proximities (Bootsand South, 1997). A data base with the location of morethan 600 food outlets in Portugal was necessary for themethod implementation.Using space interaction procedures, percentage values

    and densities were calculated for all existing outlets. Bythe end of this process, and in spite of having made acareful selection of the census data, it resulted in morethan half a thousand variables. To reduce this numberthe Pearson correlation coefcient matrix were calcu-lated and the variables with more very strong correla-tions were deleted.Fig. 3. Shortest path polygons (left) and multiplicative weighted Voron

    area and demographic polygons shaded by population density).4.

    secsmfoTowe

    1.

    2.A.B. Mendes, M.G.M.S. Cardoso / Journal ofMethodological approach

    In spite of the data abundance following from lasttion, the number of outlet supermarkets was veryall. This fact hindered the process of variable choicer the outlet clustering and respective characterization.overcome this difculty, three different proceduresre considered for experts knowledge integration:

    In a priori integration approach, the experts wererequired to compare pairs of outlets and evaluatetheir dissimilarities in a perceptual scale. The dissim-ilarities matrix was then directly used in the Wardshierarchical clustering procedure. Finally, the selec-tion of clusters proling variables relied on regres-sions over MDS dimensions.In a posteriori approach, the integration wasaccomplished by evaluating alternative clusteringstructures derived from a supervized learning proce-dure using regression trees. In this approach the base

    opisowhacve

    sositThobinvexususdeotavwe

    fupr. A priori experts knowledge integration

    In this approach the integration of the expertsvariables choice relied on the regression tree proce-dure although experts required diverse regression treeparameterizations and target variables.

    3. In the last procedure an interactive process wasadopted: the experts knowledge was considered insuccessive stages regarding the choice of clusteringvariables and the evaluation of clustering results. Thisprocess is termed interactive integration. WardsHierarchical procedure was based in alternative setsof base variables in order to provide betterclustering, as judged by experts.

    agrams (right) examples. (Point radius proportional to the outlet sales

    ing and Consumer Services 13 (2006) 231247 235inion was made by means of outlet paired compar-ns. The experts were requested to ll a questionnaireere pairs of outlets were compared and evaluatedcording to a scale of ordinal dissimilarities (from 1 ry similar outlets to 9 distinct outlets).The comparison was meant to be generic, althoughme aspects as location, management performance ande as well as clients characterization were emphasized.e resulting symmetrical dissimilarity matrix wastained by consensus among the several expertsolved. This procedure is termed a priori as theperts opinion only regards the dissimilarities matrixed as clustering base. Clusters where then obtaineding the hierarchical Wards method resulting in thendrogram presented in Fig. 4. It should be noted thather hierarchical methods as centroid and grouperage linkage were tried and similar dendrogramsre obtained.From the observation of the fusion index values andsion index variations versus number of clusters chart,esented in the same gure, six clusters were adopted.

  • ARTICLE IN PRESS

    tly applied to the dissimilarity matrix and fusion indexes chart.

    RetailingMore complex stopping rules for determining thenumber of clusters were considered (see very good textsin Everitt et al., 2001; Gordon, 1999). Although

    Fig. 4. Dendrogram of the Ward hierarchical method direc

    A.B. Mendes, M.G.M.S. Cardoso / Journal of236conicting results were obtained, in general, the sixclusters cut was supported.In order to interpret the obtained clusters some

    further exploration of the dissimilarity data wasrequired. MDSMultidimensional Scaling non-metricanalysis, using the ALSCAL algorithm by Takane,Young and Leeuw (Cox and Cox, 2000) was performed.A solution with four dimensions was found whichaccounts for an RSQ of 96% (Kruskal stress value of7.8%). Fig. 5 illustrates the positioning of chain outletsclusters in the extracted MDS dimensions, along withlabels based in the clusters characterization.Regression procedures using several hundreds of

    target variables were performed considering the MDSdimensions as explanatory variables. Results enabledthe identication of variables depicted in Fig. 6.From this gure, dimension 1 is related to outlet

    dimension and car parking facilities and inverselyrelated with outlet visibility and sales per outlet area.Dimension 2 is related to influence area dimension andpercentage of households with children or working inprimary or secondary sectors.MDS dimension 3 is related with the number of elder

    residents and preferential costumers in the inuencearea, and dimension 4 is associated with the percentageof occasional clients and complex trip (passage) clients.These results helped to support the clustersprFi

    Fig

    proand Consumer Services 13 (2006) 231247oling, which may be summarized as follows (seegs. 5 and 6):

    Small outlets: A very homogenous cluster of eightoutlets characterized by very negative values indimension 1 and 3. According to the characterization

    . 5. Outlets in the space of the MDS extracted dimensions and

    le labels.

  • ARTICLE IN PRESS

    location and outlet attributes

    influence area attributes

    clients attributes

    number of salesoperatorsworkingplacesin shop

    parkingplaces

    salesturnover for square meter of sales area

    mysteryshopping

    evaluation of walking trip

    outletvisibility

    mysteryshoppingevaluation of car parking facilities near the outlet

    0

    0.2

    0.4

    0.6

    0.8MDS dimension 1

    MDS dimension 2

    MDS dimension 3

    MDS dimension 4

    percentageof residents working in primary or secondarysector

    percentageof resident femalechildren less than 5 years old

    Influencearea by Voronoidiagramspercentage

    ofhouseholdswith children orgrandchildren less than 7 years old

    number of residentwomen with more than 65 years old

    number of residentfamilies with 1-2 persons

    0

    0.2

    0.4

    0.6

    0.8MDS dimension 1

    MDS dimension 2

    MDS dimension 3

    MDS dimension 4

    percentageof passage respondents

    preferentialrespondents

    percentageofrespondentsdeclaringoccasionalpurchases

    percentageofrespondentsliving in a 5 minuteswalkingdistance

    0

    0.2

    0.4

    0.6

    0.8MDS dimension 1

    MDS dimension 2

    MDS dimension 3

    MDS dimension 4

    Fig. 6. MDS dimensions characterization based on standardized absolute value regression coefcients (black marks represent attributes negatively

    related with MDS dimension).

    A.B. Mendes, M.G.M.S. Cardoso / Journal of Retailing and Consumer Services 13 (2006) 231247 237

  • 4.

    tiore

    19anmmCa

    coraco(seav

    am20th

    ARTICLE IN PRESSRetailSupervized learning methods rely on enormousounts of data for internal validation (Hand et al.,01; Berry and Linoff, 1997). Since, in this applicationeaux and Gayet (2001) and Chou et al. (2000).Alternative target variables were considered for treenstruction: the sales turnover for several years and thetio of sales turnover over the sales area that is a verymmon outlet performance measure in the literaturee for example Birkin et al., 2002). All the remainingailable variables were considered as predictors.chassication and Regression Trees (Breiman et al.,84). Regression trees simultaneously cluster outletsd forecast the outlet turnover based on the targetean values in the tree leafs. Recent decision treearketing applications can be found for instance inrdoso and Moutinho (2003), Cooley (2002), Mi-Cl2. A posteriori experts knowledge integration

    In the second approach, experts knowledge integra-n is made a posteriori, in the evaluation of alternativesults provided by a regression tree method: CARTdimension 3 and low-to-medium values in 4. There-fore, these are outlets with big Voronoi inuenceareas, and consequently low levels of competition,and especially low levels of preferential costumers butalso mean to low levels of passage clients.of these dimensions, these are small outlets with smallinuence areas indicating high levels of competition.

    Neighbourhood outlets: Constitute a ve outlet clusterprimarily characterized by low values in dimension 4related to percentage of passage clients and low valuesin dimension 2 related to inuence area dimension. Sothis cluster has the smallest inuence areas andpercentage of households with children and also veryfew occasional and passage clients.

    High-potential outlets: Four outlets with low-to-medium values in dimension 2 and high values indimension 3. In consequence corresponds to outletswith small-to-medium inuence areas but highpercentages of preferential costumers justifying thehigh-potential label.

    Small high-potential outlets: A three-outlet clusterwith very high values in MDS dimension 3 and verylow values in dimension 1. This is a very high-potentialcluster with many preferential costumers and percen-tages of households with children but very small salesarea.

    Big outlets: Two very big stores as the high values ofdimension 1 conrm. Both have negative values indimension 2 indicating high levels of competition andsmall inuence areas, and high values in dimension 4indicating also high percentages of passage clients.

    Low-potential outlets: Two medium size outlets withhigh values in dimension 2, very law values in

    A.B. Mendes, M.G.M.S. Cardoso / Journal of238e number of cases is very low compared to the numbershould be noted that the biggest outlets are excludedfrom this cluster as were split in the rst tree node. Andso, these are not necessarily the biggest inuence areasfor the existent outlets.The last split variable was calculated as the percentage

    of inquire respondents, which claimed to spend at least75% of the mensal expenses in food on the outlet andthe rest in a hypermarket. As we found that these loyalcostumers had residences near the outlet, this cluster waslatpocaHigh-potential outlets are characterized by all theter splits and big influence areas by the Voronoilygon methods and correspond to bigger sales. Itoutlets sales area and also to the biggest outlet annualsales.Low-potential outlets have small-to-medium dimen-sion, have small number of children in the influencearea and very low sales turnover.In Fig. 7 the best tree is presented. This tree wasevaluated by experts who agreed that all the splits madesense and the clusters in the terminal nodes were alsoreasonable. The corresponding leave-one-out estimatefor the percentage of explained variance, referred to the2002 turnover, is 86%. However, it should be noted thatbetter values were found for trees that were evaluated aspoorer by the experts.Clusters proling may be directly derived from the

    tree, which was the most appreciate characteristic of thisapproach as it greatly facilitates expert cluster valida-tion. Thus, clusters were named according to thepropositional rules associated with the correspondingleaf nodes (see Fig. 7):

    Big outlets correspond to the bigger values for thebigger sales area corresponds to a leaf with meansmaller annual turnover).Leave-one-out error estimates were presented to theexperts for tree selection decision support.of attributes, special attention must be given to themodel stability issue. In fact, several different splittingvariables may lead to similar precision index values or,alternatively, the adoption of only slightly differentbranching rules may signicantly decrease precision.Moreover, the fact that there are so few observationsturn the use of test set error estimates impossible. Inorder to overcome these limitations:

    Several trees with different parameterizations andalternative quasi-tied splitting attributes were grown.

    Only trees with meaningful decision rules wereconsidered by experts as alternative models. Inparticular, splitting variables were rejected whencounterintuitive decision rules emerged (e.g. if a

    ing and Consumer Services 13 (2006) 231247lled neighbourhood outlets and the other transit outlets.

  • ARTICLE IN PRESS

    A

    Retail

    >