Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Conducting the consumer interviews
Introduction
While all of us are consumers in that on more than one occasion we have purchased consumer
goods for personal or household consumption, this aspect of RUFSAT seeks to provide the
multi-stakeholder platform [MSP] with accurate information on consumer food purchasing
behaviour within the city.
The consumer questionnaire will provide evidence on:
the different sources from which consumers obtain their food [including that which
they grow themselves, purchase, receive as gifts, or that which is supplied through a
food aid agency]
where consumers buy their food [type of retail outlet]
what food they buy
how often they buy food
how much they spend
how they travel and how far they travel to various retail stores to purchase their food
why they purchase food from different retail outlets
the major concerns consumers have about the different foods they purchase and
consume
food wastage [in the home]
the composition of the household diet
the proportion of food eaten at home and away from home
food preparation in the home
food hygiene and safety
food security
and demographic information that can be utilised to identify differences in access to,
the affordability of food and the composition of the diet across the metropolitan area.
Resources required
To conduct this aspect of the study in a timely manner, it is recommended that at least TWO
enumerators be engaged for a period of no less than three weeks [21 days]. This is based on the
knowledge that enumerators should be capable of conducting 12-15 interviews per day and that
enumerators will require 1-2 days training prior to being released into the field.
The target number of respondents is 450. Given the difficulties associated with gaining access to
individual households, especially those who reside in apartments and for which entry may be
blocked by security personnel, the majority of interviews will be conducted in retail shopping
areas where consumers purchase the majority of their food. Where appropriate, consumers
should be interviewed in both traditional wet markets and modern retail shopping outlets
[supermarkets or hypermarkets] in similar numbers. However, within the slum areas, it may be
necessary to undertake personal interviews on a door-to-door basis [with appropriate security
arrangements].
In the actual conduct of the interviews themselves, each research team will need to make its
own decision about whether to collect the data via tablets or whether to conduct the interviews
in hard copy [paper] and to subsequently employ a data enumerator to enter the data into excel
or directly into appropriate data analysis software [such as SPSS].
If the data is to be entered directly into a tablet, appropriate software must be selected and the
survey encoded to fit that software. Unless at least one member of the research team has
appropriate software development skills, this may not be an option. Even then: (i) transcribing
the survey instrument may take an inordinate amount of time; (ii) there may be an issue around
personal security for the enumerators in the markets and in the slum areas with the risk of
physical assault and robbery, and thus the loss of data; and (iii) battery life and the need to
potentially recharge the tablets at least once during the day. At the end of each day, where
tablets are employed, data will need to be downloaded and saved to an appropriate external hard
drive.
Selecting the sample
In selecting the sample it is assumed that the research team has access to:
a physical map of the metropolitan area describing the physical boundaries for the
different suburbs and the location of: (i) traditional retail [wet] markets; (ii) suburban
shopping centres; and (iii) major shopping malls [supermarkets and hypermarkets]
from the most recent census, population estimates and a socio-economic profile for each
suburb
From the most recent census data, the proportion of the population falling within each socio-
economic classification is to be calculated. This will establish the number of strata groups to be
used in sampling. Please note: the number of strata will vary from country to country depending
upon the socio-economic classification utilised by the central statistics agency.
The total sample [450] is then to be divided by the number of strata and weighted by the
proportion of the population within each strata to determine the approximate number of
respondents to be interviewed from each strata. Using information available from the central
statistics agency, each suburb is to be assigned to the most appropriate strata and a number of
suburbs selected at random, corresponding to the proportion of the population within each
socio-economic classification. For each suburb selected, the minimum number of interviews to
be undertaken is 25. To summarise:
the total number of suburbs to be sampled is 18 [including the slums]
suburbs are to be placed into socio-economic groups
the number of suburbs selected from each group is determined by the population
distribution
With two enumerators, gathering 25 interviews is equivalent to one day in the field for each
suburb. Ideally, at least 12 interviews should be conducted in a modern retail store and at least
12 in a traditional wet market.
In stratifying the population, our objective is to draw a representative sample of the target
population. However, we must acknowledge that:
there can be a significant difference in personal incomes between households residing in
the same suburb
consumers will often travel some distance within the city to purchase food from
preferred outlets. This may be because of the perceived quality of the food, the range or
type of food available, low price, or because of ethnic, cultural or religious reasons
consumers are known to purchase different food items from different retail stores
the population of a city is not constant. On any given day, any number of people may
travel into the city for work and depart in the evening to return to their place of
residence. Even although these people may not reside in the target suburbs they are
NOT to be excluded.
Shoppers are to be randomly intercepted in the selected market using every nth respondent. N
can be determined in advance – for example every third person – but n will also be dependent
on the number of people present in the market. If the number of people is low, every shopper
passing the enumerator should be approached for an interview.
Interviews are to be conducted on each day of the week and across a variety of times to ensure
that all socio-economic groups have a chance of being represented.
After a brief introduction to the project and the reasons for undertaking the interviews, two key
questions are to be asked of the potential respondents:
1. Are you responsible for at least 40% of the food purchasing decisions within your
household? This screening question will eliminate from consideration those respondents
who do not have the knowledge or experience to complete the survey instrument.
2. Do you have 20 minutes available to participate in this survey? This is to advise the
respondent of the approximate amount of time required to complete the survey, thereby
minimising the number of surveys that are not completed.
Before commencing the interviews, enumerators are to advise respondents that:
the data will only be used for research purposes
it will not be possible to identify any individual or to link any individual household to
the data collected
the respondents participation is voluntary
the respondent may terminate the interview at any time
Not unexpectedly, respondents may expect to receive some reward or incentive for their
participation. Each team must make its own decision as to whether it will/will not offer an
incentive and also determine the type and value of the incentive. Care must be taken not to
select products that may be perceived as a commercial endorsement.
For the slum areas, the number of slums to be interviewed and thus the number of households to
be interviewed will also be determined from the census data. The key difference here is that
enumerators will need to be accompanied by a person who is familiar and well respected by the
community.
The general location where each interview is conducted is to be recorded and that location
added to the interactive geospatial map for the city.
Development and pretesting of the survey instrument
Although the format for much of the questionnaire is fixed, BEFORE commencing data
collection, each research team must give some thought to defining the different types of retail
outlet operating in the city. The terms used to describe the different types of outlet should be
well understood by the respondents and may include:
traditional wet market
temporary market [operate on only specific days of the week]
farmers market
small neighbourhood grocery stores
convenience store
petrol station
a specialist retail store [butcher, baker, fish monger, green grocer, delicatessen]
street vendor or roadside stall [fixed]
a mobile street vendor [may offer door-to-door sales]
supermarkets [independents and/or chain stores]
hypermarkets
internet
others [define]
For this study, there are 7 broad food categories:
meat and poultry
fish/seafood
dairy products
eggs
fresh fruit and vegetables
staple [includes cereals, pulses, root and tubers, and plantains]
processed foods and beverages
By necessity, because this is a rapid assessment tool where the primary unit of analysis is the
household, some products or commodities may be absent and/or aggregated.
Some, but not all, of these groupings are subsequently used to explore:
the preferred place of purchase [which is known to differ for different food products]
the frequency of purchase
the type of products purchased [within the group]
how much money the consumer spends
why the consumers purchases from the preferred place of purchase
what proportion of the product purchased is wasted
why it is wasted
what the consumer does with the waste product
the consumers concerns associated with the selected product group. Not unexpectedly,
these concerns will differ with each product group. For example, for fresh and fruit and
vegetables, consumers may have reservations about freshness or chemical residues. For
meat, they may be more concerned about growth promotants, antibiotics, animal
welfare or the use of chemical agents to extend the shelf life.
Research teams will gather data for
fresh fruit and vegetables
either meat, poultry or fish
staple food [cereal, pulse, root or tuber, or plantain]
packaged/processed food [whether canned, bottled, frozen or dehydrated. This includes
snacks, cakes, biscuits, pastries, soft drink, confectionary, breakfast cereals, bread etc
BUT it does NOT include food purchased from restaurants, fast food outlets or food
service outlets]
minimally processed food [this includes fresh cut vegetables, salad mixes, precut fruit
like pineapples, melons, durian or jackfruit, pomegranates etc]
food purchased from restaurants, fast food outlets and food service outlets, irrespective
of whether the food is eaten on the premises or purchased as a takeaway for
consumption at home
Acknowledging that as personal disposable incomes rise, a greater quantity of food is purchased
away from home, the questionnaire will endeavour to assess:
what proportion of the household food budget is spend on food away from home
where the food is consumed
by which members of the household and
for which occasion [breakfast: lunch: dinner]
It is also important to recognise that not all households are entirely dependent on food
purchases: some household will produce a proportion of their own food, others will receive a
proportion of their food from family members in rural areas and others, especially the under
privileged, may obtain a proportion or all of their food from food aid programs.
Within the home, the RUFSAT tool will explore key issues associated with food hygiene and
safety. The questionnaire will gather information on:
the type of fuel used for cooking
the source of water
the incidence of diarrhoea
whether anyone in the household has received any information on the safe handling of
food and
whether the household owns a refrigerator or freezer [food storage]
The ultimate outcome of the RUFSAT tool is to provide a planning mechanism for local
government to ensure that all consumers, irrespective of income, have access to a sufficient
quantity of safe, healthy and nutritious food. To evaluate this, the questionnaire will conclude
with four sets of related questions:
the Food Insecurity Experience Scale [FIES] is an experience-based metric of severity
of food insecurity that relies on respondents direct responses. To retain its integrity, this
scale is NOT to be modified in any way
the type of food consumed within the household for the past week and the variety of
foods consumed
whether any members of the household were unable to eat any types of food for either
medical, religious or personal reasons
whether anyone in the household has received any information on healthy, nutritious
diets
The final section of the questionnaire asks a range of demographic questions about the
respondent but more importantly, about the household. The responses to the various questions
asked here are central to the analysis of the data, for they will provide the independent variables
to be used in the cross-tabulations, t-tests and analysis of variance [ANOVA]. Issues such as
ethnicity, religion and income are hypothesized as having the most significant impact on food
purchases and food consumption, but where a person shops may also be influenced by access
and availability to transport [car or a motor bike] and age [with younger people often preferring
to shop from modern retail outlets].
For several of the questions, team members will need to adjust the categories used in the survey
instrument. For example: Q77 – age group – the categories used should mirror those used by the
central statistics agency. This will allow the characteristics of the sample group to be compared
more readily with the population parameters. Similarly, in asking the respondent about
household income [Q92], rather than to ask a direct question [which very few people will
answer], the respondent will be asked to indicate an income range. In establishing the range of
income groups, the categories used by the central statistics agency or taxation department
should be employed.
While other demographic information may or may not be used, the information gathered will be
used in profiling the respondents, thereby enabling a comparison of the sample parameters with
the population estimates. This comparison will provide a ‘goodness of fit’ measure.
Having finalised the questionnaire, the instrument should be pilot tested before undertaking the
survey. Here the researchers may choose to draw a convenience sample from workplace
colleagues [who have NOT been involved in the survey design], friends or family, or if
convenient, to go to the nearest market place or retail store. Around 10-15 surveys should be
sufficient to identify any major faults, errors of omission, or to indicate which questions may
need to be restructured [perhaps as a result of translation] to be better understood.
Data entry and encoding
To facilitate data entry a spreadsheet has been developed into which the responses are to be
entered. Every question and every subsection of every question has a column which is clearly
marked on both the questionnaire and the spreadsheet.
For the majority of questions, the answers are pre-coded, so all that needs to be done is for the
corresponding number to be entered into the appropriate column. However, where there are
open-ended questions, the data entry operator is to establish a master list. The master list is
dynamic for when new items emerge the new item is given its unique code, but where
respondents give a pre-recorded response, the number already assigned to that response is
entered. For example, in Question 9, respondents are asked to identify the reasons why they buy
food from their preferred retail store:
1 Freshness
2 Low price
3 Cleanliness
4 Hala’l
5 Lots of variety/choices
6 Close to home/office
7 Loyalty
8 Nice environment
9 10
Should the respondent provide multiple answers [sufficient space is provided to accommodate
up to five answers], their first response might be low price [2], close to home [6], but should
they then say nice environment, this is added to the list as [8]. With only one data operator the
master list can be easily constructed at the same time the data is entered. Should a double entry
be inadvertently made, it can be readily corrected by recoding.
The other major area where a master list may need to be prepared relates to the occupation for
both the respondent and their spouse. However, developing the master list for this category will
be a two stage process. In the first instance, occupations are simply recorded, but in the second
stage, the various occupations need to be grouped into much smaller categories to facilitate data
analysis. Ideally, the construction of these groups should come from the categories used by the
central statistics agency or another appropriate government agency.
Data cleaning
BEFORE any analysis of the data is undertaken, a very basic frequency analysis is to be
conducted. The purpose of this is to:
correct any data entry errors – for example, in Q11 we ask respondents to indicate how
important a number of items are to them in their decision to purchase food from a retail
store. Respondents are asked to use a scale from 1 to 6. Should 66 appear, it is
immediately obvious that a double entry has occurred which can be very easily
rectified, but where it’s a 34, 45 or 56 for example, the data operator will need to return
to the questionnaire to identify the true record and correct.
Every questionnaire will have a unique code ranging from 001 to 450. This code is to be
added to the questionnaire by the data operator at the time of data entry so as to avoid
any duplication.
determine the normality of the data. In statistics, normality refers to the normal
distribution of the data. For data that is not normally distributed, analysts must exercise
a degree of caution in drawing conclusions, especially when associations or
relationships between elements are being explored and/or hypotheses statistically tested.
While various tools are available to correct or compensate for non normal data, the level
of analysis undertaken in RUFSAT is unlikely to require those tools to be employed.
PLEASE NOTE. HAVING CHECKED THAT ALL RESPONSES ARE CORRECT, A
MASTER FILE SHOULD BE CREATED. THE DATA IN THIS FILE IS NOT TO BE
AGGREGATED: IT CONTAINS THE RAW DATA. ONE OR MORE COPIES OF THIS
FILE SHOULD BE STORED IN SEPARATE LOCATIONS TO PREVENT THE
ACCIDENTAL OR DELIBERATE LOSS OF DATA.
Data analysis
Before analysing the data, we must first describe the sample. From the frequency outputs, both
the raw number [frequency] and the percent of responses are to be entered into the following
table:
N %
Gender
Female
Male
Joint decision [where both male/female jointly answer]
Age [of the main respondent]
18 – 24 years
25 – 34 years
35 – 44 years
45 – 54 years
55 – 64 years
65 years and older
Marital state
Single
Married
Separated/divorced/widow[er]
Highest level of education achieved [by principal respondent]
Did not complete primary school
Primary school
Junior high school
Senior high school
Certificate
Diploma
Under-graduate degree
Post-graduate qualification
Occupation of the respondent
[occupational groups to be established]
Occupation of spouse
[occupational groups to be established]
Ethnicity [of principal respondent]
[list of ethnic groups to be entered from master list]
Ethnicity of spouse
[list of ethnic groups to be entered from master list]
Religion
[list of religious groups to be entered from master list]
Number of people living in house
[number of categories to be determined from output]
Immediate family members
[number of categories to be determined from output]
Parents/relatives/extended family
[number of categories to be determined from output]
Number of others living in house
[number of categories to be determined from output]
Suburb in which respondent lives
[list of suburbs to be entered from master list]
Own a motorbike
Yes
No
Own a car
Yes
No
Access to credit cards
Yes
No
Household income
[number of groups to be determined from tax office]
Frequency of payment
Weekly
Fortnightly
Monthly
Irregular
To facilitate reading and so as not to cut the table in the middle of a section, the table may be
split into two or more separate tables.
For the initial descriptive analysis of the sample [and of the data generally], the full range of
responses should be presented. However, for many of the statistical tools that will be used to
analyse the data such as crosstabs, t-tests and analysis of variance, there is a requirement for a
minimum number of respondents. Where these conditions are not fulfilled the analysis will
return an error. To overcome that problem, some aggregation of the data may be necessary.
Let’s take family size for example:
Number of people living in house
1 26 5.8
2 65 14.4
3 84 18.7
4 96 21.3
5 78 17.3
6 62 13.8
7 17 3.8
8 13 2.9
9 6 1.3
10 2 0.4
11
12 1 0.2
In a situation like this it would be sensible to aggregate the data for all those households with
more than 7 people living in the house so that the revised table would now look like this:
Number of people living in house
1 26 5.8
2 65 14.4
3 84 18.7
4 96 21.3
5 78 17.3
6 62 13.8
7+ 39 8.7
To facilitate the analysis, the data in the working spreadsheet must now be recoded, so that for
those respondents with 8, 9, 10 or 12 people in the household, a 7 will now be recorded.
A similar type of analysis might be undertaken to aggregate some of the educational
qualifications, perhaps combining Certificate and Diploma or Under-graduate and Post
Graduate if the numbers are low; recoding any minority ethnic groups and religious groups as
‘others’; or recoding those respondents who live in suburbs outside the study area as ‘others’.
Before moving to the data which relates to food, the research team should compare the sample
characteristics with those reported by the central statistics agency. Where possible, to facilitate
the comparison, the data should be tabulated. Taking this example from Perth, Western
Australia:
Age [of the main respondent]
ABS Brackets Percent
ABS Survey Variance
18-24 16.3 24.6 + 8.3
25-34 17.4 26.0 + 8.6
35-44 18.2 17.9 - 0.3
45-54 17.3 16.8 - 0.5
55-64 14.2 7.2 - 7.0
65+ 16.6 7.5 - 9.1
Note: percentage adjusted after the exclusion of people under the age of 17
It is immediately evident that the age group 18-24 years and 25-34 years is over-represented
while those aged more than 55 years are under-represented. While any number of reasons might
explain some or all of the variation, this is the sample that we have and must now work with.
However, we must note the limitations and be conscious of these in drawing any conclusions.
For Question 1, in recording the proportion of food consumed in the household by the source of
the food, rather to record frequencies, on this occasion, given that the data is metric, the mean
and standard deviation [SD] will be recorded.
Proportion of all food consumed in the household:
Mean SD
grow/produce yourself
receive as a gift from friends or family
receive as part of a remuneration package [from work]
purchase from a retail food store
purchase from a restaurant/food service outlet
receive from a food aid agency
Question 2, who makes the decision to purchase, will be presented in a similar manner.
While the mean will reveal the importance of each source of food, the standard deviation [SD],
which is derived from the variance around the mean, will indicate opportunities to interrogate
the data using some of the socio-demographic variables as the dependent variable. For example,
it would not be unreasonable to hypothesize that households in the slum districts [suburbs]
would be more likely to rely on food aid agencies than those residing in more affluent suburbs.
Using household income as the dependent variable, one might hypothesize that as income
increases households will purchase a greater quantity of food from restaurants or food service
outlets. However, lower income groups may also purchase a large proportion of their food from
fast food outlets because of convenience. In this case, to test our hypotheses analysis of variance
[ANOVA] will be conducted. However, while ANOVA may reveal a statistically significant
result, it will not reveal where the differences occur. For this reason it will be necessary to use a
post-hoc test. That which is considered the most reliable is Duncan’s HSD.
In the following example [from Perth, Western Australia], respondents have asked to indicate
what proportion of the fresh fruit and vegetables consumed in the household were: (i) grown by
themselves; (ii) purchased from a retail store; or (iii) purchased direct from the producer. From
the Australian Bureau of Statistics, the following age categories were employed
Mean percent of purchases by age
18-24 25-34 35-44 45-54 55-64 + 65 P
Grow own 5.62b 7.09b 9.01ab 11.86ab 13.97a 15.33a 0.000
Purchase 85.26a 83.47a 79.66ab 80.34ab 72.58b 77.86ab 0.001
Purchase direct 3.44ab 5.56ab 5.88ab 3.30ab 6.80a 1.67b 0.003
where those items with the same superscript are not significantly different at p = 0.05
The superscripts ab indicate that there are two significantly groups: Group a and Group b. If we
look at the proportion of the fresh fruit and vegetables grown by the respondent, it is
immediately obvious that those respondents aged 55 and above grow a larger proportion of the
fresh fruit and vegetables that they consume than respondents aged 34 years and below. For
those respondents aged 35 to 54 years there is no significant difference.
In recording the results for Q3 through Q7, given the potentially large number of responses
[depending on the different kinds of retail outlet] it will be best to look at each question
separately. For Question 3, as the data is non-metric [or categorical] only frequencies and
percent can be displayed.
Do you shop for food in any of these places?
Store 1 Store 2 Store 3 Store 4 Store 5 Store 6
N % N % N % N % N % N %
Yes
No
If yes, how often do you shop in each place?
Store 1 Store 2 Store 3 Store 4 Store 5 Store 6
N % N % N % N % N % N %
1
2
3 4 5 6
where 1 is everyday
2 is 2-3 times a week
3 is once a week 4 is 2-3 times a month
5 is once a month
6 is seldom
For Question 6 – approximately how far [m] is each retail store from your place of residence? –
the response is metric, hence the mean and standard deviation can be used here.
Store 1 Store 2 Store 3 Store 4 Store 5 Store 6
µ SD µ SD µ SD µ SD µ SD µ SD
m
where µ is the mean
SD is the standard deviation
Questions 8 and 9 are also metric, and thus the mean and standard deviation should be used.
Questions 10 and 11 will require the data enumerator to develop a master list. However, this is a
multiple response question where respondents are encouraged to give multiple answers. In
marketing research, as it is normal to associate the order in which respondents give their replies
as a measure of importance. Thus, when encoding the responses, it is important to place the
code in the appropriate column in the order in which it was presented by the respondent. Each
column is then analysed separately and the responses placed into a table as shown.
This example is drawn from a study of store choice for fresh fruit and vegetables in Perth,
Western Australia. In calculating the percentages, it is not as a percentage of responses, but
rather as a percentage derived from the number of respondents [as given in Column 1]. As it is
not our intention to differentiate between the different types of retail store, all the data,
irrespective of store type, can be aggregated. However, should the team wish to look at the
responses for different socio-economic groups, while this is relatively easy to do by applying a
filter, it is very time consuming.
Ranking N %
1 2 3 4 5
Competitive price 65 99 40 14 3 221 46.9
Good quality produce 65 70 30 10 1 176 37.3
Fresh 98 44 24 4 1 171 36.3
Convenience 88 28 14 6 1 137 29.1
Wide range of produce 18 28 28 5 1 80 17.0
Close proximity to home 38 15 8 5 66 14.0
Location 32 17 5 4 58 12.3
Wide range of other foods 7 10 9 4 2 32 6.8
Local produce 8 6 7 1 1 23 4.9
Only shop available 10 2 1 13 2.8
Presentation 2 2 4 3 1 12 2.5
Origin of the produce 4 1 3 1 1 10 2.1
Value for money 2 2 5 1 10 2.1
Credit card facilities 1 3 4 0.8
Quick checkout 2 2 4 0.8
Close to other shops 1 1 1 3 0.6
Atmosphere 1 2 3 0.6
Social meetings 1 1 2 0.4
Ability to self select 1 1 2 0.4
Store layout 1 1 2 0.4
Loyalty programs 1 1 2 0.4
N 471
Note: table has been edited to fit the page
Question 12 differs from the fixed response categories that have been previously used in that the
response range is a metric scale [like a thermometer] where the difference between 1 and 2 is
the same as the difference between 4 and 5. Hence, in recording the results for this question, the
mean and standard deviation are to be used. To record the data, simply copy and paste the table
and leave just two columns. Having entered the results into the table, order from most important
to least important. Whether this is an ascending or descending scale will be determined by the
prevailing in-country social norms: in some countries 1 is the most important, while in others, 6
is the most important. A six point scale has been used purposefully to prevent respondents from
utilising the neutral mid point: they are forced to think about each question. See this example
from Perth, Western Australia:
Importance of criteria respondents use in their choice of retail store
Mean SD
Good quality produce 5.68a 0.66
Fresh produce 5.64a 0.69
A wide range of fresh produce 5.48a 0.80
Clean 5.35a 0.92
Good value for money 5.30a 0.94
Competitive price 5.11a 1.00
All product is clearly priced 5.08b 1.06
I can self select 4.98c 1.19
Fast and efficient check-out 4.96c 1.02
Close to my home 4.95c 1.28
Easy to access 4.91c 1.14
Customer service 4.79d 1.20
Fresh produce is refrigerated 4.78d 1.29
Good access to product on the shelf 4.70e 1.16
A wide range of other food products 4.69e 1.35
Price specials or discounts 4.63e 1.31
Plentiful car parking 4.58e 1.41
Trolleys and baskets easily accessible 4.56e 1.29
One-stop shop – can purchase everything 4.51f 1.50
Origin of the product is clearly displayed 4.43g 1.38
Knowledgeable staff 4.39h 1.31
Good lighting 4.30i 1.25
Attractive presentation 4.24i 1.33
Favourable prior purchase 4.15j 1.39
Extended trading hours 3.97k 1.76
Attractive décor and surroundings 3.95k 1.34
Refund/return policy 3.81l 1.72
Clear signage 3.67m 1.49
Organic produce 3.31n 1.56
Product information available in-store 3.31n 1.46
In-store tastings 2.97o 1.53
Loyalty programs 2.83o 1.62
Advertising on radio/tv/newspapers 2.58p 1.46
Offer home delivery 2.14q 1.46
where 1 is ‘not at all important’ and 6 is ‘very important’
those items with the same superscript are not significantly different at p = 0.05
Ideally, if there is a member on the team who is proficient with statistics and the use of the data
analysis software, a rank ordering of the means should be performed. As demonstrated in the
example above, this will show where items are significantly different from others. As shown in
the example above, there is no significant difference between quality, cleanliness and price: they
are all equally important in the consumers’ decision to purchase fresh produce from a retail
store.
To determine what impact the socio-economic dimensions may have on influencing the
respondent’s decision to purchase, ANOVA may be employed, with an appropriate post-hoc test
to identify where and the differences may be found. It is also possible through principal
component analysis, to aggregate the responses and thus identify underlying latent constructs
[or factors] but this beyond the scope of this rapid appraisal tool.
Using fresh fruit and vegetables as an example [Q13 -19] for each of the sections which follow,
the presentation of the results is somewhat complex.
For Question 13, based on the category of stores selected for Q3-7 and Q9, a master list is to be
prepared to establish a code for each store type. Frequencies and percent are then recorded in
the following table for both the preferred and second most preferred store.
Where respondents buy the majority of their fresh fruit and vegetables
Most preferred place Second most preferred
place
Frequency Percent Frequency Percent
Store 1
Store 2 Store 3 Store 4 Store 5 Internet
Cross-tabulations can be easily done by selected socio-economic variables to determine what if
any impact these variables may have on the place of purchase. Cross-tabulations are utilised in
those cases where both the dependent and independent variable are non-metric. Here is an
example taken from a study that looks at the willingness of consumers in Australia and Japan to
pay a price premium for Fairtrade coffee:
Willingness to pay for Fairtrade coffee from a retail store
WA Japan
N % N %
No more 37 24.8 57 36.5
10% more 50 33.6 56 35.9
15% more 20 13.4 18 11.5
20% more 42 28.2 25 16.0
p<0.031
Where a statistically valid result is required, the chi-square test is usually applied. In this case,
where the p value is less than 0.05, the research can conclude that consumers in Japan are less
willing to pay a price premium than consumers in WA.
For Question 14, a similar table can be used to that employed for Q13 as it is not our intention
to differentiate between the different types of retail store.
Frequency of purchasing fresh fruit and vegetables from the respondents preferred and second
most preferred retail store
Most preferred place Second most preferred
place
Frequency Percent Frequency Percent
Everyday
2-3 times a week Once a week 2-3 times a month Once a month
Again, cross-tabulations can be easily done using selected socio-economic variables to
determine what if any impact these variables may have on the frequency of purchase.
For Question 15, the table will be analysed in a similar way to Questions 10 and 11. Two tables
must therefore be prepared: one for the preferred store and one for the second most preferred
retail store. However, while it may be possible to use the same master list, respondents may
raise a number of issues, particularly about quality, which will be specific only to fresh fruit and
vegetables. Similarly, for each of the other sub-sections: meat, poultry and or fish, staple food,
processed food and semi-processed food, different master lists will need to be prepared.
Question 16 is expected to generate a very large list of fresh fruit and vegetable types. However,
the analysis here is to be undertaken somewhat differently. Here it is recommended that a
separate data file be developed: one for each of the different commodity groups. Every fruit or
vegetable mentioned by the respondents [collectively] is to have its own column. Where the
consumer indicates that they purchase that fruit or vegetable, a 1 is to be entered into the
column. Where the consumer does not purchase enter a 0. For example:
Respondent Orange Apple Mango Grape Peach Pear … …
001 1 1 0 1 0 1
002 1 1 1 1 1 1
003 1 1 0 0 0 0
004 1 1 0 1 1 1
005 0 1 0 0 0 1
006 1 0 1 0 0 0
…
When the data encoding process is complete, additional columns may then be added and the
desired socio-economic data for the respondents copied and pasted from the main spreadsheet.
Having entered the data in this way, it is now easy to determine which fruit and vegetables
respondents from different socio-economic groups are consuming. This information can be
readily extracted using cross-tabulations and entered into the following table [using income
groups as an example]. Because of the different sizes of the different socio-economic groups,
the analysis of the table will be best undertaken using the percent of respondents for each socio-
economic group [hence it is not necessary to add frequencies]. To obtain these figures, each
product will need to analysed as an independent cross-tab. However, whether or not meaningful
statistics can be extracted will be determined primarily by the number of respondents
purchasing the product and the number of socio-economic groups.
Percent
Income Grp
1
Income Grp
2
Income Grp
3
Income Grp
4
Income Grp
5
Orange
Apple
Mango
Grape
Peach
Pear
…
Question 17, expenditure, is a metric scale and thus mean and standard deviation can be
reported. Using ANOVA, the means can then be compared by the selected socio-economic
variables.
Question 18 will require three tables: Q18a, proportion of the fresh fruit and vegetable thrown
away is a metric scale: mean and standard deviation. To determine any significant difference in
the amount of product thrown away from the preferred store and second most preferred store, a
paired sample t-test can be employed. In this instance, a paired sample t-test must be used to
eliminate those respondents who only answer one question but not both.
Whereas ANOVA is most often used to look for significant differences where there are more
than three categories for the dependent variable [like income group], the t-test is used to make
comparisons between two groups [such as gender: male or female]. In this example from Perth,
Western Australia, the percent of fresh produce purchased from a retail store is compared for
those respondents who have a home garden and those who do not:
Percent of purchase by garden/no garden
Garden No garden p
Mean SD Mean SD
grow yourself 22.67 21.60 0.00 0.00 0.000
purchase from a retail outlet 67.51 27.57 90.73 19.56 0.000
purchase direct from the grower 5.02 13.45 4.00 13.07 0.258
obtain as a gift from friends and family 3.03 5.72 2.69 7.42 0.469
obtain from other sources 1.45 7.15 1.87 9.05 0.458
In this instance, the results show that respondents who had a vegetable garden at home, on
average, purchased 23% less fresh fruit and vegetables from a retail store.
Q18b and Q18c with both require master lists to be developed. For the tables, both frequencies
and percent are to be recorded.
Question 19 is to be analysed in a similar manner to Q10 and Q11. Having developed the master
list, the consumers concerns are to be entered in the order in which they are presented. Multiple
answers are expected.
While it will be possible to explore potential differences in the respondents concerns by socio-
economic groups, the analysis will take some time. Should the team wish to undertake this
analysis, in the first instance, the analysis [as described above] should be repeated for each
socio-economic group to determine if there is any difference in the ranking of significance [or
importance] of the constraints. The percent of responses divided by the number of respondents
[see the WA fresh fruit and vegetable data set] can then be copied and pasted into another table
and combined with the data from the analysis of other sub groups. The comparison between
groups will best be facilitated by the use of percent responses because of differences in the size
of different socio-economic groups. However, please care take in transcribing the data from one
table to the other as the order may change between groups.
Moving now to Questions 57 and 58, as the responses here are metric, mean and standard
deviation can be employed.
Monthly expenditure on meals consumed away from home
Mean SD
Monthly expenditure on meals consumed away
from home
Monthly expenditure on meals consumed away from home by place of purchase
Mean SD
Restaurants
Fast food outlets
Work canteens
School lunches
Street vendors
Other
To explore differences between different socio-economic groups, ANOVA is to be employed
with Duncan’s HSD [as a post-hoc test], or where there are just two categories, the independent
t-test.
To record the responses to Q59, an indeterminate number of columns will be required,
depending on the number of immediate family members living in the house, parents and
relatives, and domestic helpers. The data however is metric and thus mean and standard
deviation can be utilised to facilitate data analysis.
Number of meals eaten at home and away from home per week by occasion
Breakfast Lunch Dinner
Home Away Home Away Home Away
µ SD µ SD µ SD µ SD µ SD µ SD
Self
Spouse
Child 1
Child 2
…
Parent 1 Parent 2 … Relative 1 Relative 2 … Domestic helper
To explore any differences between socio-economic groups, use ANOVA and Duncan’s HSD
or the independent t-test.
While Questions 60, 61 and 62 are also metric, and thus both the mean and standard deviation
are to be recorded, it will also be useful here to record the number of households [N] that are
producing a proportion of their own food; receiving a proportion from friends, family,
neighbours or work; or receiving a proportion from a food aid program.
Proportion [%] of food consumed in the household produced by the household
Proportion [%] of food consumed in the household received as a gift from friends, family,
neighbours or work
Proportion [%] of food consumed in the household from a food aid program
N Mean [µ] SD
Meat/poultry
Fish/seafood
Dairy products
Eggs
Fresh fruit and vegetables
Staple
Processed foods
To explore any differences between socio-economic groups, use ANOVA and Duncan’s HSD
or the independent t-test.
For Question 61, where respondents do produce a proportion of their own food, the size of the
vegetable garden and/or fish pond and the number of fruit trees, chickens and livestock are to be
recorded as given. Respondents who do not grow any proportion of their own food should be
recorded as 0.
Statistics for home food production
Mean [µ] SD
Size of vegetable garden/roof top garden [m2]
Number of fruit trees
Number of chickens
Number of other livestock
Size of fish pond [m2]
Question 62, the source of water, will require a master list to be developed and from that the
frequency and percent recorded.
Questions 65 – 69 are all categorical [non metric] responses, which require frequency and
percent to be recorded.
Questions 70 – 73 each contain multiple sub-questions. However, in all instances the data is
primarily non metric, requiring the data enumerator, where appropriate, to develop a master list
and to enter the appropriate code. Frequencies and percent are to be reported.
While Question 74 uses a metric scale, on this occasion, rather than using means and standard
deviation, the results will be easier to interpret using the percent of responses for each food
category.
Number of times in the last seven days each food group was consumed by the household
Percent
0 1 2 3 4 5 6 7
Your staple food
A red meat – beef/mutton/goat
Chicken
Fish
A green leafy vegetable
At least two other vegetables
One or more pulses
Fresh milk
Any other dairy product
One other staple
Fresh fruit
Dried fruit/nuts
Highly processed snack foods
Chocolate/confectionary
The objective of this question is to examine the diversity of the household diet. It would not be
unreasonable to assume that among the lower income households, the diet will be much less
diverse and may even be deficient for some nutrient groups [hence the need to record and report
non consumption (0) which would be otherwise lost if we were only to report the mean]. To test
that hypothesis, ANOVA will need to be employed using the means for the different socio-
economic groups. Having applied Duncan’s HSD, it should be possible to see which food
groups are consumed less often by which socio-economic groups. However, this analysis does
not inform us, and nor does it need to, whether the food group is deficient for we have no
measure of the quantity of each food group purchased nor do we know which family members
consume it. This level of detail requires a far more sophisticated survey instrument and a great
deal more time than the RUFSAT tool permits.
However, one of the factors that can have a pronounced impact on household diet, are the
dietary choices that individuals within the household make, either by choice [such as vegan or
vegetarian, or to loose weight], for medical reasons [such as food allergies, coronary heart
disease, high cholesterol, diabetes], or for religious reasons. Question 75 seeks to answer these
questions. The data is to be recorded by frequency and percent.
Are any members of the household:
Yes No
N % N %
completely vegetarian
mainly vegetarian
vegan
following a strict plan to lose weight
on a casual diet to lose weight
on a special diet for medical reasons
on a special diet due to allergies
on a special diet for religious reasons
The final question [Q76] utilises the Food Insecurity Experience Scale (FIES), developed by the
Voices of the Hungry (VoH) project. These responses are collected through the FIES Survey
Module (FIES-SM) which consists of eight questions regarding people's access to adequate
food. Used in combination with other measures, the FIES has the potential to contribute to a
more comprehensive understanding of the causes and consequences of food insecurity and to
inform more effective policies and interventions. However, unlike any of the earlier questions,
the responses to this question must always be analyzed together as a scale, not as separate items.
To do that, the data is entered into…. [and this where I need the how]
While Ive just read the monograph [http://www.fao.org/3/a-as583e.pdf] this doesn’t tell me how
to do it… and if we cant put this matrix into a simple tool to analyse it – it doesn’t belong in
RUFSAT.