Conducting the consumer interviews

Conducting the consumer interviews

Introduction

While all of us are consumers in that on more than one occasion we have purchased consumer

goods for personal or household consumption, this aspect of RUFSAT seeks to provide the

multi-stakeholder platform [MSP] with accurate information on consumer food purchasing

behaviour within the city.

The consumer questionnaire will provide evidence on:

the different sources from which consumers obtain their food [including that which

they grow themselves, purchase, receive as gifts, or that which is supplied through a

food aid agency]

where consumers buy their food [type of retail outlet]

what food they buy

how often they buy food

how much they spend

how they travel and how far they travel to various retail stores to purchase their food

why they purchase food from different retail outlets

the major concerns consumers have about the different foods they purchase and

consume

food wastage [in the home]

the composition of the household diet

the proportion of food eaten at home and away from home

food preparation in the home

food hygiene and safety

food security

and demographic information that can be utilised to identify differences in access to,

the affordability of food and the composition of the diet across the metropolitan area.

Resources required

To conduct this aspect of the study in a timely manner, it is recommended that at least TWO

enumerators be engaged for a period of no less than three weeks [21 days]. This is based on the

knowledge that enumerators should be capable of conducting 12-15 interviews per day and that

enumerators will require 1-2 days training prior to being released into the field.

The target number of respondents is 450. Given the difficulties associated with gaining access to

individual households, especially those who reside in apartments and for which entry may be

blocked by security personnel, the majority of interviews will be conducted in retail shopping

areas where consumers purchase the majority of their food. Where appropriate, consumers

should be interviewed in both traditional wet markets and modern retail shopping outlets

[supermarkets or hypermarkets] in similar numbers. However, within the slum areas, it may be

necessary to undertake personal interviews on a door-to-door basis [with appropriate security

arrangements].

In the actual conduct of the interviews themselves, each research team will need to make its

own decision about whether to collect the data via tablets or whether to conduct the interviews

in hard copy [paper] and to subsequently employ a data enumerator to enter the data into excel

or directly into appropriate data analysis software [such as SPSS].

If the data is to be entered directly into a tablet, appropriate software must be selected and the

survey encoded to fit that software. Unless at least one member of the research team has

appropriate software development skills, this may not be an option. Even then: (i) transcribing

the survey instrument may take an inordinate amount of time; (ii) there may be an issue around

personal security for the enumerators in the markets and in the slum areas with the risk of

physical assault and robbery, and thus the loss of data; and (iii) battery life and the need to

potentially recharge the tablets at least once during the day. At the end of each day, where

tablets are employed, data will need to be downloaded and saved to an appropriate external hard

drive.

Selecting the sample

In selecting the sample it is assumed that the research team has access to:

a physical map of the metropolitan area describing the physical boundaries for the

different suburbs and the location of: (i) traditional retail [wet] markets; (ii) suburban

shopping centres; and (iii) major shopping malls [supermarkets and hypermarkets]

from the most recent census, population estimates and a socio-economic profile for each

suburb

From the most recent census data, the proportion of the population falling within each socio-

economic classification is to be calculated. This will establish the number of strata groups to be

used in sampling. Please note: the number of strata will vary from country to country depending

upon the socio-economic classification utilised by the central statistics agency.

The total sample [450] is then to be divided by the number of strata and weighted by the

proportion of the population within each strata to determine the approximate number of

respondents to be interviewed from each strata. Using information available from the central

statistics agency, each suburb is to be assigned to the most appropriate strata and a number of

suburbs selected at random, corresponding to the proportion of the population within each

socio-economic classification. For each suburb selected, the minimum number of interviews to

be undertaken is 25. To summarise:

the total number of suburbs to be sampled is 18 [including the slums]

suburbs are to be placed into socio-economic groups

the number of suburbs selected from each group is determined by the population

distribution

With two enumerators, gathering 25 interviews is equivalent to one day in the field for each

suburb. Ideally, at least 12 interviews should be conducted in a modern retail store and at least

12 in a traditional wet market.

In stratifying the population, our objective is to draw a representative sample of the target

population. However, we must acknowledge that:

there can be a significant difference in personal incomes between households residing in

the same suburb

consumers will often travel some distance within the city to purchase food from

preferred outlets. This may be because of the perceived quality of the food, the range or

type of food available, low price, or because of ethnic, cultural or religious reasons

consumers are known to purchase different food items from different retail stores

the population of a city is not constant. On any given day, any number of people may

travel into the city for work and depart in the evening to return to their place of

residence. Even although these people may not reside in the target suburbs they are

NOT to be excluded.

Shoppers are to be randomly intercepted in the selected market using every nth respondent. N

can be determined in advance – for example every third person – but n will also be dependent

on the number of people present in the market. If the number of people is low, every shopper

passing the enumerator should be approached for an interview.

Interviews are to be conducted on each day of the week and across a variety of times to ensure

that all socio-economic groups have a chance of being represented.

After a brief introduction to the project and the reasons for undertaking the interviews, two key

questions are to be asked of the potential respondents:

1. Are you responsible for at least 40% of the food purchasing decisions within your

household? This screening question will eliminate from consideration those respondents

who do not have the knowledge or experience to complete the survey instrument.

2. Do you have 20 minutes available to participate in this survey? This is to advise the

respondent of the approximate amount of time required to complete the survey, thereby

minimising the number of surveys that are not completed.

Before commencing the interviews, enumerators are to advise respondents that:

the data will only be used for research purposes

it will not be possible to identify any individual or to link any individual household to

the data collected

the respondents participation is voluntary

the respondent may terminate the interview at any time

Not unexpectedly, respondents may expect to receive some reward or incentive for their

participation. Each team must make its own decision as to whether it will/will not offer an

incentive and also determine the type and value of the incentive. Care must be taken not to

select products that may be perceived as a commercial endorsement.

For the slum areas, the number of slums to be interviewed and thus the number of households to

be interviewed will also be determined from the census data. The key difference here is that

enumerators will need to be accompanied by a person who is familiar and well respected by the

community.

The general location where each interview is conducted is to be recorded and that location

added to the interactive geospatial map for the city.

Development and pretesting of the survey instrument

Although the format for much of the questionnaire is fixed, BEFORE commencing data

collection, each research team must give some thought to defining the different types of retail

outlet operating in the city. The terms used to describe the different types of outlet should be

well understood by the respondents and may include:

traditional wet market

temporary market [operate on only specific days of the week]

farmers market

small neighbourhood grocery stores

convenience store

petrol station

a specialist retail store [butcher, baker, fish monger, green grocer, delicatessen]

street vendor or roadside stall [fixed]

a mobile street vendor [may offer door-to-door sales]

supermarkets [independents and/or chain stores]

hypermarkets

internet

others [define]

For this study, there are 7 broad food categories:

meat and poultry

fish/seafood

dairy products

eggs

fresh fruit and vegetables

staple [includes cereals, pulses, root and tubers, and plantains]

processed foods and beverages

By necessity, because this is a rapid assessment tool where the primary unit of analysis is the

household, some products or commodities may be absent and/or aggregated.

Some, but not all, of these groupings are subsequently used to explore:

the preferred place of purchase [which is known to differ for different food products]

the frequency of purchase

the type of products purchased [within the group]

how much money the consumer spends

why the consumers purchases from the preferred place of purchase

what proportion of the product purchased is wasted

why it is wasted

what the consumer does with the waste product

the consumers concerns associated with the selected product group. Not unexpectedly,

these concerns will differ with each product group. For example, for fresh and fruit and

vegetables, consumers may have reservations about freshness or chemical residues. For

meat, they may be more concerned about growth promotants, antibiotics, animal

welfare or the use of chemical agents to extend the shelf life.

Research teams will gather data for

fresh fruit and vegetables

either meat, poultry or fish

staple food [cereal, pulse, root or tuber, or plantain]

packaged/processed food [whether canned, bottled, frozen or dehydrated. This includes

snacks, cakes, biscuits, pastries, soft drink, confectionary, breakfast cereals, bread etc

BUT it does NOT include food purchased from restaurants, fast food outlets or food

service outlets]

minimally processed food [this includes fresh cut vegetables, salad mixes, precut fruit

like pineapples, melons, durian or jackfruit, pomegranates etc]

food purchased from restaurants, fast food outlets and food service outlets, irrespective

of whether the food is eaten on the premises or purchased as a takeaway for

consumption at home

Acknowledging that as personal disposable incomes rise, a greater quantity of food is purchased

away from home, the questionnaire will endeavour to assess:

what proportion of the household food budget is spend on food away from home

where the food is consumed

by which members of the household and

for which occasion [breakfast: lunch: dinner]

It is also important to recognise that not all households are entirely dependent on food

purchases: some household will produce a proportion of their own food, others will receive a

proportion of their food from family members in rural areas and others, especially the under

privileged, may obtain a proportion or all of their food from food aid programs.

Within the home, the RUFSAT tool will explore key issues associated with food hygiene and

safety. The questionnaire will gather information on:

the type of fuel used for cooking

the source of water

the incidence of diarrhoea

whether anyone in the household has received any information on the safe handling of

food and

whether the household owns a refrigerator or freezer [food storage]

The ultimate outcome of the RUFSAT tool is to provide a planning mechanism for local

government to ensure that all consumers, irrespective of income, have access to a sufficient

quantity of safe, healthy and nutritious food. To evaluate this, the questionnaire will conclude

with four sets of related questions:

the Food Insecurity Experience Scale [FIES] is an experience-based metric of severity

of food insecurity that relies on respondents direct responses. To retain its integrity, this

scale is NOT to be modified in any way

the type of food consumed within the household for the past week and the variety of

foods consumed

whether any members of the household were unable to eat any types of food for either

medical, religious or personal reasons

whether anyone in the household has received any information on healthy, nutritious

diets

The final section of the questionnaire asks a range of demographic questions about the

respondent but more importantly, about the household. The responses to the various questions

asked here are central to the analysis of the data, for they will provide the independent variables

to be used in the cross-tabulations, t-tests and analysis of variance [ANOVA]. Issues such as

ethnicity, religion and income are hypothesized as having the most significant impact on food

purchases and food consumption, but where a person shops may also be influenced by access

and availability to transport [car or a motor bike] and age [with younger people often preferring

to shop from modern retail outlets].

For several of the questions, team members will need to adjust the categories used in the survey

instrument. For example: Q77 – age group – the categories used should mirror those used by the

central statistics agency. This will allow the characteristics of the sample group to be compared

more readily with the population parameters. Similarly, in asking the respondent about

household income [Q92], rather than to ask a direct question [which very few people will

answer], the respondent will be asked to indicate an income range. In establishing the range of

income groups, the categories used by the central statistics agency or taxation department

should be employed.

While other demographic information may or may not be used, the information gathered will be

used in profiling the respondents, thereby enabling a comparison of the sample parameters with

the population estimates. This comparison will provide a ‘goodness of fit’ measure.

Having finalised the questionnaire, the instrument should be pilot tested before undertaking the

survey. Here the researchers may choose to draw a convenience sample from workplace

colleagues [who have NOT been involved in the survey design], friends or family, or if

convenient, to go to the nearest market place or retail store. Around 10-15 surveys should be

sufficient to identify any major faults, errors of omission, or to indicate which questions may

need to be restructured [perhaps as a result of translation] to be better understood.

Data entry and encoding

To facilitate data entry a spreadsheet has been developed into which the responses are to be

entered. Every question and every subsection of every question has a column which is clearly

marked on both the questionnaire and the spreadsheet.

For the majority of questions, the answers are pre-coded, so all that needs to be done is for the

corresponding number to be entered into the appropriate column. However, where there are

open-ended questions, the data entry operator is to establish a master list. The master list is

dynamic for when new items emerge the new item is given its unique code, but where

respondents give a pre-recorded response, the number already assigned to that response is

entered. For example, in Question 9, respondents are asked to identify the reasons why they buy

food from their preferred retail store:

1 Freshness

2 Low price

3 Cleanliness

4 Hala’l

5 Lots of variety/choices

6 Close to home/office

7 Loyalty

8 Nice environment

9 10

Should the respondent provide multiple answers [sufficient space is provided to accommodate

up to five answers], their first response might be low price [2], close to home [6], but should

they then say nice environment, this is added to the list as [8]. With only one data operator the

master list can be easily constructed at the same time the data is entered. Should a double entry

be inadvertently made, it can be readily corrected by recoding.

The other major area where a master list may need to be prepared relates to the occupation for

both the respondent and their spouse. However, developing the master list for this category will

be a two stage process. In the first instance, occupations are simply recorded, but in the second

stage, the various occupations need to be grouped into much smaller categories to facilitate data

analysis. Ideally, the construction of these groups should come from the categories used by the

central statistics agency or another appropriate government agency.

Data cleaning

BEFORE any analysis of the data is undertaken, a very basic frequency analysis is to be

conducted. The purpose of this is to:

correct any data entry errors – for example, in Q11 we ask respondents to indicate how

important a number of items are to them in their decision to purchase food from a retail

store. Respondents are asked to use a scale from 1 to 6. Should 66 appear, it is

immediately obvious that a double entry has occurred which can be very easily

rectified, but where it’s a 34, 45 or 56 for example, the data operator will need to return

to the questionnaire to identify the true record and correct.

Every questionnaire will have a unique code ranging from 001 to 450. This code is to be

added to the questionnaire by the data operator at the time of data entry so as to avoid

any duplication.

determine the normality of the data. In statistics, normality refers to the normal

distribution of the data. For data that is not normally distributed, analysts must exercise

a degree of caution in drawing conclusions, especially when associations or

relationships between elements are being explored and/or hypotheses statistically tested.

While various tools are available to correct or compensate for non normal data, the level

of analysis undertaken in RUFSAT is unlikely to require those tools to be employed.

PLEASE NOTE. HAVING CHECKED THAT ALL RESPONSES ARE CORRECT, A

MASTER FILE SHOULD BE CREATED. THE DATA IN THIS FILE IS NOT TO BE

AGGREGATED: IT CONTAINS THE RAW DATA. ONE OR MORE COPIES OF THIS

FILE SHOULD BE STORED IN SEPARATE LOCATIONS TO PREVENT THE

ACCIDENTAL OR DELIBERATE LOSS OF DATA.

Data analysis

Before analysing the data, we must first describe the sample. From the frequency outputs, both

the raw number [frequency] and the percent of responses are to be entered into the following

table:

N %

Gender

Female

Male

Joint decision [where both male/female jointly answer]

Age [of the main respondent]

18 – 24 years

25 – 34 years

35 – 44 years

45 – 54 years

55 – 64 years

65 years and older

Marital state

Single

Married

Separated/divorced/widow[er]

Highest level of education achieved [by principal respondent]

Did not complete primary school

Primary school

Junior high school

Senior high school

Certificate

Diploma

Under-graduate degree

Post-graduate qualification

Occupation of the respondent

[occupational groups to be established]

Occupation of spouse

[occupational groups to be established]

Ethnicity [of principal respondent]

[list of ethnic groups to be entered from master list]

Ethnicity of spouse

[list of ethnic groups to be entered from master list]

Religion

[list of religious groups to be entered from master list]

Number of people living in house

[number of categories to be determined from output]

Immediate family members


Parents/relatives/extended family


Number of others living in house


Suburb in which respondent lives

[list of suburbs to be entered from master list]

Own a motorbike

Yes

No

Own a car

Yes

No

Access to credit cards

Yes

No

Household income

[number of groups to be determined from tax office]

Frequency of payment

Weekly

Fortnightly

Monthly

Irregular

To facilitate reading and so as not to cut the table in the middle of a section, the table may be

split into two or more separate tables.

For the initial descriptive analysis of the sample [and of the data generally], the full range of

responses should be presented. However, for many of the statistical tools that will be used to

analyse the data such as crosstabs, t-tests and analysis of variance, there is a requirement for a

minimum number of respondents. Where these conditions are not fulfilled the analysis will

return an error. To overcome that problem, some aggregation of the data may be necessary.

Let’s take family size for example:


1 26 5.8

2 65 14.4

3 84 18.7

4 96 21.3

5 78 17.3

6 62 13.8

7 17 3.8

8 13 2.9

9 6 1.3

10 2 0.4

11

12 1 0.2

In a situation like this it would be sensible to aggregate the data for all those households with

more than 7 people living in the house so that the revised table would now look like this:


1 26 5.8

2 65 14.4

3 84 18.7

4 96 21.3

5 78 17.3

6 62 13.8

7+ 39 8.7

To facilitate the analysis, the data in the working spreadsheet must now be recoded, so that for

those respondents with 8, 9, 10 or 12 people in the household, a 7 will now be recorded.

A similar type of analysis might be undertaken to aggregate some of the educational

qualifications, perhaps combining Certificate and Diploma or Under-graduate and Post

Graduate if the numbers are low; recoding any minority ethnic groups and religious groups as

‘others’; or recoding those respondents who live in suburbs outside the study area as ‘others’.

Before moving to the data which relates to food, the research team should compare the sample

characteristics with those reported by the central statistics agency. Where possible, to facilitate

the comparison, the data should be tabulated. Taking this example from Perth, Western

Australia:

Age [of the main respondent]

ABS Brackets Percent

ABS Survey Variance

18-24 16.3 24.6 + 8.3

25-34 17.4 26.0 + 8.6

35-44 18.2 17.9 - 0.3

45-54 17.3 16.8 - 0.5

55-64 14.2 7.2 - 7.0

65+ 16.6 7.5 - 9.1

Note: percentage adjusted after the exclusion of people under the age of 17

It is immediately evident that the age group 18-24 years and 25-34 years is over-represented

while those aged more than 55 years are under-represented. While any number of reasons might

explain some or all of the variation, this is the sample that we have and must now work with.

However, we must note the limitations and be conscious of these in drawing any conclusions.

For Question 1, in recording the proportion of food consumed in the household by the source of

the food, rather to record frequencies, on this occasion, given that the data is metric, the mean

and standard deviation [SD] will be recorded.

Proportion of all food consumed in the household:

Mean SD

grow/produce yourself

receive as a gift from friends or family

receive as part of a remuneration package [from work]

purchase from a retail food store

purchase from a restaurant/food service outlet

receive from a food aid agency

Question 2, who makes the decision to purchase, will be presented in a similar manner.

While the mean will reveal the importance of each source of food, the standard deviation [SD],

which is derived from the variance around the mean, will indicate opportunities to interrogate

the data using some of the socio-demographic variables as the dependent variable. For example,

it would not be unreasonable to hypothesize that households in the slum districts [suburbs]

would be more likely to rely on food aid agencies than those residing in more affluent suburbs.

Using household income as the dependent variable, one might hypothesize that as income

increases households will purchase a greater quantity of food from restaurants or food service

outlets. However, lower income groups may also purchase a large proportion of their food from

fast food outlets because of convenience. In this case, to test our hypotheses analysis of variance

[ANOVA] will be conducted. However, while ANOVA may reveal a statistically significant

result, it will not reveal where the differences occur. For this reason it will be necessary to use a

post-hoc test. That which is considered the most reliable is Duncan’s HSD.

In the following example [from Perth, Western Australia], respondents have asked to indicate

what proportion of the fresh fruit and vegetables consumed in the household were: (i) grown by

themselves; (ii) purchased from a retail store; or (iii) purchased direct from the producer. From

the Australian Bureau of Statistics, the following age categories were employed

Mean percent of purchases by age

18-24 25-34 35-44 45-54 55-64 + 65 P

Grow own 5.62b 7.09b 9.01ab 11.86ab 13.97a 15.33a 0.000

Purchase 85.26a 83.47a 79.66ab 80.34ab 72.58b 77.86ab 0.001

Purchase direct 3.44ab 5.56ab 5.88ab 3.30ab 6.80a 1.67b 0.003

where those items with the same superscript are not significantly different at p = 0.05

The superscripts ab indicate that there are two significantly groups: Group a and Group b. If we

look at the proportion of the fresh fruit and vegetables grown by the respondent, it is

immediately obvious that those respondents aged 55 and above grow a larger proportion of the

fresh fruit and vegetables that they consume than respondents aged 34 years and below. For

those respondents aged 35 to 54 years there is no significant difference.

In recording the results for Q3 through Q7, given the potentially large number of responses

[depending on the different kinds of retail outlet] it will be best to look at each question

separately. For Question 3, as the data is non-metric [or categorical] only frequencies and

percent can be displayed.

Do you shop for food in any of these places?

Store 1 Store 2 Store 3 Store 4 Store 5 Store 6

N % N % N % N % N % N %

Yes

No

If yes, how often do you shop in each place?


N % N % N % N % N % N %

1

2

3 4 5 6

where 1 is everyday

2 is 2-3 times a week

3 is once a week 4 is 2-3 times a month

5 is once a month

6 is seldom

For Question 6 – approximately how far [m] is each retail store from your place of residence? –

the response is metric, hence the mean and standard deviation can be used here.


µ SD µ SD µ SD µ SD µ SD µ SD

m

where µ is the mean

SD is the standard deviation

Questions 8 and 9 are also metric, and thus the mean and standard deviation should be used.

Questions 10 and 11 will require the data enumerator to develop a master list. However, this is a

multiple response question where respondents are encouraged to give multiple answers. In

marketing research, as it is normal to associate the order in which respondents give their replies

as a measure of importance. Thus, when encoding the responses, it is important to place the

code in the appropriate column in the order in which it was presented by the respondent. Each

column is then analysed separately and the responses placed into a table as shown.

This example is drawn from a study of store choice for fresh fruit and vegetables in Perth,

Western Australia. In calculating the percentages, it is not as a percentage of responses, but

rather as a percentage derived from the number of respondents [as given in Column 1]. As it is

not our intention to differentiate between the different types of retail store, all the data,

irrespective of store type, can be aggregated. However, should the team wish to look at the

responses for different socio-economic groups, while this is relatively easy to do by applying a

filter, it is very time consuming.

Ranking N %

1 2 3 4 5

Competitive price 65 99 40 14 3 221 46.9

Good quality produce 65 70 30 10 1 176 37.3

Fresh 98 44 24 4 1 171 36.3

Convenience 88 28 14 6 1 137 29.1

Wide range of produce 18 28 28 5 1 80 17.0

Close proximity to home 38 15 8 5 66 14.0

Location 32 17 5 4 58 12.3

Wide range of other foods 7 10 9 4 2 32 6.8

Local produce 8 6 7 1 1 23 4.9

Only shop available 10 2 1 13 2.8

Presentation 2 2 4 3 1 12 2.5

Origin of the produce 4 1 3 1 1 10 2.1

Value for money 2 2 5 1 10 2.1

Credit card facilities 1 3 4 0.8

Quick checkout 2 2 4 0.8

Close to other shops 1 1 1 3 0.6

Atmosphere 1 2 3 0.6

Social meetings 1 1 2 0.4

Ability to self select 1 1 2 0.4

Store layout 1 1 2 0.4

Loyalty programs 1 1 2 0.4

N 471

Note: table has been edited to fit the page

Question 12 differs from the fixed response categories that have been previously used in that the

response range is a metric scale [like a thermometer] where the difference between 1 and 2 is

the same as the difference between 4 and 5. Hence, in recording the results for this question, the

mean and standard deviation are to be used. To record the data, simply copy and paste the table

and leave just two columns. Having entered the results into the table, order from most important

to least important. Whether this is an ascending or descending scale will be determined by the

prevailing in-country social norms: in some countries 1 is the most important, while in others, 6

is the most important. A six point scale has been used purposefully to prevent respondents from

utilising the neutral mid point: they are forced to think about each question. See this example

from Perth, Western Australia:

Importance of criteria respondents use in their choice of retail store

Mean SD

Good quality produce 5.68a 0.66

Fresh produce 5.64a 0.69

A wide range of fresh produce 5.48a 0.80

Clean 5.35a 0.92

Good value for money 5.30a 0.94

Competitive price 5.11a 1.00

All product is clearly priced 5.08b 1.06

I can self select 4.98c 1.19

Fast and efficient check-out 4.96c 1.02

Close to my home 4.95c 1.28

Easy to access 4.91c 1.14

Customer service 4.79d 1.20

Fresh produce is refrigerated 4.78d 1.29

Good access to product on the shelf 4.70e 1.16

A wide range of other food products 4.69e 1.35

Price specials or discounts 4.63e 1.31

Plentiful car parking 4.58e 1.41

Trolleys and baskets easily accessible 4.56e 1.29

One-stop shop – can purchase everything 4.51f 1.50

Origin of the product is clearly displayed 4.43g 1.38

Knowledgeable staff 4.39h 1.31

Good lighting 4.30i 1.25

Attractive presentation 4.24i 1.33

Favourable prior purchase 4.15j 1.39

Extended trading hours 3.97k 1.76

Attractive décor and surroundings 3.95k 1.34

Refund/return policy 3.81l 1.72

Clear signage 3.67m 1.49

Organic produce 3.31n 1.56

Product information available in-store 3.31n 1.46

In-store tastings 2.97o 1.53

Loyalty programs 2.83o 1.62

Advertising on radio/tv/newspapers 2.58p 1.46

Offer home delivery 2.14q 1.46

where 1 is ‘not at all important’ and 6 is ‘very important’

those items with the same superscript are not significantly different at p = 0.05

Ideally, if there is a member on the team who is proficient with statistics and the use of the data

analysis software, a rank ordering of the means should be performed. As demonstrated in the

example above, this will show where items are significantly different from others. As shown in

the example above, there is no significant difference between quality, cleanliness and price: they

are all equally important in the consumers’ decision to purchase fresh produce from a retail

store.

To determine what impact the socio-economic dimensions may have on influencing the

respondent’s decision to purchase, ANOVA may be employed, with an appropriate post-hoc test

to identify where and the differences may be found. It is also possible through principal

component analysis, to aggregate the responses and thus identify underlying latent constructs

[or factors] but this beyond the scope of this rapid appraisal tool.

Using fresh fruit and vegetables as an example [Q13 -19] for each of the sections which follow,

the presentation of the results is somewhat complex.

For Question 13, based on the category of stores selected for Q3-7 and Q9, a master list is to be

prepared to establish a code for each store type. Frequencies and percent are then recorded in

the following table for both the preferred and second most preferred store.

Where respondents buy the majority of their fresh fruit and vegetables

Most preferred place Second most preferred

place

Frequency Percent Frequency Percent

Store 1

Store 2 Store 3 Store 4 Store 5 Internet

Cross-tabulations can be easily done by selected socio-economic variables to determine what if

any impact these variables may have on the place of purchase. Cross-tabulations are utilised in

those cases where both the dependent and independent variable are non-metric. Here is an

example taken from a study that looks at the willingness of consumers in Australia and Japan to

pay a price premium for Fairtrade coffee:

Willingness to pay for Fairtrade coffee from a retail store

WA Japan

N % N %

No more 37 24.8 57 36.5

10% more 50 33.6 56 35.9

15% more 20 13.4 18 11.5

20% more 42 28.2 25 16.0

p<0.031

Where a statistically valid result is required, the chi-square test is usually applied. In this case,

where the p value is less than 0.05, the research can conclude that consumers in Japan are less

willing to pay a price premium than consumers in WA.

For Question 14, a similar table can be used to that employed for Q13 as it is not our intention

to differentiate between the different types of retail store.

Frequency of purchasing fresh fruit and vegetables from the respondents preferred and second

most preferred retail store

Most preferred place Second most preferred

place

Frequency Percent Frequency Percent

Everyday

2-3 times a week Once a week 2-3 times a month Once a month

Again, cross-tabulations can be easily done using selected socio-economic variables to

determine what if any impact these variables may have on the frequency of purchase.

For Question 15, the table will be analysed in a similar way to Questions 10 and 11. Two tables

must therefore be prepared: one for the preferred store and one for the second most preferred

retail store. However, while it may be possible to use the same master list, respondents may

raise a number of issues, particularly about quality, which will be specific only to fresh fruit and

vegetables. Similarly, for each of the other sub-sections: meat, poultry and or fish, staple food,

processed food and semi-processed food, different master lists will need to be prepared.

Question 16 is expected to generate a very large list of fresh fruit and vegetable types. However,

the analysis here is to be undertaken somewhat differently. Here it is recommended that a

separate data file be developed: one for each of the different commodity groups. Every fruit or

vegetable mentioned by the respondents [collectively] is to have its own column. Where the

consumer indicates that they purchase that fruit or vegetable, a 1 is to be entered into the

column. Where the consumer does not purchase enter a 0. For example:

Respondent Orange Apple Mango Grape Peach Pear … …

001 1 1 0 1 0 1

002 1 1 1 1 1 1

003 1 1 0 0 0 0

004 1 1 0 1 1 1

005 0 1 0 0 0 1

006 1 0 1 0 0 0

…

When the data encoding process is complete, additional columns may then be added and the

desired socio-economic data for the respondents copied and pasted from the main spreadsheet.

Having entered the data in this way, it is now easy to determine which fruit and vegetables

respondents from different socio-economic groups are consuming. This information can be

readily extracted using cross-tabulations and entered into the following table [using income

groups as an example]. Because of the different sizes of the different socio-economic groups,

the analysis of the table will be best undertaken using the percent of respondents for each socio-

economic group [hence it is not necessary to add frequencies]. To obtain these figures, each

product will need to analysed as an independent cross-tab. However, whether or not meaningful

statistics can be extracted will be determined primarily by the number of respondents

purchasing the product and the number of socio-economic groups.

Percent

Income Grp

1

Income Grp

2

Income Grp

3

Income Grp

4

Income Grp

5

Orange

Apple

Mango

Grape

Peach

Pear

…

Question 17, expenditure, is a metric scale and thus mean and standard deviation can be

reported. Using ANOVA, the means can then be compared by the selected socio-economic

variables.

Question 18 will require three tables: Q18a, proportion of the fresh fruit and vegetable thrown

away is a metric scale: mean and standard deviation. To determine any significant difference in

the amount of product thrown away from the preferred store and second most preferred store, a

paired sample t-test can be employed. In this instance, a paired sample t-test must be used to

eliminate those respondents who only answer one question but not both.

Whereas ANOVA is most often used to look for significant differences where there are more

than three categories for the dependent variable [like income group], the t-test is used to make

comparisons between two groups [such as gender: male or female]. In this example from Perth,

Western Australia, the percent of fresh produce purchased from a retail store is compared for

those respondents who have a home garden and those who do not:

Percent of purchase by garden/no garden

Garden No garden p

Mean SD Mean SD

grow yourself 22.67 21.60 0.00 0.00 0.000

purchase from a retail outlet 67.51 27.57 90.73 19.56 0.000

purchase direct from the grower 5.02 13.45 4.00 13.07 0.258

obtain as a gift from friends and family 3.03 5.72 2.69 7.42 0.469

obtain from other sources 1.45 7.15 1.87 9.05 0.458

In this instance, the results show that respondents who had a vegetable garden at home, on

average, purchased 23% less fresh fruit and vegetables from a retail store.

Q18b and Q18c with both require master lists to be developed. For the tables, both frequencies

and percent are to be recorded.

Question 19 is to be analysed in a similar manner to Q10 and Q11. Having developed the master

list, the consumers concerns are to be entered in the order in which they are presented. Multiple

answers are expected.

While it will be possible to explore potential differences in the respondents concerns by socio-

economic groups, the analysis will take some time. Should the team wish to undertake this

analysis, in the first instance, the analysis [as described above] should be repeated for each

socio-economic group to determine if there is any difference in the ranking of significance [or

importance] of the constraints. The percent of responses divided by the number of respondents

[see the WA fresh fruit and vegetable data set] can then be copied and pasted into another table

and combined with the data from the analysis of other sub groups. The comparison between

groups will best be facilitated by the use of percent responses because of differences in the size

of different socio-economic groups. However, please care take in transcribing the data from one

table to the other as the order may change between groups.

Moving now to Questions 57 and 58, as the responses here are metric, mean and standard

deviation can be employed.

Monthly expenditure on meals consumed away from home

Mean SD

Monthly expenditure on meals consumed away

from home

Monthly expenditure on meals consumed away from home by place of purchase

Mean SD

Restaurants

Fast food outlets

Work canteens

School lunches

Street vendors

Other

To explore differences between different socio-economic groups, ANOVA is to be employed

with Duncan’s HSD [as a post-hoc test], or where there are just two categories, the independent

t-test.

To record the responses to Q59, an indeterminate number of columns will be required,

depending on the number of immediate family members living in the house, parents and

relatives, and domestic helpers. The data however is metric and thus mean and standard

deviation can be utilised to facilitate data analysis.

Number of meals eaten at home and away from home per week by occasion

Breakfast Lunch Dinner

Home Away Home Away Home Away

µ SD µ SD µ SD µ SD µ SD µ SD

Self

Spouse

Child 1

Child 2

…

Parent 1 Parent 2 … Relative 1 Relative 2 … Domestic helper

To explore any differences between socio-economic groups, use ANOVA and Duncan’s HSD

or the independent t-test.

While Questions 60, 61 and 62 are also metric, and thus both the mean and standard deviation

are to be recorded, it will also be useful here to record the number of households [N] that are

producing a proportion of their own food; receiving a proportion from friends, family,

neighbours or work; or receiving a proportion from a food aid program.

Proportion [%] of food consumed in the household produced by the household

Proportion [%] of food consumed in the household received as a gift from friends, family,

neighbours or work

Proportion [%] of food consumed in the household from a food aid program

N Mean [µ] SD

Meat/poultry

Fish/seafood

Dairy products

Eggs

Fresh fruit and vegetables

Staple

Processed foods

To explore any differences between socio-economic groups, use ANOVA and Duncan’s HSD

or the independent t-test.

For Question 61, where respondents do produce a proportion of their own food, the size of the

vegetable garden and/or fish pond and the number of fruit trees, chickens and livestock are to be

recorded as given. Respondents who do not grow any proportion of their own food should be

recorded as 0.

Statistics for home food production

Mean [µ] SD

Size of vegetable garden/roof top garden [m2]

Number of fruit trees

Number of chickens

Number of other livestock

Size of fish pond [m2]

Question 62, the source of water, will require a master list to be developed and from that the

frequency and percent recorded.

Questions 65 – 69 are all categorical [non metric] responses, which require frequency and

percent to be recorded.

Questions 70 – 73 each contain multiple sub-questions. However, in all instances the data is

primarily non metric, requiring the data enumerator, where appropriate, to develop a master list

and to enter the appropriate code. Frequencies and percent are to be reported.

While Question 74 uses a metric scale, on this occasion, rather than using means and standard

deviation, the results will be easier to interpret using the percent of responses for each food

category.

Number of times in the last seven days each food group was consumed by the household

Percent

0 1 2 3 4 5 6 7

Your staple food

A red meat – beef/mutton/goat

Chicken

Fish

A green leafy vegetable

At least two other vegetables

One or more pulses

Fresh milk

Any other dairy product

One other staple

Fresh fruit

Dried fruit/nuts

Highly processed snack foods

Chocolate/confectionary

The objective of this question is to examine the diversity of the household diet. It would not be

unreasonable to assume that among the lower income households, the diet will be much less

diverse and may even be deficient for some nutrient groups [hence the need to record and report

non consumption (0) which would be otherwise lost if we were only to report the mean]. To test

that hypothesis, ANOVA will need to be employed using the means for the different socio-

economic groups. Having applied Duncan’s HSD, it should be possible to see which food

groups are consumed less often by which socio-economic groups. However, this analysis does

not inform us, and nor does it need to, whether the food group is deficient for we have no

measure of the quantity of each food group purchased nor do we know which family members

consume it. This level of detail requires a far more sophisticated survey instrument and a great

deal more time than the RUFSAT tool permits.

However, one of the factors that can have a pronounced impact on household diet, are the

dietary choices that individuals within the household make, either by choice [such as vegan or

vegetarian, or to loose weight], for medical reasons [such as food allergies, coronary heart

disease, high cholesterol, diabetes], or for religious reasons. Question 75 seeks to answer these

questions. The data is to be recorded by frequency and percent.

Are any members of the household:

Yes No

N % N %

completely vegetarian

mainly vegetarian

vegan

following a strict plan to lose weight

on a casual diet to lose weight

on a special diet for medical reasons

on a special diet due to allergies

on a special diet for religious reasons

The final question [Q76] utilises the Food Insecurity Experience Scale (FIES), developed by the

Voices of the Hungry (VoH) project. These responses are collected through the FIES Survey

Module (FIES-SM) which consists of eight questions regarding people's access to adequate

food. Used in combination with other measures, the FIES has the potential to contribute to a

more comprehensive understanding of the causes and consequences of food insecurity and to

inform more effective policies and interventions. However, unlike any of the earlier questions,

the responses to this question must always be analyzed together as a scale, not as separate items.

To do that, the data is entered into…. [and this where I need the how]

While Ive just read the monograph [http://www.fao.org/3/a-as583e.pdf] this doesn’t tell me how

to do it… and if we cant put this matrix into a simple tool to analyse it – it doesn’t belong in

RUFSAT.

Documents

Conducting the consumer interviews