Upload
dinhdat
View
222
Download
3
Embed Size (px)
Citation preview
EXERCISES
Exercise 1 (Chapter Two):
1.1: Household Characteristics
Open c:\intropov\data\Hh.sav that contains household level variables. There will be 496
households in the file. Note that each column corresponds to each variable, whereas each
row represents each observation or household. All the variables included in the data are
described in Appendix 1.
Of variables, there is a variable called ‘weight’. This weight is the weight given to each
household. From this weight, we can calculate the population weight by multiplying
weight by the size of household (See also Chapter 2 of ‘Poverty Manual’). Answer the
following questions:
(a) Generate the population weight called ‘pop’.
(b) Compare the total number of households and the sum of population.
(c) There are four regions in the survey: Dhaka, Chittagong, Khulna, and Rajshahi.
‘region’ is a string variable. Record these regional names into a different variable called
‘reg1’. Record Dhaka, Chittagong, Khulna, and Rajshahi into 1, 2, 3, and 4, respectively.
To convert a string variable into a numeric, type the following commands in the Syntax
Editor.
STRING reg1 (A8). IF (region = " Dhaka " ) reg1 = " 1 " . IF (region = " Chittagong " ) reg1 = " 2 " . IF (region = " Khulna " ) reg1 = " 3 " . IF (region = " Rajshahi " ) reg1 = " 4 " . VARIABLE LABELS reg1 ' four regions (numeric) ' . EXECUTE.
Having executed the commands above, go to the Variable View tab and then click on
Type to change ‘reg1’ to a numeric variable. Note that a string variable ‘region’ has
changed into a numeric variable ‘reg1’. Fill in the following table:
Household characteristics Region 1 Region 2 Region 3 Region 4 Total Number of households in the population
Total population
Average distance a household to paved
road
Average distance of a household to nearest bank % of Households with electricity % of Households with toilet % of population with electricity % of population with toilet Average household size
Can you conclude from the results that one region is more affluent than the other?
Describe why.
(d) Household characteristics also vary with the gender of household head. Compute the
means of the variables in the following table.
Household Male headed Female headed
characteristics households households
Average household size
Average years of schooling of household head
Average household assets
Average household land holding Number of households in the population Sample size Ratio of sample household s to household in the population
Do you think the female-headed households are underrepresented in the sample? [Hint :
Compare the ratio of sample households to households in the population for male- and
female-headed households.] Can you still conclude whether the female-headed
households are more (or less) educated or less (or more) affluent than their counterparts?
Discuss.
1.2: Individual Characteristics
Open c:\intropov\data\ind.sav. This file has information on individual members of
households. Sort this file by ‘hcode’ and then merge it with the household level data
(c:\intropov\data\Hh.sav). Remember that the household level data is also sorted by
‘hcode’. As a result, you will get the new merged data.
Unlike STATA, SPSS involves more procedures in merging files. In the merged file, you
will find many missing values represented by dots (.). When the household level data is
merged with the individual level data, there are less observations in the former data set
compared to the latter. For instance, the age of household head will be given for the first
member of the household, who is the head of the household. Dots will appear for the
other members within the same household. SPSS does not automatically fill the dots with
the same value as the first member of the household. Note that STATA does this
automatically. In SPSS, you need to fill the dots by using two steps.
-The first step is to use Split File by ‘hcode’. This will group the data by
household code. Note that Split File On sign will appear on the right corner of
the Data Editor window.
-The second step is to use Replace Missing Values from the Transform menu.
Replace dots with the series mean of the first value of the variable in which you
are interested.
SORT CASES BY hcode . SPLIT FILE LAYERED BY hcode .
RMV / famsize = SMEAN(famsize) / toilet = SMEAN(toilet) .
Having gone through these two steps, we will have a merged file without any missing
values or dots. Save this file as c:\intropov\data\hsurvey.sav. Complete the following
questions.
(a) Answer the following questions and discuss how the results vary across regions.
Regional variation Region 1 Region 2 Region 3 Region 4 Total Average years of education of the
population.
% of female population
% of working population (with positive working hours) % of working population (working in farm)
Are the results very different between male and female?
Gender differences Male Female Total
Average years of schooling
Average age
Average working hours
Average working hours in farm Average working hours in non-farm
[Note: The individual file also have a weight variable, which is in fact the household
weight so that total weight is equal to the total population. A detailed discussion on
‘weight’ is provided in Chapter 2 of poverty manual]
1.3: Expenditure
Open the data c:\intropov\data\consume.sav. It contains quantity and expenditure of each
food item at household level. Note that total expenditure (hhexp) is the sum of total food
(expfd) and non-food (expnfd) expenditures for a household. These are household
monthly expenditures. To get per capita expenditure per month, thus, ‘hhexp’, ‘expfd’,
and ‘expnfd’ have to be divided by the size of household. But the household size is not in
c:\intropov\data\consume.sav file but in another file called c:\intropov\data\Hh.sav. Thus,
the two files have to be merged together. Sort the ‘c:\intropov\data\consume.sav’ data by
‘hcode’ and then merge it with c:\intropov\data\Hh.sav.
(i) Compute ‘per capita food expenditure (pcfood)’, ‘per capita non-food
expenditure (pcnfood)’, and ‘per capita total expenditure (pcexp)’.
(ii) Repeat (i) with weight and without weight. How do they differ?
(Note: When weighted, population should be used as a weight, where
population weight = weight given to each household × size of household)*
(iii) Which region has the highest and the lowest per capita food and per capita
total expenditure ?
(iv) Does per capita total expenditure differ between male-headed and female-
headed households?
(v) Do you find any positive correlation between number of years of schooling of
household head and per capita total expenditure? Explain the answer by using
a graph.
(vi) Is per capita total expenditure declining with the size of household? Generate
the square of household size.
COMPUTE size_sq = famsize**2 .
EXECUTE .
* Two estimates can differ widely. The correct procedure is to use the population weight.
Exercise 2 (Chapter Three):
The focus of this part of exercises is on constructing a poverty line. The poverty line
specifies the society’s minimum standard of living to which everyone should be entitled.
Poverty line used as a yardstick to identify the poor is thus the baseline for any poverty
analysis. Once the poverty line is determined, one can construct poverty profiles, the
distribution of poverty across sectors, geographical regions, and socioeconomic groups,
and a comparison of key characteristics of the poor with those of the non-poor.
This exercise discusses three methods that have been used to derive the poverty line in
Bangladesh. These are namely direct calorie intake, food energy intake, and cost of basic
needs. These three methods will be exercised in turn. (Note: A food basket considered for
the healthy survival of a typical family in rural Bangladesh is the same as the one used in
STATA exercise)
2.1: Direct Calorie Intake
The file ‘c:\intropov\data\consume.sav’ provides information on quantities of 10 food
items consumed by the households included in the data. Note that ‘potato’ and ‘other
vegetables’ are lumped together into one item called ‘veg’. These quantities of 10 food
items can be converted into calories based on food calorie conversion factors that are
provided in the table below.
Note also that the quantities in the data are expressed in kg per week and thus have to be
converted into gram per day. Based on calories of food basket and quantities of food
consumed by each household in the survey, we can obtain the household’s calorie intake.
In order to get per capita calorie intake, we need to merge c:\intropov\data\consume.sav
with c:\intropov\data\Hh.sav which has the size of household. Generate a variable called
‘pccal’ indicating ‘per capita calorie intake’.
Food Per capita normative daily requirements
Items Calorie Quantity (gm)
Average rural
consumer price (taka/kg)
Rice 1386 397 15.19 Wheat 139 40 12.81 Pulse 153 40 30.84 Milk (cow) 39 58 15.9
Oil (mustard) 180 20 58.24
Meat (beef) 14 12 66.39
Fish 51 48 46.02 Potato 26 27 8.18
Other vegetable 26 150 38.3 Sugar 82 20 30.49 Fruits 6 20 28.86 Total 2112
Classify an individual is poor if his or her per capita calorie intake is less than the
nutritional requirement of 2112 calories per day and zero otherwise. Create a new
variable called ‘z_dci’ equal to 100 if the household is poor and zero otherwise. Save the
file as ‘c:\intropov\data\pline.sav’.
IF (pccal < 2112 ) z_dci = 100 .
IF (pccal >= 2112) z_dci = 0 .
EXECUTE.
Discuss the percentage of poor by regions. Which region is the poorest?
2.2: Food-Energy Intake Method
The food energy intake (FEI) method is simple. Since separate poverty lines are
estimated for each region, it takes into account the differences in regional costs of living
and food preference. A classic method of FEI has been proposed by Greer and Thorbecke
(1986). They provide a method that computes the food poverty line at which an
individual’s food energy intake is just sufficient to satisfy his or her calorie requirement
per day. Their proposed cost-of-calorie function is
Ln(E) = a + b C + u
where E is the per capita total expenditure, C is the number of calories obtained from the
food basket, and u is the error term. Once the equation is estimated, we are able to
construct a poverty line for each region. Since the calorie requirement is the same for all
regions at 2112, the poverty line is estimated separately for each region as
pline = exp( a + b × 2112)
where exp stands for exponential and a and b are the coefficient of estimates in the log
equation above.
We now apply this methodology to the data. Open c:\intropov\data\pline.sav.
(i) Generate the logarithm of per capita total expenditure.
(ii) Regress log of per capita total expenditure (‘pcexp’) against per capita calorie
intake (‘pccal’). Use weighted least square method, where weight is the
population.
REGRESSION
/REGWGT = pop
/STATISTICS COEFF OUTS R ANOVA
/DEPENDENT lpcexp
/METHOD=ENTER pccal .
(iii) What are the estimates of the slope and the constant term ?
COMPUTE lpcexp = LN(pcexp) .
EXECUTE.
(iv) Create a variable called ‘feipline’, which is equal to the exponential of
(estimated constant + estimated slope multiplied by 2112).
Other than this method, there is a simpler way of calculating a poverty line under FEI
method. The steps are as follows:
(i) Obtain the weighted mean of per capita total expenditure (with weight = pop)
within the range where per capita calorie intake lies between its lower bound
(=2112*0.9) and its upper bound (=2112*1.1).
(ii) Name this weighted average of per capital total expenditure as ‘feipline’.
(iii) Create a variable ‘feipoor’=100 if ‘feipline’ is greater than per capita total
expenditure and zero otherwise.
COMPUTE feipoor = 0 .
EXECUTE .
IF (feipline > pcexp ) feipoor = 100 .
EXECUTE .
(iv) Compute the percentage of poor by regions. Which region is the poorest by
FEI method?
2.3: Cost of Basic Needs
Rowntree’s (1901) approach to specifying poverty lines based on the concept of ‘physical
efficiency’ measures poverty in terms of lack of command over basic consumption needs
essential for maintaining physical efficiency. This approach is so-called the cost of basic
needs (CBN) method of constructing poverty lines. This method involves determining
COMPUTE feipline = EXP ( a + 2112ˆ ×b ) .
EXECUTE .
food and non-food costs of basic consumption baskets and then adding up the two costs
gives the poverty line. We provide exercises on food poverty line and non-food poverty
line separately.
A: Food Poverty Line
First of all, we choose a basket of a reference group. Open C:\intropov\data\Hh.sav and
merge it with c:\intropov\data\consume.sav after sorting the files by ‘hcode’. Call this
merged file ‘c:\intropov\data\exp.sav’. Create per capita food, non-food, and total
expenditure. Generate the cumulative sum of population (‘cpop’) of which its last value
must be one (How to create ‘cpop’ will be explained in Exercise 4). Create a reference
group, the bottom 20 percent in the distribution of per capita total expenditure. Type the
following command in the Syntax Editor:
COMPUTE ref = 0 .
IF (cpop<=0.2) ref =1 .
EXECUTE .
Having defined the reference group, merge c:\intropov\data\hsurvey.sav’ with
c:\intropov\data\vprice.sav. In this case, arrange variables (‘thana’ and ‘vill’) in their
ascending order before merging the two data sets. In the file c:\intropov\data\vprice.sav,
there is village level price information on all 11 food items. Under the method of cost of
basic needs, it is assumed that all individuals belonging to the bottom 20 percent
nationally enjoy the same standard of living but have different consumption patterns.
Given the basket of total expenditure and calories and average prices of each food item in
the basket for the reference group, compute the quantity of each food item in the basket.
Convert quantities into calorie by using calorie conversion factor. Make sure that the unit
is converted correctly. Compute the cost of per calorie through dividing the average of
per capita expenditure on food basket by the average of per capita total calorie of the
basket for the reference group. Create a variable called ‘costcal’ for the cost of per
calorie. Calculate the food poverty line equal to multiplying ‘costcal’ by 2112. Note that
there is only one food poverty line in this case.
(i) What is the cost of per calorie for the reference group?
(ii) Create 5 different quintile groups. Compare the cost of per calorie for each of
these quintiles.
(iii) What is the monthly food poverty line ? Label this food poverty line as
‘fline’. Save this file as ‘c:\intropov\data\pline.sav’.
B: Non-food Poverty Line
Parametric methods of setting non-food poverty lines can be readily estimated using a
food-share Engel curve of the regression form, which is illustrated in ‘Poverty Lines in
Theory and Practice’ by M. Ravallion (1999). In this exercise, we practice non-
parametric ways of defining non-food poverty lines which do not impose a functional
form on the Engel curve. We will illustrate constructing both the upper and the lower
poverty line.
(i) Open file c:\intropov\data\pline.sav.
(ii) Arrange per capita food expenditure in ascending order. This is an important
step to follow. Otherwise, you will get an incorrect result.
SORT CASES BY pcfd .
(iii) Compute the weighted average of per capita non-food expenditure for those
households whose per capita food expenditure lies within plus or minus 10
percent around the food poverty line.
COMPUTE filter_$=(pcfd>=fline*0.9 & pcfd<=fline*1.1) . VARIABLE LABEL filter_$ ‘pcfd>=fline*0.9 & pcfd<=fline*1.1 (FILTER)’ VALUE LABELS filter_$ 0 ‘Not Selected’ 1 ‘Selected’ FORMAT filter_$ (f1.8) . FILTER BY filter_$ . EXECUTE . WEIGHT BY pop . DESCRIPTIVES VARIABLES = pcnfd /STATISTICS=MEAN .
(iv) Call this mean of per capita non-food expenditure (which is weighted by the
population weight) as ‘nfline1’
(v) Compute the upper poverty line (‘upline’) by summing the food poverty line
and ‘nfline1’.
We can apply the same approach to setting the lower poverty line described above, with
the difference that we compute the non-food expenditure of households in the
neighborhood of the point where per capita total expenditure is equal to the food poverty
line. Answer the following questions. (Note: In this case, per capita total expenditure has
to be sorted in its ascending order)
(i) What is the non-food poverty line obtained from this method?
(ii) Compute the lower poverty line (‘cbnpline’).
(iii) Compare the upper poverty line with the lower poverty line. Which one would
you use? Why?
(iv) Calculate the incidence of poverty using the upper poverty line and the lower
poverty line. How are they different? Discuss.
We have constructed poverty lines based on three different methods described above.
Discuss how different the percentage of people living below each of these poverty lines
using the three methodologies. Also discuss which method would you adopt in setting an
official poverty line for your own country.
Exercises 3 (Chapter Four)
3.1: Getting Started
Open the data file c:\intropov\data\example.sav. The file contains the individual
consumption information of three countries. The figures are all monthly consumption. All
three countries have 10 citizens.
(i) Compare the means of consumption for three countries.
(ii) Suppose that a poverty line is set at 126 per month. Given this poverty line,
compute the following poverty estimates for each country.
a : the head-count index
b: the poverty gap index
c: the squared poverty gap index (or the severity of poverty index)
(iii) Repeat (ii) when the poverty line is 130. Which country has the highest
poverty? Why?
(iv) Why would you use the poverty gap index and its squared poverty gap index
rather than the head-count index even though the latter is extremely simple
and widely used?
3.2: Poverty Measures
Now we work with the data file c:\intropov\data\pline.sav. Make sure that you have
variables including per capita total expenditure (‘pcexp’) and poverty lines (‘fline’ and
‘cbnpline’) constructed by the cost of basic needs.
(i) Compute five poverty measures – including head-count ratio, poverty gap
index, squared poverty gap index (severity of poverty index) and Watts
measure– for per capita total expenditure, using both the food poverty line and
the non-food poverty line derived from the cost of basic need method.
The following program calculates the four poverty measures for the whole population.
IF (pcexp < pline) hcount = 100 . IF (pcexp >= pline) hcount =0 . EXECUTE . Compute gap = hcount*(pline-pcexp)/pline. Compute severity =hcount*((pline-pcexp)/pline)**2. Compute Watts = hcount*(ln(pline)-ln(pcexp)). execute. WEIGHT BY pop . DESCRIPTIVES VARIABLES=hcount gap severity Watts /STATISTICS=MEAN .
(ii) Estimate the incidence of poverty, the poverty gap index (PGR), and the
severity of poverty index (FGT) for specific subgroups using the food poverty
line and the total poverty line.
Headcount index PGR FGT
(a) 4 Regions
(b) Male-headed households
(c) Female-headed households
(d) Households with more than 5 members
(e) Households with less than or equal to 5
(iii) Poverty calculations are based on a sample of households rather than the
population. Thus, we must compute standard errors of each poverty measure..
When poverty measures have large standard errors, small changes in poverty
may be statistically insignificant and should be carefully interpreted. To
compute corrected standard error, we suggest two methods.
One of ways to correct standard errors is simply divide the standard deviation
of a poverty measure by the square root of sample size ( 496=n in our
example). Go to Analyze and choose Descriptive Statistics. Alternatively, type
the following command in the Syntax Editor:
WEIGHT BY pop . DESCRIPTIVES VARIABLES = HCOUNT PGR FGT / STATISTICS = MEAN STDDEV .
Having obtained the standard deviations of poverty measures, simply divide them by
496 to get their corrected standard errors.
The other method is to adjust the population taking into account
sample size. Type the following command in the Syntax Editor:
COMPUTE pop1 = spop
npop × .
EXECUTE .
WEIGHT BY pop1 .
DESCRIPTIVES VARIABLES = HCOUNT PGR FGT / STATISTICS = MEAN SEMEAN .
where n, spop, and SEMEAN are the size of sample, the sum of population, and the
standard error of the mean, respectively. In our example, n is equal to 496 and spop is
equal to 13280. Having adjusted the population weight taking into account the size of
sample, simply compute poverty measures and their standard errors, which are
weighted by the adjusted population. Having computed these, fill in the following
table.
Headcount
ratio Poverty gap
Ratio FGT ratio Region 1
(Standard errors)
Region 2
(Standard errors)
Region 3
(Standard errors)
Region 4
(Standard errors)
Exercise 4 (Chapter Five)
4.1: Stochastic Dominance
There is no general consensus on poverty line. Thus, it might be appropriate to measure
poverty using all possible poverty lines in a given range. Note that the choice of poverty
measures has a significant implication on the direction of changes in poverty. Hence, it
will be useful to find conditions under which all members of the class of poverty
measures give the same ranking. These issues are dealt with using the idea of stochastic
dominance.
The first-order stochastic dominance test compares the percentage of poor for different
regions, which have the probability distribution functions for each region. A simple way
of testing the first order dominance for each of four regions is to plot the percentage of
poor on the vertical axis and the poverty lines on the horizontal axis.
Poverty Percentage of poor
Line Region 1 Region 2 Region 3 Region 4
3000 7.08 0.62 1.80 2.31
4000 12.04 9.58 11.78 13.93
5000 27.38 27.27 42.89 32.36
6000 45.25 49.84 58.44 48.52
7000 57.34 61.40 74.04 65.90
Poverty Poverty gap ratio
Line Region 1 Region 2 Region 3 Region 4
3000 0.40 2.39 5.90 11.17
4000 0.09 1.38 4.88 10.42
5000 0.29 1.77 6.68 13.77
6000 0.38 2.08 6.34 12.14
7000 0.31 1.96 6.02 11.95
We have calculated the head-count ratio and the poverty gap ratio for all four regions.
These poverty measures are estimated for various poverty lines as shown in the table
above. The first order dominance curve is the relationship between poverty line (x-axis)
and the corresponding head-count ratio (y-axis).
PLOT FORMAT = OVERLAY
/PLOT = hc1 hc2 hc3 hc4 with pline .
After formatting the graph by interpolation, the graph will look like this:
(i) Does one distribution dominate over the other?
(ii) Does any one of lines cross another line?
(iii) Can you conclude from the graph that one region has a higher incidence of
poverty than another region? Is it true for other poverty measures?
If the two curves do not intersect at all, we do not need to test the second or third
dominance because the first dominance will imply higher poverty on the basis of all
poverty measures including the head-count ratio. Otherwise, we move on to testing the
second-order stochastic dominance. It is the relationship between poverty line (x-axis)
and the corresponding poverty gap ratio (y-axis). This curve is also called the ‘poverty
deficit curve’. If the second order dominance condition is satisfied, (when the curves do
not intersect), we can say unambiguously that poverty measured by entire class of Foster,
Greer and Thorbecke poverty measures with the exception of the head-count ratio will be
higher in one region than in another region. Given the table above, simply plot the
poverty gap ratio (y-axix) against the poverty lines (x-axis) for each of the regions.
Repeat questions (i), (ii) and (iii).
If the poverty deficit curves also intersect, then we move on to the third order stochastic
dominance, which is the relationship between the poverty line (x-axis) and the severity of
poverty (or square of poverty gap ratio).
Exercise 5 (Chapter Six)
5.1: Lorenz curve
The Lorenz curve is a simple device that has been used widely to describe and analyze
data on income distribution. This curve has become important in recent times because it
provides a useful method of ranking income distribution from the welfare point of view.
The Lorenz curve is defined as the relationship between the proportion of people with
income less than or equal to a specified amount, and the proportion of total income
received by those people.
More generally, the Lorenz curve is represented by a function L(p), which is interpreted
as the fraction of total income received by the bottom pth fraction of people, when the
people are arranged in ascending order of their incomes. The curve is drawn in a unit
square. Thus, if p=0, L(p)=0 and if p=1, L(p)=1. The slope of the curve is positive and
increases monotonically: the curve is convex to the p axis. From this, it follows that p ≤
L(p). The straight line represented by the equation, L(p)=p, is called the egalitarian line.
In constructing the Lorenz curve, we require to compute the cumulative proportion of per
capita total expenditure and population. The following commands will be involved given
that you have computed the mean of per capita expenditure (‘mpcexp’) and the sum of
population (‘spop’).
COMPUTE cpcexp = pop*pcexp / mpcexp/spop . Compute cpop=pop/spop. EXECUTE . SORT CASES BY cpcexp (A) . CREATE /cpcexp=CSUM(cpcexp). /cpop=CSUM(cpop).
Note that ‘cpcexp’, ‘cpop’ and ‘CSUM’ are the cumulative proportion of per capita
expenditure, the cumulative proportion of population and the cumulative sum,
respectively.
Check point: Are the last values of ‘cpcexp’ and ‘cpop’ equal to 1 ? If not, there is a
problem.
After having created ‘cpcexp’ and ‘cpop’, we need the following commands:
Compute p=cpop-pop/spop/2. Compute q=cpcexp-pcexp*pop/spop/mpcexp/2.
Note that we have just made a continuity correction.
(i) Go to ‘Graphs’ menu and select ‘Interactive’ and ‘Line’. Graph q on the
vertical axis and q on the horizontal axis. Does the Lorenz curve have a
positive slope? Is the curve convex to the p axis? Can you say that its slope is
increasing monotonically?
(ii) Construct the Lorenz curve for Dhaka and Chittagong. Does one curve lie
above to the other? Which one is closer to the egalitarian line? Can you
conclude that the distribution of expenditure in the Dhaka region is more
equal than in the Chittagong region? Discuss.
(iii) Do these two Lorenz curves intersect each other? If the two curves intersect,
we cannot say that one region is more equal than the other. In this respect, the
Lorenz curve provides only the partial ranking of distributions.
5.2: Inequality Measures
We exercise four inequality measures in this section – including the Gini index,
generalized Gini index, Atkinson measure, and Theil’s index.
(i) Gini Index
Of all the inequality measures, the Gini index is used most widely. It became popular
because of its direct relationship with the Lorenz curve. The Gini index measures the
extent to which the Lorenz curve departs from the egalitarian line. It is defined as twice
the area between the Lorenz curve and the egalitarian line. This definition ensures that
the value of the Gini index lies between zero (for complete equality) and one (for
complete or most extreme inequality).
Having created the cumulative proportion of per capita total expenditure and population,
the following commands are to generate the Gini index and quintile shares.
COMPUTE gini = 100*(1-2*q) . EXECUTE .
IF (p<=.20) quint = 1 . IF (p>.20 and p<=.40) quint = 2 . IF (p>.40 and p<=.60) quint = 3 . IF (p>.60 and p<=.80) quint = 4 . IF (p>.80) quint = 5 . EXECUTE.
COMPUTE share =100* pcexp*pop/(spop*mpcexp) . EXECUTE .
Compute the Gini index for the four regions. Which region is the most unequal among
the four regions?
(ii) Atkinson’s Measure
The inequality measure proposed by Atkinson is
µ*
1x
A −=
which is in fact a measure of loss of welfare caused as a consequence of inequality in the
society. x* is called ‘equally distributed equivalent level of income’ which is the level of
per capita income that if received by everyone, would make the total welfare exactly
equal to the total welfare generated by the actual income distribution.
With homothetic utility function, Atkinson’s index is equal to
∑−==
−−n
iii xfA
1
11
1 )(1
1)( εε
µε , 1≠ε
= µ
∑=−
n
iiei xf
1
))(logexp(1 , 1=ε
ε is a measure of degree of inequality-aversion.
The following program can be used to compute Atkinson’s measures for pcexp when ε
is 1, 1.5, and 2.
Compute lpcexp=Ln(pcexp).
Compute pcexp1=(pcexp)**(-0.5).
Compute pcexp2= (pcexp)**(-1).
Execute.
Calculate the weighted mean (weight = pop) of lpcexp, pcexp1 and pcexp2 using the
descriptive command:
WEIGHT BY pop . DESCRIPTIVES VARIABLES = lpcexp pcexp1 pcexp2 / STATISTICS = MEAN .
The following calculations will give the Atkinson’s measures depending on the value of
relative aversion parameter.
A ( == )5.1ε 1- (mean pcexp)**(-2)/mpcexp.
A ( )2=ε = 1- (mean pcexp)**(-1)/mpcexp.
A ( )1=ε = 1- (exp(mean lpcexp))/mpcexp.
Repeat the following example to compute the Atkinson’s inequality measure. Fill in the
following gaps.
Households
Per capita expenditure
(exp) (1)
Relative frequency
(feq) (2)
log(exp) (3) (1) × (3)
(exp)( -0.5)
(exp)( -1)
1 1 0.03
2 1000 0.03
3 2000 0.03
4 3000 0.07
5 4000 0.17 8.29 33176 0.016 0.000
6 5000 0.2
7 6000 0.2
8 10000 0.17
9 12000 0.07 9.39 112712 0.009 0.000
10 14000 0.03
Weighted Mean 6100 8.36 54498 0.044 0.030
0.91 Atkinson index ( ε =1) Atkinson index (ε =1.5) Atkinson index ( ε =2)
Is the Atkinson’s inequality measure increasing as the inequality aversion parameter
increases?
(iv) Theil’s Index
Theil (1967) proposed two inequality measures that are based on the notion of entropy in
information theory. The two entropy measures are defined as
T0 = log log( ) ( )µ −∞
∫ x f x dx0
T1 = 1
0µµx x f x d xlog( ) ( ) log
∞
∫ −
where µ is the mean income and f(x) is the density function.
Compute the two entropy measures from the table presented above. Is your result for T0
equal to 0.36? Is your result fro T1 equal to 0.22?
Exercise 6 (Chapter Seven)
Poverty profiles describe nature and extent of poverty. They provide breakdown of
aggregate poverty according to various socioeconomic and demographic characteristics
of households. They show how poverty varies with subgroups of society, such as regions,
household size, age, etc. Poverty profiles can also show the impact of the sectoral and
regional patterns of economic changes on aggregate poverty.
6.1: Characteristics of the poor
Open the file c:\intropov\data\pline.sav. Note that people whose per capita expenditure is
less than per capita monthly poverty line defined by the cost of basic needs are classified
as ‘poor’ and ‘non-poor’ otherwise. Answer the following questions.
Poor Non-poor
Average distance of a household to paved road
Average distance of a household to nearest bank
% households with electricity
% households with sanitary toilets
Average household assets
Average household land holding
Average household size
% households headed by female
% households headed by male
Average years of schooling of head
Average age of household head
Average total working hours in farming
Average total working hours in non-farming
Calculate the head-count ratio, the poverty gap ratio, and the severity of poverty by all
household characteristics shown above. Construct graphs for each of these subgroups.
Discuss a poverty profile in the rural Bangladesh based on your findings.
Exercise 7 (Chapter Eight)
Suppose that we want to explain per capita total expenditure in terms of socioeconomic
and demographic household characteristics in the data. We estimate a regression model
with the logarithm of per capita expenditure as the dependent variable. The explanatory
variables can include;
- gender of household head
- age of household head
- age-square of household head
- size of household
- size-square of household
- education and employment status of household head
- access to basic infrastructure such as distance to a paved road or to bank
- asset positions such as land holding
- region or urban/rural
- and other variables
We generate variables that do not exist in the data.
COMPUTE lpcexp = ln(pcexp) . COMPUTE sq_age = age**2 . COMPUTE sq_size = size**2 . EXECUTE .
There is categorical variables in the regression model, such as gender and region. In this
case, we have to convert these categorical variables into dummy variables. For instance,
if the head of household is male, create a new variable equal to 1 and 0 otherwise. Note
that in the regression model, only one of two dummies has to be included.
IF (gender = 1) male = 1 . IF (gender = 2) female = 0 . EXECUTE .
Similarly, create a regional dummy variable called ‘reg1’. There will be four dummies
yet only three dummy variables should be included. To run a regression model, open the
Syntax Editor, write the following command, and then click on the Run button to execute
the analysis.
REGRESSION /DESCTIPTIVES /REGWGT=pop /STATISTICS COEFF OUTS R ANOVA /DEPENDENT lpcexp /METHOD=ENTER male age sq_age size sq_size edu road land reg1 reg2 reg3 (include more
variables).
The Regression command is used to produce both simple and multiple regression equations and associated
statistics.
The /DESCRIPTIVE subcommand tells SPSS to produce descriptive statistics for all the variables
included in the analysis. These statistics include means, standard deviations, a correlation matrix, and so
on.
The /REGWGT subcommand indicates that the regression model is weighted (by population in our
example)
The /STATISTICS subcommand produces statistical results of the model – including R-square, adjusted
R-square, sum of squares, degrees of freedom, estimated coefficients, t and F statistics, etc.
The /DEPENDENT subcommand is used to identify the dependent variable in the regression model. In this
example, our dependent variable is the log of per capita expenditure.
The /METHOD subcommand must immediately follow the /DEPENDENT subcommand. This
subcommand is used to tell SPSS the way you want your independent variables to be added to the
regression equation. ENTER is the most direct method used to build a regression equation; it tells SPSS
simply to enter all the independent variables that you indicate for inclusion in the regression equation.
Since all the dummy variables takes values 0 and 1, the above model cannot be estimated
by the ordinary least square (OLS) method. This is because there is a perfect
multicollinearity between the dummy variables and the constant term in the regression
model. To overcome this problem, a certain constraint on the coefficient has to be
imposed. Another problem you have to deal with is heteroskedastic ity. Since each
sampled household has a different population weight attached due to sampling design
used, the OLS method will give inefficient coefficient estimates because of
heteroskedasticity. The coefficients in the model, however, can be estimated efficiently
using the weighted least square (WLS) method, where population is used as the weight.
Thus, the model is estimated based on the restricted weighted least squares method.
(i) What is your R_square of this model?
(ii) Do the signs of coefficients match with your hypothesis?
(iii) Are coefficients significant at 5 % significance level?
It is a good idea to visually examine the scatter plot of the two variables when
interpreting a regression analysis. To do so, you may need to type:
PLOT / FORMAT REGRESSION / PLOT lpcexp WITH (explanatory variables) .
In addition, the scatter plot of the residuals against the fitted values will help you to see
visually whether the model is a good fit. To carry out this task, we need additional
subcommands in the regression.
REGRESSION /DESCTIPTIVES /STATISTICS COEFF OUTS R ANOVA /DEPENDENT lpcexp /METHOD=ENTER male age sq_age size sq_size edu road land reg1 reg2 reg3 (include more variables). /SCATTERPLOT = (*ZRESID, *ZPRED) /SAVE ZRESID ZPRED .
After having saved ZPRED and ZRESID, simply type the following command in the
Syntax Editor:
PLOT / FORMAT REGRESSION / PLOT ZRESID WITH ZPRED .