21
National Transfer Accounts 1 Data Data and Estimation and Estimation Issues Issues Sang-Hyop Lee Sang-Hyop Lee University of Hawaii at Manoa University of Hawaii at Manoa

N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

Embed Size (px)

Citation preview

Page 1: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts1

Data Data and Estimation and Estimation IssuesIssues

Sang-Hyop LeeSang-Hyop Lee

University of Hawaii at ManoaUniversity of Hawaii at Manoa

Page 2: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts2

DataData Sets for Statistical Sets for Statistical AnalysisAnalysis► Cross sectionCross section► Time seriesTime series► Cross section time series; useful for Cross section time series; useful for

aggregate cohort analysisaggregate cohort analysis► Panel (longitudinal)Panel (longitudinal)

Repeated cross-section design: most commonRepeated cross-section design: most common Rotating panel design (Cote d’Ivore 1985 data)Rotating panel design (Cote d’Ivore 1985 data) Supplemental cross-section design (Kenya & Supplemental cross-section design (Kenya &

Tanzania 1982/83 data, MFLS)Tanzania 1982/83 data, MFLS)► Cross section with retrospective informationCross section with retrospective information► Micro vs. MacroMicro vs. Macro

Page 3: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts3

Quality of Survey DataQuality of Survey Data

►Constructing NTA requires individual or Constructing NTA requires individual or household micro survey data sets.household micro survey data sets.

►A good survey data set has the A good survey data set has the properties ofproperties of Extent (richnessExtent (richness): it has the ): it has the variablesvariables of of

interest at a certain level of details.interest at a certain level of details. ReliabilityReliability: the variables are : the variables are measured measured

without errorwithout error.. ValidityValidity: the data set is : the data set is representativerepresentative..

Page 4: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts4

Data Problem (An example)Data Problem (An example)►FIES (64,433 household with FIES (64,433 household with

233,225 individuals)233,225 individuals) Measured for only urban areaMeasured for only urban area (Valid?) (Valid?) No single person householdNo single person household (Valid?) (Valid?) No individual level income, only No individual level income, only

household levelhousehold level (Rich?) (Rich?) No information of income for family No information of income for family

owned businessowned business (Rich?) (Rich?) Measured for Measured for up to 8 household up to 8 household

membersmembers: discrepancy between the : discrepancy between the sum of individual and household sum of individual and household incomeincome (Valid? Rich?)(Valid? Rich?)

Page 5: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts5

Extent (Richness):Extent (Richness):Missing/Change of VariablesMissing/Change of Variables

►Not Not measuredmeasured in the data in the data Only measured for a certain groupOnly measured for a certain group Labor portion of self-employed incomeLabor portion of self-employed income

►Change of variables over timeChange of variables over time Institutional/policy changeInstitutional/policy change New consumption items, new jobs, etcNew consumption items, new jobs, etc

►Change of survey Change of survey instrument/collapsinginstrument/collapsing

Page 6: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts6

Reliability: Reliability: Measurement Measurement ErrorError► Response errorResponse error

RRespondents do not know what is requiredespondents do not know what is required IIncentive to understatencentive to understate/overstate/overstate RRecall biasecall bias: related with period of survey: related with period of survey Using wrong/different reporting unitsUsing wrong/different reporting units

► Reporting errorReporting error: : heaping or outliersheaping or outliers► Coding errorCoding error► Overestimate/UnderestimateOverestimate/Underestimate

Parents do not report their children until the Parents do not report their children until the children have namechildren have name

Detect by checking survival rate of single Detect by checking survival rate of single ageage

► Discrepancy between aggregate value Discrepancy between aggregate value and individual valueand individual value

Page 7: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts7

Validity: Validity: CensorCensoringing

►Selection based on characteristicsSelection based on characteristics►Top/Bottom codingTop/Bottom coding►Censoring Censoring due to the time of surveydue to the time of survey

Duration of unemploymentDuration of unemployment (left and right (left and right censoring)censoring)

Completed years of schoolingCompleted years of schooling

►Attrition (Panel data)Attrition (Panel data)

Page 8: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts8

Categorical/Qualitative Categorical/Qualitative VariablesVariables

►Converting categorical to single Converting categorical to single continuous variablescontinuous variables Grouped by age (population, public Grouped by age (population, public

education consumption)education consumption) Income category (FPL)Income category (FPL)

► Inconsistency over timeInconsistency over time►Categorical Categorical continuous, and vice continuous, and vice

versaversa

Page 9: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts9

Units, Real vs. NominalUnits, Real vs. Nominal► Be cBe careful about the reporting unitareful about the reporting unit

MeasurMeasurementement units units Reporting period units (reference period, Reporting period units (reference period,

seasonal fluctuation, recall bias) seasonal fluctuation, recall bias)

► Nominal vs. RealNominal vs. Real Aggregation across itemsAggregation across items Quality change (e.g. computer)Quality change (e.g. computer) Where inflation is a substantial problem Where inflation is a substantial problem

Page 10: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts10

Solution for Solution for Missing VariablesMissing Variables► Ignore it; random non-responseIgnore it; random non-response► Give up: find other source of data (FIES vs. LFS)Give up: find other source of data (FIES vs. LFS)► ImputeImpute

Based on their characteristics or mean valueBased on their characteristics or mean value Based on the value of other peer groupBased on the value of other peer group Modified zero order regressions (y on x)Modified zero order regressions (y on x)

- Create dummy variable for missing Create dummy variable for missing variables of x (z)variables of x (z)

- Replace missing variable with 0 (x’)Replace missing variable with 0 (x’)- Regress y on x’ and z, rather than y on xRegress y on x’ and z, rather than y on x

Page 11: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts11

Households vs. IndividualsHouseholds vs. Individuals

► Consumption and income measurement are Consumption and income measurement are individual level individual level

► But a lot of data are gathered from But a lot of data are gathered from householdhousehold Allocating household consumption Allocating household consumption

(income) to individual household (income) to individual household members is a critical part of estimation members is a critical part of estimation

Adjusting using aggregate (Adjusting using aggregate (mmacro) controlacro) control

Page 12: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts12

Headship (Thailand, 1996)Headship (Thailand, 1996)

0

10

20

30

40

50

60

70

80

0 10 20 30 40 50 60 70 80 90+

Age

Hea

dshi

p R

ate

(per

cent

)

Economic Head

Self-reported Head

`

Page 13: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts13

MeasuringMeasuring Consumption Consumption

► Underestimation: e.g. British FESUnderestimation: e.g. British FES Using aggregate control mitigate the Using aggregate control mitigate the

problem.problem.

► Home produced items: both income and Home produced items: both income and consumption.consumption.

► Allocation across individuals is difficult Allocation across individuals is difficult ► Estimating Estimating some some profilesprofiles, such as health , such as health

expenditure are also difficultexpenditure are also difficult in part due in part due to various source of financing.to various source of financing.

Page 14: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts14

Measuring IncomeMeasuring Income

► ““All of the difficulties of measuring All of the difficulties of measuring consumption apply with greater force to the consumption apply with greater force to the measurement of income” (Deaton, p. 29).measurement of income” (Deaton, p. 29). Need detailed information on “transactions” Need detailed information on “transactions”

(inflow and outflow): an enormous task(inflow and outflow): an enormous task Incentive to understate: using aggregate control Incentive to understate: using aggregate control

mitigate the problem.mitigate the problem. Some surveys did not attempt to collect Some surveys did not attempt to collect

information on asset income (e.g. NSS of India)information on asset income (e.g. NSS of India)

► Allocating self-employment income across Allocating self-employment income across individuals is difficult.individuals is difficult.

Page 15: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts15

Data Data CleaningCleaning

► Case by caseCase by case► Find out what data sets are available and Find out what data sets are available and

choose the best one (template for choose the best one (template for workshop)workshop)

► Detect outliers and examine them carefullyDetect outliers and examine them carefully► A serious examination is required when A serious examination is required when

inflation matters to check whether actual inflation matters to check whether actual estimation process generate a variable estimation process generate a variable

► Make variables consistentMake variables consistent► Convert categorical variable to continuous Convert categorical variable to continuous

variable, etc.variable, etc.

Page 16: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts16

Weighting Weighting and Clusteringand Clustering

► Weight should be used in the summary of Weight should be used in the summary of variables/direct variables/direct tabulation/regression/smoothing.tabulation/regression/smoothing.

► Frequency Weights; fw indicate replicated Frequency Weights; fw indicate replicated data. The weight tells the command how data. The weight tells the command how many observations each observation really many observations each observation really represents.represents.. tab edu [w=wgt] . tab edu [w=wgt] tab edu [fw=wgt] tab edu [fw=wgt]

► Analytic Weights; aw are inversely Analytic Weights; aw are inversely proportional to the variance of an proportional to the variance of an observation. It is appropriate when you are observation. It is appropriate when you are dealing with data containing averages.dealing with data containing averages.. su edu [w=wgt] . su edu [w=wgt] su edu [aw=wgt] su edu [aw=wgt]. reg wage edu [w=wgt] . reg wage edu [w=wgt] reg wage edu reg wage edu [aw=wgt][aw=wgt]

Page 17: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts17

Weighting and ClusteringWeighting and Clustering (cont’d)(cont’d)►Probability Weights; pw are the sample Probability Weights; pw are the sample

weight which is the inverse of the weight which is the inverse of the probability that this observation was probability that this observation was sampled.sampled.

. reg wage edu [pw=wgt] . reg wage edu [pw=wgt] reg wage reg wage edu [(a)w=wgt], robustedu [(a)w=wgt], robust

. reg wage edu [pw=wgt], cluster(hhid) . reg wage edu [pw=wgt], cluster(hhid)

reg wage edu [(a)w=wgt], reg wage edu [(a)w=wgt], cluster(hhid)cluster(hhid)

Page 18: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts18

SmoothingSmoothing► Shows the pattern more clearly by Shows the pattern more clearly by

reducing sampling variancereducing sampling variance► Should not eliminate real features of the Should not eliminate real features of the

datadata Avoid too much smoothing (e.g. old-Avoid too much smoothing (e.g. old-

age health expenditure.)age health expenditure.) We don’t want to smooth some profiles We don’t want to smooth some profiles

(e.g. education)(e.g. education) Basic components should be Basic components should be

smoothed, but not aggregationssmoothed, but not aggregations► Type of smoothingType of smoothing

““lowess” smoothing (Stata) does not lowess” smoothing (Stata) does not incorporate sample weightincorporate sample weight

Friedman’s super smoothing (R) does.Friedman’s super smoothing (R) does.

Page 19: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts19

DiscussionDiscussion► Data type/quality varies across countries.Data type/quality varies across countries.► Estimation method could vary across Estimation method could vary across

countries depending on data.countries depending on data.► However,However, some standard measure some standard measure could could

be applied.be applied. Definition Definition Specification Specification Estimation Estimation

using weight using weight Smoothing Smoothing Macro Macro control control Present your work!Present your work!

If some component vary substantially If some component vary substantially by age, then it is estimateby age, then it is estimatedd separately separately

(education, health, etc)(education, health, etc)

Page 20: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts20

AcknowledgementAcknowledgement

Support for this project has been provided by the Support for this project has been provided by the following institutions:following institutions:

► the John D. and Catherine T. MacArthur Foundation; the John D. and Catherine T. MacArthur Foundation; ► the National Institute on Aging: NIA, R37-AG025488 the National Institute on Aging: NIA, R37-AG025488

and NIA, R01-AG025247; and NIA, R01-AG025247; ► the International Development Research Centre the International Development Research Centre

(IDRC);(IDRC);► the United Nations Population Fund (UNFPA); the United Nations Population Fund (UNFPA); ► the Academic Frontier Project for Private Universities: the Academic Frontier Project for Private Universities:

matching fund subsidy from MEXT (Ministry of matching fund subsidy from MEXT (Ministry of Education, Culture, Sports, Science and Technology), Education, Culture, Sports, Science and Technology), 2006-10, granted to the Nihon University Population 2006-10, granted to the Nihon University Population Research Institute.Research Institute.

Page 21: N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

National Transfer Accounts21

The EndThe End