35
Introduction to Stata and DASP PEP and UNDP June 2010 – 1 / 24 Introduction to Stata and DASP Abdelkrim Araar, Sami Bibi and Jean-Yves Duclos Workshop on poverty and social impact analysis Dakar, Senegal, 08-12 June 2010

Introduction to Stata and DASP

  • Upload
    others

  • View
    28

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to Stata and DASP

Introduction to Stata andDASP PEP and UNDP June 2010 – 1 / 24

Introduction to Stata and DASP

Abdelkrim Araar, Sami Bibi and Jean-Yves Duclos

Workshop on poverty and social impact analysisDakar, Senegal, 08-12 June 2010

Page 2: Introduction to Stata and DASP

Outline

Introduction to Stata andDASP PEP and UNDP June 2010 – 2 / 24

Introduction to Stata

DASP: a Stata package for distributive analysis

Conclusion

Page 3: Introduction to Stata and DASP

Objectives

Introduction to Stata andDASP PEP and UNDP June 2010 – 3 / 24

� Understand the basic structure of the Stata software;� Discover Stata’s various elements of graphical interface;� Understand howDASP can be useful for the analysis of distributions of

welfare;� See the basic structure ofDASP modules; Consider the challenges and

the difficulties of measuring poverty and welfare in a money-metricframework;

� See how the information required byDASP is related to important issuesin welfare economics;

� See how to generate tabular and graphical output withDASP

Page 4: Introduction to Stata and DASP

Introduction to Stata

Outline

Objectives

Introduction to Stata

Stata in few words

Main Stata graphicalinterfaces

Main Stata graphicalinterfaces

Dialog box and Statacommands

DASP: a Stata packagefor distributiveanalysis

Conclusion

Introduction to Stata andDASP PEP and UNDP June 2010 – 4 / 24

Page 5: Introduction to Stata and DASP

Stata in few words

Introduction to Stata andDASP PEP and UNDP June 2010 – 5 / 24

� Stata is a statistical software used by several firms and academicinstitutions throughout the world.

� Stata allows,inter alia, to

� process data;� perform statistical analysis;� draw graphs;� make statistical simulations;� design specialized and complementary programs;

� Stata has become increasingly popular during the last years. This can beexplained in part by its optimized design and its simple graphicalinterface.

Page 6: Introduction to Stata and DASP

Main Stata graphical interfaces

Introduction to Stata andDASP PEP and UNDP June 2010 – 6 / 24

Page 7: Introduction to Stata and DASP

Main Stata graphical interfaces

Introduction to Stata andDASP PEP and UNDP June 2010 – 7 / 24

A The commands windowThis allows typing a Stata command line andexecuting it by entering theEnter button.

� Page-Up: to edit the preceding command.� Page-Down: to edit the successive command.� Tab: to complete the name of the variable.

Page 8: Introduction to Stata and DASP

Main Stata graphical interfaces

Introduction to Stata andDASP PEP and UNDP June 2010 – 7 / 24

B The review commands windowThis window displays the last few linestyped and executed in the commands window.

� Click once on a given command line that appears in this windowtocopy it in the commands window;

� Click twice on a given command line that appears in this window toexecute it;

� Clicking on the left button of the mouse shows a menu that allowsyou to copy or save the commands used during the session in a *.dofile.

Page 9: Introduction to Stata and DASP

Main Stata graphical interfaces

Introduction to Stata andDASP PEP and UNDP June 2010 – 7 / 24

C The variables windowThis window lists the names of the variables ofthe opened data file as well as their label names and format.

� Click once on a given variable to copy it in the commands window;� Clicking on the left button of the mouse shows a menu that allows

you to rename variables or add some notes on the current data file.

Page 10: Introduction to Stata and DASP

Main Stata graphical interfaces

Introduction to Stata andDASP PEP and UNDP June 2010 – 7 / 24

D The results windowThis window displays the results of the submittedStata commands.

� Select part or the whole set of results and click on the left button ofthe mouse to copy that in text or tabulated format.

Page 11: Introduction to Stata and DASP

Dialog box and Stata commands

Introduction to Stata andDASP PEP and UNDP June 2010 – 8 / 24

� Stata’s main menu also contains other items to access the dialogue boxes.� Dialogue boxes facilitate learning the syntax of Stata commands. To

execute commands, Stata offers three possibilities:

1. Typing the command in thecommands windowand clicking onEntrer ;

2. Executing the Stata command in a dialog box;3. Executing a *.do file (an ASCII text file that contains a set of

successive Stata command lines).

Page 12: Introduction to Stata and DASP

Dialog box and Stata commands

Introduction to Stata andDASP PEP and UNDP June 2010 – 8 / 24

� To display the dialog box of a command, two options are available.� The first is to select the command item from Stata’s main menu

Example:Main Menu: Statistics⇒ Summaries⇒ Summary statistics⇒ Summary statistics

� The second is to type the commanddb followed by the command ofinterest and then click onEntrer .Exampledb summarize

Page 13: Introduction to Stata and DASP

Dialog box and Stata commands

Introduction to Stata andDASP PEP and UNDP June 2010 – 8 / 24

Figure 1: Dialog box of the Summarize command

Page 14: Introduction to Stata and DASP

Dialog box and Stata commands

Introduction to Stata andDASP PEP and UNDP June 2010 – 8 / 24

Display the help for the command

Reset: Initialize the dialog box fields by their default val-ues.

Copy in clipboard the syntax that will be generated afterclicking on OK

Execute the command and close the dialog box

Close the dialog box.

Execute the command without closing the dialog box.This button is useful when we plan to explore the com-mand with different options.

Page 15: Introduction to Stata and DASP

Dialog box and Stata commands

Introduction to Stata andDASP PEP and UNDP June 2010 – 8 / 24

� By clicking on Submit or on OK, the generated command appearsautomatically in the commands window.

� Each of the three forms of execution has its specific usefulness.

1. The use of dialog boxes generates an accurate Stata syntaxwhenoptions are selected. This helps learn quickly Stata’s commandsyntax.

2. A do file may contain a set of command lines that can form an entireprogram. Users can save this to reuse it and modify at their laterconvenience.

3. More advanced Stata users can use directly the commands window togenerate quickly some statistical results.

Page 16: Introduction to Stata and DASP

DASP: a Stata package fordistributive analysis

Outline

Objectives

Introduction to Stata

DASP: a Stata packagefor distributiveanalysis

DASP in few words

DASP features

OtherDASP features

DASP’s main menu

DASP’s mainvariables

Using variables inDASP

InputtingDASPcommands

Applications and filesin DASP

Producing curveswith DASP

SavingDASP graphs

Conclusion

Introduction to Stata andDASP PEP and UNDP June 2010 – 9 / 24

Page 17: Introduction to Stata and DASP

DASP in few words

Introduction to Stata andDASP PEP and UNDP June 2010 – 10 / 24

� Stata enables programmers to provide specialized “.ado” routines to addto the power of the software.

� DASP, which stands forDistributive Analysis Stata Package, is mainlydesigned to assist researchers and policy analysts interested in conductingdistributive analysis with Stata.

� DASP uses Stata for two main reasons:

� Stata is a powerful tool to store and manage household data surveys.CombiningDASP and Stata allows to use the same environment forprocessing and analyzing data.

� Stata easily allows adding specialized programs, making itpossiblefor programmers to add to its power and flexibility.

Page 18: Introduction to Stata and DASP

DASP features

Introduction to Stata andDASP PEP and UNDP June 2010 – 11 / 24

DASP allows to:

� Estimate the most popular statistics (indices, curves) used for the analysisof poverty, inequality, social welfare, and equity;

� Estimate the differences in such statistics;� Estimate standard errors and confidence intervals by takingfull account

of survey design;� Perform the most popular distributive decomposition procedures;� Check for the ethical robustness of distributive comparisons;� Support distributive analysis on more than one data base at the same time.

Page 19: Introduction to Stata and DASP

Other DASP features

Introduction to Stata andDASP PEP and UNDP June 2010 – 12 / 24

� Contains optimized algorithms for the estimation of distributive indices;� Unifies syntax and parameter use across various estimation procedures

for distributive analysis.� For eachDASP module, three types of files are provided1:

� *.ado file: contains the program of the module.� *.hlp file: contains the help material for the given module.� *.dlg file: allows the user to perform the estimation using the

module’s dialog box.

1For more information about DASP modules, see the user manual:(Araar and Duclos (2007)).

Page 20: Introduction to Stata and DASP

DASP’s main menu

Introduction to Stata andDASP PEP and UNDP June 2010 – 13 / 24

DASP’s windows menu makes it possible to access quickly each of the dialogboxes. The latter are grouped by main themes.

Page 21: Introduction to Stata and DASP

DASP’s main variables

Introduction to Stata andDASP PEP and UNDP June 2010 – 14 / 24

� VARIABLE OF INTEREST. This is the variable that usually captures livingstandards. It can represent, for instance, income per capita orexpenditures per adult equivalent.

� SIZE VARIABLE . This refers to the “ethical” or physical size of theobservation. This variable usually refers to the number of householdmembers.

� GROUP VARIABLE. Say that we wish to estimate poverty within acountry’s rural area or within female-headed families. Oneway to do thisis to forceDASP to focus on a population subgroup defined as those forwhom some GROUP VARIABLE(say, area of residence) equals a givenGROUP NUMBER(say 2, for rural area).

� SAMPLING WEIGHT. Sampling weights are the inverse of the samplingprobability. This variable should be set upon the initialization of the dataset.

Page 22: Introduction to Stata and DASP

Using variables inDASP

Introduction to Stata andDASP PEP and UNDP June 2010 – 15 / 24

� DASP makes it possible to use simultaneously more than one data file.� The user should initialize each data file before using it withDASP. This

initialization is done by:

� Labeling variables and values for categorical variables;� Initializing the sampling design with the commandsvyset;� Saving the initialized data file.

� It is useful to add a character such as “I” to the names of initialized files(Example: Uganda99I.dta) in order to distinguish them.

Page 23: Introduction to Stata and DASP

Inputting DASP commands

Introduction to Stata andDASP PEP and UNDP June 2010 – 16 / 24

� Stata andDASP commands can be entered directly into a commandwindow:

� An alternative is to use dialog boxes. For this, the commanddb should betyped and followed by the name of the relevantDASP module. Example:db ifgt.

Page 24: Introduction to Stata and DASP

Applications and files in DASP

Introduction to Stata andDASP PEP and UNDP June 2010 – 17 / 24

Two main types of applications are provided inDASP. For the first one, theestimation procedure uses only one data file, the data file in “memory” (or“loaded”). It is from that file that the relevant variables must be specified.

Page 25: Introduction to Stata and DASP

Applications and files in DASP

Introduction to Stata andDASP PEP and UNDP June 2010 – 17 / 24

Two main types of applications are provided inDASP.For the second type of applications, two distributions are needed. For each ofthese two distributions, the user can specify the currently-loaded data file (theone in memory) or one saved on disk.

Page 26: Introduction to Stata and DASP

Producing curves withDASP

Introduction to Stata andDASP PEP and UNDP June 2010 – 18 / 24

� DASP was strongly designed to facilitate the of use curves to displaydistributive information.

� For instance, if we wish to graph Lorenz curves to compare inequalitybetween rural and urban areas, we can simply type the followingcommand line:clorenz exppc, hgroup(zone) hsize(size)where in this exampleexppc is per capita expenditures ,size ishousehold size andzone is the zone variable (1 = rural / 2= urban).

Page 27: Introduction to Stata and DASP

Producing curves withDASP

Introduction to Stata andDASP PEP and UNDP June 2010 – 18 / 24

� After executing this command the following windows appears:

Page 28: Introduction to Stata and DASP

Producing curves withDASP

Introduction to Stata andDASP PEP and UNDP June 2010 – 18 / 24

� For many curves,DASP allows showing their confidence intervalsaccording to selected levels of statistical significance (this value is bydefault set to 95%).

� For instance, to draw confidence intervals around FGT curves, we can usethecfgtsm DASP module.cfgtsm exppc, alpha(0)hsize(size) hgroup(sex) max(100000)

Page 29: Introduction to Stata and DASP

Producing curves withDASP

Introduction to Stata andDASP PEP and UNDP June 2010 – 18 / 24

� After executing this command the following windows ap-pears: Drawing the confidence interval of distributive curves (FGT curves)

Page 30: Introduction to Stata and DASP

SavingDASP graphs

Introduction to Stata andDASP PEP and UNDP June 2010 – 19 / 24

� Graphs produced withDASP or Stata can be saved in many differentformats. Among them:

*.gph is Stata’s graphical format. It is useful to allow re-editing the graph(with Stata 10 or higher).

*.wmf is the Windows metafile format. This format may be easily insertedinto Word documents. The user can also copy a Stata graph and pastit directly into a Word document.

*.eps is the encapsulated postscript format. This format can easily beinserted in Latex documents.

Page 31: Introduction to Stata and DASP

Conclusion

Outline

Objectives

Introduction to Stata

DASP: a Stata packagefor distributiveanalysis

Conclusion

Summary

Relevant DASPcommands

Exercises with Stataand DASP

Reference

Introduction to Stata andDASP PEP and UNDP June 2010 – 20 / 24

Page 32: Introduction to Stata and DASP

Summary

Introduction to Stata andDASP PEP and UNDP June 2010 – 21 / 24

� Stata is a popular software that provides powerful statistical applicationsand that is simple to use.

� Stata commands can be inputted through dialog boxes, do files, orcommands windows.

� DASP facilitates the estimation of the most popular statistics used for theanalysis of poverty, inequality, social welfare, and equity, and providesvarious sophisticated statistical tools to check for the robustness and theprecision of such statistics

� DASP unifies syntax and parameter use across various estimationprocedures for distributive analysis.

� DASP allows the use of two distributions at the same time, and simplifiesthe production of tables and graphs.

Page 33: Introduction to Stata and DASP

Relevant DASP commands

Introduction to Stata andDASP PEP and UNDP June 2010 – 22 / 24

� FGT and EDE-FGT poverty indices (ifgt).� FGT CURVE with confidence interval (cfgts).� Lorenz and concentration CURVES (clorenz).

Page 34: Introduction to Stata and DASP

Exercises with Stata and DASP

Introduction to Stata andDASP PEP and UNDP June 2010 – 23 / 24

� Exercises 1.1, 1.2, 1.3

Page 35: Introduction to Stata and DASP

Reference

Introduction to Stata andDASP PEP and UNDP June 2010 – 24 / 24

ARAAR, A. AND J.-Y. DUCLOS (2007): “DASP: Distributive Analysis StataPackage,” PEP, CIRPÉE and World Bank, Université Laval.