Elite_data Analysis With Stata

Introduction to Data Analisys with StataSara Godoy.Grupo Avanzado. Noviembre 2011

Nonparametric Analysis

Non-Parametric tests: SummaryNATURE OF DEPENDENT VBL. ONE-SAMPLE TWO-SAMPLE K-SAMPLE

RELATED/MATCHED

INDEPENDENT

CATEGORICAL/NOMINAL

Binomial test

McNemar test

Fisher s exact test WilconxonMann Whitney test

Chi-square test

ORDINAL/INTERVAL

KolmogorovSmirnov onesample test

Wilcoxon signed ranks test

Kruskal Wallis test

Non-parametric correlationA Spearman correlation is used when one or both of the variables are not assumed to be normally distributed and interval (but are assumed to be ordinal). The values of the variables are converted in ranks and then correlated. ! Syntax: spearman [varlist] [if] ,[options]!

spearman read write Number of obs = 200 Spearman's rho = 0.6167 Test of Ho: read and write are independent Prob > |t| = 0.0000 The results suggest that the relationship between read and write (rho = 0.6167, p = 0.000) is statistically significant.

P-values meaningA p-value is a measure of how much evidence we have against the null hypothesis (H0) ! The p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.! !

One often "rejects the null hypothesis" when the p-value is less than the significance level:! ! !

p F = R-squared = Root MSE =

xi: regress csat expense percent income high college i.region, robust

50 69.82 0.0000 0.9111 21.492

csat expense percent income high college _Iregion_2 _Iregion_3 _Iregion_4 _cons

Coef. -.002021 -3.007647 -.1674421 1.814731 4.670564 69.45333 25.39701 34.57704 808.0206

Robust Std. Err. .0035883 .2358047 1.196409 1.02694 1.599798 17.99933 12.52558 9.44989 67.86418

t -0.56 -12.75 -0.14 1.77 2.92 3.86 2.03 3.66 11.91

P>|t| 0.576 0.000 0.889 0.085 0.006 0.000 0.049 0.001 0.000

[95% Conf. Interval] -.0092676 -3.483864 -2.583638 -.2592168 1.439705 33.10295 .101086 15.4926 670.9661 .0052256 -2.53143 2.248754 3.888679 7.901422 105.8037 50.69293 53.66149 945.0751

NOTE: By default xi excludes the first value, to select a different value, before running the regression type: . char region[omit] 4 xi: regress csat expense percent income high college i.region, robust This will select Midwest (4) as the reference category for the dummy variables.

Regression: correlation matrix!

Below is a correlation matrix for all variables in the model. Numbers are Pearson correlation coefficients, go from -1 to 1. Closer to 1 means strong correlation. A negative value indicates an inverse relationship (roughly, when one goes up the other goes down).pwcorr csat expense percent income high college, star(0.05) sigcsat csat 1.0000 expense percent income high college

expense

-0.4663* 0.0006 -0.8758* 0.0000 -0.4713* 0.0005 0.0858 0.5495 -0.3729* 0.0070

1.0000

percent

0.6509* 0.0000 0.6784* 0.0000 0.3133* 0.0252 0.6400* 0.0000

1.0000

income

0.6733* 0.0000 0.1413 0.3226 0.6091* 0.0000

1.0000

0.5099* 0.0001 0.7234* 0.0000

1.0000

college

0.5319* 0.0001

1.0000

Regression: graph matrix!

Command graph matrix produces a graphical representation of the correlation matrix by presenting a series of scatterplots for all variables

graph matrix csat expense percent income high college, half maxis (ylabel(none) xlabel(none))

Regression: Managing all this outputs! Usually!

when we re running regression, we ll be testing multiple models at a timeCan be difficult to compare results

! Stata

offers several user- friendly options for storing and viewing regression output from multiple models:! !

Store Output: eststo / esttab Outputting into Excel: outreg2

Regression: eststo/esttab!We

can store this info in Stata, just type:

regress csat expense, robust eststo model1 regress csat expense college, robust eststo model2 percent income high

xi: regress csat expense college i.region, robust eststo model3

percent

income

Regression: eststo/esttab!esttab model1 model2 model3 Now Stata will hold your output in . memory until you ask to recall it: (1) (2) csat csat expense -0.0223*** (-6.07) 0.00335 (0.70) -2.618*** (-11.44) 0.106 (0.09) 1.631 (1.73) 2.031 (0.96) (3) csat -0.00202 (-0.56) -3.008*** (-12.75) -0.167 (-0.14) 1.815 (1.77) 4.671** (2.92) 69.45*** (3.86) 25.40* (2.03) 34.58*** (3.66) 1060.7*** (43.55) 51 851.6*** (14.86) 51 808.0*** (11.91) 50

esttab model1 !model2 model3

percent

income

college

_Iregion_2

_Iregion_3

_Iregion_4

t statistics in parentheses * p

Elite_data Analysis With Stata

Documents

Data Analysis with Stata 14.1 Cheat Sheet TIME SERIES ...geocenter.github.io/StataTraining/pdf/StataCheatSheet_Analysis.pdf · Data Analysis with Stata 14.1 Cheat Sheet For more info

Latent Class Analysis Using Stata · Latent Class Analysis Using Stata Chuck Huber StataCorp chuber@stata.com University College London ... Factor analysis Latent Profile Analysis

Robust Regression in Stata - Stata | Data Analysis and Statistical

Introduction to STATA About STATA About STATA Basic Operations Basic Operations Regression Analysis Regression Analysis Panel Data Analysis Panel Data

Scottish Social Survey Network: Master Class 1 Data Analysis with Stata

ASSUMPTION CHECKING In regression analysis with Stata In multi-level analysis with Stata (not much extra) In logistic regression analysis with Stata NOTE:

Stochastic Frontier Analysis Stata

STATA - Probit Analysis

Stata for Survival Analysis - courses.umass.educourses.umass.edu › biep640w › pdf › Stata for Survival Analysis.pdfBIOSTATS 640 – Spring 2018 6. Survival Analysis Stata Illustration

Stata Illustration: One Way Analysis of Variancepeople.umass.edu/biep640w/pdf/stata v 13 one way anova.pdf · Stata version 13 Illustration: One Way Analysis of ... Randomized trial

Stata cheat sheet: data analysis

Time series analysis in Stata

Cluster Analysis Utilities for Stata · Cluster Analysis Utilities for Stata Brendan Halpin, Dept of Sociology, University of Limerick Extending Stata Clustering Comparing solutions:

Data Analysis with Stata 15 TIME SERIES PANEL / LONGITUDINALgeocenter.github.io/StataTraining/pdf/StataCheatSheet_analysis_201… · Data Analysis with Stata 15 Cheat Sheet For more

Example Analysis with STATA€¦ · Example Analysis with STATA † Exploratory Data Analysis. Means and Variance by Time and Group. Correlation. Individual Series † Derived Variable

Duration Analysis In Stata - Department of Political Science Analysis In Stata Kevin Sweeney ... Contains data from ... Duration Analysis In Stata.ppt

Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with

Data Analysis Declare Data with Stata Cheat Sheet TIME ...€¦ · Data Analysis with Stata Cheat Sheet For more info, see Stata’s reference manual (stata.com) Tim Essam (tessam@usaid.gov)

Stata for Survival Analysis - UMass Amherstpeople.umass.edu/biep640w/pdf/Stata for Survival Analysis.pdf · Survival Analysis Stata Illustration ….Stata\00. Stata Handouts 2017-18

Hot and Cold Spot Analysis Using Stata - RIETI and cold spot analysis using Stata 2 1 Introduction Spatial analysis is becoming more popular with the increasing availability of geographically