Software for data management: The contribution of Stata

Preview:

DESCRIPTION

Software for data management: The contribution of Stata. Dr Karen Robson, Senior Research Fellow, The Geary Institute, University College Dublin, Ireland. Getting acquainted with Stata. StataCorp develops and distributes Stata, software for statistical analysis. - PowerPoint PPT Presentation

Citation preview

Software for data management: The contribution of Stata

Dr Karen Robson, Senior Research Fellow, The Geary Institute, University College Dublin, Ireland

Getting acquainted with Stata

StataCorp develops and distributes Stata, software for statistical analysis.

Stata is available for Windows, Macintosh, and Unix computers.

Stata is used by medical researchers, biostatisticians, epidemiologists, economists, sociologists, political scientists, geographers, psychologists, social scientists, and other research professionals needing to analyze data. Gaining popularity in the social and medical sciences

Particularly useful for handling large-scale longitudinal data

Stata SE (for large data sets)

can analyze datasets with as many as 32,766 variables, and the only limit on observations is the amount of RAM on your computer

can handle string variables with a maximum length of 244 characters

can handle matrices up to 11,000 x 11,000. requires at least 512 megabytes of RAM and

80 megabytes of disk space

Stata/Intercooled (the standard one)

can analyze datasets with as many as 2,047 variables, and the only limit on observations is the amount of RAM on your computer

can handle string variables with a maximum length of 244 characters

can handle matrices up to 800 x 800.

Small Stata

A smaller, student version of Stata (for educational purchases only)

Stata MP

The fastest version of Stata (for dual-core and multicore/multiprocessor computers)

Stata/MP is the fastest and largest version of Stata.

Resources

StataCorp website (www.stata.com)

Resources

StataCorp website (www.stata.com) Timberlake website (www.timberlake.co.uk)

Resources

StataCorp website (www.stata.com) Timberlake website (www.timberlake.co.uk) UCLA Stata “portal”

(http://www.ats.ucla.edu/stat/)

Resources

StataCorp website (www.stata.com) Timberlake website (www.timberlake.co.uk) UCLA Stata “portal”

(statcomp.ats.ucla.edu/stata) Statalist (www.hsph.harvard.edu/statalist)

Resources

StataCorp website (www.stata.com) Timberlake website (www.timberlake.co.uk) UCLA Stata “portal”

(statcomp.ats.ucla.edu/stata) Statalist (www.hsph.harvard.edu/statalist) Stata Journal (www.stata-journal.com)

As well, available Dec 2008

Launching Stata

OS contingentDefault window preferencesWindow preferences fully adjustableAuto memory set

Comparing with SPSS

Start up differences

Comparing with SPSS

Start up differencesWith data file open

Comparing with SPSS

Start up differencesWith data file openViewing data

data viewer, data editor

Comparing with SPSS

Start up differencesWith data file openViewing data

data viewer, data editorViewing variables

Comparing with SPSS

Start up differencesWith data file openViewing data

data viewer, data editorViewing variablesViewing output/commands

output window buffer, log files

Comparing with SPSS

Start up differencesWith data file openViewing data

data viewer, data editorViewing variablesViewing output/commands

output window buffer, log filesSyntax and “do files”

INPUT

Stata command window

Do file

Pull-down menu

Variable window

Review window

Computation

RESULTS

Output window

Log file

Advantages and disadvantages of Stata

User driven Free STBs Dedicated journal Web active Memory

requirements Backward

compatible

Change! SPSS dominance Orientated to writing

syntax/code Pull-down windows

debate! Now in version 8 forward

Advantages and disadvantages of Stata

Easier code Easier data handling Clarity of operations/

feedback Results table

function

Before version 8, limited graphics

Now, complex graphics

Variable labelling Editing of output

Advantages and disadvantages of Stata

Nested/master do files

Flexible terminology Setting types of

data Interactive help Switch output (log

file) on/off

Copy and paste

Overview of analytic techniques

Too numerous to mention!Comprehensive manualsA selection:

All types of regressionSurvey packageEpidemiological packageMultilevel modellingTime series functionsCluster analysis

Data

Data files .dtaStat/Transfer software

Stata – using wide and long file formats

Wide file formats (everything you add goes to the right of the existing data)

Long file formats (everything you add goes underneath the existing data)

MERGE

Data 1 Data 2

APPEND

Data 2

Data 1

Data 1 (indi)

‘master’ Data 2 (indj)

‘using’

_merge values

1

3

2

Recommended