11
Course Topics Simple statistical methods for data analysis using Excel. descriptive statistics, an introduction to statistical inference, and linear regression models. Excel workbooks for computing elementary statistics using the Data Analysis toolkit. Transferring digital information (graphs and tables) into Word documents, developing presentations in Power Point. Publishing documents on the web

Course Topics Simple statistical methods for data analysis using Excel. descriptive statistics, an introduction to statistical inference, and linear regression

Embed Size (px)

Citation preview

Page 1: Course Topics Simple statistical methods for data analysis using Excel. descriptive statistics, an introduction to statistical inference, and linear regression

Course Topics Simple statistical methods for data analysis using Excel.

• descriptive statistics,

• an introduction to statistical inference, and

• linear regression models.  Excel workbooks for computing elementary statistics

using the Data Analysis toolkit.   Transferring digital information (graphs and tables) into

Word documents, developing presentations in Power Point.

Publishing documents on the web

Page 2: Course Topics Simple statistical methods for data analysis using Excel. descriptive statistics, an introduction to statistical inference, and linear regression

Statistics with Microsoft Excel by B.J. Dretzke (Recommended for students that are not familiar with Excel)

Introduction to the Practice of Statistics, by David S. Moore and George P. McCabe

Elementary Statistics (2002), by M. F. Triola. The Basic Practice of Statistics (2000), by D.S.

Moore. 

Optional Texts

Page 3: Course Topics Simple statistical methods for data analysis using Excel. descriptive statistics, an introduction to statistical inference, and linear regression

Useful links

Surfstat: an online text in introductory Statistics: http://www.anu.edu.au/nceph/surfstat/surfstat-home/surfstat.html

Statistics at Square One: http://bmj.com/collections/statsbk/index.shtml

The DePaul University library offers a number of good books on Excel using books 24X7: IT Pro

Page 4: Course Topics Simple statistical methods for data analysis using Excel. descriptive statistics, an introduction to statistical inference, and linear regression

Getting ready for the class• Open Excel• Check that the Tools menu contains the Data Analysis option• If not, use Tools|Add Ins… and click on box labeled Analysis

ToolPak

Page 5: Course Topics Simple statistical methods for data analysis using Excel. descriptive statistics, an introduction to statistical inference, and linear regression

The goal of data analysis is to gain information from the data. Long listings of data are of little value. Statistical methods come to help us.

Exploratory data analysis: set of methods to display and summarize the data.

Data on just one variable: the distribution of the observations is analyzed by

I. Displaying the data in a graph that shows overall patterns and unusual observations (stem-and-leaf plot, bar chart, histogram, box plot, density curve).

II. Computing descriptive statistics that summarize specific aspects of the data (center and spread).

Exploratory Data Analysis

Page 6: Course Topics Simple statistical methods for data analysis using Excel. descriptive statistics, an introduction to statistical inference, and linear regression

Data contain information about a group of individuals or subjects

A variable is a characteristic of an observed individual which takes different values for different individuals:

Quantitative variable (continuous) takes numerical values. Ex.: Height, Weight, Age, Income, MeasurementsQualitative/Categorical variable classifies an individual into categories or groups. Ex. : Sex, Religion, Occupation, Age (in classes e.g. 10-20, 20-30, 30-40)

The distribution of a variable tells us what values it takes and how often it takes those values

Different statistical methods are used to analyze quantitative or categoricalvariables.

Observed variables

Page 7: Course Topics Simple statistical methods for data analysis using Excel. descriptive statistics, an introduction to statistical inference, and linear regression

Pie chart

18-3453%

35-5442%

55>5%

Graphs for categorical variablesGraphs for categorical variablesThe values of a categorical variable are labels.

The distribution of a categorical variable lists the count or percentage of individuals in each category.

Wireless surfers by AgeBar Chart

53%42%

5%

0%

20%

40%

60%

18-34 35-54 55>

A sample of 400 wireless internet users.

Counts: 212 168 20

Page 8: Course Topics Simple statistical methods for data analysis using Excel. descriptive statistics, an introduction to statistical inference, and linear regression

Wireless surfers by genderBar chart

72%

28%

0%

50%

100%

Male Female

Wireless internet users

Male 288 (72%)

Female 112 (28%)

Total 400 (100%)

Another ExampleAnother Example

Page 9: Course Topics Simple statistical methods for data analysis using Excel. descriptive statistics, an introduction to statistical inference, and linear regression

Survived Dead

Male Female Male Female

First class 62 141 118 4

Second class 25 93 154 13

Third class 88 90 422 106

Crew members 192 20 670 3

Example: On the morning of April 10, 1912 the Titanic sailed from the port of Southampton (UK) directed to NY. Altogether there were 2,201 passengers and crew members on board. This is the table of the survivors of the famous tragic accident.

Assigning CategoriesAssigning Categories

Page 10: Course Topics Simple statistical methods for data analysis using Excel. descriptive statistics, an introduction to statistical inference, and linear regression

Example: CEO salariesForbes magazine published data on the best small firms in 1993. These were firms with

annual sales of more than five and less than $350 million. Firms were ranked by five-year average return on investment. The data extracted are the age and annual salary of the chief executive officer for the first 59 ranked firms.

Salary of chief executive officer (including bonuses), in $thousands

145 621 262 208 362 424 339 736 291 58 498 643 390 332 750 368 659 234 396 300 343 536 543 217 298 1103 406 254 862 204 206 250 21 298 350 800 726 370 536 291 808 543 149 350 242 198 213 296 317 482 155 802 200 282 573 388 250 396 572

The Histogram

Page 11: Course Topics Simple statistical methods for data analysis using Excel. descriptive statistics, an introduction to statistical inference, and linear regression

1. Construct a distribution table:i. Define class intervals or bins (Choose intervals of equal width!)ii. Count the percentage of observations in each intervaliii. End-point convention: left endpoint of the interval is included,

and the right endpoint is excluded, i.e. [a,b)2. Draw the horizontal axis. 3. Construct the blocks:

Height of block = percentages!

The total area under an histogram must be 100%

Drawing a histogram