5
1 CHAPTER 1 STATISTICS 1.1 1.1 1.1 1.1 WHAT IS STATISTICS? WHAT IS STATISTICS? WHAT IS STATISTICS? WHAT IS STATISTICS? The science of collecting, describing, and interpreting data. Statistics can be divided into 2 areas: 1. Descriptive statistics refers to the techniques and methods for organizing and summarizing the information obtained from the sample. 2. Inferential statistics refers to the techniques of interpreting and generalizing about the population based on the information obtained from the sample. Statistics are used in nearly all aspects of life, for example it is widely used in health-related fields, academia, sports, etc. For this reason, it is critical that the statistical results are studied and appropriate conclusions are formed. Case Study 1.1 1. Who was surveyed? American travelers who take trips of more than 100 miles 2. Based on this information, how would you describe the “typical” long–distance traveler? A 38 year old male whose income is over $50k. He travels by car between 100 and 300 miles, usually to visit friends or relatives and stays at their homes. 3. How do you compare to this “typical” long– distance traveler? Answers will vary based on gender, income, age, etc Case Study 1.2 1. Who was surveyed? Employed American Adults 2. Who did the surveying? U.S. Labor Department 3. Explain the meaning of: 3,262,120 is the number of cashiers surveyed in 1996, $5.75 is the median hourly pay rate for the cashiers surveyed.

Statistike 1

Embed Size (px)

Citation preview

Page 1: Statistike 1

1

CHAPTER 1 STATISTICS 1.11.11.11.1 WHAT IS STATISTICS?WHAT IS STATISTICS?WHAT IS STATISTICS?WHAT IS STATISTICS?

The science of collecting, describing, and interpreting data.

Statistics can be divided into 2 areas: 1. Descriptive statistics refers to the techniques and methods for organizing and summarizing

the information obtained from the sample.

2. Inferential statistics refers to the techniques of interpreting and generalizing about the population based on the information obtained from the sample.

• Statistics are used in nearly all aspects of life, for example it is widely used in health-related fields, academia, sports, etc.

• For this reason, it is critical that the statistical results are studied and appropriate conclusions are formed.

Case Study 1.1

1. Who was surveyed? American travelers who take trips of more than 100 miles

2. Based on this information, how would you describe the “typical” long–distance traveler?

A 38 year old male whose income is over $50k. He travels by car between 100 and 300 miles, usually to visit friends or relatives and stays at their homes.

3. How do you compare to this “typical” long–distance traveler?

Answers will vary based on gender, income, age, etc

Case Study 1.2

1. Who was surveyed? Employed American Adults

2. Who did the surveying? U.S. Labor Department

3. Explain the meaning of: 3,262,120 is the number of cashiers surveyed in 1996, $5.75 is the median hourly pay rate for the cashiers surveyed.

Page 2: Statistike 1

2

Case Study 1.3

1. Who was surveyed? American Adults

2. When were the surveys taken? On September 1, 1998

3. How accurate are the reported percentages believed to be?

Within + or – 4% 4. What do you think the 54% combined with

the margin of error means? The actual percentage who rate the economics conditions as good could range from 50% to 58%

Assignment: p. 7–9 Exercises 1.4 – 1.8

1.21.21.21.2 INTRODUCTION TO BASIC TERMSINTRODUCTION TO BASIC TERMSINTRODUCTION TO BASIC TERMSINTRODUCTION TO BASIC TERMS 1. Population - a collection, or set, of individuals or objects or events whose properties are to be analyzed.

a. Finite - when the membership of a population can be physically listed. b. Infinite - when the membership of a population is unlimited

2. Sample - a subset of a population. The part of the population from which the data values of information are

obtained.

3. Variable - a characteristic of interest about each individual element of a population or sample. (Ex. Height- there are many different heights) a. Qualitative (or Attribute or Categorical) Variable – a variable that describes or categorizes an

element of a population. (Ex. Kinds of fruit, types of music, etc)

• Nominal – a qualitative variable that categorizes an element of a population. Not only are arithmetic operations not meaningful for data resulting from a nominal variable, an order cannot be assigned to the categories. (Ex. Description or name)

• Ordinal – A qualitative variable that incorporates an ordered position, or ranking. (Ex. First, second, third, etc)

b. Quantitative (or Numerical) Variable – a variable that quantifies an element of a population.

• Discrete – A quantitative variable that can assume a countable number of values. The domain of a discrete variable has gaps between the possible values.

• Continuous – A quantitative variable that can assume an uncountable number of values. Theoretically, the domain of a continuous variable has no gaps since all numerical values are possible. (Ex. Amount of time, age, area, volume)

Page 3: Statistike 1

3

4. Data (singular) - the value of the variable associated with one element of a population or sample. This value may be a number, a word, or a symbol.

(Ex. 5'5"- this height is constant)

5. Data (plural) – The set of values collected for the variable from each of the elements belonging to the sample.

6. Experiment – a planned activity whose results yield a set of data.

7. Parameter – a numerical value summarizing all the data of an entire population. (Parameters are calculated from

populations.) 8. Statistic – a numerical value summarizing the sample data.(Statistics are calculated from samples)

Case Study 1.4

1. Name the four variables 1. Status: Did most of your family eat

dinner together last night? 2. Importance: How important to you is

eating dinner with your family? 3. Number of dinners: In the last seven

days, how many evenings did most of your family eat dinner together?

4. Length of dinner: How long would you say dinner usually lasts when you eat together?

2. What kind of variable is each? 1. Attribute (nominal) 2. Attribute (ordinal) 3. Numerical (discrete) 4. Numerical (continuous)

Assignment: p. 14–15 Exercises 1.17, 1.21, 1.22–1.24

1111.3.3.3.3 MEASURABILITY AND VARIABILITYMEASURABILITY AND VARIABILITYMEASURABILITY AND VARIABILITYMEASURABILITY AND VARIABILITY

• Within a set of data, variation is expected.

• The discrepancies between each measured value is our variability.

• Variability is expected even if the data tool of measurement is precise.

• Statistical process control is utilized to try to control or reduce the variability. Example:

When you look at a carton of candy bars, it states that each of the 24 candy bars weighs 7/8 ounce, to the nearest 1/8 ounce. In all actuality, if you were to weigh each bar, not all of them would weigh exactly 7/8 ounce. The different weights show our variability. Assignment: p. 16 Exercises 1.25–1.28

Page 4: Statistike 1

4

1.41.41.41.4 DATA COLLECTIONDATA COLLECTIONDATA COLLECTIONDATA COLLECTION Basic Definitions:

1. Biased Sampling method – A sampling method that produces values which systematically differ from the population being sampled. An unbiased sampling method is one that is not biased.

a. Convenience sample – when a sample is selected from elements of a population that are easily accessible.

b. Volunteer sample – results collected from those elements of the population that choose to contribute the needed information on their own initiative.

2. Experiment - The investigator controls or modifies the environment and observes the effect on the variable under study.

3. Observational Study – The investigator does not modify the environment and does not control the process being observed.

Case Study 1.5

1. Was this study an experiment or observational study? Observational; The experimenter can not control the weather.

Case Study 1.6

1. What is the population of interest? All adults who deal with stockbrokers.

2. Was this investigation an experiment

or an observational study? Experiment

4. Surveys – often times are observational studies of people. 5. Census – is a 100% survey. It is compiled if every element of the population can be listed and

observed. 6. Sampling Frame – A list of the elements belonging to the population from which the sample will be

drawn. The sampling frame should be representative of the population. a. Sample design – the selection process used to select the sample elements from the sampling

frame once a representative sampling frame has been established i. Judgment samples – samples that are selected on the basis of being “typical” ii. Probability samples – samples in which the elements to be selected are drawn on the

basis of probability. Each element in a population has a certain known probability of being selected as part of the sample.

7. Simple Random Sample – a sample selected in such a way that every element in the population has an equal probability of being chosen.

Steps to follow for the collection of Data: 1. Define the objectives of the survey or experiment 2. Define the variable and the population of interest 3. Define the data-collection and data-measuring schemes. 4. Determine the appropriate descriptive or inferential data-analysis techniques.

Page 5: Statistike 1

5

A convenience sample or volunteer sample, as indicated by their names, can often result in biased samples. Data collection can be accomplished with experiments (the environment is controlled) or observational studies (environment is not controlled). Surveys fall under observational studies. Sample designs can be categorized as judgment samples (believed to be typical) or probability samples (certain chance of being selected is given to each data value in the population). The random sample (each data value has the same chance) is the most common probability sample. Methods (simply defined) to obtain a random sample include:

1. Random Number Table – A collection of random digits used primarily for one of two reasons. a. To identify the source element of a population (the source of data)

b. To simulate an experiment.

2. Systematic – every kth element is chosen 3. Stratified – fixed number of elements from each strata (group) 4. Proportional (quota) – number of elements from each strata is determined by its size. 5. Cluster – fixed number or all elements from certain strata.

Assignment: p. 21–22 Exercises 1.35, 1.38, 1.39, 1.44

1.51.51.51.5 COMPACOMPACOMPACOMPARISON OF PROBABILITY AND STATISTICSRISON OF PROBABILITY AND STATISTICSRISON OF PROBABILITY AND STATISTICSRISON OF PROBABILITY AND STATISTICS Probability – asks you about the chance that something specific will happen when you know the possibilities. (The population is known) Example: The likelihood that heads will occur when a coin is tossed. Statistics – asks you to draw a sample, describe it, and then make inferences about the population based on the information found in the sample. Example: The weights of 35 babies are studied to estimate weight gain in the first month after birth.

Assignment: p. 23 Exercises 1.47 & 1.48