36
PPAL 6200 Research Methods and Info Systems Intro and Chapter 1

PPAL 6200 Research Methods and Info Systems

  • Upload
    amal

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

PPAL 6200 Research Methods and Info Systems. Intro and Chapter 1. Class Outline. Intro to the Course Discussion of Software and Technical Issues Break Describing Data “Distributions” with Graphs. Introduction to the Course. What can you expect to learn in this class - PowerPoint PPT Presentation

Citation preview

Page 1: PPAL 6200 Research Methods and Info Systems

PPAL 6200Research Methods and Info

SystemsIntro and Chapter 1

Page 2: PPAL 6200 Research Methods and Info Systems

Class Outline

• Intro to the Course

• Discussion of Software and Technical Issues

• Break

• Describing Data “Distributions” with Graphs

Page 3: PPAL 6200 Research Methods and Info Systems

Introduction to the Course

• What can you expect to learn in this class– A Framework for Conducting and Evaluating

Empirical Research– A Framework for Conducting and Evaluating

Statistical Research– The challenges facing those who must deal

with information systems as part of their jobs

Page 4: PPAL 6200 Research Methods and Info Systems

• What you should not expect to learn in this class– A professional capacity to conduct statistical

work, at best you will be prepared to learn more about how to undertake statistical work (if you choose to do so) and to be a “knowledgeable consumer” of research prepared using statistics.

Page 5: PPAL 6200 Research Methods and Info Systems

• Some Key Concepts to Start Us Off *source unless noted Moore(2009)

– Data • Numbers with a context (xxiv). The context including how

data is collected can alter results.

– Variable• An empirical property that can take on two or more values

(Frankfort-Nachmias & Nachmias 1996:50) Don’t get suckered in by small and rapid changes, look at the big picture (xxvii)

– Case• An individual, event or other thing for which we have data

– Measurement• The assignment of numbers to objects, events or variables

according to rules (ibid: 156-157)

Page 6: PPAL 6200 Research Methods and Info Systems

– Levels of Measurement• Nominal, Ordinal, Interval, Ratio

– Validity• Are you measuring what you thought you are measuring?

– Reliability• Are you measuring it accurately?

– Spuriousness• Is there something else involved? Beware the lurking

variable (xxvii)

– Statistics• The science of learning from data (xxiv)

Page 7: PPAL 6200 Research Methods and Info Systems

The Book Title Says It All…

• This is a class in the “basic practice of statistics” with a little bit of practical advice thrown in regarding management of information systems

• Inside the front cover of the book is a wonderful set of flow through figures that show how one can go about statistical thinking in a disciplined manner and three four step plans to guide your work

Page 8: PPAL 6200 Research Methods and Info Systems

Some software and technical issues

• For this portion of the class we will quickly review my website then leave power point to go look at the electronic resources available there to assist you

Page 9: PPAL 6200 Research Methods and Info Systems

www.yorku.ca/dcohn/PPAL-6200.html

Page 10: PPAL 6200 Research Methods and Info Systems

The Secure Website

Page 11: PPAL 6200 Research Methods and Info Systems

Please Note: The secure website will look different for you as I have access to page design resources you will not see

• We will now leave power point to look at these resources

Page 12: PPAL 6200 Research Methods and Info Systems

Describing Data Distributions with Graphs

• As the introductory sections of the book noted, you really cannot go wrong to begin your work by visualizing the individual variables that comprise your data (and on occasion plotting them against another variable such as time).

• The distribution tells you what values a variable takes and how often it does so

Page 13: PPAL 6200 Research Methods and Info Systems

Ways we can Visualize and Explore Data

• Exploratory analysis is not meant to allow us to reach any deep conclusions it is meant to help us better understand the data set and the relationships within it

• We want to look both for an overall pattern (consistencies) and deviation from it (often called outliers)

• Tables– Tables are effective tools for visualizing data, provided that we

do not have too many variables, nor too many cases

• At a certain point we need to graphically depict our data to make it understandable as a snapshot

Page 14: PPAL 6200 Research Methods and Info Systems

Which Graph?

• The graphic depictions we employ are dependent on:– The type of data we have

• Level of Measurement• Whether Stationary or Chronological

Page 15: PPAL 6200 Research Methods and Info Systems

Some Common Graphs

• Pie Chart (good for showing percentages when few categories of a nominal or ordinal variable)

Page 16: PPAL 6200 Research Methods and Info Systems

Percentage of Students Picking a Given Major

Page 17: PPAL 6200 Research Methods and Info Systems

• Bar Charts are equally useful for nominal and ordinal variables but have the benefit of allowing more flexibility

Page 18: PPAL 6200 Research Methods and Info Systems

Foreign Born Population of US States by Percentage

Page 19: PPAL 6200 Research Methods and Info Systems

Histograms

• Histograms can be confusing as they look like Bar Graphs sometimes. In fact you can make them by carefully specifying a Bar Graph. However they are really quite different.

• They are meant for use with Interval and Ratio data where there is a lot of variability among cases because there are so many possible values for the data

Page 20: PPAL 6200 Research Methods and Info Systems

• Therefore we have to “group the data” to a certain extent to allow us to represent it

• What a histogram shows is the percentage of cases that have a score within the groups represented by the bars

Page 21: PPAL 6200 Research Methods and Info Systems
Page 22: PPAL 6200 Research Methods and Info Systems

• You will notice that this graph looks a bit different from the one in the book.

• This is because the scaling that my software used is a bit different from that used by the person who did the examples in the book.

Page 23: PPAL 6200 Research Methods and Info Systems
Page 24: PPAL 6200 Research Methods and Info Systems

This brings up a good point

• Be careful how you manipulate data as you will see in the next section of the talk. these two graphs portray the same information but one will give us a more interesting result.

Page 25: PPAL 6200 Research Methods and Info Systems

Describing a Distribution

• Once we get to developing histograms we can start to evaluate the shape of our data in a number of interesting ways (Shape, Centre, Spread)– What is the shape of the plot? Is it single peaked or

multi-peaked?– Where is the peak? Is it at the centre or off-centre

(skewed)? When the tail of a distribution heads off to one side unevenly we say it is skewed to that side (this is confusing)

– What about outliers? Any unusually high or low scores?

Page 26: PPAL 6200 Research Methods and Info Systems

As you can see below: Regrouping the data makes one figure more symmetrical than the

other

Page 27: PPAL 6200 Research Methods and Info Systems

A stemplot is not so elegant

• Granted it is not so elegant but it does allow us to figure out what is happening inside of those bars….

Page 28: PPAL 6200 Research Methods and Info Systems

28

Stemplots(Stem-and-Leaf Plots)

• For quantitative variables• Separate each observation into a stem (first part

of the number) and a leaf (the remaining part of the number)

• Write the stems in a vertical column; draw a vertical line to the right of the stems

• Write each leaf in the row to the right of its stem; order leaves if desired

Page 29: PPAL 6200 Research Methods and Info Systems

29

Weight Data12

Page 30: PPAL 6200 Research Methods and Info Systems

BPS - 5th Ed. Chapter 1 30

Weight Data:Stemplot

(Stem & Leaf Plot)

1011121314151617181920212223242526

Key

20|3 means203 pounds

Stems = 10’sLeaves = 1’s

192

2

1522

5

135

Page 31: PPAL 6200 Research Methods and Info Systems

BPS - 5th Ed. Chapter 1 31

Weight Data:Stemplot

(Stem & Leaf Plot)

1011121314151617181920212223242526

Key

20|3 means203 pounds

Stems = 10’sLeaves = 1’s

2

2

5

Page 32: PPAL 6200 Research Methods and Info Systems

BPS - 5th Ed. Chapter 1 32

Weight Data:Stemplot

(Stem & Leaf Plot)

10 016611 00912 003457813 0035914 0815 0025716 55517 00025518 00005556719 24520 321 02522 023242526 0

Key

20|3 means203 pounds

Stems = 10’sLeaves = 1’s

Page 33: PPAL 6200 Research Methods and Info Systems

BPS - 5th Ed. Chapter 1 33

Extended Stem-and-Leaf Plots

If there are very few stems (when the data

cover only a very small range of values), then

we may want to create more stems by splitting

the original stems. In other words, you can

have more than one stem with the same base

number.

Page 34: PPAL 6200 Research Methods and Info Systems

BPS - 5th Ed. Chapter 1 34

Extended Stem-and-Leaf Plots

Example: if all of the data values were between 150 and 179, then we may choose to use the following stems:

151516161717

Leaves 0-4 would go on each upper stem (first “15”), and leaves 5-9 would go on each lower stem (second “15”).

Page 35: PPAL 6200 Research Methods and Info Systems
Page 36: PPAL 6200 Research Methods and Info Systems

Thinking about these Graphs

• When we look at these graphs we have to keep in mind the questions we have started– Shape– Centre (other than time-series)– Outliers