31
2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 Data collection can be thought of a kind of measurement. For example: Measure something about human behavior (i.e., people’s attitudes toward learning) Measure something about human mind (i.e., IQ scores, personality scores, or reaction times scores) What is a measurement? Some way of assigning numbers or labels Some other kinds of well-defined descriptions to things For example: Age of a person (measured in years): 0, 1, 2, … more likely to 150 since no human has lived that long (see the Wikipedia link Oldest people: https://en.wikipedia.org/wiki/Oldest_people ) A person’s opinion regarding “learning new things is fun” on a Likert scale of “1 = Strongly Disagree”, “2 = Disagree”, “3 = Neither Agree nor Disagree”, “4 = Agree”, “5 = Strongly Agree” 2 Psychological Measurement 1 2

Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

1

Week 2 Lecture Notes

PSYC2021: Winter 2019

1

• Data collection can be thought of a kind of measurement.

For example:

• Measure something about human behavior (i.e., people’s attitudes toward learning)

• Measure something about human mind (i.e., IQ scores, personality scores, or reaction times scores)

• What is a measurement?

• Some way of assigning numbers or labels

• Some other kinds of well-defined descriptions to things

For example:

• Age of a person (measured in years): 0, 1, 2, … more likely to 150 since no human has lived that long

(see the Wikipedia link Oldest people: https://en.wikipedia.org/wiki/Oldest_people )

• A person’s opinion regarding “learning new things is fun” on a Likert scale of

“1 = Strongly Disagree”, “2 = Disagree”, “3 = Neither Agree nor Disagree”, “4 = Agree”, “5 = Strongly Agree”

2

Psychological Measurement

1

2

Page 2: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

2

• Construct are internal attributes or characters that cannot be directly observed but are useful for describing and explaining behavior.

• It is a thing that you are trying to take a measurement of (e.g., age, attitudes, opinions)

For example: The Survey of Attitude Towards Statistics (SATS-36©) encompasses six attitudinal constructs of

• Affect: Students’ feeling concerning statistics

• Cognitive Competence: Students’ capability in doing statistics

• Difficulty: Students’ perception of the difficulty of statistics

• Value: Students’ valuing (relevance, worth, and usefulness) of statistics

• Interest: Students’ interest in learning statistics

• Effort: Students’ willingness to put effort into learning statistics

The link to this survey (two copy righted versions): https://www.evaluationandstatistics.com/

Other examples include: Anxiety, intelligence, etc.

3

Constructs

Measure refers to the method or the tool that you can use to make observations.

For example:

• A question/statement in a survey. For instance, SATS-36© includes the following statements:

• A behavioral observation (e.g., children’s behavior in a playground)

• A brain scan

4

Measure

3

4

Page 3: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

3

• Operationalisation refers to taking a meaningful but somewhat vague concept and turn it into a precise measure.

• It is a logical connection between the measure and the theoretical construct, or the process by which we try to

derive a measure from a theoretical construct.

For example:

• We can measure students’ attitudes toward statistics by administering the Survey of Attitude Towards Statistics

(SATS-36©) that encompasses six attitudinal constructs of Affect, Cognitive Competence, Difficulty, Value,

Interest, and Effort.

• We can measure a person’s intelligence based on their performance on an IQ test.

5

Operationalisation

Important things to keep in mind when you are collecting quantitative data:

• Be precise about what you want to measure.

• Determine what method you will use to measure it.

• Will you use self-report to measure or will you lookup official record?

For example:

I studied students’ attitudes toward statistics. I administered the SATS-36© and linked students’ responses to

their repository record (e.g., university GPA, program of study) from the Office of Registrar.

• Define the set of possible values that the measurement can take.

• Note values do not have to be always numerical (but often are)

• Think about unit of measurement (e.g., age in years, attitude-scores on a scale of 1 to 5)

6

Operationalisation

5

6

Page 4: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

4

• Variation is the of statistics.

• Variation is the foundation of sound reasoning about the data.

• Statistical methods helps explain the variation in the data; We model the variation in the data.

• A variable is a characteristic of an individual case (experimental or observational unit) in the population.

• Any characteristics we can measure for each subject/participant/case is a variable.

• A variable can take different values on different cases.

• A variable is the outcome of a psychological measurement.

7

Notion of Variation and Variables

Consider an undergraduate university students’ data base.

• Individual cases (subjects): Students of the university.

• Variables: Program of Study, Year of Study, Sex, University GPA, High School GPA, etc. The values of these

variables have measurement like the following:

• Program of Study (Psychology, Visual Arts, Kinesiology, etc.)

• There is student to student program of study variation. Different program of study for different students.

• Year of Study (1, 2, 3, 4)

• There is student to student year of study variation. Different year of study for different students.

• Sex: Male, Female (sometime we assign value 1 if Male; 0 if Female)

• There is student to student sex variation. Different sex for different students.

• University GPA: (ranges form 1.0 to 4.0)

• There is student to student university GPA variation. Different university GPA for different students.

8

Examples of Variables and Their Measurement

7

8

Page 5: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

5

Classify Variables: Quantitative or Categorical

Quantitative variable:

• When the measurement scale has numerical values. They describe amount of something.

• These variables must accompany with their unit of measurement.

E.g., University GPA: (range form 1.0 to 4.0)

E.g., Hours of study: (0 to infinite!?)

• They may also arise from the process of counting.

E.g., The number residents in the province of Ontario.

E.g., The number siblings a person has.

Categorical variable:

• When the measurement scale is set of categories.

• To determine what group or category individuals (cases, subjects, participants) belong to. • Categories/levels are pre-determined for a categorical variable.

• Counting is a natural way to summarize and learn about a categorical variable.

• Often called qualitative variables: Distinct categories differ in their qualities not in their numerical magnitude.

E.g., Program of Study: Behavioral Sciences, Environmental Sciences, Life Sciences, Social Science, and so on.

E.g., Canadian Provinces: Ontario, British Columbia, Alberta, and so on.9

Why Classify Variables as Quantitative or Categorical?

• For application of different statistical methods.

• For obtaining appropriate graphs and summary statistics.

Example of a Quantitative Variable:

• Income of Canadian Citizens (in thousands of dollars).

We might be interested in average income of all Canadian Citizens.

Graphical Display (we will learn in week 3): Histogram or Boxplot of distribution of income.

Example of a Categorical Variable:

• Canadian Provinces.

We might be interested in the number of Canadians living in each province (Count).

Graphical Display (we will learn in week 3): Bar chart or Pie Chart

10

9

10

Page 6: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

6

Convert a Quantitative Variable to a Categorical Variable

Simply break up the range of values into several intervals.

Example: Age

Distribution of on- and off-reserve First Nations people (single identity), by age group, 2011

Source: http://www.statcan.gc.ca/pub/89-653-x/2016010/tbl/tbl01-eng.htm

11

• Recall: A measurement involves assigning individuals or events to categories.

For example:

• Student’s status (Undergraduate, Graduate)

• Hours of study (1, 2, 3, …)

• The categories used to measure a variable and the relationship between categories determine the types of scale.

For example:

• Nominal scale

• Ordinal scale

• Interval scale

• Ratio scale

12

Scales of Measurement

11

12

Page 7: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

7

Nominal Scale

Nominal scale consists of set of categories that have different names.

• There is no particular relationship between different possibilities (categories, levels, classes).

• We cannot say one outcome/possibility is “bigger” or “better” than any other one.

• It does not make sense to average levels/categories/classes.

• There is no order in which we list the options (categories, classes, levels).

• Examples: Primary mode of transportation to school.

• Categories: automobile, bus, subway, bicycle, walk.

13

Ordinal Scale

Ordinal scale consists of set of categories that are organized in a natural, meaningful, ordered sequence.

• The levels of a categorical variable form an ordinal scale.

• Measurements on an ordinal scale rank scores (observations) in terms of size or magnitude.

Examples:

• Socio-economic class with categorical scale: upper, middle, lower.

• Political philosophy with categorical scale: Very liberal, moderately liberal, slightly liberal, very conservative,

moderately conservative, slightly conservative.

14

13

14

Page 8: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

8

Interval ScaleInterval scale consists of a series of equal intervals (equal distances).

• Numerical values are meaningful.

• The difference between the numbers are interpretable.

• The distance is meaningful.

• Note: 0-point on an interval is arbitrary and does not necessary indicate a zero amount of variable being measure.

Example:

• Annual income (in thousands of dollars).

• The interval (distance) between $30,000 to $40,000 is $10,000.

Purpose: We can compare outcomes are how much larger or how much smaller one is than the other

(e.g., in which interval should an annual income go to).

• Year (2000, 2001, 2002, …)

• Number of children in a family (0, 1, 2, …) 15

Ratio Scale

Ratio scale is an interval scale.

• 0 means something (e.g., 0 seconds means no time at all, 0 correct answers on a true/false test).

• It is OK to multiply and divide, or to subtract and add between the two measure.

• A ratio of numbers do reflect ratio of magnitude.

• The difference between the numbers are interpretable.

Example: Response Time (measured in minutes; the amount of time someone takes to solve a problem or answer a

question because it is an indicator of how different the task is).

• Suppose person A takes 3 minutes to solve a problem.

• Suppose person B takes 2 minutes to solve the same problem.

➢Person A takes one minute (3 – 2 = 1) longer to solve the problem.

➢Person A takes 3/2 = 1.5 times as long to solve the question.

• Suppose Person C takes 3 minutes to solve the same problem.

➢Person A takes the same time as person B to solve the problem: 3 – 3 = 0 difference to solve the problem.

➢Person A takes 3/3 = 1 times as long to solve the question.16

15

16

Page 9: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

9

Discrete verses Continuous Variables

17

Discrete Variables:

• Any variable with finite number (countable) of possible values is discrete.• That is, it is restricted to whole, countable numbers.

• It consists of separate indivisible categories.

• There is nothing in the middle (between two values); No values exists between two neighbouring categories.

Examples: • Number of students at York University.• Number of students enrolled in the Psychology program at York University• Number of times dined at in 2018.

Continuous Variables:

• Any variable with infinite continuum (no ending number) of possible real number values.

(e.g., a number with decimal points).• That is, there are infinite number of possible values that fall between two values.

• For any two values that you can think of, it is logically possible to have another in between

Examples: • Time (in minutes) takes to finish a test• Response time (in minutes)• Test scores (0 to 100)• GPA (1.0 to 4.0)

Discrete verses Continuous Variables

18

• ALL categorical variables (nominal or ordinal measurement scales) are discrete, having a finite set of categories.

Examples:

• Program of Study: (Behavioral Sciences, Environmental Sciences, Life Sciences, Social Science) – this is a discrete categorical variable has a nominal scale of measurement.

• Year of Study (1, 2, 3, 4) – this is a discrete categorical variable with an ordinal scale of measurement.

• Quantitative variables (measured on interval or ratio scale) could be either discrete or continues.

Examples:

• Annual income (in Thousand Dollars) – this is a continuous quantitative variable with an interval scale.

• Test score (0 to 100) – this is a continuous quantitative variable with an interval scale.

• Time (in minutes) takes to finish a test – this is a continuous quantitative variable with a ratio scale.

• Height (in centimeters) – this is a continuous quantitative variable with a ratio scale.

• Number of pets in a family (0, 1, …) – this is a discrete quantitative variable with an interval scale.

• Number of sibilings (0, 1, …) – this is a discrete quantitative variable with an interval scale.

17

18

Page 10: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

10

Likert Scale

Likert scale is used typically in surveys.

For example, the Survey of Attitude Towards Statistics (SATS-36©) uses a 7-point Likert scale of measuring students’ attitudes toward statistics:

“1 = Strongly Disagree”, …, “4 = Neither Agree nor Disagree”, …, “5 = Strongly Agree”

• What kind of variables are they?

• They are discrete variables, since, we cannot give a response of 4.5 (in the example above).

• Not nominal since items are ordered.

• They could be treated as ordinal scale.

Example: If the difference between “strongly agree” and “agree” is not perceived the same difference as “agree” and “neither agree nor disagree”.

• The position of ordinal scales on the quantitative-qualitative (categorical) classification is fuzzy.

• They could resemble interval scales. Since most participants appear to take the whole “on a scale of 1 to 7” part fairly seriously and they tend to act as if the difference between, for example, “strongly agree” and “agree”, and “neither agree nor agree” options fairly similar to one another.

• Each level has a greater or smaller magnitude than another level.

• Often methods used for their statistical analysis is the same as nominal scale of measurement.

• We can conduct a sensitivity analysis and check if conclusions would differ in any significant way of other choices of scores. 19

Quantitative Aspects of Likert Scale

Consider undergraduate students’ responses to the statement “I will like statistics” on a 7-point Likert scale in the

Survey of Attitude Towards Statistics (SATS-36©). W can obtain the extent of students’ agreement to this statement.

• We might want to treat the Likert scale (1 to 7) as an interval scale (distance is 1 between options) to compute a mean score for an item (e.g., I will like statistics).

20

19

20

Page 11: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

11

Asal’s Example: Treating a Likert Scale as a Quasi-Interval Scale

I obtained a statistic (average) about students’ reported willingness to spend a great deal to learn statistics as 5.88 on a 7-point Likert scale (“1” indicates a strong disagreement to “4” neutral to “7” strong agreement).

With this value of 5.88, since it is above 4 (neutral response),

I described that on average, students reported a great deal of effort to learn statistics in their statistics course.

21

Assessing the Reliability of a Measurement

Reliability refer to the repeatability or consistency of a measurement.

• How precise you measure something.

There are different ways in which we might measure reliability:

• Test-retest reliability:

o This relates to consistency over time.

o If we repeat the measure at a later time, do we get the same answer/response/outcome?

• Inter-rater reliability:

o This related to consistency across theoretically-equivalent measurements.

o Example: if an educator used a different set of instrument to measure their students’ course performance, does it give the same outcome?

• Internal consistency reliability:

If the measurement is constructed from lots of different parts that perform similar function do the individual parts tend to give similar answers

Example: In a personality questionnaire result is added up across several questions.

22

21

22

Page 12: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

12

Roles of Variables: Predictors and Outcomes

23

• Response variable (outcome variable/measure):

A variable on which the comparisons are made at levels of the other variable(s).

• Explanatory variable (predictor variable):

A variable that is proposed to explain the variation in the outcome variable in the analysis.

• A variable that may contribute to explaining the differences in the response variable in the analysis.

• An analysis of association between two variables:

o An analysis of association between two variables is called bivariate analysis.

• When certain values of one variable tend to go with certain values of the other.

• We need to define the response variable and the explanatory variable.

oAnalysis of association studies how the outcomes on the response variable depends on or is explained by the

values of the explanatory variable.

Association Between Two Variables

24

Consider the following questions:

➢ What is the relationship between educational attainment and gender?

➢ What is the relationship between attitudes toward learning and gender?

➢ What is the relationship between percentage of birth to teenage mothers and poverty rate?

The above questions can also be phrased as follow:

➢ Does educational attainment depend on gender?

➢ Does attitudes toward learning depend on gender?

➢ Does percentage of birth to teenage mothers depend on poverty rate?

The above questions can also be phrased as follow:

➢ Can gender predict educational attainment?

➢ Can gender predict attitudes toward learning?

➢ Can poverty rate predict percentage of teenage mothers?

23

24

Page 13: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

13

Correlational Study

In a correlation (observational) study, the researcher observes individual cases and measures variables of interest

but has no experimental control over the subjects.

• It observes individual cases and measures variables of interest but does not attempt to influence the response.

• It may help identify variables that have an effect but they do not prove cause and effect.

• They are commonly used and are valuable for discovering trends and possible relationships.

• However, it is not possible for correlational studies to demonstrate a causal relationship.

• We cannot establish cause and effect with correlational studies because there is always the possibility that

some unmeasured variables (hidden/lurking variables) could be responsible for pattern observed in the data.

A lurking variable is usually thought of as a variable associated with both the response variable and the

explanatory variable that makes it appear that the explanatory variable may be causing the response variable.

25

Example of a Correlation Study from OECD-iLibrary

What does age have to do with skills proficiency?http://www.oecd-ilibrary.org/education/what-does-age-have-to-do-with-skills-proficiency_5jm0mq158zjl-en

“The Survey of Adult Skills, a product of the OECD Programme for the International Assessment of Adult Competencies (PIAAC), shows that there are substantial differences in proficiency in information-processing skills across age groups. Proficiency is highest among adults in their late 20s and early 30s. From that point, proficiency declines with age. Depending on whether and how one tries to account for differences in individuals’ socio-demographic characteristics, 55-65 year-olds can be estimated to score between 18 and 32 points below 25-34 year-olds in literacy” (OECD, 2012).

26

25

26

Page 14: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

14

What does age have to do with skills proficiency?http://www.oecd-ilibrary.org/education/what-does-age-have-to-do-with-skills-proficiency_5jm0mq158zjl-en

27

What does age have to do with skills proficiency?http://www.oecd-ilibrary.org/education/what-does-age-have-to-do-with-skills-proficiency_5jm0mq158zjl-en

“As the survey measures the proficiency of adults of different ages at a single point in time, the differences in

proficiency related to age may reflect the impact of other factors in addition to biological ageing

(the so-called “age effects”).

For example, the quantity and quality of education received by individuals born in different years may vary

considerably – thus generating so-called “cohort effects”. Yet, the results from the Survey of Adult Skills are

broadly similar to those of other studies that are better able to separate age and cohort effects: ageing does have

an impact on skills proficiency” (OECD, 2012).

28

27

28

Page 15: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

15

Correlation Does NOT Prove Causation

29

• In many studies of the relationship between two variables the goal is to establish that changes in the explanatory

variable cause changes in response variable.

• Even a strong association between two variables, does not necessarily imply a casual link between the variables.

Some explanations for an observed association:

• The dashed double arrow lines show an association.

• The solid arrows show a cause and effect link.

• The variable 𝑥 is explanatory, 𝑦 is response, and 𝑧 is

a lurking variable.

• In picture (c), the effects of 𝑥 and 𝑧 are confounded,

making it difficult to determine which (if either) of

the two confounder variables is causal culprit. Maybe

both have a casual link to 𝑦-variable, maybe neither.

Maybe some other undiscovered lurking variable is

the real culprit, its hidden effect is confounded with

𝑥-variable

Note Regarding Correlational Studies

• If an apparent trend is discovered in a correlational study, avoid using terms that have a notion of causal tone.

Avoid these words/terms: “impact”, “influence”, “effect”, “lead to”

• Instead of the above terms, use wordings like:

“connected to”, “related to”, “associated with”, “contribute to”

30

29

30

Page 16: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

16

Experimental Research Methods

• When our goal is to understand the cause and effect, designed experiments are the only source of fully

convincing data.

• Changes in one variable are responsible for causing differences in a second variable.

• One variable is manipulated, one variable is observed.

• The researcher manipulates one variable by changing its value from one level to another.

• Experimental methods compare the responses of subjects on some outcome measure, under different conditions.

• Those conditions are levels (treatments) of a predictor variable that can influence the outcome.

31

Elements of an Experimental Designed Study

• Response variable:

A variable of interest to be measured in the experiment.

• Typically, it is quantitative in nature.

• It is observed to assess the effect of treatment.

Example: Undergraduate students’ statistics outcomes.

• Predictor variable (Factor, Experimental variable, Manipulated variable):

A variable whose effect on the response variable is of interest to the researcher.

• Conditions that are manipulated prior to observing the outcome.

• Categorical factors (they are not measured on a numerical scale):

• For example, teaching method (regular method, intervention used (e.g., inverted/flipped classroom).

• They must have at least two levels (at least two groups, two categories)

• Quantitative factors are measured on a numerical scale.

• For example, number of mathematics courses students completed prior to taking an introductory

statistics course. 32

31

32

Page 17: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

17

Elements of an Experimental Designed Study• Experimental units:

The individuals cases (subjects, objects) on which response variables are observed or measured.

For example, undergraduate students enrolled in an introductory statistics course.

• Factor-levels:

Levels are values of the factor (predictor variable) that are utilized in the experiment.

• They are usually non-numerical.

For example, teaching method with two levels: regular (control condition), intervention (treatment condition)

• They could be numerical.

For example, the number of mathematics course completed prior to taking an intro statistics course.

• Treatments (conditions):

Factor-level combinations utilized in the experiment.

• Treatment condition: Subjects receive experimental treatment.

• Control condition: Subjects receive the experimental treatment. Instead they receive a neutral, placebo treatment.

• Purpose: To provide a baseline for comparison with the experimental treatment. 33

Experimental Research Methods

• In experimental methods, the researcher controls all aspects of the study.

• The researcher manipulates or varies the predictor variable and then allows the outcome variable to

vary naturally.

• The idea is to deliberately vary the predictors to see if they have any casual effects on the outcomes.

• Thus, the researcher must exercise control over the research situation to ensure that other, extraneous

(external) variables do not influence the relationship.

• Experimental designs need to ensure that there is no chance that something other than the predictor variable

is causing the outcomes.

• Everything is kept constant or is in some other way “balanced” to ensure that they have no effect on the

results. This is achieved by “randomization”.

• Each participant has an equal chance of being assigned to each treatment condition.

34

33

34

Page 18: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

18

Experimental Research Methods

• Good experimental designs use randomizations to determine which treatment a subject receive.

• Subjects (experimental units) are randomly assigned to different groups and then each group receive a

different condition (treatment) – that is, the researcher assign different values of the predictor variable to

different subject.

For example:

• treatments might be different drugs for treating some illness.

• treatments might be different teaching interventions for improving some learning outcomes.

• This strategy ensures to distribute participant characteristics evenly between the treatment groups so that

neither group is noticeably, for example, smarter (or older, or faster).

• This strategy reduces the bias and allows us to use statistical inference to make predictions.

• The researcher knows that groups would roughly balance on variables (factors) that can affect the outcome

measure (response variable).

35

Example of Experimental Design Study

36

Ploman, DeCastro, and Van Aken (2008) examined the effect of violent video games on 10-year old boys. They

randomly selected twenty 10-year-old boys and randomly assigned them into two groups. One group played a

violent video game. The second group played a non-violent video game. After the game playing session, the children

went to a free play period and were monitored for aggressive behaviors (e.g., hitting, kicking, pushing, frightening,

fighting, or teasing another child). The researchers assigned a score for each child’s aggressive behaviour. They then

compared the scores for the violent video game group with the scores for the non-violent video game group.

• Response variable (outcome measure): Aggressive behavior scores

• Predictor Variable (factor): Video game

• Treatments (factor-level): Non-violent video-game (control condition), Violent video-game (treatment condition)

• Experimental units: Twenty 10-year-old boys

Note: A systematic difference between the two groups provides evidence for a relationship between playing video

games and aggressive behavior for 10-year old boys.

35

36

Page 19: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

19

Example of Experimental Design Study

37

• To be able to say that the difference in aggressive behaviour is caused by the amount of violence in the games,

the researcher must rule out any other possible explanation for the difference.

• That is, any other variables that might affect aggressive behavior must be controlled.

1. The researchers must ensure that participant variables such as age, sex, intelligence do not differ from one

group to another.

2. The researchers must ensure that the subjects in each group (treatment, control) are tested in the same

environment (such as lighting, time of the day, and weather condition).

Problems with Confound(s)

38

If a research study allows more than one explanation for the results, the study is said to be cofounded, because it is

impossible to reach an unambiguous conclusion.

For example:

• If non-violent-video games groups were primarily females and those in violent-video games groups were

primarily male, in this case, the difference between groups may have caused by participants’ sex

(e.g., females are less aggressive than males).

• If the non-violent video games were tested in the morning and those in the violent-video games were tested in

the evening, in this case, the researcher could not determine whether the difference in aggressive behavior were

caused by amount of violent in the game or caused by the time of the day.

37

38

Page 20: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

20

Control For other Variables in the Effect of Amount of Violence in Video Games on Aggressive Behavior

39

1. Random assignment:

Participants could be randomly assigned for testing either in the morning or in the evening.

2. Use matching to ensure equivalent groups or equivalent environment:

The researchers could match the groups by ensuring that every group has exactly 60% females and 40% males.

3. Holding other variable constant:

Study 10-year-old boys only. That is, hold constant for age, and sex of subjects.

Another Example of An Experimental Study

Suppose a researcher randomly selects students from different program of study (behavioral sciences, social sciences,

sciences, humanities) and randomly assigns these students to either the experimental group (teaching intervention) or

control group (regular teaching) in order to evaluate the effect of teaching method and program of study on students’

statistics course outcomes.

• Response variable (outcome measure): Statistics course outcome

• Predictor Variables (factors): Teaching method, Program of study

• Treatments (factor-level combination): 2 x 4 = 8

(regular, behavioral sciences), (regular, social sciences), (regular, sciences), (regular, humanities)

(intervention, behavioral sciences), (intervention, social sciences), (intervention, sciences), (intervention, humanities)

• Experimental units: Students

40

39

40

Page 21: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

21

Nonexperimental Research Methods

• Nonexperimental research methods are like correlational methods.

• It is impossible to control the assignment of participants to groups.

• Often referred to as Quasi-experimental research.

• The predictor variable (Quasi-predictor variable) has levels to create groups of scores.

41

Examples of Nonexperimental Methods

❖ Suppose researchers aim to investigate the contribution of gender on verbal score. That is, they want to examine

whether verbal scores differ between boys and girls.

• The researchers have no control which participants go into which groups.

❖ Suppose researchers aim to compare depression scores before therapy and after therapy. This is a pre-post study.

The two groups of scores are obtained by measuring the same variable (depression level) twice for each

participant (once before therapy and another time after therapy).

• In the pre-post study, the researcher has no control over the passage of time. It could be that the difference

between pre-and-post scores are caused by treatment or as time goes by. For example, depression scores may

decrease over time in the same way that the symptoms of cold disappear over time. For example, weather

could change from dark and gloomy before therapy to bright and sunny after therapy. In this situation

depression could improve because of weather and not because of therapy.

42

41

42

Page 22: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

22

Assessing the Validity of a Study

• Can you trust the result of a study?

• Validity of a study is concerned with accuracy (valid measures): How accurate the measurements are in a study?

• There are five different types of validity:

• Internal validity

• External validity

• Construct validity

• Face validity

• Ecological validity

43

Internal Validity

• Internal validity refers to extent to which we are able to draw the correct conclusions about the causal relationships

between variables.

• It is called internal, because it refers to the relationship between things “inside” the study.

Consider this example:

Suppose we want to examine whether a university education makes the person write better. To investigate this, let’s

suppose that we randomly select a group of first-year and a group of third-year students. We ask the students to write

a 1000 word essay and then we count the number of spelling and grammatical errors they make. Suppose that we

find that the third-year students (who had more university experience/education than the first-year students) make

fewer errors. Can we conclude that university improves writing skills?

44

43

44

Page 23: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

23

Internal Validity

Refer to the example on the previous slide:

Can we conclude that university improves writing skills?

• It is hard to know the causal relationship.

• Do older people write better?

• Or, people who had more writing experience write better?

• Or, people who had more education write better?

Which of the above is the true cause of the superior performance?

• Age

• Experience

• Education

We cannot tell for sure.

Note: Internal validity cannot properly separate the causal relationships between different variable.

45

External Validity

• External validity refers to generalizability of study’s findings.

• To what extent do you expect to see the same pattern of results in “real life” as you saw in a study?

• Does a study generalize to people and situations beyond that study?

• External validity is threatened by the choice of population

• Think about whether the population from which the sample is selected is narrow. That is, think about whether the

narrow population that the sample is selected is systematically different from the general population?

For example: I studied undergraduate social sciences’ students’ attitudes toward statistics. I used SATS-36© to

measure students’ attitudes about statistics. Can I generalize the result of my study to all undergraduate students?

• Maybe. But a better answer to this question is NO.

• I can only generalize the results to undergraduate social sciences students.

Note: Think about “real life” outcomes. For example, the subjects in a study may not answer a survey questionnaire

(attitudinal survey) or perform in a task-based experiment like what they do or face in real life. 46

45

46

Page 24: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

24

Construct Validity

• Construct validity is concerned with what the study aims to measure.

• A study has a good construct validity when it actually measures the correct theoretical construct.

For example:

The Survey of Attitudes Toward Statistics (SATS-36©) encompasses six attitudinal constructs of Affect, Cognitive

Competence, Difficulty, Value, Interest, and Effort that are congruent with Eccles’ et al. (1983) expectancy-value

model. This model theorizes that students’ expectancy for success and subjective task value directly influences

students’ achievement-related choices, and also influences their performance, effort, and persistence.

• A confirmatory factor analysis, a multivariate statistical technique, can test whether items in a component

represent that component.

47

Face Validity

• Whether a measurement “looks like” it is doing what is supposed to, nothing more.

For example:

It has been argued in the literature that early attitudinal surveys of measuring students’ attitudes about statistics

included items that measured students’ conceptual understanding of statistics rather than their perception of statistics.

Thus, those attitudinal surveys lacked faced validity.

The Survey of Attitudes Toward Statistics (SATS-36©) encompasses six attitudinal constructs of Affect, Cognitive

Competence, Difficulty, Value, Interest, and Effort that are congruent with Eccles’ et al. (1983) expectancy-value

model. This model theorizes that students’ expectancy for success and subjective task value directly influences

students’ achievement-related choices, and also influences their performance, effort, and persistence.

48

47

48

Page 25: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

25

Ecological Validity

Ecological validity of a study means that the entire set up of the study should closely approximate the real world

scenario that is being investigated.

❖It is related to the concept of external validity and face validity but it is much more rigorous than that.

49

Confounds and Artifacts: Threats to Validity of a Study

Confounds:

• Confounds are additional, often unmeasured variable that is related to both predictors and the outcome.

• The existence of confounds threaten the internal validity, because we cannot tell whether the predictor causes the

outcome, or if the confounding variable causes it.

Artifacts:

• A result is said to be artificial if it only holds special situation that a researcher happened to test in their study.

• It would be a threat to the external validity of the study, because the researcher cannot generalize the result of their

study to the actual population of interest.

Example: Regression to the mean

Asal’s example: Do basic statistics courses have adverse effect on attitudes of students who had positive attitudes

about statistics at the beginning of their course? 50

49

50

Page 26: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

26

Other Threats to Validity of a Study

History Effects:

• It refers to the possibility that specific event may occur during the study itself that might influence the outcome.

Example: Suppose a study aims to investigate the effect of an anti-anxiety drug on participants’ self-reports before

and after taking the drug. Suppose that the study’s lab is in Los Angels and during the course of the study an

earthquake occurs.

Maturation Effects:

• Change over time: How people change on their own over time.

Example: In developmental psychology a research might investigate the effect of some educational tricks with

vocabulary size among 3-year-old. The researchers need to take into account the maturation effect that the

vocabulary size of children grows at a very fast rate (e.g., multiple words per day) on their own.

• When running experiments in the lab (e.g., three-hour long), researchers should take into account that people tend

to get board. This maturation effect may cause performance to decline while holding everything else constant that

is going on in the lab. 51

Other Threats to Validity of a Study

Repeated Testing Effects:

• It is an important type of history effect.

Example: I assessed undergraduate social sciences’ students’ attitudes toward statistics once at the beginning of an

introductory course for the social sciences and another time at the end of their course. I used SATS-36©.

• Issue: The responses to attitudes-items at time 1 (pre-test; at the beginning of the course) has an effect on the

responses to attitudes-items at time 1 (post-test; at the end of the course).

• Some Reasons: Familiarity with the items and/or familiarity with the course.

52

51

52

Page 27: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

27

Other Threats to Validity of a Study

Selection Bias:

Example: Gender imbalance in a study.

Suppose an experiment aims to investigate whether a new anti-anxiety drug reduces anxiety.

• The researchers design an experiment with two groups:

Treatment condition “anti-anxiety drug”, Control condition: placebo “a fake drug but not harmful”.

• The researcher wants to find out if the different treatment leads to different outcomes.

• Suppose that the study ends up with gender imbalance across groups. That is, people randomly selected into the

two groups (treatment condition, control condition) have different characteristics.

(e.g., treatment group has 80% females, and control group has 50% males)

• If the study reports that treatment works better on females than males, the study’s conclusion is not valid. 53

Other Threats to Validity of a StudyDifferential Attrition:

• It is kind of a selection bias caused by the study itself.

• There are two types:

• Homogenous attrition: Attrition effect is the same in treatment group and control group (all groups).

Example: Long running time experiments tend to make people board. Suppose that only conscientious people

stay in the study (both in the treatment and control groups) and others drop out. The sample becomes

unrepresentative of the general population. Thus, the study loses its external validity.

• Heterogenous attrition: Attrition effect is different for different groups.

Example: Suppose a study (a hypothetical study) wants to examine whether insulting people make them act in

a more obedient way. Suppose the researchers design their experiment with two conditions. In the treatment

condition, the experimenter insults the participants and then give them a questionnaire to measure obedience.

In the control condition, the experimenter engages in a meaningless conversation with the participants and

then administer the questionnaire to measure obedience.

• The people in the treatment group are more likely to drop out. The study has a confound

(e.g., the people who actually took the questionnaire are already obedient that the control group).

• In this study, insulting people does not make them obedient but makes them more disobedient

(e.g., leave the study). 54

53

54

Page 28: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

28

Other Threats to Validity of a StudyNon-response Bias:

• When participants cannot be reached or refuse to participate (e.g., mail out questionnaire)

• It is closely related to selection bias.

Example:

In her book, Woman in Love, author Shere Hite surveyed women in the U.S. She mailed her survey to 100,000

women. 4,500 women responded (4.5% of all the 100,000 women).

• Her conclusions (based on her survey): 70% of women who had been married at least five years have

extramarital affairs.

Concern:

• Is 4.5% of women representative of the 100,000 women?

• Is 4.5% of women representative of the entire population of American women?55

Other Threats to Validity of a Study

Missing Data:

• Missing responses for some of the variables measured.

• If an information/response is missing randomly, it is not a big problem. But if a challenging question has

missing information/response, it is a problem.

Example: EQAO test

Concern:

• How to treat the missing information?

• Should we omit the entire subject who provided missing information?

• Should we impute (replace with some other statistics) for the missing information using some strategies?

• If most cases (subjects) miss reporting their response to a specific question, what could this mean regarding the

questionnaire? In that case, how could we improve our research question(s)?

56

55

56

Page 29: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

29

Other Threats to Validity of a Study

57

Experimenter Bias:

• When we know what treatment was assigned, it is difficult not to let that knowledge influence our assessment of the response, even when we try to be careful.

• There are two main classes of individuals who can affect the outcome of the experiment:

• Those who could influence the results (subjects, treatment administrators, technicians)

• Those who evaluate the results (judges, treating physicians, etc.)

Example: If researchers believe that certain mice are very intelligent, then those mice will do better on all kinds of learning tasks, even though the mice were randomly “assigned” to be labelled intelligent.

• The classic reference is Robert Rosenthal’s (1966) experimenter expectancy in behavioral research.

• Naturally, the expectancy phenomenon applies to teachers and students in a classroom setting, where it is called teacher expectancy. The reference for this is Rosenthal and Jacobson’s Pygmalion in the classroom.

Pygmalion effect: if we expect great things from people, they will rise to that occasion.

Solution:

In order to avoid the bias that might result from knowing what treatment was assigned, we use blinding.

• When every individual in either one of these classes is blinded, an experiment is said to be single-blind.

• When everyone in both classes is blinded, the experiment is called double-blind.

• To engage in double-blinded studies neither the experimenter nor the participant knows which condition the participant is in or knows what the desired behaviour is.

Example of Using Blinding in Experimental Study

58

Pepsi vs Coke: study preference of consumers

• Single-blinded is when the participants don’t know which cup they are tasting, but the administrator knows

which cup is Pepsi or Coke.

• If consumer’s preference is shown to the administrator, the administrator may treat the study differently

(change the order of the cup).

• Double-blinded is when both participants and the administrators don’t know which cup is Pepsi and Coke.

57

58

Page 30: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

30

Other Threats to Validity of a Study

59

Demand Effect:

• It is impossible to stop people from knowing that they are part of a study.

• If participants know that someone (e.g., experimenter) is watching them, it can affect behavior or performance.

• People alter their response because of the attention the study focuses on. This situation is referred to as the

Hawthorne effect.

• Roles that people might adopt during an experiment:

• Good participant: figure out experiments’ hypothesis and confirm them.

• Bad participant: doing exact opposite of good participant (e.g., out to break to destroy the study)

• Faithfull: unnaturally obedient

• Apprehensive participant: gets nervous about being tested

Other Threats to Validity of a Study

60

Placebo:

• A placebo is a “fake” treatment designed to look like a real one.

• Subjects getting placebo/standard treatment are called control group.

Why including a placebo in an experimental design study is important?

• It is known that receiving any treatment will cause a subject to improve.

• We Want to show that the “real” treatment is not just effective, but better than a placebo. Then have evidence that

the treatment is worth knowing about.

• We can also use current standard treatment to compare with the placebo.

Placebo Effect:

• It is a specific type of demand effect.

• It is a situation where the mere effect of being treated causes an improvement in outcomes. That means, when

psychological attitudes can affect the results, control group with a placebo may show an improvement.

• It is stronger when placebo treatment are administered with authority or by a figure who appears to be an authority.

Example: One common version of a placebo in drug testing is a sugar pill.

59

60

Page 31: Week 2 Lecture Notesasalaslemand.weebly.com/uploads/3/1/3/1/31310805/week_2...2019-01-12 1 Week 2 Lecture Notes PSYC2021: Winter 2019 1 •Data collection can be thought of a kind

2019-01-12

31

Ethical Experiments

61

• Idea of imposing treatments on subjects might be questionable:

• What if study effects of smoking on lung disease?

Would have to prevent some subjects from smoking, and make some subjects smoke for duration of study (!!!)

This is not ethical at all.

• There are some known unhealthy/dangerous things you cannot ask subjects to do.

• Also, giving a placebo when a best proven treatment is available is not ethical.

• Subjects who receive placebo must not be subject to serious harm by so doing.

• See Declaration of Helsinki, which governs experiments on human subjects:

www.wma.net/en/30publications/10policies/b3/index.html

Summary

62

• We defined the terms psychological measurement, construct, and measure.

• We introduced variables and their scales of measurement.

• We defined the roles of variables in a study.

• We described three types of studies: Correlational, Experimental, Nonexperimental

• We realized that correlation does not imply causation.

• The best experiments are: Randomized, Comparative, Double-blind, Placebo-controlled

• We described the notions of reliability, validity and their types.

• We distinguished different threats to study’s validity.

• In general, when we design a study, we need to think about:

• The target population (who to draw our sample from)

• What to study in the target population?

• Why study that particular context? This means, being able to articulate the purpose of the study.

• How to conduct the study? This is concerned with the tools that we need to use to make our measurement.

• When to conduct the study?

• Where to conduct the study?

The above are important things to realize as they might effect the results of the study.

61

62