Lab2 Descriptives

Embed Size (px)

Citation preview

  • 8/18/2019 Lab2 Descriptives

    1/16

    Statistics – Spring 2008

    Lab #2 – Descriptives

    Descriptive analysis involves examining the characteristics of individual variables, as compared to inferentialstatistics which examines the relationship between variables.

    here are two types of variables !! categorical and continuous !! and the characteristics of interest for each variableare different. "or categorical variables, you are interested in the #count$, such as demographic characteristics of yo

    study %e.g., &0 males and &' females(. "or continuous variables, there are many different characteristics to examine

    such as mean, median, mode, range, variability, etc., but the mean is typically the most useful descriptor.

    his document explains how to examine characteristics of categorical and continuous variables. Descriptive analys

    involves the same S)SS commands as for Data Screening %e.g., *xplore, "re+uencies(, so you are already intimately

    familiar with how to conduct descriptive analysis.

    his document also explains how to create composites by averaging together individual variables into new

    composited variables. ompositing can involve a few different tas-s, such as reverse coding items, averaging itemswith different scale ranges, and conducting reliability analysis to determine if from a statistical point of view the

    individual items #should$ be averaged together. ll those tas-s are described below.

    his document also explain new s-ills that are related to descriptive analysis you may want to learn, such as how totransform a continuous variable into a categorical variable, how to create a new variable based upon the combinatio

    of two or more variables, and how to use syntax.

    1. Descriptive Statistics /our two options for descriptive analysis are #"re+uencies$ command and #*xplore$ command.

    1oth provide much of the same information, except

    a. "re+uencies !! groups together the descriptive information into one grid and displays histograms wi

    a normal curve %whereas *xplore displays histograms but without normal curves(

     b. *xplore !! displays descriptive information for each variable separately and displays boxplots "re+uencies

    3. Select Analyze !!> Descriptive Statistics !!> Frequencies2. 4ove all variables into the #5ariable%s($ window.6. lic- #Statistics$ and put a chec-mar- next to every descriptive statistics you are interested in viewing.

    7. lic- #harts$ and put a chec-mar- next to the chart type you are interested in viewing.

    &. lic- 9. utput below is for the first four #demographic$ +uestions.

    “Statistics” box provides a grid format of the descriptive statistics for each variable.

  • 8/18/2019 Lab2 Descriptives

    2/16

    fter the Statistics box, the frequency distributin for each variable is displayed. 1elow is the fre+uency

    distribution for gender

     :ext comes the !ist"ras. 1elow are the histograms for age and gender. ; chose to display to you these tw

    histograms because it illustrates how the #"re+uency$ histogram is useful for displaying both categorical an

    continuous variables. lso, notice that both histograms are not normally distributed. :ot every variable need

    to be normally distributed. )lus, categorical variables with few answer choices %e.g., 2, 6, 7, &, '( will rarely

    conform to a normal curve. "inally, in the age histogram notice the sharp drop!off below the #20$ line. his because we restricted participation in the study to people who were aged 38 or older.

  • 8/18/2019 Lab2 Descriptives

    3/16

    ;f you double!clic- on the histogram in the S)SS output viewer, it opens a new window containing the

    histogram with many new drop!down options to manipulate the histogram. here are too many options toexplain them all, so feel free to try each one, and if you have specific +uestions, please let me -now.

  • 8/18/2019 Lab2 Descriptives

    4/16

     :ow lets use the *xplore command

    3. Select Analyze !!> Descriptive Statistics !!> $%plre2. 4ove all variables into the #5ariable%s($ window.

    6. lic- #)lots$ and unclic- #Stem!and!leaf$7. lic- #ptions$ and clic- #*xclude cases pairwise$

    &. lic- 9. utput below is for only the four #system$ variables in our dataset because copy?pasting the output for all

    variables in our dataset would ta-e up too much space in this document. #&ase 'rcessin" Suary$ shows the number of cases that are valid, missing, total.

    #Descriptives$ shows the same information as the #"re+uencies$ command, but now each variable is

    displayed separately.

  • 8/18/2019 Lab2 Descriptives

    5/16

     :ext, the b%plt for each variable is displayed. 1elow is the boxplot for #edu$ because ; want to show you

     boxplot that contains both mild outliers %round dots( and extreme outliers %stars(.

  • 8/18/2019 Lab2 Descriptives

    6/16

    @hat if you want descriptive statistics within groupsA "or example, imagine a study that manipulated the

     presence or absence of a weapon during a crime, and the Dependent 5ariable was measuring the level of

    emotional reaction to the crime. ;n addition to loo-ing for descriptive statistics of your D5 within the entirestudy %so collapsing across both groups(, you may also want descriptive statistics for your D5 within eachgroup. nother example of when you would want descriptive statistics within groups is when your study

    involves a verdict choice. ypically, you not only report the percentage of guilty?not!guilty verdicts across t

    entire study, but you also want to report the percentage of guilty?not!guilty verdicts within each group in youstudy. ; present an example of this situation on the next page, and how to present this data in a "igure.

    Descriptive Statistics !!> $%plre2. 4ove all variables into the #5ariable%s($ window.

      4ove #sex$ into the #"actor Bist$6. lic- #Statistics$, and clic- #utliers$

    7. lic- #)lots$, and unclic- #Stem!and!leaf$

    &. lic- 9. utput on next page is for #system3$

  • 8/18/2019 Lab2 Descriptives

    7/16

    #Descriptives$ box tells you descriptive statistics about the variable. :otice that information for #males$ an

    #females$ is displayed separately.

    @;*!E) /ou typically discuss the characteristics of demographics in the beginning of the 4ethod sectio

    not the esults section, and you also typically only present data for gender %see below(. ;f you want to disc

    more than Fust gender, such as age, education, political afflitiation, income, etc., then you would create a"igure to display all the data. "or descriptive statitics other than demographics, you would present that data

    the esults section. ;f there are only a few descriptive statistics, you discuss them in the text of the esults

    section %see below(. ;f there are many descriptive statistics, you present them in a "igure, and then discussonly the most pertinent information from the "igure when you are writing the esults?Discussion section.

    a. participants, with many more

    females %n G 278( than males %n G >'(, and three participants who did not indicate gender.$ b. 8H.$c.

  • 8/18/2019 Lab2 Descriptives

    8/16

    *5BE;: Since #evaluating$ descriptive statistics in esults sections or "igure is simply reading the

    descriptive statistics that are reported, ; donCt have any advice for evaluating descriptive statistics other than

     pay attention if there are any other descriptive statistics that were not reported that you may find helpful orwould want the author to include in the paper.

    2. (t!er "rap!s S)SS has the ability to create other types of graphs beyond histograms and boxplots, but they provide little

    information beyond the information provided by histograms and boxplots. he other charts area. 1ar charts b. 6D bar charts

    c. Bine charts

    d. rea chartse. )ie charts

    o access these charts

    3. Select )rap!s !!> c!se eit!er “&!art *uilder” r “Le"acy &!arts”2. 4ove chosen variables into the appropriate open spaces

    6. lic- 9. #Begacy harts$ are the old way that S)SS builds charts. *ach chart has a separate command window, each

    with its own uni+ue options and characteristics. he options and characteristics are very straightforward andeasy to use.

    #hart 1uilder$ is new to S)SS. ;t reportedly has more functionality, but it is also complex and sometimes

    difficult to manipulate. ; would suggest first using the Begacy harts to get a better understanding of each tyof chart.

    +. &psites – avera"in" ites t"et!er @hy do we create compositesA he rule of thumb in statistics is #the more, the better$. ;n terms of measurin

    constructs, this means that you typically want to as- many +uestions about the same construct in order toade+uately tap into the entire construct of interest. "or example, in a study about happiness, as-ing, #how

    happy are you right now$ perfectly maps onto the construct of #how happy you are right now$. 1ut, if your

    intended construct is #happiness$, you need to as- more +uestions to tap the entire theoretical construct, suc

    as #how happy do you feel$, #how happy are you with your life in general$, and etc. hus, for every construresearchers as- many +uestions by either using established scales of the topic, or creating their own measure

    to tap all the facets of the construct. @hen you analy=e the data, you start by conducting descriptive analysis

    of each individual +uestion. hen, you composite all 30 +uestions together into 3 variable by averagingtogether all 30 +uestions. esearchers are typically more interested in that 3 composite variable than the 30

    individual items %unless the 30 +uestions are uni+uely taping different sub!parts of the entire construct, and

    researchers are interested in each sub!part(. So, after first conducting descriptive analysis of each item, youthen conduct descriptive analysis of the 3 composite variable.

    &pute -ariable 2. ype a new name for your composite in the #arget 5ariable$ box.

    6. Drag #mean$ from the #"unction group$ into the open box above

    7. eplace the +uestion mar-s %A( with each item to be composited, separated by a comma %,(&. lic- 9.

    he newly created composite will appear at the end of the data file.

    ;s it appropriate to create a composite with my +uestionsA @e described above how to create a composite, b

    another +uestion is whether its appropriate to create the composite given the +uestions and data in your stud

  • 8/18/2019 Lab2 Descriptives

    9/16

    /ou can answer that +uestion from a theoretical point of view, and a statistical point of view. ; describe belo both points of view

    "rom a theoretical point of viewI

    a. "rom a theoretical point of view, it is possible your +uestions do not measure the same construct, anthus it is inappropriate to average them together. "or example, the face content of each item may

    measure different concepts. ;magine +uestions about your political group orientation. +uestion abo

    whether you #thin-$ of yourself as a republican or democrat, may tap a different construct then if yoas- whether you #feel$ li-e a republican or democrat. /ou need to examine your +uestions and ma-e

    determination of whether you feel its appropriate to average the items together.

     b. nother option is create separate composites, one for each concept that is measured. "or example,maybe you composite together all the +uestions about how you #feel$ about your political groupmembership, and create another composite of the +uestions about how you #thin-$ of your political

    group membership. fter creating the separate composites, you can then also merge all the +uestions

    together %so merge all the separate composites together( into 3 big composite. ;n this case, you woulcall the separate composites you merged together the #sub!parts$ or #sub!factors$ of the 3 big

    composite. lso, from a theoretical point of view you need to decide how to label or characteri=e thi

     big composite.c. ;t is acceptable to create composites from a theoretical point of view even if it is not appropriate from

    statistical point of view. ; discuss next the benchmar-s for deciding whether or not its statistically

    appropriate to merge items together into a composite, but assuming those benchmar-s are not met in

    your data, it is still appropriate to merge items together from a purely theoretical point of view. Scale !!> .eliability Analysis2. 4ove all variables into the #5ariable%s($ window.

    6. lic- #Statistics$ and put a chec-mar- next to #item$ and #Scale if item deleted$

    7. lic- 9. #.eliability Statistics$ give you the #lpha$ number which is the determination of whether or not the item

    group together from a statistical point of view. lpha ranges from 0 to 3, and the higher the number, thestronger the items group together statistically. utput below is for the three #prosecutor$ +uestions.

    lphas above .J are great, above .8 are good, above .> are o-, above .' are borderline.

    ;n this case, lphaG.'8, which is acceptable to merge the three items together into a composite. lso, the

    smaller the sample, the more li-ely you will find smaller lpha levels because there is less data to identifyintercorrelations. ;n smaller samples, smaller lpha levels are acceptable to create composites.

  • 8/18/2019 Lab2 Descriptives

    10/16

    he other output from the analysis is helpful to interpret your data. #&ase 'rcessin" Suary$ tells you

    number of valid cases included in the analysis. :otice that only listwise deletion is possible. #/te Statisticgives you descriptive information about each item. #/te0ttal Statistics$ tells you the lpha levels if each

    items is removed. :otice that lpha improves to .>8 if we remove #prosecutor6$. ;n this case, because thereare so few items %e.g., 6(, ; would suggest not removing #prosecutor6$, even though it improves lpha,

     because only 2 items is not much of a composite. ;f we were analy=ing many items %e.g., 7K(, then it would

    more appropriate to exclude items.

    @;*!E) #The three items measuring attitudes toward prosecutors formed a reliable composite (α = .68)

    *5BE;: "or each composite in the paper, the author%s( need to report the alpha level, which is the

    statistic that tells you whether or not the items group together statistically. lpha is determined by the streng

    of the bivariate relationships amongst all the items in the composite. he higher the internal consistencyamongst items, the higher the lpha level. lphas above .J are great, above .8 are good, above .> are o-, abo

    .' are borderline. lso, the smaller the sample, the more li-ely you will find smaller lpha levels because

    there is less data to identify intercorrelations.

    . /tes it! different scale ran"es ;f you are going to composite together multiple items, all the items need to have the same scale range.

    "or example, lets say we as- two happiness +uestions %3( # scale,

    and %2( # points, 1E the scale ranges are along different dimensions. ompositing involves averagi

    items together. ;f we average together these two items, the resulting average will not be interpretable becaus

    of the different scale ranges. "or example, a #3$ on the first item is the lowest possible answer choice, but a#3$ on the second item is one of the highest possible choices. he solution is to transform both scale ranges

  • 8/18/2019 Lab2 Descriptives

    11/16

    into a common metric. his is accomplished by first #standardi=ing$ both items. hen, we composite thenewly transformed items.

    1efore we get to how to standardi=e items, ; want to point out why ; included in the example a scale that

    ranged from a negative number %!6( to a positive number %6(. Sometimes when you are measuring constructthere is a natural mid!point or neutral point, such as with #happiness$ where you could have #0$ happiness a

    the moment. ;n this situation, it can be beneficial to include an answer choice that is neutral or #0$. :otice t

    if we as-ed the same +uestion but with a 3!> scale, if you wanted to indicate you are feeling =ero happiness the moment, your only answer choice would be a #3$, which you may not feel indicates you absence of

    happiness. nother reason to include a scale that ranges from negative to positive is that your construct also

    ranges from negative to positive. "or example, imagine a +uestion that as-ed about your feelings about thedeath penalty. /ou could have a negative view or a positive view of the death penalty, so in order to tap thatconstruct you need to include in the scale range answer choices that reflect positive and negative. nother w

    to reflect both positive and negative in a scale with the labels. "or example, you could as- the same +uestion

    about your feelings toward the death penalty on a 3!> scale, but have the labels for #3$ be strongly oppose, for #7$ be neutral, and for #>$ be strongly support.

    ; also want to point out that standardi=ing your items to transform items to a common metric is necessary

    when any of the scale ranges differ, not Fust with negative versus positive items, as in the example above. "o

    example, you may as- +uestions about the death penalty that are so similar that you want to vary the scaleranges of the items so that you tap into more information %and also force the subFects to pay more attention t

    the items because all items with the same scale range may allow la=y subFects to answer the same way on

    similar +uestions without thin-ing carefully about their answers(. o Standardi=e items

    3. Select Analyze !!> Descriptive Statistics !!> Descriptives

    2. 4ove all variables into the #5ariable%s($ window.6. )ut a chec-mar- next to #Save standardi=ed values as variables$

    7. lic- 9. he newly standardi=ed variables are listed at the end of the data file. *ach standardi=ed variable is listed in

    separate column. /ou can then analy=e the new standardi=ed variables as you would any other variable in yo

    data set, including averaging them together to create a composite.

    3. .everse cdin" ites ;f you are going to composite together multiple items, all the items need to be #in the same direction$. his

    means that indicating a higher %or lower( response each scale must correspond conceptually to answering

    higher %or lower( on the other items you want to composite together. "or example, lets say we as- two happiness +uestions %3( # scale,

    and %2( # scale. :otice that the two +uestions are about the samconstruct %so theoretically you can merge them together(, and also notice that the total range of the scales fo

     both items are > points, 1E that conceptually answer higher %or lower( on one item is the same as answeri

    lower %or higher( on the other item. 1efore we can composite them together, we need to transform all the iteso that they are #in the same direction$. hus, we could either reverse code the scale range for the first item

    the second item %but not obviously both items(. omposites typically contain multiple items, so you typicall

    have to reverse code multiple items. lso, when choosing which set of items to reverse code %e.g., either theitems that are in the positive direction, or items that are in the negative direction(, you should thin- ahead tothe statistical analyses you want to conduct and how you want output from those statistical analyses %or the

    relationship between those variables( to be conceptuali=ed. "or example, imagine a study testing the

    relationship between happiness and income. ;f your hypothesis is that more income is correlated with morehappiness, then conceptually we want our #happiness$ composite to code in the positive direction %so that

    higher on the scale means more happiness( so that the outcome is easier to interpret. :otice, that if we code

    the happiness composite in the opposite direction %so that lower means more happiness(, we will still get thesame conceptual outcome as with the positively coded composite !! that more happiness is correlated with

  • 8/18/2019 Lab2 Descriptives

    12/16

    more income !! but the interpretation of the outcome will be more difficult because we will get a negativecorrelation between the variables %because lower on the happiness scale is more happiness, and more

    happiness is correlated with higher income. hus, thin- ahead to your intended results and code all the items

    in the appropriate direction.

    o reverse code items

    3. Select ,ransfr !!> .ecde int different variables2. 4ove one item into the #;nput @indow$

    6. ype a name for the new variable.%; li-e to use the same name as the original variable, but labeled with #Lrev$, such as #system3Lrev$(

    7. lic- #hanges$

    &. lic- #ld and :ew 5alues$

    '. *nter the #old$ value and the #new value$ and clic- #dd$  %;f reverse coding a 3!> scale, then oldG3, newG> oldG2, newG' oldG6, newG&, and etc.(

    >. lic- ontinue

    8. lic- 9. he newly reverse coded variable is listed at the end of the data file.

     :otice that instead of #ecode into different variables$, there is an option for #ecode into same variables$

    do not use this function because ; li-e to leave the old variable intact because ; li-e to -eep a permanent rec

    of each variable, and you may forget you reverse coded it and reverse code it again, and you may ma-e amista-e in reverse code that canMt be undone if the old variable has been replaced.

    '. S45,A6 Ep this point, we have learned that S)SS has two windows –Data *ditor %grid of data( and 5iewer %output(.

    S)SS has a third window – Syntax. @hat is syntaxA @hen you point!and!clic- in the Data *ditor for S)SS to calculate a mean, or outlier, or

    correlation, or whatever, S)SS is calculating the statistical formulas for those tests. S)SS is basically a big

    calculator that can perform many different calculations. @hen you point!and!clic- in the Data *ditor forS)SS, you are telling S)SS how to perform those calculations, such as include #9urtosis$, or #exclude case

     pairwise$, or #run correlations on these three specific variables, and not the other variables$. nother way ttell S)SS to perform those same operations is to use programming language. ;n the syntax window, you cantype programming language, then hit the #run$ button, and S)SS will perform the calculations. his process

    analogous to how a website designer writes computer code to design a website, but you donCt see the code,

    only the website design. Similarly, the point!and!clic- functionality in S)SS is analogues to the website des

    you see, and the syntax functionality in S)SS is analogues to the bac-ground computer code that you typicadonCt see.

    @hy use syntaxA he point!and!clic-$ interface is very easy to use. /ou donCt need to learn the syntax

     programming language which can sometimes get overwhelming and difficult to understand.

  • 8/18/2019 Lab2 Descriptives

    13/16

      %; li-e to use the same name as the original variable, but labeled with #Lrev$, such as #system3Lrev$(7. lic- #hanges$

    &. lic- #ld and :ew 5alues$

    '. *nter the #old$ value and the #new value$ and clic- #dd$

      %;f reverse coding a 3!> scale, then oldG3, newG> oldG2, newG' oldG6, newG&, and etc.(>. lic- ontinue

    8. lic- )S*  :otice that the last action is #)S*$, not 9.

    he syntax window will open, and the command you Fust initiated is displayed using syntax code.

    ; have pasted below the syntax for our example. #*D*$ is the command to perform. :otice that the oldvariable name and new variable name are in the command line. :otice that it ends with #*N*E*.$. ;f wwanted to #run$ this command, we would highlight the entire syntax, and clic- the arrow button O

    *D* system3 %3G>( %2G'( %6G&( %7G7( %&G6( %'G2( %>G3( ;: system3Lrev.

    *N*E*.

    @e are using this example to show how using syntax can speed up repetitive actions. So if we copy?paste th

    syntax over and over, we can then type in the other variables we need reverse code. hen, we highlight all th

    syntax, and clic- the arrow button to run the syntax.*D* system3 %3G>( %2G'( %6G&( %7G7( %&G6( %'G2( %>G3( ;: system3Lrev.

    *N*E*.

    *D* system2 %3G>( %2G'( %6G&( %7G7( %&G6( %'G2( %>G3( ;: system2Lrev.*N*E*.

    *D* system6 %3G>( %2G'( %6G&( %7G7( %&G6( %'G2( %>G3( ;: system6Lrev.

    *N*E*.*D* system7 %3G>( %2G'( %6G&( %7G7( %&G6( %'G2( %>G3( ;: system7Lrev.

    *N*E*.

    nother way to use syntax is to -eep a record of your statistical analyses because the syntax indicates not on

    which statistical analyses was performed, but it also provides a record of how you performed those statistica

    analyses and which options you chose to use. he utput window provides that record by displaying thesyntax for every analyses that is conducted.

    '. ,ransfrin" cntinuus variables int cate"rical variables

    7and cate"rical variables int different cate"rical variables8

    ;t is possible to transform continuous variables into categorical variables. "or example, imagine a study abo

    happiness where your happiness item %or composite( ranges from 3 to >. /ou might be interested incategori=ing the subFects as either high happiness %7 through > on the scale( or low happiness %3 through 7 o

    the scale(. his is called #dichotomi=ing$ the variable because you are creating a new variable that has only

    two options. nother example of why you would want to transform a continuous variable into a categorical variable is if

    there are only a few responses on some of the answer choices in the continuous variable. "or example,

    imagine a scale range from 3!33 in which answer choice 7 and?or answer choice J received only 3 responseeach. 3 response is not enough data for meaningful interpretation. /ou may want to collapse the 33 point sca

    into 6 or 7 categories. s another example, loo- at the #relLcategory$ in our dataset which measures the

    religious category memberships of the subFects. he fre+uency distribution is listed on the next page.

  • 8/18/2019 Lab2 Descriptives

    14/16

    ransforming variables in this way uses the same S)SS command as for reverse coding items.

    o transform the variables

    3. Select ,ransfr !!> .ecde int different variables2. 4ove one item into the #;nput @indow$

    6. ype a name for the new variable.%; li-e to use the same name as the original variable, but labeled with #Lcat$, such as #system3Lcat$(

    7. lic- #hanges$

    &. lic- #ld and :ew 5alues$

    '. lic- #ange$ and enter the range of values of the #old$ variable, and assign a number for new variable.

      %e.g., 3!6.JJJ become a #3$, and 7.0003!> becomes a #2$(>. lic- ontinue

    8. lic- 9. he newly transformed variable is listed at the end of the data file. ; would suggest then going into the

    #5ariable 5iew$ and assigning value labels in the #5alues$ column that reflect how you cut the variable. "o

    example, if you Fust created a new categorical variable where 3!6.JJJ become a #3$, and 7.0003!> becomes#2$, then assign 3G3!6.JJJ, and 2G7.003!>. hus, you -eep a record of what the #3$ and #2$ means.

    point scale, the medi

    is a 2 or a '. ;n this situation half of the scores are bunched into a small range %e.g., 2 points in this examplewhereas the other half are more evenly distributed across a larger range %e.g., & point in this example(. nce

    again, you are losing valuable information by dichotomi=ing in this way. ;n summary, theoretical andstatistical considerations when dichotomi=ing variables. ne solution is to dichotomi=e in both ways and

    analy=e the data using both variables. he same theoretical and practical considerations come into play when you are deciding to split the variable

    in other ways. /ou may decide, for example, to cut the continuous variable in thirds, or fourth, or fifths.Sometimes when you cut the variable into thirds, your new categorical variable only includes the top and

     bottom third. Sometimes you are only interested in the more polari=ed decisions. Sometimes you can

  • 8/18/2019 Lab2 Descriptives

    15/16

    strengthen the relationship between your variables by only including the polari=ed Fudgments. "rom atheoretical point of view it can ma-e sense to drop the middle third because they are the subFects who are

    somewhat undecided about the construct. )lus, thin- about why dichotomi=ing continuous variables results

    reduced information and reduced statistical power. SubFects in the continuous variable who are near the mid

    are now the same as subFects near the top?bottom after you dichotomi=e the variable. ;n a 300 point scale forexample, the subFects who respond 7J and &3 are treated the same as the subFects who respond 0 and 300,

    respectively. hus, you are reducing your ability to detect true relationships in the study because the subFect

    close to the middle may be mas-ing relationships amongst your variables by diluting the strength of thehigh?low categories in the variable. *liminating the middle third when you cut the continuous variable in

    thirds is one way to create a categorical variable while minimi=ing your loss of power. "rom a practical point of view, if you are dichotomi=ing a variable, you donCt truly cut it in half because if y

    cut a 3!> point scale from 3!7 and 7!>, for example, a subFect who answered #7$ is technically in both

    categories. hus, when you use the typically create a small degree of separation, such as 3!6.JJJ and 7.003! @hen splitting a continuous identification variable into two groups, another +uestion is whether you want to

    have e+ual : si=e for Fust that variable, or have e+ual : across that variable :D another variable. "or

    example, ; conducted a study about how republicans and democrats identify with their political party. Bets s

    ; want to dichotomi=e my measure of #identification$. @hen splitting the continuous identification variable

    into two groups, the +uestion is whether you want to have e+ual : si=e for Fust the identification variable, orhave e+ual : across both identification and the republican v. democrat variable. "or example, if you split th

    identification variable down the middle, you might have many more republicans in the low or high

    identification condition, and vice versa for democrats. n the other hand, you could split the identificationvariable separately for republicans and then again democrats, and then combine together, so that way you ha

    e+ual : across both variables. ; believe both are defensible options to choose. 4y opinion is the first option

    the best %grand median or midpoint( because then the high and low groups will have e+uivalent psychologicmeaning across party affiliation. ;n other words, #high$ and #low$ mean the same thing for both republicans

    and democrats even if cell si=e is une+ual.

    '. &reatin" ne variables based upn t!e cbinatin f t r re variables Sometimes you want to create a new variable that is a combination of two or more other variables. "or

    example, ; conducted a study about how republicans and democrats identify with their political party. "or ea

    subFect, ; as-ed what is their political party affiliation and how much they identify with that political party.Bets say ; want a new variable of only highly identified republicans but lowly identified democrats. ;n this

    case ; want to create a new variable that is a combination of my two +uestions.

  • 8/18/2019 Lab2 Descriptives

    16/16

    highly identified republicans. hen, we repeat the process by using the #ompute variable$ command toassign a #2$ if highly identified democrats.

    o transform the variables

    3. Select ,ransfr !!> &pute -ariable 2. ype a new name for your new variable in the #arget 5ariable$ box.

    6. ;n the #:umeric *xpression$ box, type the number of a category

    %e.g., BetCs start by assigning category #3$(7. lic- the #;f$ button, and clic- #;nclude if case satisfied condition$.

    &. 4ove the old variable into the open box, and specify the restriction.

      %e.g., if identification if #identify$ variable, and political party affiliation was #party$ variable, then ; neeto specify only those subFects who are highly identified %e.g., greater than 7 on the #identify$ variable( andwho are simultaneously republicans %e.g., republicans are labeled #3$ on the #party$ variable(. So, ; would

    type the following into the box !! identifyQ7 R partyG3.

    >. lic- ontinue8. lic- 9.

    . lic- ontinue

    8. lic- 9. o summary, the #:umeric *xpression$ box is the number we want to assign in the new category %3 or 2(

    nd, the criteria for who is assigned that number is specified in the #;f$ box.nd, the name of the new categorical variable was labeled in the #arget 5ariable box

    he new variable will appear at the end of the data file.