Upload
trankhuong
View
213
Download
0
Embed Size (px)
Citation preview
Stat 13, Intro. to Statistical Methods for the Life and Health Sciences.
1. Collect hw2.2. Confounding and lefties example. 3. Comparing two proportions using numerical and visual summaries,
good or bad year example. 4. Comparing 2 proportions with CIs + testing using simulation, dolphin example.5. Comparing 2 props. with theory-based testing, smoking and gender example. Read ch5. http://www.stat.ucla.edu/~frederic/13/F16 .Bring a PENCIL and CALCULATOR and any books or notes you want to the midterm and final.
1
1.Collecthw2.2.Confoundingandleftiesexample.
• Exampleofleft-handednessandageatdeath.PsychologistsDianeHalpernandStanleyCoren lookedat1,000deathrecordsofthosewhodiedinSouthernCaliforniainthelate1980s andearly1990sandcontactedrelativestoseeifthedeceasedwererighthanded orlefthanded.Theyfoundthattheaverageagesatdeathofthelefthanded was66,andfortherighthanded itwas75.Theirresultswerepublishedinprestigiousscientificjournals,NatureandtheNewEnglandJournalofMedicine.
1.Collecthw2.2.Confoundingandlefties.
Allsortsofcausalconclusionsweremadeabouthowthisshowsthatthestressofbeinglefthanded inourrighthanded worldleadstoprematuredeath.
• Aconfoundingfactoristheageofthetwopopulationsingeneral.Leftiesinthe1980swereonaverageyoungerthanrighties.Manyoldleftieswereconvertedtorightiesatinfancy,intheearly20thcentury,butthispracticehassubsided.Thusinthe1980sand1990s,therewererelativelyfewoldleftiesbutmanyyoungleftiesintheoverallpopulation.Thisaloneexplainsthediscrepancy.
Unit2:ComparingTwoGroups
• InUnit1,welearnedthebasicprocessofstatisticalinferenceusingtestsandconfidenceintervals.Wedidallthisbyfocusingonasingleproportion.
• InUnit2,wewilltaketheseideasandextendthemtocomparingtwogroups.Wewillcomparetwoproportions,twoindependentmeans,andpaireddata.
Chapter5:ComparingTwoProportions
5.1:ComparingTwoGroups:CategoricalResponse5.2:ComparingTwoProportions:Simulation-BasedApproach5.3:ComparingTwoProportions:Theory-BasedApproach
Example5.1:PositiveandNegativePerceptions
• Considerthesetwoquestions:– Areyouhavingagoodyear?– Areyouhavingabadyear?
• Dopeopleanswereachquestioninsuchawaythatwouldindicatethesameanswer?(e.g.YesforthefirstoneandNoforthesecond.)
PositiveandNegativePerceptions
• Researchersquestioned30students(randomlygivingthemoneofthetwoquestions).
• Theythenrecordedifapositiveornegativeresponsewasgiven.
• Theywantedtoseeifthewordingofthequestioninfluencedtheanswers.
Positiveandnegativeperceptions
• Observationalunits– The30students
• Variables– Questionwording(goodyearorbadyear)– Perceptionoftheiryear(positiveornegative)
• Whichistheexplanatoryvariableandwhichistheresponse variable?
• Isthisanobservationalstudyorexperiment?
Individual TypeofQuestion
Response Individual TypeofQuestion
Response
1 GoodYear Positive 16 GoodYear Positive2 GoodYear Negative 17 BadYear Positive3 BadYear Positive 18 GoodYear Positive4 GoodYear Positive 19 GoodYear Positive5 GoodYear Negative 20 GoodYear Positive6 BadYear Positive 21 BadYear Negative7 GoodYear Positive 22 GoodYear Positive8 GoodYear Positive 23 BadYear Negative9 GoodYear Positive 24 GoodYear Positive10 BadYear Negative 25 BadYear Negative11 GoodYear Negative 26 GoodYear Positive12 BadYear Negative 27 BadYear Negative13 GoodYear Positive 28 GoodYear Positive14 BadYear Negative 29 BadYear Positive15 GoodYear Positive 30 BadYear Negative
RawDatainaSpreadsheet
Two-WayTables
• Atwo-waytableorganizesdata– Summarizestwo categoricalvariables– Alsocalledcontingencytable
• Arestudentsmorelikelytogiveapositiveresponseiftheyweregiventhegoodyearquestion?
GoodYear BadYear TotalPositiveresponse 15 4 19Negativeresponse 3 8 11Total 18 12 30
Two-WayTables
• Conditionalproportionswillhelpusbetterdetermineifthereisanassociationbetweenthequestionaskedandthetypeofresponse.
• Wecanseethatthesubjectswiththepositivequestionweremorelikely torespondpositively.
GoodYear BadYear TotalPositiveresponse 15/18 ≈0.83 4/12≈0.33 19Negativeresponse 3 8 11Total 18 12 30
SegmentedBarGraphs
• Wecanalsousesegmentedbargraphstoseethisassociation betweenthe"goodyear"questionandapositiveresponse.
Statistic
GoodYear BadYear TotalPositiveresponse 15(83%) 4(33%) 19Negativeresponse 3 8 11Total 18 12 30
� Thestatisticwewillmainlyusetosummarizethistableisthedifferenceinproportionsofpositiveresponsesis0.83− 0.33=0.50.
AnotherStatistic
GoodYear BadYear TotalPositiveresponse 15(83%) 4(33%) 19Negativeresponse 3 8 11Total 18 12 30
� Anotherstatisticthatisoftenused,calledrelativerisk,istheratiooftheproportions:0.83/0.33=2.5.
� Wecansaythatthosewhoweregiventhegoodyearquestionwere2.5timesaslikelytogiveapositiveresponse.
NoAssociation
• Fordatatoshownoassociation,theproportionsofpositiveresponsesshouldbethesameforthosegettingeachquestiontype.
• Sincetheoverallpositiveresponsewas19/30(63%),ifthereisnoassociationweshouldhave63%ofthe18thatgotthegoodyearquestionwithapositiveresponse(11.4)and63%ofthe12thatgotthebadyearquestionshouldgiveapositiveresponse(7.6).Thefollowingtableistheclosestpossible.
Good Year Bad Year TotalPositiveresponse 11(61%) 8(67%) 19Negativeresponse 7 4 11Total 18 12 30
SwimmingwithDolphinsIsswimmingwithdolphinstherapeuticforpatientssufferingfromclinicaldepression?
• ResearchersAntonioli andReveley (2005),inBritishMedicalJournal,recruited30subjectsaged18-65withaclinicaldiagnosisofmildtomoderatedepression
• Discontinuedantidepressantsandpsychotherapy4weekspriortoandthroughouttheexperiment
• 30subjectswenttoanislandnearHonduraswheretheywererandomlyassignedtotwotreatmentgroups
SwimmingwithDolphins• Bothgroupsengagedinonehourofswimmingandsnorkeling
eachday• Onegroupswaminthepresenceofdolphinsandtheother
groupdidnot• Participantsinbothgroupshadidenticalconditionsexceptfor
thedolphins• Aftertwoweeks,eachsubjects’levelofdepressionwas
evaluated,asithadbeenatthebeginningofthestudy• Theresponsevariableiswhetherornotthesubjectachieved
substantialreductionindepression
SwimmingwithDolphins
Nullhypothesis:Dolphinsdonothelp.– Swimmingwithdolphinsisnotassociatedwithsubstantialimprovementindepression
Alternativehypothesis:Dolphinshelp.– Swimmingwithdolphinsincreases theprobabilityofsubstantialimprovementindepressionsymptoms
SwimmingwithDolphins• Theparameteristhe(long-run)differencebetweenthe
probabilityofimprovingwhenreceivingdolphintherapyandtheprob.ofimprovingwiththecontrol(𝜋dolphins - 𝜋control)
• Sowecanwriteourhypothesesas:H0:𝜋dolphins - 𝜋control=0.Ha:𝜋dolphins- 𝜋control>0.or
H0: 𝜋dolphins= 𝜋control
Ha:𝜋dolphins> 𝜋control
(Note:wearenotsayingourparametersequalanycertainnumber.)
SwimmingwithDolphins
Results:
Dolphingroup
Controlgroup
Total
Improved 10(66.7%) 3(20%) 13
DidNot Improve 5 12 17Total 15 15 30
Thedifferenceinproportionsofimproversis:𝒑$𝒅 − 𝒑$𝒄 =0.667– 0.20=0.467.
SwimmingwithDolphins
• Therearetwopossibleexplanationsforanobserveddifferenceof0.467.– Atendencytobemorelikelytoimprovewithdolphins(alternativehypothesis)
– The13subjectsweregoingtoshowimprovementwithorwithoutdolphinsandrandomchanceassignedmoreimproverstothedolphins(nullhypothesis)
SwimmingwithDolphins
• Ifthenullhypothesisistrue(noassociationbetweendolphintherapyandimprovement)wewouldhave13improversand17non-improversregardlessofthegrouptowhichtheywereassigned.
• Hencetheassignmentdoesn’tmatterandwecanjustrandomlyassignthesubjects’resultstothetwogroupstoseewhatwouldhappenunderatruenullhypothesis.
SwimmingwithDolphins• Wecansimulatethiswithcards– 13cardstorepresenttheimprovers– 17cardsrepresentthenon-improvers
• Shufflethecards– put15inonepile(dolphintherapy)– put15inanother(controlgroup)
SwimmingwithDolphins• ComputetheproportionofimproversintheDolphinTherapygroup
• ComputetheproportionofimproversintheControlgroup
• Thedifferenceinthesetwoproportionsiswhatcouldjustaswellhavehappenedundertheassumptionthereisnoassociationbetweenswimmingwithdolphinsandsubstantialimprovementindepression.
20.0%Improvers66.7%Improvers
DolphinTherapyControlNon-
improver
Improver
Improver
Improver
Improver
Improver
Improver
Improver
ImproverImprover
Improver
Improver
Improver
ImproverNon-improver
Non-improver
Non-improver
Non-improver
Non-improver
Non-improver
Non-improver
Non-improver
Non-improver
Non-improver
Non-improver
Non-improver
Non-improver
Non-improver
Non-improver
Non-improver
40.0%Improvers 46.7%Improvers0.400– 0.467=-0.067
DifferenceinSimulatedProportions
33.3%Improvers53.3%Improvers 46.7%Improvers40.0%Improvers
Non-improver
Improver
Non-improver
Improver
Improver
Non-improver
Improver
Improver
ImproverNon-improver
Non-improver
Non-improver
Non-improver
ImproverNon-improver
Non-improver
Improver
Improver
Non-improver
Non-improver
Non-improver
Improver
ImproverImprover
Improver
Non-improver
Non-improver
Non-improver
Non-improver
Non-improver
0.533– 0.333=0.200
DifferenceinSimulatedProportions
DolphinTherapy Control
53.3%Improvers 33.3%Improvers40.0%Improvers46.7%Improvers
Non-improver
Improver
Non-improver
Improver
Improver
Non-improver
Improver
Improver
ImproverNon-improver
Non-improver
Non-improver
Non-improver
ImproverNon-improver
Non-improver
Improver
Improver
Non-improver
Non-improver
Non-improver
Improver
ImproverImprover
Improver
Non-improver
Non-improver
Non-improver
Non-improver
Non-improver
0.467– 0.400=0.067
DifferenceinSimulatedProportions
DolphinTherapyControl
MoreSimulations-0.067-0.333 -0.200
0.067 0.2000.333
0.467-0.200
-0.200
-0.200-0.067 -0.067
-0.067
-0.067 -0.067-0.067
0.067
0.067
0.067
0.067
0.067
0.0670.200
0.200
0.200
0.3330.333Onlyonesimulatedstatisticsoutof30wasas
largeorlargerthanourobserveddifferenceinproportionsof0.467,henceourp-valueforthisnulldistributionis1/30≈0.03.
DifferenceinSimulatedProportions
SwimmingwithDolphins
• 13outof1000resultshadadifferenceof0.467orhigher(p-value=0.013).
• 0.467is(.*+,-((../0
≈ 2.52 SDabovezero.� Usingeitherthep-valueorstandardizedstatistic,wehavestrongevidenceagainstthenullandcanconcludethattheimprovementduetoswimmingwithdolphinswasstatisticallysignificant.
SwimmingwithDolphins
� A95%confidenceintervalforthedifferenceintheprobabilityusingthestandarddeviationfromthenulldistributionis0.467+ 2(0.185)=0.467+ 0.370or(0.097to0.837)
• Weare95%confidentthatwhenallowedtoswimwithdolphins,theprobabilityofimprovingisbetween0.097and0.837higherthanwhennodolphinsarepresent.
• Howdoesthisintervalbackupourconclusionfromthetestofsignificance?
SwimmingwithDolphins
• Canwesaythatthepresenceofdolphinscaused thisimprovement?– Sincethiswasarandomizedexperiment,andassumingeverythingwasidenticalbetweenthegroups,wehavestrongevidencethatdolphinswerethecause
• Canwegeneralizetoalargerpopulation?– Maybemildtomoderatelydepressed18-65yearoldpatientswillingtovolunteerforthisstudy
– Wehavenoevidencethatrandomselectionwasusedtofindthe30subjects."Outpatients,recruitedthroughannouncementsontheinternet,radio,newspapers,andhospitals."
Introduction
• Justaswithasingleproportion,wecanoftenpredictresultsofasimulationusingatheory-basedapproach.
• Thetheory-basedapproachalsogivesasimplerwaytogenerateaconfidenceintervals.
• ThemainnewmathematicalfacttouseistheSEforthedifferencebetweentwoproportionsis
�̂�(1 − �̂�) .9:+ .
9<
� .
SmokingandGender
• Howdoesparents’behavioraffectthegenderoftheirchildren?
• Fukudaetal.(2002)foundthefollowinginJapan.– Outof565birthswherebothparentssmokedmorethan
apackaday,255wereboys.Thisis45.1%boys.– Outof3602birthswherebothparentsdidnotsmoke,
1975wereboys.This54.8%boys.– Intotal,outof4170births,2230wereboys,whichis
53.5%.• Otherstudieshaveshownareducedmaletofemale
birthratiowherehighconcentrationsofotherenvironmentalchemicalsarepresent(e.g.industrialpollution,pesticides)
SmokingandGender• A segmentedbargraphand2-waytable• Let’scomparetheproportionstoseeifthedifferenceis
statisticallysignificantly.
BothSmoked Neither Smoked
Boy 255(45.1%) 1,975(54.8%)
Girl 310 1,627
Total 565 3,602
SmokingandGender
NullHypothesis:• Thereisnoassociationbetween
smokingstatusofparentsandsexofchild.
• Theprobabilityofhavingaboyisthesameforparentswhosmokeanddon’tsmoke.
• 𝜋smoking - 𝜋nonsmoking =0
SmokingandGender
AlternativeHypothesis:• Thereisanassociationbetweensmokingstatusofparentsandsexofchild.
• Theprobabilityofhavingaboyisnotthesameforparentswhosmokeanddon’tsmoke
• 𝜋smoking - 𝜋nonsmoking ≠0
SmokingandGender
• Whataretheobservationalunitsinthestudy?• Whatarethevariablesinthisstudy?• Whichvariableshouldbeconsideredtheexplanatoryvariableandwhichtheresponsevariable?
• Whatistheparameterofinterest?• Canyoudrawcause-and-effectconclusionsforthisstudy?
SmokingandGender
Usingthe3SStrategytoassesthestrength1.Statistic:• Theproportionofboysborntononsmokersminustheproportionofboysborntosmokersis0.548– 0.451=0.097.
SmokingandGender
2.Simulate:• UsetheTwoProportionsapplettosimulate• Manyrepetitionsofshufflingthe2230boysand1937girlstothe565smokingand3602nonsmokingparents
• Calculatethedifferenceinproportionsofboysbetweenthegroupsforeachrepetition.
• Shufflingsimulatesthenullhypothesisofnoassociation
SmokingandGender3.Strengthofevidence:• Nothingasextremeas
ourobservedstatistic(≥0.097or≤−0.097)occurredin5000repetitions,
• HowmanySDsis0.097abovethemean?Z=0.097/0.023=4.22usingsimulations.Whataboutusingthetheory-basedapproach?
SmokingandGender
• Noticethenulldistributioniscenteredatzeroandisbell-shaped.
• Thiscanbeapproximatedbythenormaldistribution.
Formulas
• Thetheory-basedapproachyieldsz=4.30.
𝑧 =�̂�. − �̂�@
�̂�(1 − �̂�) 1𝑛.+ 1𝑛@
�
• Here𝑧 = .0*/-.*0.
.0B0(.-.0B0) :CDE<F
:GDG
�=4.30.
• p-valueis2*(1-pnorm(4.30))=0.00171%.•
SmokingandGender
• Fukudaetal.(2002)foundthefollowinginJapan.– Outof3602birthswherebothparentsdidnotsmoke,
1975wereboys.This54.8%boys.– Outof565birthswherebothparentssmokedmorethan
apackaday,255wereboys.Thisis45.1%boys.– Intotal,outof4170births,2230wereboys,whichis
53.5%boys.
Formulas• Howdowefindthemarginoferrorforthedifferencein
proportions?
𝑀𝑢𝑙𝑡𝑖𝑝𝑙𝑖𝑒𝑟 ⨯�̂�.(1 − �̂�.)
𝑛.+�̂�@(1 − �̂�@)
𝑛@
�
• Themultiplierisdependentupontheconfidencelevel.– 1.645for90%confidence– 1.96for95%confidence– 2.576for99%confidence
• Wecanwritetheconfidenceintervalintheform:– statistic± marginoferror.
SmokingandGender• Ourstatisticistheobservedsampledifferenceinproportions,
0.097.
• Pluggingin1.96 ⨯ RS:(.-RS:)9:
+ RS<(.-RS<)9<
� =0.044,
weget0.097± 0.044asour95%CI.• Wecouldalsowritethisintervalas(0.053,0.141).• Weare95%confidentthattheprobabilityofaboybaby
whereneitherfamilysmokesminustheprobabilityofaboybabywherebothparentssmokeisbetween0.053and0.141.