95
Robust confidence intervals for Hodges-Lehmann median differences A simulation study Roger B. Newson [email protected] http://www.imperial.ac.uk/nhli/r.newson/ National Heart and Lung Institute Imperial College London 13th UK Stata Users’ Group Meeting, 10–11 September, 2007 Downloadable from the conference website at http://ideas.repec.org/s/boc/usug07.html Robust confidence intervals for Hodges-Lehmann median differences Frame 1 of 29

Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

  • Upload
    others

  • View
    24

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Robust confidence intervals for Hodges-Lehmannmedian differences

A simulation study

Roger B. [email protected]

http://www.imperial.ac.uk/nhli/r.newson/

National Heart and Lung InstituteImperial College London

13th UK Stata Users’ Group Meeting, 10–11 September, 2007Downloadable from the conference website at

http://ideas.repec.org/s/boc/usug07.html

Robust confidence intervals for Hodges-Lehmann median differences Frame 1 of 29

Page 2: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

What is a Hodges–Lehmann median difference?

I A Theil–Sen median slope of Y with respect to X is a solutionin β to the equation D(Y − βX|X) = 0, where D(·|·) denotes therank association measure Somers’ D.

I In other words, a median slope is a linear effect of X on Y , largeenough to explain the observed association.

I If X is binary with values 0 and 1, then the Theil–Sen medianslope is the Hodges–Lehmann median difference between thesubpopulations in which X = 1 and X = 0.

I In other words, the Hodges–Lehmann median difference is themedian pairwise difference between two Y–values, sampled atrandom from the two subpopulations.

I Note that the median difference is not always the differencebetween the two subpopulation medians!

Robust confidence intervals for Hodges-Lehmann median differences Frame 2 of 29

Page 3: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

What is a Hodges–Lehmann median difference?

I A Theil–Sen median slope of Y with respect to X is a solutionin β to the equation D(Y − βX|X) = 0, where D(·|·) denotes therank association measure Somers’ D.

I In other words, a median slope is a linear effect of X on Y , largeenough to explain the observed association.

I If X is binary with values 0 and 1, then the Theil–Sen medianslope is the Hodges–Lehmann median difference between thesubpopulations in which X = 1 and X = 0.

I In other words, the Hodges–Lehmann median difference is themedian pairwise difference between two Y–values, sampled atrandom from the two subpopulations.

I Note that the median difference is not always the differencebetween the two subpopulation medians!

Robust confidence intervals for Hodges-Lehmann median differences Frame 2 of 29

Page 4: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

What is a Hodges–Lehmann median difference?

I A Theil–Sen median slope of Y with respect to X is a solutionin β to the equation D(Y − βX|X) = 0, where D(·|·) denotes therank association measure Somers’ D.

I In other words, a median slope is a linear effect of X on Y , largeenough to explain the observed association.

I If X is binary with values 0 and 1, then the Theil–Sen medianslope is the Hodges–Lehmann median difference between thesubpopulations in which X = 1 and X = 0.

I In other words, the Hodges–Lehmann median difference is themedian pairwise difference between two Y–values, sampled atrandom from the two subpopulations.

I Note that the median difference is not always the differencebetween the two subpopulation medians!

Robust confidence intervals for Hodges-Lehmann median differences Frame 2 of 29

Page 5: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

What is a Hodges–Lehmann median difference?

I A Theil–Sen median slope of Y with respect to X is a solutionin β to the equation D(Y − βX|X) = 0, where D(·|·) denotes therank association measure Somers’ D.

I In other words, a median slope is a linear effect of X on Y , largeenough to explain the observed association.

I If X is binary with values 0 and 1, then the Theil–Sen medianslope is the Hodges–Lehmann median difference between thesubpopulations in which X = 1 and X = 0.

I In other words, the Hodges–Lehmann median difference is themedian pairwise difference between two Y–values, sampled atrandom from the two subpopulations.

I Note that the median difference is not always the differencebetween the two subpopulation medians!

Robust confidence intervals for Hodges-Lehmann median differences Frame 2 of 29

Page 6: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

What is a Hodges–Lehmann median difference?

I A Theil–Sen median slope of Y with respect to X is a solutionin β to the equation D(Y − βX|X) = 0, where D(·|·) denotes therank association measure Somers’ D.

I In other words, a median slope is a linear effect of X on Y , largeenough to explain the observed association.

I If X is binary with values 0 and 1, then the Theil–Sen medianslope is the Hodges–Lehmann median difference between thesubpopulations in which X = 1 and X = 0.

I In other words, the Hodges–Lehmann median difference is themedian pairwise difference between two Y–values, sampled atrandom from the two subpopulations.

I Note that the median difference is not always the differencebetween the two subpopulation medians!

Robust confidence intervals for Hodges-Lehmann median differences Frame 2 of 29

Page 7: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

What is a Hodges–Lehmann median difference?

I A Theil–Sen median slope of Y with respect to X is a solutionin β to the equation D(Y − βX|X) = 0, where D(·|·) denotes therank association measure Somers’ D.

I In other words, a median slope is a linear effect of X on Y , largeenough to explain the observed association.

I If X is binary with values 0 and 1, then the Theil–Sen medianslope is the Hodges–Lehmann median difference between thesubpopulations in which X = 1 and X = 0.

I In other words, the Hodges–Lehmann median difference is themedian pairwise difference between two Y–values, sampled atrandom from the two subpopulations.

I Note that the median difference is not always the differencebetween the two subpopulation medians!

Robust confidence intervals for Hodges-Lehmann median differences Frame 2 of 29

Page 8: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

The Lehmann confidence interval formula

I The conventional confidence interval formula for the mediandifference (Lehmann, 1963)[1] was implemented in Stata byWang (1999)[4].

I It assumes that the two subpopulation distributions are differentonly in location.

I This assumption implies that the median difference is thedifference between the two medians.

I However, it also implies that the subpopulations are equallyvariable.

I The Lehmann formula is therefore robust to non–Normality atthe price of being non–robust to unequal variability. (Whichoften causes even more problems.)

Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29

Page 9: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

The Lehmann confidence interval formula

I The conventional confidence interval formula for the mediandifference (Lehmann, 1963)[1] was implemented in Stata byWang (1999)[4].

I It assumes that the two subpopulation distributions are differentonly in location.

I This assumption implies that the median difference is thedifference between the two medians.

I However, it also implies that the subpopulations are equallyvariable.

I The Lehmann formula is therefore robust to non–Normality atthe price of being non–robust to unequal variability. (Whichoften causes even more problems.)

Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29

Page 10: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

The Lehmann confidence interval formula

I The conventional confidence interval formula for the mediandifference (Lehmann, 1963)[1] was implemented in Stata byWang (1999)[4].

I It assumes that the two subpopulation distributions are differentonly in location.

I This assumption implies that the median difference is thedifference between the two medians.

I However, it also implies that the subpopulations are equallyvariable.

I The Lehmann formula is therefore robust to non–Normality atthe price of being non–robust to unequal variability. (Whichoften causes even more problems.)

Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29

Page 11: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

The Lehmann confidence interval formula

I The conventional confidence interval formula for the mediandifference (Lehmann, 1963)[1] was implemented in Stata byWang (1999)[4].

I It assumes that the two subpopulation distributions are differentonly in location.

I This assumption implies that the median difference is thedifference between the two medians.

I However, it also implies that the subpopulations are equallyvariable.

I The Lehmann formula is therefore robust to non–Normality atthe price of being non–robust to unequal variability. (Whichoften causes even more problems.)

Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29

Page 12: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

The Lehmann confidence interval formula

I The conventional confidence interval formula for the mediandifference (Lehmann, 1963)[1] was implemented in Stata byWang (1999)[4].

I It assumes that the two subpopulation distributions are differentonly in location.

I This assumption implies that the median difference is thedifference between the two medians.

I However, it also implies that the subpopulations are equallyvariable.

I The Lehmann formula is therefore robust to non–Normality atthe price of being non–robust to unequal variability. (Whichoften causes even more problems.)

Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29

Page 13: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

The Lehmann confidence interval formula

I The conventional confidence interval formula for the mediandifference (Lehmann, 1963)[1] was implemented in Stata byWang (1999)[4].

I It assumes that the two subpopulation distributions are differentonly in location.

I This assumption implies that the median difference is thedifference between the two medians.

I However, it also implies that the subpopulations are equallyvariable.

I The Lehmann formula is therefore robust to non–Normality atthe price of being non–robust to unequal variability. (Whichoften causes even more problems.)

Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29

Page 14: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

The cendif confidence interval formula

I An alternative confidence interval formula for the mediandifference (Newson, 2006)[3] is used by the cendif module ofthe SSC package somersd.

I It is derived by inverting a delta–jackknife confidence intervalformula for Somers’ D.

I It should therefore still work if the two subpopulationdistributions differ in ways other than location.

I In particular, it should still work if the two subpopulations areunequally variable.

I The cendif formula therefore contrasts to the Lehmannformula as the unequal–variance t–test contrasts to theequal–variance t–test.

Robust confidence intervals for Hodges-Lehmann median differences Frame 4 of 29

Page 15: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

The cendif confidence interval formula

I An alternative confidence interval formula for the mediandifference (Newson, 2006)[3] is used by the cendif module ofthe SSC package somersd.

I It is derived by inverting a delta–jackknife confidence intervalformula for Somers’ D.

I It should therefore still work if the two subpopulationdistributions differ in ways other than location.

I In particular, it should still work if the two subpopulations areunequally variable.

I The cendif formula therefore contrasts to the Lehmannformula as the unequal–variance t–test contrasts to theequal–variance t–test.

Robust confidence intervals for Hodges-Lehmann median differences Frame 4 of 29

Page 16: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

The cendif confidence interval formula

I An alternative confidence interval formula for the mediandifference (Newson, 2006)[3] is used by the cendif module ofthe SSC package somersd.

I It is derived by inverting a delta–jackknife confidence intervalformula for Somers’ D.

I It should therefore still work if the two subpopulationdistributions differ in ways other than location.

I In particular, it should still work if the two subpopulations areunequally variable.

I The cendif formula therefore contrasts to the Lehmannformula as the unequal–variance t–test contrasts to theequal–variance t–test.

Robust confidence intervals for Hodges-Lehmann median differences Frame 4 of 29

Page 17: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

The cendif confidence interval formula

I An alternative confidence interval formula for the mediandifference (Newson, 2006)[3] is used by the cendif module ofthe SSC package somersd.

I It is derived by inverting a delta–jackknife confidence intervalformula for Somers’ D.

I It should therefore still work if the two subpopulationdistributions differ in ways other than location.

I In particular, it should still work if the two subpopulations areunequally variable.

I The cendif formula therefore contrasts to the Lehmannformula as the unequal–variance t–test contrasts to theequal–variance t–test.

Robust confidence intervals for Hodges-Lehmann median differences Frame 4 of 29

Page 18: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

The cendif confidence interval formula

I An alternative confidence interval formula for the mediandifference (Newson, 2006)[3] is used by the cendif module ofthe SSC package somersd.

I It is derived by inverting a delta–jackknife confidence intervalformula for Somers’ D.

I It should therefore still work if the two subpopulationdistributions differ in ways other than location.

I In particular, it should still work if the two subpopulations areunequally variable.

I The cendif formula therefore contrasts to the Lehmannformula as the unequal–variance t–test contrasts to theequal–variance t–test.

Robust confidence intervals for Hodges-Lehmann median differences Frame 4 of 29

Page 19: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

The cendif confidence interval formula

I An alternative confidence interval formula for the mediandifference (Newson, 2006)[3] is used by the cendif module ofthe SSC package somersd.

I It is derived by inverting a delta–jackknife confidence intervalformula for Somers’ D.

I It should therefore still work if the two subpopulationdistributions differ in ways other than location.

I In particular, it should still work if the two subpopulations areunequally variable.

I The cendif formula therefore contrasts to the Lehmannformula as the unequal–variance t–test contrasts to theequal–variance t–test.

Robust confidence intervals for Hodges-Lehmann median differences Frame 4 of 29

Page 20: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Comparing the two t–tests: Existing results

I Moser and Stevens (1992)[2] compared the Gosset equal–variance andSatterthwaite unequal–variance t–tests, using numerical integration.

I The Satterthwaite method had the advertized coverage probability.

I The equal–variance t–test produced oversized (undersized) confidenceintervals if the smaller sample is sampled from the less variable (morevariable) subpopulation.

I However, the equal–variance t–test had the advertized coverageprobability, if either the subsample numbers or the subpopulationvariances were equal.

I Under the latter conditions, the equal–variance t–test produced smallerconfidence intervals with the same coverage probability.

I The authors therefore recommended the unequal–variance method asthe “default”, and the equal–variance method for the “special occasion”of unequal sample numbers and prior knowledge of equal variability.

I They advised against the “traditional” practice of testing equality ofvariances before choosing a t–test!

Robust confidence intervals for Hodges-Lehmann median differences Frame 5 of 29

Page 21: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Comparing the two t–tests: Existing results

I Moser and Stevens (1992)[2] compared the Gosset equal–variance andSatterthwaite unequal–variance t–tests, using numerical integration.

I The Satterthwaite method had the advertized coverage probability.

I The equal–variance t–test produced oversized (undersized) confidenceintervals if the smaller sample is sampled from the less variable (morevariable) subpopulation.

I However, the equal–variance t–test had the advertized coverageprobability, if either the subsample numbers or the subpopulationvariances were equal.

I Under the latter conditions, the equal–variance t–test produced smallerconfidence intervals with the same coverage probability.

I The authors therefore recommended the unequal–variance method asthe “default”, and the equal–variance method for the “special occasion”of unequal sample numbers and prior knowledge of equal variability.

I They advised against the “traditional” practice of testing equality ofvariances before choosing a t–test!

Robust confidence intervals for Hodges-Lehmann median differences Frame 5 of 29

Page 22: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Comparing the two t–tests: Existing results

I Moser and Stevens (1992)[2] compared the Gosset equal–variance andSatterthwaite unequal–variance t–tests, using numerical integration.

I The Satterthwaite method had the advertized coverage probability.

I The equal–variance t–test produced oversized (undersized) confidenceintervals if the smaller sample is sampled from the less variable (morevariable) subpopulation.

I However, the equal–variance t–test had the advertized coverageprobability, if either the subsample numbers or the subpopulationvariances were equal.

I Under the latter conditions, the equal–variance t–test produced smallerconfidence intervals with the same coverage probability.

I The authors therefore recommended the unequal–variance method asthe “default”, and the equal–variance method for the “special occasion”of unequal sample numbers and prior knowledge of equal variability.

I They advised against the “traditional” practice of testing equality ofvariances before choosing a t–test!

Robust confidence intervals for Hodges-Lehmann median differences Frame 5 of 29

Page 23: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Comparing the two t–tests: Existing results

I Moser and Stevens (1992)[2] compared the Gosset equal–variance andSatterthwaite unequal–variance t–tests, using numerical integration.

I The Satterthwaite method had the advertized coverage probability.

I The equal–variance t–test produced oversized (undersized) confidenceintervals if the smaller sample is sampled from the less variable (morevariable) subpopulation.

I However, the equal–variance t–test had the advertized coverageprobability, if either the subsample numbers or the subpopulationvariances were equal.

I Under the latter conditions, the equal–variance t–test produced smallerconfidence intervals with the same coverage probability.

I The authors therefore recommended the unequal–variance method asthe “default”, and the equal–variance method for the “special occasion”of unequal sample numbers and prior knowledge of equal variability.

I They advised against the “traditional” practice of testing equality ofvariances before choosing a t–test!

Robust confidence intervals for Hodges-Lehmann median differences Frame 5 of 29

Page 24: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Comparing the two t–tests: Existing results

I Moser and Stevens (1992)[2] compared the Gosset equal–variance andSatterthwaite unequal–variance t–tests, using numerical integration.

I The Satterthwaite method had the advertized coverage probability.

I The equal–variance t–test produced oversized (undersized) confidenceintervals if the smaller sample is sampled from the less variable (morevariable) subpopulation.

I However, the equal–variance t–test had the advertized coverageprobability, if either the subsample numbers or the subpopulationvariances were equal.

I Under the latter conditions, the equal–variance t–test produced smallerconfidence intervals with the same coverage probability.

I The authors therefore recommended the unequal–variance method asthe “default”, and the equal–variance method for the “special occasion”of unequal sample numbers and prior knowledge of equal variability.

I They advised against the “traditional” practice of testing equality ofvariances before choosing a t–test!

Robust confidence intervals for Hodges-Lehmann median differences Frame 5 of 29

Page 25: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Comparing the two t–tests: Existing results

I Moser and Stevens (1992)[2] compared the Gosset equal–variance andSatterthwaite unequal–variance t–tests, using numerical integration.

I The Satterthwaite method had the advertized coverage probability.

I The equal–variance t–test produced oversized (undersized) confidenceintervals if the smaller sample is sampled from the less variable (morevariable) subpopulation.

I However, the equal–variance t–test had the advertized coverageprobability, if either the subsample numbers or the subpopulationvariances were equal.

I Under the latter conditions, the equal–variance t–test produced smallerconfidence intervals with the same coverage probability.

I The authors therefore recommended the unequal–variance method asthe “default”, and the equal–variance method for the “special occasion”of unequal sample numbers and prior knowledge of equal variability.

I They advised against the “traditional” practice of testing equality ofvariances before choosing a t–test!

Robust confidence intervals for Hodges-Lehmann median differences Frame 5 of 29

Page 26: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Comparing the two t–tests: Existing results

I Moser and Stevens (1992)[2] compared the Gosset equal–variance andSatterthwaite unequal–variance t–tests, using numerical integration.

I The Satterthwaite method had the advertized coverage probability.

I The equal–variance t–test produced oversized (undersized) confidenceintervals if the smaller sample is sampled from the less variable (morevariable) subpopulation.

I However, the equal–variance t–test had the advertized coverageprobability, if either the subsample numbers or the subpopulationvariances were equal.

I Under the latter conditions, the equal–variance t–test produced smallerconfidence intervals with the same coverage probability.

I The authors therefore recommended the unequal–variance method asthe “default”, and the equal–variance method for the “special occasion”of unequal sample numbers and prior knowledge of equal variability.

I They advised against the “traditional” practice of testing equality ofvariances before choosing a t–test!

Robust confidence intervals for Hodges-Lehmann median differences Frame 5 of 29

Page 27: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Comparing the two t–tests: Existing results

I Moser and Stevens (1992)[2] compared the Gosset equal–variance andSatterthwaite unequal–variance t–tests, using numerical integration.

I The Satterthwaite method had the advertized coverage probability.

I The equal–variance t–test produced oversized (undersized) confidenceintervals if the smaller sample is sampled from the less variable (morevariable) subpopulation.

I However, the equal–variance t–test had the advertized coverageprobability, if either the subsample numbers or the subpopulationvariances were equal.

I Under the latter conditions, the equal–variance t–test produced smallerconfidence intervals with the same coverage probability.

I The authors therefore recommended the unequal–variance method asthe “default”, and the equal–variance method for the “special occasion”of unequal sample numbers and prior knowledge of equal variability.

I They advised against the “traditional” practice of testing equality ofvariances before choosing a t–test!

Robust confidence intervals for Hodges-Lehmann median differences Frame 5 of 29

Page 28: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Simulation study: Aims

I A simulation study, modelled on the Moser–Stevens study[2],was designed to test cendif to destruction in a wide range ofscenarios.

I The cendif method was compared with 3 other methods (theLehmann method and the two t–tests) for calculating confidenceintervals for median differences.

I In each scenario, coverage probabilities were estimated, togetherwith median confidence interval width ratios.

I 10000 replicate sample pairs were simulated for each scenario.I In this presentation, we focus on comparing coverage

probabilities between the Lehmann and cendif methods.

Robust confidence intervals for Hodges-Lehmann median differences Frame 6 of 29

Page 29: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Simulation study: Aims

I A simulation study, modelled on the Moser–Stevens study[2],was designed to test cendif to destruction in a wide range ofscenarios.

I The cendif method was compared with 3 other methods (theLehmann method and the two t–tests) for calculating confidenceintervals for median differences.

I In each scenario, coverage probabilities were estimated, togetherwith median confidence interval width ratios.

I 10000 replicate sample pairs were simulated for each scenario.I In this presentation, we focus on comparing coverage

probabilities between the Lehmann and cendif methods.

Robust confidence intervals for Hodges-Lehmann median differences Frame 6 of 29

Page 30: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Simulation study: Aims

I A simulation study, modelled on the Moser–Stevens study[2],was designed to test cendif to destruction in a wide range ofscenarios.

I The cendif method was compared with 3 other methods (theLehmann method and the two t–tests) for calculating confidenceintervals for median differences.

I In each scenario, coverage probabilities were estimated, togetherwith median confidence interval width ratios.

I 10000 replicate sample pairs were simulated for each scenario.I In this presentation, we focus on comparing coverage

probabilities between the Lehmann and cendif methods.

Robust confidence intervals for Hodges-Lehmann median differences Frame 6 of 29

Page 31: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Simulation study: Aims

I A simulation study, modelled on the Moser–Stevens study[2],was designed to test cendif to destruction in a wide range ofscenarios.

I The cendif method was compared with 3 other methods (theLehmann method and the two t–tests) for calculating confidenceintervals for median differences.

I In each scenario, coverage probabilities were estimated, togetherwith median confidence interval width ratios.

I 10000 replicate sample pairs were simulated for each scenario.I In this presentation, we focus on comparing coverage

probabilities between the Lehmann and cendif methods.

Robust confidence intervals for Hodges-Lehmann median differences Frame 6 of 29

Page 32: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Simulation study: Aims

I A simulation study, modelled on the Moser–Stevens study[2],was designed to test cendif to destruction in a wide range ofscenarios.

I The cendif method was compared with 3 other methods (theLehmann method and the two t–tests) for calculating confidenceintervals for median differences.

I In each scenario, coverage probabilities were estimated, togetherwith median confidence interval width ratios.

I 10000 replicate sample pairs were simulated for each scenario.I In this presentation, we focus on comparing coverage

probabilities between the Lehmann and cendif methods.

Robust confidence intervals for Hodges-Lehmann median differences Frame 6 of 29

Page 33: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Simulation study: Aims

I A simulation study, modelled on the Moser–Stevens study[2],was designed to test cendif to destruction in a wide range ofscenarios.

I The cendif method was compared with 3 other methods (theLehmann method and the two t–tests) for calculating confidenceintervals for median differences.

I In each scenario, coverage probabilities were estimated, togetherwith median confidence interval width ratios.

I 10000 replicate sample pairs were simulated for each scenario.I In this presentation, we focus on comparing coverage

probabilities between the Lehmann and cendif methods.

Robust confidence intervals for Hodges-Lehmann median differences Frame 6 of 29

Page 34: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Simulation study: Scenarios

I Pairs of subpopulation distributions were selected from 2families: the “t–test friendly” Normal family and theoutlier–prone, “t–test unfriendly” Cauchy family.

I Both families are symmetric, and parameterized by a median µ(set to zero) and a scale parameter σ (measuring variability).

I Subsample numbers were all 10 possible pairs N1 ≤ N2 from theset {5, 10, 20, 40}.

I Variability scale ratios σ1/σ2 between the populations of thesmaller and larger samples were from the symmetrical set of 9values {1/4, 1/3, 1/2, 2/3, 1, 3/2, 2, 3, 4}.

I These 180 scenarios (90 for each distributional family) werechosen to include “best” and “worst” cases for all 4 statisticalmethods.

Robust confidence intervals for Hodges-Lehmann median differences Frame 7 of 29

Page 35: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Simulation study: Scenarios

I Pairs of subpopulation distributions were selected from 2families: the “t–test friendly” Normal family and theoutlier–prone, “t–test unfriendly” Cauchy family.

I Both families are symmetric, and parameterized by a median µ(set to zero) and a scale parameter σ (measuring variability).

I Subsample numbers were all 10 possible pairs N1 ≤ N2 from theset {5, 10, 20, 40}.

I Variability scale ratios σ1/σ2 between the populations of thesmaller and larger samples were from the symmetrical set of 9values {1/4, 1/3, 1/2, 2/3, 1, 3/2, 2, 3, 4}.

I These 180 scenarios (90 for each distributional family) werechosen to include “best” and “worst” cases for all 4 statisticalmethods.

Robust confidence intervals for Hodges-Lehmann median differences Frame 7 of 29

Page 36: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Simulation study: Scenarios

I Pairs of subpopulation distributions were selected from 2families: the “t–test friendly” Normal family and theoutlier–prone, “t–test unfriendly” Cauchy family.

I Both families are symmetric, and parameterized by a median µ(set to zero) and a scale parameter σ (measuring variability).

I Subsample numbers were all 10 possible pairs N1 ≤ N2 from theset {5, 10, 20, 40}.

I Variability scale ratios σ1/σ2 between the populations of thesmaller and larger samples were from the symmetrical set of 9values {1/4, 1/3, 1/2, 2/3, 1, 3/2, 2, 3, 4}.

I These 180 scenarios (90 for each distributional family) werechosen to include “best” and “worst” cases for all 4 statisticalmethods.

Robust confidence intervals for Hodges-Lehmann median differences Frame 7 of 29

Page 37: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Simulation study: Scenarios

I Pairs of subpopulation distributions were selected from 2families: the “t–test friendly” Normal family and theoutlier–prone, “t–test unfriendly” Cauchy family.

I Both families are symmetric, and parameterized by a median µ(set to zero) and a scale parameter σ (measuring variability).

I Subsample numbers were all 10 possible pairs N1 ≤ N2 from theset {5, 10, 20, 40}.

I Variability scale ratios σ1/σ2 between the populations of thesmaller and larger samples were from the symmetrical set of 9values {1/4, 1/3, 1/2, 2/3, 1, 3/2, 2, 3, 4}.

I These 180 scenarios (90 for each distributional family) werechosen to include “best” and “worst” cases for all 4 statisticalmethods.

Robust confidence intervals for Hodges-Lehmann median differences Frame 7 of 29

Page 38: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Simulation study: Scenarios

I Pairs of subpopulation distributions were selected from 2families: the “t–test friendly” Normal family and theoutlier–prone, “t–test unfriendly” Cauchy family.

I Both families are symmetric, and parameterized by a median µ(set to zero) and a scale parameter σ (measuring variability).

I Subsample numbers were all 10 possible pairs N1 ≤ N2 from theset {5, 10, 20, 40}.

I Variability scale ratios σ1/σ2 between the populations of thesmaller and larger samples were from the symmetrical set of 9values {1/4, 1/3, 1/2, 2/3, 1, 3/2, 2, 3, 4}.

I These 180 scenarios (90 for each distributional family) werechosen to include “best” and “worst” cases for all 4 statisticalmethods.

Robust confidence intervals for Hodges-Lehmann median differences Frame 7 of 29

Page 39: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Simulation study: Scenarios

I Pairs of subpopulation distributions were selected from 2families: the “t–test friendly” Normal family and theoutlier–prone, “t–test unfriendly” Cauchy family.

I Both families are symmetric, and parameterized by a median µ(set to zero) and a scale parameter σ (measuring variability).

I Subsample numbers were all 10 possible pairs N1 ≤ N2 from theset {5, 10, 20, 40}.

I Variability scale ratios σ1/σ2 between the populations of thesmaller and larger samples were from the symmetrical set of 9values {1/4, 1/3, 1/2, 2/3, 1, 3/2, 2, 3, 4}.

I These 180 scenarios (90 for each distributional family) werechosen to include “best” and “worst” cases for all 4 statisticalmethods.

Robust confidence intervals for Hodges-Lehmann median differences Frame 7 of 29

Page 40: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Normal coverage probabilities for the Gosset and cendif methods

.25.3333

.5.6667

11.5

234

.25.3333

.5.6667

11.5

234

.55.6 .65.7 .75.8 .85.9 .951 .55.6 .65.7 .75.8 .85.9 .951 .55.6 .65.7 .75.8 .85.9 .951 .55.6 .65.7 .75.8 .85.9 .951 .55.6 .65.7 .75.8 .85.9 .951

5, 5 5, 10 5, 20 5, 40 10, 10

10, 20 10, 40 20, 20 20, 40 40, 40

Gosset coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability under Normal distribution

Graphs by First sample number and Second sample number

The equal–variance t–test produces oversized (undersized) confidenceintervals if the smaller sample is from the less (more) variablepopulation.

Robust confidence intervals for Hodges-Lehmann median differences Frame 8 of 29

Page 41: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Normal coverage probabilities for the Lehmann and cendif methods

.25.3333

.5.6667

11.5

234

.25.3333

.5.6667

11.5

234

.775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751

5, 5 5, 10 5, 20 5, 40 10, 10

10, 20 10, 40 20, 20 20, 40 40, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability under Normal distribution

Graphs by First sample number and Second sample number

Under most (but not all) scenarios, the cendif coverage probabilityis closer to the advertized value of 0.95.

Robust confidence intervals for Hodges-Lehmann median differences Frame 9 of 29

Page 42: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Cauchy coverage probabilities for the Lehmann and cendif methods

.25.3333

.5.6667

11.5

234

.25.3333

.5.6667

11.5

234

.775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751

5, 5 5, 10 5, 20 5, 40 10, 10

10, 20 10, 40 20, 20 20, 40 40, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability under Cauchy distribution

Graphs by First sample number and Second sample number

For both rank methods, the Cauchy coverage probabilities are similarto the Normal coverage probabilities. However . . .

Robust confidence intervals for Hodges-Lehmann median differences Frame 10 of 29

Page 43: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: Patterns of relative advantage

I . . . the relative advantage between the two rank methods variesbetween scenarios.

I The subsample size pairs N1 ≤ N2 can be classified into 3 “fuzzypatterns”, which blend into each other gradually.

I These 3 patterns can be named “N1 = N2”, “N1 < N2”, and“N1 � N2”.

I We will illustrate this remark by focussing on a “typical”example of each pattern.

Robust confidence intervals for Hodges-Lehmann median differences Frame 11 of 29

Page 44: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: Patterns of relative advantage

I . . . the relative advantage between the two rank methods variesbetween scenarios.

I The subsample size pairs N1 ≤ N2 can be classified into 3 “fuzzypatterns”, which blend into each other gradually.

I These 3 patterns can be named “N1 = N2”, “N1 < N2”, and“N1 � N2”.

I We will illustrate this remark by focussing on a “typical”example of each pattern.

Robust confidence intervals for Hodges-Lehmann median differences Frame 11 of 29

Page 45: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: Patterns of relative advantage

I . . . the relative advantage between the two rank methods variesbetween scenarios.

I The subsample size pairs N1 ≤ N2 can be classified into 3 “fuzzypatterns”, which blend into each other gradually.

I These 3 patterns can be named “N1 = N2”, “N1 < N2”, and“N1 � N2”.

I We will illustrate this remark by focussing on a “typical”example of each pattern.

Robust confidence intervals for Hodges-Lehmann median differences Frame 11 of 29

Page 46: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: Patterns of relative advantage

I . . . the relative advantage between the two rank methods variesbetween scenarios.

I The subsample size pairs N1 ≤ N2 can be classified into 3 “fuzzypatterns”, which blend into each other gradually.

I These 3 patterns can be named “N1 = N2”, “N1 < N2”, and“N1 � N2”.

I We will illustrate this remark by focussing on a “typical”example of each pattern.

Robust confidence intervals for Hodges-Lehmann median differences Frame 11 of 29

Page 47: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

N1 = N2: Both methods are reasonable

I Median differencesbetween 2 Normalsamples of 40 areestimated.

I Both methods havecoverage probabilitiesclose to the advertizedlevel of 0.95.

I However, the Lehmannmethod produces slightlyundersized confidenceintervals under veryunequal variability.

.25

.3333

.5

.6667

1

1.5

2

3

4.775

.8 .825

.85.85

.875

.9 .925

.95.95

.975

1

40, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability (95% CI) under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 12 of 29

Page 48: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

N1 = N2: Both methods are reasonable

I Median differencesbetween 2 Normalsamples of 40 areestimated.

I Both methods havecoverage probabilitiesclose to the advertizedlevel of 0.95.

I However, the Lehmannmethod produces slightlyundersized confidenceintervals under veryunequal variability.

.25

.3333

.5

.6667

1

1.5

2

3

4.775

.8 .825

.85.85

.875

.9 .925

.95.95

.975

1

40, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability (95% CI) under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 12 of 29

Page 49: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

N1 = N2: Both methods are reasonable

I Median differencesbetween 2 Normalsamples of 40 areestimated.

I Both methods havecoverage probabilitiesclose to the advertizedlevel of 0.95.

I However, the Lehmannmethod produces slightlyundersized confidenceintervals under veryunequal variability.

.25

.3333

.5

.6667

1

1.5

2

3

4.775

.8 .825

.85.85

.875

.9 .925

.95.95

.975

1

40, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability (95% CI) under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 12 of 29

Page 50: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

N1 = N2: Both methods are reasonable

I Median differencesbetween 2 Normalsamples of 40 areestimated.

I Both methods havecoverage probabilitiesclose to the advertizedlevel of 0.95.

I However, the Lehmannmethod produces slightlyundersized confidenceintervals under veryunequal variability.

.25

.3333

.5

.6667

1

1.5

2

3

4.775

.8 .825

.85.85

.875

.9 .925

.95.95

.975

1

40, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability (95% CI) under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 12 of 29

Page 51: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

N1 < N2: cendif is robust

I The first sample numberhere is half the second.

I The cendif method hascoverage probabilitiesclose to the advertizedlevel of 0.95 under allvariability ratios.

I The Lehmann methodproduces oversized(undersized) confidenceintervals if the smallersample is from the less(more) variablepopulation. (Like theequal–variance t–test.)

.25

.3333

.5

.6667

1

1.5

2

3

4.775

.8 .825

.85.85

.875

.9 .925

.95.95

.975

1

20, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability (95% CI) under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 13 of 29

Page 52: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

N1 < N2: cendif is robust

I The first sample numberhere is half the second.

I The cendif method hascoverage probabilitiesclose to the advertizedlevel of 0.95 under allvariability ratios.

I The Lehmann methodproduces oversized(undersized) confidenceintervals if the smallersample is from the less(more) variablepopulation. (Like theequal–variance t–test.)

.25

.3333

.5

.6667

1

1.5

2

3

4.775

.8 .825

.85.85

.875

.9 .925

.95.95

.975

1

20, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability (95% CI) under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 13 of 29

Page 53: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

N1 < N2: cendif is robust

I The first sample numberhere is half the second.

I The cendif method hascoverage probabilitiesclose to the advertizedlevel of 0.95 under allvariability ratios.

I The Lehmann methodproduces oversized(undersized) confidenceintervals if the smallersample is from the less(more) variablepopulation. (Like theequal–variance t–test.)

.25

.3333

.5

.6667

1

1.5

2

3

4.775

.8 .825

.85.85

.875

.9 .925

.95.95

.975

1

20, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability (95% CI) under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 13 of 29

Page 54: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

N1 < N2: cendif is robust

I The first sample numberhere is half the second.

I The cendif method hascoverage probabilitiesclose to the advertizedlevel of 0.95 under allvariability ratios.

I The Lehmann methodproduces oversized(undersized) confidenceintervals if the smallersample is from the less(more) variablepopulation. (Like theequal–variance t–test.)

.25

.3333

.5

.6667

1

1.5

2

3

4.775

.8 .825

.85.85

.875

.9 .925

.95.95

.975

1

20, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability (95% CI) under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 13 of 29

Page 55: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

N1 � N2: cendif is tested to destruction

I The cendif confidenceinterval is now undersizedunder most variabilityratios.

I The Lehmann methodstill produces oversized(undersized) confidenceintervals if the smallersample is from the less(more) variablepopulation.

I However, the Lehmanncoverage is at leastcorrect under equalvariability!

.25

.3333

.5

.6667

1

1.5

2

3

4.775

.8 .825

.85.85

.875

.9 .925

.95.95

.975

1

5, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability (95% CI) under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 14 of 29

Page 56: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

N1 � N2: cendif is tested to destruction

I The cendif confidenceinterval is now undersizedunder most variabilityratios.

I The Lehmann methodstill produces oversized(undersized) confidenceintervals if the smallersample is from the less(more) variablepopulation.

I However, the Lehmanncoverage is at leastcorrect under equalvariability!

.25

.3333

.5

.6667

1

1.5

2

3

4.775

.8 .825

.85.85

.875

.9 .925

.95.95

.975

1

5, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability (95% CI) under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 14 of 29

Page 57: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

N1 � N2: cendif is tested to destruction

I The cendif confidenceinterval is now undersizedunder most variabilityratios.

I The Lehmann methodstill produces oversized(undersized) confidenceintervals if the smallersample is from the less(more) variablepopulation.

I However, the Lehmanncoverage is at leastcorrect under equalvariability!

.25

.3333

.5

.6667

1

1.5

2

3

4.775

.8 .825

.85.85

.875

.9 .925

.95.95

.975

1

5, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability (95% CI) under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 14 of 29

Page 58: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

N1 � N2: cendif is tested to destruction

I The cendif confidenceinterval is now undersizedunder most variabilityratios.

I The Lehmann methodstill produces oversized(undersized) confidenceintervals if the smallersample is from the less(more) variablepopulation.

I However, the Lehmanncoverage is at leastcorrect under equalvariability!

.25

.3333

.5

.6667

1

1.5

2

3

4.775

.8 .825

.85.85

.875

.9 .925

.95.95

.975

1

5, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability (95% CI) under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 14 of 29

Page 59: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: Summary of results

I If N1 = N2, then both methods (especially cendif) producecoverage probabilities close to the advertized level.

I If N1 < N2 (and N1 is not too small), then the Lehmann methodproduces oversized (undersized) confidence intervals if thesmaller sample is from the less (more) variable population, andthe cendif method is more robust.

I However, if N1 � N2 (and N1 is very small), then the cendifmethod produces undersized confidence intervals, and theLehmann method is more correct under equal variability.

I Therefore, cendif is robust to unequal variability, at the priceof being less robust to the possibility that the smaller sample (butnot the larger one) is very small.

Robust confidence intervals for Hodges-Lehmann median differences Frame 15 of 29

Page 60: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: Summary of results

I If N1 = N2, then both methods (especially cendif) producecoverage probabilities close to the advertized level.

I If N1 < N2 (and N1 is not too small), then the Lehmann methodproduces oversized (undersized) confidence intervals if thesmaller sample is from the less (more) variable population, andthe cendif method is more robust.

I However, if N1 � N2 (and N1 is very small), then the cendifmethod produces undersized confidence intervals, and theLehmann method is more correct under equal variability.

I Therefore, cendif is robust to unequal variability, at the priceof being less robust to the possibility that the smaller sample (butnot the larger one) is very small.

Robust confidence intervals for Hodges-Lehmann median differences Frame 15 of 29

Page 61: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: Summary of results

I If N1 = N2, then both methods (especially cendif) producecoverage probabilities close to the advertized level.

I If N1 < N2 (and N1 is not too small), then the Lehmann methodproduces oversized (undersized) confidence intervals if thesmaller sample is from the less (more) variable population, andthe cendif method is more robust.

I However, if N1 � N2 (and N1 is very small), then the cendifmethod produces undersized confidence intervals, and theLehmann method is more correct under equal variability.

I Therefore, cendif is robust to unequal variability, at the priceof being less robust to the possibility that the smaller sample (butnot the larger one) is very small.

Robust confidence intervals for Hodges-Lehmann median differences Frame 15 of 29

Page 62: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: Summary of results

I If N1 = N2, then both methods (especially cendif) producecoverage probabilities close to the advertized level.

I If N1 < N2 (and N1 is not too small), then the Lehmann methodproduces oversized (undersized) confidence intervals if thesmaller sample is from the less (more) variable population, andthe cendif method is more robust.

I However, if N1 � N2 (and N1 is very small), then the cendifmethod produces undersized confidence intervals, and theLehmann method is more correct under equal variability.

I Therefore, cendif is robust to unequal variability, at the priceof being less robust to the possibility that the smaller sample (butnot the larger one) is very small.

Robust confidence intervals for Hodges-Lehmann median differences Frame 15 of 29

Page 63: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: Summary of results

I If N1 = N2, then both methods (especially cendif) producecoverage probabilities close to the advertized level.

I If N1 < N2 (and N1 is not too small), then the Lehmann methodproduces oversized (undersized) confidence intervals if thesmaller sample is from the less (more) variable population, andthe cendif method is more robust.

I However, if N1 � N2 (and N1 is very small), then the cendifmethod produces undersized confidence intervals, and theLehmann method is more correct under equal variability.

I Therefore, cendif is robust to unequal variability, at the priceof being less robust to the possibility that the smaller sample (butnot the larger one) is very small.

Robust confidence intervals for Hodges-Lehmann median differences Frame 15 of 29

Page 64: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: General principles

I The Lehmann and cendif methods are both based on CentralLimit Theorems, applied to Somers’ D(Y|X) for a binary X and acontinuous Y .

I However, the cendif method estimates the variance from thejoint sample distribution of X and Y , using jackknife methods.

I By contrast, the Lehmann method calculates the variance fromthe marginal sample distributions of X and Y , using permutationmethods.

I Therefore, the Lehmann method (like the equal–variance t–test)estimates the population variability of the smaller sample usingthe sample variability of the larger sample.

I By contrast, the cendif method (like the unequal–variancet–test) estimates the population variability of the smaller sampleusing the sample variability of the smaller sample.

Robust confidence intervals for Hodges-Lehmann median differences Frame 16 of 29

Page 65: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: General principles

I The Lehmann and cendif methods are both based on CentralLimit Theorems, applied to Somers’ D(Y|X) for a binary X and acontinuous Y .

I However, the cendif method estimates the variance from thejoint sample distribution of X and Y , using jackknife methods.

I By contrast, the Lehmann method calculates the variance fromthe marginal sample distributions of X and Y , using permutationmethods.

I Therefore, the Lehmann method (like the equal–variance t–test)estimates the population variability of the smaller sample usingthe sample variability of the larger sample.

I By contrast, the cendif method (like the unequal–variancet–test) estimates the population variability of the smaller sampleusing the sample variability of the smaller sample.

Robust confidence intervals for Hodges-Lehmann median differences Frame 16 of 29

Page 66: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: General principles

I The Lehmann and cendif methods are both based on CentralLimit Theorems, applied to Somers’ D(Y|X) for a binary X and acontinuous Y .

I However, the cendif method estimates the variance from thejoint sample distribution of X and Y , using jackknife methods.

I By contrast, the Lehmann method calculates the variance fromthe marginal sample distributions of X and Y , using permutationmethods.

I Therefore, the Lehmann method (like the equal–variance t–test)estimates the population variability of the smaller sample usingthe sample variability of the larger sample.

I By contrast, the cendif method (like the unequal–variancet–test) estimates the population variability of the smaller sampleusing the sample variability of the smaller sample.

Robust confidence intervals for Hodges-Lehmann median differences Frame 16 of 29

Page 67: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: General principles

I The Lehmann and cendif methods are both based on CentralLimit Theorems, applied to Somers’ D(Y|X) for a binary X and acontinuous Y .

I However, the cendif method estimates the variance from thejoint sample distribution of X and Y , using jackknife methods.

I By contrast, the Lehmann method calculates the variance fromthe marginal sample distributions of X and Y , using permutationmethods.

I Therefore, the Lehmann method (like the equal–variance t–test)estimates the population variability of the smaller sample usingthe sample variability of the larger sample.

I By contrast, the cendif method (like the unequal–variancet–test) estimates the population variability of the smaller sampleusing the sample variability of the smaller sample.

Robust confidence intervals for Hodges-Lehmann median differences Frame 16 of 29

Page 68: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: General principles

I The Lehmann and cendif methods are both based on CentralLimit Theorems, applied to Somers’ D(Y|X) for a binary X and acontinuous Y .

I However, the cendif method estimates the variance from thejoint sample distribution of X and Y , using jackknife methods.

I By contrast, the Lehmann method calculates the variance fromthe marginal sample distributions of X and Y , using permutationmethods.

I Therefore, the Lehmann method (like the equal–variance t–test)estimates the population variability of the smaller sample usingthe sample variability of the larger sample.

I By contrast, the cendif method (like the unequal–variancet–test) estimates the population variability of the smaller sampleusing the sample variability of the smaller sample.

Robust confidence intervals for Hodges-Lehmann median differences Frame 16 of 29

Page 69: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: General principles

I The Lehmann and cendif methods are both based on CentralLimit Theorems, applied to Somers’ D(Y|X) for a binary X and acontinuous Y .

I However, the cendif method estimates the variance from thejoint sample distribution of X and Y , using jackknife methods.

I By contrast, the Lehmann method calculates the variance fromthe marginal sample distributions of X and Y , using permutationmethods.

I Therefore, the Lehmann method (like the equal–variance t–test)estimates the population variability of the smaller sample usingthe sample variability of the larger sample.

I By contrast, the cendif method (like the unequal–variancet–test) estimates the population variability of the smaller sampleusing the sample variability of the smaller sample.

Robust confidence intervals for Hodges-Lehmann median differences Frame 16 of 29

Page 70: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: Interpretation of results

I If N1 = N2, then there is no larger or smaller sample – and bothmethods work (especially cendif).

I If N1 < N2 (and N1 is not too small), then the populationvariability of the smaller sample is best estimated using thesample variability of the smaller sample – favoring cendif.

I If N1 � N2 (and N1 is very small), and we have prior reason toexpect “similar” variability, then the population variability of thesmaller sample is best estimated using the sample variability ofthe larger sample – favoring the Lehmann method.

I This seems to suggest a policy of regarding cendif as thedefault and the Lehmann formula as the “special case”, similar tothe Moser–Stevens[2] policy regarding the two t–tests.

Robust confidence intervals for Hodges-Lehmann median differences Frame 17 of 29

Page 71: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: Interpretation of results

I If N1 = N2, then there is no larger or smaller sample – and bothmethods work (especially cendif).

I If N1 < N2 (and N1 is not too small), then the populationvariability of the smaller sample is best estimated using thesample variability of the smaller sample – favoring cendif.

I If N1 � N2 (and N1 is very small), and we have prior reason toexpect “similar” variability, then the population variability of thesmaller sample is best estimated using the sample variability ofthe larger sample – favoring the Lehmann method.

I This seems to suggest a policy of regarding cendif as thedefault and the Lehmann formula as the “special case”, similar tothe Moser–Stevens[2] policy regarding the two t–tests.

Robust confidence intervals for Hodges-Lehmann median differences Frame 17 of 29

Page 72: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: Interpretation of results

I If N1 = N2, then there is no larger or smaller sample – and bothmethods work (especially cendif).

I If N1 < N2 (and N1 is not too small), then the populationvariability of the smaller sample is best estimated using thesample variability of the smaller sample – favoring cendif.

I If N1 � N2 (and N1 is very small), and we have prior reason toexpect “similar” variability, then the population variability of thesmaller sample is best estimated using the sample variability ofthe larger sample – favoring the Lehmann method.

I This seems to suggest a policy of regarding cendif as thedefault and the Lehmann formula as the “special case”, similar tothe Moser–Stevens[2] policy regarding the two t–tests.

Robust confidence intervals for Hodges-Lehmann median differences Frame 17 of 29

Page 73: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: Interpretation of results

I If N1 = N2, then there is no larger or smaller sample – and bothmethods work (especially cendif).

I If N1 < N2 (and N1 is not too small), then the populationvariability of the smaller sample is best estimated using thesample variability of the smaller sample – favoring cendif.

I If N1 � N2 (and N1 is very small), and we have prior reason toexpect “similar” variability, then the population variability of thesmaller sample is best estimated using the sample variability ofthe larger sample – favoring the Lehmann method.

I This seems to suggest a policy of regarding cendif as thedefault and the Lehmann formula as the “special case”, similar tothe Moser–Stevens[2] policy regarding the two t–tests.

Robust confidence intervals for Hodges-Lehmann median differences Frame 17 of 29

Page 74: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Lehmann versus cendif: Interpretation of results

I If N1 = N2, then there is no larger or smaller sample – and bothmethods work (especially cendif).

I If N1 < N2 (and N1 is not too small), then the populationvariability of the smaller sample is best estimated using thesample variability of the smaller sample – favoring cendif.

I If N1 � N2 (and N1 is very small), and we have prior reason toexpect “similar” variability, then the population variability of thesmaller sample is best estimated using the sample variability ofthe larger sample – favoring the Lehmann method.

I This seems to suggest a policy of regarding cendif as thedefault and the Lehmann formula as the “special case”, similar tothe Moser–Stevens[2] policy regarding the two t–tests.

Robust confidence intervals for Hodges-Lehmann median differences Frame 17 of 29

Page 75: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Possible further improvements to cendif

I The jackknife method used by cendif assumes N1 + N2 − 1degrees of freedom, which may be over–generous if N1 � N2.

I It might be possible to devise an alternative degrees–of–freedomformula for the jackknife, like the Satterthwaite formula used inthe unequal–variance t–test.

I The percentile bootstrap (Wilcox, 1998)[5] might possibly be animprovement on the cendif method.

I However, the 1000 subsamples typically used might make itcomputationally expensive to prove this in a study as large as thisone!

Robust confidence intervals for Hodges-Lehmann median differences Frame 18 of 29

Page 76: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Possible further improvements to cendif

I The jackknife method used by cendif assumes N1 + N2 − 1degrees of freedom, which may be over–generous if N1 � N2.

I It might be possible to devise an alternative degrees–of–freedomformula for the jackknife, like the Satterthwaite formula used inthe unequal–variance t–test.

I The percentile bootstrap (Wilcox, 1998)[5] might possibly be animprovement on the cendif method.

I However, the 1000 subsamples typically used might make itcomputationally expensive to prove this in a study as large as thisone!

Robust confidence intervals for Hodges-Lehmann median differences Frame 18 of 29

Page 77: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Possible further improvements to cendif

I The jackknife method used by cendif assumes N1 + N2 − 1degrees of freedom, which may be over–generous if N1 � N2.

I It might be possible to devise an alternative degrees–of–freedomformula for the jackknife, like the Satterthwaite formula used inthe unequal–variance t–test.

I The percentile bootstrap (Wilcox, 1998)[5] might possibly be animprovement on the cendif method.

I However, the 1000 subsamples typically used might make itcomputationally expensive to prove this in a study as large as thisone!

Robust confidence intervals for Hodges-Lehmann median differences Frame 18 of 29

Page 78: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Possible further improvements to cendif

I The jackknife method used by cendif assumes N1 + N2 − 1degrees of freedom, which may be over–generous if N1 � N2.

I It might be possible to devise an alternative degrees–of–freedomformula for the jackknife, like the Satterthwaite formula used inthe unequal–variance t–test.

I The percentile bootstrap (Wilcox, 1998)[5] might possibly be animprovement on the cendif method.

I However, the 1000 subsamples typically used might make itcomputationally expensive to prove this in a study as large as thisone!

Robust confidence intervals for Hodges-Lehmann median differences Frame 18 of 29

Page 79: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Possible further improvements to cendif

I The jackknife method used by cendif assumes N1 + N2 − 1degrees of freedom, which may be over–generous if N1 � N2.

I It might be possible to devise an alternative degrees–of–freedomformula for the jackknife, like the Satterthwaite formula used inthe unequal–variance t–test.

I The percentile bootstrap (Wilcox, 1998)[5] might possibly be animprovement on the cendif method.

I However, the 1000 subsamples typically used might make itcomputationally expensive to prove this in a study as large as thisone!

Robust confidence intervals for Hodges-Lehmann median differences Frame 18 of 29

Page 80: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Conclusions

I This simulation study compared the coverage probabilities of theLehmann and cendif confidence intervals for mediandifferences.

I Neither method failed “catastrophically”, in the manner of thet–test.

I However, both methods could be made to produce “95%confidence intervals” that were really 90% confidence intervals.

I Under most scenarios, it appears safe to use cendif as thedefault method.

I However, the Lehmann method may be better, if N1 � N2.

Robust confidence intervals for Hodges-Lehmann median differences Frame 19 of 29

Page 81: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Conclusions

I This simulation study compared the coverage probabilities of theLehmann and cendif confidence intervals for mediandifferences.

I Neither method failed “catastrophically”, in the manner of thet–test.

I However, both methods could be made to produce “95%confidence intervals” that were really 90% confidence intervals.

I Under most scenarios, it appears safe to use cendif as thedefault method.

I However, the Lehmann method may be better, if N1 � N2.

Robust confidence intervals for Hodges-Lehmann median differences Frame 19 of 29

Page 82: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Conclusions

I This simulation study compared the coverage probabilities of theLehmann and cendif confidence intervals for mediandifferences.

I Neither method failed “catastrophically”, in the manner of thet–test.

I However, both methods could be made to produce “95%confidence intervals” that were really 90% confidence intervals.

I Under most scenarios, it appears safe to use cendif as thedefault method.

I However, the Lehmann method may be better, if N1 � N2.

Robust confidence intervals for Hodges-Lehmann median differences Frame 19 of 29

Page 83: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Conclusions

I This simulation study compared the coverage probabilities of theLehmann and cendif confidence intervals for mediandifferences.

I Neither method failed “catastrophically”, in the manner of thet–test.

I However, both methods could be made to produce “95%confidence intervals” that were really 90% confidence intervals.

I Under most scenarios, it appears safe to use cendif as thedefault method.

I However, the Lehmann method may be better, if N1 � N2.

Robust confidence intervals for Hodges-Lehmann median differences Frame 19 of 29

Page 84: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Conclusions

I This simulation study compared the coverage probabilities of theLehmann and cendif confidence intervals for mediandifferences.

I Neither method failed “catastrophically”, in the manner of thet–test.

I However, both methods could be made to produce “95%confidence intervals” that were really 90% confidence intervals.

I Under most scenarios, it appears safe to use cendif as thedefault method.

I However, the Lehmann method may be better, if N1 � N2.

Robust confidence intervals for Hodges-Lehmann median differences Frame 19 of 29

Page 85: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Conclusions

I This simulation study compared the coverage probabilities of theLehmann and cendif confidence intervals for mediandifferences.

I Neither method failed “catastrophically”, in the manner of thet–test.

I However, both methods could be made to produce “95%confidence intervals” that were really 90% confidence intervals.

I Under most scenarios, it appears safe to use cendif as thedefault method.

I However, the Lehmann method may be better, if N1 � N2.

Robust confidence intervals for Hodges-Lehmann median differences Frame 19 of 29

Page 86: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

References[1] Lehmann E. L. 1963. Nonparametric confidence intervals for a shift

parameter. The Annals of Mathematical Statistics 34(4): 1507–1512.

[2] Moser B. K. and Stevens G. R. 1992. Homogeneity of variance in thetwo-sample means test. The American Statistician 46(1): 19–21.

[3] Newson R. 2006. Confidence intervals for rank statistics: Percentileslopes, differences, and ratios. The Stata Journal 6(4): 497-520.

[4] Wang D. 1999. sg123: Hodges-Lehmann estimation of a shift inlocation between two populations. Stata Technical Bulletin 52: 52–53.Reprinted in: Stata Technical Bulletin Reprints 9: 255–257. CollegeStation, TX: Stata Press; 2000.

[5] Wilcox R. R. 1998. A note on the Theil-Sen regression estimator whenthe regressor is random and the error term is heteroscedastic.Biometrical Journal 1998; 40(3): 261–268.

This presentation can be downloaded from the conference website athttp://ideas.repec.org/s/boc/usug07.html

Robust confidence intervals for Hodges-Lehmann median differences Frame 20 of 29

Page 87: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Appendix

I This and the following frames are not part of the mainpresentation.

I However, they may be shown to the audience to illustrateresponses to questions.

Robust confidence intervals for Hodges-Lehmann median differences Frame 21 of 29

Page 88: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Median Gosset/cendif confidence interval width ratios under equalvariability

5, 5

5, 10

5, 20

5, 40

10, 10

10, 20

10, 40

20, 20

20, 40

40, 40

.8125.875.93751 1.1251.251.3751.51.6251.751.8752 2.252.52.753 3.253.53.754 4.55 5.5

.8125.875.93751 1.1251.251.3751.51.6251.751.8752 2.252.52.753 3.253.53.754 4.55 5.5

Normal Cauchy

Sam

ple

num

ber

pair

Median Gosset/cendif width ratio (95% CI) under equal variabilityGraphs by Distributional family

Robust confidence intervals for Hodges-Lehmann median differences Frame 22 of 29

Page 89: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Median Lehmann/cendif confidence interval width ratios under equalvariability

5, 5

5, 10

5, 20

5, 40

10, 10

10, 20

10, 40

20, 20

20, 40

40, 40

.9688

1 1.031

1.063

1.094

1.125

1.156

.9688

1 1.031

1.063

1.094

1.125

1.156

Normal Cauchy

Sam

ple

num

ber

pair

Median Lehmann/cendif width ratio (95% CI) under equal variabilityGraphs by Distributional family

Robust confidence intervals for Hodges-Lehmann median differences Frame 23 of 29

Page 90: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Normal coverage probabilities for the Gosset and cendif methods

.25.3333

.5.6667

11.5

234

.25.3333

.5.6667

11.5

234

.55.6 .65.7 .75.8 .85.9 .951 .55.6 .65.7 .75.8 .85.9 .951 .55.6 .65.7 .75.8 .85.9 .951 .55.6 .65.7 .75.8 .85.9 .951 .55.6 .65.7 .75.8 .85.9 .951

5, 5 5, 10 5, 20 5, 40 10, 10

10, 20 10, 40 20, 20 20, 40 40, 40

Gosset coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 24 of 29

Page 91: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Cauchy coverage probabilities for the Gosset and cendif methods

.25.3333

.5.6667

11.5

234

.25.3333

.5.6667

11.5

234

.55.6 .65.7 .75.8 .85.9 .951 .55.6 .65.7 .75.8 .85.9 .951 .55.6 .65.7 .75.8 .85.9 .951 .55.6 .65.7 .75.8 .85.9 .951 .55.6 .65.7 .75.8 .85.9 .951

5, 5 5, 10 5, 20 5, 40 10, 10

10, 20 10, 40 20, 20 20, 40 40, 40

Gosset coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability under Cauchy distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 25 of 29

Page 92: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Normal coverage probabilities for the Satterthwaite and cendif methods

.25.3333

.5.6667

11.5

234

.25.3333

.5.6667

11.5

234

.875

.9 .925

.95.95

.975

1 .875

.9 .925

.95.95

.975

1 .875

.9 .925

.95.95

.975

1 .875

.9 .925

.95.95

.975

1 .875

.9 .925

.95.95

.975

1

5, 5 5, 10 5, 20 5, 40 10, 10

10, 20 10, 40 20, 20 20, 40 40, 40

Satterthwaite coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 26 of 29

Page 93: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Cauchy coverage probabilities for the Satterthwaite and cendif methods

.25.3333

.5.6667

11.5

234

.25.3333

.5.6667

11.5

234

.875

.9 .925

.95.95

.975

1 .875

.9 .925

.95.95

.975

1 .875

.9 .925

.95.95

.975

1 .875

.9 .925

.95.95

.975

1 .875

.9 .925

.95.95

.975

1

5, 5 5, 10 5, 20 5, 40 10, 10

10, 20 10, 40 20, 20 20, 40 40, 40

Satterthwaite coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability under Cauchy distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 27 of 29

Page 94: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Normal coverage probabilities for the Lehmann and cendif methods

.25.3333

.5.6667

11.5

234

.25.3333

.5.6667

11.5

234

.775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751

5, 5 5, 10 5, 20 5, 40 10, 10

10, 20 10, 40 20, 20 20, 40 40, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability under Normal distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 28 of 29

Page 95: Robust confidence intervals for Hodges-Lehmann median ... · Robust confidence intervals for Hodges-Lehmann median differences Frame 3 of 29 The Lehmann confidence interval formula

Cauchy coverage probabilities for the Lehmann and cendif methods

.25.3333

.5.6667

11.5

234

.25.3333

.5.6667

11.5

234

.775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751 .775.8 .825.85.85.875.9 .925.95.95.9751

5, 5 5, 10 5, 20 5, 40 10, 10

10, 20 10, 40 20, 20 20, 40 40, 40

Lehmann coverage cendif coverage

Firs

t/sec

ond

popu

latio

n sc

ale

ratio

Coverage probability under Cauchy distribution

Graphs by First sample number and Second sample number

Robust confidence intervals for Hodges-Lehmann median differences Frame 29 of 29