2
Editorials Semen and the Curse of Cutoffs WHAT is it about our profession that causes us to cling so tenaciously to numerical cutoffs? Every in- troductory medical school statistics course teaches that for any assay interrogating disease, a popula- tion suffering from the disease forms a Gaussian, normative or “bell,” curve when its test results are plotted in a histogram, and that curve nearly always collides with the bell curve derived from the popu- lation free of disease. The larger the collision, the lousier the test, but that fact is nearly always muted after a threshold is set and we settle into using the assay in our clinics. Our stubborn reliance on setting a threshold even in the face of a far superior way to communicate test results is fully in evidence in the latest edition of the WHO Laboratory Manual for the Examination and Processing of Human Semen. 1 To understand the radical departure and promise lost in the release of the 5th edition of the manual, it is worth remember- ing how we got here “semen-wise.” In the second volume of Fertility and Sterility in 1951 the brilliant physiologist John MacLeod pub- lished a study of measurements of semen such as sperm count, movement and shape. 2 He plotted the histograms of each feature of semen for a population of 1,000 men who were known to have fathered children and for a group of 800 infertile men. Mac- Leod noticed that the overlap between the 2 histo- grams for each parameter was large, and as he was most interested in detecting infertile men (as op- posed to satisfying those who were fertile), he di- vided the assay outcomes into quarters, emphasiz- ing that the men in the lower quartile were those most likely to have difficulty siring offspring. Of course as the histograms overlapped substantially, the sperm parameters of the infertile men were in- termixed with the fertile group in the graphs but MacLeod drew his thresholds to include the least fertile men possible. MacLeod’s caution to exclude fertile men from his thresholds was extended by the World Health Orga- nization in its working group charged with the task of establishing semen analysis cutoff values that physicians and scientists could use when evaluating men across the globe. A panel of WHO experts mulled over sperm data available at the time, and arrived at consensus values such as 20 million/ml for sperm density. Such numerical lines in the sand did not suddenly and magically render fertile all men with sperm parameters exceeding them but, rather, these numbers were simply considered the best choices to ensure that men with levels below them were most likely infertile. The obvious problem re- mained that some fertile men would have sperm numbers lurking below these thresholds and many infertile men would have values above them. These man-made cutoff values persisted through the first 4 editions of the WHO manual for the se- men analysis but in the 5th edition something re- markable and unexpected happened. A table is printed on page 225 which is suitable for clipping and keeping in your pocket, or for snapping with your camera and carrying in your cell phone. Each semen parameter (volume, total sperm number, con- centration etc) is listed in its rows and the columns present numbers for 9 sequential centiles, beginning with the 2.5th, and ending with the 97.5th for a population of men whose partners became pregnant within 1 year. 1 For example, a practitioner so in- clined when faced with a man with a large varicocele and 21 million sperm/ml could advise that since more than 90% of fertile men have a higher sperm density (the 10th centile is 22 million/ml), the vari- cocele may be a reasonable culprit in his reproduc- tive difficulty. Or when a man presents with 120 million sperm/ml, he may be comfortably assured that three-quarters of his fellow men have lower sperm concentrations, and his partner’s age of 40 years is of more pressing concern. However just as entrusting as the table on page 225 is of clinician abilities in understanding and communicating normative data, the table on page 224 is confounding. This unfortunately placed table lists the 5th centiles and their 95% confidence inter- vals as the “lower reference limits” for each semen parameter. Whether the rationale to include such a table was born from the traditional fascination of biological science with hypothesis testing at a prob- ability threshold of 0.05 or just an irresistible im- pulse to prescribe cutoff values is unknown and ul- 0022-5347/11/1852-0381/0 Vol. 185, 381-382, February 2011 THE JOURNAL OF UROLOGY ® Printed in U.S.A. © 2011 by AMERICAN UROLOGICAL ASSOCIATION EDUCATION AND RESEARCH,INC. DOI:10.1016/j.juro.2010.11.018 www.jurology.com 381

Semen and the Curse of Cutoffs

  • Upload
    craig-s

  • View
    220

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Semen and the Curse of Cutoffs

Editorials

Semen and the Curse of Cutoffs

WHAT is it about our profession that causes us tocling so tenaciously to numerical cutoffs? Every in-troductory medical school statistics course teachesthat for any assay interrogating disease, a popula-tion suffering from the disease forms a Gaussian,normative or “bell,” curve when its test results areplotted in a histogram, and that curve nearly alwayscollides with the bell curve derived from the popu-lation free of disease. The larger the collision, thelousier the test, but that fact is nearly always mutedafter a threshold is set and we settle into using theassay in our clinics.

Our stubborn reliance on setting a threshold evenin the face of a far superior way to communicate testresults is fully in evidence in the latest edition of theWHO Laboratory Manual for the Examination andProcessing of Human Semen.1 To understand theradical departure and promise lost in the release ofthe 5th edition of the manual, it is worth remember-ing how we got here “semen-wise.”

In the second volume of Fertility and Sterility in1951 the brilliant physiologist John MacLeod pub-lished a study of measurements of semen such assperm count, movement and shape.2 He plotted thehistograms of each feature of semen for a populationof 1,000 men who were known to have fatheredchildren and for a group of 800 infertile men. Mac-Leod noticed that the overlap between the 2 histo-grams for each parameter was large, and as he wasmost interested in detecting infertile men (as op-posed to satisfying those who were fertile), he di-vided the assay outcomes into quarters, emphasiz-ing that the men in the lower quartile were thosemost likely to have difficulty siring offspring. Ofcourse as the histograms overlapped substantially,the sperm parameters of the infertile men were in-termixed with the fertile group in the graphs butMacLeod drew his thresholds to include the leastfertile men possible.

MacLeod’s caution to exclude fertile men from histhresholds was extended by the World Health Orga-nization in its working group charged with the taskof establishing semen analysis cutoff values thatphysicians and scientists could use when evaluating

men across the globe. A panel of WHO experts

0022-5347/11/1852-0381/0THE JOURNAL OF UROLOGY®

© 2011 by AMERICAN UROLOGICAL ASSOCIATION EDUCATION AND RESEARCH, INC.

mulled over sperm data available at the time, andarrived at consensus values such as 20 million/ml forsperm density. Such numerical lines in the sand didnot suddenly and magically render fertile all menwith sperm parameters exceeding them but, rather,these numbers were simply considered the bestchoices to ensure that men with levels below themwere most likely infertile. The obvious problem re-mained that some fertile men would have spermnumbers lurking below these thresholds and manyinfertile men would have values above them.

These man-made cutoff values persisted throughthe first 4 editions of the WHO manual for the se-men analysis but in the 5th edition something re-markable and unexpected happened. A table isprinted on page 225 which is suitable for clippingand keeping in your pocket, or for snapping withyour camera and carrying in your cell phone. Eachsemen parameter (volume, total sperm number, con-centration etc) is listed in its rows and the columnspresent numbers for 9 sequential centiles, beginningwith the 2.5th, and ending with the 97.5th for apopulation of men whose partners became pregnantwithin 1 year.1 For example, a practitioner so in-clined when faced with a man with a large varicoceleand 21 million sperm/ml could advise that sincemore than 90% of fertile men have a higher spermdensity (the 10th centile is 22 million/ml), the vari-cocele may be a reasonable culprit in his reproduc-tive difficulty. Or when a man presents with 120million sperm/ml, he may be comfortably assuredthat three-quarters of his fellow men have lowersperm concentrations, and his partner’s age of 40years is of more pressing concern.

However just as entrusting as the table on page225 is of clinician abilities in understanding andcommunicating normative data, the table on page224 is confounding. This unfortunately placed tablelists the 5th centiles and their 95% confidence inter-vals as the “lower reference limits” for each semenparameter. Whether the rationale to include such atable was born from the traditional fascination ofbiological science with hypothesis testing at a prob-ability threshold of 0.05 or just an irresistible im-

pulse to prescribe cutoff values is unknown and ul-

Vol. 185, 381-382, February 2011Printed in U.S.A.

DOI:10.1016/j.juro.2010.11.018

www.jurology.com 381

Page 2: Semen and the Curse of Cutoffs

SEMEN AND CURSE OF CUTOFFS382

timately unimportant. What is important is theunnecessary confusion engendered by tossing a setof arbitrary thresholds into a proper and commend-able evolution away from consensus values and to-wards accurately describing normative information.

No sooner had the 5th edition of the WHO manualbeen published than questions such as the followingappeared on Androlog, the email users group formale reproductive biologists and clinicians.3 “We areseeing and treating many patients having an infer-tility problem according to the previous edition ofWHO manual. Now, according to the WHO manual5th edition, they are considered normal. My ques-tion is how to convince these men that they arenormal.”

If the writer of this question, a well establishedand highly qualified subspecialty practitioner inmale reproductive medicine, was ensnared by thetrap of including a capricious threshold alongsiderich normative data, I shudder at the bewildermentthat is now likely convulsing reproductive medicine.The fallacy of assuming that if, for the parameter ofsperm density, a man has greater than 15 million/mlthen he is normal is dispelled by running your fingeralong the row on the table on page 225 for spermconcentration until you arrive at the column for the50th centile, which displays a value of 73 million/ml,over which lie half of men whose partners conceivedwithin 1 year. The moral is simple: you can’t saythat a man is normal who just clears the line mark-ing the lowest 5% of his companions.

If it is revealed some day that we humans have adeeply ingrained biological need for drawing lines

for medical tests akin to food, sleep and sex, then

REFERENCES

3. http://godot.urol.uic.edu/androlo

there is a better way to craft them. You can draw 2lines for any assay, one below which a patient islikely to harbor disease and one above which a pa-tient is probably healthy. A common tool used tocompute 2 such thresholds is classification and re-gression tree analysis (CART). Guzick et al pub-lished a CART analysis for semen almost a decadeago.4 As an example, the 2 thresholds calculated forsperm density were 13.5 and 48.0 million/ml, corre-sponding fairly closely to the 5th and 25th centilesdescribed in the 5th edition of the WHO manual. Ifyou absolutely must use thresholds for clinical deci-sion making, the limits in the CART study by Guzicket al are reasonable. Below those limits men arelikely infertile, the upper limits may assure a manthat he is probably fertile and everything betweenthe limits is subject to interpretation in the contextof the clinical picture.

I do not believe that we as a species or a profes-sion are obligated to using 1 or 2 numbers in engag-ing our brains to solve clinical problems. It is likelythat the need to calculate and propagate thresholdsis the vestige of an era in which we did not havecomputers constantly in our pockets, capable of stor-ing and accessing enormous amounts of data andmaking short work of complex numerical calcula-tions. Now that we have them, we should use them,and the first relic of the past to be shooed out of ourclinic doors is the notion and use of the arbitrarycutoff value.

Craig S. Niederberger

Section Editor

1. World Health Organization: WHO Laboratory Man-ual for the Examination and Processing of HumanSemen. WHO Press 2010.

2. MacLeod J: Semen quality in 1000 men of knownfertility and in 800 cases of infertile marriage.Fertil Steril 1951; 2: 115.

g_archive/3347.html.

4. Guzick DS, Overstreet JW, Factor-Litvak P et al:Sperm morphology, motility, and concentration infertile and infertile men. N Engl J Med 2001; 345:

1388.