5
1212 Mammography Screening for Breast Cancer Reply to the Commentaries Daniel B. Kopans, M.D., Elkan Halpern Ph.D., and Carol A. Hulka, M.D. We will try to limit our response to the rather extensive commentaries from Dr. Dupont and Dr. Baines with re- gard to our evaluation of the statisticalpower of the ran- domized, controlled trials of screening for breast cancer. Neither of the commentaries challenge the basic conclusions of our statistical analysis-that no trials to date have had enough patients to determine a benefit (25% mortality reduction) or lack a lack of benefit for mammographic screening in 40-49-year-old women, and, therefore, all judgments based on the randomized, controlled trials are premature. Dr. Baines recently ac- knowledged this when she wrote: “Those who advo- cate screening of women ages 40-49 years are right when they say no study thus far [including the NBSS] has had adequate power for this age group.” The fact that the trials lack the statistical power to demonstrate a 25% or lower mortality reduction for women aged 40-49 years has also been the conclusion of an international panel of experts at the International Union Against Cancer meeting last year’ and a jointly sponsored, American Cancer Society and National Cancer Institute meeting held in Atlanta in April 1994. Although not yet published, the most recent data, pro- vided at the end of this response, show that when the benefit is greater than 25%, the results can be and are now ”significant” (Hendrick and Smart, unpublished data). The majority of Dr. Baines’ points are addressed in our paper. Dr. Baines’ comparison of “detection rates” does not account for the fact that the Vancouver pro- gram does not admit symptomatic women, whereas contrary to Dr. Elwood’s caution against including women with clinically evident cancers in a screening ~~ From the Department of Radiology, Massachusetts General Hospital, and the I Iarvard Medical School, Boston, Massachusetts. Address for reprints: Daniel B. Kopans, M.D., Department of Ra- diology, Harvard Medical School, Massachusetts General Hospital, Boston, MA 021 14. Received May 27, 1994; revisions received June 3, 1994; ac- cepted June 6, 1994. trial? the NBSS permitted these women to participate. This would obviously add to the cancer detection rate in the NBSS. A more telling figure is the number of cancers missed at screening that rose to clinical detection be- tween screens. The interval cancer rate in the Vancou- ver program is much lower than that in the NBSS, con- firming the fact that the quality of Vancouver’s screen is much higher.4 We are fully aware of the effects of lead-time bias and the system of follow-up in randomized, controlled trials. Part of the problem in the reviews of the data from the trials is that the numbers have been used with- out regard to how they were actually generated. Ran- domized, controlled trials are not ”black boxes.” To use their results appropriately, the actual performance of the trials and the extent and significance of follow-up must be understood. In view of the long natural history of breast cancer, it is important for the reviewer to real- ize that early reports of data, such as those provided by the NBSS, do not provide sufficient follow-up for an accurate assessment of the benefit from screening. Our examination of the statistical power needed in randomized, controlled trials is the type of analysis that should have been done by those opposed to screening women at these ages before they pronounced screening to be ineffective. The statistical power of a trial is its foundation. As we note, the lack of statistical power can result in misleading conclusions from trials. This has clearly happened with regard to screening women aged 40-49 years. None of the trials were designed to have the statistical power to be able to provide statistical sig- nificance for an expected mortality reduction of 25%. With the exception of the Canadian trial (which is, it- self, too small), the trials were not designed pros- pectively to separately assess particular sub-groups of women, defined by age. With a relative paucity of women younger than age 50 in the trials, the rates of non-compliance, and the high rates of control group contamination, it would take a mortality reduction of

Mammography screening for breast cancer. Reply to the commentaries

Embed Size (px)

Citation preview

1212

Mammography Screening for Breast Cancer Reply to the Commentaries

Daniel B. Kopans, M.D., Elkan Halpern Ph.D., and Carol A. Hulka, M.D.

We will try to limit our response to the rather extensive commentaries from Dr. Dupont and Dr. Baines with re- gard to our evaluation of the statistical power of the ran- domized, controlled trials of screening for breast cancer.

Neither of the commentaries challenge the basic conclusions of our statistical analysis-that no trials to date have had enough patients to determine a benefit (25% mortality reduction) or lack a lack of benefit for mammographic screening in 40-49-year-old women, and, therefore, all judgments based on the randomized, controlled trials are premature. Dr. Baines recently ac- knowledged this when she wrote: “Those who advo- cate screening of women ages 40-49 years are right when they say no study thus far [including the NBSS] has had adequate power for this age group.”

The fact that the trials lack the statistical power to demonstrate a 25% or lower mortality reduction for women aged 40-49 years has also been the conclusion of an international panel of experts at the International Union Against Cancer meeting last year’ and a jointly sponsored, American Cancer Society and National Cancer Institute meeting held in Atlanta in April 1994. Although not yet published, the most recent data, pro- vided at the end of this response, show that when the benefit is greater than 25%, the results can be and are now ”significant” (Hendrick and Smart, unpublished data).

The majority of Dr. Baines’ points are addressed in our paper. Dr. Baines’ comparison of “detection rates” does not account for the fact that the Vancouver pro- gram does not admit symptomatic women, whereas contrary to Dr. Elwood’s caution against including women with clinically evident cancers in a screening

~~

From the Department of Radiology, Massachusetts General Hospital, and the I Iarvard Medical School, Boston, Massachusetts.

Address for reprints: Daniel B. Kopans, M.D., Department of Ra- diology, Harvard Medical School, Massachusetts General Hospital, Boston, MA 021 14.

Received May 27, 1994; revisions received June 3, 1994; ac- cepted June 6, 1994.

trial? the NBSS permitted these women to participate. This would obviously add to the cancer detection rate in the NBSS. A more telling figure is the number of cancers missed at screening that rose to clinical detection be- tween screens. The interval cancer rate in the Vancou- ver program is much lower than that in the NBSS, con- firming the fact that the quality of Vancouver’s screen is much higher.4

We are fully aware of the effects of lead-time bias and the system of follow-up in randomized, controlled trials. Part of the problem in the reviews of the data from the trials is that the numbers have been used with- out regard to how they were actually generated. Ran- domized, controlled trials are not ”black boxes.” To use their results appropriately, the actual performance of the trials and the extent and significance of follow-up must be understood. In view of the long natural history of breast cancer, it is important for the reviewer to real- ize that early reports of data, such as those provided by the NBSS, do not provide sufficient follow-up for an accurate assessment of the benefit from screening.

Our examination of the statistical power needed in randomized, controlled trials is the type of analysis that should have been done by those opposed to screening women at these ages before they pronounced screening to be ineffective. The statistical power of a trial is its foundation. As we note, the lack of statistical power can result in misleading conclusions from trials. This has clearly happened with regard to screening women aged 40-49 years. None of the trials were designed to have the statistical power to be able to provide statistical sig- nificance for an expected mortality reduction of 25%. With the exception of the Canadian trial (which is, it- self, too small), the trials were not designed pros- pectively to separately assess particular sub-groups of women, defined by age. With a relative paucity of women younger than age 50 in the trials, the rates of non-compliance, and the high rates of control group contamination, it would take a mortality reduction of

Commentary/Kopans et al. 1213

more than 25%, or death rates greater than anticipated, to permit the results to be significant.

The commentaries reiterate many of the same argu- ments that were provided in the “Fletcher Report,” commissioned by the National Cancer Institute to bol- ster its decision to remove support for screening women aged 40-49 years. These same arguments have been used by others to deny access to screening for these

The “Fletcher Report” described the ”con- sistency” of the data for women age 50 and older, and this is repeated by Dr. Dupont where he stated that “there is consistent evidence” that show a ”reduction of about 30% in breast cancer mortality in women aged 50-69 years.” This is, in fact, not the case. Although the point estimates of the relative risks are less than 1, the fact is that four out of the seven trials that had un- screened control groups, do not show statistically sig- nificant results for these women with their confidence intervals extending beyond 1. It is the combination of these data, in a meta-analysis, that has been accepted as “proof” of benefit. Given the much smaller numbers of women in the trials aged 40-49 years, the fact that five of the seven trials with unscreened controls have point estimates that show a mortality reduction (relative risks of less than 1) cannot be dismissed, particularly when the design of the trials and there actual perfor- mance is considered.

Only two of the trials do not, as yet, yield point es- timates which are consistent with a benefit for women younger than age 50. One is the Stockholm trial, but it is one of the more recent trials with only a short follow- up. The women were screened at an interval between screens that was longer than 2 years, and only single view mammography was provided. The available data suggest that younger women should be screened at a shorter interval.’,’’ The trials that screened with too long an interval did not provide as much benefit as those that screened at a shorter interval,” and single view mammography has been shown to miss as many as 11% of cancers.” In the Ostergotland trial, the other trial that has not shown a benefit, women were screened every two years with single view mammogra- phy. A large proportion of the women who died of breast cancer, in that trial, in the screened group, had actually refused to be screened (they must still be counted as members of the screened group). All the other 5 trials consistently estimate a benefit ranging from 22 to 49Oh5 that has been ignored because it had lacked ”statistical significance.” Our analysis demon- strates that any argument based upon that lack of sta- tistical significance is specious since the trials individu- ally, as well as combined in a meta-analysis, would have required a much higher than an expected 25% mortality reduction to provide statistical significance

with the small numbers of women who have partici- pated.

The commentaries, as well as other analyses, have overlooked other evidence of benefit. For example, the BCDDP data (the largest trial of screening yet per- formed) show that it is not merely that “mammography can detect occult invasive cancers in younger women,” but that there is no difference in survival for screened women ages 40-49 when compared with women aged 50-59 years. l3 This was confirmed in the Kopparberg study.I4 Dr. Dupont has further made the assumption that ”it is unlikely that mammography will save a higher percentage of breast cancer deaths for women in their forties than for older women.” In fact, we don’t know this. In at least one of the trials, the benefit for women under 40 has been greater than for women in their fifties. Two of the trials have benefits of 4015 and 49%’ for women under age 50 that are higher than what has been shown for women ages 50 and older.

It is also not clear that rates of compliance and con- tamination are the same for women of all ages as sug- gested by Dr. Dupont. Furthermore, the meta-analysis by Elwood et a1 cited by both Drs. Baines and Dupont was limited to data published by December of 1992.

Part of the problem lies in the fact that the analyses continue to compare women aged 40-49 years with women ages 50 and older. This is an inappropriate and arbitrary comparison. Any difference will, a priori ap- pear to occur at age 50, and the weight of increasing cancer incidence with age will skew any analysis. The skew that is caused by such weighting is evident in Dr. Dupont’s statement that the magnitude of risk for ”older” women is much greater than for ”younger” women. This is true if women aged 40-49 years are compared with ALL women age 50 and older. The fact is that the magnitude of the difference in risk between women aged 40-49 years and those aged 50-59 years is the same as the difference between women aged 50-59 years and those ages 60-69. There is no sudden change at age 50. Analysts should cease comparing women in one decade of life with all women from the next 4 de- cades.

Women and their physicians have been led to be- lieve that the studies have been done, and there is no benefit from screening women aged 40-49 years. This is clearly not the case since, as we point out, none of the trials was designed or performed properly to provide statistically significant results for a mortality benefit of 25% for women aged 40-49 years. The fact remains that, if the trial data are analyzed us the trials were de- signed, and the data are not broken into unplanned sub- groups, there is statistically significant benefit for screening beginning by age 40. It has been the use of unplanned sub-group analysis that has produced the

1214 CANCER August 15,2994, Volume 74, No. 4

statistical weakness. If the type of sub-group analysis, without statistical power, that has been used to deny women aged 40-49 years screening, is accepted, it could also be used to deny screening, for example, to women ages 50-51, or to women aged 59 years 5 months.

Dr. Dupont points out that breast cancer accounts for only 6.7% of deaths for American women. Virtually any health care intervention can be made to appear triv- ial by dividing it by the entire population. Breast cancer does comprise a small percentage of the total deaths each year, but, most women who die in a given year are quite elderly, and death cannot, as yet, be postponed indefinitely. Breast cancer is the leading cause of non- preventable cancer deaths among women. Fewer women die each year from automobile accidents that could be avoided by having seat belts, than from deaths from breast cancer that could be prevented by screen- ing. Using Dr. Dupont's argument, the benefit of seat- belts would not be worth their expense? "Only 2%" of all deaths in a given year represents thousands of women. The fact that more than 40% of the years of life lost to breast cancer are from women diagnosed before the age of 5016 should not be forgotten.

Cervical cancer screening is referred to as an exam- ple for comparison to breast cancer screening. Some have pointed out that the death rate from cervical can- cer was falling before the institution of Pap smears. The efficacy of Pap smears has never been demonstrated in randomized, controlled trials. Certainly far fewer women die each year of cervical cancer than breast can- cer, yet there is little argument against cervical cancer screening.

Dr. Dupont points out that there has not, as yet, been a decrease in breast cancer deaths in the U.S. If relevant this would argue against any breast cancer screening. In fact, widespread breast cancer screening has never happened in the United States. Despite this, the NCI has pointed out that there has been a recent 13% decrease in breast cancer deaths among women younger than age 50 due, at least in part to earlier de- tection. In Sweden, where screening is more common: the death rate from breast cancer has been decreasing (Personal communication from Dr. Lars E. Rutqvist).

Mortality rates are used incorrectly to calculate the benefit for women in their forties. The appropriate rate for such an analysis is not the rate for women in their forties, but for women whose cancers were, or could have been detected while they were in their forties. There are many women who die of breast cancer in their fifties and later from cancers that were, or could have been diagnosed while they were in their forties.

It is pointed out that the benefit for women aged 40-49 years does not begin to appear until 5-7 years after the first screen and that this "delayed" benefit is

unimportant. A "delay" negates the value of screening only if the women who had tumors under age 50 be- came age 50 during the trials, and only then had their tumors detected and cured. Aside from the fact that there is no evidence to support this explanation, the an- alysts appear to ignore the natural history of breast can- cer. Instead of dismissing a "delayed" benefit from screening "younger" women, opponents of screening should explain why they would even expect an imme- diate mortality reduction. Breast cancer has a long nat- ural history. As we noted, most cancers are not diag- nosed until the later years of a trial (follow-up is mea- sured from an individual's date of allocation or first screen). Furthermore, lead times for mammographic detection range from 2-4 or more years before a lesion becomes clinically evident. The time at which a benefit appears is a reflection of how quickly the control group's cancers successfully metastasize (the screened group's cancers must be found before this occurs), and how quickly those metastases grow sufficiently to kill the control women. In order for an immediate benefit to occur, the screened woman, with rapidly growing can- cer, must be fortuitously detected early in the trial (this is unlikely because of length bias sampling) just before the successful establishment of metastatic disease (so that she will not die). Her control group counterpart must successfully metastasize very soon after that time so that when her cancer is detected a year or more later she will die rapidly. This is a highly unlikely scenario. It is the "immediate" benefit that requires some explana- tion.

It is not scientific to reject a "delayed' benefit since it makes more sense biologically than an "immediate" benefit. The time at which a benefit appears is a reflec- tion of the time at which the tumor metastasized, and the rate at which metastases grow to lethality. Slow growth after metastasis means that the benefit of find- ing and curing cancer before metastasis, when mea- sured in terms of mortality, is evident only after a long enough follow-up period for the control group to die. One might as well argue that "safe sex" is ineffective because the benefit, in terms of aids mortality, is de- layed and does not appear until years later.

Both Dr. Dupont and Dr. Baines have argued that if the large trial that we describe is needed to prove a ben- efit, then the benefit must be small. This is a surprising statement. Our calculation was based on the same mor- tality reduction that has been found, and accepted as valuable, for women age 50 and older. The fact that there are fewer cancers per woman in the age 40-49 cohort [than the age 50-59 cohort] is virtually balanced by the large numbers of younger women. In fact, as a consequence of the "baby boom," there were almost as many women aged 40-49 years who were diagnosed

Commentary/Kopans et al. 1215

with breast cancer in 1993 (28,900) as were diagnosed between the aged 50-59 years (31,500).17 It is merely the laws of probability, and mathematics that deter- mine the size of the trial.

Dr. Dupont repeats the commonly stated myth that the breast tissues turn to fat at age 50 (a theoretical sur- rogate for menopause) suddenly making mammogra- phy more effective. The perception is an artifact of us- ing the age of 50 as a point of analysis. In fact, any mammographically evident change varies with individ- uals and, if it occurs at all, takes place gradually, at different rates and at different ages, beginning in some before age 30.18 There is no abrupt alteration seen on mammograms at menopause or at age 50. The fact that the percentage of fibrous connective tissue (seen as "dense" breast tissue on mammograms) decreases steadily with increasing age in some women (not all), with no sudden alteration at menopause or age 50, is clearly illustrated in textbooks of pathology."

The "risk" of a breast biopsy that may arise from mammography screening should certainly not be equated with the risk of dying from breast cancer. The concern is again raised that screening leads to "unnec- essary" biopsies. These are defined as biopsies for what prove to be benign abnormalities. It is difficult to un- derstand why mammography is singled out in this re- gard. Clinical breast examination (CBE) leads to as many, if not more, biopsies for benign reasons, than does mammography, and when a cancer is found at CBE it is more likely to be at a later stage, with a worse prognosis than those detected by mammography. If the goal is to reduce the number of "unnecessary" biopsies, the place to begin is with clinical breast examination. This particular argument seems somewhat paternalistic. Rather than "protecting" women from the "risk" of an abnormal examination and biopsy, women should be provided with information so that an individual can de- cide for herself.

Part of the difficulty in this debate lies in the fact that an effort to set "national health policy" has entered into what should be a scientific and medical discussion. There is, as yet, no national health policy. Even the Na- tional Cancer Institute is not charged with establishing a national health policy. Screening guidelines are exactly that-guidelines, not policy. Individual women and their physicians are asking for guidance as to the best medical recommendation. Those interested in planning national health policy should state their intent so that, as clearly expressed by John M. King, the reader is aware that the analysis is based on, and influenced by the belief that: "Screening is a public health activity concerned not with individuals, but with popula- tions. . ." 2o We would argue just the opposite.

Science should not be compromised to avoid con-

troversial discussions of health resource allocation. It is possible that breast cancer screening may be too expen- sive for "society" to support, but women should be pro- vided with all the facts so that they can participate in the discussion as to how resources will be allocated, and individuals can decide whether or not to participate in screening.

The preponderance of data suggest that screening women aged 40-49 years will provide the same, if not a greater benefit, for women aged 40-49 years as for those aged 50-59 years. Dr. Dupont, has cited old data. The Edinburgh trial now has a 22% mortality reduction for women aged 40-49 years. In May, the investigators in the Gothenburg trial updated their results for women aged 40-49 years and revealed a 40% decrease in breast cancer deaths for their screened women despite the fact that 30% of the women in the screened group, who died of breast cancer, had refused screening (they are still counted as having been screened).I6 The investigators clearly stated that the mortality benefit came from women whose cancers were detected in their forties and not, as some have suggested, because they had reached the magical age of 50. Although we would argue that the major differences in the design and execution of the trials reduces the value of a meta-analysis, the prelimi- nary results of a new analysis, using the latest figures, suggests that for the seven trials that did not screen their control group, the mortality reduction for women aged 40-49 years is now stafisfically ~ ign i f i can t .~ This is re- markable given the low statistical power of the trials, including the rates of non-compliance and contamina- tion of the control groups, and suggests that the benefit to be gained is likely greater than the 25% that we esti- mated in our analysis.

There has been clear evidence, and now statistically significant "proof" that there is a mortality benefit from screening women aged 40-49 years by mammography. Mammographic screening is not the ultimate solution to the problem of breast cancer, and intense research should be encouraged to find a universal cure, methods of prevention, or methods for earlier detection. It is not sufficient, however, to suggest that we should ignore the present and look to the future for solutions. There is no reason to believe that a solution is near, and wishful thinking will not save lives. Until other solutions are available, or a population, in which screening shouId be concentrated, is defined, the available data show a benefit for women aged 40-49 years. Health policy makers may decide otherwise, but women and their physicians should know that screening, beginning by age 40, can reduce the mortality from breast cancer.

References 1. Baines CJ. The Canadian National Breast Screening Study: a per-

spective on criticisms. Ann Intern Med 1994; 120:326-34.

1216 CANCER August 25,2994, Volume 74, No. 4

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

Eckhardt S, Badellino F, Murphy GP. UICC Meeting on Breast- Cancer Screening in Pre-Menopausal Women in Developed Countries. Cancer 1994; 56:l-5 in Cancer 1994; 73:745-53. Elwood JM, Cox B, Richardson AK. The effectiveness of breast cancer screening by mammography in younger women. Online I CurrClin Trials 1993; 32. Burhenne LJW, Burhenne HJ. The Canadian National Breast Screening Study: a Canadian critique. AIR Am IRoengenoll993; 161:761-3. Fletcher SW, Black W, Hams R, Rimer BK, Shapiro S. Report of the International Workshop on Screening for Breast Cancer. J Natl Cancerlnst 1993; 85:1644-56. Lee Davis D, Love SM. Mammographic Screening. JAMA 1994;

Baines CJ. The Canadian National Breast Screening Study: a per- spective on criticisms. Ann Intern Med 1994; 120:326-34. Hams R. Breast cancer among women in their forties: toward a reasonable research agenda. J NatI Cancer Inst 1994; 86:410-2. Moskowitz M. Breast cancer: age-specific growth rates and screening strategies. Radiology 1986; 161:37-41. Tabar L, Faberberg G, Day NE, Holmberg L. What is the opti- mum interval between screening examinations?: an analysis based on the latest results of The Swedish Two-county Breast Cancer Screening Trial. BrJ Cancer 1987; 55:547-551. Sickles EA, Kopans DB. Deficiencies in the analysis of breast can- cer screening data. JNat l Cancer Insf 1993; 85:1621-4. Muir BB, Kirkpatrick A, Roberts MM, Duffy SW. Oblique-view

271: 152-3.

13.

14.

15.

16.

17.

18.

19.

20.

mammography: adequacy for screening. Radiology 1984; 151: 39-41. Smart CR, Hartmann WH, Beahrs OH, Garfinkel L. Insights into breast cancer screening of younger women. Cancer 1993; 72:

Tabar L. New Swedish breast cancer detection results for women aged 40-49. Cancer 1993; 721437-48. Bjurstam N, Bjorneld L. Mammography screening in women aged 40-49 years at entry: results of the randomized, controlled trial in Gothenburg, Sweden. Presented to the 26th National Conference on Breast Cancer 1994 May 8-13; Palm Springs, California, Shapiro S, Venet W, Strax P, Venet L. Periodic screening for breast cancer: The Health Insurance Plan Project and its se- quelae, 1963-1986. Baltimore (MD): The Johns Hopkins Univer- sity Press, 1988. Smith RA. Epidemiology of breast cancer. A categorical course in physics: technical aspects of breast imaging. 2nd ed. Oak Brook (IL): RSNA Publications, Radiological Society of North America, 199321-33. Kopans DB. Conventional wisdom: observation, experience, an- ecdote, and science in breast imaging. AIR Am IRoentgenoll994;

Page DL, Anderson TJ. Diagnostic histopathology of the breast. New YorK: Churchill Livingstone, 1987. King J. Mammography screening for breast cancer. Letter to the editor. Cancer 1994; 73:2003-4.

1449-56.

162:299-303.