4
Editorial 1809 Mammography Screening for Breast Cancer Daniel B. Kopans, M.D. Randomized, controlled trials have established the fact that screening at periodic intervals can reduce the mor- tality rate from breast cancer.’ Mortality reduction from screening is likely to vary with the quality of the mam- mograms, skill of the interpreters, interval between screens, and thresholds for intervention when an abnor- mality is detected and diagnosis obtained. In this issue of Cancer,2 the breast cancer screening program in Nij- megen, The Netherlands, is reviewed to determine how the effectiveness of that program (and others) might be improved. The investigators reasoned that one measure of the success of a screening program is the determina- tion of how many cancers appear clinically in the inter- val after a “negative” screen and before the next screen (false-negatives), and how many cancers that are de- tected in a subsequent screen, in retrospect, could have been detected at the preceding screen (delayed diagno- sis). This provides a measure of how much room there is for improvement in the screening program and some important insights into possible ways to improve these programs by minimizing the number of these cases. The term ”delayed diagnosis” is perhaps overly harsh. These lesions would not be detected by “usual care” until even later in the absence of screening. They are more properly termed ”lost opportunities for earlier diagnosis .” The goal of breast cancer screening is to detect, diag- nose, and treat cancer at a time in its growth when the natural history of the disease can be interrupted and mortality from breast cancer deferred or avoided. Mor- tality reduction is the ultimate measure. This can be determined only in a randomized, controlled trial, but the ability to reduce mortality has been shown to be directly related to a reduction in the size and stage of the cancers as a result of the s ~ r e e n . ~ The success of a screening program is thus primarily determined by the sensitivity of the screen and the size and stage of the From the Department of Radiology, Massachusetts General Hos- Address for reprints: Daniel B. Kopans, M. D., Department of Accepted for publication June 28, 1993. pital, Boston, Massachusetts. Radiology, Massachusetts General Hospital, Boston, MA 021 14. cancers diagnosed in the population. In general, the ear- lier the detection the smaller the size of the lesions and the earlier their stage. Image Quality The quality of the mammography is a key element in the ability to detect breast cancers earlier. Much of the criticism of the recently published results from the Na- tional Breast Screening Study of Canada (NBSS), for example, revolves around the poor quality of the mam- mography in that trial.4The Nijmegen review does not provide direct information on the quality of their mam- mograms, but it would appear from the data provided that, even in retrospect, half of the cancers were unde- tectable by the mammography. The authors believe that very few of these errors were due to poor image quality, but they did not directly address the fact that the number of mammographic projections used can di- rectly influence the detection of cancers. Single-view mammography has been shown to reduce the detection rate by as much as llY~.~ Single-view mammography was used in the Nijmegen program and this might have contributed to the false-negative rate. Screening Interval The women in the Nijmegen program were screened every 2 years. It is of interest that many of the cancers that were not detectable on the index screen were in breasts that were mostly fat and ideally suited for early cancer detection. As the authors suggest, it is likely that, unless these were exceedingly fast-growing tumors, they could have been detected earlier if the screening interval was shortened to every year. This analysis is an extension of the observations made by Moskowitz6and Tabar et al.7 concerning the lead time gained by mam- mographic screening. They showed that if the time be- tween screens was too long the cancers would grow in the interval to sizes that were clinically detectable by the time of the next screen and the potential gains from earlier detection by mammography would be dimin-

Mammography screening for breast cancer

Embed Size (px)

Citation preview

Page 1: Mammography screening for breast cancer

Editorial 1809

Mammography Screening for Breast Cancer Daniel B. Kopans, M.D.

Randomized, controlled trials have established the fact that screening at periodic intervals can reduce the mor- tality rate from breast cancer.’ Mortality reduction from screening is likely to vary with the quality of the mam- mograms, skill of the interpreters, interval between screens, and thresholds for intervention when an abnor- mality is detected and diagnosis obtained. In this issue of Cancer,2 the breast cancer screening program in Nij- megen, The Netherlands, is reviewed to determine how the effectiveness of that program (and others) might be improved. The investigators reasoned that one measure of the success of a screening program is the determina- tion of how many cancers appear clinically in the inter- val after a “negative” screen and before the next screen (false-negatives), and how many cancers that are de- tected in a subsequent screen, in retrospect, could have been detected at the preceding screen (delayed diagno- sis). This provides a measure of how much room there is for improvement in the screening program and some important insights into possible ways to improve these programs by minimizing the number of these cases. The term ”delayed diagnosis” is perhaps overly harsh. These lesions would not be detected by “usual care” until even later in the absence of screening. They are more properly termed ”lost opportunities for earlier diagnosis .”

The goal of breast cancer screening is to detect, diag- nose, and treat cancer at a time in its growth when the natural history of the disease can be interrupted and mortality from breast cancer deferred or avoided. Mor- tality reduction is the ultimate measure. This can be determined only in a randomized, controlled trial, but the ability to reduce mortality has been shown to be directly related to a reduction in the size and stage of the cancers as a result of the s ~ r e e n . ~ The success of a screening program is thus primarily determined by the sensitivity of the screen and the size and stage of the

From the Department of Radiology, Massachusetts General Hos-

Address for reprints: Daniel B. Kopans, M. D., Department of

Accepted for publication June 28, 1993.

pital, Boston, Massachusetts.

Radiology, Massachusetts General Hospital, Boston, MA 021 14.

cancers diagnosed in the population. In general, the ear- lier the detection the smaller the size of the lesions and the earlier their stage.

Image Quality

The quality of the mammography is a key element in the ability to detect breast cancers earlier. Much of the criticism of the recently published results from the Na- tional Breast Screening Study of Canada (NBSS), for example, revolves around the poor quality of the mam- mography in that trial.4 The Nijmegen review does not provide direct information on the quality of their mam- mograms, but it would appear from the data provided that, even in retrospect, half of the cancers were unde- tectable by the mammography. The authors believe that very few of these errors were due to poor image quality, but they did not directly address the fact that the number of mammographic projections used can di- rectly influence the detection of cancers. Single-view mammography has been shown to reduce the detection rate by as much as l l Y ~ . ~ Single-view mammography was used in the Nijmegen program and this might have contributed to the false-negative rate.

Screening Interval

The women in the Nijmegen program were screened every 2 years. It is of interest that many of the cancers that were not detectable on the index screen were in breasts that were mostly fat and ideally suited for early cancer detection. As the authors suggest, it is likely that, unless these were exceedingly fast-growing tumors, they could have been detected earlier if the screening interval was shortened to every year. This analysis is an extension of the observations made by Moskowitz6 and Tabar et al.7 concerning the lead time gained by mam- mographic screening. They showed that if the time be- tween screens was too long the cancers would grow in the interval to sizes that were clinically detectable by the time of the next screen and the potential gains from earlier detection by mammography would be dimin-

Page 2: Mammography screening for breast cancer

1810 CANCER September 15, 1993, Volume 72, No. 6

ished. If the time between screens is such that the pa- tient or her doctor will detect many cancers at about the same time as the mammogram, the screen will be less likely to push back the time of diagnosis and contribute to mortality reduction. The authors have not provided any age information in their review, but both Mosko- witz6 and Tabar et al.’ suggest that women younger than 50 years of age should be screened at a shorter interval than every 2 years, whereas for older women the interval may perhaps be longer than annual with no detrimental effect. Unfortunately, the current screening guidelines in the United States, for unclear reasons, suggest just the opposite. Many believe that this should be changed.

Lost Opportunities for Earlier Diagnosis

The Nijmegen review revealed that at least 50% of the cancers that went undetected at the index screen were visible on the mammogram and could have been de- tected at that screen. The cancers were either over- looked, or dismissed as most likely benign. The fact that some were overlooked is due to the complex psychovi- sual phenomena involved in the perception of an abnor- mality on a radiograph. The fact that some of the le- sions were detected but dismissed involves the thresh- olds that the radiologists used for intervention when an abnormality was perceived.

Perception and Double Reading

As with the lesions that were not visible on the earlier screen, a shorter interval between screens might pro- vide a second and earlier opportunity for the interpret- ing radiologist to perceive a cancer that had been previ- ously overlooked. A similar benefit may be derived from the use of double reading. It comes as no surprise that skilled radiologists overlooked cancers that were visible in retrospect. It is a well-known phenomenon that even highly skilled observers periodically fail to perceive significant abnormalities when interpreting images. This has been shown in studies that have tested the detection of lung nodules on chest radiographs as well as the detectability of bowel lesions on barium studies, Because different observers fail to perceive dif- ferent abnormalities, ”double reading,” i.e., having two radiologists review each case, can reduce the false-ne- gative rate. In the two-county trial of breast cancer screening in Sweden, Tabar et a1.8 reported that the dou- ble reading of screening mammograms increased the detection rate by 15%. Bird found that double reading increased the rate by 5%,9 and in our own practice we have documented a 7% increase by double reading and have made it an integral part of our screening program.

It is likely that a significant number of the false-nega- tive studies in which the cancer was visible in retrospect would have been detected had a two-reader system been used in Nijmegen.

Population-based screening is a public health issue, and the cost of screening is a major impediment to its implementation. Unless it is performed in a cost-sensi- tive fashion, screening may not be supported. Double reading is effective, but must be accomplished in a highly organized and efficient fashion or it will prohibi- tively increase the cost of screening. In recent years, as a consequence of the anxieties involved in breast cancer screening, there has been an emphasis placed by some on immediate interpretation of the mammogram so that the screened individual may receive the report before she leaves the screening facility. Although this is psy- chologically desirable, it is not cost-efficient because the radiologist is required to be on-site and spend more time providing the patient with the results of the screen than in the interpretation of the images themselves. It is potentially hazardous because it may force a rapid anal- ysis and possible hasty interpretation. Furthermore, double reading becomes more expensive in this setting because the films must be handled several times instead of once. The most efficient and least expensive method for the interpretation of screening studies involves batch reading where multiple cases are loaded onto multiviewers to permit the radiologist to maximize the time in viewing the images and reduce the time spent in handling films. This approach permits efficient double reading with essentially no increase in cost. If screening is to be efficacious and cost-effective, women must be educated to understand that the benefits of double reading and the availability of screening through main- tenance of low-cost high-quality imaging far outweigh the short delay in reporting that is required in a batch reading system.

Thresholds for Intervention

The investigators clearly acknowledge that a policy in their screening program has been to maintain a low number of false-positive results. This reduces the num- ber of women who undergo biopsies for lesions that prove to be benign, but it translates into a reduction in the number of cancers detected at an earlier stage. An aggressive intervention policy that permits a lower posi- tive predictive value has been criticized as resulting in too many ”unnecessary” biopsies. The logic is elusive. There has rarely been criticism for biopsies instigated as a consequence of clinical suspicions although the major- ity of these prove to be for benign changes. In fact, in the United States, it is likely that more breast biopsies with benign results are performed on the basis of clini-

Page 3: Mammography screening for breast cancer

Editorial/Kopans 1811

cal suspicions than those performed on the basis of mammographic concerns, yet it is screening mammog- raphy that has been most severely criticized. If the goal is to reduce the number of biopsies for what prove to be benign processes, the biopsy rate for clinically evident lesions should be addressed first because fewer of these prove to be cancers and when they are malignant they are usually larger and at a later stage than those de- tected by mammography alone.” It would be interest- ing to know whether the same policy of reducing false- positive results is followed in Nijmegen for clinically evident abnormalities. If not, then the policy is illogical.

The ability of a screening program to detect cancers at an earlier stage than the usual care is dependent on the quality of the mammography and the skill of the interpreter who is trying to ”perceive” the relatively small number of cancers among mammograms from large numbers of healthy women, most of whom do not have breast cancer. Unfortunately, many women do have benign changes that appear on mammograms, and these can have appearances that are indistinguish- able from cancer. The threshold for intervention and evaluation of lesions found by mammography will in- fluence the success of the program in detecting cancers earlier. The Nijmegen investigators acknowledged the fact that a number of cancers were perceived, but were dismissed because they did not meet sufficient criteria for earlier intervention. There are certain ”classic” signs of breast cancer that are virtually pathognomonic, such as spiculated margins and fine linear, branching calcifi- cations. Unfortunately, the majority of small, early breast cancers do not have classic morphology.” If the threshold for intervention is set too high, requiring that the classic characteristics be present, small cancers may be ignored until they develop these features and are larger or later stage.

The radiologist’s initial task is to perceive (detect) an abnormality. Once an abnormality has been per- ceived the finding is analyzed to try to determine its type and probability of malignancy (diagnosis). There are no secrets in determining these relative probabili- ties. Morphologic criteria have been established that are used to try to determine the likely cause of various find- ings. Unfortunately, the spectrum of these features overlap between benign and malignant lesions, and each radiologist establishes his or her own threshold for intervention. For example, the Nijmegen investigators chose to not investigate “benign appearing masses” (presumably those with well-defined margins), and as a consequence, six cancers went undiagnosed. The au- thors argued that this was justified because it avoided the investigation of 600 other similar-appearing lesions that were benign. This typifies the basic statistical rela- tionships of screening. Assuming that the quality of the

images is high, those interpreting the cases are skilled, and lesions are appropriately detected, the false-nega- tive rate can be reduced only at the expense of an in- crease in the false-positive fraction due to the overlap between benign and malignant features. Most experi- enced radiologists who interpret screening mammo- grams recognize the same features demonstrated by various breast lesions. It has been shown that these ob- servers operate on essentially the same receiver operat- ing characteristic curve.12 The only way that a high-po- sitive predictive value can be maintained is by accepting a higher false-negative fraction. A program with a low- positive predictive value will have a lower false-nega- tive rate, but a higher false-positive rate. Unless the practitioners in screening programs that require a high percentage of cancers per number of biopsies per- formed are aware of signs that other skilled radiologists do not know, they have to be permitting potentially detectable cancers to pass through the screen. This ap- pears to be the case in Nijmegen. It is somewhat mis- leading to suggest that to have made an earlier diagno- sis of these six cancers would have increased the false- positive cases by 600, implying that 594 women were spared intervention. In fact, the vast majority of the benign lesions would have easily been shown to be cysts if ultrasound or aspiration was used. These are simple procedures that should not be costly, and the earlier detection of the six cancers, might have been worthwhile.

The reasons for requiring a low false-positive rate include the economic implications of a more aggressive policy and the desire to not traumatize women. Perhaps efforts should be made to maximize the detection rate of early cancers by devising less costly and less trau- matic methods of diagnosis so that aggressive efforts to diagnose can be maintained. Instead of the paternalistic view that women must be ”protected” from biopsies, the individual and her physician should be provided with the best estimate of cancer probability so that they can determine what level of certainty is required.

Rather than accept the fact that some cancers that can be diagnosed earlier will be passed and allowed to continue growing, diagnostic procedures that will re- duce the costs of screening should be developed. Breast biopsies performed for mammographically detected le- sions need not be overly traumatic. They can virtually all be performed in outpatient centers under local anes- thesia. Accurate preoperative needle localization can reduce the trauma and cosmetic damage from the sur- gery and permit the removal of very small amounts of tissue.I3 Many European programs have used fine-nee- dle aspiration cytology successfully in the management and triage of suspicious 1esi0ns.l~ In the United States, it is likely that imaging-guided core-needle biopsies will

Page 4: Mammography screening for breast cancer

1812 CANCER September 15,1993, Volume 72, No. 6

help reduce the need and expense for open biopsies. These are desirable goals, but should be approached with caution because the benefit from screening can be diminished if early cancers are not diagnosed due to less accurate diagnostic techniques. Methods that do not in- volve the excision of a suspicious lesion must be care- fully and scientifically validated.

Screening is clearly not the ultimate solution to the breast cancer problem, but until methods to prevent breast cancer can be devised, or perfect cures devel- oped, earlier detection offers the only opportunity to reduce the number of breast cancer deaths for a signifi- cant number of women. Screening mammograms must be of the highest quality and be performed in an effi- cient and cost-sensitive fashion so that all women of appropriate age can avail themselves of the test. The interpretation of the studies should be systematized so that such benefits as double reading can be used, and efforts should be made to develop further diagnostic approaches that will permit aggressive investigation to permit earlier diagnosis. As the Nijmegen investigators have done, screening programs should try constantly to evaluate their results so that efforts can be made to improve the value of screening.

References

1. Nystrom L, Rutqvist LE, Wall S, Lindgren A, Lindqvist M, Ry- den S, et al. Breast cancer screening with mammography: over- view of Swedish randomised trials. Lancet 1993; 341:973-8.

2. van Dijck JAAM, Verbeek ALM, Hendriks JHCL, Holland R. The current detectability of breast cancer in a mammographic

3.

4.

5.

6 .

7.

8.

9.

10.

11.

12.

13.

14.

screening program: a review of the previous mammograms of interval and screen-detected cancers. Cancer 1993; 721933-8. Tabar L, Gad A, Holmberg L, Ljungquist U. Significant reduc- tion in advanced breast cancer: results of the first seven years of mammography screening in Kopparberg, Sweden. Diag lmag Clin Med 1985; 54:158-64. Yaffe MJ. Correction: Canada study [letter to the editor]. J Natl Cancer lnst 1993; 85:94. Muir BB, Kirkpatrick A, Roberts MM, Duffy SW. Oblique-view mammography: adequacy for screening. Radiology 1984;

Moskowitz M. Breast cancer: age-specific growth rates and screening strategies. Radiology 1986; 161:37-41. Tabar L, Faberberg G, Day NE, Holmberg L. What is the opti- mum interval between screening examinations?: an analysis based on the latest results of the Swedish Two-County Breast Cancer Screening Trial. Br J Cancer 1987; 55:547-51. Tabar L, Fagerberg G, Duffy S, Day N, Gad A, Grontoft 0. Update of the Swedish Two-County Program of Mammo- graphic Screening for Breast Cancer. Rad Clin North Am 1992;

Bird RE. Professional quality assurance for mammographic screening programs. Radiology 1990; 177:587. Bassett LW, Liu TH, Giuliano ARE, Gold RH. The prevalence of carcinoma in palpable vs impalpable mammographically de- tected lesions. AJR 1991; 157:21-4. Sickles EA. Mammographic features of 300 consecutive nonpal- pable breast cancers. AIR 1986; 146:661-3. D’Orsi CJ. To follow or not to follow, that is the question. Radiol- ogy 1992; 184:306. Gallagher WJ, Cardenosa G, Rubens JR, McCarthy KA, Kopans DB. Minimal-volume excision of nonpalpable breast lesions AIR

Azevado E, Svane G, Auer G. Stereotactic fine-needle biopsy in 2594 mammographically detected non-palpable lesions. Lancet 1989; 1:1033-6.

15 1 :39-41.

30:187-210.

19 89; 153957-61,