Posted Date: Oct 5, 2015
Expert-reviewed information summary about tests used to detect or screen for breast cancer.
This PDQ cancer information summary for health professionals provides comprehensive, peer-reviewed, evidence-based information about breast cancer screening. It is intended as a resource to inform and assist clinicians who care for cancer patients. It does not provide formal guidelines or recommendations for making health care decisions.
This summary is reviewed regularly and updated as necessary by the PDQ Screening and Prevention Editorial Board, which is editorially independent of the National Cancer Institute (NCI). The summary reflects an independent review of the literature and does not represent a policy statement of NCI or the National Institutes of Health (NIH).
Note: Separate PDQ summaries on Breast Cancer Prevention, Breast Cancer Treatment, Male Breast Cancer Treatment, and Breast Cancer Treatment and Pregnancy are also available.
This summary covers the topic of breast cancer screening and includes information about breast cancer incidence and mortality, risk factors for breast cancer, the process of breast cancer diagnosis, and the benefits and harms of various breast cancer screening modalities. This summary also includes information about screening among special populations.
Mammography is the most widely used screening modality, with solid evidence of benefit for women aged 40 to 74 years. Clinical breast examination and breast self-exam have also been evaluated but are of uncertain benefit. Technologies such as ultrasound, magnetic resonance imaging, tomosynthesis, and molecular breast imaging are being evaluated, usually as adjuncts to mammography.
Based on solid evidence, screening mammography may lead to the following benefit:
Based on solid evidence, screening mammography may lead to the following harms:
For all these potential harms of screening mammography, internal validity, consistency and external validity are good.
Clinical breast examination (CBE) has not been tested independently; it was used in conjunction with mammography in one Canadian trial, and was the comparator modality versus mammography in another trial. Thus, it is not possible to assess the efficacy of CBE as a screening modality when it is used alone versus usual care (no screening activity).
Screening by CBE may lead to the following harms:
Breast self-examination (BSE) has been compared with usual care (no screening activity) and has not been shown to reduce breast cancer mortality.
Based on solid evidence, formal instruction and encouragement to perform BSE leads to more breast biopsies and diagnosis of more benign breast lesions.
Breast cancer is the most common noncutaneous cancer in U.S. women, with an estimated 60,290 cases of in situ disease, 231,840 new cases of invasive disease, and 40,290 deaths expected in 2015. Thus, fewer than 1 of 6 women diagnosed with breast cancer die of the disease. By comparison, about 71,660 American women are estimated to die of lung cancer in 2015. Males account for 1% of breast cancer cases and breast cancer deaths (refer to the Special Populations section of this summary for more information).
Widespread adoption of screening increases breast cancer incidence in a given population and changes the characteristics of cancers detected, with increased incidence of lower-risk cancers, premalignant lesions, and ductal carcinoma in situ (DCIS). (Refer to the Ductal Carcinoma In Situ section in the Breast Cancer Diagnosis and Pathology section of this summary for more information.) Ecologic studies from the United States and the United Kingdom demonstrate an increase in DCIS and invasive breast cancer incidence since the 1970s, attributable to the widespread adoption of both postmenopausal hormone therapy and screening mammography. In the last decade, women have refrained from using postmenopausal hormones, and breast cancer incidence has declined, but not to the levels seen prior to the widespread use of screening mammography.
One might expect that if screening identifies cancers before they cause clinical symptoms, then the period of screening will be followed by a period of compensatory decline in cancer rates, either in annual population incidence rates or in incidence rates in older women. However, no compensatory drop in incidence rates has ever been seen following the adoption of screening, suggesting that screening leads to overdiagnosisâthe identification of clinically insignificant cancers (refer to the Overdiagnosis section in the Harms of Screening section of this summary for more information).
Breast cancer incidence and mortality risk also vary according to geography, culture, race, ethnicity, and socioeconomic status (refer to the Special Populations section of this summary for more information).
Women with breast symptoms are not candidates for screening because they require a diagnostic evaluation. During a 10-year period, 16% of 2,400 women aged 40 to 69 years sought medical attention for breast symptoms at their health maintenance organization. Women younger than 50 years were twice as likely to seek evaluation. Additional testing was performed in 66% of these women, including invasive procedures performed in 27%. Cancer was diagnosed in 6.2% of these women, most often as stage II or stage III. Of the breast symptoms prompting medical attention, a mass was most likely to lead to a cancer diagnosis (10.7%) and pain was least likely (1.8%) to do so.
Breast cancer risk is affected by many factors besides participation in screening activities. Understanding and quantifying these risks is important to a woman, to her physicians, and to public policy makers. Refer to the PDQ summary on Breast Cancer Prevention for a complete description of factors associated with an increased or decreased risk of breast cancer.
Breast cancer is most often diagnosed by pathologic review of a fixed specimen of breast tissue. The breast tissue can be obtained from a symptomatic area or from an area identified by an imaging test. A palpable lesion can be biopsied with core needle biopsy or, less often, fine-needle aspiration biopsy or surgical excision; image guidance improves accuracy. Nonpalpable lesions can be sampled by core needle biopsy using stereotactic x-ray or ultrasound guidance or can be surgically excised after image-guided localization. In a retrospective study of 939 patients with 1,042 mammographically detected lesions who underwent core needle biopsy or surgical needle localization under x-ray guidance, sensitivity for malignancy was greater than 95% and the specificity was greater than 90%. Compared with surgical needle localization under x-ray guidance, core needle biopsy resulted in fewer surgical procedures for definitive treatment, with a higher likelihood of clear surgical margins at the initial excision.
Ductal carcinoma in situ (DCIS) is a noninvasive condition that can evolve to invasive cancer, with variable frequency and time course. Some authors include DCIS with invasive breast cancer statistics, but others argue that the term be replaced by ductal intraepithelial neoplasia, similar to the terminology used for cervical and prostate precursor lesions, and that breast cancer statistics exclude these DCIS cases.
DCIS is most often diagnosed by mammography. In the United States, only 4,900 women were diagnosed with DCIS in 1983, compared with approximately 64,000 women who are expected to be diagnosed in 2013, when mammographic screening has been widely adopted. The Canadian National Breast Screening Study-2 of women aged 50 to 59 years found a fourfold increase in DCIS cases in women screened by clinical breast examination (CBE) plus mammography compared with those screened by CBE alone, with no difference in breast cancer mortality. (Refer to the PDQ summary on Breast Cancer Treatment for more information.)
The natural history of DCIS is poorly understood because nearly all DCIS cases are treated. A single retrospective review of 11,760 breast biopsies performed between 1952 and 1968 identified 28 cases of DCIS, which were detected by physical examination, biopsied without resection, and then followed for 30 years. Nine women developed invasive breast cancer and four women died of the disease. These findings are interesting but probably not relevant to women with screen-detected DCIS in an era of improved cancer care.
Development of breast cancer after treatment of DCIS depends on the characteristics of the lesion but also on the delivered treatment. One large randomized trial found that 13.4% of women treated by lumpectomy alone developed ipsilateral invasive breast cancer within 90 months, compared with 3.9% of those treated by lumpectomy and radiation. The best evidence indicates that most DCIS lesions will not evolve to invasive cancer and that those that do can still usually be managed successfully, even after that transition. Thus, the detection and treatment of nonpalpable DCIS often represents overdiagnosis and overtreatment.
Among women diagnosed with (and treated for) DCIS between 1984 and 1989, only 1.9% died of breast cancer within 10 years, which was a lower mortality rate than for the age-matched population at large. This favorable outcome may reflect the benign nature of the condition, the benefits of treatment, or the volunteer effect (women undergoing breast cancer screening are generally healthier than those who do not).
Attempts to define low-risk DCIS cases that can be managed with fewer therapies are important. One such effort analyzed a series of 706 DCIS patients who were monitored to develop the University of Southern California/Van Nuys Prognostic Scoring Index, which defines the risk of recurrent DCIS and invasive cancer among women with DCIS based on age, margin width, tumor size, and grade. The low-risk group, comprising a third of the cases, experienced only 1% DCIS recurrences and no invasive cancers, independent of the use of postoperative radiation therapy. The moderate- and high-risk groups had higher recurrence rates, and they benefited from postlumpectomy radiation therapy. Overall, only approximately 1% died of breast cancer. In a separate study, adjuvant tamoxifen therapy was shown to reduce the incidence of invasive breast cancer.
Numerous uncontrolled trials and retrospective series have documented the ability of mammography to diagnose small, early-stage breast cancers, which have a favorable clinical course. Although several trials also show better cancer-related survival in screened versus nonscreened women, a number of important biases may explain that finding:
Because the extent of these biases is never clear in any particular study, most groups rely on randomized controlled trials to assess the benefits of screening. (Refer to the PDQ summary on Cancer Screening Overview for more information.)
Performance benchmarks for screening mammography in the United States are described on the Breast Cancer Surveillance Consortium (BCSC) website.
The sensitivity of mammography is the percentage of breast cancers detected in a given population, when breast cancer is present. Sensitivity depends on tumor size, conspicuity, and hormone sensitivity as well as breast tissue density, patient age, timing within the menstrual cycle, overall image quality, and interpretive skill of the radiologist. Overall sensitivity is approximately 79% but is lower in younger women and in those with dense breast tissue (see the BCSC website). Delay in diagnosis of breast cancer is the most common cause of medical malpractice litigation and half of the cases resulting in payment to the claimant involve false-negative mammograms.
The specificity of mammography is the likelihood of the test being normal when cancer is absent, whereas the false-positive rate is the likelihood of the test being abnormal when cancer is absent. If specificity is low, many false-positive examinations result in unnecessary follow-up examinations and procedures. (Refer to the subsection on Harms in the Screening With Mammography section of the Overview section of this summary for more information.)
Interval cancers are cancers that are diagnosed in the interval after a normal screening examination and before the subsequent screen. Some of these cancers were present at the time of mammography (false-negatives), and others grew rapidly in the interval between mammography and detection. As a general rule, interval cancers have characteristics of rapid growth and are frequently of advanced stage at the time of discovery/diagnosis.
One study of 576 women with interval cancers reported that interval cancers are more prevalent in women aged 40 to 49 years. Interval cancers appearing within 12 months of a negative screening mammogram appear to be related to decreased mammographic sensitivity, attributable to greater breast density in 68% of cases. Those appearing within a 24-month interval appear to be related both to decreased mammographic sensitivity due to greater breast density in 37.6% and to rapid tumor growth in 30.6%.
Another study that compared the characteristics of 279 screen-detected cancers with those of 150 interval cancers found that interval cancers were much more likely to occur in women younger than 50 years and to be of mucinous or lobular histology; or to have high histologic grade, high proliferative activity, relatively benign features mammographically and/or to lack calcifications. Screen-detected cancers were more likely to have tubular histology; to be smaller, low stage, and hormone sensitive; and to have a major component of ductal carcinoma in situ.
Mammography utilizes ionizing radiation to image breast tissue. The examination is performed by compressing the breast firmly between two plates. Such compression spreads out overlapping tissues and reduces the amount of radiation needed to image the breast. For routine screening in the United States, examinations are taken in both mediolateral oblique and craniocaudal projections. Both views should include breast tissue from the nipple to the pectoral muscle. Radiation exposure is 4 to 24 mSv per standard two-view screening examination. Two-view examinations are associated with a lower recall rate than are single-view examinations because they eliminate concern about abnormalities due to superimposition of normal breast structures. Two-view exams are also associated with lower interval cancer rates than are single-view exams.
Under the Mammography Quality Standards Act (MQSA) enacted by Congress in 1992, all U.S. facilities that perform mammography must be certified by the U.S. Food and Drug Administration (FDA) to ensure the use of standardized training for personnel and a standardized mammography technique utilizing a low radiation dose. (Refer to the FDA's web page on Mammography Facility Surveys, Mammography Equipment Evaluations, and Medical Physicist Qualification Requirement under MQSA.) The 1998 MQSA Reauthorization Act requires that patients receive a written lay-language summary of mammography results.
The following Breast Imaging Reporting and Data System (BI-RADS) categories are used for reporting mammographic results:
Most screening mammograms are typically interpreted as negative or benign (BI-RADS 1 or 2, respectively), with about 10% of women in the United States being asked to return for additional evaluation. The percentage of women asked to return for additional evaluation varies not only by the underlying characteristics of each woman but also by mammography facility and radiologist. Extensive literature shows increasing rates of malignancy with BI-RADS assessment categories, with less than 1% risk for diagnosis of cancer within the next year after a BI-RADS 1 or 2 assessment, 2% risk for diagnosis of cancer within the next year after a BI-RADS 3 assessment, and 95% risk for diagnosis of cancer within the next year after a BI-RADS 5 assessment. A BI-RADS 4 can optionally be subdivided into categories 4a, low suspicion (>2% to 10% risk of malignancy); 4b, moderate suspicion (>10% to 50% risk of malignancy); and 4c, high suspicion (>50% to <95% risk of malignancy).
Digital mammography is more expensive than screen-film mammography (SFM) but is more amenable to data storage and sharing. The net impact of screening with digital mammography versus film mammography, in terms of health outcomes and the net difference in rates of overdiagnosis, is unknown. Performance of both SFM and digital mammography for measures such as cancer detection rate, sensitivity, specificity, and PPV have been compared directly in several trials, and the trials yielded similar results.
A large cohort of women (n = 42,760) who underwent both digital and film mammography was evaluated at 33 U.S. centers in the Digital Mammographic Imaging Screening Trial (DMIST). No differences in breast cancer detection were observed (AUC of 0.78 +/- 0.02 for digital and AUC of 0.74 +/- 0.02 for film; P = .18). Digital mammography was better at cancer detection in women younger than 50 years (AUC of 0.84 +/- 0.03 for digital; AUC of 0.69 +/- 0.05 for film; P = .002).
A second DMIST report found that film mammography had a higher AUC in women aged 65 years and older (AUC 0.88 for film; AUC 0.70 for digital; P = .025); however, this finding was not statistically significant when multiple comparisons were considered.
In a large U.S. cohort study, sensitivity for women younger than 50 years was 75.7% (95% CI, 71.7â79.3) for film mammography and 82.4% (95% CI, 76.3â87.5) for digital mammography; specificity was 89.7% (95% CI, 89.6â89.8) for film mammography and 88.0% (95% CI, 88.2â87.8) for digital mammography. A comparison of the findings from 1.5 million digital mammography screens and 4.5 million screen-film mammogram (SFM) screens that were performed in the Netherlands from 2004 to 2010 indicated higher recall and detection rates for the digital mammography screens. Among radiologists who read both digital and SFM exams (n = 1.5 million), the recall rates were 2.0% for digital mammography (95% CI, 2.0â2.1) versus 1.6% for SFM (95% CI, 1.6â1.6); the detection rates were 5.9 per 1,000 (95% CI, 5.7â6.0) for digital mammography and 5.1 per 1,000 (95% CI, 5.0â5.2) for SFM. The PPV was statistically significantly lower in the digital mammography group (PPV, 31.2%; 95% CI, 30.6â31.7) than in the screen-film group (PPV, 34.4%; 95% CI, 33.8%â35.0%). For women aged 49 to 54 years, the recall rates for digital screens versus film screens were 2.7% versus 2.0%, respectively; the detection rates were 5.1 versus 4.0 per 1,000 screens, respectively; and the PPV was 21.4% and 22.1%, respectively. For women aged 55 to 74 years, the recall rates for digital screens versus film screens were 1.7% versus 1.4%, respectively; the detection rates were 6.2 versus 5.6 per 1,000 screens, respectively; and the PPV was 35.7% versus 40.1%, respectively.
A meta-analysis of 10 studies, including the DMIST and the aforementioned U.S. cohort study, compared digital mammography with film mammography in 82,573 women who underwent both types of the exam. In a random-effects model, there was no statistically significant difference in cancer detection between the two types of mammography (AUC of 0.92 for film and AUC of 0.91 for digital). For women younger than 50 years, all studies found that sensitivity was higher for digital mammography but that specificity was either the same or higher for film mammography. The meta-analysis found no other differences based on age.
Computed radiography (CR) utilizes a cassette-based removable detector and external reading device to generate a digital image. A large concurrent cohort study compared 254,758 full-field digital mammography (FFDM) screens with 487,334 SFM screens and 74,190 CR screens. Again, the cancer detection rate was not different between FFDM (4.9 per 1,000) and SFM (4.8 per 1,000), although the recall rate was higher for FFDM. Importantly, cancer detection was lower for CR at 3.4 per 1,000, adjusted OR 0.79 (95% CI, 0.68â0.93). Two prior studies of noncontemporaneous cohorts showed no difference between CR and SFM or higher cancer-detection rate from CR.
CAD systems are designed to help radiologists read mammograms by highlighting suspicious regions such as clustered microcalcifications and masses. Generally, CAD systems increase sensitivity and decrease specificity and increase detection of ductal carcinoma in situ (DCIS). Several CAD systems are in use. One large population-based study, comparing recall rates and breast cancer detection rates before and after the introduction of CAD systems found no change in either rate. Another large study noted an increase in recall rate and increased DCIS detection but no improvement in invasive cancer detection rate.
Using a Surveillance, Epidemiology, and End ResultsâMedicare linked database, the use of new screening mammography modalities by more than 270,000 women aged 65 years and older in two time periods, 2001 to 2002 and 2008 to 2009, was examined. Digital mammography increased from 2% to 30%, CAD increased from 3% to 33%, and spending increased from $660 million to $962 million. There was no difference in detection rates of early-stage (DCIS or stage I) or late-stage (stage IV) tumors.
Tomosynthesis, or 3-dimensional (3-D) mammography, is similar to standard 2-D mammography in how the examination is performed: the breasts are compressed in the same positions as for mammography, and the examination uses x-rays to create the image. In tomosynthesis, multiple short-exposure x-rays are obtained at different angles as the x-ray tube moves over the breast. This process takes a few seconds longer than a standard mammogram. Individual images are then reconstructed into a series of thin slices that can be viewed individually or like a movie. Cancers and other abnormalities are detected because of differences in density and shape compared with surrounding tissue, with some cancers and other findings causing architectural distortion. Overlapping tissues can be more easily recognized accurately as normal with tomosynthesis, and some cancers are better seen than on standard mammography. In some centers, tomosynthesis-guided biopsy may be available because some cancers seen only on tomosynthesis cannot be found with ultrasound.
The combination of 2-D and 3-D mammography has been reported to be more accurate than 2-D mammography alone, with respect to both improved detection of breast cancer (averaging added yield of 1.3/1,000, similar to CAD) and, importantly, reduction in recall rates. On average, 1.8% fewer women will be recalled for extra testing when tomosynthesis is performed in addition to standard 2-D digital mammography for screening. More than 80% of the cancers detected only with tomosynthesis are invasive and node negative. In particular, tomosynthesis depicts architectural distortion better than standard digital mammography; in one series of 26 cases of architectural distortion in women who had both 2-D and 3-D mammography, 19 (73%) were seen only on tomosynthesis, and 4 (21%) of those 19 were malignant.
When tomosynthesis is performed in combination with 2-D mammography, the resulting radiation exposure to the patient is essentially doubled. This is expected to result in another 1.3 fatal cancers per 100,000 women screened at age 40 years (fewer with increasing age), compared with another 130 cancers detected (see Table 1).
The performance of tomosynthesis in isolation (with synthetic 2-D mammograms created) has not been adequately validated in practice, with only one reader study and one prospective clinical trial undertaken to date. The effect of annual tomosynthesis on breast cancer mortality has not been tested in a prospective clinical trial.
Tomosynthesis in the diagnostic setting (specifically, evaluation of mammographic abnormalities) has been shown to be at least as effective as spot compression views for workup of noncalcified abnormalities, including asymmetries and distortions. Tomosynthesis is not worse than standard 2-D mammography at allowing suspicious microcalcifications to be identified, but magnification views are typically still needed to characterize suspicious calcifications.
The use of tomosynthesis in both screening and diagnosis may decrease the need for ultrasound and other additional testing (see Table 1). At this time, there are no data on the association of tomosynthesis and overall mortality reduction.
The primary role of ultrasound is the diagnostic evaluation of palpable or mammographically identified masses, rather than serving as a primary screening modality. A review of the literature and expert opinion by the European Group for Breast Cancer Screening concluded that âthere is little evidence to support the use of ultrasound in population breast cancer screening at any age.â In the setting of normal mammography and ultrasonography, less than 3% of women who have a lump will ultimately be found to have breast cancer.
Breast magnetic resonance imaging (MRI) may be used in women for diagnostic evaluation, including evaluating the integrity of silicone breast implants, assessing palpable masses following surgery or radiation therapy, detecting mammographically and sonographically occult breast cancer in patients with axillary nodal metastasis, and preoperative planning for some patients with known breast cancer. There is no ionizing radiation exposure with this procedure. It has been promoted as a screening test for breast cancer among women at elevated risk of breast cancer based on BRCA1/2 mutation carriers, a strong family history of breast cancer, or several genetic syndromes such as Li-Fraumeni or Cowden disease. Breast MRI is more sensitive but less specific than screening mammography and is more expensive.
Using infrared imaging techniques, thermography of the breast identifies temperature changes in the skin as an indicator of an underlying tumor, displaying these changes in color patterns. Thermographic devices have been approved by the FDA under the 510(k) process, which does not require evidence of clinical effectiveness. There have been no randomized trials of thermography to evaluate the impact on breast cancer mortality or the ability to detect breast cancer. Small cohort studies do not suggest any additional benefit for the use of thermography as an adjunct modality for breast cancer screening.
Randomized controlled trials (RCTs), with participation by nearly half a million women from four countries, examined the breast cancer mortality rates of women who were offered regular screening. One trial, the Canadian National Breast Screening Study (NBSS)-2, compared mammogram plus clinical breast examination (CBE) with CBE alone; the other eight trials compared screening mammogram with or without CBE to a control consisting of usual care.
The trials differed in design, recruitment of participants, interventions (both screening and treatment), management of the control group, compliance with assignment to screening and control groups, and analysis of outcomes. Some trials used individual randomization, while others used cluster randomization in which cohorts were identified and then offered screening; one trial used nonrandomized allocation by day of birth in any given month. Cluster randomization sometimes led to imbalances between the intervention and control groups. Age differences have been identified in several trials, although the differences were probably too small to have a major effect on the trial outcome. In the Edinburgh Trial, socioeconomic status, which correlates with the risk of breast cancer mortality, differed markedly between the intervention and control groups, so it is difficult, if not impossible, to interpret the results.
Breast cancer mortality is the major outcome parameter for each of these trials, so the methods used to determine cause of death are critically important. Efforts to reduce bias in the attribution of mortality cause have been made, including the use of a blinded monitoring committee (New York) and a linkage to independent data sources, such as national mortality registries (Swedish trials). Unfortunately, these attempts could not ensure a lack of knowledge of womenâs assignments to screening or control arms. Evidence of possible misclassification of breast cancer deaths in the Two-County Trial with possible bias in favor of screening has been analyzed.
There were also differences in the methodology used to analyze the results of these trials. Four of the five Swedish trials were designed to include a single screening mammogram in the control group, timed to correspond with the end of the series of screening mammograms in the study group. The initial analysis of these trials used an "evaluation" analysis, tallying only the breast cancer deaths that occurred in women whose cancer was discovered at or before the last study mammogram. In some of the trials a delay occurred in the performance of the end-of-study mammogram, resulting in more time for members of the control group to develop or be diagnosed with breast cancer. Other trials used a "follow-up" analysis, which counts all deaths attributed to breast cancer, regardless of the time of diagnosis. This type of analysis was used in a meta-analysis of four of the five Swedish trials in response to concerns about the evaluation analyses.
The accessibility of the data for international audits and verification also varies, with formal audit having been undertaken only in the Canadian trials. Other trials have been audited to varying degrees, usually with less rigor.
All of these studies are designed to study breast cancer mortality rather than all-cause mortality because of the infrequency of breast cancer deaths relative to the total number of deaths. When all-cause mortality in these trials was examined retrospectively, only the Edinburgh Trial showed a significant difference, which could be attributed to socioeconomic differences. The meta-analysis (follow-up methods) of the four Swedish trials also showed a small but significant improvement of all-cause mortality.
Refer to the Appendix of Randomized Controlled Trials section of this summary for a detailed description of the trials.
Screening for breast cancer does not affect overall mortality, and the absolute benefit for breast cancer mortality is small.
A way to view the potential benefit of breast cancer screening is to estimate the number of lives extended because of early breast cancer detection. One author estimated the outcomes of 10,000 women aged 50 to 70 years who undergo a single screen. Mammograms will be normal (true-negatives and false-negatives) in 9,500 women. Of the 500 abnormal screens, 466 to 479 will be false-positives, and 100 to 200 of these women will undergo invasive procedures. The remaining 21 to 34 abnormal screens will be true-positives, indicating breast cancer. Some of these women will die of breast cancer in spite of mammographic detection and optimal therapy, and some may live long enough to die of other causes even if the cancer had not been screen detected. The number of extended lives attributable to mammographic detection is between two and six. Another expression of this analysis is that one life may be extended per 1,700 to 5,000 women screened and followed for 15 years. The same analysis for 10,000 women aged 40 to 49 years, assuming the same 500 abnormal examinations, results in an estimate that 488 of these will be false-positives, and 12 will be breast cancer. Of these 12, there will probably be only one or two lives extended. Thus, for women aged 40 to 49 years, it is estimated that one or two lives may be extended per 5,000 to 10,000 mammograms.
While the numbers discussed above are from a single mammography exam, women undergo screening throughout their lifetimes, which can include 20 to 30 years of screening activity. A meta-analysis of RCTs conducted for the U.S. Preventive Services Task Force in 2009 (including the AGE Trial) found that the number needed to invite to screen for 10 years to avoid or delay one death from breast cancer was 1,904 for women in their 40s, 1,339 for women in their 50s, and 377 for women in their 60s. A 2009 combined analysis by six Cancer Intervention and Surveillance Modeling Network modeling groups found that screening every 2 years maintained an average of 81% of the benefit of annual screening with almost half the false-positive results. Screening biennially from age 50 to 69 years achieved a median 16.5% reduction in breast cancer deaths versus no screening. Initiating biennial screening at age 40 years (vs. age 50 years) reduced breast cancer mortality by an additional 3%, consumed more resources, and yielded more false-positive results.
Although the RCTs of screening have addressed the issue of screening efficacy (i.e., the extent to which screening reduces breast cancer mortality under the ideal conditions of an RCT), they do not provide information about the effectiveness of screening (i.e., the extent to which screening is reducing breast cancer mortality in the U.S. population). Studies that provide information about this issue include nonrandomized controlled studies of screened versus nonscreened populations, case-control studies of screening in real communities, and modeling studies that examine the impact of screening on large populations. An important issue in all of these studies is the extent to which they can control for additional effects on breast cancer mortality such as improved treatment and heightened awareness of breast cancer in the community.
Three population-based, observational studies from Sweden compared breast cancer mortality in the presence and absence of screening mammography programs. One study compared two adjacent time periods in 7 of the 25 counties in Sweden and concluded a statistically significant breast cancer mortality reduction of 18% to 32% attributable to screening. The most important bias in this study is that the advent of screening in these counties occurred over a period during which dramatic improvements in the effectiveness of adjuvant breast cancer therapy were being made, changes which were not addressed by the study authors. The second study considered an 11-year period comparing seven counties with screening programs with five counties without them. There was a trend in favor of screening, but again, the authors did not consider the effect of adjuvant therapy or differences in geography (urban vs. rural) that might affect treatment practices.
In part to account for the effects of treatment, the third study was a detailed analysis by county and concluded little impact of screening. These authors made the assumption that the annual decrease in mortality observed during the prescreening period would carry into the postscreening period, and any screening effect would result in an incremental decrease in mortality. Although no such incremental decrease in breast cancer mortality was observed after the introduction of screening, their assumption makes their conclusion weak. Comparisons across counties showed similar reductions in decreased breast cancer mortality regardless of when the countiesâ screening programs were initiated; however, the authors carried out no formal cross-county analyses.
In Nijmegen, the Netherlands, where a population-based screening program was undertaken in 1975, a case-cohort study showed that screened women have decreased mortality (odds ratio [OR], 0.48). However, a subsequent study comparing Nijmegen breast cancer mortality rates with neighboring Arnhem in the Netherlands, which had no screening program, showed no difference in breast cancer mortality.
A community-based case-control study of screening as practiced in excellent U.S. health care systems between 1983 and 1998 found no association between previous screening and reduced breast cancer mortality. Mammography screening rates, however, were generally low. The association among women at increased risk due to a family history of breast cancer or a previous breast biopsy (OR, 0.74; 95% confidence interval [CI], 0.50â1.03) was stronger than that among women at average risk (OR, 0.96; 95% CI, 0.80â1.14), but the difference was not statistically significant (P = .17).
A well-conducted ecologic study compared three pairs of neighboring European countries, matched on similarity in health care systems and population structure, one of which had started a national screening program some years earlier than the others. The investigators found that each country had experienced a reduction in breast cancer mortality, with no difference between matched pairs that could be attributed to screening. The authors suggested that improvements in breast cancer treatment and/or health care organizations were more likely responsible for the reduction in mortality than was screening.
A systematic review of ecologic and large cohort studies published through March 2011 compared breast cancer mortality in large populations of women aged 50 to 69 years who started breast cancer screening at different times. Seventeen studies met inclusion criteria. All studies had methodological problems, including control group dissimilarities, insufficient adjustment for differences between areas in breast cancer risk and breast cancer treatment, and problems with similar measurement of breast cancer mortality between compared areas. There was great variation in results among the studies, with four studies finding a relative reduction in breast cancer mortality of 33% or more (with wide CIs) and five studies finding no reduction in breast cancer mortality. Because only a part of the overall reduction in breast cancer mortality could possibly be attributed to screening, the review concluded that any relative reduction in breast cancer mortality resulting from screening would likely be no more than 10%, less than predicted by the RCTs.
A U.S. ecologic analysis conducted between 1976 and 2008 examined the incidence of early-stage versus late-stage breast cancer for women aged 40 years and older. To find a screening effect, the authors compared the magnitude of increase in early-stage cancer with the magnitude of an expected decrease in late-stage cancer. Over the study period, the absolute increase in the incidence of early-stage cancer was 122 cancers per 100,000 women, while the absolute decrease in late-stage cancers was 8 cases per 100,000 women. After adjusting for changes in incidence resulting from hormone therapy and other undefined causes, the authors concluded that the screening effect on breast cancer mortality reduction (28% during this period) was small, and that overdiagnosis of breast cancer was likely between 22% and 31% of all diagnosed breast cancers. Most of the reduction in breast cancer mortality, the authors concluded, was probably because of improved treatment rather than screening. To make these adjustments, the authors made uncertain assumptions about the effects of other factors on incidence, and made no mention of the effects of changing treatment over time. Ecologic studies are difficult to interpret because of this type of potential uncontrolled confounding, as well as these types of unfair comparisons. However, this study largely agrees with some similar analyses from other countries (see studies discussed above). A major limitation of this and other ecologic studies is the failure to account for actual exposure to screening. Most late-stage breast cancer occurs in women not exposed to screening.
A prospective cohort study of community-based screening programs in the United States found that annual compared with biennial screening mammography did not reduce the proportion of unfavorable breast cancers detected in women aged 50 to 74 years or in women aged 40 to 49 years who did not have extremely dense breasts. Women aged 40 to 49 years with extremely dense breasts did have a reduction in cancers larger than 2.0 cm (OR for biennial vs. annual screening, 2.39; 95% CI, 1.37â4.18).
The optimal screening interval has been addressed by modelers. Modeling makes assumptions that may not be correct; however, the credibility of modeling is greater when the model produces overall results that are consistent with randomized trials overall and when the model is used to interpolate or extrapolate. For example, if a modelâs output agrees with RCT outcomes for annual screening, it has greater credibility in comparing the relative effectiveness of biennial versus annual screening.
In 2000, the National Cancer Institute formed a consortium of modeling groups (Cancer Intervention and Surveillance Modeling [CISNET]) to address the relative contribution of screening and adjuvant therapy to the observed decline in breast cancer mortality in the United States. (Refer to the Randomized controlled trials section of this summary for more information.) These models gave reductions in breast cancer mortality similar to those expected in the circumstances of the RCTs but updated to the use of modern adjuvant therapy. In 2009, CISNET modelers addressed several questions related to the harms and benefits of mammography, including comparing annual versus biennial screening. The proportion of reduction in breast cancer mortality maintained in moving from annual to biennial screening for women aged 50 to 74 years ranged across the six models from 72% to 95%, with a median of 80%.
Several studies have shown that the method of cancer detection is a powerful predictor of patient outcome, which is useful for prognostication and treatment decisions. All of the studies accounted for stage, nodal status, and tumor size.
A 10-year follow-up study of 1,983 Finnish women with invasive breast cancer demonstrated that the method of cancer detection is an independent prognostic variable. When controlled for age, nodal status, and tumor size, screen-detected cancers had a lower risk of relapse and better overall survival. For women whose cancers were detected outside screening, the hazard ratio (HR) for death was 1.90 (95% confidence interval [CI], 1.15â3.11), even though they were more likely to receive adjuvant systemic therapy.
Similarly, an examination of the breast cancers found in three randomized screening trials (Health Insurance Plan, National Breast Screening Study [NBSS]-1, and NBSS-2) accounted for stage, nodal status, and tumor size and determined that patients whose cancer was found via screening have a more favorable prognosis. The relative risks for death were 1.53 (95% CI, 1.17â2.00) for interval and incident cancers, compared with screen-detected cancers; and 1.36 (95% CI, 1.10â1.68) for cancers in the control group, compared with screen-detected cancers.
A third study compared the outcomes of 5,604 English women with screen-detected cancers to those with symptomatic breast cancers diagnosed between 1998 and 2003. After controlling for tumor size, nodal status, grade, and patient age, researchers found that the women with screen-detected cancers fared better than their symptomatic counterparts. The HR for survival of the symptomatic women was 0.79 (95% CI, 0.63â0.99). Thus, method of cancer detection is a powerful predictor of patient outcome, which is useful for prognostication and treatment decisions. The findings of this study are also consistent with the evidence that some screen-detected cancers are low risk and represent overdiagnosis.
Several characteristics of women being screened that are associated with the accuracy of mammography include age, breast density, whether it is the first or subsequent exam, and time since last mammogram. Younger women have lower sensitivity and higher false-positive rates on screening mammography than do older women (refer to the Breast Cancer Surveillance Consortium performance measures by age for more information).
For women of all ages, high breast density is associated with 10% to 29% lower sensitivity. High breast density is an inherent trait, which can be familial but also may be affected by age, endogenous and exogenous hormones, selective estrogen receptor modulators such as tamoxifen, and diet. Hormone therapy is associated with increased breast density and is associated not only with lower sensitivity but also with an increased rate of interval cancers.
The Million Women Study in the United Kingdom revealed three patient characteristics that were associated with decreased sensitivity and specificity of screening mammograms in women aged 50 to 64 years: use of postmenopausal hormone therapy, prior breast surgery, and body mass index below 25. In addition, a longer interval since the last mammogram increases sensitivity, recall rate, and cancer detection rate and decreases specificity.
Strategies have been proposed to improve mammographic sensitivity by altering diet, timing mammograms with menstrual cycles, interrupting hormone therapy before the examination, or using digital mammography machines. Obese women have more than a 20% increased risk of having false-positive mammography results compared with underweight and normal weight women, although sensitivity is unchanged.
Some cancers are more easily detected by mammography than other cancers are. In particular, mucinous, lobular, and rapidly growing cancers can be missed because their appearance on x-rays is similar to that of normal breast tissue. Medullary carcinomas may be similarly missed. Some cancers, particularly those associated with BRCA1/2 mutations, masquerade as benign tumors.
Radiologist performance is critical to assessing mammographic interpretive performance, yet there is substantial, well-documented variability among radiologists. Factors that influence radiologistsâ performance include their level of experience and the volume of mammograms they interpret. There is often a trade-off between sensitivity and specificity, such that higher sensitivity may be associated with lower specificity. Radiologists in academic settings have a higher positive predictive value (PPV) of their recommendations to undergo biopsy than do community radiologists. Fellowship training in breast imaging may lead to improved cancer detection, but it is associated with higher false-positive rates.
After controlling for patient and radiologist characteristics, screening mammography interpretive performance (specificity, PPV, area under the curve [AUC]) varies by facility and is associated with facility-level characteristics. Higher interpretive accuracy of screening mammography was seen at facilities that offered screening examinations alone, included a breast imaging specialist on staff, did single as opposed to double readings, and reviewed interpretive audits two or more times each year.
False-positive rates vary significantly between facilities performing diagnostic mammography and are higher at facilities where concern about malpractice is high. False-positive rates are also higher at facilities serving vulnerable women (women of racial or ethnic minorities and women with lower educational attainment, limited household income, or rural residence) than at facilities serving nonvulnerable women, perhaps because of poorer compliance with recommendations for follow-up examinations. Analyses that do not adjust for important patient characteristics may falsely conclude that there is more facility variation in overall accuracy than actually exists.
International comparisons of screening mammography have found higher specificity in countries with more highly centralized screening systems and national quality assurance programs. For example, one study reported that the recall rate is twice as high in the United States as it is in the United Kingdom, yet there is no difference in the rate of cancers detected. Such comparisons may be confounded by social, cultural, and economic factors.
The likelihood of diagnosing cancer is highest with the prevalent (first) screening examination, ranging from 9 to 26 cancers per 1,000 screens, depending on the womanâs age. The likelihood decreases for follow-up examinations, ranging from 1 to 3 cancers per 1,000 screens. The optimal interval between screening mammograms is unknown. In particular, the breast cancer mortality-focused, randomized, controlled trials used single screening intervals with little variability across the trials. A prospective United Kingdom trial randomly assigned women aged 50 to 62 years to receive mammograms annually or at the standard 3-year interval. Although the grade and node status were similar in both groups, more cancers of slightly smaller size were detected in the annual screening group, with a lead time of approximately 7 months in comparison with triennial screening.
A large observational study found a slightly increased risk of late-stage disease at diagnosis for women in their 40s who were adhering to a 2-year versus a 1-year schedule (28% vs. 21%; odds ratio (OR), 1.35; 95% confidence interval [CI], 1.01â1.81), but no difference was seen for women in their 50s or 60s.
A Finnish study of 14,765 women aged 40 to 49 years assigned women born in even-numbered years to annual screens and women born in odd-numbered years to triennial screens. The study was small in terms of number of deaths, with low power to discriminate breast cancer mortality between the two groups. There were 18 deaths from breast cancer in 100,738 life-years in the triennial screening group and 18 deaths from breast cancer in 88,780 life-years in the annual screening group (hazard ratio, 0.88; 95% CI, 0.59â1.27).
Mammography screening may be effective in reducing breast cancer mortality in certain populations, but it can pose harm to w