Posted Date: Apr 27, 2014
Expert-reviewed information summary about tests used to detect or screen for breast cancer.
Note: Separate PDQ summaries on Breast Cancer Prevention, Breast Cancer Treatment, Male Breast Cancer Treatment, and Breast Cancer Treatment and Pregnancy are also available.
This summary covers the topic of breast cancer screening and includes information about breast cancer incidence and mortality, risk factors for breast cancer, the process of breast cancer diagnosis, and the benefits and harms of various breast cancer screening modalities. This summary also includes information about screening among special populations.
Mammography is the most widely used screening modality, with solid evidence of benefit for women aged 40 to 74 years. Clinical breast examination and breast self-exam have also been evaluated but are of uncertain benefit. Technologies such as ultrasound, magnetic resonance imaging, tomosynthesis, and molecular breast imaging are being evaluated, usually as adjuncts to mammography.
Based on solid evidence, screening mammography may lead to the following benefit:
Based on solid evidence, screening mammography may lead to the following harms:
For all these potential harms of screening mammography, internal validity, consistency and external validity are good.
Clinical breast examination (CBE) has not been tested independently; it was used in conjunction with mammography in one Canadian trial, and was the comparator modality versus mammography in another trial. Thus, it is not possible to assess the efficacy of CBE as a screening modality when it is used alone versus usual care (no screening activity).
Screening by CBE may lead to the following harms:
Breast self-examination (BSE) has been compared to usual care (no screening activity) and has not been shown to reduce breast cancer mortality.
Based on solid evidence, formal instruction and encouragement to perform BSE leads to more breast biopsies and diagnosis of more benign breast lesions.
Breast cancer is the most common noncutaneous cancer in U.S. women, with an estimated 62,570 cases of in situ disease, 232,670 new cases of invasive disease, and 40,000 deaths expected in 2014. Thus, fewer than 1 of 6 women diagnosed with breast cancer die of the disease. By comparison, about 72,330 American women are estimated to die of lung cancer in 2014. Males account for 1% of breast cancer cases and breast cancer deaths (refer to the Special Populations section of this summary for more information).
Widespread adoption of screening increases breast cancer incidence in a given population and changes the characteristics of cancers detected, with increased incidence of lower-risk cancers, premalignant lesions, and ductal carcinoma in situ (DCIS). (Refer to the Ductal Carcinoma In Situ section in the Breast Cancer Diagnosis and Pathology section of this summary for more information.) Ecologic studies from the United States and the United Kingdom demonstrate an increase in DCIS and invasive breast cancer incidence since the 1970s, attributable to the widespread adoption of both postmenopausal hormone therapy and screening mammography. In the last decade, women have refrained from using postmenopausal hormones, and breast cancer incidence has declined, but not to the levels seen prior to the widespread use of screening mammography.
One might expect that if screening identifies cancers before they cause clinical symptoms, then the period of screening will be followed by a period of compensatory decline in cancer rates, either in annual population incidence rates or in incidence rates in older women. However, no compensatory drop in incidence rates has ever been seen following the adoption of screening, suggesting that screening leads to overdiagnosisâthe identification of clinically insignificant cancers (refer to the Overdiagnosis section in the Harms of Screening Mammography section of this summary for more information).
Breast cancer incidence and mortality risk also vary according to geography, culture, race, ethnicity, and socioeconomic status (refer to the Special Populations section of this summary for more information).
Breast cancer risk is affected by many factors besides participation in screening activities. Understanding and quantifying these risks is important to a woman, to her physicians, and to public policy makers.
The incidence of breast cancer increases with a woman's age. As shown in Table 1, a 60-year-old woman has a higher risk of being diagnosed with breast cancer in the next 10 years than does a 40-year-old woman.
The cumulative lifetime incidence decreases with advancing age because the longer a woman lives without a breast cancer diagnosis, the lower her lifetime risk compared to a younger woman who might develop breast cancer at a younger or older age. The commonly quoted risk of one in eight women who will be diagnosed with breast cancer is based on lifetime risk of a diagnosis (not death) starting from birth and does not account for the womanâs current age.
Breast cancer mortality increases with age. For a 40-year-old woman without a breast cancer diagnosis, the chance of dying from breast cancer within the next 10 years is extremely small, but for a woman older than 65, it is about 1%. For a woman older than 70, the risk of dying of breast cancer is even higher, but the risk of dying of any cause is higher yet.
Women with a personal history of invasive breast cancer, DCIS, or lobular carcinoma in situ also have an increased risk of being diagnosed with a new primary breast cancer. Recommendations for subsequent mammograms vary, but evidence for various strategies is scant.
Women treated with thoracic radiation before the age of 30 years have a 1% annual risk of breast cancer, starting 8 years after the irradiation and for the rest of their lives. Annual screening with magnetic resonance imaging (MRI) has been proposed in such women, beginning 8 years after treatment or by age 25 years, whichever is later. In a study of screening with mammography and MRI, 13 cancers were diagnosed among 98 asymptomatic women who received a chest radiation dose of 15 Gy or less for pediatric or adult cancer. Four of those cancers would not have been detected without the use of MRI. Another study of multiple screening modalities observed a similar increase in cancer detection with the addition of MRI. These data suggest that earlier detection is possible with MRI, but do not demonstrate a definitive benefit of adjunct MRI screening.
Women with radiologically dense breasts (heterogeneously dense or extremely dense in the terminology of the Breast Imaging Reporting and Data System [BI-RADS]) have a threefold to sixfold increased risk of breast cancer compared with women who have fatty breasts.
Other risk factors for breast cancer include an inherited predisposition (BRCA1 or BRCA2, and others); early age at menarche and late age at first birth; and previous breast biopsies showing benign proliferative breast disease. Menopausal hormone use, obesity, lack of physical activity, and alcohol intake are associated with an increased risk of breast cancer. (Refer to the PDQ summaries on Cancer Prevention Overview and Breast Cancer Prevention for more information.) Several models estimate an individual woman's risk based on these and other factors.
Women with breast symptoms are not candidates for screening because they require a diagnostic evaluation. During a 10-year period, 16% of 2,400 women aged 40 to 69 years sought medical attention for breast symptoms at their health maintenance organization. Women younger than 50 years were twice as likely to seek evaluation. Additional testing was performed in 66% of these women, including invasive procedures performed in 27%. Cancer was diagnosed in 6.2%, most often stage II or III. Of the breast symptoms prompting medical attention, a mass was most likely to lead to a cancer diagnosis (10.7%) and pain was least likely (1.8%) to do so.
Breast cancer is most often diagnosed by pathologic review of a fixed specimen of breast tissue. The breast tissue can be obtained from a symptomatic area or from an area identified by an imaging test. A palpable lesion can be biopsied with core needle biopsy or, less often, fine-needle aspiration biopsy or surgical excision; image guidance improves accuracy. Nonpalpable lesions can be sampled by core needle biopsy using stereotactic x-ray or ultrasound guidance or can be surgically excised after image-guided localization. In a retrospective study of 939 patients with 1,042 mammographically detected lesions who underwent core needle biopsy or surgical needle localization under x-ray guidance, sensitivity for malignancy was greater than 95% and the specificity was greater than 90%. Compared with surgical needle localization under x-ray guidance, core needle biopsy resulted in fewer surgical procedures for definitive treatment, with a higher likelihood of clear surgical margins at the initial excision.
Ductal carcinoma in situ (DCIS) is a noninvasive condition that can evolve to invasive cancer, with variable frequency and time course. Some authors include DCIS with invasive breast cancer statistics, but others argue that the term be replaced by ductal intraepithelial neoplasia, similar to the terminology used for cervical and prostate precursor lesions, and that breast cancer statistics exclude these DCIS cases.
DCIS is most often diagnosed by mammography. In the United States, only 4,900 women were diagnosed with DCIS in 1983, compared with approximately 64,000 women who are expected to be diagnosed in 2013, when mammographic screening has been widely adopted. The Canadian National Breast Screening Study-2 of women aged 50 to 59 years found a fourfold increase in DCIS cases in women screened by clinical breast examination (CBE) plus mammography compared with those screened by CBE alone, with no difference in breast cancer mortality. (Refer to the PDQ summary on Breast Cancer Treatment for more information.)
The natural history of DCIS is poorly understood because nearly all DCIS cases are treated. A single retrospective review of 11,760 breast biopsies performed between 1952 and 1968 identified 28 cases of DCIS, which were detected by physical examination, biopsied without resection, and then followed for 30 years. Nine women developed invasive breast cancer and four women died of the disease. These findings are interesting but probably not relevant to women with screen-detected DCIS in an era of improved cancer care.
Development of breast cancer after treatment of DCIS depends on the characteristics of the lesion but also on the delivered treatment. One large randomized trial found that 13.4% of women treated by lumpectomy alone developed ipsilateral invasive breast cancer within 90 months, compared with 3.9% of those treated by lumpectomy and radiation. The best evidence indicates that most DCIS lesions will not evolve to invasive cancer and that those that do can still usually be managed successfully, even after that transition. Thus, the detection and treatment of nonpalpable DCIS often represents overdiagnosis and overtreatment.
Among women diagnosed with (and treated for) DCIS between 1984 and 1989, only 1.9% died of breast cancer within 10 years, which was a lower mortality rate than for the age-matched population at large. This favorable outcome may reflect the benign nature of the condition, the benefits of treatment, or the volunteer effect (women undergoing breast cancer screening are generally healthier than those who do not).
Attempts to define low-risk DCIS cases that can be managed with less therapies are important. One such effort analyzed a series of 706 DCIS patients who were monitored to develop the University of Southern California/Van Nuys Prognostic Scoring Index, which defines the risk of recurrent DCIS and invasive cancer among women with DCIS based on age, margin width, tumor size, and grade. The low-risk group, comprising a third of the cases, experienced only 1% DCIS recurrences and no invasive cancers, independent of the use of postoperative radiation therapy. The moderate- and high-risk groups had higher recurrence rates, and they benefited from postlumpectomy radiation. Overall, only approximately 1% died of breast cancer. In a separate study, adjuvant tamoxifen therapy was shown to reduce the incidence of invasive breast cancer.
Numerous uncontrolled trials and retrospective series have documented the ability of mammography to diagnose small, early-stage breast cancers, which have a favorable clinical course. Although several trials also show better cancer-related survival in screened versus nonscreened women, a number of important biases may explain that finding:
Because the extent of these biases is never clear in any particular study, most groups rely on randomized controlled trials to assess the benefits of screening. (Refer to the Cancer Screening Summary Overview for more information.)
Performance benchmarks for screening mammography in the United States are described on the Breast Cancer Surveillance Consortium (BCSC) Web site.
The sensitivity of mammography is the percentage of breast cancers detected in a given population, when breast cancer is present. Sensitivity depends on tumor size, conspicuity, and hormone sensitivity as well as breast tissue density, patient age, timing within the menstrual cycle, overall image quality, and interpretive skill of the radiologist. Overall sensitivity is approximately 79% but is lower in younger women and in those with dense breast tissue (see the BCSC Web site). Delay in diagnosis of breast cancer is the most common cause of medical malpractice litigation and half of the cases resulting in payment to the claimant involve false-negative mammograms.
The specificity of mammography is the likelihood of the test being normal when cancer is absent, whereas the false-positive rate is the likelihood of the test being abnormal when cancer is absent. If specificity is low, many false-positive examinations result in unnecessary follow-up examinations and procedures. (Refer to the subsection on Harms in the Screening With Mammography section of the Overview section of this summary for more information.)
Interval cancers are cancers that are diagnosed in the interval after a normal screening examination and before the subsequent screen. Some of these cancers were present at the time of mammography (false-negatives), and others grew rapidly in the interval between mammography and detection. As a general rule, interval cancers have characteristics of rapid growth and are frequently of advanced stage at the time of discovery/diagnosis.
One study of 576 women with interval cancers reported that interval cancers are more prevalent in women aged 40 to 49 years. Interval cancers appearing within 12 months of a negative screening mammogram appear to be related to decreased mammographic sensitivity, attributable to greater breast density in 68% of cases. Those appearing within a 24-month interval appear to be related both to decreased mammographic sensitivity due to greater breast density in 37.6% and to rapid tumor growth in 30.6%.
Another study that compared the characteristics of 279 screen-detected cancers with those of 150 interval cancers found that interval cancers were much more likely to occur in women younger than 50 years and to be of mucinous or lobular histology; or to have high histologic grade, high proliferative activity, relatively benign features mammographically and/or to lack calcifications. Screen-detected cancers were more likely to have tubular histology; to be smaller, low stage, and hormone sensitive; and to have a major component of ductal carcinoma in situ.
Mammography utilizes ionizing radiation to image breast tissue. The examination is performed by compressing the breast firmly between two plates. Such compression spreads out overlapping tissues and reduces the amount of radiation needed to image the breast. For routine screening in the United States, examinations are taken in both mediolateral oblique and craniocaudal projections. Both views should include breast tissue from the nipple to the pectoral muscle. Radiation exposure is 4 to 24 mSv per standard two-view screening examination. Two-view examinations are associated with a lower recall rate than are single-view examinations because they eliminate concern about abnormalities due to superimposition of normal breast structures.
Under the Mammography Quality Standards Act (MQSA) enacted by Congress in 1992, all U.S. facilities that perform mammography must be certified by the U.S. Food and Drug Administration (FDA) to ensure the use of standardized training for personnel and a standardized mammography technique utilizing a low radiation dose. (Refer to the FDA's Web page on Mammography Facility Surveys, Mammography Equipment Evaluations, and Medical Physicist Qualification Requirement under MQSA.) The 1998 MQSA Reauthorization Act requires that patients receive a written lay-language summary of mammography results.
On screening, the following Breast Imaging, Reporting and Data System (BI-RADS) assessments are used: 1, negative; 2, benign; or 0, incomplete with additional evaluation needed.
About 10% of women screened will be recalled for additional evaluation; more than 80% of these will be considered normal or benign after a full diagnostic workup, which may include additional mammographic views, ultrasound, or both. About 15% of women recalled will be recommended for biopsy, with 30% of cases assessed as BI-RADS 4, suspicious, yielding cancer; and 95% of cases assessed as BI-RADS 5, highly suggestive of malignancy, yielding cancer. About 2% of women screened will be recommended for short-interval follow-up, assessed as BI-RADS 3, probably benign, with fewer than 2% of such women ultimately found to have cancer.
Randomized controlled trials (RCTs), with participation by nearly half a million women from four countries, examined the breast cancer mortality rates of women who were offered regular screening. One trial, the Canadian National Breast Screening Study (NBSS)-2, compared mammogram plus clinical breast examination (CBE) with CBE alone; the other eight trials compared screening mammogram with or without CBE to a control consisting of usual care.
The trials differed in design, recruitment of participants, interventions (both screening and treatment), management of the control group, compliance with assignment to screening and control groups, and analysis of outcomes. Some trials used individual randomization, while others used cluster randomization in which cohorts were identified and then offered screening; one trial used nonrandomized allocation by day of birth in any given month. Cluster randomization sometimes led to imbalances between the intervention and control groups. Age differences have been identified in several trials, although the differences were probably too small to have a major effect on the trial outcome. In the Edinburgh Trial, socioeconomic status, which correlates with the risk of breast cancer mortality, differed markedly between the intervention and control groups, so it is difficult, if not impossible, to interpret the results.
Breast cancer mortality is the major outcome parameter for each of these trials, so the methods used to determine cause of death are critically important. Efforts to reduce bias in the attribution of mortality cause have been made, including the use of a blinded monitoring committee (New York) and a linkage to independent data sources, such as national mortality registries (Swedish trials). Unfortunately, these attempts could not ensure a lack of knowledge of womenâs assignments to screening or control arms. Evidence of possible misclassification of breast cancer deaths in the Two-County Trial with possible bias in favor of screening has been analyzed.
There were also differences in the methodology used to analyze the results of these trials. Four of the five Swedish trials were designed to include a single screening mammogram in the control group, timed to correspond with the end of the series of screening mammograms in the study group. The initial analysis of these trials used an "evaluation" analysis, tallying only the breast cancer deaths that occurred in women whose cancer was discovered at or before the last study mammogram. In some of the trials a delay occurred in the performance of the end-of-study mammogram, resulting in more time for members of the control group to develop or be diagnosed with breast cancer. Other trials used a "follow-up" analysis, which counts all deaths attributed to breast cancer, regardless of the time of diagnosis. This type of analysis was used in a meta-analysis of four of the five Swedish trials in response to concerns about the evaluation analyses.
The accessibility of the data for international audits and verification also varies, with formal audit having been undertaken only in the Canadian trials. Other trials have been audited to varying degrees, usually with less rigor.
All of these studies are designed to study breast cancer mortality rather than all-cause mortality because of the infrequency of breast cancer deaths relative to the total number of deaths. When all-cause mortality in these trials was examined retrospectively, only the Edinburgh Trial showed a significant difference, which could be attributed to socioeconomic differences. The meta-analysis (follow-up methods) of the four Swedish trials also showed a small but significant improvement of all-cause mortality.
The trials are described in detail in the Appendix of Randomized Controlled Trials section of this summary.
Screening for breast cancer does not affect overall mortality, and the absolute benefit for breast cancer mortality is small.
A way to view the potential benefit of breast cancer screening is to estimate the number of lives extended because of early breast cancer detection. One author estimated the outcomes of 10,000 women aged 50 to 70 years who undergo a single screen. Mammograms will be normal (true-negatives and false-negatives) in 9,500 women. Of the 500 abnormal screens, 466 to 479 will be false-positives, and 100 to 200 of these women will undergo invasive procedures. The remaining 21 to 34 abnormal screens will be true-positives, indicating breast cancer. Some of these women will die of breast cancer in spite of mammographic detection and optimal therapy, and some may live long enough to die of other causes even if the cancer had not been screen detected. The number of extended lives attributable to mammographic detection is between two and six. Another expression of this analysis is that one life may be extended per 1,700 to 5,000 women screened and followed for 15 years. The same analysis for 10,000 women aged 40 to 49 years, assuming the same 500 abnormal examinations, results in an estimate that 488 of these will be false-positives, and 12 will be breast cancer. Of these 12, there will probably be only one or two lives extended. Thus, for women aged 40 to 49 years, it is estimated that one or two lives may be extended per 5,000 to 10,000 mammograms.
While the numbers discussed above are from a single mammography exam, women undergo screening throughout their lifetimes, which can include 20 to 30 years of screening activity. A meta-analysis of RCTs conducted for the U.S. Preventive Services Task Force in 2009 (including the AGE Trial) found that the number needed to invite to screen for 10 years to avoid or delay one death from breast cancer was 1,904 for women in their 40s, 1,339 for women in their 50s, and 377 for women in their 60s. A 2009 combined analysis by six Cancer Intervention and Surveillance Modeling Network modeling groups found that screening every 2 years maintained an average of 81% of the benefit of annual screening with almost half the false-positive results. Screening biennially from age 50 to 69 years achieved a median 16.5% reduction in breast cancer deaths versus no screening. Initiating biennial screening at age 40 years (vs. age 50 years) reduced breast cancer mortality by an additional 3%, consumed more resources, and yielded more false-positive results.
Although the RCTs of screening have addressed the issue of screening efficacy (i.e., the extent to which screening reduces breast cancer mortality under the ideal conditions of an RCT), they do not provide information about the effectiveness of screening (i.e., the extent to which screening is reducing breast cancer mortality in the U.S. population). Studies that provide information about this issue include nonrandomized controlled studies of screened versus nonscreened populations, case-control studies of screening in real communities, and modeling studies that examine the impact of screening on large populations. An important issue in all of these studies is the extent to which they can control for additional effects on breast cancer mortality such as improved treatment and heightened awareness of breast cancer in the community.
Three population-based, observational studies from Sweden compared breast cancer mortality in the presence and absence of screening mammography programs. One study compared two adjacent time periods in 7 of the 25 counties in Sweden and concluded a statistically significant breast cancer mortality reduction of 18% to 32% attributable to screening. The most important bias in this study is that the advent of screening in these counties occurred over a period during which dramatic improvements in the effectiveness of adjuvant breast cancer therapy were being made, changes which were not addressed by the study authors. The second study considered an 11-year period comparing seven counties with screening programs to five counties without them. There was a trend in favor of screening, but again, the authors did not consider the effect of adjuvant therapy or differences in geography (urban vs. rural) that might affect treatment practices.
In part to account for the effects of treatment, the third study was a detailed analysis by county and concluded little impact of screening. These authors made the assumption that the annual decrease in mortality observed during the prescreening period would carry into the postscreening period, and any screening effect would result in an incremental decrease in mortality. Although no such incremental decrease in breast cancer mortality was observed after the introduction of screening, their assumption makes their conclusion weak. Comparisons across counties showed similar reductions in decreased breast cancer mortality regardless of when the countiesâ screening programs were initiated; however, the authors carried out no formal cross-county analyses.
In Nijmegen, the Netherlands, where a population-based screening program was undertaken in 1975, a case-cohort study showed that screened women have decreased mortality (odds ratio [OR] = 0.48). However, a subsequent study comparing Nijmegen breast cancer mortality rates with neighboring Arnhem in the Netherlands, which had no screening program, showed no difference in breast cancer mortality.
A community-based case-control study of screening as practiced in excellent U.S. health care systems between 1983 and 1998 found no association between previous screening and reduced breast cancer mortality. Mammography screening rates, however, were generally low. The association among women at increased risk due to a family history of breast cancer or a previous breast biopsy (OR = 0.74; 95% confidence interval [CI], 0.50â1.03) was stronger than that among women at average risk (OR = 0.96; 95% CI, 0.80â1.14), but the difference was not statistically significant (P = .17).
A well-conducted ecologic study compared three pairs of neighboring European countries, matched on similarity in health care systems and population structure, one of which had started a national screening program some years earlier than the others. The investigators found that each country had experienced a reduction in breast cancer mortality, with no difference between matched pairs that could be attributed to screening. The authors suggested that improvements in breast cancer treatment and/or health care organizations were more likely responsible for the reduction in mortality than was screening.
A systematic review of ecologic and large cohort studies published through March 2011 compared breast cancer mortality in large populations of women aged 50 to 69 years who started breast cancer screening at different times. Seventeen studies met inclusion criteria. All studies had methodological problems, including control group dissimilarities, insufficient adjustment for differences between areas in breast cancer risk and breast cancer treatment, and problems with similar measurement of breast cancer mortality between compared areas. There was great variation in results among the studies, with four studies finding a relative reduction in breast cancer mortality of 33% or more (with wide CIs) and five studies finding no reduction in breast cancer mortality. Because only a part of the overall reduction in breast cancer mortality could possibly be attributed to screening, the review concluded that any relative reduction in breast cancer mortality due to screening would likely be no more than 10%, less than predicted by the RCTs.
A U.S. ecologic analysis conducted between 1976 and 2008 examined the incidence of early-stage versus late-stage breast cancer for women aged 40 years and older. To find a screening effect, the authors compared the magnitude of increase in early-stage cancer with the magnitude of an expected decrease in late-stage cancer. Over the study period, the absolute increase in the incidence of early-stage cancer was 122 cancers per 100,000 women, while the absolute decrease in late-stage cancers was 8 cases per 100,000 women. After adjusting for changes in incidence due to hormone therapy and other undefined causes, the authors concluded that the screening effect on breast cancer mortality reduction (28% during this period) was small, and that overdiagnosis of breast cancer was likely between 22% and 31% of all diagnosed breast cancers. Most of the reduction in breast cancer mortality, the authors concluded, was probably because of improved treatment rather than screening. To make these adjustments, the authors made uncertain assumptions about the effects of other factors on incidence, and made no mention of the effects of changing treatment over time. Ecologic studies are difficult to interpret because of this type of potential uncontrolled confounding, as well as these types of unfair comparisons. However, this study largely agrees with some similar analyses from other countries (see studies discussed above). A major limitation of this and other ecologic studies is the failure to account for actual exposure to screening. Most late-stage breast cancer occurs in women not exposed to screening.
A prospective cohort study of community-based screening programs in the United States found that annual compared with biennial screening mammography did not reduce the proportion of unfavorable breast cancers detected in women aged 50 to 74 years or in women aged 40 to 49 years who did not have extremely dense breasts. Women aged 40 to 49 years with extremely dense breasts did have a reduction in cancers larger than 2.0 cm (OR for biennial vs. annual screening, 2.39; 95% CI, 1.37â4.18).
The optimal screening interval has been addressed by modelers. Modeling makes assumptions that may not be correct; however, the credibility of modeling is greater when the model produces overall results that are consistent with randomized trials overall and when the model is used to interpolate or extrapolate. For example, if a modelâs output agrees with RCT outcomes for annual screening, then it has greater credibility in comparing the relative effectiveness of biennial versus annual screening.
In 2000, the National Cancer Institute formed a consortium of modeling groups (Cancer Intervention and Surveillance Modeling [CISNET]) to address the relative contribution of screening and adjuvant therapy to the observed decline in breast cancer mortality in the United States. (Refer to the Randomized controlled trials section of this summary for more information.) These models gave reductions in breast cancer mortality similar to those expected in the circumstances of the RCTs but updated to the use of modern adjuvant therapy. In 2009, CISNET modelers addressed several questions related to the harms and benefits of mammography, including comparing annual versus biennial screening. The proportion of reduction in breast cancer mortality maintained in moving from annual to biennial screening for women aged 50 to 74 years ranged across the six models from 72% to 95%, with a median of 80%.
Several studies have shown that the method of cancer detection is a powerful predictor of patient outcome, which is useful for prognostication and treatment decisions. All of the studies accounted for stage, nodal status, and tumor size.
A 10-year follow-up study of 1,983 Finnish women with invasive breast cancer demonstrated that the method of cancer detection is an independent prognostic variable. When controlled for age, nodal status, and tumor size, screen-detected cancers had a lower risk of relapse and better overall survival. For women whose cancers were detected outside screening, the hazard ratio (HR) for death was 1.90 (95% confidence interval [CI], 1.15â3.11), even though they were more likely to receive adjuvant systemic therapy.
Similarly, an examination of the breast cancers found in three randomized screening trials (Health Insurance Plan, National Breast Screening Study [NBSS]-1, and NBSS-2) accounted for stage, nodal status, and tumor size and determined that patients whose cancer was found via screening have a more favorable prognosis. The relative risks for death were 1.53 (95% CI, 1.17â2.00) for interval and incident cancers, compared with screen-detected cancers; and 1.36 (95% CI, 1.10â1.68) for cancers in the control group, compared with screen-detected cancers.
A third study compared the outcomes of 5,604 English women with screen-detected cancers to those with symptomatic breast cancers diagnosed between 1998 and 2003. After controlling for tumor size, nodal status, grade, and patient age, researchers found that the women with screen-detected cancers fared better than their symptomatic counterparts. The HR for survival of the symptomatic women was 0.79 (95% CI, 0.63â0.99). Thus, method of cancer detection is a powerful predictor of patient outcome, which is useful for prognostication and treatment decisions. The findings of this study are also consistent with the evidence that some screen-detected cancers are low risk and represent overdiagnosis.
Several characteristics of women being screened that are associated with the accuracy of mammography include age, breast density, whether it is the first or subsequent exam, and time since last mammogram. Younger women have lower sensitivity and higher false-positive rates on screening mammography than do older women (refer to the Breast Cancer Surveillance Consortium performance measures by age for more information).
For women of all ages, high breast density is associated with 10% to 29% lower sensitivity. High breast density is an inherent trait, which can be familial but also may be affected by age, endogenous and exogenous hormones, selective estrogen receptor modulators such as tamoxifen, and diet. Hormone therapy is associated with increased breast density and is associated not only with lower sensitivity but also with an increased rate of interval cancers.
The Million Women Study in the United Kingdom revealed three patient characteristics that were associated with decreased sensitivity and specificity of screening mammograms in women aged 50 to 64 years: use of postmenopausal hormone therapy, prior breast surgery, and body mass index below 25. In addition, a longer interval since the last mammogram increases sensitivity, recall rate, and cancer detection rate and decreases specificity.
Strategies have been proposed to improve mammographic sensitivity by altering diet, timing mammograms with menstrual cycles, interrupting hormone therapy before the examination, or using digital mammography machines. Obese women have more than a 20% increased risk of having false-positive mammography results compared with underweight and normal weight women, although sensitivity is unchanged.
Some cancers are more easily detected by mammography than other cancers are. In particular, mucinous, lobular, and rapidly growing cancers can be missed because their appearance on x-rays is similar to that of normal breast tissue. Medullary carcinomas may be similarly missed. Some cancers, particularly those associated with BRCA1/2 mutations, masquerade as benign tumors.
Radiologist performance is critical to assessing mammographic interpretive performance, yet there is substantial, well-documented variability among radiologists. Factors that influence radiologistsâ performance include their level of experience and the volume of mammograms they interpret. There is often a trade-off between sensitivity and specificity, such that higher sensitivity may be associated with lower specificity. Radiologists in academic settings have a higher positive predictive value (PPV) of their recommendations to undergo biopsy than do community radiologists. Fellowship training in breast imaging may lead to improved cancer detection, but it is associated with higher false-positive rates.
After controlling for patient and radiologist characteristics, screening mammography interpretive performance (specificity, PPV, area under the curve [AUC]) varies by facility and is associated with facility-level characteristics. Higher interpretive accuracy of screening mammography was seen at facilities that offered screening examinations alone, included a breast imaging specialist on staff, did single as opposed to double readings, and reviewed interpretive audits two or more times each year.
False-positive rates vary significantly between facilities performing diagnostic mammography and are higher at facilities where concern about malpractice is high. False-positive rates are also higher at facilities serving vulnerable women (women of racial or ethnic minorities and women with lower educational attainment, limited household income, or rural residence) than at facilities serving nonvulnerable women, perhaps because of poorer compliance with recommendations for follow-up examinations. Analyses that do not adjust for important patient characteristics may falsely conclude that there is more facility variation in overall accuracy than actually exists.
International comparisons of screening mammography have found higher specificity in countries with more highly centralized screening systems and national quality assurance programs. For example, one study reported that the recall rate is twice as high in the United States as it is in the United Kingdom, yet there is no difference in the rate of cancers detected. Such comparisons may be confounded by social, cultural, and economic factors.
The likelihood of diagnosing cancer is highest with the prevalent (first) screening examination, ranging from 9 to 26 cancers per 1,000 screens, depending on the womanâs age. The likelihood decreases for follow-up examinations, ranging from 1 to 3 cancers per 1,000 screens. The optimal interval between screening mammograms is unknown. In particular, the breast cancer mortality-focused, randomized, controlled trials used single screening intervals with little variability across the trials. A prospective United Kingdom trial randomly assigned women aged 50 to 62 years to receive mammograms annually or at the standard 3-year interval. Although the grade and node status were similar in both groups, more cancers of slightly smaller size were detected in the annual screening group, with a lead time of approximately 7 months in comparison with triennial screening.
A large observational study found a slightly increased risk of late-stage disease at diagnosis for women in their 40s who were adhering to a 2-year versus a 1-year schedule (28% vs. 21%; odds ratio (OR) = 1.35; 95% confidence interval [CI], 1.01â1.81), but no difference was seen for women in their 50s or 60s.
A Finnish study of 14,765 women aged 40 to 49 years assigned women born in even-numbered years to annual screens and women born in odd-numbered years to triennial screens. The study was small in terms of number of deaths, with low power to discriminate breast cancer mortality between the two groups. There were 18 deaths from breast cancer in 100,738 life-years in the triennial screening group and 18 deaths from breast cancer in 88,780 life-years in the annual screening group (hazard ratio, 0.88; 95% CI, 0.59â1.27).
Digital mammography is more expensive than screen-film mammography (SFM) but is more amenable to data storage and sharing. Performance of both technologies has been compared directly in several trials yielding similar results.
A large cohort of women (n = 42,760) who underwent both digital and film mammography was evaluated at 33 U.S. centers in the Digital Mammographic Imaging Screening Trial (DMIST). No differences in breast cancer detection were observed (AUC of 0.78 +/- 0.02 for digital and AUC of 0.74 +/- 0.02 for film; P = .18). Digital mammography was better at cancer detection in women younger than 50 years (AUC of 0.84 +/- 0.03 for digital; AUC of 0.69 +/- 0.05 for film; P = .002). A second DMIST report found that film mammography had a higher AUC in women aged 65 years and older (AUC 0.88 for film; AUC 0.70 for digital; P = .025); however, this finding was not statistically significant when the multiple comparisons were considered. In a large U.S. cohort study, sensitivity for women younger than 50 years was 75.7% (95% CI, 71.7â79.3) for film mammography and 82.4% (95% CI, 76.3â87.5) for digital mammography; specificity was 89.7% (95% CI, 89.6â89.8) for film mammography and 88.0% (95% CI, 88.2â87.8) for digital mammography. A comparison of the findings from 1.5 million digital mammography screens and 4.5 million SFM screens that occurred in the Netherlands from 2004 to 2010 indicated higher recall and detection rates for the digital mammography screens. When SFM exams were restricted to those read by radiologists who read both digital and SFM exams (n = 1.5 million), the recall rates were 2.0% for digital mammography (95% CI, 2.0â2.1) versus 1.6% for SFM (95% CI, 1.6â1.6); the detection rates were 5.9 per 1,000 (95% CI, 5.7â6.0) for digital mammography and 5.1 per 1,000 (95% CI, 5.0â5.2) for SFM. The PPV was statistically significantly lower in the digital mammography group (PPV, 31.2%; 95% CI, 30.6â31.7) than in the screen-film group (PPV, 34.4%; 95% CI, 33.8%â35.0%). Recall rates were higher for digital screens, occurring at 27% in women aged 49 to 54 years and at 1.7% in women aged 55 to 74 years, but detection rates (5.1 per 1,000 for digital vs. 6.2 per 1,000 for film) and PPV (21.4% for digital vs. 35.7% for film) were lower. Findings for SFM exams (n = 4.5 million) read by radiologists who read only SFM exams were similar to the findings for exams read by radiologists who read both types of mammography exams.
A meta-analysis of 10 studies, including the DMIST and the aforementioned U.S. cohort study, compared digital mammography with film mammography in 82,573 women who underwent both types of the exam. In a random-effects model, there was no statistically significant difference in cancer detection between the two types of mammography (AUC of 0.92 for film and AUC of 0.91 for digital). For women younger than 50 years, all studies found that sensitivity was higher for digital mammography but that specificity was either the same or higher for film mammography. The meta-analysis found no other differences based on age.
Computed radiography (CR) utilizes a cassette-based removable detector and external reading device to generate a digital image. A large concurrent cohort study compared 254,758 full-field digital mammography (FFDM) screens with 487,334 SFM screens and 74,190 CR screens. Again, the cancer detection rate was not different between FFDM (4.9 per 1,000) and SFM (4.8 per 1,000), although the recall rate was higher for FFDM. Importantly, cancer detection was lower for CR at 3.4 per 1,000, adjusted OR 0.79 (95% CI, 0.68â0.93). Two prior studies of noncontemporaneous cohorts showed no difference between CR and SFM or higher cancer detection rate from CR.
CAD systems are designed to help radiologists read mammograms by highlighting suspicious regions such as clustered microcalcifications and masses. Generally, CAD systems increase sensitivity and decrease specificity and increase detection of ductal carcinoma in situ (DCIS). Several CAD systems are in use. One large population-based study comparing recall rates and breast cancer detection rates before and after the introduction of CAD systems found no change in either rate. Another large study noted an increase in recall rate and increased DCIS detection but no improvement in invasive cancer detection rate.
Mammography screening may be effective in reducing breast cancer mortality in certain populations, but it can pose harm to women who participate. The limitations are best described as false-positives (related to the specificity of the test), overdiagnosis (true-positives that will not become clinically significant), false-negatives (related to the sensitivity of the test), discomfort associated with the test, radiation risk and anxiety.
The specificity of mammography (refer to the Breast Cancer Screening Concepts section of this summary for more information) affects the number of additional interventions due to false-positive results. Even though breast cancer is the most common noncutaneous cancer in women, fewer than 5 per 1,000 women actually have the disease when they are screened. Therefore, even with a specificity of 90%, most abnormal mammograms are false-positives. Women with abnormal screening mammograms undergo additional mammographic imaging to magnify the area of concern, ultrasound, magnetic resonance imaging, and tissue sampling (by fine-needle aspiration, core biopsy, or excisional biopsy).
A study of breast cancer screening in 2,400 women enrolled in a health maintenance organization found that over a 10-year period, 88 cancers were diagnosed, 58 of which were identified by mammography. During that period, one-third of the women had an abnormal mammogram result that required additional testing, including 539 additional mammograms, 186 ultrasound examinations, and 188 biopsies. The cumulative biopsy rate (the rate of true-positives) due to mammographic findings was approximately 1 in 4 (23.6%). The positive predictive value (PPV) of an abnormal screening mammogram in this population was 6.3% for women aged 40 to 49 years, 6.6% for women aged 50 to 59 years, and 7.8% for women aged 60 to 69 years. A subsequent analysis and modeling of data from the same cohort of women, all of whom were continuously enrolled in the Harvard Pilgrim Health Care plan from July 1983 through June 1995, estimated that the risk of having at least one false-positive mammogram was 7.4% (95% confidence interval [CI], 6.4%â8.5%) at the first mammogram, 26.0% (95% CI, 24.0%â28.2%) by the fifth mammogram, and 43.1% (95% CI, 36.6%â53.6%) by the ninth mammogram. Cumulative risk of at least one false-positive by the ninth mammogram varied from 5% to 100%, depending on four patient variables (younger age, higher number of previous breast biopsies, family history of breast cancer, and current estrogen use) and three radiologic variables (longer time between screenings, failure to compare the current and previous mammograms, and the individual radiologistâs tendency to interpret mammograms as abnormal). Overall, the biggest risk factor for having a false-positive mammogram was the individual radiologistâs tendency to read mammograms as abnormal.
A prospective cohort study of community-based screening found that a greater proportion of women undergoing annual screening had at least one false-positive screen after 10 years than did women undergoing biennial screening, regardless of breast density. For women with scattered fibroglandular densities, the difference was 68.9% (annual) versus 46.3% (biennial) for women in their 40s. For women aged 50 to 74 years, the difference for this density group was 49.8% (annual) versus 30.7% (biennial).
By reviewing Medicare claims following mammographic screening in 23,172 women older than 65 years, one study found that, per 1,000 women, 85 had follow-up testing, 23 had biopsies, and 7 had cancer. Thus, the PPV for an abnormal mammogram was 8%. For women older than 70 years, the PPV was 14%.
An audit of mammograms performed in 1998 at a single institution revealed that 14.7% of examinations resulted in a recommendation for additional testing (Breast Imaging Reporting and Data System category 0), 1.8% resulted in a recommendation for biopsy (categories 4 and 5), and 5.7% resulted in a recommendation for short-term interval mammography (category 3). Cancer was diagnosed in 0.5% of the cases referred for additional testing.
Overdiagnosed disease is a neoplasm that would never become clinically apparent without screening before a patientâs death. The prevalence of cancer in women who died of noncancer causes is surprisingly high. In an overview of seven autopsy studies, the median prevalence of occult invasive breast cancer was 1.3% (range, 0%â1.8%) and of ductal carcinoma in situ was 8.9% (range, 0%â14.7%). A âperfectâ screening test would identify approximately 10% of ânormalâ women as having breast cancer, even though most of those cancers would probably not result in illness or death. Treatment of these cancers would constitute overtreatment.
Currently, cancers that will cause illness and/or death cannot be confidently distinguished from those that will remain occult, so all cancers are treated.
To determine the number of screen-detected cancers that are overdiagnosed, one can compare breast cancer incidence over time in a screened population with that of an unscreened population.
Population-based studies could demonstrate the extent of overdiagnosis if the screened and nonscreened populations were the same except for screening. Unfortunately, the populations may differ in time, geography, culture, and the use of postmenopausal hormone therapy. Investigators also differ in their calculation of overdiagnosis as they adjust for characteristics such as lead-time bias. As a consequence, the magnitude of overdiagnosis due to mammographic screening is controversial, with estimates ranging from 0% to 54%.
Several observational population-based comparis