Overdiagnosis occurs in a population when conditions are diagnosed correctly but the diagnosis produces an unfavourable balance between benefits and harms. In cancer screening, overdiagnosed cancers are those that did not need to be found because they would not have produced symptoms or led to premature death. These overdiagnosed cancers can be distinguished from false positives, which occur when an initial screening test suggests that a person is at high risk but follow-up testing shows them to be at normal risk. The cancers most likely to be overdiagnosed through screening are those of the prostate, thyroid, breast and lung. Overdiagnosis in cancer screening arises largely from the paradoxical problem that screening is most likely to find the slow-growing or dormant cancers that are least likely to harm us, and less likely to find the aggressive, fast-growing cancers that cause cancer mortality. This central paradox has become clearer over recent decades. The more overdiagnosis is produced by a screening program, the less likely the program is to serve its ultimate goal of reducing illness and premature death from cancer. Thus, it is vital that health professionals and researchers continue an open, scientific inquiry into the extent and consequences of overdiagnosis, and devise appropriate responses to it.
A cancer screening program tests a large population of healthy people in a defined age group, ideally using a simple, affordable test. This test, much like a sieve, ‘catches’ those people who are apparently at higher cancer risk and directs them to further testing. If diagnostic testing confirms the presence of cancer precursors or cancer, preventive or curative treatment is provided. This should mean easier, more effective treatment and fewer people progressing to late-stage, advanced cancer. The ultimate goal is to reduce the suffering and death caused by that cancer in that population without introducing any significant additional harms. These are good goals, and some screening programs deliver on them (e.g. cervical screening programs1).
This article explains the problem of overdiagnosis in cancer screening. Because overdiagnosis produces harm, a discussion of overdiagnosis may be perceived to be a general condemnation of cancer screening. We do not intend this, and point to other papers in this issue that focus on some of the benefits screening may offer.
Overdiagnosis is an important concern because it potentially breaks the all-important link between screening programs and their ultimate goal. The more overdiagnosis a screening program produces, the more screening is likely to increase rather than decrease suffering and even death in populations. Most cancer authorities now acknowledge that overdiagnosis is a consideration in screening, including in their communication to consumers (e.g. Cancer Research UK2, US Preventive Services Task Force3). However, significant expert disagreement exists over the extent of overdiagnosis1, and screening programs with different characteristics are likely to produce overdiagnosis to different degrees.4 In this article, we consider what overdiagnosis is, why it occurs, what forms of screening it is most relevant to, and how we should respond to it.
A common misunderstanding is to conflate overdiagnosis and false positives in screening. A false positive occurs when the ‘sieve’ of screening ‘catches’ a person who is at normal risk of cancer and incorrectly suggests that they may be at high risk. After a sometimes anxious wait for further testing5, the person is shown to be at normal risk. The extent of false positives (and false negatives) in a screening program is partly determined by test characteristics (a good test produces fewer false results) and partly by agreed standards within the program (e.g. the agreed cut-off value for a biomarker).
A false positive occurs when a person is incorrectly told that they may have cancer. Cancer overdiagnosis, in contrast, occurs when cancers are correctly diagnosed but those cancers would not have produced symptoms or been identified clinically.6,7 It is very difficult to determine whether a particular individual has been overdiagnosed8, particularly in cancer screening, because once a person has been diagnosed with cancer and treated, no-one can know what would have happened without that treatment.
Cancer overdiagnosis can therefore only be measured or statistically estimated in populations.8 Typically, a screening program that produces overdiagnosis in a population will greatly increase the incidence of early-stage cancer or pre-cancer, without reducing the incidence of late-stage cancer or mortality from that cancer. This is illustrated in Figure 1, using the example of thyroid cancer.7,9,10
Figure 1. Age-standardised thyroid cancer incidence and mortality rates per 100 000 males/females in Australia, 1968 to 2013 (click to enlarge)
An important dual paradox drives the overdiagnosis problem, one that has become more evident over time. First, cancer is complex and heterogeneous.12,13 Second, screening programs are more likely to detect slow-growing, less aggressive cancers and less likely to detect fast-growing, more aggressive cancers. This is simply a function of time. Screening occurs at regular intervals (e.g. 2, 3 or 5 years) calculated to provide the best cost:benefit ratio and least harm. A slow-growing cancer is asymptomatically present in the body for much longer, so is more likely to be present at a screening point. In contrast, a fast-growing, more aggressive cancer produces symptoms, so is likely to prompt the person to see a doctor and be clinically diagnosed in the gap between scheduled screenings.14 These ‘interval cancers’ are not a sign that the screening program has failed – they simply demonstrate that a small proportion of cancer is extremely aggressive (more so than any reasonable screening schedule could catch). This relationship is shown in Figure 2.
Figure 2. Cancer heterogeneity and multiple screenings drive overdiagnosis (click to enlarge)
Slow-growing, less aggressive cancers are an important source of overdiagnosis in screening programs, as they may not progress, or may even regress15, and so do not need to be found. To quote Otis Brawley, the Chief Medical Officer of the American Cancer Society, they are “tumours that appear cancerous under the microscope but are behaviourally of no clinical threat”.13 These cancers bias the perceived outcome of screening in a potentially confusing way.16 It is common to see claims such as the following: “95% of women whose breast cancer is detected by screening are still alive after 5 years, versus only 25% of women whose cancer is diagnosed clinically”. This may be so, but it is at least partly a function of cancer heterogeneity, rather than a benefit of screening. Women with slow-growing (screen-detected) cancers are likely to survive for 5 years with or without screening. In contrast, women with aggressive (clinical, symptomatic) cancers are, sadly, less likely to survive 5 years because the cancer is aggressive (and thus less likely to be detected by screening). Five-year survival will therefore always be higher in screen-detected cohorts than in clinically detected cohorts. This does not, however, prove that screening reduces suffering or premature cancer death.
A valid measure of the benefit of screening is the difference in absolute mortality rates between screened and unscreened groups over a particular time. Differences in absolute mortality tend to suggest a more delicate balance of benefits and harms than many might expect.17 Nonetheless, the balance of benefits and harms in any type of screening is often contested1, and will vary depending on the disease, program design, protocol design, technology and quality assurance.4
Some cancers are more likely to be overdiagnosed than others.7,13 In particular, screening is more likely to overdiagnose cancer if it detects early-stage cancer and triggers cancer treatments (as opposed to screening that detects – and prompts removal of – changes that can later become cancer, e.g. cervical and colorectal screening). To quote Esserman et al.:18
Physicians, patients, and the general public must recognize that overdiagnosis is common and occurs more frequently with cancer screening. Overdiagnosis, or identification of indolent cancer, is common in breast, lung, prostate, and thyroid cancer. Whenever screening is used, the fraction of tumours in this category increases. By acknowledging this consequence of screening, approaches that mitigate the problem can be tested.
Consistent with this, those cancers most likely to be overdiagnosed are prostate cancer, from prostate-specific antigen testing of asymptomatic men in primary care; breast cancer, from organised mammography screening; thyroid cancer, from ultrasound of the thyroid and/or adjoining structures; and lung cancer, from screening smokers using computed tomography.7,10,13,19-21
It is the nature of science to change: a scientific attitude entails scepticism and openness to questioning. Overdiagnosis in cancer screening has become topical because, within the science of cancer, it has become increasingly clear that some cancers do not need to be found, and that the benefit/harm balance of screening is less favourable than originally hoped. Unfortunately, at present, overdiagnosis debates sometimes fail to take a scientific approach, instead becoming polarised into perceived ‘pro-screening’ or ‘anti-screening’ camps, where defensiveness collides with zealousness.13
In future, we believe it is critical for healthcare practitioners and researchers to hold open, rigorous scientific conversations about overdiagnosis, staying focused on the central fact that overdiagnosis is correct diagnosis. The people who may cause overdiagnosis are rarely maleficent or incompetent: they are, in large part, professionals doing their job to a high standard and in the way their profession has agreed it should be done, with a desire to prevent suffering and premature death from cancer. The problem is that, within the healthcare system, standards for practice have been set at a point that may produce more harm than good. This harms individuals while appearing to help them.13 In addition, any amount of overdiagnosis will increase the costs and decrease the cost-effectiveness of screening programs, thus leading to opportunity costs.22–24 System standards need to be constantly and collaboratively monitored to ensure that screening produces more good than harm.24,25 More research is needed to test whether treating the lowest risk screen-detected cancers differently will reduce harm (e.g. Francis et al.25, Lane et al.26).
Citizens expect their healthcare systems to help, not harm. Health professionals make their career choices because they are motivated to relieve suffering, not cause it. If we keep hold of these values, it should be possible for us to continue to have a constructive and productive conversation about overdiagnosis and how to minimise it in cancer screening programs.
SC and AB received funding to support this work from the National Health and Medical Research Council under grant number 1104136.
© 2017 Carter and Barratt. This article is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Licence, which allows others to redistribute, adapt and share this work non-commercially provided they attribute the work and any adapted version of it is distributed under the same Creative Commons licence terms.