Although randomised controlled trials are the preferred basis for policy decisions on cancer screening, it remains difficult to assess all downstream effects of screening, particularly when screening options other than those in the specific trial design are being considered. Simulation models of the natural history of disease can play a role in quantifying harms and benefits of cancer screening scenarios. Recently, the US Preventive Services Task Force issued a C-recommendation on screening for prostate cancer for men aged 55–69 years, implying at least moderate certainty that the benefit is small. However, modelling based on data from the European Randomized study of Screening for Prostate Cancer, which included quality-of-life estimates, showed that the ratio between benefits and harms is better, and likely to be reasonable, for men screened between the ages of 55 and 63 years (i.e. by using an earlier stopping age than applied in the trial setting). This commentary article considers the importance of simulation modelling in the decision-making process for (prostate) cancer screening. The paper also explores whether the recently published Cluster Randomized Trial of PSA Testing for Prostate Cancer, a trial of a single prostate specific antigen (PSA) testing intervention in the UK, changes the evidence for regular PSA testing for men aged 55–63 years by replicating the trial using a simulation model.
Although randomised controlled trials (RCTs) are preferred as the basis for decisions regarding efficacy of cancer screening, it is almost impossible to directly assess long-term effects of screening such as overdiagnosis, overtreatment or life-years gained. This would, for instance, require a long or even lifelong follow-up of individuals in both the screening and control arms of such trials. Furthermore, finding the optimal screening strategy for a population would require formal comparisons of different screening strategies, which is impossible to do in a single RCT. This complexity of decision making has led to the need for (simulation) modelling of the natural history of disease. Modelling allows the impact of various screening strategies, as well as the long-term effects of cancer screening, to be assessed, provided that the model is well calibrated and validated.
There are numerous examples of such quantifications being a valuable source for policy decision making. For example, the US Preventive Services Task Force (USPSTF)1-3 used results from modelling studies by Cancer Intervention and Surveillance Modeling Network (CISNET) groups on lung, breast and colorectal cancer screening4-6 to assess the optimum age at which to begin and end screening, the optimal screening interval, and the relative benefits and harms of different screening strategies. Similarly, the Dutch government has implemented a national program for colorectal cancer screening, for which the target age range, the type of test and the cut-off for referral were chosen based on modelling results from several pilot projects and predicted capacity needs for colonoscopy.7 The BreastScreen Australia Evaluation Advisory Committee (EAC), in its final report8, used evidence from modelling studies9,10 on the effectiveness of breast cancer screening by age. A more recent Australian example was modelling to assess the possible benefits and cost-effectiveness of the renewed national cervical cancer screening program in Australia.11
The risks and benefits of prostate specific antigen (PSA) testing for prostate cancer at a population level have been reviewed for decades, yet no country in the world has found sufficient evidence to fund an organised screening program. Reviews, including in Australia, have deemed that the harms outweigh the benefits at a population level, due primarily to the low specificity of the PSA test and the risks of unnecessary invasive treatments with significant side-effects.12 Prostate cancer is nonetheless a good example of how (simulation) modelling can help to answer important questions about improved targeting of early detection interventions, such as at what age a man might be encouraged to have his first PSA test and especially at what age a man who had already agreed to be tested might stop.
Existing guidelines on prostate specific antigen screening are contradictory.13,14 For example, the USPSTF issued a C-recommendation on screening for prostate cancer for men aged 55–69 years, advising clinicians to inform men about the potential benefits (cancer deaths prevented, life-years gained and reduction of risk of advanced disease) and harms (overtreatment and living longer with the knowledge of a cancer diagnosis) of PSA screening.13 According to the USPSTF, a C-recommendation means there is at least moderate certainty that the benefit is small, and, therefore, selectively offering the test to individual patients based on professional judgement and patient preferences might be appropriate. Based on the 13-year follow-up of the European Randomized study of Screening for Prostate Cancer (ERSPC) trial, the USPSTF concluded that screening may prevent one to two prostate cancer deaths (over 13 years) per 1000 men screened, and 20–50% of men detected by screening may be overdiagnosed. The risk of overdiagnosis was calculated by comparing the number of cancers diagnosed in the screening group with the number diagnosed in the control group over follow-up years. However, estimating overdiagnosis over the given trial period only is often not enough, and, given the natural history of prostate cancer, longer follow-up is needed or has to be simulated.
Pashayan et al.15 concluded that the benefit of prostate cancer screening in reducing advanced stage disease is counterbalanced by overdiagnosis, the latter being especially more frequent at older ages (65–69 years). Pinsky et al16 concluded that the burden from diagnosis of indolent disease (i.e. tumours that are unlikely to become symptomatic during a man’s lifetime) should be reduced by not diagnosing indolent disease at all and by not aggressively treating diagnosed indolent disease. One of the possible solutions for this could be stopping screening before the age of 69. A model which had been developed in the Australian context did not clearly indicate a favourable harm–benefit ratio for prostate cancer screening.17 A comprehensive Australian evaluation of the evidence also found no case for a PSA-based population screening program.18 There may, however, be a role for modelling to help inform targeted approaches, beyond the guidance available through conventional evidence review.
A study using a microsimulation analysis (MISCAN) model, which was calibrated on ERSPC data and included quality-of-life estimates, showed that the ratio between benefits and harms is better for men screened at 55–63 years of age than for the broader age band (55–69/74) screened in the trial.19 The estimated effects of screening men in different age groups are shown in Table 1. Model simulation is over the lifetime and thus the numbers of prostate cancer deaths averted (5–10 per 1000 men) are larger than the prostate cancer mortality reduction found in the ERSPC trial at 13 years of follow-up. Screening in the 55–63 years age group leads to a smaller number of prostate cancer deaths averted – 7 per 1000 men, compared with 10 for the 55–69 years age group. However, the percentage loss in quality-adjusted life years (QALYs) – the difference between life years gained and QALYs gained divided by the life years gained – is smaller in the 55–63 years age group than in the 55–69 years age group; the number of overdiagnoses is also much lower (23 per 1000 men, compared with 49). Although the ratio between harms and benefits (overdiagnosis per prostate cancer death averted) is better for the initial core age group (55–69 years) than for the 64–69 years age group (who have the highest PSA test uptake in daily clinical practice), it is inferior to that for the 55–63 years age group (5.4 vs 3.2, respectively). This ratio of 3.2 between harms and benefits is almost similar to the ratio of 3 found by the UK independent breast screening panel.20 The UK panel concluded that this ratio is acceptable for breast cancer screening.
Screening in the 55–63 years age group was found to have the best benefit and harm balance in this analysis. In such circumstances there may be a case for the USPSTF to consider a B-recommendation for PSA testing for the 55–63 or 55–59 years age groups, as this modelling indicates there is moderate certainty that the net benefit is moderate to substantial. Further work, including research that improves understanding of the complexities of overdiagnosis in these specific age groups, would add to the quality of information necessary to confidently recommend such a change.
Table 1. Estimated effects of screening men at 2-year intervals compared with no screening
|Screening age group (years)||PC deaths averteda||Overdiagnosed casesa||Overdiagnosed cases per PC death averted||Life years gaineda||Life years gained per PC death averted||QALYs gaineda||QALYs gained per PC death averted||% loss in QALYsb|
The Cluster Randomized Trial of PSA Testing for Prostate Cancer (CAP), conducted in the UK with 408 825 men, is now the largest RCT on PSA screening.21 However, in the CAP trial men were offered only one PSA test, and about 36% of them accepted that offer. Therefore, in practice, the number of PSA tests in the CAP trial is less than performed in the ERSPC trial (82 299 and 140 040, respectively). The result from the CAP trial must therefore be interpreted bearing in mind the low acceptance rate (36%) and single test applied only at age 50.
We used a well-validated natural history model (MISCAN) to replicate the CAP trial, as best as we could, based on UK life tables, men screened by age, ERSPC incidence, treatment and survival rates, and assuming an 80% biopsy compliance and limited contamination rate of 2% per year. Analogous to experiences in the PLCO (Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial)22, we assumed no difference in the natural history of prostate cancer, the performance of PSA testing and the benefit per screen in the UK compared with other countries in Europe or the US. Figure 1 shows our expected prostate cancer mortality curves for the screen and control arms of CAP. The small expected difference between the arms (given the one test at low compliance) is striking. We have estimated a prostate cancer mortality rate ratio of 0.94 after 10 years of follow-up, not much different from the observed point estimate of 0.96, and well within the 95% confidence interval (0.85, 1.08). Extending the prostate cancer mortality prediction to 15 and 20 years of follow-up did not alter our estimate (0.94 and 0.95 mortality rate ratio, respectively). Therefore, our conclusion is that although the CAP trial of a single PSA testing intervention did not show statistically significant differences in prostate cancer mortality after 10 years of follow-up, there may still be a potential mortality benefit demonstrated by microsimulation modelling. The low point estimate (4% statistically nonsignificant prostate cancer mortality reduction) observed in CAP cannot be interpreted to be inconsistent with the 27% benefit per screen as estimated from ERSPC, and confirmed in PLCO. This implies that, even when a trial shows no mortality benefit, well-validated modelling can strengthen the evidence on targeted interventions for improved early detection.
Figure 1. Cumulative number of prostate cancer deaths in both arms of the CAP trial by follow-up years, as predicted by the MISCAN model (click to enlarge)
Validation is one of the main methods for achieving trust and confidence in healthcare models.23 Model validation methods include: face validity, verification (or internal validity), cross validity, external validity and predictive validity; the latter has been suggested to be the most desirable method.23 In several modelling studies, the CISNET models have been replicated by independent researchers (with external validation by others). For example, the MISCAN prostate model was replicated by independent researchers based on the reporting of all basic parameters in our papers.24,25 We have also described how the MISCAN model prediction of the impact of breast cancer screening fulfils predictive validity.26 In short, MISCAN model predictions for the impact of breast cancer screening on incidence, made in 1994 for a steady-state screening situation27, closely resemble the actual breast cancer incidence rates in 2010 in the Netherlands.
A well-validated (simulation) model can play a crucial role in the development of sound cancer control policies, particularly when RCTs and other empirical studies are unable to give information regarding the harm–benefit ratio in the long run, and an optimum age or interval to screen, because of a lack of diverse trials. In this commentary article, we showed how modelling is useful to quantify the ratio between harms and benefits, and evaluated an age category with a better harm–benefit balance for prostate cancer screening than was applied in the trials to show efficacy.
This publication was made possible by Grant Number U01 CA199338 from the National Cancer Institute as part of the Cancer Intervention and Surveillance Modeling Network, which supported the underlying development of the simulation model utilised. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the National Cancer Institute.
Externally peer reviewed, not commissioned.
© 2019 Getaneh et al. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Licence, which allows others to redistribute, adapt and share this work non-commercially provided they attribute the work and any adapted version of it is distributed under the same Creative Commons licence terms.